Lecture 2:

MONTE CARLO INTEGRATION

Many numerical problems are very difficult, or even impossible, to solve in closed form because the function to be manipulated is extremely complicated or is given by a table of values rather than by an equation. Evaluating a definite integral or solving a differential equation are typical examples. This kind of problems, although deterministic by nature, can be handled by Monte Carlo methods in a way similar to that used for the probabilistic problems arising in the nuclear and particle physics, or in the fields of economics and social science, or in telecommunications. To do that, we simulate the process represented by the function in the original problem as a stochastic process by assigning some probability that a particular functional value will be observed. A random number is selected from the appropriate distribution and used to compute the value of the desired function. If the function is in tabular form, interpolation between entries in the table will be necessary. The sum of successive trials, perhaps weighted by their probability success, is an approximation to the solution. Of course, the more trials are attempted, the better will be the approximation. This is not a very economic course of action. Broadly speaking, there is a square law relationship between the error in an answer and the requisite number of observations; to reduce it tenfold calls for a hundredfold increase in the observations, and so on.

The idea behind the Monte Carlo approach to deterministic problems is to exploit the strength of theoretical mathematics while avoiding its inherent weakness by replacing theory by experiment whenever the former falters. The weakness associated with abstraction and generality of theoretical mathematics, in contrast to the applied mathematics, is that the more general and formal its language, the less is pure theory ready to provide a numerical solution in a particular application.

Searching a Sample Space

To integrate the function f(x) over the interval [a,b] find some value M such that f(x)<M over the entire interval. Then select a random number x from the uniform distribution on [0,1], multiply by M, and determine if f(x ) falls below the curve for f(x) or above it. Score one point if the f(x)³ f(x ). Fig. Illustrates this process. Repeat this procedure a large number of times. Let S be the number of successes (points falling below the curve) in T trials. The estimated probability of success is

S/T = Area under the curve / Total area of rectangle =

After a minimal number of trials, say 100, the value of the integral calculated from the above equation can be compared to the value computed at the previous trial. When the value of the integral does not change by more than a preset tolerance between successive trials, the process may be terminated. An upper limit should be set for the number of trials to ensure termination of the procedure.

Algorithm

Given are f(x), a, b, M, mintrials, maxtrials, toler, and a random number generator to select x from the uniform distribution on [0,1].

T = 0

S = 0

while T < maxtrials

T = T + 1

x = a + x 1*(b - a )

y = x 2 * M

if y <= f(x)

S = S + 1

if T >= mintrials, integral (S) = S* M*(b - a)/T

if | integral(S) - integral(S-1) | <+ toler, stop

This algorithm can be extended to handle multiple integrals by searching a volume of space rather than an area. Thus, a function f(x,y) may be integrated over x and y by selecting random values of x and y which designate a point in the volume bounded by x=a, x = b, y = p, y = q, f(x,y) = 0, and f(x,y) = M. The higher dimensional integrals are evaluated following the same procedure for every variable.

Example:

Integrate sinx from x=a=0 to x=b=p . As the maximal value of sinx on this interval is one, choose that value for M to minimize the search area. The analytical solution is cos0 - cosp = 2.

Solution

Using a random number generator, we obtain the following results for the first three trials (table). In the first trial (x 1) sin2.9685 = 0.1722, which is less than the random number (x 2) 0.8448 obtained for y=x *M=x . Therefore S remains zero, and p S/T = 0 is the first approximation to the integral. The second trial (x 3) gives a value of y = (x 4) which is less than sin(x), so S is one and the new approximation to the integral is p /2. After 100 trials the approximation to the integral is 2.004, which is close to the analytical value of 2.

Table

T

x

y

S

Integral

1

2.9685

0.8448

0

0.0000

2

2.2346

0.7539

1

1.5708

3

1.9689

0.5740

2

2.0944

Comment: The divergence of the computed result towards the exact one is not a steady function. Rather oscillations appear and the process is terminated when they (oscillations) become relatively small.

Statistics of Monte Carlo Integration

Let I denote the value of an integral determined by searching a volume V with random numbers. As shown above, the expectation value of the integral

EI = <I> = a

is approximated by SV/T after a large number of trials. The expectation value of the variance of the integral, that is, the variance of the population of the estimates of the integral is

It can be shown that the probability that the calculated integral is more than m standard deviations from its expectation value is constant.

Therefore the only way to improve the accuracy of the integration is to minimize s . The general variance-reducing techniques lie at the heart of good Monte Carlo work. These techniques depend upon various devices, such as distorting the data so that the variance of the parent distribution is reduced (in importance sampling ), or making allowances for various causes of variation in the data (as in regression).

In the example considered above, the ratio S/T approaches a constant with an increasing the number of trials. In this case the only way to minimize s is to minimize V, the volume of the search space. That is, M should be chosen to be as small as possible. [We will come back to this point in the chapter devoted to the algorithm of Metropolis.- see the first lecture]

Comparison with Polynomial Methods

Evaluation of an n-dimensional integral over an interval divided into N subintervals requires function evaluations. For example, the composite trapezoid rule has an error proportional to where h is the mesh size. For simplicity let the interval of integration be one, that is,

Nh = 1

This tactic merely converts the independent variable to relative units and does not result in any loss of generality. The uncertainty of the value of the integral is

d = K/

where K is a constant.

After q function evaluations using the Monte Carlo method the integral has a probability of 0.9 of being in the interval [I - 2s , I + 2s ] where s 2 is the variance of the population of the estimates. The uncertainty of the integral actually obtaine .

If q equals , Monte Carlo integration is more accurate than the composite trapezoid rule when

Ignoring the difference between K and K , the above condition is met when n is greater than six. Frequently, fewer than trials are required for high dimensional integrals, emphasizing the superiority of the Monte Carlo method for such problems.

Question: What is the difference in the results if different random number generators have been used? Let us inverse the question in order to answer it by using a physical example: Does sampling from different distributions influence the final result?

Testing Hypotheses

A statistical hypothesis is an assertion that a random variable has a particular probability distribution. Although statistical hypothesis may be quite complex, only simple hypotheses will be discussed here; for complex statistical testing the reader is referred to program packages such as in the CERN library. Testing an hypothesis involves calculating the probability that the random variable would deviate from the value predicted by the presumed underlying distribution. If the probability is less than some confidence level, most often 0.01 or 0.05, the hypothesis is accepted. If the calculated probability exceeds the confidence level, the hypothesis is rejected.

Comparing a Sample to a Population

A common hypothesis is that the collection of values for a random variable obtained by a mathematical model or an experiment fits a particular probability density function.

Example: A sequence of 200 random numbers was generated with the linear congruential random number generator

where a =2 147 437 301, c = 453 816 981, and . The hypothesis to be tested is that these 200 numbers are uniformly distributed on the interval [0,1]. Suppose each number is assigned to one of ten bins, depending on its first digit (0 to 9). Each bin should contain 20 numbers if the distribution is completely random. The distributions actually obtained for two values of the seed (ISEED) are given in the table.

Bin

ISEED=1

ISEED=12345

0

19

17

1

21

21

2

22

22

3

20

16

4

24

24

5

12

22

6

22

20

7

24

22

8

9

17

19

21

15

The test gives the goodness of the fit of such results to the expected distribution. The statistics is given by

=

where is a value of the random variable whose distribution is being tested, is the expected value of the variable, and n is the number of comparisons being made. In the above is the number of random numbers in bin i, is the expected number of numbers in that bin (=20), and n = 10 bins. The values of for the seeds 1 and 12345 are 5.8 and 4.0, respectively.

The statistics has its own frequency function which depends on n, the number of comparisons of actual and expected values of the random variable. The probability that a given value of will be exceeded by chance has been tabulated for many values of n-1, the number of degrees of freedom of the distribution. For nine degrees of freedom there is a probability of only 0.05 that will exceed 16.9 by chance. As the actual values of the statistics are less than this value, the hypothesis that the two sequences of random numbers fit the uniform distribution on [0,1] should be accepted.

Testing the Value of a Statistics

The average results of scientific experiment often must be compared to the expected value of the quantity being measured. Because it is unlikely that the measured and theoretical values will agree exactly, it is necessary to determine if the discrepancy is due to random errors of measurement or if it is due to a systematic difference between theoretical preconceptions and reality. Suppose that a large number of replicate measurements were found to be normally distributed with a standard deviation s . Is the observed mean value m of a subsequent experiment involving only a few replicate measurements a reasonable estimate of the theoretical value m o given this known experimental error? In another words, is the observed mean value m a good estimate of the mean of a population distributed as N(m o,s )?

The z-statistics is defined as

z =

where n is the number of estimates that produced the mean m . This statistics is normally distributed with a mean zero and a standard deviation of one. There are tables of the probability that a particular value of z is exceeded (this is the area under the normal curve of error between that value of z and ¥ ). Denote the desired confidence limit by a and the value of z at which the above probability equals a by . If the calculated value of z exceeds , then m is significantly greater than m o. If z is less than (-), then m is significantly smaller than m o. If exceeds , then one of the above conditions applies. The first two of the above cases are called ‘one-tailed’ tests and the third case is called a ‘two-tailed’ test, because they refer to the areas under ‘tails’ of the bell-shaped normal curve of error. The two-tailed test is most often used, but care should be taken that the correct table of z values is being used.

The distribution of the random variables constituting experimental measurement is usually unknown, and only a small sample from that distribution is available. In such cases the standard deviation s in the above formula should be replaced by s, the root-mean-square deviation of the replicate measurements from their mean. Because the sample size is small, s is only an approximation to the true standard deviation and the formula yields a random variable called Student’s t statistics instead of z.

t =

Student’s t statistics can be shown to be distributed as and be a close approximation to the normal distribution for moderate to large values of n, the sample size. Values of t for several confidence limits have been tabulated for many degrees of freedom. For a given confidence level a , if the calculated value of t exceeds (or is less than -), m is significantly greater (or less) than m o . If exceeds , one of the above two cases applies. Care must be taken to use either the one-tailed or two-tailed table of statistics as appropriate.

Example: Several measurements of the density of water have been made at 25° C: 0.987, 0.991,1.005, 0.993, 1.007, 0.995. The true value is known to be 0.997 to the third decimal places. Is the result acceptable at the 0.05 confidence level?

The mean value of the measurements is m = 0.996 and the root-mean-square-deviation is s = 0.00765. The t statistics equals 0.312 to the three decimal places. The table of two-tailed t values gives a value of 2.571 at the 0.05 confidence level. This means that there is a probability of only 0.05 that the tabulated t value would be exceeded by chance. Because the calculated t values does not exceed the tabulated value, the probability that the discrepancy between the theoretical and measured densities reflects the experimental error exceeds 0.95. The result is considered as not significantly different from the known density.

A scientist frequently knows from experience the standard error associated with a particular kind of measurement and wishes to determine if several replicate measurements are representative of the precision he has come to expect from such an experiment. That is, he wishes to determine if the root-mean-square deviation s of his measurements from their mean value is significantly different from the known standard deviation s . If s is significantly greater than s , the experiment in question are less precise than the expected from experience.

This determination can be made using a test.

=

As before, care must be taken to use the one-tailed or two-tailed table of values as is appropriate for the test to be made.

Comparison of Two Populations

Suppose scientists in two laboratories measure the viscosity of a fluid a large number of times. The scientists at the first laboratory find their measurements normally distributed with a mean of m 1 and a standard deviation of s 1. At the second laboratory is found that the measurements have a mean of m 2 and a standard deviation of s 2. The question is if the means of the two populations are significantly different.

As the two populations are normally distributed, a linear combination of them (such as their difference) is also normally distributed. Therefore the z-statistics can be used to test if the difference between the two means is significantly different from zero. Substituting into the formula for z yields

where and are the sizes of the two populations to be compared. As above the calculated z value is compared with that for the desired confidence limit from tables for one-tailed or two-tailed tests.

Most often the standard deviations are not known because only a small sample has been taken from the universe of possible measurements. In this case the true means and the standard deviations of the two populations are unknown because the sample size is too small to make that determination. As before, s 1 and s 2 are replaced by s1 and s2, respectively. The Student’s t statistics is computed instead of z statistics.

.

The calculated value of t is compared to the tabulated value for n degrees of freedom at the desired confidence limit. The number of degrees of freedom is the integer closest to the quantity

The above tests assumes that the two populations of measurements are distinct. To test that the two samples, in fact, do come from the same population calculate the t statistics

where s, a weighted estimate of the standard deviation of the underlying population, is given by

where is the number of degrees of freedom of the t distribution.

The ratio of the estimates of two sample variances tests that the samples are both representative of the same population. The ratio

F =

has its own frequency function which is tabulated for several degrees of freedom in each sample. The table is generally repeated for a few different confidence limits. By convention the larger variance is placed in the numerator. The calculated F ratio is compared to the tabulated value at (-1) numerator and (-1) denominator degrees of freedom at the desired confidence limit a . If the calculated value of F exceeds its tabulated value, there is a probability less than a that the discrepancy between the variances could have arisen by chance and the two samples do not reflect the same population variance.

Example Problem

Two laboratories, using different scintillation counters, measured the intensity of 662 keV g -rays from . The pulse heights of counts of radioactivity in a typical scintillation counter are normally distributed with a mean of H and a standard deviation given by

where E is the energy of the radiation in keV and q is the efficiency of the scintillation phosphor. Let be the pulse height with a probability of observance equal to half the probability of observing the peak pulse height. The uncertainty in the radioactive measurement is approximated by

2

The two laboratories, using the phosphorus NaI (q=30) and anthracene (q=15), each made 16 measurements centered on 662 keV and obtained peak count rates of 3120 and 2250 cpm (counts per minute). Are these measurements equally reliable, that is, is there a statistically significant difference between the estimated experimental uncertainties in the observations from the two laboratories?

Solution

Obviously, the two measurements stand for sampling from entirely different populations with different distributions of pulse height. An F test for the equality of the variances is certain to show a significant difference due to the different amounts of in the two experiments. A common basis for the measurements can be obtained by dividing by the peak count rate H.

The expected variance of the pulse heights relative to the peak pulse height for the first measurements is

corresponding to a relative standard deviation of 2.75%. The expected relative variance for the second laboratory’s measurement is 1,3248 x, corresponding to a relative standard deviation of 3.64%. Multiplying by 2.36 gives measurement errors of 6.07% and 8.59%, respectively.

The F statistic is given by

F =

The tabulated F statistic for 15 degrees of freedom in each variance is 2.40 at the 0.05 confidence level. As the calculated value of F is less than the tabulated value, the hypothesis that the two variances are equivalent is accepted and the results from the two laboratories are equally reliable.