Lecture 1:

This is a ‘how-to-do-it’ course for people who want to use computers to simulate behavior of finite-size systems and to understand the interrelation of a model and the real world. We hope that it will be useful for first-year graduate students (working on their Master or Ph.D. degree), research workers in industry and academia.

Getting started is the main barrier to writing a simulation program. Few people begin their research into any field by sitting down and composing a program from scratch. Yet these programs are not inherently complicated: there are just a few traps to be avoided. In the past, many simulation programs have been handed down from one research group to another and from one generation of the students to the next. Technical details such as methods for improving the speed of the program or for avoiding common mistakes are often buried in the appendices of publications.

The field of computer simulation has enjoyed rapid advances in the last decade. Smart Monte Carlo techniques have been introduced and tested and the molecular dynamics method has been extended to simulate various ensembles. The techniques have been merged into a new field of stochastic simulations and extended to cover quantum mechanical as well as classical systems.

A short history of computer simulations

It is now over 40 years since the first computer simulation of complex systems (liquids and surface protective coverage) was carried out at the Los Alamos National Laboratories in the United States [Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., and Teller E. 1953:Equation of state calculations by fast computing machines, J.Chem.Phys.21,1087-92.] The Los Alamos computer, called MANIAC, was at that time one of the most powerful available; it is a measure of the recent rapid advance in computer technology that microcomputers of comparable power are on the market at modest cost. Modern computers range from the relatively cheap, but powerful, single-user workstation to the extremely fast and expensive mainframes. Recently, supercomputers, as CRAY, which were the most powerful computing devices in the 80ties, have successfully been replaced by transputers, as GRAPES in Japan, or by parallel machines. It is not any more a luxury to require a computer having specialized features, such as pipeline and array processors. Computer simulation is possible on most machines. We provide here ideas and codes that can be used on the widely available PC and workstations. No approach to parallel computing is presented in the frame of this course.

The very earliest work (Metropolis,1953) laid the foundation of ‘Modern Monte Carlo Simulation’ referred as Metropolis Monte Carlo. The original models were highly simplified representations of molecules, such as hard spheres and disks. These are still used for educational purposes because the general ideas of the simulations are more transparent than they are for the case of systems interacting in complicated ways. The availability of realistic potentials makes possible to compare the results from the computer simulations to the experimental data and analytical theories.

If we are interested in the dynamic properties of the systems, then MC is not helpful. A different technique is required to solve the classical equations of motion (Newton’s equation). This is the Molecular Dynamics (MD) technique. This was first accomplished for a system of hard spheres, by Adler and Wainwright (1957). They considered particles moving at constant velocity between perfectly elastic collisions and solved the dynamic problem without making any approximations, within the limit imposed by machine accuracy. It was several years before a successful attempt was made to solve the equations of motions for a set of Lennard-Jones particles. (Rahman 1964). Here, an approximate, step-by-step procedure is needed, since the forces change continuously as the particles move. Since that time the properties of the Lennard-Jones model have been thoroughly investigated (Verlet,1967,1968; Nicolas, Gubbins, Tildesley,1979).

After the groundwork on atomic systems, computer simulation developed rapidly: models for economical growth and society development, for the rate of diseases spread-out and telephone-wiring network; for traffic jams in ‘rush’ hours. One would say that all the human activities are subject of simulations in the present time. It seems that a basic understanding of how-to-model specific object is necessary for everybody, at least to a level to judge about the results presented by the others.

The aim of this course is twofold.

1. It should help someone who never made a Monte Carlo simulation but knows how to program a computer to start with a Monte Carlo study of simple systems. For this purpose we include simple explicit examples of such programs written in FORTRAN (many of the programs are available in C, as well). Those, who prefer PASCAL, ADA, or any other language, can use them. Unfortunately, they should ‘translate’ into the specific language the codes available for a direct usage.

We start by outlining how one gets random numbers and sketch the percolation problem as an example of their immediate use. The Ising system is presented as the simplest case of a system with thermal fluctuations. The discreteness of variables in both cases is a simplifying feature, which enables computer time and/or memory to be saved by special ‘tricks’ such as multispin coding technique. Later we describe how systems with continuous variables (like Heisenberg spins) are treated, followed by an introductory discussion of the various pitfalls which may hamper such simulations. With this background, it should be possible to start doing some Monte Carlo work.

2. The second aim is to present some technical aspects of simulations of ‘classical’ systems where recent progress has been achieved. The emphasis will be on problems illustrated with examples taken from the authors own research: information drawn from the study of charged particles penetration through solid thin layers; techniques relying on the use of Plancherelle’s theorem relating the measured and the true signal. In the second part of the course we will discuss (briefly) another simulation technique - molecular dynamics.

Introduction to Monte Carlo Simulations

The Monte Carlo Method was developed by von Neuman, Ulam and Metropolis at the end of the Second World War to study the diffusion of neutrons in fissionable material [Von Neumann J., and Ulam,S.(1945) Random ergodic theorems. Bull. Am. Math.Soc.51,(9),660(No.165)], [Metropolis,N. and Ulam S 1949: The Monte Carlo Method. J.Am.Stat.Ass.44,335-41], [Von Neumann J. 1951: Various techniques used in connection with random digits. US Nat. Bur. Stand. Appl. Math. Ser. 12,36-8].

The name Monte Carlo method arises from the fact that this method uses random numbers similar to those coming out of roulette games. The random numbers were coined by Metropolis in 1947. But producing up to 1010 random numbers necessary for a simulation of a process in this way would consume a lot of time and money. The computers are used for the purpose instead. For understanding the principle of ‘random number generators’ implemented for the software of digital computers, we recall that each word of computer memories consists of m bits for one integer. One of these bits indicates the sign, the others give the magnitude. Thus, the largest number handled by a computer is (2m-1 -1) if the last bit of the word is a sign bit, or it is 2m-1 if the last bit is a digit. Typically m=32 (64), but some machines also have m=16 (48). Some examples: m=16 is the PC having processor Intel 286, 386, while PC (Intel 486 and up) has m=32. The same is for the old VAX workstations. The alfa-DEC workstations have m=64. One may produce a ‘pseudorandom’ integer I1 between 0 and (2m-1 -1) from an initial integer I0 by multiplying I0 with a suitable number like 65539, which in general would produce an integer distinctly exceeding 2m-1, and by putting the product back into the interval from 0 to 2m-1 by a modulo operation:

I1 = 65539* I0* modulo(2m-1). (1.1)

This modulo operation can be done on many but not all computers by simply omitting the leading bits of the product, which is done automatically when too large integers are multiplied. But one has to ensure that the result is not negative. Thus on a 32-bit computer the following FORTRAN statements produce from a random integer (stored as variable I) another integer (stored at the end again under the name I) and a real number X between 0 and 1: (‘linear congruential method’)

I = 65539* I

if(I.LT.0) I = I + 2147483647 + 1

X = I * 0.4656612E-9 (1.2)

By repeating this procedure one produces another integer I and real numbers X. The distribution of real numbers is approximately homogeneous and approximately random between 0 and 1. As a rule of thumb, the larger m is the better the approximation. The resulting series of ‘random’ numbers is completely reproducible, however, if one starts with the same value for I, which repeats itself after a period of at most 2m-2. The initial value of I must be a positive odd integer for the case of the procedure (1.2). The described generator is simple, fast, and dirty. One needs several iterations in order to produce X which are not very small. Thus at the beginning of the program one may use a short loop which ‘warms up’ this generator. It is also safer to put I = 2*I + 1, after I is set initially, so that one never gets erroneously an even I. Computer time can be saved if this generator is not written as a separate subroutine but inserted in the main program wherever it is needed. Also the last line of (1.2) can be avoided if real numbers are not necessary or if one redefines some of the quantities. For example, if during the program one must compare millions of times random numbers X to a fixed probability P ( or a finite set of fixed probabilities P(j)) it is better to define at the beginning an integer probability Ip = 2m-1 *P and later one simply compares I’s from (1.2) to this Ip without calculating X’s. Thus we gain in time. Do we lose something? Unfortunately, yes, we do. We sacrificed the precision of the calculations. The main source for that is the different ‘density’ of the number representation by a computer. It is known ( Knuth, v.II of Art of scientific Programming; Simon + Geblon) that the numbers between the machine zero (exactly,e ) and 1 are more dense than the numbers between 1 and the largest integer.

The simplest test for any generator of random numbers is to fill up a simple cubic lattice of L3 sites n(k1, k2, k3), where ki run from 1 to L. Initially all elements n are empty (n=0). Then one point is selected randomly by calculating its three coordinates k1, k2, and k3 consecutively from three consecutive random numbers X through k1= 1 + X*L, etc. If the randomly selected lattice site is still empty it is then filled (n=1); otherwise it is left filled. It is a good idea to store in an extra array the cases of one and the same point selected (this would give the length of the random-number sequence period). The filling procedure is repeated about t* L3 times, with t of order 10. The restrictions on L are coming from the memory available on the computer. For an example, a lattice length of L=10 requires about 2MB memory (for integers, and doubled for real numbers). Theoretically, the fraction of empty lattice sites should decay as exp(-t). With (1.2) one gets good results for L=10, but already at L=15 some deviations are seen. Very drastically, at L=20 more than 2000 sites remain empty even if theoretically less than one should be left empty. This failure is due to strong triplet correlations between the consecutive pseudorandom numbers generated by (1.2). This problem can be avoided by ‘mixing’ two different random number generators. We complete this paragraph with a general algorithm for random number production based on the ‘linear congruential method’.

Algorithm for generating random numbers in the interval [0,1]:

Start with x0 , the so-called ‘seed’ number

On the nth call to the random number generator calculate

xn+1 =a xn + c(modulo m)

Return (xn+1 /m).

The integers a, c, and m must be selected very carefully to ensure that the successive xn lie on the interval [0,1], do not correlate, and do not follow a periodic sequence. The following empirical rules have been found for the selection of these three parameter values.

    1. Choose a such that a mod8=5
    2. m = maximal integer value for the computer used.
    3. Choose an odd value for c
    4. c/m = 0.211.

An example of “Simple Sampling”: The Percolation Problem

What can we do with these random numbers in the field of statistical physics? Let us start with a simple purely geometric problem of the so-called percolation transition: one considers an (infinite) lattice where each site is randomly occupied with probability p and empty with probability (1-p). Neighboring occupied sites are said to form ‘clusters’. For more details, see (in Bulgarian ): B.Karadjov(1996) - Master thesis, F.Babalievski (1994) - Ph.D. thesis, Efros (1982) in Russian. Some questions concerning this problem are: how many clusters n l(p) containing l occupied sites exist in the lattice? At which probability pc does an infinite cluster form for the first time, that ‘percolates’ from one boundary of the lattice to the opposite one? How does the shape of the lattice cells (triangles, squares, honeycomb) influence the value of pc? The percolation problem has been studied very extensively in connection with critical phenomena and continuous (or second-order) phase transitions. We will return to this problem from a pure mathematical point of view when discussing phase transitions and scaling theory.

A Monte Carlo procedure which generates sample configurations of such a partially filled lattice then simply consists of FORTRAN statements like

DO 1 K1 = 1,L

DO 1 K2 = 1,L

DO 1 K3=1,L

N(K1,K2,K3)=0

IF(RANF(I).LT.P) N(K1,K2,K3) = 1

1 CONTINUE

Comment: The name RANF should be replaced with the appropriate name for the system used in computations.

Here RANF(.) is the random number generator and P ,L are constants given at the beginning of the program. This case is for a square lattice. One sweep through the lattice already determines the whole system. There is no need to wait until some ‘equilibrium’ is established, in contrast to thermal or quantum systems. Rather sophisticated codes group the occupied sites into clusters of various sizes [the most effective code at present is the one published by J. Hoshen and R. Kopelman, Phys. Rev. B14, 3438(1976)]. Since each place is randomly occupied or empty, independent of what happens at neighboring sites, one needs to store only the current plane of a three-dimensional lattice (or the current row in two dimensions) for this analysis. Figure 1.1 shows the sizes Sµ of the largest cluster for the two dimensional LxL triangular lattice and the case where exactly half the lattice sites are filled. Except for small L the data follow a straight line on a log-log plot, which correspond to a power low Sµ ~ L91/48. This so-called critical exponent or fractal dimension 91/48 for the largest clusters agree with the actual expectations. This example (95000x95000) was one of the largest systems simulated in the late 80’s and showed that the size dependence of some quantities may contain useful information. The theoretical basis for such an analysis is provided by the so-called finite-size scaling theory [Ferdinand,A., Fisher, M.E. : Phys.Rev.185,832(1969)].

RANDOM SAMPLING FROM DIFFERENT DISTRIBUTIONS

The pseudorandom numbers generated by the procedures just described are expected to be uniform in [0,1] or [1, MAX]. What do we need to produce numbers having different distributions?

The most common probability functions P(x) (frequency function) used in science are the Poisson distribution for discrete random variables and the exponential and Gaussian distributions for continuous random variables. The usual normalization applies:

where a is determined by the ‘nature’ of the variable x: a =0 or a = -µ .

Having in hands a well-tested generator of uniformly distributed random numbers, one can generate numbers with any other distribution by means of special techniques.

  1. Method of the Inverse Functions: If the undefined integral

can analytically be solved, the following procedure produces numbers with the desired distribution:

,

where x is uniformly distributed in [0,1].

B) Smart Direct Methods for Sampling

Sampling from the Poisson distribution

Suppose a set of i integers are to be drawn from a population consisting of n non-negative integers. The probability of drawing a particular set is given by the binomial distribution

The expectation value E(i) of a random variable distributed according to this distribution, called the ‘mean’ is given by

E(i)=

The standard deviation, s , is a measure of the degree to which the sampled integers spread about the mean. It is given by the following formula for its square:

called the ‘variance’. As n becomes infinite, while the mean value nq is kept constant, the distribution of the random variable i approaches

P(i) =

where m is the mean value of the population of n individuals. This is the frequency function for the Poisson distribution. It is important in assessing the reliability of counting discrete events, such as radioactive decay.

Algorithm

To select a non-negative integer i from the Poisson distribution with mean m :

p = exp(-m )

q = 1

i = -1

while q>p

select random number x from the uniform distribution

q = qx

i = i+1

exit with current value of i.

Sampling from the Gaussian distribution.

The random variables in most mathematical models assume continuous rather than discrete values. Suppose x is such a variable and is clustered about the population mean m with (positive or negative) deviations . Further suppose that x can be expanded about the mean as a power series in and that the terms in remain finite as the sample size n becomes infinite whereas terms with higher powers of approach zero much more rapidly. The probability of selecting a particular x under these conditions is given by:

where s is the standard deviation. This is the Gaussian or normal distribution.

There has been much controversy over the applicability of the Gaussian distribution to all experimental errors. Because empirical values far from the mean occur infrequently, the properties of the measurement method far out on the ‘tails’ of the distribution are difficult to examine. Rarely does an experimenter perform sufficient replicate measurements to verify that the experimental errors follow a Gaussian distribution. However, it is certain that this distribution is qualitatively correct for measurement errors, and for practical purposes it may be treated as quantitatively correct.

The equation

y=( x - m )/s

transforms the random variable x, normally distributed with mean m and standard deviation s , into a random variable y, normally distributed with mean zero and standard deviation one, commonly denoted with N(0,1). Because y can always be transformed back to x, the following algorithm for selecting a value from N(0,1) is completely general.

Algorithm

DO

select random numbers x 1 and x 2 from the uniform distribution

x 1=2x 1 - 1

x 2 = 2x 2 - 1

s = (x 1)*( x 1) + (x 2)( x 2)

while s>= 1 (ensure that the point (x 1, x 2) is within the unit circle)

x = x 1* SQR( -2 lns/s) (transform to the normal distribution).

Thierry’s algorithm:

Let x 1 and x 2 are uniformly distributed in [0,1]. Then the numbers y1 and y2 computed from:

y1 = SQRT(-2 lnx 1 ) cos(2p x 2)

y2 = SQRT(-2 lnx 1 ) sin(2p x 2)

are normally distributed in (-¥ ,+¥ ).

In some applications (Brownian dynamics, obtained by the method outlined above, the variables

are sampled from the bivariate Gaussian distribution with zero means, variances and , and correlation coefficient .

Sampling from the Exponential Distribution

Some mathematical models include continuous random variables that assume only non-negative values. The exponential distribution

P(x) = exp(-x)

for x on [0,µ ] is an important frequency function for such variables. For instance, distance traveled by a molecule in a gas or subatomic particle in a reactor prior to a collision follows this distribution.

Algorithms

To sample from the exponential distribution.

First version

select random number x from the uniform distribution

x = - lnx

Because calculating the logarithm is slow*, we supply

Second version

x = 0

for I = 0,… until satisfied DO

select random number x (I) from the uniform distribution.

IF x (I)> x (I-1)

IF I is even, x = x + 1

ELSE exit with x = x + x (I).

*) Comment: The computation of the logarithmic function by the contemporary computers is not so slow anymore, but still this ‘shift’ method is very useful.

    1. Rejection Method (J. von Neuman) (1951) National Bureau of Standards. Applied Mathematics Series, 12,36-38,p.36 (eq.3.4.2)

We would like to generate x on [0,1] with p(x)=w(x). The geometrical basis of the von Neuman rejection is illustrated in the fig. 2. Let w’(x) be a positive function such that w’(x)>w(x) over the region of integration. Note that this means that the definite integral of w’ is greater than 1. A convenient, but not always useful choice for w’ is any constant greater than the maximum value of w’ in the region of integration. If we generate points in two dimensions that uniformly fill the are under the curve w’(x) and then ‘accept’ for use only those points that are under w(x), then the accepted points will be distributed according to w(x). Practically, what is done is to choose two random variables, x and x , the former distributed proportional to w’ and the latter distributed uniformly between 0 and 1. The value x is then accepted if x is less then the ratio w(x)/w’(x); if a point x is rejected, we go on and generate another pair of x and x . This technique is obviously efficient if w’ is close to w throughout the entire range of integration; otherwise, much time is wasted rejecting useless points.

Exercise: Use von Neuman rejection to sample points in the interval [0,1] distributed as w(x) = and evaluate the integral and its uncertainty with these points. Compare the efficiency of your calculation for various choices of w’

Fig.2 is here..

Fig.3 : Convolution of a Gaussian and a Lorentzian function:V(1,5,1,2)=

L(1,5,1)Ä G(1,5,2)

The efficiency of the calculation depends on the chosen area covered the function.

    1. The algorithm of Metropolis et al. (importance sampling algorithm)

This subject is presented in the Lecapp file (appendix to the lecture).