Random Number Generation and Monte Carlo Methods

by James E. Gentle

This document was written in LaTeX, and there has been no serious attempt to translate it to html. Maybe when MathML arrives I'll translate it.

Preface

The role of Monte Carlo methods and simulation in all of the sciences has increased in importance during the past several years. Monte Carlo is a fundamental tool of computational statistics. At the kernel of a Monte Carlo or simulation method is random number generation. Generation of random numbers is also at the heart of many standard statistical methods (``Assume $x_1, x_2, \ldots, x_n$ is a random sample ...'' --- this random sample is generated by the computer).

Various methods for generation of random numbers have been used. Sometimes processes that are considered random are used, but for Monte Carlo methods which depend on millions of random numbers, a physical process as a source of random numbers is generally cumbersome. Instead of ``random'' numbers, most applications use ``pseudorandom'' numbers that are deterministic, but which ``look like'' they were generated randomly. Chapter 1 discusses methods for generation of sequences of pseudorandom numbers that simulate a uniform distribution over the unit interval $(0,1)$. These are the basic sequences from which are derived pseudorandom numbers from other distributions, pseudorandom samples, and pseudostochastic processes.

Chapter 2 discusses general methods for transforming a uniform random deviate or a sequence of uniform random deviates into a deviate from a different distribution. Then Chapter 3 describes methods for specific distributions. It is not the purpose of Chapter 3 just to catalog a large number of methods. In some cases, just the best method is described; in other cases, especially where the best method is very complicated, a simple method is described and references are given for other methods. Although some of the exercises require the reader to write a program to generate random numbers, for production work, I recommend using existing software whenever it is available.

Chapters 1, 2, and 3 are the longest chapters in the book, covering most of the basic methods. Chapter 4 continues the developments of Chapters 2 and 3 to apply them to generation of samples and of nonindependent sequences.

Chapter 5 considers some applications of random numbers. Some of these applications are to solve deterministic problems. This is called Monte Carlo.

The Monte Carlo methods raise questions about the quality of the pseudorandom numbers that simulate physical processes and about the ability of those numbers to cover the range of a random variable adequately. In Chapter 6, I address some of these issues, including the basic question of whether we should even attempt to simulate a random sequence. In many cases a quasirandom sequence that has regular patterns may be more appropriate.

Chapter 7 provides information on computer software for generation of random variates. The discussion concentrates on two software packages, IMSL and S-Plus.

Monte Carlo methods are commonly used in the research literature to evaluate properties of statistical methods. Chapter 8 addresses some of the considerations that apply to this kind of study. It is emphasized that the study uses an {\it experiment}, and the principles of scientific experimentation should be observed.

This book has grown from lecture notes I have used in different courses in the computational and statistical sciences over the past few years. The book would be appropriate for a course in Monte Carlo methods or simulation. It could also be used for supplemental or reference material in a course in statistical computing, computational statistics, or other computational sciences.

The material in this book is part of the field of statistical computing, which comprises computational methods that support statistical methods. In other notes that form the larger whole of which this book is a part, I discuss techniques of computational statistics. The field of computational statistics includes the set of statistical methods that are computationally intensive and which use computation as a basic tool of discovery, as opposed to computation for the purpose of processing data according to a fixed paradigm.

Many methods of computational statistics, such as cross-validation, MCMC, and bootstrap, rely on random number generation and the basic Monte Carlo technique. These methods are not covered in the present text.

The main prerequisite for this text is some background in what is generally called ``mathematical statistics''. Some scientific computer literacy is also necessary. I do not use any particular software system in the book; but I do assume the ability to program in either Fortran or C, and the availability of either S-Plus, Matlab, or Maple.

When I have taught this material, my lectures have consisted in large part of working through examples. Some of those examples have become exercises in the present text. The exercises should be considered an integral part of the book.

In every class I teach in computational statistics, I give Exercise 2 in Chapter 8 as a term project. It is to replicate and extend a Monte Carlo study reported in some recent journal article. Each student picks an article to use. The statistical methods studied in the article must be ones that the student understands, but that is the only requirement as to the area of statistics addressed in the article. I have varied the way the project is carried out, but it always involves more than one student working together. A simple way is for each student to referee another student's first version (due midway through the term), and to provide a report for the student author to use in a revision. Each student is both an author and a referee. In another variation, I have students be co-authors. One student selects the article, designs and performs the Monte Carlo experiment, and another student writes the article, in which the main content is the description and analysis of the Monte Carlo experiment.

I used \TeX\ via \LaTeX\ to write the book, and I used S-Plus and Matlab to generate the graphics. I did all of the typing, programming, etc., myself (mostly early in the morning or late at night), so all mistakes are mine.

Material relating to courses I teach in the computational sciences is available over the World Wide Web at the URL,

http://www.science.gmu.edu/

Notes on this book, including errata, are available at

http://www.science.gmu.edu/~jgentle/rngbk/

James E. Gentle
Fairfax County, Virginia