Sampling random numbers from arbitrary distributions

Have you ever wondered how to sample (pseudo)random numbers on a computer from a nonuniform distribution? I have. I’ll try to explain how its done, and illustrate the method using Python.

If you are familiar with the excellent NumPy package, you’ve most likely also discovered the numpy.random module, which offers a broad variety of functions for obtaining random numbers from a large number of distributions. But what if your favourite distribution is not in that list? And what if you only know your distribution at discrete points between which you have to interpolate?

Well, one thing at a time.

The inverse transform sampling method

A method to sample random numbers from any distribution using uniformly distributed random numbers

r\in[0,1)

is called inverse transform sampling, and it relies on knowing the cumulative dirstibution function

F(x)=\int_{-\infty}^{x} f(x')dx'

of our distribution, which we may call f(x). The inverse transform method works by sampling a random number r and then finding the value x for which F(x)=r. Sounds easy enough, so let’s look at an example:

Let’s say our distribution is

f_\lambda(x)=\begin{cases} \lambda\exp(-\lambda x) & x\geq 0\\ 0 & \text{otherwise} \end{cases}

We can analytically calculate the cumulative distribution function to be

F_\lambda(x)=\int_{0}^{x}\lambda\exp(-\lambda x')dx'=1-\exp(-\lambda x).

Here’s a plot of our distribution and its cumulative distribution for different values of \lambda:

exponential_pdfs

Now, if we have a random number r\in[0,1), and we want to find the x for which

r=F_\lambda(x)=1-\exp(\lambda x)

we have to solve above expression for x, which yields

x=-\frac{1}{\lambda}\log(1-r),

the inverse of the cumulative distribution function. In order to sample random numbers from f_\lambda(x) we have to generate some uniformly distributed random numbers and just plug them into above equation. The resulting numbers will be distributed according to f_\lambda(x). We can easily convince ourselves of this using the following Python script:

import numpy as np
import matplotlib.pyplot as plt

def pdf(x, l):
    """
    Exponential probability distribution function
    @param x x-value
    @param l lambda
    """
    return l*np.exp(-l*x)
def inv_cpdf(x, l):
    """
    inverse of the cumulative distribution function of
    the exponential distibution
    @param x x-value
    @param l lambda
    """
    return -np.log(1-x)/l

# let's use lambda=1 for now
l=1.
# generate 10000 random numbers in [0,1)
r=np.random.random(10000)
# put r into inv_cpdf
r_exp=inv_cpdf(r, l)
# x axis for plot
x=np.linspace(0,5,20)
# create figure
fig=plt.figure()
ax=fig.add_subplot(111)
ax.plot(x, pdf(x, l), label=r"$f(x)=\lambda\cdot\exp(-\lambda x)$")
ax.hist(r_exp, bins=200, range=(x[0], x[-1]), histtype='step', normed=True, label="Histogram")
ax.set_xlabel(r"$x$")
ax.set_ylabel("PDF")
ax.grid(True)
ax.legend()
fig.savefig("hist_exponential.png")
fig.show()

In Line 4 to 18 the distribution and the inverse of its cumulative distribution are defined. In line 23 we generate 10000 random numbers which we plug into our inverse cumulative distribution in line 25. The rest of the code is basically just for plotting: we make a normalized histogram of our transformed random numbers and plot the original distribution for comparison:

hist_exponential

Nice! The histogram traces out the original distribution, which means that our transformed random numbers are actually distributed according to f_\lambda(x)!

Inverting means flipping your axes

But what do we do if we cannot analytically calculate the cumulative distribution function? Or if we cannot analytically solve r=F(x)? Well, we can do so numerically. In the world of Python there’s the SciPy package which offers a function for evaluating cumulative integrals, and the equation r=F(x) can be rearranged to

F(x)-r=0

and then solved by e.g. Newton’s method, of which there’s also an implementation available in SciPy.

But there’s an easier way, especially if you only know your distribution at a few points \left(x_i, f(x_i)\right) between which you have to interpolate.

What we did when we solved

r=F(x)

was inverting F(x). That’s why the method outlined above is called inversion transform sampling. The inverse of a function F(x) (if it has one) is usually denoted by F^{-1}(x) (not to be confused with (F(x))^{-1}=\frac{1}{F(x)}!), and its graph looks just like the one of F(x), just with x and y axes swapped, or mirrored at the line defined by y=x, here here illustrated using the cumulative distribution function from above:

pdf_inverse

Now, as we’re using a computer, we can easily flip the axes of a graph, and as such also easily generate an approximation of the inverse of F(x) by evaluating it at the points x_i in the region of interest, and calculating an interpolation (e.g. a linear one), where we use f(x_i) as the x-values and x_i as the y-values.

So let’s test this method. Here’s the python code:

import numpy as np
from scipy.integrate import cumtrapz
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt

x=np.linspace(-3,3,50) # x axis
y=np.abs((1-.1*np.sin(2.*np.pi*x))*(np.exp(-1.*np.abs(x)))) # some funky function
y=y/np.trapz(y,x) # normalize
cy=cumtrapz(y,x,initial=0.) # cumulative distribution function

# plots
fig1=plt.figure()
ax1=fig1.add_subplot(111)
ax1.set_title("Propability distribution")
ax1.plot(x,y, label="Propability distribution")
ax1.plot(x,cy, label="Cumulative PDF")
ax1.legend()
ax1.set_xlabel(r"$x$")
ax1.set_ylabel(r"$y$")
ax1.grid(True)
fig1.savefig("funky_distribution_function.png")
fig1.show()

# swap x and y axis and interpolate
inv_cpdf=interp1d(cy, x)
# generate 100000 random numbers
r=np.random.random(100000)
# map random numbers onto our funky distribution
r_dist=inv_cpdf(r)

#make histogram

fig2=plt.figure()
ax2=fig2.add_subplot(111)
ax2.plot(x, y, label="Propability distribution")
ax2.hist(r_dist, bins=200, range=(x[0], x[-1]), normed=True, label="Histogram")
ax2.set_xlabel(r"$x$")
ax2.set_ylabel("PDF")
ax2.legend()
ax2.grid(True)
fig2.savefig("hist_funky.png")
fig2.show()

In line 7 we define our propability distribution as some funny function which looks like this (blue curve):

funky_distribution_function

Then we calculate the cumulative distribution (orange curve) and create above plot. Line 25 is where the magic takes place: We calculate an interpolation using scipy.interpolate.interp1d with the cumulative distribution as the interpolation’s x-values and the x-axis on which we defined the distribution as the interpolaion’s y-axis. Then we generate a few random numbers, and put them into our interpolated inverse of the cumulative distribution, just as we did above, only there we had an analytic expression of F^{-1}(x). The rest is just plotting again, generating a histogram to illustrate that the method actually works:

hist_funky

Cool, isn’t it? Of course, instead of generating our (x_i, y_i) points using some funny function we could have used data points from somewhere else as well.

 

I hope you enjyed reading this.

Have a nice day!

 

During the writing of this post I found this and this article, two of propably many on the same subject, so I’m not under the illusion that the methods discussed here are in any way original ideas of mine. However, as I’ve heard of these methods somewhere else before stumbling across these articles I’ve not cited them as sources here.

Advertisements

Fun with FORTRAN

Quite recently I had to use a program written in FORTRAN77 for some calculations. Curious as I am I peeked into the source code just to realize that
I don’t understand a thing of what’s going on. So I decided to learn some FORTRAN77.

Why should one use FORTRAN today? From its Wikipedia article one quickly learns that is has been the language of scientific computing for decades and is still quite popular with scientists and engineers as of today. As a result there’s still a lot of excellent and useful FORTRAN code around, which has to be understood and maintained.

A qick Google search yielded this nice introduction to FORTRAN77 which goes through pretty much all important topics, and if you already have some programming experience you should not have any trouble following it. Here I’m not going to offer a FORTRAN tutorial, because there’s already plenty of those available, written by people who know the language much better than I do, but I’ll share my first steps in FORTRAN with you, and hopefully you’ll have just as much fun as I did.

So let’s dive into the scientific computing fun and solve a simple problem!

A pendulum

A mathematical pendulum is one of the first problem a physics student encounters in his or her classes. As it can easily be shown the angle \varphi from the vertical obeys the differential equation

\ddot\varphi=-\frac{g}{l}\sin\varphi

where l is the length of the pendulum and g is the gravitational acceleration. The standard way of analytically tackling this problem is to assume \varphi to be small, which enables us to use the small-angle approximation

\sin\varphi\approx\varphi

which, according to Wikipedia, is accurate enough for \varphi < 14^\circ.

Under this approximation the differential equation becomes

\ddot\varphi=-\frac{g}{l}\varphi,

the differential equation of a harmonic oscillator, the solution of which is well known to be some function like

\varphi(t)=A\cdot\cos(\omega t+\phi)

where \omega=\sqrt{g/l} and the constants A and \phi have to be determined from e.g. initial conditions i.e. the values of \varphi(0) and \dot\varphi(0). For the particularily simple case of \varphi(0)=\varphi_0 and \dot\varphi(0)=0 one has

\varphi(t)=\varphi_0\cos(\omega t)

which corresponds to releasing the pendulum from an angle \varphi_0 without giving it any additional velocity.

Numerically solving the differential equation

There are plenty of methods for numerically solving ordinary differential equations. Here I’ll adopt a simplified version of the Leapfrog integrator to solve the equation of motion:

The angle \varphi(t+\Delta t) at a later time t+\Delta t is calculated using

\varphi(t+\Delta t)\approx \varphi(t)+\Delta t \cdot \dot \varphi(t).

Then the angular velocity \dot\varphi(t+\Delta t) at t+\Delta t is then:

\dot\varphi(t+\Delta t)\approx \dot\varphi(t) - g \sin(\varphi)\cdot\Delta t

These equations are applied repeatedly in order to propagate \varphi(t) and \dot\varphi(t), and for simplicity we’ll assume l to equal 1\mathrm{m}. But that’s engough talk, here’s the code:

C234567
      program pendulum
      implicit none
C     declarations
      real phi, phidot, dt, t
      integer steps
C     Number of steps
      steps=200
C     timestep and initial values
      t=0.
      dt=.05
      phi=0.1
      phidot=0.

      write(*,*) "# t, phi, phidot"
C     Integrator loop
      do while(steps .ge. 0)
      write(*,*) t, phi, phidot
      t = t + dt
      phi = phi + dt*phidot
      phidot = phidot - dt*9.81*sin(phi)
      steps = steps - 1
      enddo
      end

Please note that in FORTRAN77 the first 7 characters of each line are reserved for special characters or line numbers. Lines starting with a C indicate a comment. In line 12 and 13 the initial \varphi_0=0.2 and \dot\varphi=0 are set, 17 to 23 are the integrator.

This program can be compiled e.g. with the GNU Fortran compiler:

f77 -o pendulum pendulum.f

The programs output is three columns of numbers, the first column being the time t, the second being \varphi, and the third being \dot\varphi. You can dump the output into a text file for plotting:

./pendulum > pendulum.1.dat

Results

In the above FORTRAN code the initial angle is 0.1 \mathrm{rad} i.e. approximately 6^\circ, so we expect the small angle approximation to be valid, and thus the oscillation to be a cosine:plot_small_angle

As we can see our simple FORTRAN integrator (solid lines) does a pretty good job at reproducing the analytic solution (dashed lines) for small angles., so we can be confident that it works well enough. But what about large angles? We can set the initial angle to a large value by modifying line 12 of the FORTRAN program to read e.g.

       phi=3.09

which corresponds to \varphi\approx177^\circ and is obviously out of the range where the small angle approximation can be expected to yield acceptable results. So let’s recompile the program and generate another data file:

f77 -o pendulum pendulum.f
./pendulum > pendulum3.09.dat

Here’s the plot:plot_large_angle

For large angles the shape of \varphi(t) and consequently \dot\varphi(t)  (solid lines, dark colors) is considerably different from the analytical solution (solid lines, light colors), and also the period of the oscillation changed. This dependency of the period on the oscillation of the amplitude is not captured by the simple harmonic model in the small angle approximation, but we can study such effects numerically.

Here’s the Python 3 code for generating the plots from the two data files:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec

def pendulum(t, phi):
    return phi*np.cos(np.sqrt(9.81)*t)
def dpendulum(t, phi):
    return -np.sqrt(9.81)*phi*np.sin(np.sqrt(9.81)*t)

data1=np.loadtxt("pendulum3.09.dat")
data4=np.loadtxt("pendulum.1.dat")

fig=plt.figure()
gs=gridspec.GridSpec(2,1, hspace=0)

ax1=plt.subplot(gs[0])
ax1.set_title(r"Large angle, $\varphi_0=3.09$")
ax1.plot(data1[:,0], data1[:,1], color="C0", label="Num.")
ax1.plot(data1[:,0], pendulum(data1[:,0], 3.09), color="C0", alpha=.3, label="Analyt.")
ax1.set_xlabel(r"$t$ [s]")
ax1.set_ylabel(r"$\varphi$ [rad]")

ax2=plt.subplot(gs[1])
ax2.plot(data1[:,0], data1[:,2], color="C1", label="Num.")
ax2.plot(data1[:,0], dpendulum(data1[:,0], 3.09), color="C1", alpha=.3, label="Analyt.")
ax2.set_xlabel(r"$t$ [s]")
ax2.set_ylabel(r"$\dot\varphi$ [rad/s]")

ax2.legend()
ax1.legend()
fig.tight_layout()
plt.savefig("plot_large_angle.png")
plt.show()

fig2=plt.figure()
ax1=plt.subplot(gs[0])
ax1.set_title(r"Small angle, $\varphi_0=0.1$")
ax1.set_xlabel(r"$t$ [s]")
ax1.set_ylabel(r"$\varphi$ [rad]")
ax1.plot(data4[:,0], data4[:,1], label=r"Num.", color="C0")
ax1.plot(data4[:,0], pendulum(data4[:,0], .1), '--', label=r"Analyt.", color="C2")

ax2=plt.subplot(gs[1])
#ax2.set_title(r"$\varphi_0=0.2$")
ax2.plot(data4[:,0], data4[:,2], label=r"Num.", color="C1")
ax2.plot(data4[:,0], dpendulum(data4[:,0], .1), '--', label=r"Analyt.", color="C3")
ax2.set_xlabel(r"$t$ [s]")
ax2.set_ylabel(r"$\dot\varphi$ [rad/s]")

ax2.legend()
ax1.legend()
fig.tight_layout()

plt.savefig("plot_small_angle.png")
plt.show()

Conclusion

So we used FORTRAN77 to solve an ordinary differential equation and investigate an effect inherent to the nonlinear dynamics of a simple pendulum. Of course, we could also have done that in Python or any other modern language, but that would have defeated the purpose of this excercise, which was to learn some FORTRAN.

My first impression of FORTRAN was positive, and I can understand very well why it is still being used for numerics today. Although I haven’t used any functions, arrays or subroutines yet I’ll do so very soon and prepare another post to share my experiences.

Also, feel free to experiment with the code. It may be fun to try different initial conditions, e.g. a non-zero initial velocity, or just whatever you can imagine.

So, I hope you enjoyed reading this! Have a nice day!