How to simulate first passage time probability in python for a random walk

0 votes

I have a 2D random walk where the particles have equal probabilities to move to the left, right, up, down or stay in the same position. I generate a random number from to 1 to 5 to decide in which direction the particle will move. The particle will perform n steps, and I repeat the simulation several times.

I want to plot the probability F(t) of hitting a linear barrier located at x = -10 for the first time (the particle will disappear after hitting this point). I started counting the number of particles fp for each simulation that hit the trap, adding the value 1 each time I have a particle in the position x = -10. After this I plotted fp, number of particles hitting the trap for the first time, vs t, the time steps.

import matplotlib.pyplot as plt 
import matplotlib
import numpy as np
import pylab
import random

n = 1000
n_simulations=1000

x = numpy.zeros((n_simulations, n))
y = numpy.zeros((n_simulations, n))
steps = np.arange(0, n, 1)
for i in range (n_simulations):
    for j in range (1, n):
        val=random.randint(1, 5)
        if val == 1:
            x[i, j] = x[i, j - 1] + 1
            y[i, j] = y[i, j - 1] 
        elif val == 2:
            x[i, j] = x[i, j - 1] - 1
            y[i, j] = y[i, j - 1] 
        elif val == 3:
            x[i, j] = x[i, j - 1]
            y[i, j] = y[i, j - 1] + 1
        elif val == 4:
            x[i, j] = x[i, j - 1]
            y[i, j] = y[i, j - 1] - 1
        else:
            x[i, j] = x[i, j - 1]
            y[i, j] = y[i, j - 1]
        if x[i, j] == -10:
            break

fp = np.zeros((n_simulations, n)) # number of paricles that hit the trap for each simulation. 
for i in range(n_simulations):
    for j in range (1, n):
        if x[i, j] == -10:
            fp[i, j] = fp[i, j - 1] + 1
        else:
            fp[i, j] = fp[i, j - 1]
s = [sum(x) for x in zip(*fp)]
plt.xlim(0, 1000)
plt.plot(steps, s)
plt.show()

I should have the following plot:

Probability of hitting the target as a function of time

But the plot I get is different since the curve is always increasing and it should decrease for large t (for large t, most particles have already hit the target and the probability decreases). Even without using the sum of fp I don't have the desired result. I would like to know where my code is wrong. This is the plot I get with my code.

Increasing values

Apr 4 in Machine Learning by Nandini
• 5,480 points
142 views

1 answer to this question.

0 votes

To begin with, you're now computing fp as the total of all particles that have crossed the trap. This number must be asymptotic to n at some point. The derivative of the cumulative total, which is the number of particles traversing the trap per unit time, is what you're looking for.

In the second loop, a very simple adjustment is required. Change the condition below.

if x[i, j] == -10:
    fp[i, j] = fp[i, j - 1] + 1
else:
    fp[i, j] = fp[i, j - 1]

till

fp[i, j] = int(x[i, j] == -10)

Because booleans are already subclasses of int, and you only want 1 or 0 recorded at each step, this works. It's the same as eliminating fp[i, j - 1] from the RHS of both branches of the if statement's assignments.

The plot is as follows:

enter image description here

This appears odd, but hopefully you can already see a glimmer of the plot you're looking for. The low density of particles crossing the trap is what makes it odd. You may improve the appearance by increasing particle density or smoothing the curve with a running average, for example.

Let's start with the smoothing method using np.convolve:

y = np.convolve(fp.sum(0), np.full(11, 1/11), 'same')
z = np.convolve(fp.sum(1), np.full(101, 1/101), 'same')

plt.plot(s, y)
plt.plot(s, z)
plt.legend(['Raw', 'Window Size 11', 'Window Size 101'])

enter image description here

Except for some normalization concerns, this is starting to resemble the curve you're looking for. Smoothing the curve is certainly useful for estimating the plot's form, but it's definitely not the greatest method for actually displaying the simulation. One issue you may see is that the averaging causes the values at the left end of the curve to become very skewed. You can significantly alleviate this by modifying how the window is interpreted or by using a different convolution kernel, but edge effects will always exist.

You'll want to increase the number of samples to significantly improve the quality of your results. I would propose that you optimize your code a little before proceeding.

The design of the trap allows you to decouple the two directions, therefore optimization #1, as mentioned in the comments, is that you don't need to create both x and y coordinates for this problem. Instead, you have a 1/5 chance of stepping in the -x direction and a 1/5 chance of stepping in the +x direction.

The second optimization is solely for speed. You can do everything in a purely vectorized manner rather than running multiple for loops. I'll also provide an example of the new RNG API, which I've found to be far faster than the legacy API.

The third optimization is to increase readability. Without extensive documentation, names like n simulations, n, and fp are not very informative. To make the code self-documenting, I'll rename a few things in the example below:

particle_count = 1000000
step_count = 1000

# -1 always floor divides to -1, +3 floor divides to +1, the rest zero
random_walk = np.random.default_rng().integers(-1, 3, endpoint=True, size=(step_count, particle_count), dtype=np.int16)
random_walk //= 3  # Do the division in-place for efficiency
random_walk.cumsum(axis=0, out=random_walk)

This excerpt calculates random walk as a sequence of steps, first utilizing the smart floor division method to verify that the ratios for each step are exactly 1/5. Cumsum is then used to integrate the stages in place.

Using masking, it's simple to locate the spot when the walk first crosses -10:

steps = (random_walk == -10).argmax(axis=0)

The first occurrence of a maximum is returned by argmax. Because the array (random walk == -10) is made up of booleans, the index of the first occurrence of -10 in each column will be returned. Particles that never reach -10 within simulation count steps will have all False values in their column, resulting in argmax returning 0. This is simple to filter out because 0 is never a valid number of steps.

A histogram showing the number of steps will provide you with the information you require. The fastest approach to compute a histogram with integer data is to use np.bincount:

histogram = np.bincount(steps)
plt.plot(np.arange(2, histogram.size + 1), hist[1:] / particle_count)

The number of particles that never made it to -10 within step count steps is the first element of the histogram. Unless argmax is used, the first 9 entries of the histogram should always be zero. Because histogram[0] nominally shows the count after one step, the display range is shifted by one.

enter image description here

answered Apr 5 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to load a model from an HDF5 file in Keras?

Hi@akhtar, If you stored the complete model, not ...READ MORE

answered Jul 14, 2020 in Machine Learning by MD
• 95,360 points
3,483 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 2,827 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to calculate ctc probability for given input and expected output?

The loss for a batch is defined ...READ MORE

answered Mar 17 in Machine Learning by Dev
• 6,000 points
73 views
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP