Error is fsolve in R when trying to fill a matrix with values; looking for explanation and/or solution - r

Solution Found
I am trying to plot a volatility surface using "persp" in R. To do so I need to fill a matrix, z, with implied volatilities.
I have a data frame of the strike prices, time and market prices. Data only contains call options.
AAPL <- #data
df <- data.frame(AAPL$Strike.Price,AAPL$Time.Left,AAPL$Market.Price)
I currently have a matrix, zz, that has stock prices in the first column, times as the headers and the respective market prices in columns 2, 3 and 4. It is important to note that some values of the market prices are missing (NA).
zz <- cast(df, df.Strike.Price ~ df.Time.Left)
For my x, y axis, I define the vectors:
x0 <- zz$df.Strike.Price #Strike prices for calculation of imp. vol.
x <- zz$df.Strike.Price / 153.06 #Axis for plotting
y <- c(time1, time2, time3)
Now the z matrix for plotting implied volatility. I start with an empty matrix
z = matrix(data=NA,nrow=length(x0),ncol=length(y))
Then I attempt to fill the matrix, leaving NA for values that cannot be calculated
for(i in 1:length(x0)){
for(j in 1:length(y)){
#Formula for Black-Scholes call option price (no dividends)
BS = function(X,T,sigma){
#Parameters
S=153.06; r=0.05 #Stock value is same for all options, r is arbitrarily selected to be some constant.
d1 = (log(S/X) + (r + sigma^2/2)*T) / (sigma*sqrt(T))
d2 = d1 - sigma*sqrt(T)
#Price for call options
price = S*pnorm(d1) - X*exp(-r*T)*pnorm(d2)
return(price)
}
#To address NA entries in zz
if(is.na(zz[i,j+1] == TRUE)){
z[i,j] = NA
}
#This is the part of the code that causes issues
else{
#Function for fsolve, the Black-Scholes price minus the market price.
A = function(sigma){
a = BS(x0[i], y[j], sigma) - zz[i,j+1]
return(a)
}
V = fsolve(A, 0.5) #Should give me the implied volatility from market data.
z[i,j] = V
}
}
}
Upon executing this piece of code I get the error message:
Error in if (norm(s, "F") < tol || norm(as.matrix(ynew), "F") < tol) break :
missing value where TRUE/FALSE needed
I'm not sure what the error is about. Is there a way to overcome this problem or an alternative method to getting the implied volatilities instead of using fsolve?

The error is to do with the changes in sigma becoming too small for the function fsolve. I was able to find another function that could solve non-linear equations and used that instead.
The function was nleqslv from a package of the same name nleqslv.

Related

Stochastic parameter estimation

I have made a pandemic stochastic simulator which takes probabilities of an infection, recovery or neither and uses a gillespie algorithm with vectors to determine the number of people in each category at each time. I want to carry out a simulation study and use maximum liklihood estimation to get parameter estimates for my simulations. It worked perfectly for the SI model but in this model i get the following error codes that i cannot understand. When i run just the function MLE i get scalars and I can even produce the vector J. But when i try and use optim it tells me that the function PL isnt a scalar when i know it is. Any help would be greatly appreciated thanks
#SIR 100 DAYS WITH 10 INTERVALS A DAY
T<-100 #Setting the number of intervals
dt<-0.01 #Setting the interval lengths
B<-1.5 #Setting Beta
N<-50 #Setting population size
Y<-0.5 #Setting recovery rate
r<-function(i){runif(1,0,1)} #Random number generator
S<-c(1:T)
I<-c(1:T)
R<-c(1:T)
I1<-c(1:T)
I2<-c(1:T)
I3<-c(1:T)
It<-c(1:T)
Time<-c(1:T)
I[1]<-1
S[1]<-N-I[1]
R[1]<-0
It[1]<-I[1]
P1<-function(t){(B)*(I[t])*(S[t])*(dt)*(1/N)} #Creates first event interval(Infection)
P2<-function(t){(Y)*(I[t])*(dt)+(B)*(I[t])*(S[t])*(dt)*(1/N)} #Creates 2nd event interval(Recovery)
P3<-function(t){1} #Creates 3rd event interval (No transition)
PI1<-function(t){(I1[t])/I[t]} #Creates interval for recovery from first group
PI2<-function(t){((I1[t])/I[t])+((I2[t])/I[t])} #Creates interval for recovery from third group
PI3<-function(t){1} #Creates interval for recovery from first group
for(i in 2:T){
x<-r(i)
if(x<P1(i-1)){ #If an infection occurs
S[i]<-S[i-1]-1
I[i]<-I[i-1]+1
R[i]<-R[i-1]
It[i]<-It[i-1]+1
}
else if(x<P2(i-1)){ #If a recovery occurs
S[i]<-S[i-1]
I[i]<-I[i-1]-1
R[i]<-R[i-1]+1
It[i]<-It[i-1]}
else{ #If no transition occurs
S[i]<-S[i-1]
I[i]<-I[i-1]
R[i]<-R[i-1]
It[i]<-It[i-1]}
}
n<-c(1:T)
for(i in 1:T){
n[i]<-S[i]+I[i]+R[i]}
n
S
I
R
Data<-cbind.data.frame(Time,S,I,R,n,It) #Create a dataframe for ease of manipulations
Data$EventInfection<-0
Data$EventRecovery<-0
Data$EventNotransition<-0
for(i in 2:T){if(Data$It[i]>Data$It[i-1]){Data$EventInfection[i]<-1} #Event indiciators to make Liklihood easier
else if(Data$R[i]>Data$R[i-1]){Data$EventRecovery[i]<-1}
else{Data$EventNotransition[i]<-1}}
PL<-function(i,b,y){((b*S[i]*I[i]*dt*(1/N))^Data$EventInfection[[i]])*((I[i]*(y)*dt)^Data$EventRecovery[[i]])*((1-(b*S[i]*I[i]*dt*(1/N))-((y)*I[i]*dt))^Data$EventNotransition[[i]])}
MLE<-function(b,y){
J<<-c(1:T)
for(i in 1:T){
J[i]<<-log(PL(i,b,y))}
return(sum(J))}
MLE(1,0.5)
optim(c(1,1), MLE, y=1)
Warning messages:
1: In J[i] <- log(PL(i, b, y)) :
number of items to replace is not a multiple of replacement length
2: In J[i] <- log(PL(i, b, y)) :
number of items to replace is not a multiple of replacement length
3: In J[i] <- log(PL(i, b, y)) :
number of items to replace is not a multiple of replacement length
4: In J[i] <- log(PL(i, b, y)) :
number of items to replace is not a multiple of replacement length
5: In J[i] <- log(PL(i, b, y)) :
number of items to replace is not a multiple of replacement length
MLE() takes two variables, yet you gave the optim() function three parameters. Essentially, the optim() function expects b in your MLE function to be a vector of two spots. If you wanted to optimize b and y, for example, this will work.
MLE <- function(b){
J <<- vector(length = Ti)
for(i in 1:Ti){
J[i] <<- log(PL(i, b[1], b[2]))
}
return(sum(J))
}
MLE(c(1, 0.5))
optim(c(1, 1), MLE)
Now b is b[1] and y is b[2]. I'm not sure if that's what you wanted to optimize, though.

Estimating an OLS model in R with million observations and thousands of variables

I am trying to estimate a big OLS regression with ~1 million observations and ~50,000 variables using biglm.
I am planning to run each estimation using chunks of approximately 100 observations each. I tested this strategy with a small sample and it worked fine.
However, with the real data I am getting an "Error: protect(): protection stack overflow" when trying to define the formula for the biglm function.
I've already tried:
starting R with --max-ppsize=50000
setting options(expressions = 50000)
but the error persists
I am working on Windows and using Rstudio
# create the sample data frame (In my true case, I simply select 100 lines from the original data that contains ~1,000,000 lines)
DF <- data.frame(matrix(nrow=100,ncol=50000))
DF[,] <- rnorm(100*50000)
colnames(DF) <- c("y", paste0("x", seq(1:49999)))
# get names of covariates
my_xvars <- colnames(DF)[2:( ncol(DF) )]
# define the formula to be used in biglm
# HERE IS WHERE I GET THE ERROR :
my_f <- as.formula(paste("y~", paste(my_xvars, collapse = " + ")))
EDIT 1:
The ultimate goal of my exercise is to estimate the average effect of all 50,000 variables. Therefore, simplifying the model selecting fewer variables is not the solution I am looking for now.
The first bottleneck (I can't guarantee there won't be others) is in the construction of the formula. R can't construct a formula that long from text (details are too ugly to explore right now). Below I show a hacked version of the biglm code that can take the model matrix X and response variable y directly, rather than using a formula to build them. However: the next bottleneck is that the internal function biglm:::bigqr.init(), which gets called inside biglm, tries to allocate a numeric vector of size choose(nc,2)=nc*(nc-1)/2 (where nc is the number of columns. When I try with 50000 columns I get
Error: cannot allocate vector of size 9.3 Gb
(2.3Gb are required when nc is 25000). The code below runs on my laptop when nc <- 10000.
I have a few caveats about this approach:
you won't be able to handle a probelm with 50000 columns unless you have at least 10G of memory, because of the issue described above.
the biglm:::update.biglm will have to be modified in a parallel way (this shouldn't be too hard)
I have no idea if the p>>n issue (which applies at the level of fitting the initial chunk) will bite you. When running my example below (with 10 rows, 10000 columns), all but 10 of the parameters are NA. I don't know if these NA values will contaminate the results so that successive updating fails. If so, I don't know if there's a way to work around the problem, or if it's fundamental (so that you would need nr>nc for at least the initial fit). (It would be straightforward to do some small experiments to see if there is a problem, but I've already spent too long on this ...)
don't forget that with this approach you have to explicitly add an intercept column to the model matrix (e.g. X <- cbind(1,X) if you want one.
Example (first save the code at the bottom as my_biglm.R):
nr <- 10
nc <- 10000
DF <- data.frame(matrix(rnorm(nr*nc),nrow=nr))
respvars <- paste0("x", seq(nc-1))
names(DF) <- c("y", respvars)
# illustrate formula problem: fails somewhere in 15000 < nc < 20000
try(reformulate(respvars,response="y"))
source("my_biglm.R")
rr <- my_biglm(y=DF[,1],X=as.matrix(DF[,-1]))
my_biglm <- function (formula, data, weights = NULL, sandwich = FALSE,
y=NULL, X=NULL, off=0) {
if (!is.null(weights)) {
if (!inherits(weights, "formula"))
stop("`weights' must be a formula")
w <- model.frame(weights, data)[[1]]
} else w <- NULL
if (is.null(X)) {
tt <- terms(formula)
mf <- model.frame(tt, data)
if (is.null(off <- model.offset(mf)))
off <- 0
mm <- model.matrix(tt, mf)
y <- model.response(mf) - off
} else {
## model matrix specified directly
if (is.null(y)) stop("both y and X must be specified")
mm <- X
tt <- NULL
}
qr <- biglm:::bigqr.init(NCOL(mm))
qr <- biglm:::update.bigqr(qr, mm, y, w)
rval <- list(call = sys.call(), qr = qr, assign = attr(mm,
"assign"), terms = tt, n = NROW(mm), names = colnames(mm),
weights = weights)
if (sandwich) {
p <- ncol(mm)
n <- nrow(mm)
xyqr <- bigqr.init(p * (p + 1))
xx <- matrix(nrow = n, ncol = p * (p + 1))
xx[, 1:p] <- mm * y
for (i in 1:p) xx[, p * i + (1:p)] <- mm * mm[, i]
xyqr <- update(xyqr, xx, rep(0, n), w * w)
rval$sandwich <- list(xy = xyqr)
}
rval$df.resid <- rval$n - length(qr$D)
class(rval) <- "biglm"
rval
}

r - How to translate this time series calculation into a raster calculation?

I'm trying to reproduce this vector (time series) calculation code:
gamma.parameters<- fitdistr(may_baseline_3months[may_baseline_3months>0],"gamma")
into a raster calculation code.
What this code originally does is trying to fit a gamma distribution by maximum likelihood estimation to a vector (time series) may_baseline_3months.
And what I want to do is to calculate the same thing but with a raster stack.
I tried doing this with calc() function:
f1<-function(x)
{
library(MASS)
return(fitdistr(x,"gamma"))
}
gamma.parameters<- calc(x = may_baseline_3months,fun = f1)
Error in .calcTest(x[1:5], fun, na.rm, forcefun, forceapply) :
cannot use this function
but it didn't work.
Note: My raster stack has only 4 layer.
EDIT
You can download a example data here spi
The fitdistr is part of the procedure of my main goal. I'm trying to calcule the Standard Precipitation Index. I already did it with a time series of a monthly precipitation of 30 year.
Here is the code for a time series till the line that I'm stock:
data<-read.csv("guatemala_spi.csv",header = T,sep=";")
dates<-data[,1]
rain_1month<-data[,2]
rain_3months<-0
#Setting the first 2 elements to NA because I'm going to calcule the accumulating the rainfall for 3 month
for (i in c(1:2)) {
rain_3months[i]<-NA
}
#Accumulating the rainfall for the rest of the data
number_of_months<-length(rain_1month)
for (j in c(3:number_of_months))
{
rain_3months[j]<-0.0
for (i in c(0:2))
{
rain_3months[j] = rain_3months[j] + rain_1month[j-i]
}
}
#Extracting a time-series for the month of interest (May)
may_rain_3months<-rain_3months[substr(dates,5,6)==”05”]
dates_may<-dates[substr(dates,5,6)==”05”]
number_of_years<-length(dates_may)
#Fitting the gama distribution by maximum likelihood estimation
start_year<-1971
end_year<-2010
start_index<-which(substr(dates_may,1,4)==start_year)
end_index<-which(substr(dates_may,1,4)==end_year)
may_baseline_3months<-may_rain_3months[start_index:end_index]
library(MASS)
gamma.parameters<-fitdistr(may_baseline_3months[may_baseline_3months>0],"gamma")
That last line is the one that I'm having problems to calculate for a raster stack.
Here's what I have so far in raster form:
Example multi-layer raster here (Monthly precipitation 2001 to 2004, 48 layers in total)
#Initiating a dates vector
dates<-c("200101","200102","200103","200104","200105","200106","200107","200108","200109","200110","200111","200112",
"200201","200202","200203","200204","200205","200206","200207","200208","200209","200210","200211","200212",
"200301","200302","200303","200304","200305","200306","200307","200308","200309","200310","200311","200312",
"200401","200402","200403","200404","200405","200406","200407","200408","200409","200410","200411","200412")
#Initiating a NA raster
rain_3months_1layer<-raster(nrow=1600, ncol=1673,extent(-118.4539, -34.80395, -50, 30),res=c(0.05,0.05))
values(rain_3months_1layer)<-NA
#Creating a raster stack NA of 48 layers
rain_3months<-stack(mget(rep( "rain_3months_1layer" , 48 )))
#Reading the data
rain_1month <- stack("chirps_rain_1month.tif")
#Accumulating the rainfall
number_of_months<-nlayers(rain_1month)
for (j in c(3:number_of_months))
{
rain_3months[[j]]<-0.0
for (i in c(0:2))
{
rain_3months[[j]] = rain_3months[[j]] + rain_1month[[j-i]]
}
}
#Extracting the raster for the month of interest (May)
may_rain_3months<-stack(rain_3months[[which(substr(dates,5,6)=="05", arr.ind = T)]])
dates_may<-dates[substr(dates,5,6)=="05"]
number_of_years<-length(dates_may)
#Fitting the gama distribution by maximum likelihood estimation
start_year<-2001
end_year<-2004
start_index<-which(substr(dates_may,1,4)==start_year)
end_index<-which(substr(dates_may,1,4)==end_year)
may_baseline_3months<-stack(may_rain_3months[[start_index:end_index]])
library(MASS)
f1<-function(x)
{
library(MASS)
return(fitdistr(x,"gamma"))
}
gamma.parameters<- calc(x = may_baseline_3months,fun = f1)
I can't make calc() to compute fitdistr() to the raster stack.
You need to make a function that calc can use. Your function f1 returns an object of class fitdistr. The calc function does not know what to do with that:
library(MASS)
set.seed(0)
x <- runif(10)
f1 <- function(x) {
return(fitdistr(x,"gamma"))
}
a <- f1(x)
class(a)
# [1] "fitdistr"
a
# shape rate
# 4.401575 6.931571
# (1.898550) (3.167113)
You need a function that returns numbers. Like f2:
f2 <- function(x) {
fitdistr(x,"gamma")$estimate
}
b <- f2(x)
class(b)
#[1] "numeric"
b
# shape rate
#4.401575 6.931571
Test f2 with calc:
library(raster)
s <- stack(lapply(1:12, function(i) setValues(r, runif(ncell(r)))))
r <- calc(s, f2)
I assume that this answers your questions. I cannot be sure because your question is way too complex. The first thing you need to do with a problems like this is to create a simple example like I have done above.
Next question
Error in stats::optim(x = c(7, 7, 7, 7), par = list(shape = Inf, rate
= Inf), : non-finite value supplied by optim.
That is a different issue, you are providing fitdistr with values it cannot deal with. You can add a try clause to skip over those. You could identify which cells this happens in and what the values are to see if there is something else you should do.
f3 <- function(x) {
x <- try (fitdistr(x,"gamma")$estimate, silent=TRUE )
if (class(x) == 'try-error') { c(-9999, -9999) } else { x }
}
x[1] <- NA
f2(x)
#Error in fitdistr(x, "gamma") : 'x' contains missing or infinite values
f3(x)
#[1] -9999 -9999
Note that you need to make sure that the number of values returned by f3 should always be the same. In this case two values. Here I use -9999 so that you can identify the cells. You can also use NA

Bandpassfilter R using fft

I have a time series z with sampling frequeny fs = 12(monthly data) and I would like to perform a bandpass filter using the fftat 10 months and 15 months. This is how I would proceed:
y <- as.data.frame(fft(z))
y$freq <- ..
y$y <- ifelse(y$freq>= 1/10 & y$freq<= 1/15,y$y,0)
zz <- fft(y$y, inverse = TRUE)/length(z)
plot zz in the time domain...
However, I don't know how to derive the frequencies of the fft and I don't know how to plot zz in the time domain. Can someone help me?
I have a function, that wraps fft() a bit:
function(y, samp.freq, ...){
N <- length(y)
fk <- fft(y)
fk <- fk[2:length(fk)/2+1]
fk <- 2*fk[seq(1, length(fk), by = 2)]/N
freq <- (1:(length(fk)))* samp.freq/(2*length(fk))
return(data.frame(fur = fk, freq = freq))
}
y is values of your signal, and samp.freq is it's sample frequency. It's output is data.frame with two columns - fur is complex numbers we get after fast fourier transform (Mod(fur) will be an amplitude, Arg(fur) - a phase) and freq is vector of corresponding frequencies.
But for frequency filtering I highly reccomend using signal package.
For example using Butterworth filter:
library('signal')
bf <- butter(2, c(low, high), type = "pass")
signal.filtered <- filtfilt(bf, signal.noisy)
In this case interval should be defined as c(Low.freq, High.freq) * (2/samp.freq), where Low.freq and High.freq - borders of frequency intervals. More information can be found in package documentation and octave reference guide.
Also, notice that with fft you can get only frequencies up to (sample frequency)/2.

How to simulate daily stock returns in R

I need to simulate a stock's daily returns. I am given r=(P(t+1)-P(t))/P(t) (normal distribution) mean of µ=1% and sd of σ =5%. P(t) is the stock price at end of day t. Simulate 100,000 instances of such daily returns.
Since I am a new R user, how do I setup t for this example. I am assuming P should be setup as:
P <- rnorm(100000, .01, .05)
r=(P(t+1)-P(t))/P(t)
You are getting it wrong: from what you wrote, the mean and the sd applies on the return and not on the price. I furthermore make the assumption that the mean is set for an annual basis (1% rate of return from one day to another is just ...huge!) and t moves along a day range of 252 days per year.
With these hypothesis, you can get a series of daily return in R with:
r = rnorm(100000, .01/252, .005)
Assuming the model you mentioned, you can get the serie of the prices P (containing 100001 elements, I will take P[1]=100 - change it with your own value if needed):
factor = 1 + r
temp = 100
P = c(100, sapply(1:100000, function(u){
p = factor[u]*temp
temp<<-p
p
}))
Your configuration for the return price you mention (mean=0.01 and sd=0.05) will however lead to exploding stock price (unrealistic model and parameters). Be carefull to check that prod(rate) will not return Inf .
Here is the result for the first 1000 values of P, representing 4 years:
plot(1:1000, P[1:1000])
One of the classical model (which does not mean this model is realistic) assumes the observed log return are following a normal distribution.
Hope this helps.
I see you already have an answer and ColonelBeauvel might have more domain knowledge than I (assuming this is business or finance homework.) I approached it a bit differently and am going to post a commented transcript. His method uses the <<- operator which is considered as a somewhat suspect strategy in R, although I must admit it seems quite elegant in this application. I suspect my method will probably be a lot faster if you ever get into doing large scale simulations.
Starting with your code:
P <- rnorm(100000, .01, .05)
# r=(P(t+1)-P(t))/P(t) definition, not R code
# inference: P_t+1 = r_t*P_t + P_t = P_t*(1+r_t)
# So, all future P's will be determined by P_1 and r_t
Since P_2 will be P_1*(1+r_1)r_1 then P_3 will be P_1*(1+r_1)*(1+r_2), .i.e a continued product of the vector (1+r) for which there is a vectorized function.
P <- P_1*cumprod(1+r)
#Error: object 'P_1' not found
P_1 <- 100
P <- P_1*cumprod(1+r)
#Error: object 'r' not found
# So the random simulation should have been for `r`, not P
r <- rnorm(100000, .01, .05)
P <- P_1*cumprod(1+r)
plot(P)
#Error in plot.window(...) : infinite axis extents [GEPretty(-inf,inf,5)]
str(P)
This occurred because the cumulative product went above the limits of numerical capacity and got assigned to Inf (infinity). Let's be a little more careful:
r <- rnorm(300, .01, .05)
P <- P_1*cumprod(1+r)
plot(P)
This strategy below iteratively updates the price at time t as 'temp' and multiplies it it by a single value. It's likely to be a lot slower.
r = rnorm(100000, .01/252, .005)
factor = 1 + r
temp = 100
P = c(100, sapply(1:300, function(u){
p = factor[u]*temp
temp<<-p
p
}))
> system.time( {r <- rnorm(10000, .01/250, .05)
+ P <- P_1*cumprod(1+r)
+ })
user system elapsed
0.001 0.000 0.002
> system.time({r = rnorm(10000, .01/252, .05)
+ factor = 1 + r
+ temp = 100
+ P = c(100, sapply(1:300, function(u){
+ p = factor[u]*temp
+ temp<<-p
+ p
+ }))})
user system elapsed
0.079 0.004 0.101
To simulate a log return of the daily stock, use the following method:
Consider working with 256 days of daily stock return data.
Load the original data into R
Create another data.frame for simulating Log return.
Code:
logr <- data.frame(Date=gati$Date[1:255], Shareprice=gati$Adj.Close[1:255], LogReturn=log(gati$Adj.Close[1:251]/gati$Adj.Close[2:256]))
gati is the dataset
Date and Adj.close are the variables
notice the [] values.
P <- rnorm(100000, .01, .05)
r=(P(t+1)-P(t))/P(t)
second line translates directly into :
r <- (P[-1] - P[length(P)]) / P[length(P)] # (1:5)[-1] gives 2:5
Stock returns are not normally distributed for Simple Returns ("R"), given their -1 lower bound per compounded period. However, Log Returns ("r") generally are. The below is adapted from #42's post above. There don't seem to be any solutions to simulating from Log Mean ("Expected Return") and Log Stdev ("Risk") in #Rstats, so I've included them here for those looking for "Monte Carlo Simulation using Log Expected Return and Log Standard Deviation"), which are normally distributed, and have no lower bound at -1. Note: from this single example, it would require looping over thousands of times to simulate a portfolio--i.e., stacking 100k plots like the below and averaging a single slice to calculate a portfolio's average expected return at a chosen forward month. The below should give a good basis for doing so.
startPrice = 100
forwardPeriods = 12*10 # 10 years * 12 months with Month-over-Month E[r]
factor = exp(rnorm(forwardPeriods, .04, .10)) # Monthly Expected Ln Return = .04 and Expected Monthly Risk = .1
temp = startPrice
P = c(startPrice, sapply(1:forwardPeriods, function(u){p = factor[u]*temp; temp <<- p; p}))
plot(P, type = "b", xlab = "Forward End of Month Prices", ylab = "Expected Price from Log E[r]", ylim = c(0,max(P)))
n <- length(P)
logRet <- log(P[-1]/P[-n])
# Notice, with many samples this nearly matches our initial log E[r] and stdev(r)
mean(logRet)
# [1] 0.04540838
sqrt(var(logRet))
# [1] 0.1055676
If tested with a negative log expected return, the price should not fall below zero. The other examples, will return negative prices with negative expected returns. The code I've shared here can be tested to confirm that negative prices do not exist in the simulation.
min(P)
# [1] 100
max(P)
# [1] 23252.67
Horizontal axis is number of days, and vertical axis is price.
n_prices <- 1000
volatility <- 0.2
amplitude <- 10
chng <- amplitude * rnorm(n_prices, 0, volatility)
prices <- cumsum(chng)
plot(prices, type='l')

Resources