Density in R plot - plot

I am trying to plot the density of the gamma distribution.
x<-seq(0,10000,length.out = 1000)
plot(density(rgamma(1000,shape = 7,scale = 120)))
plot(dgamma(x,shape=7,scale=120),col="red")
But, I don't understand why both plots are totally different.

Since you didn't supply x in the final call, the x coordinates defaulted to the indices 1,2,3,...1000 of the vector dgamma(x,shape=7,scale=120) rather than the intended 0,10,20,....
If you do:
x<-seq(0,10000,length.out = 1000)
plot(density(rgamma(1000,shape = 7,scale = 120)))
points(x,dgamma(x,shape=7,scale=120),type = "l", col="red")
Then the graph is:

Related

How to plot theoretical Pareto distribution in R?

I need to plot theoretical Pareto distribution in R.
I want this as a line - not points and not polylines.
My distribution function is 1−(1/x)^2.
I plotted empirical distribution of my sample and also theoretical distribution at one graph:
ecdf(b2)
plot(ecdf(b2))
lines(x, (1-(1/x)^2), col = "red", lwd = 2, xlab = "", ylab = "")
But I got:
You can see that red line is not continuous, it's something like polyline. Is it possible to get the continuous red line?
Do you have any advices?
Use curve() instead.
library(EnvStats)
set.seed(8675309)
# You did not supply the contents of b2 so I generated some
b2 <- rpareto(100, 1, 2)
plot(ecdf(b2))
ppareto <- function(x) 1−(1/x)^2
curve(ppareto, col = "red", add = TRUE)

Histogram overlay not visible

I need to overlay a normal distribution curve based on a dataset on a histogram of the same dataset.
I get the histogram and the normal curve right individually. But the curve just stays a flat line when combined to the histogram using the add = TRUE attribute in the curve function.
I did try adjusting the xlim and ylim to check if it works but am not getting the intended results, I am confused about how to set the (x and y) limits to suit both the histogram and the curve.
Any suggestions? My dataset is a set of values for 100 individuals daily walk distances ranging from min = 0.4km to max = 10km
bd.m <- read_excel('walking.xlsx')
hist(bd.m, ylim = c(0,10))
curve(dnorm(x, mean = mean(bd.m), sd = sd(bd.m)), add = TRUE, col = 'red')
You need to set freq = FALSEin the call to hist. For example:
dt <- rnorm(1000, 2)
hist(dt, freq = F)
curve(dnorm(x, mean = mean(dt), sd = sd(dt)), add = TRUE, col = 'red')

How to plot the Standard Normal CDF in R?

As the title says, I'm trying to plot the CDF of a N(0,1) distribution between some values a, b. I.e. Phi_0,1 (a) to Phi_0,1 (b). For some reason I'm having issues finding information on how to do this.
You can use curve to do the plotting, pnorm is the normal probability (CDF) function:
curve(pnorm, from = -5, to = 2)
Adjust the from and to values as needed. Use dnorm if you want the density function (PDF) instead of the CDF. See the ?curve help page for a few additional arguments.
Or using ggplot2
library(ggplot2)
ggplot(data.frame(x = c(-5, 2)), aes(x = x)) +
stat_function(fun = pnorm)
Generally, you can generate data and use most any plot function capable of drawing lines in a coordinate system.
x = seq(from = -5, to = 2, length.out = 1000)
y = pnorm(x)

How do you implement rgamma and dgamma in a single plot

For an assignment I was asked this:
For the values of
(shape=5,rate=1),(shape=50,rate=10),(shape=.5,rate=.1), plot the
histogram of a random sample of size 10000. Use a density rather than
a frequency histogram so that you can add in a line for the population
density (hint: you will use both rgamma and dgamma to make this plot).
Add an abline for the population and sample mean. Also, add a subtitle
that reports the population variance as well as the sample variance.
My current code looks like this:
library(ggplot2)
set.seed(1234)
x = seq(1, 1000)
s = 5
r = 1
plot(x, dgamma(x, shape = s, rate = r), rgamma(x, shape = s, rate = r), sub =
paste0("Shape = ", s, "Rate = ", r), type = "l", ylab = "Density", xlab = "", main =
"Gamma Distribution of N = 1000")
After running it I get this error:
Error in plot.window(...) : invalid 'xlim' value
What am I doing incorrectly?
plot() does not take y1 and y2 arguments. See ?plot. You need to do a plot (or histogram) of one y variable (e.g., from rgamma), then add the second y variable (e.g., from dgamma) using something like lines().
Here's one way to get a what you want:
#specify parameters
s = 5
r = 1
# plot histogram of random draws
set.seed(1234)
N = 1000
hist(rgamma(N, shape=s, rate=r), breaks=100, freq=FALSE)
# add true density curve
x = seq(from=0, to=20, by=0.1)
lines(x=x, y=dgamma(x, shape=s, rate=r))

Surface plot Q in R - compable to surf() in matlab

I want to plot a matrix of z values with x rows and y columns as a surface similar to this graph from MATLAB.
Surface plot:
Code to generate matrix:
# Parameters
shape<-1.849241
scale<-38.87986
x<-seq(from = -241.440, to = 241.440, by = 0.240)# 2013 length
y<-seq(from = -241.440, to = 241.440, by = 0.240)
matrix_fun<-matrix(data = 0, nrow = length(x), ncol = length(y))
# Generate two dimensional travel distance probability density function
for (i in 1:length(x)) {
for (j in 1:length(y)){
dxy<-sqrt(x[i]^2+y[j]^2)
prob<-1/(scale^(shape)*gamma(shape))*dxy^(shape-1)*exp(-(dxy/scale))
matrix_fun[i,j]<-prob
}}
# Rescale 2-d pdf to sum to 1
a<-sum(matrix_fun)
matrix_scale<-matrix_fun/a
I am able to generate surface plots using a couple methods (persp(), persp3d(), surface3d()) but the colors aren't displaying the z values (the probabilities held within the matrix). The z values only seem to display as heights not as differentiated colors as in the MATLAB figure.
Example of graph code and graphs:
library(rgl)
persp3d(x=x, y=y, z=matrix_scale, color=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)))
surface3d(x=x, y=y, z=matrix_scale, color=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)))
persp(x=x, y=y, z=matrix_scale, theta=30, phi=30, col=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)), border=NA)
Image of the last graph
Any other tips to recreate the image in R would be most appreciated (i.e. legend bar, axis tick marks, etc.)
So here's a ggplot solution which seems to come a little bit closer to the MATLAB plot
# Parameters
shape<-1.849241
scale<-38.87986
x<-seq(from = -241.440, to = 241.440, by = 2.40)
y<-seq(from = -241.440, to = 241.440, by = 2.40)
df <- expand.grid(x=x,y=y)
df$dxy <- with(df,sqrt(x^2+y^2))
df$prob <- dgamma(df$dxy,shape=shape,scale=scale)
df$prob <- df$prob/sum(df$prob)
library(ggplot2)
library(colorRamps) # for matlab.like(...)
library(scales) # for labels=scientific
ggplot(df, aes(x,y))+
geom_tile(aes(fill=prob))+
scale_fill_gradientn(colours=matlab.like(10), labels=scientific)
BTW: You can generate your data frame of probabilities much more efficiently using the built-in dgamma(...) function, rather than calculating it yourself.
In line with alexis_laz's comment, here is an example using filled.contour. You might want to increase your by to 2.40 since the finer granularity increases the time it takes to generate the plot by a lot but doesn't improve quality.
filled.contour(x = x, y = y, z = matrix_scale, color = terrain.colors)
# terrain.colors is in the base grDevices package
If you want something closer to your color scheme above, you can fiddle with the rainbow function:
filled.contour(x = x, y = y, z = matrix_scale,
color = (function(n, ...) rep(rev(rainbow(n/2, ...)[1:9]), each = 3)))
Finer granularity:
filled.contour(x = x, y = y, z = matrix_scale, nlevels = 150,
color = (function(n, ...)
rev(rep(rainbow(50, start = 0, end = 0.75, ...), each = 3))[5:150]))

Resources