What is the distribution of the sum of two independent random variables with a uniform uniform distribution on the range 0 to 1? Just make the appropriate simulation, the result is presented on the graph and commented on it.
my code:
x<-runif(10,0,1)
y<-runif(10,0,1)
z<-x+y
plot(density(x),col='blue')
par(new = TRUE)
plot(density(y),ann=FALSE, axis = FALSE,col='green')
par(new = TRUE)
plot(density(z),ann=FALSE, axis = FALSE,col='red')
Related
I want to plot anomalies in a dataset with a different color. For that i generated random numbers, injected anomalies based on a condition, and them plotted them. But the plot that i am getting is wrong. Following is the code:
n = 1000
a = 25
mu = 0
sigma = 0.5
data = rnorm(n,mu,sigma)
n_data = sample(1:n,25,replace = FALSE)
p_data = sample(1:n,25,replace = FALSE)
plot(data)
points(data[n_data],col=2)
points(data[p_data],col=3)
But this gives me a wrong plot. It should show anomalous points distributed among the whole graph, but it shows a plot like this.
How can i plot the points correctly based on index?
here you plot your vector data without x specified so x is x1 = 1... xn = length(data)
just indicate the x corresponding and it will work
points(n_data, data[n_data],col=2)
points(p_data, data[p_data],col=3)
The problem is you do not have x coordinates for your random values, so the plot will simply give each value an index and treat that as your x-value. You have a total of 1000 points, but only 25 are colored in each of the colored points. If you were to take 1000 colored points they would be spread out just as much.
I'm trying to use the inverse cumulative distribution method to plot a histogram from the standard cauchy distribution and I'm getting a strange plot that doesn't look like the textbook standard cauchy. I think I have my inverse function correct (x = tan(pi*(x - 1/2))) so I would appreciate some help. Here is the r code that I have used:
n <- 10000
u <- runif(n)
c.samp <- sapply(u, function(u) tan(pi*(u - 1/2)))
hist(c.samp, breaks = 90, col = "blue",
main = "Hist of Cauchy")
The resulting plot just doesn't look correct:
Any help is appreciated, thank you.
The histogram and sampling technique is correct.
Compare the results with the following (which uses the R Cauchy sampling function).
c.samp2 <- rcauchy(n)
hist(c.samp2, breaks = 90, col = "blue",
main = "Hist of Cauchy 2")
The output here also look incorrect, but it is not.
First, you should note the x-axis is by default chosen based on the extreme values that you happen to encounter. As you probably know, the Cauchy distribution is extremely fat-tailed and very large, but rare, values are expected. When running 10000 samples from the Cauchy distribution, those relatively few single measurements squeeze the plot and do not show up on the plot because only very few observations are allocated to each bins in those extremes.
The default parameters of how hist chooses the bins are also poorly suited for distribution like the Cauchy. Try e.g.
hist(c.samp2, breaks = "FD", col = "blue",
bins = 50,
main = "Hist of Cauchy 2",
xlim = c(-500, 500))
I suggest to read the help("hist") page carefully and play around with the parameters to get a good and useful histogram.
By tweaking the chosen x-axis ranges, using an y-axis probability scale, adding the theoretical distribution and a "rug", you get something more useful.
hist(c.samp, breaks = "FD", col = "blue",
main = "Hist of Cauchy distribution",
xlim = c(-50, 50),
freq = FALSE)
curve(dcauchy, add = TRUE, col = "red")
rug(c.samp)
Note that using c.samp or c.samp2 now hardly changes the plot.
I want to plot the histogram with real data and compare it with a theoretical normal distribution in one plot. But the scale looks different. Two plots have different scale
# you can generate some ramdom data on ystar which is realy data.
x<-seq(-4,4,length=200)
y<-dnorm(x,mean=0, sd=1)
plot(x,y, type = "l", lwd = 2, xlim = c(-3.5,3.5),ylim=c(0,0.7))
par(new = TRUE)
hist(ystar,xlim = c(-10,10),freq = FALSE,ylim=c(0,0.7),breaks = 50)
Desire output
Assuming that ystar is a vector, you should change this:
y<-dnorm(x,mean=0, sd=1)
To:
y<-dnorm(x,mean=mean(ystar), sd=sd(ystar))
This will produce a distribution function that approximately matches the histogram.
You should then be able to use the same x-limits for both the histogram and the theoretical distribution, which will eliminate the strange overlapping axis labels you have in your current version.
Could someone explain me why I get different lines when I plot? Somehow I thought the line should be the same
data(aircraft)
help(aircraft)
attach(aircraft)
lgWeight <- log(Weight)
library(KernSmooth)
# a) Fit a nonparametric regression to data (xi,yi) and save the estimated values mˆ (xi).
# Regression of degree 2 polynomial of lgWeight against Yr
op <- par(mfrow=c(2,1))
lpr1 <- locpoly(Yr,lgWeight, bandwidth=7, degree = 2, gridsize = length(Yr))
plot(Yr,lgWeight,col="grey", ylab="Log(Weight)", xlab = "Year")
lines(lpr1,lwd=2, col="blue")
lines(lpr1$y, col="black")
How can I get the values from the model? If I print the model, it gives me the values on $x and $y, but somehow if I plot them, is not the same as the blue line. I need the values of the fitted model (blue) for every x, could someone help me?
The fitted model (blue curve) is correctly in lpr1. As you said, the correct y-values are in lpr1$y and the correct x-values are in lpr1$x.
The reason the second plot looks like a straight line is because you are only giving the plot function one variable, lpr1$y. Since you don't specify the x-coordinates, R will automatically plot them along an index, from 1 to the length of the y variable.
The following are two explicit and equivalent ways to plot the curve and line:
lines(x = lpr1$x, y = lpr1$y,lwd=2, col="blue") # plots curve
lines(x = 1:length(lpr1$y), y = lpr1$y, col="black") # plot line
So I have a barplot in which the y axis is the log (frequencies). From just eyeing it, it appears that bars decrease exponentially, but I would like to know this for sure. What I want to do is also plot an exponential on this same graph. Thus, if my bars fall below the exponential, I would know that my bars to decrease either exponentially or faster than exponential, and if the bars lie on top of the exponential, I would know that they dont decrease exponentially. How do I plot an exponential on a bar graph?
Here is my graph if that helps:
If you're trying to fit density of an exponential function, you should probably plot density histogram (not frequency). See this question on how to plot distributions in R.
This is how I would do it.
x.gen <- rexp(1000, rate = 3)
hist(x.gen, prob = TRUE)
library(MASS)
x.est <- fitdistr(x.gen, "exponential")$estimate
curve(dexp(x, rate = x.est), add = TRUE, col = "red", lwd = 2)
One way of visually inspecting if two distributions are the same is with a Quantile-Quantile plot, or Q-Q plot for short. Typically this is done when inspecting if a distribution follows standard normal.
The basic idea is to plot your data, against some theoretical quantiles, and if it matches that distribution, you will see a straight line. For example:
x <- qnorm(seq(0,1,l=1002)) # Theoretical normal quantiles
x <- x[-c(1, length(x))] # Drop ends because they are -Inf and Inf
y <- rnorm(1000) # Actual data. 1000 points drawn from a normal distribution
l.1 <- lm(sort(y)~sort(x))
qqplot(x, y, xlab="Theoretical Quantiles", ylab="Actual Quantiles")
abline(coef(l.1)[1], coef(l.1)[2])
Under perfect conditions you should see a straight line when plotting the theoretical quantiles against your data. So you can do the same plotting your data against the exponential function you think it will follow.