R: simple plot with varying slope by value of x axis - r

I am new to R, hopefully someone can help me. I am trying to figure out how to plot a very simple graph (with plot()):
y-axis should be ay <- c(-1,1) and x-axis should be ax <- c(0:6). I have a vector v1<-c(0,-0.1,-0.3,-0.6,-0.2,-0.4, 0.2), that gives the slopes for each segment on the x-axis (i.e. between 0 and 1, the slope is -0.1, moving from 1 to 2, the slope is -0.3, and so on.).
I simply need to draw a line that goes from 0 to 6 on the x-axis with slopes between the segments given by v2.
Additionally, there should be a separate straight line with slope -0.5 starting from 0, i.e. abline(0,-0.5) in the same figure.
This should be something extremely simple, but I just can't get it right. Thanks in advance!

How about this one?
x <- 0:6
v1 <- c(0,-0.1,-0.3,-0.6,-0.2,-0.4, 0.2)
Note that the slopes are given by the difference in y-values over the difference in x-values. Since the difference in x-values are always 1 for each successive entry in v1, it essentially contains the difference in y-values. Taking the first entry in v1 to be y(x=0) (see note below), cumsum(v1) gives you the y-values you need to give to plot.
y <- cumsum(v1)
plot(x, y, type="l")
Note that there are seven slope-values in v1 but only six differences from 0 to 6. If you count, that is 0-1, 1-2, 2-3, 3-4, 4-5, and 5-6, which gives six differences. If the first entry is supposed to be a slope, then the range of x needs to be reconsidered.

I have not fully understood your problem. In my opinion the range on the x axis should be (0, 6).
Anyway, see the code below. Hope it can help you.
v1 <- c(-0.1,-0.3,-0.6,-0.2,-0.4, 0.2)
plot(0:6, 1+c(0,cumsum(v1)), type="o", ylim=c(-1,1))
abline(v=0:6, h=seq(-1,1,by=0.1), col="gray", lty=3)

Related

When plotting a curve in R, a piece of the curve gets cut off, not sure why

I am trying to plot this formula. As x approaches 0 from the right, y should be approaching infinity, and so my curve should be going upwards close to y-axis. Instead it gets cut off at y=23 or so.
my_formula = function(x){7.9*x^(-0.5)-1.3}
curve(my_formula,col="red",from=0 ,to=13, xlim=c(0,13),ylim=c(0,50),axes=T, xlab=NA, ylab=NA)
I tried to play with from= parameter, and actually got what I needed when I
put from=-4.8 but I have no idea why this works. in fact x doesn't get less than 0, and from/to should represent the range of x values, Do they? If someone could explain it to me, this would be amazing! Thank you!
By default, curve only chooses 101 x-values within the (from, to) range, set by the default value of the n argument. In your case this means there aren't many values that are close enough to 0 to show the full behaviour of the function. Increasing the number of values that are plotted with something like n=500 helps:
curve(my_formula,col="red",from=0 ,to=13,
xlim=c(0,13),ylim=c(0,50),axes=T, xlab=NA, ylab=NA,
n=500)
This is due mainly to the fact that my_formula(0) is Inf:
So plotting from=0, to=13 in curve means your first 2 values are by default (with 101 points as #Marius notes):
# x
seq(0, 13, length.out=101)[1:2]
#[1] 0.00 0.13
# y
my_formula(seq(0, 13, length.out=101)[1:2])
#[1] Inf 20.61066
And R will not plot infinite values to join the lines from the first point to the second one.
If you get as close to 0 on your x axis as is possible on your system, you can make this work a-okay. For instance:
curve(my_formula, col="red", xlim=c(0 + .Machine$double.eps, 13), ylim=c(0,50))

Shade area under a curve [duplicate]

This question already has answers here:
Shading a kernel density plot between two points.
(5 answers)
Closed 6 years ago.
I'm trying to shade an area under a curve in R. I can't quite get it right and I'm not sure why. The curve is defined by
# Define the Mean and Stdev
mean=1152
sd=84
# Create x and y to be plotted
# x is a sequence of numbers shifted to the mean with the width of sd.
# The sequence x includes enough values to show +/-3.5 standard deviations in the data set.
# y is a normal distribution for x
x <- seq(-3.5,3.5,length=100)*sd + mean
y <- dnorm(x,mean,sd)
The plot is
# Plot x vs. y as a line graph
plot(x, y, type="l")
The code I'm using to try to color under the curve where x >= 1250 is
polygon(c( x[x>=1250], max(x) ), c(y[x==max(x)], y[x>=1250] ), col="red")
but here's the result I'm getting
How can I correctly color the portion under the curve where x >= 1250
You need to follow the x,y points of the curve with the polygon, then return along the x-axis (from the maximum x value to the point at x=1250, y=0) to complete the shape. The final vertical edge is drawn automatically, because polygon closes the shape by returning to its start point.
polygon(c(x[x>=1250], max(x), 1250), c(y[x>=1250], 0, 0), col="red")
If, rather than dropping the shading all the way down to the x-axis, you prefer to have it at the level of the curve, then you can use the following instead. Although, in the example given, the curve drops almost to the x-axis, so its hard to see the difference visually.
polygon(c(x[x>=1250], 1250), c(y[x>=1250], y[x==max(x)]), col="red")

Boxplot in R uper/lower whiskers

I am using the most basic function of Boxplot, boxplot(x, ..., range = 1.5, but if I don't set the rang, and let R use its default value. Something like boxplot(x, ...,) what exactly quantile of the whiskers ? Because I have outliners that is larger or smaller than the upper/lower whiskers. How can I know the exact percentage of the outliners above or below the uper/lower whiskers? In other words, without setting the range, may I know what the percentage of the data is for the uper/lower whiskers?
For example, you could calculate the percentage of utliers as follows:
# Some data with outliers:
d <- rnorm(100)
d[sample(1:100, 10)] <- rnorm(10,mean = 0, sd = 10)
bp <- boxplot(d)
# Get the values of the outliers:
out <- bp$out
# The proportion of outliers:
length(out)/length(d)*100
9
Not entirely sure what your question is, but: ?boxplot says the default value of range is 1.5, and then it says
range: this determines how far the plot whiskers extend out from the
box. If ‘range’ is positive, the whiskers extend to the most
extreme data point which is no more than ‘range’ times the
interquartile range from the box. A value of zero causes the
whiskers to extend to the data extremes.
In other words, the whiskers are not defined as a proportion of the data, but as a multiple of the interquartile range.
If you want to know the proportions, you can use boxplot.stats:
set.seed(101)
x <- runif(100)
bb <- boxplot.stats(x)
c(mean(x<min(bb$stats)),mean(x>max(bb$stats)))
## [1] 0 0
mean(<logical value>) is a shortcut for computing a proportion. Because I have chosen the data from a uniform distribution, there are actually no points beyond the whiskers (confirmed by looking at boxplot(x)). If I were to do re-do this with rcauchy() there would be lots ...

represent the frequency line in a histagram using freq=TRUE

I have the following piece of code in R:
w=rbeta(365,1,3,ncp=0)
hist(10*w,breaks=25,freq=TRUE,xlim=c(0,10), ylim=c(0,60))
h=seq(0,1,0.05)
So far so good.
What I want to do now is to add a line representing the beta function having parameters alpha=1, beta=3 (as in the rbeta function I used), which takes into account the frequency and not the density. The total number of elements in the rbeta is 365 (the days in a year) and the reason why I multiply w by 10 is because the variable I am studying can assume value [0,10] each day, following the beta distribution described above.
What do I have to do to represent this line?
Summarizing, the histogram is based on simulated values, and I want to show how the theoretical beta function would had behaved in comparison to the simulation.
If you want them to match up, you're going to want to match up the area under the curves of the histogram and the density plot. That should put them on the same scale. One way to do that would be
set.seed(15) #to make it reproducible
w <- rbeta(365, 1, 3, ncp=0)
hh <- hist(w*10, breaks=25, freq=TRUE, xlim=c(0,10), ylim=c(0,60))
ss <- sum(diff(hh$breaks)*hh$counts)
curve(dbeta(x/10, 1, 3, ncp=0)*ss/10, add=T)
This gives

Statistical distribution plot with shaded rejection areas

I found this R code online:
stdDev <- 0.75;
x <- seq(-5,5,by=0.01)
y <- dnorm(x,sd=stdDev)
right <- qnorm(0.95,sd=stdDev)
plot(x,y,type="l",xaxt="n",ylab="p",
xlab=expression(paste('Assumed Distribution of ',bar(x))),
axes=FALSE,ylim=c(0,max(y)*1.05),xlim=c(min(x),max(x)),
frame.plot=FALSE)
axis(1,at=c(-5,right,0,5),
pos = c(0,0),
labels=c(expression(' '),expression(bar(x)[cr]),expression(mu[0]),expression('')))
axis(2)
xReject <- seq(right,5,by=0.01)
yReject <- dnorm(xReject,sd=stdDev)
polygon(c(xReject,xReject[length(xReject)],xReject[1]),
c(yReject,0, 0), col='red')
It is doing what I need, which is plotting the normal distribution, and shading a right rejection area according to some number (0.95). What I want to ask is:
How can I change this code to shade a two sided rejection area?
How do I change it for a left side one sided area?
And assume that I want a chi square or F distribution instead, is it enough to just change the dnorm & qnorm commands accordingly?
Another question: In this plot, the plot itself is higher than the y-axis. How do I fix it that the axis matches the height of the plot?
Thank you!
You can start with a polygon covering the whole area under the curve and removing the part that is not rejected:
## Calculate the 5th percentile
left <- qnorm(0.05,sd=stdDev)
## x and y for the whole area
xReject <- c(seq(-5,5,by=0.01))
yReject <- dnorm(xReject,sd=stdDev)
## set y = 0 for the area that is not rejected
yReject[xReject > left & xReject < right] <- 0
## Plot the red areas
polygon(c(xReject,xReject[length(xReject)],xReject[1]),
c(yReject,0, 0), col='red')
As before but set to zero the not rejected areas
yReject[xReject > left] <- 0
Almost. For example for the chi squared distribution you have to give the df (degrees of freedom and not sd). And also the xlim has to be changed. But apart from that the code would be the same.
The line axis(2) draws the y-axis. You can give some extra arguments to have it the way you want. You can try for example:
s <- seq(0,0.55,0.05)
axis(2, at = s, labels = s)
Hope it helps,
alex
Take the polygon calls which shade the right-side rejection area and repeat those lines, substituting the coordinates of the left-side area.
i think this will do it
left <- qnorm(0.05,sd=stdDev)
xLeject <- seq(left,-5,by=-0.01)
yLeject <- dnorm(xLeject,sd=stdDev)
polygon(c(xLeject,xLeject[length(xLeject)],xLeject[1]),
c(yLeject,0, 0), col='red')
As to graph extent, see plot(..., ylim=(lower,upper))

Resources