Creating shades between a cluster of line plots in R - r

I am currently working with a line plot in R that would contain a large number of separate plots (which does make it difficult to read). What I would like to do instead is to somehow create a light-colored shade that captures the range of individual line plots. Would that be possible? This is what I have for my plot:
plot(get,Hope2, type="l",col="red", lwd="3", xlab="Cumulative CO2
emissions (TtC)", ylab="One-day maximum precipitation (mm/day)",
main="One-day maximum precipitation for South Sudan for CanESM2
scenarios")
lines(get2.teratons, Hope3, type="l", lwd="3", col="red")
lines(get3.teratons, Hope4, type="l", lwd="3", col="red")
lines(get4.teratons, Hope5, type="l", lwd="3", col="red")
So, this gives 4 separate red lines on the same plot (I am likely going to place up to 10 lines, so you can imagine how cluttered that would look without shading them in the background!). Now, let's say that I wanted to create a light red shade that fills from the upper to lower red curves. How should I go about doing that? The idea would be to capture all of the line plots by doing something like this:
http://jvoigts.scripts.mit.edu/blog/assets/plot_shaded_pretty.png
Thanks, and I would greatly appreciate any help!

Related

Contour plot via Scatter plot

Scatter plots are useless when number of plots is large.
So, e.g., using normal approximation, we can get the contour plot.
My question: Is there any package to implement the contour plot from scatter plot.
Thank you #G5W !! I can do it !!
You don't offer any data, so I will respond with some artificial data,
constructed at the bottom of the post. You also don't say how much data
you have although you say it is a large number of points. I am illustrating
with 20000 points.
You used the group number as the plotting character to indicate the group.
I find that hard to read. But just plotting the points doesn't show the
groups well. Coloring each group a different color is a start, but does
not look very good.
plot(x,y, pch=20, col=rainbow(3)[group])
Two tricks that can make a lot of points more understandable are:
1. Make the points transparent. The dense places will appear darker. AND
2. Reduce the point size.
plot(x,y, pch=20, col=rainbow(3, alpha=0.1)[group], cex=0.8)
That looks somewhat better, but did not address your actual request.
Your sample picture seems to show confidence ellipses. You can get
those using the function dataEllipse from the car package.
library(car)
plot(x,y, pch=20, col=rainbow(3, alpha=0.1)[group], cex=0.8)
dataEllipse(x,y,factor(group), levels=c(0.70,0.85,0.95),
plot.points=FALSE, col=rainbow(3), group.labels=NA, center.pch=FALSE)
But if there are really a lot of points, the points can still overlap
so much that they are just confusing. You can also use dataEllipse
to create what is basically a 2D density plot without showing the points
at all. Just plot several ellipses of different sizes over each other filling
them with transparent colors. The center of the distribution will appear darker.
This can give an idea of the distribution for a very large number of points.
plot(x,y,pch=NA)
dataEllipse(x,y,factor(group), levels=c(seq(0.15,0.95,0.2), 0.995),
plot.points=FALSE, col=rainbow(3), group.labels=NA,
center.pch=FALSE, fill=TRUE, fill.alpha=0.15, lty=1, lwd=1)
You can get a more continuous look by plotting more ellipses and leaving out the border lines.
plot(x,y,pch=NA)
dataEllipse(x,y,factor(group), levels=seq(0.11,0.99,0.02),
plot.points=FALSE, col=rainbow(3), group.labels=NA,
center.pch=FALSE, fill=TRUE, fill.alpha=0.05, lty=0)
Please try different combinations of these to get a nice picture of your data.
Additional response to comment: Adding labels
Perhaps the most natural place to add group labels is the centers of the
ellipses. You can get that by simply computing the centroids of the points in each group. So for example,
plot(x,y,pch=NA)
dataEllipse(x,y,factor(group), levels=c(seq(0.15,0.95,0.2), 0.995),
plot.points=FALSE, col=rainbow(3), group.labels=NA,
center.pch=FALSE, fill=TRUE, fill.alpha=0.15, lty=1, lwd=1)
## Now add labels
for(i in unique(group)) {
text(mean(x[group==i]), mean(y[group==i]), labels=i)
}
Note that I just used the number as the group label, but if you have a more elaborate name, you can change labels=i to something like
labels=GroupNames[i].
Data
x = c(rnorm(2000,0,1), rnorm(7000,1,1), rnorm(11000,5,1))
twist = c(rep(0,2000),rep(-0.5,7000), rep(0.4,11000))
y = c(rnorm(2000,0,1), rnorm(7000,5,1), rnorm(11000,6,1)) + twist*x
group = c(rep(1,2000), rep(2,7000), rep(3,11000))
You can use hexbin::hexbin() to show very large datasets.
#G5W gave a nice dataset:
x = c(rnorm(2000,0,1), rnorm(7000,1,1), rnorm(11000,5,1))
twist = c(rep(0,2000),rep(-0.5,7000), rep(0.4,11000))
y = c(rnorm(2000,0,1), rnorm(7000,5,1), rnorm(11000,6,1)) + twist*x
group = c(rep(1,2000), rep(2,7000), rep(3,11000))
If you don't know the group information, then the ellipses are inappropriate; this is what I'd suggest:
library(hexbin)
plot(hexbin(x,y))
which produces
If you really want contours, you'll need a density estimate to plot. The MASS::kde2d() function can produce one; see the examples in its help page for plotting a contour based on the result. This is what it gives for this dataset:
library(MASS)
contour(kde2d(x,y))

Draw 3 curves together in the same graph and scale

I need to draw three curves together in the same graph with the same scale. I know how to draw two curves together, as the following code:
r=0.8
z=2
k=seq(0,5,by=0.1)
y1=(z^2+k*r)/(r*z+k)
y2=z*(z+k*r)/(r+k)
plot(k,y1,type='l',ylab=' ',col="red",ylim=range(c(y1,y2)))
par(new=TRUE)
plot(k,y2,type='l',col="green",ylim=range(c(y1,y2)))
It works fine, but I don't know how to add the third curve, means how to set ylim.
Any help is appreciated.
Use lines
r=0.8
z=2
k=seq(0,5,by=0.1)
y1=(z^2+k*r)/(r*z+k)
y2=z*(z+k*r)/(r+k)
y3=0.8*y2
ymin=min(c(y1,y2,y3))
ymax=max(c(y1,y2,y3))
plot(k,y1,type='l',ylab='lines',col="red",ylim=c(ymin,ymax))
par(new=TRUE)
lines(k,y2,type='l',col="green",ylim=range(c(y1,y2)))
lines(k,y3)
Here is another example, with data you have provided as a comment
r=0.8
z=3
p=seq(0.1,5,by=0.1)
y1=(p*z^2+r*z)/(p*r*z+1)
y2=z*(p+r)/(p*r+1)
y3=(p^2*z*2-1+sqrt((p^2*z*2)^2+4*p^2*r*2*z*2))/(2*p*2*r*z)
ymin=-1
ymax=max(c(y1,y2,y3))
plot(p,y1,type='l',ylab='lines',col="red",ylim=c(ymin,ymax))
par(new=TRUE)
lines(p,y2,type='l',col="green",ylim=range(c(y1,y2)))
lines(p,y3, col="blue")

Graphics using plot.mc

I need help with some of the graphics on a cumulative distribution plot using plot.mc
An example of what I am doing is as follows:
Require(mc2d)
ndvar(10000)
test<-mcstoc(rbetagen,type="V",shape1=48, shape2=66.42, min=0, max=4000)
hist(test)
test<-mc(test)
plot(test)
This plots the cumulative distribution of "test".
What I want to do, is draw a verticle line up from a specific point on the x-axis (say 1800), which touches the cumulative distribution line, then draws a line across to the Y axis at the point corresponding to 1800 on the x axis.
I have looked up help(plot.mc) and help(plot.stepfun) but neither seem to be able to do this.
Many thanks,
Timothy
See ?arrows
plot(test)
arrows(x0=0, y0=.745, x1=1800, y1=.745, code=0, lty=2)
arrows(x0=1800, y0=0, x1=1800, y1=.745, code=0, lty=2)

abline and logarithmic x-axis gives horizontal regression line in plot

First of all, please download my data set from http://alexandervanloon.nl/survey_oss.csv and then execute the following content of a script to get a few scatter plots:
# read data and attach it
survey <- read.table("survey_oss.csv", header=TRUE)
attach(survey)
# plot for inhabitants
png("scatterINHABT.png")
plot(INHABT, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS", las=1)
abline(lm(OSSADP~INHABT)) # regression line (y~x)
dev.off()
# plot for inhabitants divided by 1000
png("scatterINHABT_divided.png")
plot(INHABT/1000, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS", las=1)
abline(lm(OSSADP~INHABT)) # regression line (y~x)
dev.off()
# plot for inhabitants in logarithmic scale
png("scatterINHABT_log.png")
plot(INHABT, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS", las=1, log="x")
abline(lm(OSSADP~INHABT)) # regression line (y~x)
dev.off()
# plot for inhabitants in logarithmic scale and divided by 1000
png("scatterINHABT_log_divided.png")
plot(INHABT/1000, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS", las=1, log="x")
abline(lm(OSSADP~INHABT)) # regression line (y~x)
dev.off()
As you can see, in the first scatterplot the problem is that R decides to use scientific notation and the data looks odd because of outliers. That's why I'd like to have the inhabitants on x-axis in thousands and have the x-axis use a logarithmic scale as well.
The problem is twofold. First, I can get rid of scientific notation by simply dividing the inhabitants by 1000, but this produces a flat horizontal regression line unlike the first plot. I know there are other ways to fix this such as Do not want scientific notation on plot axis but I couldn't adapt the code there to my situation.
Second, switching the x-axis to a logarithmic scale also makes the regression line flat. Google points to https://stat.ethz.ch/pipermail/r-help/2006-January/086500.html as a first result for a possible solution and I tried using abline(lm(OSSADP~log10(INHABT))) which is suggested there, but that produces a vertical regression line. And if I divide both by 1000 and use a logarithmic scale, the line is also horizontal.
I'm a social scientist without any background in mathematics and statistics, so I fear I might have missed something obvious, if so my apologies. Thank you all very much for any potential help.
The scientific notation was covered on the R mailing list a while ago, but you can control how R chooses when to go to scientific notation with options()$scipen
.
options(scipen=10)
plot(INHABT, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS")
Second, the problem with your dividing by 1000 is that you didn't divide by a thousand in both the plot and the abline. This would do the trick:
plot(INHABT/1000, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS")
abline(lm(OSSADP~I(INHABT/1000))) # Fixed regression line.
The I is neccessary because the / symbol has a different meaning in formulas.
Also, your las parameter is unnecessary.
I solved the problem of horizontal line when use log="x" like this:
plot(INHABT, OSSADP, xlab="Inhabitants", ylab="Adoption of OSS", log="x")
abline(lm(OSSADP~log10(INHABT)))
with log10 and not just log.

Reason for strange gap in the plot

I have this plot of Depths vs time:
This plot has a strange gap at the start of May.
I checked the data but there are no NAs or Nans or no missing data.
This is a time series of regular interval of 15 minutes
I cannot give the dataset here since it contains 10,000 rows.
Can somebody please give suggestions as what possibly it can be?
I am using the following plotting code:
library(zoo)
z=read.zoo("data.txt", header=TRUE)
temp=index(z[,1])
m=coredata(z[,1])
x=0.001
p=rep.int(x,length(temp))
png(filename=paste(Name[k],"_mean1.png", sep=''), width= 3500, height=1600, units="px")
par(mar=c(13,13,5,3),cex.axis= 2.5, cex.lab=3, cex.main=3.5, cex.sub=5)
plot(temp,m, xlab="Time", ylab="Depth",type='l', main=Name[k])
symbols(temp,m,add=TRUE,circles=p, inches=1/15, ann=F, bg="steelblue2", fg=NULL)
dev.off()
Okay, here's a guess from what you have posted.
I'm guessing there is no data for a period right at the start of May where the 'gap' in question pops up. There are no NAs because there just aren't any lines of data for this period at all. There is still a thin black line drawn to the plot by this line of code which links the 'gap' in data...
plot(temp,m, xlab="Time", ylab="Depth",type='l', main=Name[k])
...but there are no blue symbols (circles) plotted close together enough to make it look like a continuous blue line. The blue symbols being plotted with the below code, over the top of the existing plot:
symbols(temp,m,add=TRUE,circles=p, inches=1/15, ann=F, bg="steelblue2", fg=NULL)
I suggest instead of plotting a line and then plotting symbols over the top of it that you just plot a thick blue line to start with like:
plot(temp,m, xlab="Time", ylab="Depth",type='l', main=Name[k],lwd=5,col="steelblue2")

Resources