I want to break the x-axis of a plot of a cumulative distribution function for which I use the function plot.stepfun, but don't seem to be able to figure out how.
Here's some example data:
set.seed(1)
x <- sample(seq(1,20,0.01),300,replace=TRUE)
Then I use the function ecdf to get the empirical cumulative distribution function of x:
x.cdf <- ecdf(x)
And I change the class of x.cdf to stepfun, because I prefer to call plot.stepfun directly over using plot.ecdf (which also uses plot.stepfun, but has fewer possibilities to customize the plot).
class(x.cdf) <- "stepfun"
Then I am able to create a plot as follows:
plot(x.cdf, do.point=FALSE)
But now I want to break up the x-axis between 12 and 20, e.g. using axis.break [plotrix-library] such as here, but since I have no ordinary x and y-argument for plotting, I don't know how to do this.
Any help would be very much appreciated!
"Breaking the axis between 12 and 20" doesn't make a lot of sense to me since 20 is the end of the x range, so I will exemplify breaking it between 12 and 15. The plotrix.axis.break function doesn't actually do very much (as can be seen if you step through that example.) All it does is put a couple of slashes at a particular location, the "breakpos". All the rest of the work needs to be done with regular plotting functions and plot.stepfun isn't really set up to do it, so I'm using regular plot.default with the type="s" argument. You need to do the offsetting of the x values, the arguments to the ecdf function and the labels in the axis arguments.
png()
plot( c(seq(1,12,0.1), seq(15,20,0.1)-3), # Supply the range, shifted
x.cdf(c(seq(1,12,0.1), seq(15,20,0.1))), # calc domain values, not shifted
type="s", xaxt="n", xlab="X", ylab="Quantile")
axis(1, at=c( 1:12, (16:20)-3), labels=c(1:12, (16:20)) ) #shift x's, labels unshifted
axis.break(breakpos=12)
dev.off()
Related
when using the "curve" function in R, how do you suppress/stop the plot from showing up? For example, this code always plots the curve
my_curve = curve(x)
Is there a parameter to do this or should I being using a different function? I just want the x y points as a dataframe from the curve.
curve() is from the graphics library and is unhandy for generating lists.
Just try using:
x = seq(from, to, length.out = n)
y = function(x)
If you stick to the curve function, the closest to a solution I know is adding dev.off() after the curve() statement!
Here's a way to take advantage of the part of curve that you want without generating a plot.
I made a copy of the curve function (just type curve in the console); called it by a new name (curve2); and commented out the four lines at the end starting with if (isTRUE(add)). When it's called and assigned, I had a list with two vectors—x and y. No plot.
Hopefully a simple question today:
I'm plotting an RDA (in R Studio) and would like to remove the second X and Y (top and right) axes . Purely for aesthetic purposes, but still. The code I'm using is below. I've managed to remove the first axes (I'll replace them with something nicer later) with xaxt="n" and yaxt="n", but it still puts the others in.
The question: How do I remove the top and right axes from a plot in R?
To make this example reproducible you will need two data frames of equal length called "bio" and "abio" respectively.
library (vegan) ##not sure which package I'm actually employing
library(MASS) ##these are just my defaults
rdaY1<-rda(bio,Abio) #any dummy data will do so long as they're of equal length
par(bg="transparent",new=FALSE)
plot(rdaY1,type="n",bty="n",main="Y1. P<0.001 R2=XXX",
ylab="XXX% variance explained",
xlab="XXX% variance explained",
col.main="black",col.lab="black", col.axis="white",
xaxt="n",yaxt="n",axes=FALSE, bty="n")
abline(h=0,v=0,col="black",lwd=1)
points(rdaY1,display="species",col="gray",pch=20)
#text(rdaY1,display="species",col="gray")
points(rdaY1,display="cn",col="black",lwd=2)
text(rdaY1,display="cn",col="black")
UPDATE: Using comments below I've played around with various ways to get rid of the axes and it seems like that second "points" command where I call for the vectors to be plotted is the problem. Any ideas?
bty="L" worked for me. I generated some random data using rnorm() to test:
library(vegan)
mat <- matrix(rnorm(100), nrow = 10)
pl <- rda(mat)
plot(pl, bty="L")
Here's the result.
I try to overlay two histograms in the same plane but the option Probability=TRUE (relative frequencies) in hist() is not effective with the code below. It is a problem because the two samples have very different sizes (length(cl1)=9 and length(cl2)=339) and, with this script, I cannot vizualize differences between both histograms because each shows frequencies. How can I overlap two histograms with the same bin width, showing relative frequencies?
c1<-hist(dataList[["cl1"]],xlim=range(minx,maxx),breaks=seq(minx,maxx,pasx),col=rgb(1,0,0,1/4),main=paste(paramlab,"Group",groupnum,"cl1",sep=" "),xlab="",probability=TRUE)
c2<-hist(dataList[["cl2"]],xlim=range(minx,maxx),breaks=seq(minx,maxx,pasx),col=rgb(0,0,1,1/4),main=paste(paramlab,"Group",groupnum,"cl2",sep=" "),xlab="",probability=TRUE)
plot(c1, col=rgb(1,0,0,1/4), xlim=c(minx,maxx), main=paste(paramlab,"Group",groupnum,sep=" "),xlab="")# first histogram
plot(c2, col=rgb(0,0,1,1/4), xlim=c(minx,maxx), add=T)
cl1Col <- rgb(1,0,0,1/4)
cl2Col <- rgb(0,0,1,1/4)
legend('topright',c('Cl1','Cl2'),
fill = c(cl1Col , cl2Col ), bty = 'n',
border = NA)
Thanks in advance for your help!
When you call plot on an object of class histogram (like c1), it calls the S3 method for the histogram. Namely, plot.histogram. You can see the code for this function if you type graphics:::plot.histogram and you can see its help under ?plot.histogram. The help file for that function states:
freq logical; if TRUE, the histogram graphic is to present a
representation of frequencies, i.e, x$counts; if FALSE, relative
frequencies (probabilities), i.e., x$density, are plotted. The default
is true for equidistant breaks and false otherwise.
So, when plot renders a histogram it doesn't use the previously specified probability or freq arguments, it tries to figure it out for itself. The reason for this is obvious if you dig around inside c1, it contains all of the data necessarily for the plot, but does not specify how it should be rendered.
So, the solution is to reiterate the argument freq=FALSE when you run the plot functions. Notably, freq=FALSE works whereas probability=TRUE does not because plot.histogram does not have a probability option. So, your plot code will be:
plot(c1, col=rgb(1,0,0,1/4), xlim=c(minx,maxx), main=paste(paramlab,"Group",groupnum,sep=" "),xlab="",freq=FALSE)# first histogram
plot(c2, col=rgb(0,0,1,1/4), xlim=c(minx,maxx), add=T, freq=FALSE)
This all seems like a oversight/idiosyncratic decision (or lack thereof) on the part of the R devs. To their credit it is appropriately documented and is not "unexpected behavior" (although I certainly didn't expect it). I wonder where such oddness should be reported, if it should be reported at all.
Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:
I have a plot() that I'm trying to make, but I do not want the x-values to be used as the axis labels...I want a different character vector that I want to use as labels, in the standard way: Use as many as will fit, drop the others, etc. What should I pass to plot() to make this happen?
For example, consider
d <- data.frame(x=1:5,y=10:15,x.names=c('a','b','c','d','e'))
In barplot, I would pass barplot(height=d$y,names.arg=d$x.names), but in this case the actual x-values are important. So I would like an analog such as plot(x=d$x,y=d$y,type='l',names.arg=d$x.names), but that does not work.
I think you want to first suppress the labels on the x axis with the xaxt="n" option:
plot(flow~factor(month),xlab="Month",ylab="Total Flow per Month",ylim=c(0,55000), xaxt="n")
then use the axis command to add in your own labels. This example assumes the labels are in an object called month.name
axis(1, at=1:12, labels=month.name)
I had to look up how to do this and I stole the example from here.