I am plotting a boxplot without the outliers and I would like to create a new plot in the same cartesian space as the boxplot did. Is there a way of extracting the plotting values for a plot?
I first thought about creating an object but there seems to be no plotting-related parameters.
my_plot <- boxplot(a ~ b, outline=F)
But the parameters inside my_plot only concern statistical information but not plotting.
How can I get the final range (ylim) of the boxplot?
UPDATE: Nick's #nick-sabbe suggestion (par("yaxp")[1:2]) partially works. It returns properly the value of each of the labels in each extreme on the Y-axis. The correct way is to use par('usr') as it returns the extremes of the plotting area in the form (x1, x2, y1, y2). Thanks Nick for pointing me into the right direction.
I haven't tested this for boxplots, but for normal scatterplots, par("yaxp") gives you interesting information wrt the y axis. So you can use, IIRC, par("yaxp")[1:2] to get the current outer limits of the y axis. This doesn't always do exactly what you want, but typically it does. Let us know if it works for your boxplot...
Related
I´m working with R scatterplot3D and I need to use expression() in labels because I have to use some Greek letters;
my question is: is there a way to pull the y.lab name down or write it along the axis (in a diagonal position)? I went to help and packages description but nothing seems to work;
thanks in advance for any help
Maria
library(scatterplot3d)
par(mfrow=c(1,1))
A <- c(3,2,3,3,2)
B <- c(2,4,5,3,4)
D <- c(4,3,4,2,3)
scatterplot3d(A,D,B, xlab=expression(paste(x[a],"-",x[b])),
ylab=expression(x[a]),
zlab=expression(sigma^2))
You can't use any of the classic ways due to the way the scatterplot3d() function constructs the plot. It's basically plotted on top of a classic plot pane, which means the axis labels are bound to the classic positions. The z-label is printed at the real left Y-axis, and the y label is printed at the real right Y-axis.
You can use text() to get around this:
use par("usr") to get the limits of the X and Y coordinates
calculate the position you want the label on (at 90% of the horizontal position and 8% of the vertical position for example.)
use text() to place it (and possibly the parameter srt to turn the label)
This makes it a bit more generic, so you don't have try different values for every new plot you make.
Example :
scatterplot3d(A,D,B, xlab=expression(paste(x[a],"-",x[b])),
ylab="",
zlab=expression(sigma^2))
dims <- par("usr")
x <- dims[1]+ 0.9*diff(dims[1:2])
y <- dims[3]+ 0.08*diff(dims[3:4])
text(x,y,expression(x[a]),srt=45)
Gives
scatterplot3d(A,D,B, xlab=expression(paste(x[a],"-",x[b])),
ylab="",
zlab=expression(sigma^2))
mtext( expression(x[a]), side=4,las=2,padj=18, line=-4)
One does need to use fairly extreme parameter values to get the expression in the right place in that transformed spatial projection.
I am trying to plot a set of data in R
x <- c(1,4,5,3,2,25)
my Y scale is fixed at 20 so that the last datapoint would effectively not be visible on the plot if i execute the following code
plot(x, ylim=c(0,20), type='l')
i wanted to show the range of the outlying datapoint by showing a smaller box above the plot, with an independent Y scale, representing only this last datapoint.
is there any package or way to approach this problem?
You may try axis.break (plotrix package) http://rss.acs.unt.edu/Rdoc/library/plotrix/html/axis.break.html, with which you can define the axis to break, the style, size and color of the break marker.
The potential disadvantage of this approach is that the trend perception might be fooled. Good luck!
Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:
Hi I got a data frame weekly.mean.values with the following structure:
week:mean:ci.lower:ci.upper
Where week is a factor; mean, ci.lower and ci.upper are numeric. For each week, there is only one mean, and one ci.lower or ci.upper.
I was trying to plot a shaded area inside of the 95% confidence interval around the mean, with the following code:
ggplot(weekly.mean.values,aes(x=week,y=mean)) +
geom_line() +
geom_ribbon(aes(ymin=ci.lower,ymax=ci.upper))
The plot, however, came out blank (that is only with x-axis and y-axis present, but no lines, or points, let alone shaded areas).
If I removed the geom_ribbon part, I did get a line. I know that this should be a very simple task but I don't know why I couldn't get geom_ribbon to plot what I wanted. Any hint would be truly appreciated.
I realize this thread is super old, but google still find it.
The answer is that you need to set the ymin and ymax to use a part of the data you are using on the y-axis. It you set them to scalar values then the ribbon covers the entire plot from top to bottom.
You can use
ymin=0
ymax=mean
to go from 0 to your y-point or even
ymin=mean-1
ymax=mean+1
to have the ribbon cover a strip encompassing your actual data.
I may be missing something, but the ribbon will be plotted filled with grey20 by default. You are plotting this layer on top of the data so no wonder it obscures it. Also, it is also possible that the limits for the plot axes derived from the data provided to the initial ggplot() call will not be sufficient to contain the confidence interval ribbon. In that case, I would not be surprised to see a grey/blank plot.
To see if this is the problem, try altering your geom_ribbon() line to:
geom_ribbon(aes(ymin=ci.lower,ymax=ci.upper), alpha = 0.5)
which will plot the ribbon with transparency whic should show the data underneath if the problem is what I think it is.
If so, set the x and y limits to the range of the data +/- the confidence interval you wish to plot and swap the order of the layers (i.e. draw the line on top of the ribbon), and use transparency in the ribbon to show the grid through it.
From ggplot's docs for geom_ribbon (2.1.0):
For each continuous x value, geom_interval displays a y interval. geom_area is a special case of geom_ribbon, where the minimum of the range is fixed to 0.
In this case, x values cannot be factors for geom_ribbon. One solution would be to convert week from a factor to a numeric. e.g.
ggplot(weekly.mean.values,aes(x=as.numeric(week),y=mean)) +
geom_line() +
geom_ribbon(aes(ymin=ci.lower,ymax=ci.upper))
geom_line should handle the switch from factor to numeric without incident, although the X axis scale may display differently.
I'd like to use ggplot2 density geometry using a log transformation for the x scale:
qplot(rating, data=movies, geom="density", log="x")
This, however, produces a chart with probabilities larger than 1. One solution that seems to work is to scale the dataset before calling qplot:
qplot(rating, data=transform(movies, rating=log(rating))
But then the x axis doesn't look nice. What is the correct way to handle this?
It seems that my question doesn't not, in fact, make sense. It seems that it is OK that probability densities are larger than one, as per [2]. What is important is that the integral over the entire space is equal to one [3].
This gives the right answer.
qplot(rating, y = ..scaled.., data=movies, geom="density", log="x")
stat_density produces new values, one of them is ..scaled.. which is the density scaled from 0 to 1.
HTH