I am creating several boxplots in ggplot2 with a log10 scale using
coord_trans(y="log10")
It is important that only the scale and not the data itself is log-transformed. One data set includes zero values, which is creating -inf values so that the boxplot cannot be drawn on a log-transformed scale.
I have tried to use
scale_y_continuous(trans=pseudo_log_trans(base=10))
However, this makes changes to the data instead of the scale. Outliers of the boxplot change and the boxplot stats extracted through ggplot_build(examplefig)$data are different from the original data.
Is there any way to create a boxplot in ggplot2 with a log10 scale and data including zero values? There should be no transformation of the data itself and outliers should be displayed like in the boxplot with the original data.
This is the very first question I ask here and I am new to R, so I hope the question is clear.
I have three variables (Precipitation, Temperature and PAR Radiation) with different scales. I'm trying to plot these three variables together and I put the daily sum of precipitation represented by a barplot on the left side y axis and the daily average of temperature on the right side y axis. I'd like to put another y axis on the right side, with another scale, in order to represent the daily average of PAR radiation, but I can't. I'm using the ggplot package, because it is useful for other reasons.
I'm trying to reach something similar as in the pic:
A discussion of this topic can be found here:
ggplot with 2 y axes on each side and different scales
A workaround solution can be found here:
https://rpubs.com/MarkusLoew/226759
You can work with the Sandard R package
http://evolvingspaces.blogspot.com/2011/05/multiple-y-axis-in-r-plot.html
I am searching and trying the following plot in R for ages, but nothing seems to work.
What I want is a quantitative variable in the Y axis and a categorical variable in the X axis, and just an horizontal histogram (of the Y variable) for each category.
I couldn't find a package that does this. Any suggestions?
Sorry for the newbie R question...
I have a data.frame that contains measurements of a single variable. These measurements will be distributed differently depending on whether the thing being measured is of type A or type B; that is, you can imagine that my column names are: measurement, type label (A or B). I want to plot the histograms of the measurements for A and B separately, and put the two histograms in the same plot, with each histogram normalised to unit area (this is because I expect the proportions of A and B to differ significantly). By unit area, I mean that A and B each have unit area, not that A+B have unit area. Basically, I want something like geom_density, but I don't want a smoothed distributions for each; I want the histogram bars. Not interleaved, but plotted one on top of the other. Not stacked, although it would be interesting to know how to do this also. (The purpose of this plot is to explore differences in the shapes of the distributions that would indicate that there are quantitative differences between A and B that could be used to distinguish between them.) That's all. Two or more histograms -- not smoothed density plots -- in the same plot with each normalised to unit area. Thanks!
Something like this?
# generate example
set.seed(1)
df <- data.frame(Type=c(rep("A",1000),rep("B",4000)),
Value=c(rnorm(1000,mean=25,sd=10),rchisq(4000,15)))
# you start here...
library(ggplot2)
ggplot(df, aes(x=Value))+
geom_histogram(aes(y=..density..,fill=Type),color="grey80")+
facet_grid(Type~.)
Note that there are 4 times as many samples of type B.
You can also set the y-axis scales to float using: scales="free_y" in the call to facet_grid(...).
Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving: