ggplot2 Line graphs in R: Plotting dependent variable on y axis

ggplot2 Line graphs in R: Plotting dependent variable on y axis - r

I am trying to plot the vertical concentration profile of a pollutant. By convention, altitude is plotted on the vertical axis, and concentration is on the x (even though altitude is the independent variable). When plotting the concentrations for pollutants that do not fit a one-to-one function, R connects the points in a most annoying zig-zag pattern, instead of connecting them in order by altitude.
I tried changing the concentration values to factors, with levels based on altitude values:
concSummary$value <- factor(concSummary$value, levels =
concSummary$value[order(concSummary$altitude)])
But this didn't seem to work.
Does anyone know how to get around this problem?

Update: Someone posted a useful solution here: controlling order of points in ggplot2 in R?
Using geom_path() instead of geom_point() tells R to connect points in the order in which they appear in a dataframe. This happened to work for me because the data were ordered by altitude.

Related

How to create a boxplot with log10 scale and zero values in ggplot2

I am creating several boxplots in ggplot2 with a log10 scale using
coord_trans(y="log10")
It is important that only the scale and not the data itself is log-transformed. One data set includes zero values, which is creating -inf values so that the boxplot cannot be drawn on a log-transformed scale.
I have tried to use
scale_y_continuous(trans=pseudo_log_trans(base=10))
However, this makes changes to the data instead of the scale. Outliers of the boxplot change and the boxplot stats extracted through ggplot_build(examplefig)$data are different from the original data.
Is there any way to create a boxplot in ggplot2 with a log10 scale and data including zero values? There should be no transformation of the data itself and outliers should be displayed like in the boxplot with the original data.
This is the very first question I ask here and I am new to R, so I hope the question is clear.

How to put two y-axis with different scale on the same side of the plot with ggplot?

I have three variables (Precipitation, Temperature and PAR Radiation) with different scales. I'm trying to plot these three variables together and I put the daily sum of precipitation represented by a barplot on the left side y axis and the daily average of temperature on the right side y axis. I'd like to put another y axis on the right side, with another scale, in order to represent the daily average of PAR radiation, but I can't. I'm using the ggplot package, because it is useful for other reasons.
I'm trying to reach something similar as in the pic:

A discussion of this topic can be found here:
ggplot with 2 y axes on each side and different scales
A workaround solution can be found here:
https://rpubs.com/MarkusLoew/226759

You can work with the Sandard R package
http://evolvingspaces.blogspot.com/2011/05/multiple-y-axis-in-r-plot.html

Is it possible to make a pirateplot using frequencies instead of densities in R?

I am searching and trying the following plot in R for ages, but nothing seems to work.
What I want is a quantitative variable in the Y axis and a categorical variable in the X axis, and just an horizontal histogram (of the Y variable) for each category.
I couldn't find a package that does this. Any suggestions?

R + ggplot2, multiple histograms in the same plot with each histogram normalised to unit area?

Sorry for the newbie R question...
I have a data.frame that contains measurements of a single variable. These measurements will be distributed differently depending on whether the thing being measured is of type A or type B; that is, you can imagine that my column names are: measurement, type label (A or B). I want to plot the histograms of the measurements for A and B separately, and put the two histograms in the same plot, with each histogram normalised to unit area (this is because I expect the proportions of A and B to differ significantly). By unit area, I mean that A and B each have unit area, not that A+B have unit area. Basically, I want something like geom_density, but I don't want a smoothed distributions for each; I want the histogram bars. Not interleaved, but plotted one on top of the other. Not stacked, although it would be interesting to know how to do this also. (The purpose of this plot is to explore differences in the shapes of the distributions that would indicate that there are quantitative differences between A and B that could be used to distinguish between them.) That's all. Two or more histograms -- not smoothed density plots -- in the same plot with each normalised to unit area. Thanks!

Something like this?
# generate example
set.seed(1)
df <- data.frame(Type=c(rep("A",1000),rep("B",4000)),
Value=c(rnorm(1000,mean=25,sd=10),rchisq(4000,15)))
# you start here...
library(ggplot2)
ggplot(df, aes(x=Value))+
geom_histogram(aes(y=..density..,fill=Type),color="grey80")+
facet_grid(Type~.)
Note that there are 4 times as many samples of type B.
You can also set the y-axis scales to float using: scales="free_y" in the call to facet_grid(...).

R - logistic curve plot with aggregate points

Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.

The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)

You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

ggplot2 Line graphs in R: Plotting dependent variable on y axis - r

Update: Someone posted a useful solution here: controlling order of points in ggplot2 in R? Using geom_path() instead of geom_point() tells R to connect points in the order in which they appear in a dataframe. This happened to work for me because the data were ordered by altitude.

Related

How to create a boxplot with log10 scale and zero values in ggplot2

How to put two y-axis with different scale on the same side of the plot with ggplot?

Is it possible to make a pirateplot using frequencies instead of densities in R?

R + ggplot2, multiple histograms in the same plot with each histogram normalised to unit area?

R - logistic curve plot with aggregate points

Categories

Resources