I am trying to have labels of the y axis from a ggplot between a categorical (species in Y) and a continuous variable (in X) presented in alphabetic order. But I am getting the Y presented with the last species in alphabetic order on the top of my Y axis and the first species in alphabetic order on the bottom.
Since I am new I cannot show images, but it looks like a list of species on the y axis and for each species is represented a point with its standard error bars to the corresponding x value (mean). And the species are presented with Wood Duck on the top and Alpine Swift on the bottom (the middle being ordered in alphabetic order).
I would like to have the opposite (species Alpine Swift on the top and on the bottom the species Wood Duck).
the command I used to plot the graph is the following:
# getting data for the error bars
limits<-aes(xmax=mydata$Xvalues+mydata$Xvalues_SD,
xmin=(mydata$Xvalues-mydata$Xvalues_SD))
# plot graph
graph<-ggplot(data=mydata,aes(x = Xvalues, y = species))
+scale_y_discrete("Species")
+scale_x_continuous(" ")+geom_point()+theme_bw()+geom_errorbarh(limits)
I have tried to order my data set before to upload the data and run the graph.
I have also tried to reorder the species factor using the following command:
mydata$species <- ordered(mydata$species, levels=c("Alpine Swift","Azure-winged Magpie","Barn Swallow","Black-browed Albatross","Blue Tit1","Blue Tit2","Blue-footed Booby","Collared Flycatcher","Common Barn Owl","Common Buzzard","Eurasian Sparrowhawk","European Oystercatcher","Florida Scrub-Jay","Goshawk","Great Tit","Green Woodhoopoe","Grey-headed Albatross","House Sparrow","Indigo Bunting","Lesser Snow Goose","Long-tailed Tit","Meadow Pipit","Merlin","Mute Swan","Osprey","Pied Flycatcher","Pinyon Jay","Sheychelles Warbler","Short-tailed Shearwater","Siberian Jay","Tawny Owl","Ural Owl","Wandering Albatross","Western Gull1","Western Gull2","Wood Duck"))
But I am getting the same graph.
How should I do to change the order of my Y axis?
library(ggplot2)
df <- data.frame(x=rnorm(10),Species=LETTERS[1:10])
ggplot(df)+geom_point(aes(x=x,y=Species),size=3,color="red")
df$Species <- factor(df$Species,levels=rev(unique(df$Species)))
ggplot(df)+geom_point(aes(x=x,y=Species),size=3,color="red")
If you want to put y in some other order, say order of decreasing x, do this:
df$Species <- factor(df$Species, levels=df[order(df$x,decreasing=T),]$Species)
ggplot(df)+geom_point(aes(x=x,y=Species),size=3,color="red")
Try changing +scale_y_discrete("Species") to +scale_y_discrete("Species", trans = 'reverse')
Using fct_rev() from package forcats and following jlhoward's example:
library(ggplot2)
library(forcats)
df <- data.frame(x = rnorm(10), Species = LETTERS[1:10])
# original plot
ggplot(df) +
geom_point(aes(x = x, y = Species), size = 3, color = "red")
# solution
ggplot(df) +
geom_point(aes(x = x, y = fct_rev(Species)), size = 3, color = "red")
Related
I'm generating violin plots in ggplot2 for a time series, year_1 to year_32. The years in my df are stored as numerical values. From the examples I've seen, it seems that I must convert these numerical year values to factors to plot one violin per year; and in fact, if I run the code without as.factors, I get one big fat violin. I would like to understand why geom_violin can't have numeric values on the x axis; or if I'm wrong about that, how to use them?
So:
my_data$year <- as.factor(my_data$year)
p <- ggplot(data = my_data, aes(x = year, y = continuous_var)+
geom_violin(fill = "#FF0000", color = "#000000")+
ylim(0,500)+
labs(x = "x_label", y = "y_label")
p +my_theme()
works fine, but if I skip
my_data$year <- as.factor(my_data$year)
it doesn't work, I get one big fat violin for all years. Why?
TIA
You miss a ) at the end of this line p <- ggplot(data = my_data, aes(x = year, y = continuous_var)
I have construced a reproducible example with the ToothGrowth dataset:
This should work now:
library(ggplot2)
my_data <- ToothGrowth
my_data$dose <- as.factor(my_data$dose)
p <- ggplot(data = my_data, aes(x = dose, y = len))+
geom_violin(fill = "#FF0000", color = "#000000")+
ylim(0,500)+
labs(x = "x_label", y = "y_label") +
theme_bw()
p
PS: this discussion would better fit Cross Validated, as it's more of an statistics than coding question.
I'm not 100% sure, but here's my explanation: the violin plot shows the density for a set of data, you can divide your data into groups so that you can plot one violin for each part of your data. But if the metric you're using to divide groups (x axis) is a continuous, you're going to have infinite groupings (one group for the values at 0, one for 0.1, one for 0.01, etc.), so in the end you actually can't divide your data, and ggplot probably ignores the x variable and makes one violin for all your data.
I am trying to combine a line plot and horizontal barplot on the same plot. The difficult part is that the barplot is actually counts of the y values of the line plot.
Can someone show me how this can be done using the example below ?
library(ggplot2)
library(plyr)
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
counts <- ddply(dff, ~ y1, summarize, y2 = sum(y2))
# line plot
ggplot(data=dff) + geom_line(aes(x=x,y=y1))
# bar plot
ggplot() + geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
I believe what I need is presented in the pseudocode below but I do not know how to write it out in R.
Apologies. I actually meant the secondary x axis representing the value of counts for the barplot, while primary y-axis is the y1.
ggplot(data=dff) + geom_line(aes(x=x,y=y1)) + geom_bar(data=counts , aes(primary y axis = y1,secondary x axis =y2),stat="identity")
I just want the barplots to be plotted horizontally, so I tried the code below which flip both the line chart and barplot, which is also not I wanted.
ggplot(data=dff) +
geom_line(aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y2,y=y1),stat="identity") + coord_flip()
You can combine two plots in ggplot like you want by specifying different data = arguments in each geom_ layer (and none in the original ggplot() call).
ggplot() +
geom_line(data=dff, aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
The following plot is the result. However, since x and y1 have different ranges, are you sure this is what you want?
Perhaps you want y1 on the vertical axis for both plots. Something like this works:
ggplot() +
geom_line(data=dff, aes(x=y1 ,y = x)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity", color = "red") +
coord_flip()
Maybe you are looking for this. Ans based on your last code you look for a double axis. So using dplyr you can store the counts in the same dataframe and then plot all variables. Here the code:
library(ggplot2)
library(dplyr)
#Data
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
#Code
dff %>% group_by(y1) %>% mutate(Counts=sum(y2)) -> dff2
#Scale factor
sf <- max(dff2$y1)/max(dff2$Counts)
# Plot
ggplot(data=dff2)+
geom_line(aes(x=x,y=y1),color='blue',size=1)+
geom_bar(stat='identity',aes(x=x,y=Counts*sf),fill='tomato',color='black')+
scale_y_continuous(name="y1", sec.axis = sec_axis(~./sf, name="Counts"))
Output:
I have the following data:
I would like to generate a bar plot that shows the frequency of each value of Var1 per each run. I want the x axis represents each run and the y axis represents the frequency of each Var1 value. To do that, I wrote the following R script:
df <- read.csv("/home/nasser/Desktop/data.csv")
g <- ggplot(df) +
geom_bar(aes(Run, Freq, fill = Var1, colour = Var1), position = "stack", stat = "identity")
The result that I got is:
The issue is that the x axis does not show each run seperately (the axis should be 1, 2, .., etc) and the legend should show each value of Var1 seperately and in a different color. Also, the bars are not so clear since it is so difficult to see the frequency of each Var1 values. In other words, the generated plot is not the normal stacked bar like the one shown in this answer
How to solve that?
You need to convert both variables to factors. Otherwise, R sees them as numerical and not categorical data.
df <- read.csv("/home/nasser/Desktop/data.csv")
g <- ggplot(df) +
geom_bar(aes(factor(Run), Freq, fill = factor(Var1), colour = factor(Var1)),
position = "stack", stat = "identity")
Is it possible to plot a boxplot and a stripchart next to each other in the same figure? If I run this code, the stripchart overrides the boxplots. What i actually want is that they lay next to each other. In hat way a figure with 10 column on the x-as will be formed. Is that possible?
boxplot(doubles[1:5,])
stripchart(doubles[6:10,],add=TRUE,vertical=TRUE, pch=19)
Some example of you data would be good, but the easiest option is probably:
#random data corresponding to your 5 columns
x <- data.frame(V = rnorm(100), W = rnorm(100), X = rnorm(100), Y = rnorm(100),
Z = rnorm(100))
#remove axis with 'axes=F', define wider x-limits with 'xlim'
stripchart(x[1:5,],vertical=TRUE, pch=19,xlim=c(1,6),axes=F)
#add boxplots next to stripchart, decrease width with 'boxwex'
boxplot(x[1:5,],add=T,at=1.5:5.5,boxwex=0.25,axes=F)
#add custom x axis
axis(1,at=1.25:5.25,labels=names(x))
Use ggplot2
library(ggplot2)
qplot(treatment, decrease, data = OrchardSprays) +
scale_y_log10() +
geom_boxplot() +
geom_point(colour = 'blue', alpha = 0.5)
I try to display a line on top of a boxplot graph with the x made from factor.
This code work well:
x <- c(91,92,93,125,123,140)
y <- c(200,260,220,300,350,360)
d1 <- data.frame(x=x,y=y)
d1$f1 = factor(round(d1$x/10))
qplot(f1,y,data=d1,geom="boxplot")
d2<-data.frame(x2=c(90,140),y2=c(210,320))
qplot(x2,y2,data=d2,geom="line")
But when i try to add the line to the graph...
qplot(f1,y,data=d1,geom="boxplot") + geom_line(data = d2, aes(x = x2, y=y2))
To see my results: http://jeb-files.s3.amazonaws.com/Clipboard01.jpg
How do I manage to have my line align with my boxplot?
Thanks!
A boxplot requires the x-values to be factors, whereas a geom_line requires the x-values to be numeric. You can get what you want by modifying the geom_line call so that the x value is defined as the numeric version of the ordered factor obtained from round(x2/10):
qplot( f1,y,data=d1,geom="boxplot") +
geom_line(data = d2, aes(x = as.numeric(ordered(round(x2/10))), y=y2))