I am building a ggplot2 figure with a facet grid. On my Y-axis are percentages, and my X-axis is the concentration (in numbers). Each facet has 3 groups (0, 24 and 48 hours)
ggplot(data=MasterTable, aes(x=Concentration, y=Percentage, group=Time)) +
geom_point() +
geom_line() +
facet_grid(Chemicals ~ Treatments)
This generates a continuous x-axis. Since the values are not evenly spread out, I would prefer a discrete axis to better visualize my data. I followed the following tutorial with no luck. The first figure is exactly what I am trying to do.
I also tried formatting the axis:
scale_x_discrete(labels("0", "0.1", "2", "50"))
and formatting the line:
geom_line(aes(Time))
and following this tutorial.
I think this problem is that the values on the x-axis are integers rather than strings. This makes the default axis continuous. How can I change this?? I am sure the solution is simple, I just can't figure it out.
Thanks in advance!
On this page they make the following modification df2$dose<-as.factor(df2$dose). You can try to modify your x-axis as df2$Concentration<-as.factor(df2$Concentration)
or like this:
ggplot(data=MasterTable, aes(x=factor(Concentration), y=Percentage, group=Time)) +
geom_point() +
geom_line() +
facet_grid(Chemicals ~ Treatments)
Related
I'm trying to create a stacked barchart with gene sequencing data, where for each gene there is a tRF.type and Amino.Acid value. An example data set looks like this:
tRF <- c('tRF-26-OB1690PQR3E', 'tRF-27-OB1690PQR3P', 'tRF-30-MIF91SS2P46I')
tRF.type <- c('5-tRF', 'i-tRF', '3-tRF')
Amino.Acid <- c('Ser', 'Lys', 'Ser')
tRF.data <- data.frame(tRF, tRF.type, Amino.Acid)
I would like the x-axis to represent the amino acid type, the y-axis the number of counts of each tRF type and the the fill of the bars to represent each tRF type.
My code is:
ggplot(chart_data, aes(x = Amino.Acid, y = tRF.type, fill = tRF.type)) +
geom_bar(stat="identity") +
ggtitle("LAN5 - 4 days post CNTF treatment") +
xlab("Amino Acid") +
ylab("tRF type")
However, it generates this graph, where the y-axis is labelled with the categories of tRF type. How can I change my code so that the y-axis scale is numerical and represents the counts of each tRF type?
Barchart
OP and Welcome to SO. In future questions, please, be sure to provide a minimal reproducible example - meaning provide code, an image (if possible), and at least a representative dataset that can demonstrate your question or problem clearly.
TL;DR - don't use stat="identity", just use geom_bar() without providing a stat, since default is to use the counts. This should work:
ggplot(chart_data, aes(x = Amino.Acid, fill = tRF.type)) + geom_bar()
The dataset provided doesn't adequately demonstrate your issue, so here's one that can work. The example data herein consists of 100 observations and two columns: one called Capitals for randomly-selected uppercase letters and one Lowercase for randomly-selected lowercase letters.
library(ggplot2)
set.seed(1234)
df <- data.frame(
Capitals=sample(LETTERS, 100, replace=TRUE),
Lowercase=sample(letters, 100, replace=TRUE)
)
If I plot similar to your code, you can see the result:
ggplot(df, aes(x=Capitals, y=Lowercase, fill=Lowercase)) +
geom_bar(stat="identity")
You can see, the bars are stacked, but the y axis is all smooshed down. The reason is related to understanding the difference between geom_bar() and geom_col(). Checking the documentation for these functions, you can see that the main difference is that geom_col() will plot bars with heights equal to the y aesthetic, whereas geom_bar() plots by default according to stat="count". In fact, using geom_bar(stat="identity") is really just a complicated way of saying geom_col().
Since your y aesthetic is not numeric, ggplot still tries to treat the discrete levels numerically. It doesn't really work out well, and it's the reason why your axis gets smooshed down like that. What you want, is geom_bar(stat="count").... which is the same as just using geom_bar() without providing a stat=.
The one problem is that geom_bar() only accepts an x or a y aesthetic. This means you should only give it one of them. This fixes the issue and now you get the proper chart:
ggplot(df, aes(x=Capitals, fill=Lowercase)) + geom_bar()
You want your y-axis to be a count, not tRF.type. This code should give you the correct plot: I've removed the y = tRF.type from ggplot(), and stat = "identity from geom_bar() (it is using the default value of stat = "count instead).
ggplot(tRF.data, aes(x = Amino.Acid, fill = tRF.type)) +
geom_bar() +
ggtitle("LAN5 - 4 days post CNTF treatment") +
xlab("Amino Acid") +
ylab("tRF type")
I'm trying to convert the decimal points from the colomn titled cover_type_ratio below into a percentage in ggplot2. Here is the df
DF
And here is my code for my bar graph:
plot<-ggplot(coverType_count, aes(x=Cover_Type, y=count)) +
geom_bar(stat="identity") +
scale_y_continuous() +
geom_text(data=coverType_count,aes(label=count,y=count+100),size=4) +
geom_text(data=coverType_count,
aes(label=cover_type_ratio,y=count+200),size=4)+
theme(axis.text.x=element_text(angle=30,hjust=1,size=8))+
ggtitle('Cover Type Distribution')
When plotted, it looks like this:
Screenshot of Plot
I need to change the jumbled numbers to percentages from the aforementioned cover_type_ratio column in df, and have tried using the scales library, but to no avail. Does anyone have any suggestions?
Thank you!
I was hoping someone might be able to help. I am still getting to grips with R and I am quite new to ggplot2.
My problem:
I am trying to make a stacked area plot. I have formatted my data frame so that it is in long format. My columns are Date, Category (filter.size) and value (chl.average).
e.g:
data frame example
The issue I am having is that when I try and plot this, where Chlstacked is my data.frame):
stkchl <- ggplot(Chlstacked, aes(x=Date, y=chl.average,
fill=filter.size)) + geom_area()
stkchl
the axis and background layer plots but not the actual stack, although it recognises the categories in a legend with colours.
I have tried and alternate method:
stkchl <- ggplot(Chlstacked, aes(x=Date, y=chl.average))
stkchl
stkchl + geom_area(aes(colour = chl.average, fill= chl.average),
position = 'stack')
Which gives: Error in f(...) : Aesthetics can not vary with a ribbon
My thought is that perhaps as the Dates, which I would want on the x-axis (as it is time series), are repeated for each category (>20, <20>5, <5>GFF) they are not unique so maybe doing something - altough I am stumped as to what - to cause error.Or perhaps something simple that I am doing wrong within my coding?
Any help would be appreciated - thanks
I have a little problem with a ggplot barchart.
I wanted to make a barchart with ggplot2 in order to compare my Svolumes for my 4 stocks on a period of few months.
I have two problems:
The first one is that my y axis is wrong. My graph/data seems correct but the y axis don't "follow" as I thought it will contain another scale... I would to have to "total" number of my dataset svolumes, I think here it is writing my svolumes values. I don't know how to explain but I would like the scale corresponding to all of my data on the graph like 10,20,etc until my highest sum of svolumes.
There is my code:
Date=c(rep(data$date))
Subject=c(rep(data$subject))
Svolume=c(data$svolume)
Data=data.frame(Date,Subject,Svolume)
Data=ddply(Data, .(Date),transform,pos=cumsum(as.numeric(Svolume))-(0.5*(as.numeric(Svolume))))
ggplot(Data, aes(x=Date, y=Svolume))+
geom_bar(aes(fill=Subject),stat="identity")+
geom_text(aes(label=Svolume,y=pos),size=3)
and there is my plot:
I helped with the question here
Finally, How could I make the same plot for each months please? I don't know how to get the values per month in order to have a more readable barchart as we can't read anything here...
If you have other ideas for me I would be very glad to take any ideas and advices! Maybe the same with a line chart would be more readable...? Or maybe the same barchart for each stocks ? (I don't know how to get the values per stock either...)
I just found how to do it with lines.... but once again my y axis is wrong, and it's not very readable....
Thanks for your help !! :)
Try adding the following line right before your ggplot function. It looks like your y-axis is in character.
[edit] Incorporate #user20650's comments, add as.character() first then convert to numeric.
Data$Svolume <- as.numeric(as.character(Data$Svolume))
To produce the same plot for each month, you can add the month variable first: Data$Month <- month(as.Date(Date)). Then add facet to your ggplot object.
ggplot(Data, aes(x=Date, y=Svolume) +
...
+ facet_wrap(~ Month)
For example, your bar chart code will be:
Data$Svolume <- as.numeric(as.character(Data$Svolume))
Data$Month <- month(as.Date(Date))
ggplot(Data, aes(x=Date, y=Svolume)) +
geom_bar(aes(fill=Subject),stat="identity") +
geom_text(aes(label=Svolume,y=pos),size=3) +
facet_wrap(~ Month)
and your Line chart code will be:
Data$Svolume <- as.numeric(as.character(Data$Svolume))
Data$Month <- month(as.Date(Date))
ggplot(Data, aes(x=Date, y=Svolume, colour=Subject)) +
geom_line() +
facet_wrap(~ Month)
I need to plot some discrete predictions with probability intervals in ggplot2, but I'm having some problems.
I have the following data.frame
city pred min.80 max.80
BH 100 50 150
RJ 120 80 140
SP 90 80 100
I want a plot with the cities on y-axis and the predicted values on x-axis. For each discrete value of y, there should be a horizontal bar with its range being the min.80 and max.80 values. My idea is to use geom_rect from ggplot2 for doing it.
I've tried the following code, but the problem is that I'm converting the discrete variable to continuous in order to plot it, and I lose their values on the label.
> ggplot(df) + geom_rect(aes(xmin=min.80, xmax=max.80, ymin=as.numeric(city)-0.4,
+ ymax=as.numeric(city)+0.4))
Is there another way to do it?
I suggest you use the geom pointrange or crossbar:
ggplot(df, aes(x=city)) +
geom_pointrange(aes(ymin=min.80, ymax=max.80, y=pred)) +
coord_flip()
ggplot(df, aes(x=city)) +
geom_crossbar(aes(ymin=min.80, ymax=max.80, y=pred)) +
coord_flip()
I think you want to keep the y axis as a factor (y=city). This kind of (estimate+interval) data is probably is better done with something like geom_pointrange. After all, the "height" of the rectangle doesn't have an interpretation.
If you have to have the errorbars be horizontal, I've done this before in two ways:
using coord_flip()
Last time I tried coord_flip(), it was a bit limited, so I sometimes also recreated the geom_pointrange() functionality by combining geom_hline() with geom_point().