I am trying to plot trendlines for my data by month and visualize them on a faceted ggplot. R is not allowing me to show the trendline. I want to use geom_line() or geom_smooth(). I have a hypothesis about why this is happening, I am assuming it's because ggplot is viewing the Month names as categorical (factors), seeing as my code otherwise seems to match online tutorials for showing points and lines on the same faceted ggplot.
I tried using month.name, to convert my month names to recognizable months to test my hypothesis, but R does this incorrectly (May, July, and September were converted to April, February, and June?).
I also have the plots mainly displaying as I want, for example yields the following:
Toledo_month_log_cyano %>% ggplot(aes(x=Month, y = `Log GC/L`)) +
geom_point(aes(color = gene)) +
geom_smooth() +
theme_classic() +
facet_wrap(~gene, ncol = 1)
It looks mainly how I want, I just want a trend line which doesn't show. If someone could help me figure out what the problem is and solve it that would be great!
Related
I have two questions.
I am using a dataset detailing information on Chicago (temp, ozone, season, date).
I would like a plot show the temperature across time, along with a smooth line to
show the trend, specifically with dates isolated from 1997-2000.
Thus:
library(ggplot2)
library(dplyr)
chicago <- read.csv(file = 'FilePath')
## Limit data to 1997-2000
chicago2 <- chicago %>%
filter(date >= "1997-01-01" & date <= "2000-12-31")
ggplot(chicago2, aes(x=date, y=temp)) +
geom_point() +
geom_smooth() +
labs(title="Temperature")
My issues are as follows:
There seems to be an issue with the x-axis, with the dates not cleanly represented. When I zoom into the picture, it seems like R is plotting every single date on the x-axis, but I am not sure if this is the actual issue. Scatterplot
While I am able to accurately plot a scatterplot, there is no overlayed smooth line, despite the use of the geom_smooth() function.
Looking forward to your responses,
This is a bit of a newbie question. I am using the package "nycflights13" in R, and "tidyverse".
library(nycflights13)
library(tidyverse)
I am trying to get a bar chart that shows the total number of flights by airline/carrier, and have it color each bar by the number of flights that occurred each month.
I can get a simple bar chart to show with the following:
ggplot(flights) +
geom_bar(mapping=aes(x=carrier))
When I try to color it with the month, it doesn't change anything.
ggplot(flights) +
geom_bar(mapping=aes(x=carrier, fill=month))
The graph generated by the code above looks exactly the same.
It seems to work when I do the opposite... if I create a chart with "month" on the x-axis and color by carrier, it works just like I would expect.
ggplot(flights) +
geom_bar(mapping=aes(x=month,fill=carrier))
I assume it has something to do with discrete vs continuous variables?
Yes, this has to do with discrete vs continuous variables. as.factor() will convert month to discrete factors.
ggplot(flights) +
geom_bar(mapping=aes(x=carrier, fill=as.factor(month)))
For fun, there is a way to override geom_bar's built in stat_count default. This requires adding a dummy variable to flights, to use as a y, and sorting the data by month (or you get weird artifacts). Look at the help document about ?geom_bar().
flights$n<-1
flights%>%
arrange(month)%>%
ggplot(aes(carrier, n, fill = month)) +
geom_bar(stat = "identity") +
scale_fill_continuous(low="blue", high="red")
I am trying to make a graph with multiple facets in iNext. Facets are months. Since the default is the alphabetic order, to make the panels of facet, it puts August, July, June, May. But I want the reverse order. I tried to make site (which is the month in my case) as a factor. But it did not work.
Anybody has an idea what I can do to fix this?
I've also wanted to know how the designers wanted us to do that. My workaround was to build it up from more basic parts... below you can see how I reordered by descending elevation in the ant dataset:
# faceting in iNEXT
library(iNEXT)
data(ant)
adat<-iNEXT(ant, datatype = "incidence_freq")
# if you look in Anne Chao's github, you can see the source code for ggiNEXT
# it generates a df with fortify.iNEXT, which is what ggiNEXT plots
plotdf<-fortify.iNEXT(adat, type=1)
#manually relevel the factor
plotdf$site<-factor(plotdf$site, levels=c("h2000m", "h1500m","h1070m", "h500m", "h50m"))
#then just use your own clever ggplot ideas to make the plot you want
p<-(plotdf %>% ggplot(aes(x,y, colour=order))
+ geom_point(size=1)
+ geom_line(lwd=0.5)
+ theme(legend.position = "none")
+ facet_wrap(~site)
+ theme_classic()
)
print(p)
https://github.com/AnneChao/iNEXT/blob/master/R/ggiNEXT.R
I have searched considerably for what I want to accomplish, but I haven't run across examples or plots that are specifically what I'm looking for), so I am reaching out to the community.
What I have (data downloadable here):
Time-series data (each record 2 hours apart and spanning nearly a year) with associated elevation and property ownership.
library(ggplot2)
data <- read.csv("dataex.csv")
data$timestamp <-as.POSIXct(as.character(data$timestamp),format="%m/%d/%Y %H:%M", tz="GMT")
What I want (via ggplot):
A line or bar plot showing elevation (y-axis) across time (x-axis) for each data record colored by ownership (for a line plot, filling the area under the line, or for a bar plot, filling the bar). I've tried iterations of geom_line, geom_bar, and geom_area (w geom_bar below the closest I have come). I'd like at least one of the following options to come true!
Option A - The closest I have come to achieving this (plotting per data record) is with the following code:
ggplot(data, aes(x=timestamp, y=elev, fill=OWNER)) + geom_bar(stat="identity")
However, I'd like the bars to be touching each other, but if I adjust the width in geom_bar(), everything disappears. (Also, if I run the above code on other batches of similar data, it will only show a fraction of the bars, likely because they have more data records). Seems like its just too much data to plot. So I tried another route...
Option B - Plotting by day, which turns out to be more informative, showing each day the variability in ownership.
ggplot(data, aes(x=as.Date(Date, format='%Y-%m-%d'), y=elev, fill=OWNER)) + geom_bar(stat="identity", width=1)
However, this sums the y-axis, so the elevation is not interpretable. I could divide the y-axis by 12 (the typical number of records per day) but there are occasional days with fewer than 12 records, which causes the y-axis to be incorrect. Is there a function or a way to divide the y-axis by the respective number of records per day that is being represented in the plot? Or does someone have advice for a better solution?
Something like:
library(readr)
library(dplyr)
library(ggplot2)
library(ggalt)
readr::read_csv("~/Desktop/dataex.csv") %>%
mutate(timestamp=lubridate::mdy_hm(timestamp)) %>%
select(timestamp, elev, Owner=OWNER) -> df
ggplot(df, aes(timestamp, elev, group=Owner, color=Owner)) +
geom_segment(aes(xend=timestamp, yend=0), size=0.1) +
scale_x_datetime(expand=c(0,0), date_breaks="2 months") +
scale_y_continuous(expand=c(0,0), limits=c(0,2250), label=scales::comma) +
ggthemes::scale_color_tableau() +
hrbrmisc::theme_hrbrmstr(grid="Y") +
labs(x=NULL, y="Elevation") +
theme(legend.position="bottom") +
theme(axis.title.y=element_text(angle=0, margin=margin(r=-20)))
?
I have a little problem with a ggplot barchart.
I wanted to make a barchart with ggplot2 in order to compare my Svolumes for my 4 stocks on a period of few months.
I have two problems:
The first one is that my y axis is wrong. My graph/data seems correct but the y axis don't "follow" as I thought it will contain another scale... I would to have to "total" number of my dataset svolumes, I think here it is writing my svolumes values. I don't know how to explain but I would like the scale corresponding to all of my data on the graph like 10,20,etc until my highest sum of svolumes.
There is my code:
Date=c(rep(data$date))
Subject=c(rep(data$subject))
Svolume=c(data$svolume)
Data=data.frame(Date,Subject,Svolume)
Data=ddply(Data, .(Date),transform,pos=cumsum(as.numeric(Svolume))-(0.5*(as.numeric(Svolume))))
ggplot(Data, aes(x=Date, y=Svolume))+
geom_bar(aes(fill=Subject),stat="identity")+
geom_text(aes(label=Svolume,y=pos),size=3)
and there is my plot:
I helped with the question here
Finally, How could I make the same plot for each months please? I don't know how to get the values per month in order to have a more readable barchart as we can't read anything here...
If you have other ideas for me I would be very glad to take any ideas and advices! Maybe the same with a line chart would be more readable...? Or maybe the same barchart for each stocks ? (I don't know how to get the values per stock either...)
I just found how to do it with lines.... but once again my y axis is wrong, and it's not very readable....
Thanks for your help !! :)
Try adding the following line right before your ggplot function. It looks like your y-axis is in character.
[edit] Incorporate #user20650's comments, add as.character() first then convert to numeric.
Data$Svolume <- as.numeric(as.character(Data$Svolume))
To produce the same plot for each month, you can add the month variable first: Data$Month <- month(as.Date(Date)). Then add facet to your ggplot object.
ggplot(Data, aes(x=Date, y=Svolume) +
...
+ facet_wrap(~ Month)
For example, your bar chart code will be:
Data$Svolume <- as.numeric(as.character(Data$Svolume))
Data$Month <- month(as.Date(Date))
ggplot(Data, aes(x=Date, y=Svolume)) +
geom_bar(aes(fill=Subject),stat="identity") +
geom_text(aes(label=Svolume,y=pos),size=3) +
facet_wrap(~ Month)
and your Line chart code will be:
Data$Svolume <- as.numeric(as.character(Data$Svolume))
Data$Month <- month(as.Date(Date))
ggplot(Data, aes(x=Date, y=Svolume, colour=Subject)) +
geom_line() +
facet_wrap(~ Month)