I have a very simple question here. I have a dataset from 2009-2012. I want to plot the data with facets. I have created the faceted plot as follows.
R-code
ggplot(al02428400,aes(x=date,y=as.numeric(Discharge)))+geom_line()+ylab("Discharge(cfs)")+facet_wrap(~Year,scales=("free_x"))+theme_bw()
The output of the above R code is as follows:
On the X-axis I only want to show the month. By default it is showing month and year. Is there any way I can get rid of year ?
The fully reproducible code is as follows:
library(ggplot2)
url <- "http://nwis.waterdata.usgs.gov/usa/nwis/uv/?cb_00060=on&cb_00065=on&format=rdb&period=&begin_date=2009-01-01&end_date=2012-12-31&site_no=02428400"
download.file(url,destfile="Data load for stations/data/alabamariver-at-monroeville-2009.txt")
al02428400 <- read.table("Data load for stations/data/alabamariver-at-monroeville-2009.txt",header=T,skip=1,sep="\t")
head(al02428400)
sapply(al02428400,class)
al02428400 <- al02428400[-1,]
names(al02428400)<- c("Agency","SiteNo","Datetime", "TZ","Discharge","Status","Gageheight","gstatus")
al02428400$date <- strptime(al02428400$Datetime, format="%Y-%m-%d %H:%M")
al02428400$Discharge <- as.numeric(as.character(al02428400$Discharge))
al02428400$Year <- as.numeric(format(al02428400$date, "%Y"))
ggplot(al02428400,aes(x=date,y=as.numeric(Discharge)))+geom_line()+ylab("Discharge(cfs)")+facet_wrap(~Year,scales=("free_x"))+theme_bw()
Thanks.
As your x values are date you can use scale_x_date() to change format of labels. Library scales is needed to get better formatting of breaks and labels.
library(scales)
+scale_x_datetime(labels = date_format("%b"))
For me, what worked was
library(scales)
+ scale_x_date(date_labels = "%b-%d-%Y")
More info here
Related
I created a boxplot with ggplot with the following data.frame:
library(lubridate)
library(ggplot2)
library(ggplotly)
df <- data.frame(
time = c("00:43:20", "00:44:30","00:45:40"),
sex = c("m","m","m")
)
df$sex <- factor(df$sex)
df$time <- lubridate::hms(df$time)
Now I created my boxplot with ggplot
g <- ggplot(df) +
geom_boxplot(aes(sex, time)) +
scale_y_time()
Everything looks fine and now get interactive with ggploty():
plotly::ggplotly(g)
But when I hoover over the boxplot, I just see seconds, not the lubridate format.
How can I manage to see the data as shown on the y-axis?
The problem is rather complex from what I understand. The main issue seems to be that lubridate stores times as periods. Therefore you get the seconds in plotly as in ggplot they are seconds as well, they just where converted on the scale by "scale_y_time".
From my understanding the work arround would be to convert the time value to a numeric value of minutes. Though this means a minutes will have 100sec after the comma/dot:
1st option with ggplot:
library(plotly)
library(ggplot)
library(lubridate)
# calculate time as minutes passed and get it as numeriic
mins <- as.numeric(lubridate::hms(df$time) - hms("00:00:00"))/60
df$sex <- factor(df$sex)
df$time <- mins
g <- ggplot2::ggplot(df) +
ggplot2::geom_boxplot(aes(sex, time))
plotly::ggplotly(g)
2nd option with plotly directly (only for the text data not sure if you could add sex F as x or if you need a second trace and some cosmetics need to be done also... anyhow ggplot gives practicalle the same result)
plotly::plot_ly(y = ~mins, type = "box")
Possibly there is a better solution - I just could not figure it out in the last 2 hours ;(
I would like to create a horizontal bar graph from my data.
The link to my data is here.
The code that I am using
library(ggplot2)
ggplot(data=df , aes(x=fct_inorder(WorkSchedule),y=timing, fill=Value)) + geom_col() + coord_flip()
The output of the plot:
How to change the x-axis to show time from 04:00 till 03:45 (24h)
I tried factor(Source) but it does not work.
UPDATE# How can I change the x axis of this graph?
Many tahnks
With the function lvls_reorder() from library forçats, you can specify the order of the levels of your variable.
library(tidyverse) # forcats is included in tidyverse library
df <- df %>%
mutate(Workschedule = lvls_reorder(Workschedule, c(3,2,4,5,1))
If you transform the variable Source as a factor, you can also determine the order you want.
I am trying to plot a time series in ggplot2. Assume I am using the following data structure (2500 x 20 matrix):
set.seed(21)
n <- 2500
x <- matrix(replicate(20,cumsum(sample(c(-1, 1), n, TRUE))),nrow = 2500,ncol=20)
aa <- x
rnames <- seq(as.Date("2010-01-01"), length=dim(aa)[1], by="1 month") - 1
rownames(aa) <- format(as.POSIXlt(rnames, format = "%Y-%m-%d"), format = "%d.%m.%Y")
colnames(aa) <- paste0("aa",1:k)
library("ggplot2")
library("reshape2")
library("scales")
aa <- melt(aa, id.vars = rownames(aa))
names(aa) <- c("time","id","value")
Now the following command to plot the time series produces a weird looking x axis:
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line()
What I found out is that I can change the format to date:
aa$time <- as.Date(aa$time, "%d.%m.%Y")
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line()
This looks better, but still not a good graph. My question is especially how to control the formatting of the x axis.
Does it have to be in Date format? How can I control the amount of breaks (i.e. years) shown in either case? It seems to be mandatory if Date is not used; otherwise ggplot2 uses some kind of useful default for the breaks I believe.
For example the following command does not work:
aa$time <- as.Date(aa$time, "%d.%m.%Y")
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line() +
scale_x_continuous(breaks=pretty_breaks(n=10))
Also if you got any hints how to improve the overall look of the graph feel free to add (e.g. the lines look a bit inprecise imho).
You can format dates with scale_x_date as #Gopala mentioned. Here's an example using a shortened version of your data for illustration.
library(dplyr)
# Dates need to be in date format
aa$time <- as.Date(aa$time, "%d.%m.%Y")
# Shorten data to speed rendering
aa = aa %>% group_by(id) %>% slice(1:200)
In the code below, we get date breaks every six months with date_breaks="6 months". That's probably more breaks than you want in this case and is just for illustration. If you want to determine which months get the breaks (e.g., Jan/July, Feb/Aug, etc.) then you also need to use coord_cartesian and set the start date with xlim and expand=FALSE so that ggplot won't pad the start date. But when you set expand=FALSE you also don't get any padding on the y-axis, so you need to add the padding manually with scale_y_continuous (I'd prefer to be able to set expand separately for the x and y axes, but AFAIK it's not possible). Because the breaks are packed tightly, we use a theme statement to rotate the labels by 90 degrees.
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line(show.legend=FALSE) +
scale_y_continuous(limits=c(min(aa$value) - 2, max(aa$value) + 1)) +
scale_x_date(date_breaks="6 months",
labels=function(d) format(d, "%b %Y")) +
coord_cartesian(xlim=c(as.Date("2009-07-01"), max(aa$time) + 182),
expand=FALSE) +
theme_bw() +
theme(axis.text.x=element_text(angle=-90, vjust=0.5))
I have about 20 years of daily data in a time series. It has columns Date, rainfall and other data.
I am trying plot rainfall vs Time. I want to get 20 line plots with different colours and legend is generated that show the years in one graph. I tried the following codes but it is not giving me the desired results. Any suggestion to fix my issue would be most welcome
library(ggplot2)
library(seas)
data(mscdata)
p<-ggplot(data=mscdata,aes(x=date,y=precip,group=year,color=year))
p+geom_line()+scale_x_date(labels=date_format("%m"),breaks=date_breaks("1 months"))
It doesnt look great but here's a method. We first coerce the data into dates in the same year:
mscdata$dayofyear <- as.Date(format(mscdata$date, "%j"), format = "%j")
Then we plot:
library(ggplot2)
library(scales)
p <- ggplot(data = mscdata, aes(x = dayofyear, y = precip, group = year, color = year))
p + geom_line() +
scale_x_date(labels = date_format("%m"), breaks = date_breaks("1 months"))
While I agree with #Jaap that this may not be the best way to depict these data, try to following:
mscdata$doy <- as.numeric(strftime(mscdata$date, format="%j"))
ggplot(data=mscdata,aes(x=doy,y=precip,group=year)) +
geom_line(aes(color=year))
Although the given answers are good answers to your questions as it stands, i don't think it will solve your problem. I think you should be looking at a different way to present the data. #Jaap already suggested using facets. Take for example this approach:
#first add a month column to your dataframe
mscdata$month <- format(mscdata$date, "%m")
#then plot it using boxplot with year on the X-axis and month as facet.
p1 <- ggplot(data = mscdata, aes(x = year, y = precip, group=year))
p1 + geom_boxplot(outlier.shape = 3) + facet_wrap(~month)
This will give you a graph per month, showing the rainfall per year next to one each other. Because i use boxplot, the peaks in rainfall show up as dots ('normal' rain events are inside box).
Another possible approach would be to use stat_summary.
I have created a graph in ggplot2 using zoo to create month bins. However, I want to be able to modify the graph so it looks like a standard ggplot graph. This means that the bins that aren't used are dropped and the bins that are populate the entire bin space. Here is my code:
library(data.table)
library(ggplot2)
library(scales)
library(zoo)
testset <- data.table(Date=as.Date(c("2013-07-02","2013-08-03","2013-09-04","2013-10-05","2013-11-06","2013-07-03","2013-08-04","2013-09-05","2013-10-06","2013-11-07")),
Action = c("A","B","C","D","E","B","A","B","C","A","B","E","E","C","A"),
rating = runif(30))
The ggplot call is:
ggplot(testset, aes(as.yearmon(Date), fill=Action)) +
geom_bar(position = "dodge") +
scale_x_yearmon()
I'm not sure what I'm missing, but I'd like to find out! Thanks in advance!
To get a "standard-looking" plot, convert the data to a "standard" data type, which is a factor:
ggplot(testset, aes(as.factor(as.yearmon(Date)), fill=Action)) +
geom_bar(position='dodge')