I would like to lower the number of points on the lines on my plot.
For example,
date <- c("2017-04-15","2017-04-16","2017-04-17","2017-04-18","2017-04-19","2017-04-20","2017-04-21")
x <- c(1,3,3,4,3,5,2)
df <- data.frame(date,x)
Rather than having a point located at every vertex. I would like one located at every other vertex. The first, third, fifth and seventh vertex would have points while the others would not.
ggplot(df, aes(date,x,group=1)) +
geom_line(size=.4) +
geom_point(size=.7)
This seems simple enough, but I have been unable to find any information on how to do it.
You can use scale_x_date to scale your x axis dates
date <- c("2017-04-15","2017-04-16","2017-04-17","2017-04-18","2017-04-19","2017-04-20","2017-04-21")
x <- c(1,2,3,4,3,5,2)
#Convert date to DATE format using as.Date()
df <- data.frame(date = as.Date(date),x)
ggplot(df, aes(date,x,group=1)) +
geom_line(size=.4) +
geom_point(size=.7) +
scale_x_date(date_breaks = "2 day", date_labels = "%d-%b") #using Scale_x_date to change the spacing and label format for display
Related
In ggplot2, I have a question about appropriate scales for making POSIXct datetimes into time-of-day in an axis. Consider:
library(tidyverse)
library(lubridate)
library(hms)
library(patchwork)
test <- tibble(
dates = c(ymd_hms("2022-01-01 6:00:00"),
ymd_hms("2023-01-01 19:00:00")),
x = c(1, 2),
hms_dates = as_hms(dates)
)
plot1 <- ggplot(test) + geom_point(aes(x = x, y = dates)) +
scale_y_time()
plot2 <- ggplot(test) + geom_point(aes(x = x, y = hms_dates)) +
scale_y_time()
plot1 + plot2
Plot 1 y axis includes dates and time, but Plot 2 shows just time of day. That's what I want! I'd like to generate plot 2 like images without having to use the hms::as_hms approach. This seems to imply some options for scale_y_datetime (or similar) that I can't discover. I'd welcome suggestions.
Does someone have an example of how to use the limits option in scale_*_time, or (see question #1) limits for a scale_y_datetime that specifies hours within the day, e.g. .. limits(c(8,22)) predictably fails.
For your second question, when dealing with dates or datetimes or times you have to set the limits and/or breaks as dates, datetimes or times too, i.e. use limits = as_hms(c("8:00:00", "22:00:00"):
library(tidyverse)
library(lubridate)
library(hms)
ggplot(test) + geom_point(aes(x = x, y = hms_dates)) +
scale_y_time(limits = as_hms(c("8:00:00", "22:00:00")))
#> Warning: Removed 1 rows containing missing values (`geom_point()`).
Concerning your first question. TBMK this could not be achieved via scale_..._datetime. And if you just want to show the time part of your dates then converting to an has object is IMHO the easiest way to achieve that. You could of course set the units to be shown as axis text via the date_labels argument, e.g. date_labels="%H:%M:%S" to show only the time of day. However, as your dates variable is still a datetime the scale, breaks and limits will still reflect that, i.e. you only change the format of the labels and for your example data you end up with an axis showing the same time for each break, i.e. the start of the day.
ggplot(test) + geom_point(aes(x = x, y = dates)) +
scale_y_datetime(date_labels = "%H:%M:%S")
i have a dataset given with:
Country Time Value
1 USA 1999-Q1 292929
2 USA 1999-Q2 392023
3. USA 1999-Q3 9392992
4
.... and so on. Now I would like to plot this dataframe with Time being on the x-axis and y being the Value. But the problem I face is I dont know how to plot the Time. Because it is not given in month/date/year. If that would be the case I would just code as.Date( format = "%m%d%y"). I am not allowed to change the quarterly name. So when I plot it, it should stay that way. How can I do this?
Thank you in advance!
Assuming DF shown in the Note at the end, convert the Time column to yearqtr class which directly represents year and quarter (as opposed to using Date class) and use scale_x_yearqtr. See ?scale_x_yearqtr for more information.
library(ggplot2)
library(zoo)
fmt <- "%Y-Q%q"
DF$Time <- as.yearqtr(DF$Time, format = fmt)
ggplot(DF, aes(Time, Value, col = Country)) +
geom_point() +
geom_line() +
scale_x_yearqtr(format = fmt)
(continued after graphics)
It would also be possible to convert it to a wide form zoo object with one column per country and then use autoplot. Using DF from the Note below:
fmt <- "%Y-Q%q"
z <- read.zoo(DF, split = "Country", index = "Time",
FUN = as.yearqtr, format = fmt)
autoplot(z) + scale_x_yearqtr(format = fmt)
Note
Lines <- "
Country Time Value
1 USA 1999-Q1 292929
2 USA 1999-Q2 392023
3 USA 1999-Q3 9392992"
DF <- read.table(text = Lines)
Using ggplot2:
library(ggplot2)
ggplot(df, aes(Time, Value, fill = Country)) + geom_col()
I know other people have already answered, but I think this more general answer should also be here.
When you do as.Date(), you can only do the beginning. I tried it on your data frame (I called it df), and it worked:
> as.Date(df$Time, format = "%Y")
[1] "1999-11-28" "1999-11-28" "1999-11-28"
Now, I don't know if you want to use plot(), ggplot(), the ggplot2 library... I don't know that, and it doesn't matter. However you want to specify the y axis, you can do it this way.
I have a basic dataframe with 3 columns: (i) a date (when a sample was taken); (ii) a site location and (iii) a binary variable indicating what the condition was when sampling (e.g. wet versus dry).
Some reproducible data:
df <- data.frame(Date = rep(seq(as.Date("2010-01-01"), as.Date("2010-12-01"), by="months"),times=2))
df$Site <- c(rep("Site.A",times = 12),rep("Site.B",times = 12))
df$Condition<- as.factor(c(0,0,0,0,1,1,1,1,0,0,0,0,
0,0,0,0,0,1,1,0,0,0,0,0))
What I would like to do is use ggplot to create a bar chart indicating the condition of each site (y axis) over time (x axis) - the condition indicated by a different colour. I am guessing some kind of flipped barplot would be the way to do this, but I cannot figure out how to tell ggplot2 to recognise the values chronologically, rather than summed for each condition. This is my attempt so far which clearly doesn't do what I need it to.
ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()
So I have 2 questions. Firstly, how do I tell ggplot to recognise changes in condition over time and not just group each condition in a traditional stacked bar chart?
Secondly, it seems ggplot converts the date to a numerical value, how would I reformat the x-axis to show a time period, e.g. in a month-year format? I have tried doing this via the scale_x_date function, but get an error message.
labDates <- seq(from = (head(df$Date, 1)),
to = (tail(df$Date, 1)), by = "1 months")
Datelabels <-format(labDates,"%b %y")
ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()+
scale_x_date(labels = Datelabels, breaks=labDates)
I have also tried converting sampling times to factors and displaying these instead. Below I have done this by changing each sampling period to a letter (in my own code, the factor levels are in a month-year format - I put letters here for simplicity). But I cannot format the axis to place each level of the factor as a tick mark. Either a date or factor solution for this second question would be great!
df$Factor <- as.factor(unique(df$Date))
levels(df$Factor) <- list(A = "2010-01-01", B = "2010-02-01",
C = "2010-03-01", D = "2010-04-01", E = "2010-05-01",
`F` = "2010-06-01", G = "2010-07-01", H = "2010-08-01",
I = "2010-09-01", J = "2010-10-01", K= "2010-11-01", L = "2010-12-01")
ggplot(df) +
geom_bar(aes(x=Site,y=Date,fill=Condition),stat='identity')+coord_flip()+
scale_y_discrete(breaks=as.numeric(unique(df$Date)),
labels=levels(df$Factor))
Thank you in advance!
It doesn't really make sense to use geom_bar() considering you do not want to summarise the data and require the visualisation over "time"
I would rather use geom_line() and increase the line thickness if you want to portray a bar chart.
library(tidyr)
library(dplyr)
library(ggplot2)
library(scales)
library(lubridate)
df <- data.frame(Date = rep(seq.Date(as.Date("2010-01-01"), as.Date("2010-12-01"), by="months"),times=2))
df$Site <- c(rep("Site.A",times = 12),rep("Site.B",times = 12))
df$Condition<- as.factor(c(0,0,0,0,1,1,1,1,0,0,0,0,
0,0,0,0,0,1,1,0,0,0,0,0))
df$Date <- ymd(df$Date)
ggplot(df) +
geom_line(aes(y=Site,x=Date,color=Condition),size=10)+
scale_x_date(labels = date_format("%b-%y"))
Note using coord_flip() also does not work, I think this causes the Date issue, see below threads:
how to use coord_carteisan and coord_flip together in ggplot2
In ggplot2, coord_flip and free scales don't work together
I am trying to plot a time series in ggplot2. Assume I am using the following data structure (2500 x 20 matrix):
set.seed(21)
n <- 2500
x <- matrix(replicate(20,cumsum(sample(c(-1, 1), n, TRUE))),nrow = 2500,ncol=20)
aa <- x
rnames <- seq(as.Date("2010-01-01"), length=dim(aa)[1], by="1 month") - 1
rownames(aa) <- format(as.POSIXlt(rnames, format = "%Y-%m-%d"), format = "%d.%m.%Y")
colnames(aa) <- paste0("aa",1:k)
library("ggplot2")
library("reshape2")
library("scales")
aa <- melt(aa, id.vars = rownames(aa))
names(aa) <- c("time","id","value")
Now the following command to plot the time series produces a weird looking x axis:
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line()
What I found out is that I can change the format to date:
aa$time <- as.Date(aa$time, "%d.%m.%Y")
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line()
This looks better, but still not a good graph. My question is especially how to control the formatting of the x axis.
Does it have to be in Date format? How can I control the amount of breaks (i.e. years) shown in either case? It seems to be mandatory if Date is not used; otherwise ggplot2 uses some kind of useful default for the breaks I believe.
For example the following command does not work:
aa$time <- as.Date(aa$time, "%d.%m.%Y")
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line() +
scale_x_continuous(breaks=pretty_breaks(n=10))
Also if you got any hints how to improve the overall look of the graph feel free to add (e.g. the lines look a bit inprecise imho).
You can format dates with scale_x_date as #Gopala mentioned. Here's an example using a shortened version of your data for illustration.
library(dplyr)
# Dates need to be in date format
aa$time <- as.Date(aa$time, "%d.%m.%Y")
# Shorten data to speed rendering
aa = aa %>% group_by(id) %>% slice(1:200)
In the code below, we get date breaks every six months with date_breaks="6 months". That's probably more breaks than you want in this case and is just for illustration. If you want to determine which months get the breaks (e.g., Jan/July, Feb/Aug, etc.) then you also need to use coord_cartesian and set the start date with xlim and expand=FALSE so that ggplot won't pad the start date. But when you set expand=FALSE you also don't get any padding on the y-axis, so you need to add the padding manually with scale_y_continuous (I'd prefer to be able to set expand separately for the x and y axes, but AFAIK it's not possible). Because the breaks are packed tightly, we use a theme statement to rotate the labels by 90 degrees.
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line(show.legend=FALSE) +
scale_y_continuous(limits=c(min(aa$value) - 2, max(aa$value) + 1)) +
scale_x_date(date_breaks="6 months",
labels=function(d) format(d, "%b %Y")) +
coord_cartesian(xlim=c(as.Date("2009-07-01"), max(aa$time) + 182),
expand=FALSE) +
theme_bw() +
theme(axis.text.x=element_text(angle=-90, vjust=0.5))
I have two problems handling my time variable in Gnu R!
Firstly, I cannot recode the time data (downloadable here) from factor (or character) with as.Posixlt or with as.Date without an error message like this:
character string is not in a standard unambiguous format
I have then tried to covert my time data with:
dates <- strptime(time, "%Y-%m-%j")
which only gives me:
NA
Secondly, the reason why I wanted (had) to convert my time data is that I want to plot it with ggplot2 and adjust my scale_x_continuous (as described here) so that it only writes me every 50 year (i.e. 1250-01-01, 1300-01-01, etc.) in the x-axis, otherwise the x-axis is too busy (see graph below).
This is the code I use:
library(ggplot2)
library(scales)
library(reshape)
df <- read.csv(file="https://dl.dropboxusercontent.com/u/109495328/time.csv")
attach(df)
dates <- as.character(time)
population <- factor(Number_Humans)
ggplot(df, aes(x = dates, y = population)) + geom_line(aes(group=1), colour="#000099") + theme(axis.text.x=element_text(angle=90)) + xlab("Time in Years (A.D.)")
You need to remove the quotation marks in the date column, then you can convert it to date format:
df <- read.csv(file="https://dl.dropboxusercontent.com/u/109495328/time.csv")
df$time <- gsub('\"', "", as.character(df$time), fixed=TRUE)
df$time <- as.Date(df$time, "%Y-%m-%j")
ggplot(df, aes(x = time, y = Number_Humans)) +
geom_line(colour="#000099") +
theme(axis.text.x=element_text(angle=90)) +
xlab("Time in Years (A.D.)")