R, Format date on X axis using ggplot - r

After aggregating, "filtering" and parsing a column with date and hours by using as.Date, also transforming it by using melt function, my data looks like this:
DATE variable value
1 13-09-20 Billete_50 20405
2 14-09-20 Billete_50 19808
3 13-09-20 Billete_100 27787
4 14-09-20 Billete_100 20361
5 13-09-20 Total 48192
6 14-09-20 Total 40169
I want a linear graph to show the data, but it looks like this:
It's in spanish, I'm sorry, but the date is almost the same, it may say only September (Sep), but it is showing October, January, April, July, then back to October.
This is my ggplot line:
ggplot(plotCajero1Melted, aes(x=DATE, y=value, col=variable)) + geom_line() +xlab("Fecha") +ylab("Saldo") +ggtitle("Cajero 1")
What should I do? It's something with the date? I was thinking in the year, it is represented in a short way, probably it is making ggplot to have a weird behavior, or maybe a ggplot option that i'm missing?

You're right, your date format is not read as it should be (it seems to be read as Year-Month-Day).
You can modify the date format for example by using the function dmy from lubridate package to indicate r to read the date as Day-Month-Year:
library(lubridate)
df$DATE <- dmy(df$DATE)
ggplot(df, aes(x = DATE, y = value, color = variable))+
geom_line()
Is it what you are looking for ?

Related

Time series plot in R skips dates on x-axis

I'm trying to create a simple time series plot in R with the following data (it's in tbl format):
Date sales
<date> <dbl>
1 2010-02-05 1105572.
2 2010-09-03 1048761.
3 2010-11-19 1002791.
4 2010-12-24 1798476.
5 2011-02-04 1025625.
6 2011-11-18 1031977.
When I use the following command: plot(by_date$Date, by_date$sales, type = 'l'), the resulting graph just skips the individual dates, as I want it to display, and just shows the year on the x-axis, like this (please ignore the axis labels for now):
I've checked the format of the date column using class(by_date$Date) and it does show up as 'Date'. I've also checked other threads here and the closest one that came to answering my query is the one here. Tried that approach but didn't work for me, while plotting in ggplot or converting data to data frame didn't work either. Please help, thanks.
With ggplot this should work -
library(ggplot2)
ggplot(by_date, aes(Date, sales)) + geom_line()
You can use scale_x_date to format your x-axis as you want.

Plotting this weeks data versus last weeks in ggplot

I have a datasets structured the following way
date transaction
8/15/2020 585
8/14/2020 780
8/13/2020 1427.8
8/12/2020 4358
8/11/2020 780.9
8/8/2020 585
8/6/2020 1107.4
8/5/2020 2917.35
8/4/2020 1237.1
Is there a way to plot a line graph with all the transactions that occurred this week compared to the previous week? I tried filtering the data manually and assigning it to a new dataframe which seemed to work but its very manual intensive. Would it be possible to use today() and have it register the day of execution and run the results from there? Thanks!
To do that, you need
real Date (using as.Date), so that we can deal with them numerically (not categorically), and so that we can break them into weeks;
use format to get each date's week-of-the-year; and
facet_wrap so that we can use facets and have distinct x axes.
dat$date <- as.Date(dat$date, format = "%m/%d/%Y")
dat$week <- format(dat$date, format = "%V") # or %W
library(ggplot2)
ggplot(dat, aes(date, transaction)) +
facet_wrap("week", ncol = 1, scales = "free_x") +
geom_path()

How can I fix my x-axis, and plot my dates using GGPlot2 in R? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
So I have a CSV file with two columns. Date (DD-MM-YYYY) and Gas Price (0.00). I also converted the date using as.Date. But when I attempted to plot it using ggplot, I kept getting this error that it can't work with numeric type or something.
Here is my source code:
gasoline <- read.csv(file.choose())
Date <- gasoline$Date
Price <- as.numeric(gasoline$Price) str(Price) ggplot(gasoline, aes(Date, Price)) + geom_line(colour="red")
Unfortunately, this code results in my graph having every date crammed down at the bottom so that it is barely legible:
(imgur.com/a/iitXw).
It's just a flat line. It also says, "geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?"
How can I plot dates such as this in Ggplot?
It is hard to diagnose the problem or propose a solution without your code or data. However, the code below is a very simple way to create a basic plot from example data that matches your description of your data:
# attach the ggplot package
library(ggplot2)
# make example data that matches your description
# (where the dates are January, April, July, and October 1 of 2015)
dates <- c('01-01-2015', '01-04-2015', '01-07-2015', '01-10-2015')
prices <- rnorm(4, 3)
# now convert dates to date format
dates <- as.Date(dates, format='%d-%m-%Y')
# you should then be able to create a simple plot using qplot
qplot(dates, prices, geom='line', xlab='Date', ylab='Price',
main='Gas Prices Over Time')
Note the use of the format option in the as.Date() function. Since the dates are in DD-MM-YYYY format, while as.Date() assumes YYYY-MM-DD format, your dates would not convert correctly without that option:
dates <- c('01-01-2015', '01-04-2015', '01-07-2015', '01-10-2015')
dates <- as.Date(dates)
format(dates, format="%B %d %Y")
[1] "January 20 1" "April 20 1" "July 20 1" "October 20 1"
Since you had some problem with your dates being numeric, I suspect you did some data cleaning after converting them to dates; for example,
dates <- c('01-01-2015', '01-04-2015', '01-07-2015', '01-10-2015', NA)
dates <- as.Date(dates, format='%d-%m-%Y')
class(dates)
[1] "Date"
dates <- ifelse(is.na(dates), NA, dates)
class(dates)
[1] "numeric"
This is because Date objects are stored as integer values, which are used when making logical comparisons (see this page at IDRE). However, that still should not throw an error when plotting; the axis labels should simply be numbers rather than dates.
prices <- c(rnorm(4, 3), NA)
qplot(dates, prices, geom='line', xlab='Date', ylab='Price',
main='Gas Prices Over Time')
Hopefully this will get you started on a solution to your problem.
EDIT:
Having seen the first twelve rows of your csv file, I can give you the following solution:
library(ggplot2)
df <- read.csv('dataframe.csv') # change file name/path as appropriate
df$Date <- as.Date(df$Date, format='%m/%d/%Y')
ggplot(df, aes(x=Date, y=Price)) + geom_line(color="red")
For me this produced the following plot:
I believe the issue was that in the code you provided me, the date variable was not converted to a date object; it was a factor (as seen in the lower right of the screenshot you provided). There was only one observation per level of the factor, resulting in the error you saw. You can get rid of that error by adding group=1 to the aesthetic (i.e. aes(x=Date, y=Price, group=1)), but this does not accomplish exactly what you want.
The key is making sure you convert the variable to a Date object (being careful about the format, as mentioned above); then everything works out nicely.

R aggregate by %Y-%b as date

I have a data frame item_sold_time with a item_sold_time$SaleDate "POSIXct" "POSIXt" column, looking like this 2015-04-28 07:59:43.
I want to aggregate by month in year:
df2<-aggregate(list(Qty=item_sold_time$Qty), by=list(ISBN=item_sold_time$ISBN,DateYM=strftime(item_sold_time$SaleDate, format="%Y-%b")), FUN=sum)
The problem with strftime is that it converts the date into characters and I get an error if I try to convert it back into date.
I tried all combinations of date formatting, I could find in 2 days of searching. The final destination of that date is to be used in this plot:
ggplot(df2, aes(x = DateYM, y=Qty))
Please help. Thanks
Set each date to (e.g.) the 15th of the month, then can aggregate by date and plot as normal.
as.Date(format(item_sold_time$SaleDate, "%Y-%m-15"))

Order dates in ggplot by month

I have DF$Date in the as.Date format "yyyy-mm-dd" as shown below. Is there an easy way to get these grouped by month in ggplot?
Date
2015-07-30
2015-08-01
2015-08-02
2015-08-06
2015-08-11
2015-08-12
I've added a column DF$Month as "year Monthname" (e.g. April 2015.)
I'm doing this by DF$Month<-strftime(DF$Date,format="%B %Y")
Is there a quick way to factor the month/years so that they are ordinal?
I used a workaround by formatting using:
DF$Month<-strftime(DF$Date,format="%Y-%m") so that the larger numbers are first and subsequently the month number.
This gives the output, which is sortable:
DF$Month
"2015-07"
"2015-08"
This output allows me to get this grouping:
http://imgur.com/df1FI3s
When using this plot:
MonthlyActivity<-ggplot(DF,aes(x=Month, y=TotalSteps))+
geom_boxplot()
MonthlyActivity
Any alternatives so I can use the full month name and still be in the correct time order?
There are probably other solutions, but here is one with full month names as a factor. As you already found out, you need a x variable to group by. We can then treat it as a 'order a factor' problem instead of a date-scale problem.
#first, generate some data
dat <- data.frame(date=sample(seq(as.Date("01012015",format="%d%m%Y"),
as.Date("01082015", format="%d%m%Y"),by=1),1000,T),
value=rnorm(1000))
We find the minimum and maximum month, and do some date-arithmetic to allow for all start-days (so that february doesn't get skipped when the minimum date is on the 29th/30th/31st). I used lubridate for this.
library(lubridate)
min_month = min(dat$date)-day(min(dat$date))+1
max_month = max(dat$date)-day(max(dat$date))+1
We generate a grouping variable. It is a factor with labels like 'January 2015, March 2015'. However, we force the order by creating a sequence (by month) from min date to max date and formatting it in the same way.
dat$group <- factor(format(dat$date, "%B %Y"),
levels=format(seq(min_month, max_month,by="month"),
"%B %Y"))
This forces the ordering on the axis:
Try adding
scale_x_discrete(limits = month.abb)
so your code would be
MonthlyActivity<-ggplot(DF,aes(x=Month, y=TotalSteps))+ geom_boxplot()+scale_x_discrete(limits = month.abb)
you will need library(dplyr)

Resources