I'm trying to create a simple time series plot in R with the following data (it's in tbl format):
Date sales
<date> <dbl>
1 2010-02-05 1105572.
2 2010-09-03 1048761.
3 2010-11-19 1002791.
4 2010-12-24 1798476.
5 2011-02-04 1025625.
6 2011-11-18 1031977.
When I use the following command: plot(by_date$Date, by_date$sales, type = 'l'), the resulting graph just skips the individual dates, as I want it to display, and just shows the year on the x-axis, like this (please ignore the axis labels for now):
I've checked the format of the date column using class(by_date$Date) and it does show up as 'Date'. I've also checked other threads here and the closest one that came to answering my query is the one here. Tried that approach but didn't work for me, while plotting in ggplot or converting data to data frame didn't work either. Please help, thanks.
With ggplot this should work -
library(ggplot2)
ggplot(by_date, aes(Date, sales)) + geom_line()
You can use scale_x_date to format your x-axis as you want.
Related
I have nearly a quarter of a million data rows like below.
it consists of readings at 15-minute intervals over 7 years.
Propertimeseries <- ts(Proper)
plot.ts(Propertimeseries)
The above code is what I used to get the graphs below.
I want to have basically the line plot that can be seen on the bottom section of the graph with reading value on the y axis and the date on one x-axis. with a secondary x-axis going from 00:15:00 to 00:00:00 for each day.
Any help would be more than appreciated!
Thank you
EDIT*
#dcruvolo
this is the output for dput()
enter code here[3
After aggregating, "filtering" and parsing a column with date and hours by using as.Date, also transforming it by using melt function, my data looks like this:
DATE variable value
1 13-09-20 Billete_50 20405
2 14-09-20 Billete_50 19808
3 13-09-20 Billete_100 27787
4 14-09-20 Billete_100 20361
5 13-09-20 Total 48192
6 14-09-20 Total 40169
I want a linear graph to show the data, but it looks like this:
It's in spanish, I'm sorry, but the date is almost the same, it may say only September (Sep), but it is showing October, January, April, July, then back to October.
This is my ggplot line:
ggplot(plotCajero1Melted, aes(x=DATE, y=value, col=variable)) + geom_line() +xlab("Fecha") +ylab("Saldo") +ggtitle("Cajero 1")
What should I do? It's something with the date? I was thinking in the year, it is represented in a short way, probably it is making ggplot to have a weird behavior, or maybe a ggplot option that i'm missing?
You're right, your date format is not read as it should be (it seems to be read as Year-Month-Day).
You can modify the date format for example by using the function dmy from lubridate package to indicate r to read the date as Day-Month-Year:
library(lubridate)
df$DATE <- dmy(df$DATE)
ggplot(df, aes(x = DATE, y = value, color = variable))+
geom_line()
Is it what you are looking for ?
I have a data frame like the following:
user_id track_id created_at
1 81496937 cd52b3e5b51da29e5893dba82a418a4b 2014-01-01 05:54:21
2 2205686924 da3110a77b724072b08f231c9d6f7534 2014-01-01 05:54:22
3 132588395 ba84d88c10fb0e42d4754a27ead10546 2014-01-01 05:54:22
4 97675221 33f95122281f76e7134f9cbea3be980f 2014-01-02 05:54:24
5 17945688 b5c42e81e15cd54b9b0ee34711dedf05 2014-01-02 05:54:24
6 452285741 8bd5206b84c968eda0af8bc86d6ab1d1 2014-01-02 05:54:25
I want to create a line chart in R showing the number of user_id across days. I want to know how many user_id are present per day and create a plot of that. How do I do it?
First of all, you should know how to process date and time in R. I strongly recommend the lubridate package.
library(lubridate)
t <- ymd_hms("20170621111800")
dt <- floor_date(t, unit='day')
dt
Then you need to learn how to manipulate a data frame in R. I usually use dplyr package because it is quite simple to learn and the code is easy to read.
library(dplyr)
new_df <- df %>%
mutate(dt=floor_date(ymd_hms(created_at, unit='day'))) %>%
group_by(dt) %>%
summarise(user_cnt=n_distinct(user_id))
new_df
At last, you need to learn how to plot a data frame in R. I personally prefer to use ggplot2 to do this task.
library(ggplot2)
p <- ggplot(new_df) + geom_line(aes(x=dt, y=user_cnt))
p
Now you will see a picture showed in the bottom right panel if you use RStudio to run the code. Furthermore, you could use plotly package to change the static image to a dynamic chart!
library(plotly)
ggplotly(p)
time count
2017-03-08 19:33 1
2017-03-23 22:11 1
2017-03-30 3:30 10
2017-03-09 19:33 13
2017-03-23 22:11 1
2017-03-31 3:30 1
.....
this data is about how fast consumers comments write
so I want to make a plot which I can easily know about how fast comments on.
For example,
In X axis, time series starts from 2017-03-08
through same interval(seconds or minute) there is a bar plot
so if the comments write speed is fast, the bar plot is dense.
and then time goes on, spped is not that fast, the bar plot is not dense
how can I make it?
cc5<-dt[, tdiff := difftime(cc, shift(cc, fill=cc[1L]), units="secs"),
by=title]
using this code, I can make difftime column
I have one more problem time column is character type
so I try to change it to date type using as.Date it doesn't work
so I change it to POSIXct type
I think to make X axis in time series I need to change date type
I'm not 100% sure that I'm really understanding the result that you want,
but generally when I want to put dates in the x-axis, I go to Understanding dates and plotting a histogram with ggplot2 in R
and use Gauden's Code v1. If you have successfully changed the character into a POSIXct time, as.Date() should work fine.
I am trying to create a plot of weekly data. Though this is not the exact problem I am having it illustrates it well. Basically imagine you want to make a plot of 1,2,....,7 for for 7 weeks from Jan 1 2015. So basically my plot should just be a line that trends upward but instead I get 7 different lines. I tried the code (and some other to no avail). Help would be greatly appreciated.
startDate = "2015-01-01"
endDate = "2015-02-19"
y=c(1,2,3,4,5,6,7)
tsy=ts(y,start=as.Date(startDate),end=as.Date(endDate))
plot(tsy)
You are plotting both the time and y together as individual plots.
Instead use:
plot(y)
lines(y)
Also, create a date column based on the specifics you gave which will be a time series. From here you can add the date on the x-axis to easily see how your variable changes over time.
To make your life easier I think your first step should be to create a (xts) time series object (install/load the xts-package), then it is a piece of cake to plot, subset or do whatever you like with the series.
Build your vector of dates as a sequence with start/end date:
seq( as.Date("2011-07-01"), by=1, len=7)
and your data vector: 1:7
a one-liner builds and plots the above time series object:
plot(as.xts(1:7,order.by=seq( as.Date("2011-07-01"), by=1, len=7)))