I have a dataset which plots unemployment over time, and I want to add in bands highlighting when there is a recession.
The original dataframe is called quarterly data.
recession <- data.frame(date_start= as_date(c("1973-07-01", "1980-01-01", "1990-07-01","2008-04-01")),
date_end = as_date(c("1975-07-01","1981-04-01", "1991-07-01","2009-04-01")))
recession$date_start <- ymd (recession$date_start)
recession$date_end <- ymd (recession$date_end)
ggplot(quarterly_data, aes(x=date, y= Unemployment))+
geom_line()+
geom_rect(data = recession, inherit.aes=FALSE , aes(xmin = date_start, xmax = date_end, ymin = -0.1, ymax = 0.1),
fill = "red", alpha= 0.3)
However, when I run the ggplot, I get this error message:
Error: Invalid input: time_trans works with objects of class POSIXct only
Does anyone know how to fix this?
While you have supplied us with the data frame recession, you have not supplied us with the data frame quarterly_data, where you are getting the error. A few pointers here to try, but first, a bit of description of what to gauge is causing this issue.
First of all, time_trans appears to be from the scales package, but it's not clear why that needs to run based on the code above. Is there anything else that could be using the scales package here?
Now for the error message itself, it requires an object of class POSIXct only. This is different than objects of class Date, which are created from the lubridate package that you are using, as apparent from the use of as_date to create the recession data frame.
You can confirm this yourself by running class(recession$date_start), where you can see the output is a Date class object.
After the ymd() function, you are also getting an object of class Date. From the documentation, you should be able to coerce the class to be converted to POSIXct POSIXt via supplying a tz= (time zone) argument. You can see this with the following:
> class(ymd(recession$date_start))
[1] "Date"
> class(ymd(recession$date_start, tz='GMT'))
[1] "POSIXct" "POSIXt"
So, that might fix your problem. But, you still have some detective work to do, since we don't have your other data frame and we apparently are not seeing a function that is trying to call time_trans from the scales package. The other possibility here is that ggplot is calling this to adjust an axis based on a POSIXt object... but I don't see a scale_ call or coord_flip() that might cause this error. I would recommend the following sequence:
Try the "homerun" approach by running your ymd() functions again, but supplying tz="GMT" to force the output to be a POSIXct object. Not sure if this will be successful.
run the ggplot() line itself. Do you get the same error? If so, the error lies within the quarterly_data data frame, and not the recession data frame. If it works, then run the ggplot() line and add in the geom_line() object. If it still works, then your issue is with the geom_rect function, which likely means the recession data frame.
Check the class of date objects in quarterly_data. Are they Date class or POSIXct class? If Date, try to convert them to POSIXct (maybe just use as.POSIXct()).
Is there more code that belongs here from your plot call? If you have coord_flip() or any scale_x or other thematic elements that are added to your plot code, they can definitely be trying to adjust the time scale and result in that error.
Related
Trying my hands on autoplot and ggseasonplot functions, but neither working. Please guide / help.
library(readxl)
new<-read_excel('NEW DATA.xlsx')
View(new)
library(ggplot2)
autoplot(new)
class(new)
ggseasonplot(new)
Error: Objects of type tbl_df/tbl/data.frame not supported by autoplot.
From my understanding, time series does not support data.frame and has to be converted into time series format. The reason this happens, as you can not tell R to convert the whole table / matrix / data frame to plot, without giving it the constraints, telling what to plot, and how to plot.
I used
my1 <- ts (name of the data frame, [,2], start = year,
month, date, frequency = in my case it was 31)
As I only wanted to plot my column 2, I wrote [,2].
my1 <- ts (data1, [,2], start = 1981,1,1, frequency = 31)
This converted it to time series, which I plotted later by commands-
Autoplot (my1)
ggseasonplot (my1).
I'm trying to convert some simple data into a form I thought ggplot2 would accept.
I snag some simple stock data and now I just want to plot, later I want to plot say a 10-day moving average or a 30-day historical volatility period to go with it, which is I'm using ggplot.
I thought it would work something like this line of pseudocode
ggplot(maindata)+geom_line(moving average)+geom_line(30dayvol)
library(quantmod)
library(ggplot2)
start = as.Date("2008-01-01")
end = as.Date("2019-02-13")
start
tickers = c("AMD")
getSymbols(tickers, src = 'yahoo', from = start, to = end)
closing_prices = as.data.frame(AMD$AMD.Close)
ggplot(closing_prices, aes(y='AMD.Close'))
But I can't even get this to work. The problem of course appears to be that I don't have an x-axis. How do I tell ggplot to use the index column as a. Can this not work? Do I have to create a new "date" or "day" column?
This line for instance using the Regular R plot function works just fine
plot.ts(closing_prices)
This works without requiring me to enter a hard x-axis, and produces a graph, however I haven't figured out how to layer other lines onto this same graph, evidently ggplot is better so I tried that.
Any advice?
as.Date(rownames(df)) will get you the rownames and parse it as a date. You also need to specify a geom_line()
library(quantmod)
library(ggplot2)
start = as.Date("2008-01-01")
end = as.Date("2019-02-13")
start
tickers = c("AMD")
getSymbols(tickers, src = 'yahoo', from = start, to = end)
closing_prices = as.data.frame(AMD$AMD.Close)
ggplot(closing_prices, aes(x = as.Date(rownames(closing_prices)),y=AMD.Close))+
geom_line()
Edit
Thought it would be easier to explain in the answers as opposed to the comments.
ggplot and dplyr have two methods of evaluation. Standard and non standard evaluation. Which is why in ggplot you have both aes and aes_(). The former being non standard evaluation and the later being standard evaluation. In addition there is also aes_string() which is also standard evaluation.
How are these different?
Its easy to see when we explore all the methods,
#Cleaner to read, define every operation in one step
#Non Standard Evaluation
closing_prices%>%
mutate(dates = as.Date(rownames(.)))%>%
ggplot()+
geom_line(aes(x = dates,y = AMD.Close))
#Standard Evaluation
closing_prices%>%
mutate(dates = as.Date(rownames(.)))%>%
ggplot()+
geom_line(aes_(x = quote(dates),y = quote(AMD.Close)))
closing_prices%>%
mutate(dates = as.Date(rownames(.)))%>%
ggplot()+
geom_line(aes_string(x = "dates",y = "AMD.Close"))
Why are there so many different ways of doing the same thing? In most cases its okay to use non standard evaluation. However if we want to wrap these plots in functions and dynamically change the column to plot based on function parametrs passed as strings. It is helpful to plot using the aes_ and aes_string.
I have several data-sets similar to https://www.dropbox.com/s/j9ihawgfqwxmkgc/pred.csv?dl=0
Loading them from CSV and then plotting works fine
predictions$date <- as.Date(predictions$date)
plot(predictions$date, predictions$pct50)
But when I want to use GGPLOT to draw these data predicted points into a plot to compare them with the original points like:
p = ggplot(theRealPastDataValues,aes(x=date,y=cumsum(amount)))+geom_line()
This command
p + geom_line(predictions, aes(x=as.numeric(date), y=pct50))
generates the following error:
ggplot2 doesn't know how to deal with data of class uneval
But as the first plot(predictions$date, predictions$pct50) works with the data I do not understand what is wrong.
Edit
dput(predictions[1:10, c("date", "pct50")])
structure(list(date = c("2009-07-01", "2009-07-02", "2009-07-03",
"2009-07-04", "2009-07-05", "2009-07-06", "2009-07-07", "2009-07-08",
"2009-07-09", "2009-07-10"), pct50 = c(4276, 4076, 4699.93, 4699.93,
4699.93, 4699.93, 4664.76, 4627.37, 4627.37, 4627.37)), .Names = c("date",
"pct50"), row.names = c(NA, 10L), class = "data.frame")
Edit 2
I change this
p + geom_line(data = predictions, aes(x=as.numeric(date), y=pct50))
and the error changed to:
Invalid input: date_trans works with objects of class Date only
Zusätzlich: Warning message:
In eval(expr, envir, enclos) : NAs created
so I think the hint to How to deal with "data of class uneval" error from ggplot2? (see comments) was a good Idea, bit still the plot does not work.
Your first issue (Edit 2) is because ?geom_line uses mapping=NULL as the first argument, so you need to explicitely state the first argument is data
p + geom_line(data = predictions, aes(x=as.numeric(date), y=pct50))
similar question
Your second issue is because your predictions$date is a character vector, and when using as.numeric it introduces NAs. If you want numerics you need to format it as a date first, then convert it to numeric
as.numeric(as.Date(predictions$date), format="%Y%m%d")
I get the message
Error:no applicable method for 'round_any' applied to an object of
class "labelled"
when I try to plot my graphs using ggplot2 and R. I have labelled my variables in my data frame using Hmisc::label and I think this is the problem. How do I solve this issue?
My labels look like this:
label(data$results_lp)="Lumbure Puncture Results"
label(data$hiv_test)="HIV Test done"
label(data$outcome)="Outcome at Discharge"
label(data$vac_10mnth_complete)="Vaccinne 10months complete"
label(data$vac_3mnth_complete)="Vaccine 3months complete"
label(data$vac_uptodate)="Vaccine up to date"
label(data$dx1_pneum_rcd)="Pneumonia Recorded"
label(data$mal)="Malaria"
label(data$dx1_malaria)="Documented Malaria"
label(data$dehydrat)="Dehydration"
How do I solve this?
Remove the labels for plotting:
library(Hmisc)
DF <- data.frame(x=factor(rep(1:2,5)),y=1:10)
label(DF$x)="xLab"
label(DF$y)="yLab"
library(ggplot2)
ggplot(DF,aes(x=x,y=y)) + geom_boxplot()
#Don't know how to automatically pick scale for object of type labelled. Defaulting to continuous
ggplot(DF,aes(x=factor(unclass(x)),y=unclass(y))) + geom_boxplot()
#no warning
Unfortunately you don't give the details necessary to reproduce your error and give a customized solution.
how can I have a data set of only time intervals (no dates) in R, like the following:
TREATMENT_A TREATMENT_B
1:01:12 0:05:00
0:34:56 1:08:09
and compute mean times, etc, and draw boxplots with time intervals in the y-axis?
I am new to R, and I searched for this but found no example in the net.
Thanks
The chron-package has a 'times' class that supports arithmetic. You could also do all of that with POSIXct objects and format the date-time output to not include the date. I thought axis.POSIXct function has a format argument that should let you have time outputs. However, it does not seem to get dispatched properly, so I needed to construct the axis "by hand."
dft <- data.frame(x= factor( sample(1:2, 100, repl=TRUE)),
y= Sys.time()+rnorm(100)*4000 )
boxplot(y~x, data=dft, yaxt='n')
axis(2, at=seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000) ,
labels=format.POSIXct(seq(from=range(dft$y)[1], to =range(dft$y)[2], by=3000),
format ="%H:%M:%S") )
There did turn out to be an appropriate method, Axis.POSIXt (to which I thought boxplot should have been turning for plotting, but it did not seem to recognize the class of the 'y' argument):
boxplot(y~x, data=dft, yaxt='n')
Axis(side=2, x=range(dft$y), format ="%H:%M:%S")
Regarding your request for something "simpler", take a look at theis ggplot2 based solution, using the dft dataframe defined above with POSIXct times. (I did try with the chron-times object but got a message saying ggplot did not support that class):
require(ggplot2); p <- ggplot(dft, aes(x,y))
p + geom_boxplot()
Check out the "lubridate" package, and the "hms" function within it.