Trouble showing year on x-axis with ggplot - r

I am using ggplot to show publications over time by year. However, my x-axis is showing up as integers (ex 2015.0) instead of each year showing up under each bar.
p <-ggplot(pubs, aes(x = Year, y=Pubs, fill=Author.order)) +
geom_bar(stat="identity", position=position_stack(reverse = TRUE),size=.3)

Like the other user mentioned, it's difficult to help you without seeing your dataset, pubs. With that being said, if you are seeing your axis labeled as 2015.0 for the year 2015, that definitely means pubs$Year is not formatted as a date. As such, ggplot treats it like a normal numeric.
Try formatting as date first:
pubs$Year <- as.Date(pubs$Year, format="your.format.here")
If I saw how pubs$Year was constructed, I could give you a suggestion for what to use for format=. By default the as.Date function will try different formats according to the documentation and hit you with an error if there are none found based on your data. as.Date uses strptime() to convert into Date class, so you can look at that documentation to help understand how to write the format= piece. Note that strptime converts characters to date types, so if your data is numeric, you may actually want to convert to character then convert to date. Sounds weird, but I've done it myself because it works. :)
There are more classes other than Date in R for representation of a date or date/time value, however. You can also use POSIXct or POSIXlt using any of the as.POSIX functions, which operate similarly to the as.Date function. Any of those formats should work fine with ggplot.
Finally, if you want to specify anything related to the scales for representation in your plot, you can use scale_x_date to change breaks, limits, format labels, etc. See documentation here.

Related

Gather turns my dates into unrecognisable format

I am trying to gather a couple columns of dates so that its easier for it to be choices in shiny. However, when I gather dates, it turns into for example, 2020/12/14 to 128284 format. I have tried as.Date, as.character, I have tried lubridating but it doesn't work. (I have been gathering in a separate script besides shiny). Please see my code when gathering.
Here is my data
before gather
df<-df%>%gather(key="date.type", value="dates",
date.1, date.2, date.3, date.4)
This turns it to something like this;
after gather
This becomes a problem when I am trying to find difference between two dates in Shiny(I have been using difftime).
The error I get in shiny is:
x character string is not in a standard unambiguous format
I am also thinking of not gathering at all, but allowing the user to choose the from date column and to date column in the UI, but I am not sure how to then find the difference in days between the from and to dates in the server.
mutate(theduration=difftime(input$to,input$from,units="days")
This doesn't work.
OK, so I had this problem when using gather to make a dataset. When you get those 5 digit time character blocks, try this:
mutate(time=as.Date(as.numeric(time),origin="1899-12-30"))
Apparently, that that 5 digit number is days since the origin date. It's a MS thing. Good Luck!

Plotting POSIXct in ggplot manually scaling x-axis

I am trying to plot up this windspeed data, with years displaying on the x-axis. The data frame was set up as
wsAvg<-data.frame(date=as.POSIXct(ws07$date[1224:1559]),u.1=(ws07$u[1224:1559]),stringsAsFactors = FALSE)
wsAvg<-rbind(wsAvg,c(date=as.POSIXct(ws08$date[1032:1367]),(ws08$u[1032:1367])))
And below using ggplot to plot my windspeed data frame.
ggplot(wsAvg,aes(x=date,y=as.numeric(u.1)))+geom_point(size=3,pch=2)+
geom_smooth(method="lm",colour="black",se=FALSE)+
#scale_x_datetime(limits=as.POSIXct(c('2006-09-01','2016-10-01')),breaks=date_breaks("1 year"),labels=date_format("%Y"))+
Without the scale_x_datetime() in my command, I get those dates. When I add in the scale_x_datetime() function to manually scale my x-axis to display only years. All my data lines up onto 2007. Anyone know why this is?
It is very difficult to provide the answer to your question, since we don't have a clear picture of any of your data. With that being said, let's look at the information you did provide and see where the likely source of the problem is for your question.
The issue is clearly related to the formatting/data located in your "date" column. It's best to look at this stepwise and test at each step to see what can go wrong here:
Your raw data: There is likely nothing wrong with your base data, but we don't know the format of the "date" vector coming from ws07$date[1224:1559] and ws08$date[1032:1367]. Your raw data originates from two data frames, so just confirm that the raw data from these two vectors is formatted identically, but more importantly, is it already formatted as a date? What is class(ws08$date)? Also, what does the data look like if you took a sample of that dataset? (e.g. ws07$date[sample(1224:1559, 20)]).
Conversion to POSIXct: The first code you show includes as.POSIXct(), but does not include the argument for format=. You may or may not need to specify this, but I would recommend consulting the documentation to be sure you're using the function correctly. You can try converting a small subset of the data just using as.POSIXct(ws07$date[1224:1250]) or something like that. Does it give you the dates formatted correctly? If not, try specifying the format= arg until it "works" as you intended.
Initial Plot and Second plot The data is spread out in the first plot, likely kind of how you expected. What about the month/day combinations in the first plot - are they correct? If they are correct, it may indicate the year is being read wrong, since apparently all dates are clustered around May and June of 2007. Comparing the first and second plots, there's no obvious issue with scale_x_datetime() here. Those two plots are consistent with data that has x values = dates ranging from May-June of 2007.
Bottom line: Hard to discern exactly where it's going wrong for you, but likely it's (1) in the conversion to date using as.POSIXct from your ws07 and ws08 datasets, or (2) the format of ws07$date or ws08$date being imported/converted incorrectly. The solution is to use the format= argument in the date conversion/import function you are using to ensure that the format is correct and years/months/dates are imported accordingly.
The code that worked for me. Instead of using c() function when I was binding data from other datasets, I had to use data.frame() to add other years into the wsAvg data frame.
wsAvg<-data.frame(date=as.POSIXct(ws07$date[1224:1559]),u.1=(ws07$u[1224:1559]),stringsAsFactors = FALSE)
wsAvg<-rbind(wsAvg,data.frame(date=as.POSIXct(ws08$date[1032:1367]),u.1=(ws08$u[1032:1367])))

Dates on X Axis

I have looked everywhere ( Example ) and a few other posts, but I just cannot understand how to setup the date on the X axis. Can someone help me? This is what I have so far:
Data.CSV:
21-Oct-14,
22-Oct-14,
23-Oct-14,....etc
I have tried this: axis(1,at=NVDA_Data$Date) and it doesn't show up and the console says its NULL
I just want to change where it says 0-250 with the dates
You can improve your question by letting us know you are using base R plot (which it appears from your screen shot to be true).
One easy solution is to convert your date-looking character or factor variable into a Date variable. Plot will generally make reasonable axis marks from a Date variable. Use summary() on your data frame to determine if the Date variable is a factor or character. If it is a character, then do something like this:
data.frame$Date2 <- as.Date(data.frame$Date, "%d-%b-%y")
If your Date column is read in as a factor, use colClasses to read it in as character, then use the snippet above.
The link #user20650 provided gives some good tips for labeling Date variables.

Want only the time portion of a date-time object in R

I have a vector of times in R, all_symbols$Time and I am trying to find out how to get JUST the times (or convert the times to strings without losing information). I use
strptime(all_symbol$Time[j], format="%H:%M:%S")
which for some reason assumes the date is today and returns
[1] "2013-10-18 09:34:16"
Date and time formatting in R is quite annoying. I am trying to get the time only without adding too many packages (really any--I am on a school computer where I cannot install libraries).
Once you use strptime you will of necessity get a date-time object and the default behavior for no date in the format string is to assume today's date. If you don't like that you will need to prepend a string that is the date of your choice.
#James' suggestion is equivalent to what I was going to suggest:
format(all_symbol$Time[j], format="%H:%M:%S")
The only package I know of that has time classes (i.e time of day with no associated date value) is package:chron. However I find that using format as a way to output character values from POSIXt objects lends itself well to functions that require factor input.
In the decade since this was written there is now a package named “hms” that has some sort of facility for hours, minutes, and seconds.
hms: Pretty Time of Day
Implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
Came across the same problem recently and found this and other posts R: How to handle times without dates? inspiring. I'd like to contribute a little for whoever has similar questions.
If you only want to you base R, take advantage of as.Date(..., format = ("...")) to transform your date into a standard format. Then, you can use substr to extract the time. e.g. substr("2013-10-01 01:23:45 UTC", 12, 16) gives you 01:23.
If you can use package lubridate, functions like mdy_hms will make life much easier. And substr works most of the time.
If you want to compare the time, it should work if they are in Date or POSIXt objects. If you only want the time part, maybe force it into numeric (you may need to transform it back later). e.g. as.numeric(hm("00:01")) gives 60, which means it's 60 seconds after 00:00:00. as.numeric(hm("23:59")) will give 86340.

Plotting Time Series

I'm working on 16 world indices over three year and i want to make a plot from these 16 indices.
all<-read.table("C.../16indices.txt")
dimnames(all)[[2]]<-c("Date","BEL 20","CAC 40","AEX","DAX","FTSE 100","IBEXx 35","ATX","SMI","FTSE MIB","RTX","HSI","NIKKEI 225","S&P 500","NASDAQ","Dow Jones","BOVESPA")
attach(all)
Problems
My dates are written in the form "2009-01-05". I want only "2009" to appear otherwise i would have to many jumps.
For example the prices from the BOVESPA go from 40.000,15 to 60.000,137. How do I get nice y-labels? For instance 40.000, 45.000,...,60.000.
How do i get 16 of these plots in one nice figure/plot?
I'm not used to work with R. I tried something like this but that didn't work...
plot(all[1,],all[,2])
Biggest problem is no sample data> Here is advice based on guesswork:
I tried something like this but that didn't work... plot(all[1,],all[,2])
You need to format your date values as R Date class. If they are in YYYY-MM-DD format it will be as simple as:
all$Date <- as.Date(all.Date)
To your specific questions:
1) My dates are written in the form "2009-01-05". I want only "2009" to appear otherwise i would have to many jumps.
You will need to suppress axis plotting in the plot call and then need to add an axis() call.
2) For example the prices from the BOVESPA go from 40.000,15 to 60.000,137. How do I get nice y-labels? For instance 40.000, 45.000,...,60.000.
You appear to be in a European locale and that mean your initial read.table call probably mangled the data input and you need to read the documentation for read.csv2 which will properly handle the reversal of the decimal point and comma meanings for numeric data. You should also use colClasses.
3) How do i get 16 of these plots in one nice figure/plot?
You should probably calculate ratios from an initial starting point for each series so there can be a common scale for display.

Resources