How to make a scatter plot using ggplot2? - r

I am trying to make a data vs. Time graph for some Methane emissions data I have. The code so far looks like this:
CH4 <- as.numeric(Aeris_2_Data$CH4)
Aeris_2_Data$Date.Time <- as.POSIXct(Aeris_2_Data$Time_Stamp, tz = "", "%m/%d/%Y %H:%M:%S")
ggplot(Aeris_2_Data, aes(x = Aeris_2_Data$Date.Time, y = as.numeric(CH4)) + geom_point() + labs(x = "Time", y = "CH4 [ppm]") + ggtitle("Methane Over Time")
My data looks like this:
head(Aeris_2_Data) and this: an extension of head
I am trying to map CH4 over time as you can probably see from the small code fragment I've managed so far. but I keep getting the error:
Error in seq.int(0, to0 - from, by) : 'to' must be a finite number
Everything seems to match the ggplot info I remember and also found online. What is going wrong? My guess is to do with the formatting of the time data, which is in the format %m/%d/%Y %H:%M:%Sand stored as a character in the csv file I am pulling from. How do I properly format that to change it? Thanks in advance.

There are two errors in your code :
date format is "%m/%d/%Y %H:%M" and not "%m/%d/%Y %H:%M:%S"
one ) is missing after aes()
Additionnaly as mentioned is the comments you should better use Date.Time and transform CH4 as numeric directly into the data.frame
The code should be:
Aeris_2_Data$CH4 <- as.numeric(Aeris_2_Data$CH4)
Aeris_2_Data$Date.Time <- as.POSIXct(Aeris_2_Data$Time_Stamp, tz = "", "%m/%d/%Y %H:%M")
ggplot(Aeris_2_Data, aes(x = Date.Time, y = as.numeric(CH4))) + geom_point() + labs(x = "Time", y = "CH4 [ppm]") + ggtitle("Methane Over Time")

Related

Can I chronologically order dates as characters in R?

I have data in a .csv. The first column is dates, the second column counts a number of days. I want to plot number of days vs. date. (see here)
In my .csv the dates are chronological by year. In RStudio, the initial plot is chronological by the month's number.
install.packages("tidyverse")
library(tidyverse)
#load my spreadsheet
openingData <- read_csv("daysPriorToOpening.csv")
ggplot(data = openingData) +
geom_col(mapping = aes(x = dateOpened, y = daysPrior) +
labs(x = "Date Opened", y = "Days prior to opening at or above 11.0")
That creates this output, with it arranged in order by the number of the month. I like the appearance, just not the order. Someone suggested I try using as.Date()
openingData$dateOpened <- as.Date(openingData$dateOpened, format = "%m/%d/%Y")
Then I ran the code again to graph and it plotted chronologically, but now there are large gaps. See here. The dates aren't labeled as they were in the first picture; the reader has to guess the exact date.
My guess as to the different appearance is that in the first case, the dates are characters and discrete. In the second case, using as.Date() changed them to Dates and they become continuous. Is there a way to either,
keep the display as the first graph but order it by year, or
display as in the second graph but either eliminate the gaps or label the columns with their corresponding date?
openingData %>%
mutate(dateOpened = as.Date(dateOpened,"%m/%d/%y")) %>%
arrange(dateOpened) %>%
mutate(id = factor(row_number(),labels = dateOpened)) %>%
ggplot() +
geom_col(mapping = aes(x = id, y = daysPrior))+
labs(x = "Date Opened", y = "Days prior to opening at or above 11.0")
You need to convert your dates to a factor, and order the factor levels according to the date they represent. This involves converting to a date, ordering, then converting back again.
dates <- as.Date(openingData$dateOpened, format = "%m/%d/%y")
levs <- strftime(sort(dates), format = "%m/%d/%y")
openingData$dateOpened <- factor(strftime(dates, format = "%m/%d/%y"), levs)
ggplot(data = openingData) +
geom_col(mapping = aes(x = dateOpened, y = daysPrior)) +
labs(x = "Date Opened", y = "Days prior to opening at or above 11.0")

Trouble with ggplot2 and POSIX x-axis

I'm fairly new to R. While executing below line of code I've encountered an error (Error in seq.int(0, to0 - from, by) : 'to' must be a finite number) at the end of line ggplot2, details below. Not sure what causes this. Although I saw another post on a similar error however, it doesn't seem to be fully relevant to my code structure. Appreciate all inputs. Thanks.
setwd(("C:/Users/xx")
library(ggplot2)
library(forecast)
library(dplyr)
library(colortools)
Master <- read.csv("master.csv")
class(Master$TIMESTAMP)
Master$TIMESTAMP_posix <- as.POSIXct(Master$TIMESTAMP, format = "%D/%M/%Y %H:%M:%S")
class(Master$TIMESTAMP_posix)
(time_plot_2 <- ggplot(Master, aes(x = TIMESTAMP_posix, y = VWC_CS7)) +
geom_line() +
scale_x_datetime(date_labels = "%Y", date_breaks = "1 day") +
theme_classic())

Problems creating datetime series graph in R using ggplot

I am trying to create a graph with the following characteristics:
x-axis: time and date
y-axis: data
here you can download my dataframe: https://my.cloudme.com/josechka/data
I try to produce the graph using:
p <- ggplot(data,aes(x = Date, y = Var,group = 1))
+ geom_line()
+ scale_x_date(labels = date_format("%m/%d/%Y"))
+ scale_y_continuous(limits = c(0, 70000))
p
And I get the result:
Error: Invalid input: date_trans works with objects of class Date only
I am quite new in R and ggplot. What am I doing wrong?
As suggested you have to format the Date column into a Date object.
data$Date<-as.Date(data$Date, format="%d/%m/%Y")
Now you can use your script in order to create the plot:
library("ggplo2")
library("scales")
p <- ggplot(data,aes(x = Date, y = Var,group = 1))
+ geom_line()
+ scale_x_date(labels = date_format("%m/%d/%Y"))
+ scale_y_continuous(limits = c(0, 70000))
p
And this is the resulting plot:
Thanks for the comments. They helped me to find out the solution. Both comments allow to represent my data. However, there is small problem: data from the same day is grouped and it is not possible to see the daily behaviour of the variable. I tested to format the Date column using the next command:
as.POSIXct(data$Date, format="%d/%m/%Y %H:%M:%S")
It worked out. However it is important to have the original data in the format d/m/Y h:m:s. Thanks very much for the comments which help me a lot to solve my problem.

Cannot convert a time variable to plot it on ggplot

I have two problems handling my time variable in Gnu R!
Firstly, I cannot recode the time data (downloadable here) from factor (or character) with as.Posixlt or with as.Date without an error message like this:
character string is not in a standard unambiguous format
I have then tried to covert my time data with:
dates <- strptime(time, "%Y-%m-%j")
which only gives me:
NA
Secondly, the reason why I wanted (had) to convert my time data is that I want to plot it with ggplot2 and adjust my scale_x_continuous (as described here) so that it only writes me every 50 year (i.e. 1250-01-01, 1300-01-01, etc.) in the x-axis, otherwise the x-axis is too busy (see graph below).
This is the code I use:
library(ggplot2)
library(scales)
library(reshape)
df <- read.csv(file="https://dl.dropboxusercontent.com/u/109495328/time.csv")
attach(df)
dates <- as.character(time)
population <- factor(Number_Humans)
ggplot(df, aes(x = dates, y = population)) + geom_line(aes(group=1), colour="#000099") + theme(axis.text.x=element_text(angle=90)) + xlab("Time in Years (A.D.)")
You need to remove the quotation marks in the date column, then you can convert it to date format:
df <- read.csv(file="https://dl.dropboxusercontent.com/u/109495328/time.csv")
df$time <- gsub('\"', "", as.character(df$time), fixed=TRUE)
df$time <- as.Date(df$time, "%Y-%m-%j")
ggplot(df, aes(x = time, y = Number_Humans)) +
geom_line(colour="#000099") +
theme(axis.text.x=element_text(angle=90)) +
xlab("Time in Years (A.D.)")

ggplot2 geom_line() and smoothing

I am trying to create a GGPLOT2 smoothed line graph that looks more like this
Source: http://www.esrl.noaa.gov/psd/enso/mei/
and less like this:
Source: https://dl.dropboxusercontent.com/u/16400709/StackOverflow/Rplot02.png
My data are available on dropbox.
Having looked at previous posts I used the code below:
#MEI Line Graph
d4 <- read.csv("https://dl.dropboxusercontent.com/u/16400709/StackOverflow/Data_MEI.csv")
head(d4,n=20)
MEI<-ggplot(d4,aes(x=d4$Date, y=d4$MEI,group=1))+geom_line()
MEI+stat_smooth(method ="auto",level=0.95)
What I think I need is to reduce the amount of smoothing taking place, but I have yet to figure out how to achieve this.
d4s<-SMA(d4$MEI,n=8)
plot.ts(d4s)
SMA() works well but I cant get it to work with ggplot
Any hints would be appreciated!
Be aware that the MEI index is for a 2-month period, so it's already got some smoothing built in. Assuming that you are using the MEI data that NOAA ESRL publishes, you should be able to create the same plot.
First of all you need to get the system set up, as you'll be working with timezeones:
# set things up ----
working.dir = file.path('/code/R/StackOverflow/')
setwd(working.dir)
Sys.setenv(TZ='GMT')
now, download your data and read it in
d.in <- read.csv("MEI.txt")
The next step is to get the dates formatted properly.
d.in$Date <- as.POSIXct(d.in$Date,
format = "%d/%m/%Y",
tz = "GMT")
and because we need to figure out where things cross the x-axis, we'll have to work in decimal dates. Use the Epoch value:
d <- data.frame(x = as.numeric(format(d.in$Date,
'%s')),
y = d.in$MEI)
Now we can figure out the zero-crossings. We'll use Beroe's example for that.
rx <- do.call("rbind",
sapply(1:(nrow(d)-1), function(i){
f <- lm(x~y, d[i:(i+1),])
if (f$qr$rank < 2) return(NULL)
r <- predict(f, newdata=data.frame(y=0))
if(d[i,]$x < r & r < d[i+1,]$x)
return(data.frame(x=r,y=0))
else return(NULL)
}))
and tack that on the end of the initial data:
d2 <- rbind(d,rx)
now convert back to dates:
d2$date <- as.POSIXct(d2$x,
origin = "1960-01-01",
format = "%s",
tz = "GMT")
now we can do the plot:
require(ggplot2)
ggplot(d2,aes(x = date,
y = y)) +
geom_area(data=subset(d2, y<=0), fill="blue") +
geom_area(data=subset(d2, y>=0), fill="red") +
scale_y_continuous(name = "MEI")
and that gives you this:
Now, do you really need to smooth this?

Resources