Executing getSunlightTimes function in R with data frame? - r

I am hoping that you can support me with the use of the getSunlightTimes function. I have a pixel level data frame ("latlon2") with latitude ("lat"), longitude ("lon"), and one date ("date") in format YYYY-MM-DD. The data covers the continental US, and I also have a state code variable in the data frame.
To obtain the date variable as a date class variable, I executed:
latlon2$date=as.Date(latlon2$d2003s)
I am trying to use the getSunlightTimes to identify the time of sunrise and sunset for each pixel on the designated date. However, I am having a hard time getting the function to work. There is not a lot of information on this command beyond R's help guides, so I am hoping some of you have worked with it and can offer your suggestions based on my approach so far.
First I tried using the getSunlightTimes function designating each latitude/longitude/date column in my data frame
sunrise2003CET=getSunlightTimes(date="latlon2$date", lat="latlon2$lat", lon="latlon2$lon", tz="CET", keep = c("sunrise", "sunset"))
R returns the error:
Error in getSunlightTimes(date = "latlon2$date2", lat = "latlon2$lat",
: date must to be a Date object (class Date)
What's frustrating about this is that when I look at class(latlon2$date) R verifies that the column is a "Date" class!
Next, I tried designating the data frame only:
sunrise2003CET=getSunlightTimes(data="latlon2", tz="CET", keep = c("sunrise", "sunset"))
R returns the error:
Error in .buildData(date = date, lat = lat, lon = lon, data = data) :
all(c("date", "lat", "lon") %in% colnames(data)) is not TRUE
This seems odd because I named the columns in the dataframe "date", "lat", "lon", but perhaps the error is due to the fact that there are other variables in the data frame (such as state code).
I am trying to perform this task for several dates across 15 years (and four time zones), so any suggestions on how to get this running, and also running efficiently, are much appreciated!
Thank you so much!
Colette

The problem is with the quotes. When you write
sunrise2003CET=getSunlightTimes(date="latlon2$date",
lat="latlon2$lat",
lon="latlon2$lon",
tz="CET",
keep = c("sunrise", "sunset"))
you shouldn't put the expressions for the date, lat and lon arguments in quotes, because then R will see them as strings. (You could try class("latlon2$date") to see this.) Just write it as
sunrise2003CET=getSunlightTimes(date=latlon2$date,
lat=latlon2$lat,
lon=latlon2$lon,
tz="CET",
keep = c("sunrise", "sunset"))

Related

Finding maximum or minimum date value for each individual

I have a dataframe in a wide format in R, denoting different visit dates for each individual (visitdate1, visitdate2, visitdate3, etc.). I'm trying to find the latest date for each individual and save it as a new column, but this doesn't seem to be working.
I checked the class of the dataframe and each visitdate is already recognized as a Date, so I don't know why the code is not working.
This is the code I tried:
df1$latestdate <- pmax(as_date(df1$visitdate1), as_date(df1$visitdate2),
as_date(df1$visitdate3))
The error I'm getting is the following:
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
The problem is that I'm asking R to find the maximum date value per row, not to convert any date (as it's already a date).
However, even when I leave as_date out of the code, I get the error that :
replacement has 0 rows, data has 120.
Any insight that might help? Thanks in advance! Btw, I'm new to R. :)
Below I provide an example, kind of guessing what your data looks like. pmax may not be the best thing for this.
DATES = seq(as.Date('2011-01-01'),as.Date('2017-01-01'),"months")
df = data.frame(id=1:10,
visitdate1 = sample(DATES,10),
visitdate2 = sample(DATES,10),
visitdate3 = sample(DATES,10)
)
#set columns to find row Max
COLUMNS = c("visitdate1","visitdate2","visitdate3")
df$latestdate = apply(df[,COLUMNS],1,max)

Receiving odd error while downloading PRISM climate data. How to download a list of dates?

Trying to get mean temperature for a certain date over the last 5 years. Keep running into this same error. Any suggestions would be greatly appreciated!
get_prism_dailys(type="tmean", dates = as.Date("2018-06-01", "2017-06-01", "2016-06-01", "2015-06-01", "2014-06-01"), keepZip=FALSE)
Error Received:
Error in if (!is_within_daily_range(dates)) stop("Please ensure all dates fall within the valid Prism data record") :
missing value where TRUE/FALSE needed
The short answer on you question is quite simple: just replace as.Date with the c() function:
get_prism_dailys(type = "tmean", dates = as.Date("2018-06-01", "2017-06-01",
"2016-06-01", "2015-06-01", "2014-06-01"), keepZip=FALSE)
It seems to me that the third examples to get_prism_dailys(), which you have probably used, has a small issue. Lets's see on the source code get_prism_dailys(), which is possible with
print(get_prism_dailys)
The dates argument is processed get_prism_dailys() with the gen_dates() function:
dates <- gen_dates(minDate = minDate, maxDate = maxDate, dates = dates)
In its turn, the gen_dates() is transforming the dates argument with as.Dates(), as we can find with getAnywhere():
getAnywhere(gen_dates)
if(!is.null(dates)){
# make sure it is cast as a date if it was provided as a character
dates <- as.Date(dates)
Thus, the transformation of the date input with as.Date() is not necessary, but it is essential to put all the supplied dates into a single vector.
Hope, it will be helpful :)

Creating SpatialLinesDataFrame from SpatialLines object and basic df

Using leaflet, I'm trying to plot some lines and set their color based on a 'speed' variable. My data start at an encoded polyline level (i.e. a series of lat/long points, encoded as an alphanumeric string) with a single speed value for each EPL.
I'm able to decode the polylines to get lat/long series of (thanks to Max, here) and I'm able to create segments from those series of points and format them as a SpatialLines object (thanks to Kyle Walker, here).
My problem: I can plot the lines properly using leaflet, but I can't join the SpatialLines object to the base data to create a SpatialLinesDataFrame, and so I can't code the line color based on the speed var. I suspect the issue is that the IDs I'm assigning SL segments aren't matching to those present in the base df.
The objects I've tried to join, with SpatialLinesDataFrame():
"sl_object", a SpatialLines object with ~140 observations, one for each segment; I'm using Kyle's code, linked above, with one key change - instead of creating an arbitrary iterative ID value for each segment, I'm pulling the associated ID from my base data. (Or at least I'm trying to.) So, I've replaced:
id <- paste0("line", as.character(p))
with
lguy <- data.frame(paths[[p]][1])
id <- unique(lguy[,1])
"speed_object", a df with ~140 observations of a single speed var and row.names set to the same id var that I thought I created in the SL object above. (The number of observations will never exceed but may be smaller than the number of segments in the SL object.)
My joining code:
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object)
And the result:
row.names of data and Lines IDs do not match
Thanks, all. I'm posting this in part because I've seen some similar questions - including some referring specifically to changing the ID output of Kyle's great tool - and haven't been able to find a good answer.
EDIT: Including data samples.
From sl_obj, a single segment:
print(sl_obj)
Slot "ID":
[1] "4763655"
[[151]]
An object of class "Lines"
Slot "Lines":
[[1]]
An object of class "Line"
Slot "coords":
lon lat
1955 -74.05228 40.60397
1956 -74.05021 40.60465
1957 -74.04182 40.60737
1958 -74.03997 40.60795
1959 -74.03919 40.60821
And the corresponding record from speed_obj:
row.names speed
... ...
4763657 44.74
4763655 34.8 # this one matches the ID above
4616250 57.79
... ...
To get rid of this error message, either make the row.names of data and Lines IDs match by preparing sl_object and/or speed_object, or, in case you are certain that they should be matched in the order they appear, use
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object, match.ID = FALSE)
This is documented in ?SpatialLinesDataFrame.
All right, I figured it out. The error wasn't liking the fact that my speed_obj wasn't the same length as my sl_obj, as mentioned here. ("data =
object of class data.frame; the number of rows in data should equal the number of Lines elements in sl)
Resolution: used a quick loop to pull out all of the unique lines IDs, then performed a left join against that list of uniques to create an exhaustive speed_obj (with NAs, which seem to be OK).
ids <- data.frame()
for (i in (1:length(sl_obj))) {
id <- data.frame(sl_obj#lines[[i]]#ID)
ids <- rbind(ids, id)
}
colnames(ids)[1] <- "linkId"
speed_full <- join(ids, speed_obj)
speed_full_short <- data.frame(speed_obj[,c(-1)])
row.names(speed_full_short) <- speed_full$linkId
splndf <- SpatialLinesDataFrame(sl_obj, data = speed_full_short, match.ID = T)
Works fine now!
I may have deciphered the issue.
When I am pulling in my spatial lines data and I check the class it reads as
"Spatial Lines Data Frame" even though I know it's a simple linear shapefile, I'm using readOGR to bring the data in and I believe this is where the conversion is occurring. With that in mind the speed assignment is relatively easy.
sl_object$speed <- speed_object[ match( sl_object$ID , row.names( speed_object ) ) , "speed" ]
This should do the trick, as I'm willing to bet your class(sl_object) is "Spatial Lines Data Frame".
EDIT: I had received the same error as OP, driving me to check class()
I am under the impression that the error that was populated for you is because you were trying to coerce a data frame into a data frame and R wasn't a fan of that.

csv to frequency polygon using R or python

I have a result.csv file to which contains information in the following format :
date,tweets
2015-06-15,tweet
2015-06-15,tweet
2015-06-12,tweet
2015-06-11,tweet
2015-06-11,tweet
2015-06-11,tweet
2015-06-08,tweet
2015-06-08,tweet
i want to plot a frequency polygon with number of entries corresponding to each date as y axis and dates as x axis
i have tried the following code :
pf<-read.csv("result.csv")
library(ggplot2)
qplot(datetime, data =pf, geom = "freqpoly")
but it shows the following error :
geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?
can anyone tell me how to solve this problem. I am totally new to R so any kind of guidance will be of great help to me
Your issue is that you are trying to treat datetime as continuous, but it's imported it as a factor (discrete/categorical). Let's convert it to a Date object and then things should work:
pf$datetime = as.Date(pf$datetime)
qplot(datetime, data =pf, geom = "freqpoly")
Based on your code, I assume that the result.csv has a header: datetime, atweet. By default, read.csv takes the first line of the CSV file as column names. That means you will be able to access the two columns with pf$datetime and pf$atweet.
If you look at the documentation of read.csv, you will find that stringsAsFactors = default.stringsAsFactors(), which is FALSE. That is, the strings from CSV files are kept as factors.
Now, even if you change the value of stringsAsFactors, you still get the same error. That is because ggplot does not know how to order the dates, as it does not recognize the strings as such.
To transform the strings into logical dates, you can use strptime.
Here is the working example:
pf<-read.csv("result.csv", stringsAsFactors=FALSE)
library(ggplot2)
qplot(strptime(pf$datetime, "%Y-%m-%d"), data=pf, geom='freqpoly')

Errors when using stcontruct( ) or STFDF( ) to create STFDF object

I am relatively new to R so i apologize if I have trouble expressing what I'm attempting to do. I have a 'spatial' panel dataset in long form and a shapefile. The long form table is a data.frame and it includes a column of dates (that have been converted to dates using 'as.date') and an ID column that is the same as that in the shapefile used to identify the different polygons (thus my long form dataset has no long lat values just an ID field that corresponds to the polygon features in the shapefile). I want to construct a spatiotemporal object of class ST out of these two objects (the shapefile and the long form dataset). To do this I have tried using stcontruct() and STFDF() but with absolutely not luck. stcontruct() gives me this error:
stConstruct(x, x$ID, x$date, SpatialObj = pol, TimeObj = NULL, interval=FALSE)
Error in stConstruct(x, x$ID, x$date, SpatialObj = pol, TimeObj = NULL, :
unknown parameter combination
and STFDF() gives me this error:
STFDF(shapefile, x$date, x)
Error: nrow(object#data) == length(object#sp) * nrow(object#time) is not TRUE
I've been stuck on this for days reading everything I can about the spacetime package in forums, etc. but to no avail. Any help is greatly appreciated.
thanks!
About STFDF error
From:
http://r-sig-geo.2731867.n2.nabble.com/Error-with-STFDF-td7584461.html
"If you don't have every time value at every spatial point, then you can't
have an STFDF object, as by definition STFDF is a full space by time grid.
The equation in the error is part of the definition/requirement for an
STFDF object.
SDIDF objects don't have that requirement of all times at all locations..."

Resources