How to input multiple departure dates and times in gmapsdistance R package? - r

I am using the R package, gmapsdistance, to extract Google distance matrix.
My codes are:
gdm <- gmapsdistance(origin = origin_vector,
destination = destination_vector,
combinations = "pairwise",
mode = "driving",
shape = "long",
dep_date = date_vector,
dep_time = time_vector)
The Google API key was also set. The code works until the parameters for dep_date and dep_time. I believe it is because the parameters do not accept vectors. Two errors appear:
1: XML declaration allowed only at the start of the document.
2: Extra content at the end of the document.
In my dataset, each row is a data point with a unique origin, destination, departure date, and departure time. I require the Google distance/time for each row and there are thousands of them. There is no need for me to compare across rows. How may I do this?
Appreciate any help!

Related

R native time series: date data

There are R native datasets, such as the Nile dataset, that are time series. However, if I actually look at the data set, be it as it was, after as_tibble(), after as.data.frame() – it doesn't matter –, there is only one column: x (which, in this specific case, is the "measurement of anual flow of the river"). However, if I plot() the data, in any of the three formats (raw, tibble or data.frame), I plots with the dates:
(Technically, the x axis label changes, but that's not the point).
Where are these dates stored? How can I access them (to use ggplot(), for example), or even – how can I see them?
If you use str(Nile) or print(Nile), you'll see that the Nile data set is store in a Time-Series object. You can use the start(), end() and frequency() functions to extract those attribute then create a new column to store those informations.
data(Nile)
new_df = data.frame(Nile)
new_df$Time = seq(from = start(Nile)[[1]], to = end(Nile)[[1]], by = frequency(Nile))

For-loop that loops over dates and months for extracting Google Analytics data in R in order to run Markov Chain Model

I am trying to build a Markov Chain digital attribution model using GA data.
I would like some help with for loops for date ranges and then the creation of a column in my dataset which shows the month.
The query extracts data from GA and specifically from the multi channel funnel report. So, I would like all unique paths and conversions by month, so I can run a Markov Chain model and get the channel attribution by month.
I have done some basic work, but I have stuck as I am not familiar with loops.
Any help would be great.
I will try to break the code in two stages:
Stage a: Extracting the data
finding the starting and ending date per month
start_date <- seq(as.Date("2018-01-01"),length=12,by="months")
end_date <- seq(as.Date("2018-02-01"),length=12,by="months")-1
looping over starting and ending dates
mcf_data <- list()
for(i in 1:length(start_date)){
for(j in 1:length(end_date)){
mcf_data<-print(get_mcf(ga_id,
start.date = start_date[i], end.date = end_date[j],
metrics = "mcf:totalConversions",
dimensions = "mcf:basicChannelGroupingPath",
sort = NULL,
filters = NULL,
samplingLevel = NULL,
start.index = NULL, max.results = NULL, fetch.by = NULL)
}
}
This works fine but just gives me the total number of unique paths and Conversions. Ideally, I would like to use the i,j's in order to create an additional column where each time the loop runs from 1 to length I get a data frame which relates to a certain month, so at the end I have dataset with unique paths and conversions by Month.
Stage b: Running the Markov Chain Model
Ideally, I would like to continue the for loop from extracting the data to running the model by month
df_mcf_data <- data.frame(mcf_data$basicChannelGroupingPath
,mcf_data$totalConversions,
conv_null = 0)
mod1 <- markov_model(df_mcf_data,
var_path = 'mcf_data.basicChannelGroupingPath',
var_conv = 'mcf_data.totalConversions',
var_null = 'conv_null',
out_more = TRUE)
df_res1 <- mod1$result
The most important part is the stage a of the code for my work. But, if there can be help for stage b as well, my total aim is to have a for loop which runs all the code at once and gives me monthly results.

Executing getSunlightTimes function in R with data frame?

I am hoping that you can support me with the use of the getSunlightTimes function. I have a pixel level data frame ("latlon2") with latitude ("lat"), longitude ("lon"), and one date ("date") in format YYYY-MM-DD. The data covers the continental US, and I also have a state code variable in the data frame.
To obtain the date variable as a date class variable, I executed:
latlon2$date=as.Date(latlon2$d2003s)
I am trying to use the getSunlightTimes to identify the time of sunrise and sunset for each pixel on the designated date. However, I am having a hard time getting the function to work. There is not a lot of information on this command beyond R's help guides, so I am hoping some of you have worked with it and can offer your suggestions based on my approach so far.
First I tried using the getSunlightTimes function designating each latitude/longitude/date column in my data frame
sunrise2003CET=getSunlightTimes(date="latlon2$date", lat="latlon2$lat", lon="latlon2$lon", tz="CET", keep = c("sunrise", "sunset"))
R returns the error:
Error in getSunlightTimes(date = "latlon2$date2", lat = "latlon2$lat",
: date must to be a Date object (class Date)
What's frustrating about this is that when I look at class(latlon2$date) R verifies that the column is a "Date" class!
Next, I tried designating the data frame only:
sunrise2003CET=getSunlightTimes(data="latlon2", tz="CET", keep = c("sunrise", "sunset"))
R returns the error:
Error in .buildData(date = date, lat = lat, lon = lon, data = data) :
all(c("date", "lat", "lon") %in% colnames(data)) is not TRUE
This seems odd because I named the columns in the dataframe "date", "lat", "lon", but perhaps the error is due to the fact that there are other variables in the data frame (such as state code).
I am trying to perform this task for several dates across 15 years (and four time zones), so any suggestions on how to get this running, and also running efficiently, are much appreciated!
Thank you so much!
Colette
The problem is with the quotes. When you write
sunrise2003CET=getSunlightTimes(date="latlon2$date",
lat="latlon2$lat",
lon="latlon2$lon",
tz="CET",
keep = c("sunrise", "sunset"))
you shouldn't put the expressions for the date, lat and lon arguments in quotes, because then R will see them as strings. (You could try class("latlon2$date") to see this.) Just write it as
sunrise2003CET=getSunlightTimes(date=latlon2$date,
lat=latlon2$lat,
lon=latlon2$lon,
tz="CET",
keep = c("sunrise", "sunset"))

Butterworth filtering of EEG data in R

I'm very new at R and EEG signalling so please excuse me if the answer to the question is obvious.
I'm trying to perform a Butterworth filter to an EEG signal to extract the Alpha band. When I executed the filter, the resulting signal looked very strange and not at all what I expected, with an unusually large peak at the beginning of the time frame. I tried using eegfilter and bwfilter to see if it was a problem with the code but there was very little difference between the two when I plot the results. I'm at a loss to explain the end result and would be grateful if someone could explain peculiar end result to me.
Here is an example from the data I'm looking at:
https://ufile.io/1ji48wg6
The sampling rate is 512.
I want to extract the alpha band, so frequencies between 8 and 12 Hz
library(eegkit)
mturk <- read.csv("EEG_alpha.csv", head = TRUE, sep= ",")
mturk.but <- eegfilter(mturk, Fs = 512, lower = 8, upper = 12, method = "butter", order = 4)
plot(mturk.but)
Here is a picture of the data when plotted. The left most image is the raw data. The central plot is the result of applying a Butterworth filter using eegfilter. And the right plot is the result of applying a Butterworth filter using bwfilter.
Plots of data when filters are applied
Header of the dataset:
EEG
-8438.876837
-8442.718979
-8441.877183
-8439.974768
-8443.436883
-8448.900711
-8452.433874
-8441.616546
It seems that eegfilter and bwfilter functions add 0's in front of the data before applying the filter and only then normalises it. As such you end up with something akin to a Dirac at the start of the data once its been processed, making the filtered data go from it's raw state:
EEG_raw
To this once you've filtered it:
EEG butterworth filetered
However, if you normalise the data to 0; subtracting the first value of the time series from all the values before you apply the filter, no Dirac-like artifacts occur:
EEG normalised followed by butterworth filter

Creating SpatialLinesDataFrame from SpatialLines object and basic df

Using leaflet, I'm trying to plot some lines and set their color based on a 'speed' variable. My data start at an encoded polyline level (i.e. a series of lat/long points, encoded as an alphanumeric string) with a single speed value for each EPL.
I'm able to decode the polylines to get lat/long series of (thanks to Max, here) and I'm able to create segments from those series of points and format them as a SpatialLines object (thanks to Kyle Walker, here).
My problem: I can plot the lines properly using leaflet, but I can't join the SpatialLines object to the base data to create a SpatialLinesDataFrame, and so I can't code the line color based on the speed var. I suspect the issue is that the IDs I'm assigning SL segments aren't matching to those present in the base df.
The objects I've tried to join, with SpatialLinesDataFrame():
"sl_object", a SpatialLines object with ~140 observations, one for each segment; I'm using Kyle's code, linked above, with one key change - instead of creating an arbitrary iterative ID value for each segment, I'm pulling the associated ID from my base data. (Or at least I'm trying to.) So, I've replaced:
id <- paste0("line", as.character(p))
with
lguy <- data.frame(paths[[p]][1])
id <- unique(lguy[,1])
"speed_object", a df with ~140 observations of a single speed var and row.names set to the same id var that I thought I created in the SL object above. (The number of observations will never exceed but may be smaller than the number of segments in the SL object.)
My joining code:
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object)
And the result:
row.names of data and Lines IDs do not match
Thanks, all. I'm posting this in part because I've seen some similar questions - including some referring specifically to changing the ID output of Kyle's great tool - and haven't been able to find a good answer.
EDIT: Including data samples.
From sl_obj, a single segment:
print(sl_obj)
Slot "ID":
[1] "4763655"
[[151]]
An object of class "Lines"
Slot "Lines":
[[1]]
An object of class "Line"
Slot "coords":
lon lat
1955 -74.05228 40.60397
1956 -74.05021 40.60465
1957 -74.04182 40.60737
1958 -74.03997 40.60795
1959 -74.03919 40.60821
And the corresponding record from speed_obj:
row.names speed
... ...
4763657 44.74
4763655 34.8 # this one matches the ID above
4616250 57.79
... ...
To get rid of this error message, either make the row.names of data and Lines IDs match by preparing sl_object and/or speed_object, or, in case you are certain that they should be matched in the order they appear, use
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object, match.ID = FALSE)
This is documented in ?SpatialLinesDataFrame.
All right, I figured it out. The error wasn't liking the fact that my speed_obj wasn't the same length as my sl_obj, as mentioned here. ("data =
object of class data.frame; the number of rows in data should equal the number of Lines elements in sl)
Resolution: used a quick loop to pull out all of the unique lines IDs, then performed a left join against that list of uniques to create an exhaustive speed_obj (with NAs, which seem to be OK).
ids <- data.frame()
for (i in (1:length(sl_obj))) {
id <- data.frame(sl_obj#lines[[i]]#ID)
ids <- rbind(ids, id)
}
colnames(ids)[1] <- "linkId"
speed_full <- join(ids, speed_obj)
speed_full_short <- data.frame(speed_obj[,c(-1)])
row.names(speed_full_short) <- speed_full$linkId
splndf <- SpatialLinesDataFrame(sl_obj, data = speed_full_short, match.ID = T)
Works fine now!
I may have deciphered the issue.
When I am pulling in my spatial lines data and I check the class it reads as
"Spatial Lines Data Frame" even though I know it's a simple linear shapefile, I'm using readOGR to bring the data in and I believe this is where the conversion is occurring. With that in mind the speed assignment is relatively easy.
sl_object$speed <- speed_object[ match( sl_object$ID , row.names( speed_object ) ) , "speed" ]
This should do the trick, as I'm willing to bet your class(sl_object) is "Spatial Lines Data Frame".
EDIT: I had received the same error as OP, driving me to check class()
I am under the impression that the error that was populated for you is because you were trying to coerce a data frame into a data frame and R wasn't a fan of that.

Resources