using rWBclimate for historical data in R

using rWBclimate for historical data in R - r

I am able to get the following code to work:
world_dat <- get_ensemble_temp(world,"annualavg",2080,2100)
but I would like to change it to historical and start in 1920,1939 (or even earlier). Unfortunately it keeps saying unused arguments
world_dat2 <- get_historical_temp(world,"annualavg",1920,1939)
I basically want to create a world map showing historical temperatures. Any help will be greatly appreciated. Thx!

The reason why you get the "unused argument" error is because the arguments for these two functions are different:
get_ensemble_temp(locator, type, start, end)
get_historical_temp(locator, time_scale)
For the "get_historical_temp" function, you would set time_scale="year", and then subset to the years that you want. E.g.:
USA_dat <- get_historical_temp("USA", "year")
USA_dat_small <- subset(USA_dat, year >= 1920 & year <= 1939,
select=c(1:length(USA_dat)))
The outputs of these functions are quite different, too. You will have to average and summarize the data from "get_historical_temp" to make them comparable to the output of "get_ensemble_temp"
Also, I couldn't get your first line to work with the argument "world."
According to the docs (http://cran.r-project.org/web/packages/rWBclimate/rWBclimate.pdf)
you have to use a vector of all country codes in order to get the whole world's data all at once.

Related

Why after I use "subset", the filtered data is less than it should be?

I want to have "Blancas" and "Sultana" under the "Variete" column.
Why after I use "subset", the filtered data is less than it should be?
Figure 1 is the original data,
figure 2 is the expected result,
figure 3 is result I obtained with the code below:
df <- read_excel("R_NLE_FTSW.xlsx")
options(scipen=200)
BLANCAS<-subset(df, Variete==c("Blancas","Sultana"))
view(BLANCAS)
It's obvious that some data of BLANCAS are missing.
P.S. And if try it in a sub-sheet, the final result sometimes will be 5 times more!
path = "R_NLE_FTSW.xlsx"
df <- map_dfr(excel_sheets(path),
~ read_xlsx(path, sheet = 4))
I don't understand why sometimes it's more and sometimes less than the expected result. Can anyone help me? Thank you so much!

First of all, while you mention that you need both "Blancas" and "sultanas" , your expected result shows only Blancas! So get that straight first.
For such data comign from excel :
Always clean the data after its imported. Check for unqiue values to find if there are any extra spaces etc.
Trim the character data, ensure Date fields are correct and numbers are numeric (not characters)
Now to subset a data : Use df%>%filter(Variete %in% c('Blancas','Sultana')
-> you can modify the c() vector to include items of interest.
-> if you wish to clean on the go?
df%>%filter(trimws(Variete)) %in% c('Blancas','Sultana'))
and your sub-sheet problem : We even don't know what data is there. If its similar then apply same logics.

R - Assign the mean of a column sub-sector to each row of that sub-sector

I am trying to create a column which has the mean of a variable according to subsectors of my data set. In this case, the mean is the crime rate of each state calculated from county observations, and then assigning this number to each county relative to the state they are located in. Here is the function wrote.
Create the new column
Data.Final$state_mean <- 0
Then calculate and assign the mean.
for (j in range[1:3136])
{
state <- Data.Final[j, "state"]
Data.Final[j, "state_mean"] <- mean(Data.Final$violent_crime_2009-2014,
which(Data.Final[, "state"] == state))
}
Here is the following error
Error in range[1:3137] : object of type 'builtin' is not subsettable
Very much appreciated if you could, take a few minutes to help a beginner out.

You've got a few problems:
range[1:3136] isn't valid syntax. range(1:3136) is valid syntax, but the range() function just returns the minimum and maximum. You don't need anything more than 1:3136, just use
for (j in 1:3136) instead.
Because of the dash, violent_crime_2009-2014 isn't a standard column name. You'll need to use it in backticks, Data.Final$\violent_crime_2009-2014`` or in quotes with [: Data.Final[["violent_crime_2009-2014"]] or Data.Final[, "violent_crime_2009-2014"]
Also, your code is very inefficient - you re-calculate the mean on every single time. Try having a look at the
Mean by Group R-FAQ. There are many faster and easier methods to get grouped means.
Without using extra packages, you could do
Data.Final$state_mean = ave(x = Data.Final[["violent_crime_2009-2014"]],
Data.Final$state,
FUN = mean)
For friendlier syntax and greater efficiency, the data.table and dplyr packages are popular. You can see examples using them at the link above.

Here is one of many ways this can be done (I'm sure someone will post a tidyverse answer soon if not before I manage to post):
# Data for my example:
data(InsectSprays)
# Note I have a response column and a column I could subset on
str(InsectSprays)
# Take the averages with the by var:
mn <- with(InsectSprays,aggregate(x=list(mean=count),by=list(spray=spray),FUN=mean))
# Map the means back to your data using the by var as the key to map on:
InsectSprays <- merge(InsectSprays,mn,by="spray",all=TRUE)
Since you mentioned you're a beginner, I'll just mention that whenever you can, avoid looping in R. Vectorize your operations when you can. The nice thing about using aggregate, and merge, is that you don't have to worry about errors in your mapping because you get an index shift while looping and something weird happens.
Cheers!

R: Window Function "Start" after "End"

I am having a problem with the window function in R.
newdata1 <-window(mergedall,start=c(as.Date(as.character("2014-06-16"))),end=c(as.Date(as.character("2015-01-31"))))
I got this error. I am trying to understand how I can fix this issue. Thank you!
Error in window.default(mergedall, start = c(as.Date(as.character("2014-06-16"))), :
'start' cannot be after 'end'
In addition: Warning message:
In window.default(mergedall, start = c(as.Date(as.character("2014-06-16"))), :
'end' value not changed`

I know it's an old post. But, please make sure that "mergedall" is a time series object which was created using the ts command.
While creating the time series object from any vector or series,
some_result_ts <- ts(vector,frequency=xx,start=c(yyyy,m))
This kind of error comes when yyyy is lesser than the start you are specifying in window command.
For example if you take a data frame column or a vector or series , and during the ts formation with ts command, give yyyy=2010,m=1 with a frequency of 12 and assuming it's a 36 month data, the implicit end will be 2013,12.
some_result_ts <- ts(vector,frequency=12,start=c(2010,1))
Then, while using a window function, if you are specifying let's say, start = c(2014,1) , then R will give a message that => 'start' cannot be after 'end' and end value not changed.

Again it's an old post. But since I stumbled upon it by searching the same error. I want to still provide something useful for future Googlers.
I could not replicate your issue because you did not provide your own mergedall dataset. So I am starting with a toy example to show a few places where the problem might be. It's really not that difficult at all.
Potential problem #1:
You did not create a ts object to begin with. Window function operates on a ts object, and it cannot just be a vector took directly from a df. Use ts function to make a vector a ts object first. And then assign it with proper start, end, frequency.
all <-seq(1:8) #eight observations in sequence
Assign these eight values as monthly observations, starting from 201406 to 201501. Frequency 12 means monthly.
all.ts <- ts(all, start = c(2014,6), end = c(2015,1), frequency = 12)
Potential problem #2:
You perhaps already assigned your mergedall series as a ts object, but with different start/end/frequency. My example above was based on monthly observations. So even though they are correct examples, they will not match with your daily-based window function. Window function and the ts object needs to be consistent.
Following my example, the window function would look like:
newdata1 <-window(all.ts,start=c(2014,6),end=c(2015,1) )

Hi here is what you can try, perhaps this would be the solution as I also faced the same problem.
You might not be referring to proper index value in the timeseries object.
In below code I have added the index (i) you can put 1 in case the object has only one series or any number or pass different values using a simple loop.
Hope it helps.!
newdata1 <-window(mergedall[i],start=c(as.Date(as.character("2014-06-16"))),end=c(as.Date(as.character("2015-01-31"))))

I am also a future googler and none of the answers helped me. This was my problem and solution:
MWE issue:
set.seed(50)
data <- ts(rnorm(100), start(1850))
data.train <- window(data, start = 1850, end = 1949)
MWE solution:
set.seed(50)
data <- ts(rnorm(100), start = (1850))
data.train <- window(data, start = 1850, end = 1949)
Issue was the missing equals sign when setting the start date.
The resulting variable data was still a time series; but the give-away was: "Time-Series from 1 to 100" rather than "Time-Series from 1850 to 1949", which told me that something was awry with creating the time series.
The ts function doesn't raise this as an error, presumably because it accepts the start() function from the {stats} package, according to the ?ts doc.

This is probably an issue arising from the format of your 'mergedall' object.
Make sure that you have a ts, xts or a zoo object.
Try f.ex. the following first, in order to ensure the format of your object:
str(mergedall)

R: Currently unsupported data type when using period.apply

First a reproducible example:
library(quantstrat)
getSymbols("AAPL")
Test<-period.apply(AAPL,endpoints(AAPL,on="weeks",k=10),ROC)
TestDF<-as.data.frame(Test)
I want to get the ROC for a certain stock or whatever for x weeks. Or in other words, I want to compare several stocks and rank them with their 10-week ROC, 20 week ROC etc.
Obviously the period apply works, however when I want to convert it to a data Frame and look at my data I always get this error:
Error in coredata.xts(x) : currently unsupported data type
Any idea whats wrong?

period.apply requires a function that returns a single row. ROC does not return a single row. So define your own function to do that.
Test <- period.apply(AAPL, endpoints(AAPL,on="weeks",k=10),
function(x) log(last(x)/coredata(first(x))))

R XTS package to.minutes3

I try am trying to use the "to.minutes3" function in the xts package to segment my data.
This function does correctly put the time column into the desired intervals. But data columns becomes "open" , "close", "high" and "low". Is there are way tell the function to average the data points that fall into the same interval?
Thanks,
Derek

You want period.apply. Assuming your data are in object x and are more frequent than 3-minutes, the code below will give you a mean for each distinct, non-overlapping, 3-minute interval.
> period.apply(x, endpoints(x,k=3,"minutes"), mean)

It looks to me like the answer is no, without completely changing that function, based on help("to.period"). to.minutes uses to.period, which says the following w.r.t. the OHLC parameter:
OHLC should an OHLC object be
returned? (only OHLC=TRUE currently
supported)
So other return values aren't supported.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

using rWBclimate for historical data in R - r

Related

Why after I use "subset", the filtered data is less than it should be?

R - Assign the mean of a column sub-sector to each row of that sub-sector

R: Window Function "Start" after "End"

R: Currently unsupported data type when using period.apply

R XTS package to.minutes3

Categories

Resources