Faster alternative to nested loops - r

I have written the below function, which contains a nested loop. In short, it calculates differences in emissions between i (28) pairs alternative technologies for j (48) countries. For a single combination and a single country, it takes 0.32 sec, which should give a total time of 0.32*28*48 = around 7 min. The function actually takes about 50 min, which makes me think there may be some unnecessary computing going on. Is a nested loop the most efficient approach here?
Any help is greatly appreciated!
alt.comb.p <- function(Fmat){
y.empty = matrix(data = 0,ncol = 2,nrow = nrow(FD)-1)
row.names(y.empty) <- paste(FD$V1[2:nrow(FD)],FD$V2[2:nrow(FD)],sep = " ")
country.list = unique(FD$V1)
for (j in 1:length(country.list)){ # for every country
for (i in 1:ncol(alt.comb)){ # for every possible combination
# the final demand of the first item of the combination is calculated
first = alt.comb[,i][1]
first.name = row.names(Eprice.Exio)[first]
loc1 = grep(pattern = first.name,x = row.names(y.empty))
country.first = substr(x = row.names(y.empty)[loc1[j]],start = 0,stop = 2)
y.empty[,1][loc1[j]] <- Eprice.Exio[first.name,country.first]
# the final demand of the second item of the combination is calculated
second = alt.comb[,i][2]
second.name = row.names(Eprice.Exio)[second]
loc2 = grep(pattern = second.name,x = row.names(y.empty))
country.second = substr(x = row.names(y.empty)[loc2[j]],start = 0,stop = 2)
y.empty[,2][loc2[j]] <- Eprice.Exio[second.name,country.second]
# calculates the difference between the total pressures from item 1 and item 2
r.1 = sum(Fmat%*%as.vector(y.empty[,1]))
r.2 = sum(Fmat%*%as.vector(y.empty[,2]))
r.dif = r.1-r.2 # negative means alternative 1 is better
alt.comb[2+j,i] <- r.dif
row.names(alt.comb)[2+j] <- country.first
y.empty = matrix(data = 0,ncol = 2,nrow = nrow(FD)-1)
row.names(y.empty) <- paste(FD$V1[2:nrow(FD)],FD$V2[2:nrow(FD)],sep = " ")
}
}
return(alt.comb)
}
Edit:
A simplified example would be:
Fmat = matrix(data = runif(1:9600), ncol=9600, nrow=9600)
alt.comb.p <- function(Fmat){
y.empty = matrix(data = 0,ncol = 2,nrow = 9600)
country.list = runif(n = 10)
alt.comb = matrix(data=0,ncol=5,nrow=10)
for (j in 1:10){ # for every country
for (i in 1:5){ # for every possible combination
y.empty[50,1] <- runif(1)
y.empty[60,2] <- runif(1)
# calculates the difference between the total pressures from item 1 and item 2
r.1 = sum(Fmat%*%as.vector(y.empty[,1]))
r.2 = sum(Fmat%*%as.vector(y.empty[,2]))
r.dif = r.1-r.2 # negative means alternative 1 is better
alt.comb[j,i] <- r.dif
y.empty = matrix(data = 0,ncol = 2,nrow = 9600)
}
}
return(alt.comb)
}

Related

How can i start this code found on github?

I'm following this code on github and in line 51 i have a problem with option[i,]<- skew.raw why? Said: object "i" not found. Why? What should i put?
It also fails to take values as after starting the get.option function I have NA values.
# Define function for formating/retrieving options data from json obj
get.options = function(symbols, date){
options = matrix(ncol = 11, nrow = length(symbols))
colnames(options) = c('Cl_price', "call_strike",
"call_lastPrice","call_vol","call_openInt", "call_ImpVoli",
"put_strike","put_lastPrice", 'put_vol',"put_openInt", 'put_ImpVoli')
rownames(options) = symbols
for(u in 1:length(symbols)){
s = symbols[u]
d = as.numeric(as.POSIXct(date, origin = '1970-01-01', tz = 'GMT'))
json_file <- sprintf('https://query2.finance.yahoo.com/v7/finance/options/%s?
date=%d&formatted=true&crumb=UNus6VhY1bn&lang=en-US&region=US&corsDomain=finance.yahoo.com',s,d)
json_data <- suppressWarnings(fromJSON(paste(readLines(json_file), collapse = "")))
# CALLS
n = length(json_data$optionChain$result[[1]]$options[[1]]$calls)
if (n < 1) next
calls = matrix(ncol = 6, nrow = n)
for(i in 1:n) calls[,2][i] = json_data$optionChain$result[[1]]$options[[1]]$calls[[i]]$strike$raw
Cl.price = json_data$optionChain$result[[1]]$quote$regularMarketPrice
x <- which.min(abs((calls[,2]/Cl.price) -1))
calls = calls[x,]
calls[1] = Cl.price
calls[3] = json_data$optionChain$result[[1]]$options[[1]]$calls[[x]]$lastPrice$raw
calls[4] = json_data$optionChain$result[[1]]$options[[1]]$calls[[x]]$volume$raw
calls[5] = json_data$optionChain$result[[1]]$options[[1]]$calls[[x]]$openInterest$raw
calls[6] = json_data$optionChain$result[[1]]$options[[1]]$calls[[x]]$impliedVolatility$raw
# PUTS
n = length(json_data$optionChain$result[[1]]$options[[1]]$puts)
if(n < 1) next
puts = matrix(ncol = 5, nrow = n)
for(i in 1:n) puts[,1][i] = json_data$optionChain$result[[1]]$options[[1]]$puts[[i]]$strike$raw
x <- which.min(abs((puts[,1]/Cl.price) - 0.95))
puts = puts[x,]
puts[2] = json_data$optionChain$result[[1]]$options[[1]]$puts[[x]]$lastPrice$raw
puts[3] = json_data$optionChain$result[[1]]$options[[1]]$puts[[x]]$volume$raw
puts[4] = json_data$optionChain$result[[1]]$options[[1]]$puts[[x]]$openInterest$raw
puts[5] = json_data$optionChain$result[[1]]$options[[1]]$puts[[x]]$impliedVolatility$raw
options[u,] = c(calls, puts)
}
return(options)
}
# Define stocks and gather options data
date = '2017-04-21'
symbols <- c('DIS','CAT','TSLA')
daily.options = as.data.frame(get.options(symbols, date))
which(is.na(daily.options))
skew.raw = daily.options$put_ImpVoli - daily.options$call_ImpVoli # SKEW(i,t)
options[i,] <- skew.raw
write.table(options, 'DISCATTSLA', sep = ",")
options = read.table('DISCATTSLA', sep = ",")**
I’m following this code because I read the paper by Rhui Zhao but in the paper I did not talk about how to implement the skew volatility on a software and then I was able to find this code on github.

How to make a loop for out-of-sample forecast

I am beginner to R and was hoping to have ideas for making a loop.
I would like to automate the following for each observation out of 726 observation making a 5 ahead out-of-sample forecast based on a rolling window of 1000 obsv, storing only the t+5 in the "pred" column and then reset the "VIX.Close" column to his original values.
require(highfrequency)
require(quantmod)
require(xts)
getSymbols("^VIX")
VIX_fcst_test <- VIX[, "VIX.Close"]
VIX_fcst_test$pred <- NA
VIX_fcst_test$VIX.Close[3000] <- predict(HARmodel(data = VIX_fcst_test$VIX.Close[2000:2999], periods = c(1, 5 , 22), type = "HAR", inputType = "RM"))
VIX_fcst_test$VIX.Close[3001] <- predict(HARmodel(data = VIX_fcst_test$VIX.Close[2001:3000], periods = c(1, 5 , 22), type = "HAR", inputType = "RM"))
VIX_fcst_test$VIX.Close[3002] <- predict(HARmodel(data = VIX_fcst_test$VIX.Close[2002:3001], periods = c(1, 5 , 22), type = "HAR", inputType = "RM"))
VIX_fcst_test$VIX.Close[3003] <- predict(HARmodel(data = VIX_fcst_test$VIX.Close[2003:3002], periods = c(1, 5 , 22), type = "HAR", inputType = "RM"))
VIX_fcst_test$pred[3004] <- predict(HARmodel(data = VIX_fcst_test$VIX.Close[2004:3003], periods = c(1, 5 , 22), type = "HAR", inputType = "RM"))
VIX_fcst_test$VIX.Close <- VIX[, "VIX.Close"]
I tried this loop but I don't know how to make the last prediction into the "pred" column and reset the "VIX.Close" column.
for (i in 2000:2004) {
HAREstimated <- HARmodel(data = VIX_fcst_test[i: (i+ 999), "VIX.Close"], periods = c(1, 5 , 22), type = "HAR", inputType = "RM")
VIX_fcst_test$VIX.Close[i + 1000] <- predict(HAREstimated)
}
Any ideas?
My understanding is the following:
you first run the loop on each of the five sets of observations, with an IF statement for when you reach the final iteration which goes into the pred column instead of VIX.close
you keep the reset of VIX.close outside of the for loop, otherwise it would have reset with each iteration
for (i in 2000:2004) {
if (i != 2004) {
HAREstimated <- HARmodel(data = VIX_fcst_test[i:(i+999), "VIX.Close"], periods = c(1, 5 , 22), type = "HAR", inputType = "RM")
VIX_fcst_test$VIX.Close[i + 1000] <- predict(HAREstimated)
} else {
HAREstimated <- HARmodel(data = VIX_fcst_test[i:(i+999), "VIX.Close"], periods = c(1, 5 , 22), type = "HAR", inputType = "RM")
VIX_fcst_test$pred[i + 1000] <- predict(HAREstimated)
}
}
VIX_fcst_test$VIX.Close <- VIX[, "VIX.Close"]
# final prediction
VIX_fcst_test$pred[3004]
So really all you needed was an IF statement in your loop.

How to store list values in to matrix

set.seed(650)
library(maxLik)
y = c(rnorm(15,1,1), rnorm(15,3,1))
dat = data.frame(y)
B = 3 # number bootstrap sample
n = length(dat$y)
n1 = 15
boot.samples = matrix(sample(dat$y, size = B * n, replace = TRUE), n, B)
ml = list()
boot.l = 0
va.l = NULL
for (j in 1:B) {
boot.l = boot.samples[, j]
for (i in 1:n) {
LLl <- function(param) {
mul <- param[1]
sigmal <- param[2]
sum(log(dnorm(dat[1:i, ], mul, sigmal)))
}
ml[[i]] = coef(maxLik(logLik = LLl, start = c(mul = 1, sigmal = 1)))
}
va.l = matrix(unlist(ml), n-1, B*2, byrow = TRUE)
}
va.l
The following are my output
However, when I print the list I have the following output.
My question is how can I have mul estimates for j=1 in the 1st column, sigmal estimates for j = 1 in the second column and mul estimates for j=2 in the 3rd column, sigmal estimates for j = 2 in the 4th column and so on?
Are there any other way do this? Thank you for your help.

Rolling window with Copulas

I would like to apply a rolling window to fit a student t Copula and then to do a forecast based on the results from the fitting process. I already tried it with a for loop, but it always state errors according to the fit Copula command.
#Students t Copula
windowsSize <- 4000 # training data size
testsize <- 351 # number of observations to forecast
for(k in 0:33) # run 34 experiments
{
A <- k*testsize + 1
B <- A + windowsSize - 1
start_obs <- A
end_obs <- B
lgYen_roll <- lgYenUSD[A:B]
lgEuro_roll <- lgEuroUSD[A:B]
ElgYen_roll <- ElgYenUSD[A:B]
ElgEuro_roll <- ElgEuroUSD[A:B
StdlgYen_roll <- StdlgYenUSD[A:B]
StdlgEuro_roll <- StdlgEuroUSD[A:B]
CopYenEuro_roll <- pobs(as.matrix(cbind(lgYen_roll,lgEuro_roll)))
YenEuro_fit_t_roll <- fitCopula(t.cop,CopYenEuro_roll,method=c('ml'), posDef = is(t.cop, "ellipCopula"),
start = NULL, lower = NULL, upper = NULL,
optim.method = optimMeth(t.cop, method,dim=d),
optim.control = list(maxit=1000),
estimate.variance = NA, hideWarnings = FALSE)
Here occurs already the first error: "Error in if (any(u < 0) || any(u > 1)) stop("'u' must be in [0,1] -- probably rather use pobs(.)") :
missing value where TRUE/FALSE needed"
CO_YenEuro_roll_rho <- coef(YenEuro_fit_t_roll)[1]
CO_YenEuro_roll_df <- coef(YenEuro_fit_t_roll)[2]
YenEurocopula_dist_t_roll <- mvdc(copula=tCopula(param = CO_YenEuro_roll_rho,dim=2), margins=c("norm","norm"),
paramMargins = list(list(mean=ElgYen_roll, sd=StdlgYen_roll),
list(mean=ElgEuro_roll, sd=StdlgEuro_roll)),
check = TRUE, fixupNames = TRUE)
YenEurocopula_random_t_roll.dist <- rMvdc(351,YenEurocopula_dist_t_roll)
#Prediction
A <- B + 1
B <- B + testsize
lgYen_roll <- lgYenUSD[A:B]
lgEuro_roll <- lgEuroUSD[A:B]
ElgYen_roll <- ElgYenUSD[A:B]
ElgEuro_roll <- ElgEuroUSD[A:B]
StdlgYen_roll <- StdlgYenUSD[A:B]
StdlgEuro_roll <- StdlgEuroUSD[A:B]
predict_EXT <- matrix(0, testsize, 1)
for(i in 1:testsize) # do the forecast based on the Copula Fit results
{
predict_EXT[i] <- fitCopula(t.cop,CopYenEuro_rolling[i],method=c('ml'), posDef = is(t.cop, "ellipCopula"),
start = NULL, lower = NULL, upper = NULL,
optim.method = optimMeth(t.cop, method,dim=d),
optim.control = list(maxit=1000),
estimate.variance = NA, hideWarnings = TRUE)
YenEurocopula_dist_t_roll <- mvdc(copula=tCopula(param = CO_YenEuro_roll_rho[i],dim=2), margins=c("norm","norm"),
paramMargins = list(list(mean=ElgYen_roll[i], sd=StdlgYen_roll[i]),
list(mean=ElgEuro_roll[i], sd=StdlgEuro_roll[i])),
check = TRUE, fixupNames = TRUE)
YenEurocopula_random_t_roll.dist <- rMvdc(351,YenEurocopula_dist_t_roll[i])
}}
Maybe someone has a solution to this problem?

randomly assign teachers to a school with dplyr or similar?

Suppose I have a data frame with 8 schools and its characteristics, and another with 48 teachers and its characteristics. I can generate some fake data with the following code:
library(dplyr)
library(geosphere)
set.seed(6232015)
n.schools <-8
n.teachers <- 48
makeRandomString <- function(pre, n=1, length=12) {
randomString <- c(1:n) # initialize vector
for (i in 1:n) {
randomString[i] <- paste0(pre,'.', paste(sample(c(0:9, letters, LETTERS),
length, replace=TRUE),
collapse=""))
}
return(randomString)
}
gen.teachers <- function(n.teachers){
Teacher.ID <- makeRandomString(pre= 'T', n = n.teachers, length = 20)
Teacher.exp <- runif(n = n.teachers, min = 1, max = 30)
Teacher.Other <- sample(c(0,1), replace = T, prob = c(0.5, 0.5), size = n.teachers)
Teachers <- data.frame(Teacher.ID, Teacher.exp, Teacher.Other)
return(Teachers)
}
gen.schools <- function(n.schools){
School.ID <- makeRandomString(pre= 'S', n = n.schools, length = 20)
School.lat <- runif(n = n.schools, min = -2, max = 2)
School.long <- runif(n = n.schools, min = -2, max = 2)
Schools <- data.frame(School.ID, School.lat, School.long) %>%
rowwise() %>% mutate (School.distance = distHaversine(p1 = c(School.long, School.lat),
p2 = c(0, 0), r = 3961))
return(Schools)
}
Teachers <- gen.teachers(n.teachers = n.teachers)
Schools <- gen.schools(n.schools = n.schools)
To each shool, I want to assign 6 teachers (every teacher get 1 and only 1 school). I could use:
Teachers %>% sample_n(6)
To get a list of 6 teachers assign those to a school, remove them from my original pool and keep going with a loop. My guess/hope is that there is a much easier way of doing this.
Thanks for the help!
In the context of your code
sample(rep(Schools$School.ID, each = 6))
gives a random sequence of schools where each school.id appears 6 times. Set Teachers$AssignedSchool to this sample and each teacher has an assigned school

Resources