Regression imputation with dplyr in R - r

I want to do regression imputation with dplyr in R efficiently. Here is my problem: I have a data set with many missing values for one column - let's call it p. Now I want to estimate the missing values of p with a regression imputation approach. For that I regress p on a set of variables with OLS using uncensored data (a subset of the data set without missing values for p). Then I use the estimated coefficients to calculate the missing values of p.
My data set looks like that:
df = data.frame(
id = c(1, 1, 1, 2, 2, 2),
group = c(1, 1, 2, 1, 1, 2),
sub_group = c(1, 2, 3, 1, 2, 3),
p = c(4.3, 5.7, NA, NA, NA, 10),
var1 = c(0.3, 0.1, 0.4, 0.9, 0.1, 0.2),
var2 = c(0, 0, 0, 1, 1, 1)
)
where id represent individuals, which buy goods from a group (e.g. "food") with subgroups (like "bread"). p is the price, while var1 and var2 are some demographic variables (like "education" and "age").
What I've done so far:
library(dplyr)
df <- as_tibble(df)
# Create uncensored data
uncensored_df <- df %>%
filter(!is.na(p))
# Run regression on uncensored data
imp_model <- lm(p ~ var1 + var2, data = uncensored_df)
# Get the coefficients of the fitted model
coefs <- unname(imp_model$coefficients)
# Use coefficients to compute missing values of p
censored_df <-df %>%
filter(is.na(p)) %>%
group_by(id, group, sub_group) %>%
mutate(p = coefs[1] + coefs[2] * var1 + coefs[3] * var2)
# And finally combine the two subsets
bind_rows(uncensored_df, censored_df) %>% arrange(id, group, sub_group)
As I use more than var1 and var2 in my actual problem (about 30 variables), what is a better way to do regression imputation with dplyr? (I'm also open for non-dplyr solutions, though.)

library(dplyr)
fit <- lm(p ~ ., data = select(df, p, starts_with("var")))
df %>%
rowwise() %>%
mutate(p = ifelse(is.na(p), predict(fit, newdata = across()), p)) %>%
ungroup()
How it works
For starters, when fitting your model, you can subset your data frame using select and any of the tidyselect helpers to select your dependent variables (here used starts_with("var")). This subset data frame then allows you to use the ~ . notation which means regress p on everything in the subset data frame.
Next you create a row-wise data frame and use your model to predict where p is missing. In this instance across turns each row into a 1x6 tibble that you can pass to the newdata argument. predict then uses the model fit and this new data to predict a value of p.
Output
id group sub_group p var1 var2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 4.3 0.3 0
2 1 1 2 5.7 0.1 0
3 1 2 3 3.60 0.4 0
4 2 1 1 5.10 0.9 1
5 2 1 2 10.7 0.1 1
6 2 2 3 10 0.2 1
Benchmarking
As mentioned in the comments, for large data frames the rowwise operation takes significantly longer than some other options:
library(microbenchmark)
set.seed(1)
df1 <- df %>%
slice_sample(n = 1E5, replace = T)
fit <- lm(p ~ ., data = select(df1, p, starts_with("var")))
dplyr_rowwise <- function(){
df1 %>%
rowwise() %>%
mutate(p = ifelse(is.na(p), predict(fit, newdata = across()), p)) %>%
ungroup()
}
dplyr_coalesce <- function(){
df1 %>%
mutate(p = coalesce(p, predict(fit, newdata = df1)))
}
base_index <- function(){
isna <- is.na(df1$p)
df1$p[isna] <- predict(fit, newdata = subset(df1, isna))
}
microbenchmark(
dplyr_rowwise(),
dplyr_coalesce(),
base_index(),
times = 10L
)
Unit: milliseconds
expr min lq mean median uq
dplyr_rowwise() 63739.9512 64441.0800 66926.46041 65513.51785 66923.0241
dplyr_coalesce() 6.5901 6.9037 8.55971 7.21125 7.7157
base_index() 13.0368 13.1790 15.73682 13.53310 19.3004

Related

How to nest tables in a column of a dataframe?

I read that it is possible to store dataframes in a column of a dataframe with nest:
https://tidyr.tidyverse.org/reference/nest.html
Is it also possible to store tables in a column of a dataframe?
The reason is that I would like to calculate the Kappa for every subgroup of a dataframe with Caret. Although caret::confusionMatrix(t) expects a table as input.
In the example-code below this works fine if I calculate the Kappa for the complete dataframe at once:
library(tidyverse)
library(caret)
# generate some sample data:
n <- 100L
x1 <- rnorm(n, 1.0, 2.0)
x2 <- rnorm(n, -1.0, 0.5)
y <- rbinom(n, 1L, plogis(1 * x1 + 1 * x2))
my_factor <- rep( c('A','B','C','D'), 25 )
df <- cbind(x1, x2, y, my_factor)
# fit a model and make predictions:
mod <- glm(y ~ x1 + x2, "binomial")
probs <- predict(mod, type = "response")
# confusion matrix
probs_round <- round(probs)
t <- table(factor(probs_round, c(1,0)), factor(y, c(1,0)))
ccm <- caret::confusionMatrix(t)
# extract Kappa:
ccm$overall[2]
> Kappa
> 0.5232
Although if I try to do group_by to generate the Kappa for every factor as a subgroup (see code below) it does not succeed. I suppose I need to nest t in a certain way in df although I don't know how:
# extract Kappa for every subgroup with same factor (NOT WORKING CODE):
df <- cbind(df, probs_round)
df <- as.data.frame(df)
output <- df %>%
dplyr::group_by(my_factor) %>%
dplyr::mutate(t = table(factor(probs_round, c(1,0)), factor(y, c(1,0)))) %>%
summarise(caret::confusionMatrix(t))
Expected output:
>my_factor Kappa
>1 A 0.51
>2 B 0.52
>3 C 0.53
>4 D 0.54
Is this correct and is this possible?
(the exact values for Kappa will be different due to the randomness in the sample data)
Thanks a lot!
You could skip the intermediate mutate() that's giving you trouble to do:
library(dplyr)
library(caret)
df %>%
group_by(my_factor) %>%
summarize(t = confusionMatrix(table(factor(probs_round, c(1,0)),
factor(y, c(1,0))))$overall[2])
Returns:
# A tibble: 4 x 2
my_factor t
<chr> <dbl>
1 A 0.270
2 B 0.513
3 C 0.839
4 D 0.555
The above approach is the easiest to get the desired results. But just to show whats possible, we can use your approach with rowwise::nest_by which groups the data set rowwise.
In the approach below we calculate a separate glm for each subgroup. I'm not sure if that's what you want to do.
library(tidyverse)
library(caret)
# generate some sample data:
n <- 1000L
df <- tibble(x1 = rnorm(n, 1.0, 2.0),
x2 = rnorm(n, -1.0, 0.5),
y = rbinom(n, 1L, plogis(x1 + 1 * x1 + 1 * x2)),
my_factor = rep( c('A','B','C','D'), 250))
output <- df %>%
nest_by(my_factor) %>%
mutate(y = list(data$y),
mod = list(glm(y ~ x1 + x2,
family = "binomial",
data = data)),
probs = list(predict(mod, type = "response")),
probs_round = list(round(probs)),
t = list(table(factor(probs_round, c(1, 0)),
factor(y, c(1, 0)))),
ccm = caret::confusionMatrix(t)$overall[2])
output %>%
pull(ccm)
#> Kappa Kappa Kappa Kappa
#> 0.7743682 0.7078112 0.7157761 0.7549340
Created on 2021-06-23 by the reprex package (v0.3.0)

R behavior of mutate and rnorm

Hello I have the following code from a course
library(tidyverse)
library(dslabs)
data("polls_us_election_2016")
head(results_us_election_2016)
results_us_election_2016 %>% arrange(desc(electoral_votes)) %>% top_n(5, electoral_votes)
'Computing the average and standard deviation for each state'
polls <- polls_us_election_2016 %>%
filter(state != "U.S." &
!grepl("CD", "state") &
enddate >= "2016-10-31" &
(grade %in% c("A+", "A", "A-", "B+") | is.na(grade))) %>%
mutate(spread = rawpoll_clinton/100 - rawpoll_trump/100) %>%
group_by(state) %>%
summarize(avg = mean(spread), sd = sd(spread), n = n()) %>%
mutate(state = as.character(state))
# joining electoral college votes and results
results <- left_join(polls, results_us_election_2016, by="state")
head(results)
# states with no polls: note Rhode Island and District of Columbia = Democrat
results_us_election_2016 %>% filter(!state %in% results$state)
# assigns sd to states with just one poll as median of other sd values
results <- results %>%
mutate(sd = ifelse(is.na(sd), median(results$sd, na.rm = TRUE), sd))
#Calculating the posterior mean and posterior standard error
mu <- 0
tau <- 0.02
results %>% mutate(sigma = sd/sqrt(n),
B = sigma^2/ (sigma^2 + tau^2),
posterior_mean = B*mu + (1-B)*avg,
posterior_se = sqrt( 1 / (1/sigma^2 + 1/tau^2))) %>%
arrange(abs(posterior_mean))
#Monte Carlo simulation of Election Night results (no general bias)
mu <- 0
tau <- 0.02
clinton_EV <- replicate(1000, {
results %>% mutate(sigma = sd/sqrt(n),
B = sigma^2/ (sigma^2 + tau^2),
posterior_mean = B*mu + (1-B)*avg,
posterior_se = sqrt( 1 / (1/sigma^2 + 1/tau^2)),
simulated_result = rnorm(length(posterior_mean), posterior_mean, posterior_se),
clintonvotes = ifelse(simulated_result > 0, electoral_votes, 0)) %>% # award votes if Clinton wins state
summarize(clinton = sum(clintonvotes)) %>% # total votes for Clinton
.$clinton + 7 # 7 votes for Rhode Island and DC
})
mean(clinton_EV > 269) # over 269 votes wins election
I don't understand how this line works
simulated_result = rnorm(length(posterior_mean), posterior_mean, posterior_se)
length(posterior_mean) = 47, so rnorm should return a vector of size 47.
When I replace this with 1 each state gets the same result from rnorm although posterior_mean and posterior_se are diffent for each state. When I change it 46 I get an error.
So it seems to me that this line fills the whole column simulated_result (perhaps 47 times with the same results?). I would have expected that mutate uses the values of each row only to manipulate this particulate row.
Can perhaps someone explain this behavior to me or point me to a resource where this is explained?
For the rnorm function, if you check the vignette:
rnorm(n, mean = 0, sd = 1) Arguments
x, q :vector of quantiles.
p :vector of probabilities.
n :number of observations. If length(n) > 1, the length is taken to be the number required.
mean :vector of means.
sd :vector of standard deviations.
There are two ways to use it, one, you generate a vector of length n, coming from normal distribution of same mean and sd, for example:
set.seed(111)
rnorm(10,0,1)
[1] 0.2352207 -0.3307359 -0.3116238 -2.3023457 -0.1708760 0.1402782 -1.4974267 -1.0101884
[9] -0.9484756 -0.4939622
If you provide a vector that is as long as n, you are specifying the mean and sd for each entry, for example:
set.seed(111)
rnorm(10,1:10,1:10)
[1] 1.23522071 1.33852826 2.06512853 -5.20938263 4.14561978 6.84166935 -3.48198659 -0.08150735
[9] 0.46371956 5.06037783
In this case, you generate a vector of 10 random normal variable, first entry comes from mean=1, sd=1, 2nd entry mean=2, sd=2 and so on. We can also do something in between:
set.seed(111)
rnorm(10,1:10,1))
[1] 1.235221 1.669264 2.688376 1.697654 4.829124 6.140278 5.502573 6.989812 8.051524 9.506038
In this case, it returns a vector of length 10, first entry coming from mean = 1,sd=1, 2nd coming from mean =2,sd =1, and we can visualize this by re-running this:
t(replicate(10,rnorm(10,1:10,1)))
It's not very clear what you replaced with 1, but essentially the purpose of mutate is to assign a column with the values. And the simulated results columns work like the above.

record linear regression results repeatly

As shown in the following example, what I want to achieve is to run the regression many times, each time R records the estimates of did in one data.frame.
Each time, I changed the year condition in "ifelse", ie., ifelse(mydata$year >= 1993, 1, 0), thus each time I run a different regression.
mydata$time = ifelse(mydata$year >= 1994, 1, 0)
Can anyone help it? My basic code is as below (the data can be downloaded through browser if R returned errors):
library(foreign)
mydata = read.dta("http://dss.princeton.edu/training/Panel101.dta")
mydata$time = ifelse(mydata$year >= 1994, 1, 0)
mydata$did = mydata$time * mydata$treated
mydata$treated = ifelse(mydata$country == "E" | mydata$country == "F" | mydata$country == "G", 1, 0)
didreg = lm(y ~ treated + time + did, data = mydata)
summary(didreg)
Generally if you want to repeat a process many times with some different input each time, you need a function. The following function takes a scalar value year_value as its input, creates local variables for regression and exports estimates for model term did.
foo <- function (year_value) {
## create local variables from `mydata`
y <- mydata$y
treated <- as.numeric(mydata$country %in% c("E", "F", "G")) ## use `%in%`
time <- as.numeric(mydata$year >= year_value) ## use `year_value`
did <- time * treated
## run regression using local variables
didreg <- lm(y ~ treated + time + did)
## return estimate for model term `did`
coef(summary(didreg))["did", ]
}
foo(1993)
# Estimate Std. Error t value Pr(>|t|)
#-2.784222e+09 1.504349e+09 -1.850782e+00 6.867661e-02
Note there are several places where your original code can be improved. Say, using "%in%" instead of multiple "|", and using as.numeric instead of ifelse to coerce boolean to numeric.
Now you need something like a loop to iterate this function over several different year_value. I would use lappy.
## raw list of result from `lapply`
year_of_choice <- 1993:1994 ## taken for example
result <- lapply(year_of_choice, foo)
## rbind them into a matrix
data.frame(year = year_of_choice, do.call("rbind", result), check.names = FALSE)
# year Estimate Std. Error t value Pr(>|t|)
#1 1993 -2784221881 1504348732 -1.850782 0.06867661
#2 1994 -2519511630 1455676087 -1.730819 0.08815711
Note, don't include year 1990 (the minimum of variable year) as a choice, otherwise time will be a vector of 1, as same as the intercept. The resulting model is rank-deficient and you will get "subscript out of bounds" error. R version since 3.5.0 has a new complete argument to generic function coef. So for stability we may use
coef(summary(didreg), complete = TRUE)["did", ]
But you should see all NA or NaN for year 1990.
Here is another option, here we create a matrix for all the years, join it to mydata, gather to long, nest by grouping, then run regression to extract the estimates. Note that "gt_et_**" stands for "greater than or equal to.."
library(foreign)
library(dplyr)
library(tidyr)
library(purrr)
mydata = read.dta("http://dss.princeton.edu/training/Panel101.dta")
mtrx <- matrix(0, length(min(mydata$year):max(mydata$year)), length(min(mydata$year):max(mydata$year)))
mtrx[lower.tri(mtrx, diag = TRUE)] <- 1
df <- mtrx %>% as.data.frame() %>% mutate(year = min(mydata$year):max(mydata$year))
colnames(df) <- c(paste0("gt_et_", df$year), "year")
models <- df %>%
full_join(., mydata, by = "year") %>%
gather(mod, time, gt_et_1990:gt_et_1999) %>%
nest(-mod) %>%
mutate(data = map(data, ~mutate(.x, treated = ifelse(country == "E"|country == "F"|country == "G", 1, 0),
did = time * treated)),
mods = map(data, ~lm(y ~ treated + time + did, data = .x) %>% summary() %>% coef())) %>%
unnest(mods %>% map(broom::tidy)) %>%
filter(.rownames == "did") %>%
select(-.rownames)
models
#> mod Estimate Std..Error t.value Pr...t..
#> 1 gt_et_1991 -2309823993 2410140350 -0.95837738 0.34137018
#> 2 gt_et_1992 -2036098728 1780081308 -1.14382344 0.25682856
#> 3 gt_et_1993 -2784221881 1504348732 -1.85078222 0.06867661
#> 4 gt_et_1994 -2519511630 1455676087 -1.73081886 0.08815711
#> 5 gt_et_1995 -2357323806 1455203186 -1.61992760 0.11001662
#> 6 gt_et_1996 250180589 1511322882 0.16553749 0.86902697
#> 7 gt_et_1997 405842197 1619653548 0.25057346 0.80292231
#> 8 gt_et_1998 -75683039 1852314277 -0.04085864 0.96753194
#> 9 gt_et_1999 2951694230 2452126428 1.20372840 0.23299421
Created on 2018-09-01 by the reprex
package (v0.2.0).

Apply grouped model group-wise

My question is very similar to this one, but the problem I am facing has a twist that those answers do not address. Specifically, I am estimating a spatial model, y=rho * lw * y + X *beta. Because the observations are related by the matrix lw, I must apply the model to the entire X matrix simultaneously. Because those answers operate row-wise, they do not apply.
Here is MWE data, consisting of twenty points across three groups and a spatial weights matrix:
library(spdep)
#Coordinates
pointcoords <- data.frame(x = runif(n=20, min =10, max = 100), y = runif(n=20, min = 10, max = 100), ID = as.character(1:20))
pointsSP <- SpatialPoints(pointcoords[,1:2])
# Weights matrix
lw <- nb2listw(knn2nb(knearneigh(pointsSP, k = 4, RANN = FALSE),
row.names = pointcoords$ID))
# Data
MyData <- data.frame(ID = rep(1:20, each = 3),
Group = rep(1:3, times = 20),
DV = rnorm(60),IV = rnorm(60))
I can estimate the models by Group with dplyr
library(dplyr)
models <- MyData %>% group_by(Group) %>%
do(lm = lm(DV ~ IV, data = .),
sar = lagsarlm(DV ~ IV, data = ., listw = lw))
Predicting to new data with this answer operates on a row-wise basis, working fine for the lm objects,
MyData2 <- data.frame(ID = rep(1:20, each = 3),
Group = rep(1:3, times = 20),
IV = rnorm(60))
MyData2 %>% left_join(models) %>% rowwise %>%
mutate(lmPred = predict(lm, newdata = list("IV" = IV))) %>% head()
#Joining by: "Group"
#Source: local data frame [6 x 6]
#Groups:
# ID Group IV lm sar lmPred
#1 1 1 -0.8930794 <S3:lm> <S3:sarlm> -0.21378814
#2 1 2 -1.6637963 <S3:lm> <S3:sarlm> 0.42547796
#3 1 3 0.5243841 <S3:lm> <S3:sarlm> -0.23372996
#4 2 1 -0.1956969 <S3:lm> <S3:sarlm> -0.20860280
#5 2 2 0.8149920 <S3:lm> <S3:sarlm> 0.14771431
#6 2 3 -0.3000439 <S3:lm> <S3:sarlm> 0.05082524
But not for the sar models:
MyData2 %>% left_join(models) %>% rowwise %>%
mutate(sarPred = predict(sar, newdata = list("IV" = IV), listw=lw)) %>% head()
#Joining by: "Group"
#Error in if (nrow(newdata) != length(listw$neighbours)) stop("mismatch between newdata and spatial weights") :
argument is of length zero
I think there should be a better way of doing this, without joining the model to every row. Also, creating a list object for newdata won't work if you have several or changing predictor variables. It seems that the dplyr way should be something like this:
MyData2 %>% group_by(Group) %>%
mutate(sarPred = predict(models$sar[[Group]], newdata = ., listw=lw))
But the [[Group]] index isn't quite right.
I ended up doing this with do in dplyr, going through the models data.frame rowwise. I believe it does what you want, although the output doesn't contain the new data used for predictions. I did add in Group to the output, though, as it seemed necessary to keep groups separated.
models %>%
do(data.frame(Group = .$Group,
predlm = predict(.$lm, newdata = filter(MyData2, Group == .$Group)),
predsar = predict(.$sar, newdata = filter(MyData2, Group == .$Group) , listw = lw)))
EDIT
Playing around with adding the explanatory variable into the output data.frame. The following works, although there is likely a better way to do this.
models %>%
do(data.frame(Group = .$Group, IV = select(filter(MyData2, Group == .$Group), IV),
predlm = predict(.$lm, newdata = filter(MyData2, Group == .$Group)),
predsar = predict(.$sar, newdata = filter(MyData2, Group == .$Group) , listw = lw)))
I'm putting this out there because it does do what I want it to, even if it needs to use a for loop (gasp)
predictobj <- list()
for(i in models$Group){
predictobj[[i]] <- predict.sarlm(models$sar[[i]],
newdata = filter(MyData2, Group == i),
listw = lw)
}
Anybody have a dplyr solution?

rolling regression with dplyr

I have a dataframe of "date", "company" and "return", reproducible by the code below:
library(dplyr)
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
With this dataframe, I can calculate the daily market average return and add that result into a new column "market.ret".
x <- group_by(x, date)
x <- mutate(x, market.ret = mean(x$return, na.rm = TRUE))
Now I want to group all my data by different companies (2 in this case).
x <- group_by(x, company)
After doing this, I would like to fit "return" by "market.ret" and calculate the linear regression coefficient and store the slopes in a new column. If I want to do the fitting for the whole data set within a given company, then I can simply call lm():
group_by(x, company) %>%
do(data.frame(beta = coef(lm(return ~ market.ret,data = .))[2])) %>%
left_join(x,.)
However, I actually want to do the linear regression on a "rolling" basis, i.e. for each day separately over a 20-day trailing period. I want to use rollapply() but do not know how to pass two columns into the function. Any help or suggestion is greatly appreciated.
Note: Below is the code that I used for calculating 20-day rolling standard deviation of returns which might be helpful:
sdnoNA <- function(x){return(sd(x, na.rm = TRUE))}
x <- mutate(x, sd.20.0.d = rollapply(return, FUN = sdnoNA, width = 20, fill = NA))
## lms is a function which calculate the linear regression coefficient
lms <- function(y, x){
s = which(is.finite(x * y))
y = y[s]
x = x[s]
return(cov(x, y)/var(x))
}
## z is a dataframe which stores our final result
z <- data.frame()
## x has to be ungrouped
x <- ungroup(x)
## subset with "filter" and roll with "rollapply"
symbols <- unique(x$company)
for(i in 1:length(symbols)){
temp <- filter(x, company == symbols[i])
z <- rbind(z, mutate(temp, beta = rollapply(temp[, c(3, 4)],
FUN = function(x) lms(x[, 1], x[, 2]),
width = 20, fill = NA,
by.column = FALSE, align = "right")))
}
## final result
print(z)
Here is a dplyr solution
#####
# setup data as OP (notice the fix when computing the market return)
library(dplyr)
set.seed(41797642)
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
x <- x %>%
group_by(date) %>%
mutate(market.ret = mean(return))
#####
# compute coefs using rollRegres
library(rollRegres)
func <- . %>% {
roll_regres.fit(x = cbind(1, .$market.ret),
y = .$return, width = 20L)$coefs }
out <- x %>%
group_by(company) %>%
# make it explicit that data needs to be sorted
arrange(date, .by_group = TRUE) %>%
do(cbind(reg_col = select(., market.ret, return) %>% func,
date_col = select(., date))) %>%
ungroup
head(out[!is.na(out$reg_col.1), ], 5)
#R # A tibble: 5 x 4
#R company reg_col.1 reg_col.2 date
#R <fct> <dbl> <dbl> <date>
#R 1 SNXAD -0.0104 0.746 2011-07-20
#R 2 SNXAD -0.00953 0.755 2011-07-21
#R 3 SNXAD -0.0124 0.784 2011-07-22
#R 4 SNXAD -0.0167 0.709 2011-07-23
#R 5 SNXAD -0.0148 0.691 2011-07-24
tail(out[!is.na(out$reg_col.1), ], 5)
#R # A tibble: 5 x 4
#R company reg_col.1 reg_col.2 date
#R <fct> <dbl> <dbl> <date>
#R 1 UYLTS -0.00276 0.837 2011-08-25
#R 2 UYLTS 0.0000438 0.928 2011-08-26
#R 3 UYLTS 0.000250 0.936 2011-08-27
#R 4 UYLTS -0.000772 0.886 2011-08-28
#R 5 UYLTS 0.00173 0.902 2011-08-29
It is very close to the answer here which is fairly close to this answer though using the rollRegres package.

Resources