I have to compute characteristic roots for 100s of U.S.counties. I got help using lapply. I prefer using MAP and PURR if that is possible. I am using them to compute linear regressions.
Here is my test problem and the output.
test.dat<- tibble(ID=c(1,2),a=c(1,1),b=c(1,1),c=c(2,2),d=c(4,3))
test.out<-test.dat %>% nest(-ID) %>% mutate(fit = purrr::map(data,~ function(x) eigen(data.matrix(x)), data=.))
I would appreciate any help.
Thanks
V.K.Chetty
Related
I would like to ask for tips on how to execute multiple statistical test(e.g. t-test, f-test, ks-test) for multiple groups. Basically, I want to run the statistical tests as many times as the number of groups available in my data and come up with a single result. In this case, I am trying to do a t-test to compare current and previous year data grouped by a variable(let's say dealer code).
I have a data similar to the below (stress) and would like to find out if there is a generic approach on how to do this? I managed to use rstatix for t-test but it doesn't have this function for f-test and ks-test.
Any help is appreciated. Thank you!
library(datarium)
library(rstatix)
data("stress", package = "datarium")
set.seed(123)
stress%>% sample_n_by( size = 60)
stat.test <- stress %>%
group_by(exercise) %>%
t_test(score~ treatment) %>%
add_significance()
stat.test
I am running an ARIMA model using the fable package. Just curious to find out whether there is a way to specify the order of the model (e.g specifying an order of 2,1,1) when using the ARIMA function in the package as opposed to using the optimal lags which are specified automatically?
Also trying to figure out the best way to add a vector as a dummy variable in order to control for a structural break for the first observation
I've used the built in dataset in the package (tourism)
library(fable)
fit <- tourism %>% slice(tail(row_number(), 10)) %>%
model(arima = ARIMA(Trips))
TIA!
If you check out the help page for this ARIMA() function, it says how to specify p, d, and q, and it provides a nice example:
# Manual ARIMA specification
USAccDeaths %>%
as_tsibble() %>%
model(arima = ARIMA(log(value) ~ 0 + pdq(0, 1, 1) + PDQ(0, 1, 1))) %>%
report()
You can specify numbers (or ranges of numbers that it can use to determine which one is the best fit) for p, d, and q in the right-hand side of the formula argument within the ARIMA() function. pdq() is for non-seasonal components and PDQ() is for seasonal components.
It also says:
To force a nonseasonal fit, specify PDQ(0, 0, 0) in the RHS of the model formula.
I'm not exactly sure what you're trying to accomplish with the dummy variable. Could you provide a reproducible example?
I wondered whether the frollapply can be leveraged to run a regression in data.table environment? I can perform this task using rollRegres and tidy as follows
DT <- DT %>%
group_by(id) %>%
do(.,mutate(.,Beta = roll_regres(Y ~ X,.,252)$coef[,2]))
The above code tries to return the slope of regression on a rolling window using 252 days. I would appreciate any feedback on this and whether frollapply solution, if relevant, is faster than the above code.
I am using the code below to get correlations between my dependent variable and a questionnaire response (for different levels of different conditions).
BREAK %>%
group_by(condition, valence) %>%
summarize(COR=cor(rt, positive_focused_cognitiveER)) %>%
ungroup()
It gives me the correlations and their directions (+/-).
I would like to know, however, if those correlations are significant.
Is there a way to simply add a line to the code I already have to get the p-values?
Or another easy code? (I don't need fancy stuff, just the numbers)
The only fitting post I found for my problem was this one Getting p values for groupwise correlation using the dplyr package but the answer did not help me.
Thanks in advance for any tips! :)
You can compute p-values with stats::cor.test :
BREAK %>%
group_by(condition, valence) %>%
summarize(COR = stats::cor.test(rt, positive_focused_cognitiveER)$estimate,
pval = stats::cor.test(rt, positive_focused_cognitiveER)$p.value
) %>%
ungroup()
I'm trying to make a loop (or something else that can do this) that can run a linear model of the year and natural log of cases from my data, for each country, separately so that I can gain a slope from each linear model and plot them as a histogram.
I'm very new to R and I'm struggling immensely to work out how to do this; below is a rough snapshot of what my data looks like, and has 197 different countries in total, ranging from years 1997 - 2019.
data
Any help on how to do this would be greatly appreciated, thank you.
Based on your question, take a look at this website.
Let's say your data is in a data.frame called df, then you could
df %>%
split(.$country) %>%
map(~ lm(log(cases) ~ year, data = .)) %>%
map(summary) %>%
map_dbl("Estimate")
Let me know if you need more help with this.