I have an experimental design. I want to calculate its D-efficiency.
I thought R package AlgDesign could help. I found function optFederov which generates the design and - if the user wants - returns its D efficiency. However, I don't want to use optFederov to generate the design - I already have my design!
I tried eval.design(~.,mydesign). But the only metrics it gives me are: determinant, A, diagonality, and gmean.variances. Maybe there is a way to get from determinant or A to D-efficiency (I am not a mathematician, so I am not sure). Or maybe some other way to calculate D-efficiency "manually" so to say?
Thanks a lot for any hint!
I was working on a similar project. I found out this formula Deff = (|X'X|^(1/p))/ND in this link. Where X is the model matrix, p is the number of betas in you linear model and ND the number of runs your experiment has. You could just make a code like this and it will do the trick.
det(t(X)%*%X)^(1/beta)/(numRuns)
I tested the results using JMP for my project so I believe this is the correct formula
Determinant, the first result given by eval.design, is the D-efficiency.
Related
I have used the function hw to analyze a time series
fcast<-hw(my_series,h=12,level=95,seasonal="multiplicative",damped=TRUE,lambda=NULL)
By looking at the fcast$model$par I observe the values of alpha, beta, gamma, phi, and the initial states.
I've also looked at the contents of fcast$model$states to see the evolution of all the values. I've tried to reproduce the results in Excel in order to understand the whole procedure.
To achieve the same values of b (trend) as in fcast$model$states I observe that I have to use a formula like the one in the bibliography about the Holt-Winters method:
b(t)=beta2*(l(t)-l(t-1)+(1-beta2)*phi*b(t-1)
But, if in fcast$model$par beta=0.08128968, I find that in order to achieve the same results I have to use beta2=0.50593541.
What's the reason for that? I don't see any relationship between beta and beta2.
I have also found that in order to get the same forecast as the one obtained with the hw function I have to use the following formulas once the data are finished:
l(t)=l(t-1)+b(t-1)
b(t)=phi*b(t-1)
^y(t)=(l(t-1)+b(t-1))*s(t-m)
I haven't found any bibliography on this forecasting phase, explaining that some parameters are no longer used. For instance, in this case phi is still used for b(t), but not used anymore for l(t).
Can anyone refer to any bibliography where I can find this part explained?
So in the end I've been able to reproduce the whole set of data in Excel, but there's a couple of steps I would like to understand better.
Thank you!!
Currently I'm interested in learning how to obtain information from the American Community Survey PUMS files. I have read some of the the ACS documentation and found that to replicate weights I must use the following formula:
And thanks to google I also found that there's the SURVEY package and the svrepdesign function to help me get this done
https://www.rdocumentation.org/packages/survey/versions/3.33-2/topics/svrepdesign
Now, even though I'm getting into R and learning statistics and have a SQL background, there are two BIG problems:
1 - I have no idea what that formula means and I would really like to understand it before going any further
2 - I don't understand how the SVREPDESIGN function works nor how to use it.
I'm not looking for someone to solve my life/problems, but I would really appreciate if someone points me in the right direction and gives a jump start.
Thank you for your time.
When you are using svyrepdesign, you are specifying that it is a design with replicated weights, and it uses the formula you provided to calculate the standard errors.
The American Community Survey has 80 replicate weights, so it first calculates the statistic you are interested in with the full sample weights (X), then it calculates the same statistic with all 80 replicate weights (X_r).
You should read this: https://usa.ipums.org/usa/repwt.shtml
I was trying to understand how may I fit a VAR model that is specific and
not general.
I understand that fitting a model such as general VAR(1) is done by
importing the "vars" package from Cran
for example
consider that y is a matrix of a 10 by 2. then I did this after importing vars package
y=df[,1:2] # df is a dataframe with alot of columns (just care about the first two)
VARselect(y, lag.max=10, type="const")
summary(fitHilda <- VAR(y, p=1, type="const"))
This work fine if no restriction is being made on the coefficients. However, if I would like to fit this restricted VAR model in R
How may I do so in R?
Please refer me to a page if you know any? If there is anything unclear from your prespective please do not mark down let me know what is it and I will try to make it as clear as I understand.
Thank you very much in advance
I was not able to find how may I put restrictions the way I would like to. However, I find a way to go through that by doing as follow.
Try to find the number of lags using a certain information criterion like
VARselect(y, lag.max=10, type="const")
This will enable you to find the lag length. I found it to be one in my case. Then afterwards fit a VAR(1) model to your data. which is in my case y.
t=VAR(y, p=1, type="const")
When I view the summary. I find that some of the coefficients may be statistically insignificant.
summary(t)
Then afterwards run the built-in function from the package 'vars'
t1=restrict(t, method = "ser", thresh = 2.0, resmat = NULL)
This function enables one to Estimation of a VAR, by imposing zero restrictions by significance
to see the result write
summary(t1)
I am a novice in R programming.
I would like to ask experts here a question concerning a code of R.
First, let a vector x be c(2,5,3,6,5)
I hope to make another vector y whose i-th component is derived from N(sum(x[1]:x[i]),1)
(i.e. the i-th component of y follows normal distribuion with variance 1 and mean summation from x[1](=2) to x[i] (i=1,2,3,4,5))
For example, the third component of y follows normal distribuion with mean x[1]+x[2]+x[3]=2+5+3=10 and variance 1
I want to know a code of R making the vector y described above "without using repetition syntax such as for, while, etc."
Since I am a novice of R programming and have a congenitally poor sense of computational statistics, I don't seem to hit on a ingenious code of R at all.
Please let me know a code of R making a vector explained above without using repetition syntax such as for, while, etc.
Previously, I should like to thank you very much heartily for your mindful answer.
You can do
rnorm(length(x), mean = cumsum(x), sd = 1)
rnorm is part of the family of functions associated with the normal distribution *norm. To see how a function with a known name works, use
help("rnorm") # or ?rnorm
cumsum takes the cumulative sum of a vector.
Finding functionality
In R, it's generally a safe bet that most functionality you can think of has been implemented by someone already. So, for example, in the OP's case, it is not necessary to roll a custom loop.
The same naming convention as *norm is followed for other distributions, e.g., rbinom. You can follow the link at the bottom of ?rnorm to reach ?Distributions, which lists others in base R.
If you are starting from scratch and don't know the names of any related functions, consider using the built-in search tools, like:
help.search("normal distribution") # or ??"normal distribution"
If this reveals nothing and yet you still think a function must exist, consider installing and loading the sos package, which allows
findFn("{cumulative mean}") # or ???"{cumulative mean}"
findFn("{the pareto distribution}") # or ???"{the pareto distribution}"
Beyond that, there are other online resources, like Google, that are good. However, a question about functionality on Stack Overflow is a risky proposition, since it will not be received well (downvoted and closed as a "tool request") if the implementation of the desired functionality is nonexistent or unknown to folks here. Stack Overflow's new "Documentation" subsite will hopefully prove to be a resource for finding R functions as well.
Just a warning, I started using R a day ago...my apologies if anything seems idiotically simple.
Right now im trying to have R take in a .txt file with acelerometer data of an impact and calculate a Head injury criterion test for it. The HIC test requires that curve from the data be integrated on a certain interval.
The equation is at the link below...i tried to insert it here as an image but it would not let me. Apparently i need some reputation points before it'll let me do that.
a(t) is the aceleration curve.
So far i have not had an issue generating a suitable curve in R to match the data. The loess function worked quite well, and is exactly the kind of thing i was looking for...i just have no idea how to integrate it. As far as i can tell, loess is a non-parametric regression so there is no way to determine the equation of the curve iteslf. Is there a way to integrate it though?
If not, is there another way to acomplish this task using a different function?
Any help or insighful comments would be very much appreciated.
Thanks in advance,
Wes
One more question though James, how can i just get the number without the text and error when using the integrate() function?
You can use the predict function on your loess model to create a function to use with integrate.
# using the inbuilt dataset "pressure"
plot(pressure,type="l")
# create loess object and prediction function
l <- loess(pressure~temperature,pressure)
f <- function(x) predict(l,newdata=x)
# perform integration
integrate(f,0,360)
40176.5 with absolute error < 4.6
And to extract the value alone:
integrate(f,0,360)$value
[1] 40176.5