Integrate nonparametric curve in R - r

Just a warning, I started using R a day ago...my apologies if anything seems idiotically simple.
Right now im trying to have R take in a .txt file with acelerometer data of an impact and calculate a Head injury criterion test for it. The HIC test requires that curve from the data be integrated on a certain interval.
The equation is at the link below...i tried to insert it here as an image but it would not let me. Apparently i need some reputation points before it'll let me do that.
a(t) is the aceleration curve.
So far i have not had an issue generating a suitable curve in R to match the data. The loess function worked quite well, and is exactly the kind of thing i was looking for...i just have no idea how to integrate it. As far as i can tell, loess is a non-parametric regression so there is no way to determine the equation of the curve iteslf. Is there a way to integrate it though?
If not, is there another way to acomplish this task using a different function?
Any help or insighful comments would be very much appreciated.
Thanks in advance,
Wes
One more question though James, how can i just get the number without the text and error when using the integrate() function?

You can use the predict function on your loess model to create a function to use with integrate.
# using the inbuilt dataset "pressure"
plot(pressure,type="l")
# create loess object and prediction function
l <- loess(pressure~temperature,pressure)
f <- function(x) predict(l,newdata=x)
# perform integration
integrate(f,0,360)
40176.5 with absolute error < 4.6
And to extract the value alone:
integrate(f,0,360)$value
[1] 40176.5

Related

Optimizing parameter values using non-linear least squares in R (with integrals)

Obviously an R (and math) amateur. I've been working 10+ hours on trying to get this to work, so I thought I'd attempt posting here as a shot.
I have data collected from an experiment with two variables: Iq and q. These data are linear when plotted in loglog space. I am trying to solve for two other variables, por and r, in the following equation:
Iq=SLD^2*(por/Vra)*integral{Rmin to Rmax}((Vr)^2*f(r)*F dr)
Where:
SLD=known constant
por=unknown
Vra=integral{0 to Inf}(Vr*f(r)dr)
Vr=(4/3)*pi*r^3
Rmin and Rmax = known constants
f(r)=((r^-(1+fd))/(Rmin^(-fd) - Rmax^(-fd))/fd)
r=unknown
fd=known constant
F=(3*(sin(q*r)-q*rcos(q*r))/(q*r)^3)^2
I've tried many attempts at this, but can't seem to wrap my brain around the variables inside the variables into code. This problem used to be solved in an Excel solver routine that optimized parameter values using non-linear least squares that only works on (imo) Windows 95 Excel, and we're trying to adapt it into a more user-friendly data processing method. But I'm a geochemist, so basically useless. Any help would be much appreciated! I can include more details if some kind soul out there is willing to help out.

R - replicate weight survey

Currently I'm interested in learning how to obtain information from the American Community Survey PUMS files. I have read some of the the ACS documentation and found that to replicate weights I must use the following formula:
And thanks to google I also found that there's the SURVEY package and the svrepdesign function to help me get this done
https://www.rdocumentation.org/packages/survey/versions/3.33-2/topics/svrepdesign
Now, even though I'm getting into R and learning statistics and have a SQL background, there are two BIG problems:
1 - I have no idea what that formula means and I would really like to understand it before going any further
2 - I don't understand how the SVREPDESIGN function works nor how to use it.
I'm not looking for someone to solve my life/problems, but I would really appreciate if someone points me in the right direction and gives a jump start.
Thank you for your time.
When you are using svyrepdesign, you are specifying that it is a design with replicated weights, and it uses the formula you provided to calculate the standard errors.
The American Community Survey has 80 replicate weights, so it first calculates the statistic you are interested in with the full sample weights (X), then it calculates the same statistic with all 80 replicate weights (X_r).
You should read this: https://usa.ipums.org/usa/repwt.shtml

Function to plot model with one variable varying and others constant

It's simple, but I can't remember how this procedure is called, hence I was not able to find the function to do so. I want to explore the effects and gradients of a simple lm() model by plotting the response of one variable at a time, the others being kept constant.
Can anybody tell me which function to use to do so? I seem to remember it's a function generating several plots, or something like this. It could be something akin to sensitivity analysis... Sorry for the beginner question.
Thank you in advance!
The car package has a lot of utilities for analyzing regression models. This sounds like a component+residual plot (or partial residuals plot).
library(car) # for avPlots(...)
fit <- lm(mpg~wt+hp+disp, mtcars)
crPlots(fit)
As noted in the comments, termplot(...) does basically the same thing.

Calculate D-efficiency of an experimental desgin in R

I have an experimental design. I want to calculate its D-efficiency.
I thought R package AlgDesign could help. I found function optFederov which generates the design and - if the user wants - returns its D efficiency. However, I don't want to use optFederov to generate the design - I already have my design!
I tried eval.design(~.,mydesign). But the only metrics it gives me are: determinant, A, diagonality, and gmean.variances. Maybe there is a way to get from determinant or A to D-efficiency (I am not a mathematician, so I am not sure). Or maybe some other way to calculate D-efficiency "manually" so to say?
Thanks a lot for any hint!
I was working on a similar project. I found out this formula Deff = (|X'X|^(1/p))/ND in this link. Where X is the model matrix, p is the number of betas in you linear model and ND the number of runs your experiment has. You could just make a code like this and it will do the trick.
det(t(X)%*%X)^(1/beta)/(numRuns)
I tested the results using JMP for my project so I believe this is the correct formula
Determinant, the first result given by eval.design, is the D-efficiency.

How can I estimate the logarithmic form of data points using R?

I have data points that represent a logarithmic function.
Is there an approach where I can just estimate the function that describes this data using R?
Thanks.
I assume that you mean that you have vectors y and x and you try do fit a function y(x)=Alog(x).
First of all, fitting log is a bad idea, because it doesn't behave well. Luckily we have x(y)=exp(y/A), so we can fit an exponential function which is much more convenient. We can do it using nonlinear least squares:
nls(x~exp(y/A),start=list(A=1.),algorithm="port")
where start is an initial guess for A. This approach is a numerical optimization, so it may fail.
The more stable way is to transform it to a linear function, log(x(y))=y/A and fit a straight line using lm:
lm(log(x)~y)
If I understand right you want to estimate a function given some (x,y) values of it. If yes check the following links.
Read about this:
http://en.wikipedia.org/wiki/Spline_%28mathematics%29
http://en.wikipedia.org/wiki/Polynomial_interpolation
http://en.wikipedia.org/wiki/Newton_polynomial
http://en.wikipedia.org/wiki/Lagrange_polynomial
Googled it:
http://www.stat.wisc.edu/~xie/smooth_spline_tutorial.html
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/smooth.spline.html
http://www.image.ucar.edu/GSP/Software/Fields/Help/splint.html
I never used R so I am not sure if that works or not, but if you have Matlab i can explain you more.

Resources