plotting lrc in SSasymp in R - r

My question is similar to the unanswered here: working with SSasymp in r
For a simple SSmicmen:
x1 = seq (0,10,1)
y1 = SSmicmen(x1, Vm=10, K=0.5)
plot(y1 ~ x1, type="l")
the value of K is easily identified in the point (5, 0.5), the value of half the maximum growth.
Given a simple SSasympOrig:
x2 = seq (0,10,1)
y2 = SSasympOrig(x2, Asym=10, lrc=0.1)
# Asym*(1 - exp(-exp(lrc)*input))
plot(y2 ~ x2, type="l")
is there a way to represent and/or identify the meaning and/or effect of the parameter "lcr" on the resulting graph, in a similar way as the example above?

Sure, you can visualize this:
x2 = seq (0,10,0.01)
y2 = SSasympOrig(x2, Asym=10, lrc=0.1)
# Asym*(1 - exp(-exp(lrc)*input))
plot(y2 ~ x2, type="n")
for (lrc in (10^((-5):1))) {
y2 = SSasympOrig(x2, Asym=10, lrc=lrc)
# Asym*(1 - exp(-exp(lrc)*input))
lines(y2 ~ x2, type="l", col = 6+log10(lrc))
}
This parameter controls how fast the asymptote is approached. Getting this from studying the equation requires highschool-level maths skills. Or you could try reading the Wikipedia entry about the half-life:
y2 = SSasympOrig(x2, Asym=10, lrc=0.1)
# Asym*(1 - exp(-exp(lrc)*input))
plot(y2 ~ x2, type="l")
points(x = log(2) / exp(0.1), y = 0.5 * 10)

Related

R: Generate sine wave with variable frequency

This might be more of a math question than an R question but here it goes...
I'm trying to generate a low frequency oscillator (LFO2) where the frequency is controlled by another low frequency oscillator (LFO1). LFO1 has a frequency of 0.02 Hz while I want LFO2 to have a frequency that oscillates between 0.00 and 0.11 Hz dependent on the output of LFO1.
# length in seconds
track_length <- 356
upsample <- 10 # upsample the signal
# LFO rates (Hz)
rate1 <- 0.02
rate2_range <- list(0.00, 0.11)
# make plot of LFO1
x1 <- 1:(track_length*upsample)/upsample
amp <- (rate2_range[[2]] - rate2_range[[1]])/2
y1 <- amp*cos(2*pi*rate1*x1) + amp
plot(x1, y1, type='l')
The variable frequency for LFO2 generated by LFO1 looks exactly as I expected.
So I go on to make LFO2 using the output of LFO1 like so..
# make plot of LFO2
x2 <- x1
y2 <- cos(2*pi*y1*x2)
plot(x2, y2, type='l')
However, the output of LFO2 is not what I expected... It seems to be continuously getting faster and also has some peaks that don't oscillate at the full range. I don't understand this as the only thing I'm adjusting is the frequency and it shouldn't be faster than 0.11 Hz. At first I thought it might be an under sampling issue but I get the same results when upsampling the time series to any degree.
Any idea what I'm missing here?
The "frequency" of cos(f(t)) is not f(t). It's the derivative of f(t).
You have:
y1(t) = A*cos(2πf1t) + A
y2(t) = cos(2πy1(t))
If the frequency you want is Acos(2πf1t) + A, then you need to integrate that to get the argument to cos:
y1(t) = A*sin(2πf1t)/2πf1 + At
y2(t) = cos(2πy1(t))
In R:
# length in seconds
track_length <- 356
upsample <- 10 # upsample the signal
# LFO rates (Hz)
rate1 <- 0.02
rate2_range <- list(0.00, 2)
# make integral of LFO1
x1 <- 1:(track_length*upsample)/upsample
amp <- (rate2_range[[2]] - rate2_range[[1]])/2
y1 <- amp*sin(2*pi*rate1*x1)/(2*pi*rate1) + amp*x1
plot(x1, y1, type='l')
# make plot of LFO2
x2 <- x1
y2 <- cos(2*pi*y1 / upsample)
plot(x2, y2, type='l')
You are not restricting the data by amp as you did at the first plot. So it is normal to see cos output altering around -1 and 1.You need to restrict the formula by the max(y1) and min(y1).
So the codes below,
y2 <- vector()
amp <- (max(y1) - min(y1))/2
for(i in 1:length(y1)) {
y2[i] <- amp * cos(2*pi* y1[i] * x2) + amp
}
plot(x2, y2, type='l',col="blue")
grid(nx = NULL, ny = NULL, col = "lightgray", lty = "dotted")
gives this plot,

Exponential decay fit in r

I would like to fit an exponential decay function in R to the following data:
data <- structure(list(x = 0:38, y = c(0.991744340878828, 0.512512332368168,
0.41102449265681, 0.356621905557202, 0.320851602373477, 0.29499198506227,
0.275037747162642, 0.25938850981822, 0.245263623938863, 0.233655093612007,
0.224041426946405, 0.214152907133301, 0.207475138903635, 0.203270738895484,
0.194942528735632, 0.188107106969046, 0.180926819430008, 0.177028560207711,
0.172595416846822, 0.166729221891201, 0.163502461048814, 0.159286528409165,
0.156110097827889, 0.152655498715612, 0.148684858095915, 0.14733605355542,
0.144691873223729, 0.143118852619617, 0.139542186417186, 0.137730138713745,
0.134353615271572, 0.132197800438632, 0.128369567159113, 0.124971834736476,
0.120027536018095, 0.117678812415655, 0.115720611113327, 0.112491329844252,
0.109219168085624)), class = "data.frame", row.names = c(NA,
-39L), .Names = c("x", "y"))
I've tried fitting with nls but the generated curve is not close to the actual data.
enter image description here
It would be very helpful if anyone could explain how to work with such nonlinear data and find a function of best fit.
Try y ~ .lin / (b + x^c). Note that when using "plinear" one omits the .lin linear parameter when specifying the formula to nls and also omits a starting value for it.
Also note that the .lin and b parameters are approximately 1 at the optimum so we could also try the one parameter model y ~ 1 / (1 + x^c). This is the form of a one-parameter log-logistic survival curve. The AIC for this one parameter model is worse than for the 3 parameter model (compare AIC(fm1) and AIC(fm3)) but the one parameter model might still be preferable due to its parsimony and the fact that the fit is visually indistinguishable from the 3 parameter model.
opar <- par(mfcol = 2:1, mar = c(3, 3, 3, 1), family = "mono")
# data = data.frame with x & y col names; fm = model fit; main = string shown above plot
Plot <- function(data, fm, main) {
plot(y ~ x, data, pch = 20)
lines(fitted(fm) ~ x, data, col = "red")
legend("topright", bty = "n", cex = 0.7, legend = capture.output(fm))
title(main = paste(main, "- AIC:", round(AIC(fm), 2)))
}
# 3 parameter model
fo3 <- y ~ 1/(b + x^c) # omit .lin parameter; plinear will add it automatically
fm3 <- nls(fo3, data = data, start = list(b = 1, c = 1), alg = "plinear")
Plot(data, fm3, "3 parameters")
# one parameter model
fo1 <- y ~ 1 / (1 + x^c)
fm1 <- nls(fo1, data, start = list(c = 1))
Plot(data, fm1, "1 parameter")
par(read.only = opar)
AIC
Adding the solutions in the other answers we can compare the AIC values. We have labelled each solution by the number of parameters it uses (the degrees of freedom would be one greater than that) and have reworked the log-log solution to use nls instead of lm and have a LHS of y since one cannot compare the AIC values of models having different left hand sides or using different optimization routines since the log likelihood constants used could differ.
fo2 <- y ~ exp(a + b * log(x+1))
fm2 <- nls(fo2, data, start = list(a = 1, b = 1))
fo4 <- y ~ SSbiexp(x, A1, lrc1, A2, lrc2)
fm4 <- nls(fo4, data)
aic <- AIC(fm1, fm2, fm3, fm4)
aic[order(aic$AIC), ]
giving from best AIC (i.e. fm3) to worst AIC (i.e. fm2):
df AIC
fm3 4 -329.35
fm1 2 -307.69
fm4 5 -215.96
fm2 3 -167.33
A biexponential model would fit much better, though still not perfect. This would indicate that you might have two simultaneous decay processes.
fit <- nls(y ~ SSbiexp(x, A1, lrc1, A2, lrc2), data = data)
#A1*exp(-exp(lrc1)*x)+A2*exp(-exp(lrc2)*x)
plot(y ~x, data = data)
curve(predict(fit, newdata = data.frame(x)), add = TRUE)
If the measurement error depends on magnitude, you could consider using it for weighting.
However, you should consider carefully what kind of model you'd expect from your domain knowledge. Just selecting a non-linear model empirically is usually not a good idea. A non-parametric fit might be a better option.
data <- structure(list(x = 0:38, y = c(0.991744340878828, 0.512512332368168,
0.41102449265681, 0.356621905557202, 0.320851602373477, 0.29499198506227,
0.275037747162642, 0.25938850981822, 0.245263623938863, 0.233655093612007,
0.224041426946405, 0.214152907133301, 0.207475138903635, 0.203270738895484,
0.194942528735632, 0.188107106969046, 0.180926819430008, 0.177028560207711,
0.172595416846822, 0.166729221891201, 0.163502461048814, 0.159286528409165,
0.156110097827889, 0.152655498715612, 0.148684858095915, 0.14733605355542,
0.144691873223729, 0.143118852619617, 0.139542186417186, 0.137730138713745,
0.134353615271572, 0.132197800438632, 0.128369567159113, 0.124971834736476,
0.120027536018095, 0.117678812415655, 0.115720611113327, 0.112491329844252,
0.109219168085624)), class = "data.frame", row.names = c(NA,
-39L), .Names = c("x", "y"))
# Do this because the log of 0 is not possible to calculate
data$x = data$x +1
fit = lm(log(y) ~ log(x), data = data)
plot(data$x, data$y)
lines(data$x, data$x ^ fit$coefficients[2], col = "red")
This did a lot better than using the nls forumla. And when plotting the fit seems to do fairly well.

R: nls() error. "singular gradient matrix at initial parameter estimates"

I have a simple example below (which doesn't work) that attempts to do multivariate fit using the default algorithm (Gauss-Newton). I get the error: "Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates".
## Defining the two independent x variables, and the one dependent y variable.
x1 = 1:100*.01
x2 = (1:100*.01)^2
y1 = 2*x1 + 0.5*x2
## Putting into a data.frame for nls() funcion.
df = data.frame(x1, x2, y1)
## Starting parameters: a = 2.1, b = 0.4 (and taking c = 0)
fit_results <-nls(y1 ~ x1*a + x2*b +c, data=df, start=c(a=2.1, b=0.4, c=0))
Note: even when I set a = 2, and b = 0.5 above, I still get the same error message.
Thanks Brian, not sure how to make a comment the selected answer. Here is code that works... turns out I needed to add more randomness in the y1 dependent variable.
## Defining the two independent x variables, and the one dependent y variable.
x1 = 1:100*0.1
x2 = runif(100,0,10)
y1 = 2*x1 + 0.5*x2*runif(100,0.9,1.1)
## Putting into a data.frame for nls() funcion.
df = data.frame(x1, x2, y1)
fit_results <-nls(y1 ~ x1*a + x2*b +c, data=df, start=c(a=2.1, b=0.4, c=0))

Predicting data from a power curve manually

I have a series of data I have fit a power curve to, and I use the predict function in R to allow me predict y values based on additional x values.
set.seed(1485)
len <- 24
x <- runif(len)
y <- x^3 + rnorm(len, 0, 0.06)
ds <- data.frame(x = x, y = y)
mydata=data.frame(x,y)
z <- nls(y ~ a * x^b, data = mydata, start = list(a=1, b=1))
#z is same as M!
power <- round(summary(z)$coefficients[1], 3)
power.se <- round(summary(z)$coefficients[2], 3)
plot(y ~ x, main = "Fitted power model", sub = "Blue: fit; green: known")
s <- seq(0, 1, length = 100)
lines(s, s^3, lty = 2, col = "green")
lines(s, predict(z, list(x = s)), lty = 1, col = "blue")
text(0, 0.5, paste("y =x^ (", power, " +/- ", power.se,")", sep = ""), pos = 4)
Instead of using the predict function here, how could I manually calculate estimated y values based on additional x values based on this power function. If this were just a simple linear regression, I would calculate the slope and y intercept and calculate my y values by
y= mx + b
Is there a similar equation I can use from the output of z that will allow me to estimate y values from additional x values?
> z
Nonlinear regression model
model: y ~ a * x^b
data: mydata
a b
1.026 3.201
residual sum-of-squares: 0.07525
Number of iterations to convergence: 5
Achieved convergence tolerance: 5.162e-06
You would do it the same way except you use the power equation you modeled. You can access the parameters the model calculated using z$m$getPars()
Here is a simple example to illustrate:
predict(z, list(x = 1))
Results in: 1.026125
Which equals the results of
z$m$getPars()["a"] * 1 ^ z$m$getPars()["b"]
Which is equivalet to y = a * x^b
Here are some ways.
1) with This evaluates the formula with respect to the coefficients:
x <- 1:2 # input
with(as.list(coef(z)), a * x^b)
## [1] 1.026125 9.437504
2) attach We could also use attach although it is generally frowned upon:
attach(as.list(coef(z)))
a * x^b
## [1] 1.026125 9.437504
3) explicit Explicit definition:
a <- coef(z)[["a"]]; b <- coef(z)[["b"]]
a * x^b
## [1] 1.026125 9.437504
4) eval This one extracts the formula from z so that we don't have to specify it again. formula(z)[[3]] is the right hand side of the formula used to produce z. Use of eval is sometimes frowned upon but this does avoid
the redundant specification of the formula.
eval(formula(z)[[3]], as.list(coef(z)))
## [1] 1.026125 9.437504

R regularize coefficients in regression

I'm trying to use linear regression to figure out the best weighting for 3 models to predict an outcome. So there are 3 variables (x1, x2, x3) that are the predictions of the dependent variable, y. My question is, how do I run a regression with the constraint that the sum of the coefficients sum to 1. For example:
this is good:
y = .2(x1) + .4(x2) + .4(x3)
since .2 + .4 + .4 = 1
this is no good:
y = 1.2(x1) + .4(x2) + .3(x3)
since 1.2 + .4 + .3 > 1
I'm looking to do this in R if possible. Thanks. Let me know if this needs to get moved to the stats area ('Cross-Validated').
EDIT:
The problem is to classify each row as 1 or 0. y is the actual values ( 0 or 1 ) from the training set, x1 is the predicted values from a kNN model, x2 is from a randomForest, x3 is from a gbm model. I'm trying to get the best weightings for each model, so each coefficient is <=1 and the sum of the coefficients == 1.
Would look something like this:
y/Actual value knnPred RfPred gbmPred
0 .1111 .0546 .03325
1 .7778 .6245 .60985
0 .3354 .1293 .33255
0 .2235 .9987 .10393
1 .9888 .6753 .88933
... ... ... ...
The measure for success is AUC. So I'm trying to set the coefficients to maximize AUC while making sure they sum to 1.
There's very likely a better way that someone else will share, but you're looking for two parameters such that
b1 * x1 + b2 * x2 + (1 - b1 - b2) * x3
is close to y. To do that, I'd write an error function to minimize
minimizeMe <- function(b, x, y) { ## Calculates MSE
mean((b[1] * x[, 1] + b[2] * x[, 2] + (1 - sum(b)) * x[, 3] - y) ^ 2)
}
and throw it to optim
fit <- optim(par = c(.2, .4), fn = minimizeMe, x = cbind(x1, x2, x3), y = y)
No data to test on:
mod1 <- lm(y ~ 0+x1+x2+x3, data=dat)
mod2 <- lm(y/I(sum(coef(mod1))) ~ 0+x1+x2+x3, data=dat)
And now that I think about it some more, skip mod2, just:
coef(mod1)/sum(coef(mod1))
For the five rows shown either of round(knnPred) or round(gbmPred) give perfect predictions so there is some question whether more than one predictor is needed.
At any rate, to solve the given question as stated the following will give nonnegative coefficients that sum to 1 (except possibly for tiny differences due to computer arithmetic). a is the dependent variable and b is a matrix of independent variables. c and d define the equality constraint (coeffs sum to 1) and e and f define the inequality constraints (coeffs are nonnegative).
library(lsei)
a <- cbind(x1, x2, x3)
b <- y
c <- matrix(c(1, 1, 1), 1)
d <- 1
e <- diag(3)
f <- c(0, 0, 0)
lsei(a, b, c, d, e, f)

Resources