predicting values witn non-linear regression - r

My non-linear model is the following:
fhw <- data.frame(
time=c(10800, 10810, 10820, 10830, 10840, 10850, 10860, 10870, 10880, 10890),
water=c( 105, 103, 103, 104, 107, 109, 112, 113, 113, 112)
)
nl <- nls(formula = water ~ cbind(1,poly(time,4),sin(omega_1*time+phi_1),
sin(omega_2*time+phi_2),
sin(omega_3*time+phi_3)), data = fhw,
start = list(omega_1=(2*pi)/545, omega_2=(2*pi)/205,
omega_3=(2*pi)/85, phi_1=pi, phi_2=pi, phi_3=pi),
algorithm = "plinear", control = list(maxiter = 1000))
Time is between 10800 and 17220, but I want to predict ahead. Using the function predict like this:
predict(nl,data.frame(time=17220:17520))
gives wrong results, since the first value it returns is complete different than the last value it return when I use predict(nl). I think the problem has something to do with poly, but I'm not sure. Furthermore, predicting at one time point, gives the error: degree' must be less than number of unique points. Can anybody help?

Related

Writing a function to run a F-test with two separate shapiro-wilks tests conducted inside the function

The function should perform as follows: The function takes the
arguments: x1, x2, alt = "two-sided", lev = 0.95, where the equality
indicates the default value.
•The arguments x1 and x2 are the X1 and X2 samples, respectively.
•The argument alt is the alternative hypothesis whose two other
possible values are "greater" and "less".
•The argument lev is the confidence level 1 −α. ii. The function
returns an R list containing the test statistic, p-value, confidence
level, and confidence interval.
iii. Inside the function, two Shapiro-Wilk tests of normality are
conducted separately for the two samples (note the normality
assumption at the beginning of the problem). If one or both p-values
are less than 0.05, a warning message is printed out explaining the
situation.
Here is what I have come up with so far but not sure how to create one function to run both:
library(stats)
x1 <- c(103, 94, 110, 87, 98, 102, 86, 98, 109, 92)
x2 <- c(97, 82, 123, 92, 175, 88, 118, 81, 165, 97, 134, 92, 87, 114)
var.test(x1, x2, alternative = "two.sided", conf.level = 0.95)
shapiro.test(x1)$p.value < 0.05|shapiro.test(x2)$p.value < 0.05
Some hints:
Your task is to write a function, so you should have something like this:
my_function <- function(x1, x2, alt = "two-sided", level = 0.95){
# fill in the body of the function here
}
You can do whatever you need to do in the body of the function.
Recall that in R, the last evaluated line of a function is automatically its returned value. So, you might choose to have your last line be list(...) as described in the problem statement.
It will be useful to store results of tests, etc. as variables inside your function, e.g. test_output_1 <- ... so that you can reference those things later in the body of your function.

Lookup value from another matrix

I have a stepwise structure of tariffs for treatments. Treatment between 0-33 hrs receive the tariff $96. Treatments between 34 and 96 hours receive 224, etc.
I would like to create graph with treatment hours and tariffs with the hrs on the x-axis and tariff on y-axis. In order to do that I need to create a variable that gives me the corresponding tariff for each treatment hour ('hr'). How do I do this in R?
min <- c(381, 201, 97, 34, 0)
max <- c(NA, 380, 200, 96, 33)
tariff2019 <- c(779, 536, 368, 224, 96)
dat <- data.frame(hr=seq(401))
dat$tariff <-

Confidence Interval for Non Linear Regression Model

My data consist of two columns: time and cumulative number like below:
time <- c(1:14)
cum.num <- c(20, 45, 99, 195, 301, 407, 501, 582, 679, 753, 790, 861, 1011, 1441)
My non linear function is:
c1*cos(0.6731984259*time)+c2*sin(0.6731984259*time)+c3*(time)^2+c4*time+c5
My objective is to model this function using non linear regression using nls() in R and to compute the confidence interval. I have donr the following:
m1.fit<-nls(cum.vul~c1*cos(0.6731984259*time)+c2*sin(0.6731984259*time)+c3*(time)^2+c4*time+c5,start=list(c1=-50,c2=-60,c3=5,c4=8,c5=100))
I got an error while computing confidence interval, i have tried the following:
confint(m1.fit)
Once i issued this command got the following error:
Waiting for profiling to be done...
Error in prof$getProfile() :
step factor 0.000488281 reduced below 'minFactor' of 0.000976562
Can anyone help me in this regard?
Try package nlstools:
> nlstools::confint2(m1.fit)
2.5 % 97.5 %
c1 -48.556270 54.959689
c2 -175.654079 -45.216965
c3 3.285062 9.529072
c4 -49.254627 46.007629
c5 -34.135835 272.864743`

exponential function in R

I have basic knowledge in R, I would like to know how to write a code of an exponential function in R
F(X)=B(1-e^-AX)
where A=lambda parameter, B is a parameter represents the Y data, X represents the X data below.
I need the exponential model to generate the curve to fit the data; for example:
X <- c(22, 44, 69, 94, 119, 145, 172, 199, 227, 255)
PS: this x-axis in numbers (in millions).
Y <- c(1, 7, 8, 12, 12, 14, 14, 18, 19, 22)
This y-axis
any idea of how to write the code and fit this model in the data...?
In R you can write an exponential function with exp(), in your case:
F <- Y*(1-exp(-A*X))

R: mix() in mixdist package returning error

I have installed the mixdist package in R to combine distributions. Specifically, I'm using the mix() function. See documentation.
Basically, I'm getting
Error in nlm(mixlike, lmixdat = mixdat, lmixpar = fitpar, ldist = dist, :
missing value in parameter
I googled the error message, but no useful results popped up.
My first argument to mix() is a data frame called data.df. It is formatted exactly like the built-in data set pike65. I also did data.df <- as.mixdata(data.df).
My second argument has two rows. It is a data frame called datapar, formatted exactly like pikepar. My pi values are 0.5 and 0.5. My mu values are 250 and 463 (based on my data set). My sigma values are 0.5 and 1.
My call to mix() looks like:
fitdata <- mix(data.df, datapar, "norm", constr = mixconstr(consigma="CCV"), emsteps = 3, print.level = 2)
The printing shows that my pi values go from 0.5 to NaN after the first iteration, and that my gradient is becoming 0.
I would appreciate any help in sorting out this error.
Thanks,
n.i.
Using the test data you linked to
library(mixdist)
time <- seq(673,723)
counts <-c(3,12,8,12,18,24,39,48,64,88,101,132,198,253,331,
419,563,781,1134,1423,1842,2505,374,6099,9343,13009,
15097,13712,9969,6785,4742,3626,3794,4737,5494,5656,4806,
3474,2165,1290,799,431,213,137,66,57,41,35,27,27,27)
data.df <- data.frame(time=time, counts=counts)
We can see that
startparam <- mixparam(c(699,707),1 )
data.fit <- mix(data.mix, startparam, "norm")
Gives the same error. This error appears to be closely tied to the data (so the reason this data does not work could be potentially different than why yours does not work but this is the only example you offered up).
The problem with this data is that the probability between the two groups becomes indistinguishable at some point. Then that happens, the "E" step of the algorithm cannot estimate the pi variable properly. Here
pnorm(717,707,1)
# [1] 1
pnorm(717,699,1)
# [1] 1
both are exactly 1 and this seems to be causing the error. When mix takes 1 minus this value and compares the ratio to estimate group, it gets NaN values which are propagated to the estimate of proportions. When internally these NaN values are passed to nlm() to do the estimation, you get the error message
Error in nlm(mixlike, lmixdat = mixdat, lmixpar = fitpar, ldist = dist, :
missing value in parameter
The same error message can be replicated with
f <- function(x) sum((x-1:length(x))^2)
nlm(f, c(10,10))
nlm(f, c(10,NaN)) #error
So it appears the maxdist package will not work in this scenario. You may wish to contact the package maintainer to see if they are aware of the problem. In the meantime you will will need to find another way to estimate the parameters of you mixture model.
Now, I am not an expert in mixture distributions, but I think #MrFlick's accepted answer is a little bit misleading for anyone googling the error message (although no doubt correct for the example he gave). The core problem is that in both, your linked code and your example, the sigma values are very small compared to mu values. I think that the algorithm just cannot manage to find a solution with such small starting sigma values. If you increase the sigma values, you will get a solution. Linked code as an example:
library(mixdist)
time <- seq(673,723)
counts <- c(3, 12, 8, 12, 18, 24, 39, 48, 64, 88, 101, 132, 198, 253, 331, 419, 563, 781, 1134, 1423, 1842, 2505, 374, 6099, 9343, 13009, 15097, 13712, 9969, 6785, 4742, 3626, 3794, 4737, 5494, 5656, 4806, 3474, 2165, 1290, 799, 431, 213, 137, 66, 57, 41, 35, 27, 27, 27)
data.df <- data.frame(time=time, counts=counts)
data.mix <- as.mixdata(data.df)
startparam <- mixparam(mu = c(699,707), sigma = 1)
data.fit <- mix(data.mix, startparam, "norm") ## Leads to the error message
startparam <- mixparam(mu = c(699,707), sigma = 5) # Adjust start parameters
data.fit <- mix(data.mix, startparam, "norm")
plot(data.fit)
data.fit ### Estimates somewhat reasonable mixture distributions
# Parameters:
# pi mu sigma
# 1 0.853 699.3 4.494
# 2 0.147 708.6 2.217
A bottom line: if you can increase your start parameter sigma values, mix function might find reasonable estimates for you. You do not necessarily have to try another package.
In addition, you can get this message if you have missing data in your dataset.
From example set
data(pike65)
data(pikepar)
pike65$freq[10] <- NA
fitpike1 <- mix(pike65, pikepar, "lnorm", constr = mixconstr(consigma = "CCV"), emsteps = 3)
Error in nlm(mixlike, lmixdat = mixdat, lmixpar = fitpar, ldist =
dist, : missing value in parameter

Resources