User-specified Z matrix in lme - r

I have been looking forever about how to do this in R and cannot find anything! Basically, I am wanting to shrink predictors using LMM. So I have a set of fixed effects, X, and I have a set of predictors, Z, that I want to put a random effect on so the model is
Y=X*beta+Z*u+e
where u~N(0,sigma_u^2 * I) and e ~ N(0,sigma_e^2 * I). I thought I could do this in lme with
fit <- lme(Y~X,random=pdIdent(~-1+Z))
but I only get the error:
Error in getGroups.data.frame(dataMix, groups) :
invalid formula for groups
Any help on this issue is much appreciated.

Have you tried:
N = sample size
group <- rep(1, N)
fit <- lme(Y~X, random=list(group=pdIdent(~-1+Z)))

Related

Fixed coefficient/Offset in Fine&Gray competing-risk adjusted model (FGR)

I want to fit a Fine&Gray competing risk adjusted model including an offset. In other types of models, I am used to being able to simply put in >offset(x), which will add an offset with coefficient 1.
I tried to do the same using the FGR function from the package riskRegression. I didn't get a warning message, but I then noticed that the coefficients for the model with and without offset(x) were exactly the same for the other variables
Example:
#install.packages(riskRegression")
library(riskRegression)
matrix <- matrix(c(3,6,3,2,5,4,7,2,8,2,
0.8,0.6,0.4,0.25,0.16,0.67,0.48,0.7,0.8,0.78,
60,55,61,62,70,49,59,63,62,64,
15,16,18,12,16,13,19,12,15,14,
0,2,1,0,1,1,0,1,2,0,
345,118,225,90,250,894,128,81,530,268),
nrow=10,ncol=6)
df <- data.frame(matrix)
colnames(df) <- c("x","y","z", "a","event","time")
fit <- FGR(Hist(time,event)~ offset(x)+a+y+z, data=df, cause=1)
fit
fit2 <- FGR(Hist(time,event)~ a+y+z, data=df, cause=1)
fit2
If you run this script, you can see that the coefficients of a, y and z do not change, while you are not getting a warning that offset cannot be used (so apparantly it just simply ignored offset(x)).
Does anybody know of a way to include x as an offset (i.e. with coefficient fixed at 1) in FGR? (Edit: Or another way to calculate the correct coefficents for a, y and z with fixed x?)
You can use the survival package for Fine-Gray models with offsets. Just wrap the variable you would like to have the offset with offset(var). I set the model below to model event 1. See code below:
library(survival)
matrix <- matrix(c(3,6,3,2,5,4,7,2,8,2,
0.8,0.6,0.4,0.25,0.16,0.67,0.48,0.7,0.8,0.78,
60,55,61,62,70,49,59,63,62,64,
15,16,18,12,16,13,19,12,15,14,
0,2,1,0,1,1,0,1,2,0,
345,118,225,90,250,894,128,81,530,268),
nrow=10,ncol=6)
df <- data.frame(matrix)
colnames(df) <- c("x","y","z", "a","event","time")
coxph(Surv(time,event==1)~ offset(x)+a+y+z, data=df)

exponential fit singularity in r

I'm trying to fit a bi-exponential function on this dataset but I can't seem to get it to converge. I've tried using nls2 grid search for the best starting point, I've also tried using nlsLM but neither method works. Does anyone have any suggestions?
function: y = a1*exp(-n1*t) + a2*exp(-n2*t) + c
here is the code:
y <- c(1324,1115,1140,934,1013,982,1048,1143,754,906,895,900,765,808,680,731,728,794,706,531,629,629,519,514,516,454,465,630,415,347,257,363,275,379,329,263,301,315,283,354,230,257,196,268,262,236,220,239,255,213,275,273,294,169,257,178,207,169,169,297,
227,189,214,168,263,227,185,220,169,229,174,231,178,141,195,223,258,206,181,200,150,200,169,194,230,162,174,194,225,216,196,213,150,235,231,224,244,161,219,222,210,
186,188,197,177,251,248,223,273,145,257,236,214,194,211,213,175,168,223,192,318,
263,234,163,202,239,189,216,206,185,185,191,340,145,188,305,112,252,213,245,240,196,196,179,235,241,177,196,191,181,240,164,202,201,306,214,212,185,192,178,203,203,239,141,203,190,216,174,219,153,177,223,207,186,213,173,210,191,258,277)
t <- seq(1,length(y),1)
mydata <- data.frame(t=t,y=y)
library(nls2)
fo <- y~a1*exp(-n1*t)+a2*exp(-n2*t)+c
grd <- expand.grid(a1=seq(-12030,1100,by=3000),
a2=seq(-22110,1900,by=2000),
n1=seq(0.01,.95,by=0.4),
n2=seq(0.02,.9,by=0.25),
c=seq(100,400,by=50))
fit <- nls2(fo, data=allout, start=grd, algorithm='brute-force', control=list(maxiter=100))
fit2 <- nls(fo, data=allout, start=as.list(coef(fit)), control=list(minFactor=1e-12, maxiter=200),trace=F)
error: maximum iteration exceeded
However, if I use nlsLM then I get singularity gradient matrix at initial parameter estimate.

Fit state space model using dlm

I am about to fit a state space model to at univariate time series (y_t). The model i try to fit is:
y_t=F x_t+\epsilon_t, \epsilon_t \sim N(0,V)
x_{t+1}=G x_t+w_t, w_t \sim N(0,W)
x_0 \sim N(m_0,C_0)
I use the following R-code:
# Create function of unknown parameters, which returns dlm object
Build <- function(theta) {dlm(FF=theta[1],
GG=theta[2],V=theta[3],W=theta[4],m0=theta[5],
C0=theta[6])}
# Fit model to data using MLE
f1 <- dlmMLE(y,parm=c(1,1,0.1,0.1,0,0.1),Build)
But I get the following error message (after running f1):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta [4], :
V is not a valid variance matrix
My problem is that I don't understand why V is not a valid variance matrix..
Does anyone know what is wrong?
Thank you in advance
Regards fuente
EDIT:
I tried doing the same, but instead of my real data I used:
y <- rnorm(72,6.44,1.97)
This produced, however, an error involving W (and not V...):
Error in dlm(FF = theta[1], GG = theta[2], V = theta[3], W = theta[4], :
W is not a valid variance matrix
I'm confused. Does it have something to do with the starting values passed to parm=...?

Error when using msmFit in R

I'm trying to simulate this paper (Point Forecast Markov Switching Model for U.S. Dollar/ Euro Exchange Rate, by Hamidreza Mostafei) in R. The table that I'm trying to get is on page 483. Here is a link to a pdf.
I wrote the following codes and then got an error at the last line:
mydata <- read.csv("C:\\Users\\User\\Downloads\\EURUSD_2.csv", header=T)
mod <- lm(EURUSD~EURUSD.1, mydata)
mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
Error in if ((max(abs(object["Fit"]["logLikel"] - oldll))/(0.1 + max(abs(object["Fit"]["logLikel"]))) < :
missing value where TRUE/FALSE needed
Basically the data that's being used is EURUSD, which is the level change in monthly frequency. EURUSD.1 is the one lag variable. Both EURUSD and EURUSD.1 are in my csv file. (I'm not sure how to attach the csv file here. If someone could point that out that would be great).
I changed the EURUSD.1 values to something random and msmFit function seemed to work. But whenever I tried using the original value, i.e. the lag value, the error came out.
Something degenerate is happening when one variable is simply lagged from the other. Consider:
Sample data frame where Y is lagged X:
> d = data.frame(X=runif(100))
> d$Y=c(.5, d$X[-100])
> mod <- lm(X~Y,d)
> mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
Error in if ((max(abs(object["Fit"]["logLikel"] - oldll))/(0.1 + max(abs(object["Fit"]["logLikel"]))) < :
missing value where TRUE/FALSE needed
that gives your error. Let's add a tiny tiny bit of noise to Y and see what happens:
> d$Y=d$Y+rnorm(100,0,.000001)
> mod <- lm(X~Y,d)
> mod.mswm = msmFit(mod, k=2, p=1, sw=c(T,T,T,T), control=list(parallel=F))
> mod.mswm
Markov Switching Model
Call: msmFit(object = mod, k = 2, sw = c(T, T, T, T), p = 1, control = list(parallel = F))
AIC BIC logLik
4.3109 47.45234 3.84455
Coefficients:
(Intercept)(S) Y(S) X_1(S) Std(S)
Model 1 0.8739622 -22948.89 22948.83 0.08194545
Model 2 0.4220748 77625.21 -77625.17 0.21780764
Transition probabilities:
Regime 1 Regime 2
Regime 1 0.3707261 0.3886715
Regime 2 0.6292739 0.6113285
It works! Now either:
Having perfectly lagged variables causes some "divide by zero" error because its a purely degenerate case (like having perfectly co-linear variables in a linear model). A little experimenting shows that in this case the resulting output is very sensitive to how much noise you add, so I'm thinking its on a knife-edge here. I suspect having perfectly lagged variables here leads to some singularity or degeneracy.
or
There's some bug in the function.
I have no idea what msmFit does, so that's for you to sort out.

Error with pls function in MixOmics package

I am trying to use the pls function in the mixOmics package.
The code I have is the following:
a = rnorm(100)
X = cbind(1, a, a^2, a^3)
Y = rnorm(100)
pls(X,Y)
When I run it, I get the following error message:
In pls(X, Y) : Zero- or near-zero variance predictors.
Reset predictors matrix to not near-zero variance predictors.
See $nzv for problematic predictors.
But I don't understand where is the problem!
The error tells you that one of your input variables (or column) in X has zero or very little variance.
Here, the problem is simply that your X in pls(X,Y) contains a column with constant values, so that the variance of this variable is exactly zero.
If you remove this column from your data, the pls will work ;)
X = X[,-1]
pls(X,Y)

Resources