This is my data:
'data.frame': 72 obs. of 7 variables:
$ X1 : chr "2011M1" "2011M2" "2011M3" "2011M4" ...
$ KPR : int 0 0 0 0 0 0 0 0 0 0 ...
$ LTV : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ sukubunga: num 6.5 6.5 6.5 6.5 6.5 6.5 6.5 6.5 6.5 6.5 ...
$ inflasi : num 0.89 0.13 -0.32 -0.31 0.12 0.55 0.67 0.93 0.27 -0.12 ...
$ npl : num 2.31 2.39 2.22 2.2 2.12 ...
$ sbkredit : num 11.4 11.4 11.3 11.4 11.3 ...
i use the package glarma and this is my steps:
library(readr)
b <- read_csv("E:/b.csv")
dataku<-as.data.frame(b)
dataku$LTV<-as.factor(dataku$LTV)
dataku$LTV<-relevel(dataku$LTV,ref="0")
glmmo<-glm(KPR~LTV+sbkredit+inflasi+npl,data=dataku,family=binomial(link=logit),na.action=na.omit,x=TRUE)
summary(glmmo)
X<-glmmo$x
X<-as.matrix(X)
y1<-dataku$KPR
n1<-rep(1,length(dataku$X))
Y<-cbind(y1,n1-y1)
Y<-as.matrix(Y)
library(glarma)
glarmamo<-glarma(Y,X,phiLags=c(1),phiInit=c(0.6),type="Bin",method="FS",residuals="Pearson",maxit=100,grad=1e-6)
but, i get error :
Error in GL$cov %*% GL$ll.d : requires numeric/complex matrix/vector
arguments
When i multiply GL$cov %*% GL$ll.d for
so, what should i do?
Related
I am trying to run a shapiro-wilk normality test on R (Rcmdr to be more accurate) by going to "Statistics=>Summary=>Descriptive statistics" and then selecting one of my dependent variable and choosing "summary by group".
Rcmdr automatically triggers the following code :
normalityTest(Algometre.J0 ~ Modalite, test="shapiro.test",
data=Dataset)
And I am getting the following error message :
'groups' must be a factor.
I have already categorized my independant variable as a factor (I swear, I did !)
Any idea what's wrong ?
Thanx in advance
Here is what str(Dataset) shows :
'data.frame': 76 obs. of 11 variables:
$ Modalite : chr "C" "C" "C" "C" ...
$ Angle.J0 : num 20.1 20.5 21 22.5 19.1 ...
$ Angle.J1 : num 21.7 22.6 22.8 23.3 20.5 ...
$ Angle.J2 : num 22.3 23 23.9 24.2 21 ...
$ Epaisseur.J0: num 1.97 1.54 1.76 1.89 1.53 1.87 1.54 2 1.79 1.41 ...
$ Epaisseur.J1: num 2.07 1.49 1.87 1.91 1.54 1.9 1.51 2.03 1.71 1.48 ...
$ Epaisseur.J2: num 2.08 1.69 1.77 2 1.61 1.99 1.38 2.06 1.86 1.53 ...
$ Algometre.J0: num 45 40 105 165 66.3 ...
$ Algometre.J1: num 32.7 39.7 91.7 124 63.7 ...
$ Algometre.J2: num 51.3 58.7 101 138 60.3 ...
$ ObsNumber : int 1 2 3 4 5 6 7 8 9 10 ...
What does that mean ?
I'm trying to create linear mixed model to explain the presence / absence of a species according to 30 fixed environmental variables and 2 random variables ("Location" and "Season"). My data looks like this:
str(glmm_data)
'data.frame': 209 obs. of 40 variables:
$ CODE : Factor w/ 209 levels "VAL1_1","VAL1_2",..: 1 72 142 170 176 183 190 197 203 8 ...
$ Location : Factor w/ 32 levels "ALMENARA","ARES 1",..: 10 11 12 15 17 2 3 4 21 18 ...
$ Season : Factor w/ 7 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
$ PO4 : num -1.301 -1.301 -1.301 0.437 -1.301 ...
$ NO2 : num -1.129 -1.629 -0.781 -1.699 -1.654 ...
$ NO3 : num 1.044 0.115 1.918 1.457 1.467 ...
$ NH4 : num 0.0123 -0.014 -1.301 -0.2772 -1.301 ...
$ ChlA : num 0.341 0.117 0.87 -0.699 1.53 ...
$ Secchi : num 29 23 10 17 20 9 22 25 25 24 ...
$ Temp_w : num 5.4 3.2 10.3 10.5 4.7 7.2 8 9.2 4.6 6.9 ...
$ Conductivity : num 2.74 2.52 2.76 2.36 2.66 ...
$ Oxi_conc : num 11.6 9.2 7.04 9.99 7 ...
$ Hydroperiod : int 0 0 0 0 1 0 1 0 0 0 ...
$ Rain : int 1 1 1 1 1 1 1 1 1 1 ...
$ RainFre : int 0 0 0 0 0 0 0 0 0 0 ...
$ Veg_flo : num 0 0 0 0 0 0 0 0 0 0 ...
$ Veg_emg : num 0.735 0.524 0.226 0.685 0.226 ...
$ Depth_max : num 1.64 1.57 1.18 1.11 1.85 ...
$ Agricultural : num 0 0 0 0 0 ...
$ LowGrass : num 0 0.41 0.766 0 0.856 ...
$ Forest : num 1.097 1.161 0.44 1.05 0.502 ...
$ Buildings : num 0 0 0 0 0 ...
$ Heterogeneity : num 0.512 0.437 1.028 0.559 0.98 ...
$ Morphology : num 0.04519 -0.00115 0.01556 0.00771 0.12125 ...
$ Fish : int 0 0 0 0 0 0 0 0 0 0 ...
$ TempRange : num 1.4 1.4 1.4 1.4 1.4 ...
$ Tavg : num 1.03 1 1.03 1.03 1 ...
$ Precipitation : num 2.8 2.82 2.8 2.81 2.8 ...
$ MatOrg : num 0.264 0.257 0.236 0.251 0.313 ...
$ CO3 : num 0.14 0.163 0.222 0.335 0.306 ...
$ PC1 : num -0.132 -0.186 -0.074 0.127 -0.175 ...
$ PC2 : num -0.0729 0.0568 -0.0428 -0.0688 -0.0464 ...
$ PC3 : num -0.00638 0.01857 0.02817 -0.00918 0.02056 ...
$ Alytes_obstetricans : int 0 0 0 0 0 0 1 0 0 0 ...
$ Bufo_spinosus : int 0 0 0 0 0 0 0 0 0 0 ...
$ Epidalea_calamita : int 0 0 0 0 0 0 0 0 0 0 ...
$ Pelobates_cultripes : int 0 0 0 0 0 0 0 0 0 0 ...
$ Pelodytes_hespericus: int 1 0 0 0 0 0 0 0 0 0 ...
$ Pelophylax_perezi : int 0 0 0 0 1 0 1 0 0 0 ...
$ Pleurodeles_waltl : int 0 0 0 0 0 0 0 0 0 0 ...
PS: if anyone knows a better way to show my data please explain, I'm a noob at this.
The last 7 columns are the response variables, namely presence (1) or absence (0) of said species so my response variables are binomial. I'm using the glmer function from the lme4 package.
I'm trying to create a model for each species. So the first one looks like this:
Aly_Obs_GLMM <- glmer(Alytes_obstetricans ~ PO4 + NO2 + NO3 + NH4 + ChlA +
Secchi + Temp_w + Conductivity + Oxi_conc + Hydroperiod + Rain + RainFre +
Veg_flo + Veg_emg + Depth_max + Agricultural + LowGrass + Forest + Buildings +
Heterogeneity + Morphology + Fish + TempRange + Tavg + Precipitation +
MatOrg + CO3 + PC1 + PC2 + PC3 + (1|Location) + (1|Season), family = binomial,
data = glmm_data
)
However when running the code, I get the followed error message:
Error in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L),
compDev = compDev, : Downdated VtV is not positive definite
and the model fails to create.
Any ideas on what I may be doing wrong? Thanks
I have 19 variables and I want to run 19 different regressions that consist of 2 independent variables from my dataset.
*Update -This is my dataset's structure:
$ Failure_Response_Var_Yr: num 0 0 0 0 0 0 0 0 0 0 ...
$ exp_var_nocorr_2 : num 4.61 5.99 6.13 3.17 4.4 ...
$ exp_var_nocorr_3 : num 4.16 5.46 5.24 2.86 3.72 ...
$ exp_var_nocorr_4 : num 0.00191 2.23004 0.5613 1.07986 0.99836 ...
$ exp_var_nocorr_5 : num 0.709 2.79 6.846 15.478 11.418 ...
$ exp_var_nocorr_6 : num 0.724 0.497 1.782 0.156 2.525 ...
$ exp_var_nocorr_7 : num 0 168.17 92.041 0.584 265.338 ...
$ exp_var_nocorr_8 : num -38.64 4.89 1.5 24.8 16.56 ...
$ exp_var_nocorr_9 : num 116 88.3 56.4 60.6 57.6 ...
$ exp_var_nocorr_10 : num 0 10.3 0 93.7 0 ...
$ exp_var_nocorr_11 : num 1.02 1.23 1.31 2.06 1.33 ...
$ exp_var_nocorr_12 : num 60 140 124 275 203 ...
$ exp_var_nocorr_13 : num 10.835 5.175 1.838 0.347 0.783 ...
$ exp_var_nocorr_14 : num 59 60.2 87.2 42.2 84.2 ...
$ exp_var_nocorr_15 : num 61.9 68.3 99 50.2 103.9 ...
$ exp_var_nocorr_16 : num 4.4 11.24 8.23 6.9 8.84 ...
$ exp_var_nocorr_17 : num 6.43 18.62 10.72 15.62 10.35 ...
I wrote this code:
col17 <- names(my.sample)[-c(1:9,26:29)]
Such that now dput(col17) gives out:
c("exp_var_nocorr_2", "exp_var_nocorr_3", "exp_var_nocorr_4", "exp_var_nocorr_5", "exp_var_nocorr_6", "exp_var_nocorr_7", "exp_var_nocorr_8", "exp_var_nocorr_9", "exp_var_nocorr_10", "exp_var_nocorr_11", "exp_var_nocorr_12", "exp_var_nocorr_13", "exp_var_nocorr_14", "exp_var_nocorr_15", "exp_var_nocorr_16", "exp_var_nocorr_17" )
`logit.test2 <- vector("list", length(col17))
#start of loop #
for(i in seq_along(col17)){
for(k in seq_along(col17)){
logit.test2[i] <- glm(reformulate(col17[i]+col17[k], "Failure_Response_Var_Yr"),
family=binomial(link='logit'), data=my.sample)
}
}`
# end of loop #
but it printed out this problem:
"Error in col17[i] + col17[k] : non-numeric argument to binary operator"
Can anybody hand me out a code that can fix this problem?
While trying to determine the optimal number of clusters for a kmeans, I tried to use the package mclust with the following code :
d_clust <- Mclust(df,
G=1:10,
mclust.options("emModelNames"))
d_clust$BIC
df is a data frame of 132656 obs. of 19 variables, the data is scaled, and there is no missing values (no NA/NaN/Inf values I checked with is.na and is.finite). Also, my variables are all in numeric format thanks to as.numeric
However after using the code, the screen displays "fitting" with a loading bar, goes up to 11%, and then after a moment I get the error message :
NAs in foreign function call (arg 13)
Does anyone know why I have this type of error ?
EDIT
Output of str(df) (I modified the variable name because of confidential issues)
'data.frame': 132656 obs. of 19 variables:
$ X1: num 0.5 1 1 1 0.5 1 1 1 1 1 ...
$ X2: num 0.714 0.286 1 0.857 0.286 ...
$ X3: num 0.667 1 0.667 0.667 0.667 ...
$ X4: num 0.714 0.429 1 0.714 0.429 ...
$ X5: num 0.667 0.333 1 0.667 0.333 ...
$ X6: num 0.5 0.25 1 0.5 0.25 0.25 0 0.5 0.5 0.25 ...
$ X7: num 0.667 0.667 0.667 0.667 0.667 ...
$ X8: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
$ X9: num 0.667 0 0.667 0.333 0 ...
$ X10: num 1 0.833 1 1 1 ...
$ X11: num 1 0.75 1 1 1 1 1 1 1 1 ...
$ X12: num 1 1 1 0.8 1 1 1 1 1 1 ...
$ X13: num 0.5 0.75 0.75 0.5 0.75 0.25 0.75 0.5 0.5 0.5 ...
$ X14: num 0.75 0.75 0.75 1 0.75 0.75 0.75 1 0.75 0.75 ...
$ X15: num 1 0 0.5 1 1 1 0.75 1 0.5 1 ...
$ X16: num 1 0.333 0.667 0.833 0.833 ...
$ X17: num 1 1 1 1 1 1 1 1 1 1 ...
$ X18: num 0.00157 0.000438 0.001059 0.000879 0.004919 ...
$ X19: num 0.5 0.125 1 0.625 0.125 0.125 0.125 1 0.5 0.25 ...
I've been trying to use irmi from VIM package to replace NA's.
My data looks something like this:
> str(sub_mex)
'data.frame': 21 obs. of 83 variables:
$ pH : num 7.2 7.4 7.4 7.36 7.2 7.82 7.67 7.73 7.79 7.7 ...
$ Cond : num 1152 1078 1076 1076 1018 ...
$ CO3 : num NA NA NA NA NA ...
$ Mg : num 25.8 24.9 24.3 24.8 23.4 ...
$ NO3 : num 49.7 25.6 27.1 39.6 52.8 ...
$ Cd : num 0.0088 0.0104 0.0085 0.0092 0.0086 ...
$ As_H : num 0.006 0.0059 0.0056 0.0068 0.0073 ...
$ As_F : num 0.0056 0.0058 0.0057 0.0066 0.0065 0.004 0.004 0.004 0.0048 0.0078 ...
$ As_FC : num NA NA NA NA NA NA NA NA NA 0.0028 ...
$ Pb : num 0.0097 0.0096 0.0092 0.01 0.0093 0.0275 0.024 0.0255 0.031 0.024 ...
$ Fe : num 0.39 0.26 0.27 0.28 0.32 0.135 0.08 NA 0.13 NA ...
$ No_EPT : int 0 0 0 0 0 0 0 0 0 0 ...
I've subset my sub_mex dataset to analyze observations separately, so i have sub_t dataset. Which look something like this
> str(sub_t)
'data.frame': 5 obs. of 83 variables:
$ pH : num 7.82 7.67 7.73 7.79 7.7
$ CO3 : num 45 NA 37.2 41.9 40.3
$ Mg : num 41.3 51.4 47.7 51.8 53
$ NO3 : num 47.1 40.7 39.9 42.1 37.6
$ Cd : num 0.0173 0.0145 0.016 0.016 0.0154
$ As_H : num 0.00949 0.01009 0.00907 0.00972 0.00954
$ As_F : num 0.004 0.004 0.004 0.0048 0.0078
$ As_FC : num NA NA NA NA 0.0028
$ Pb : num 0.0275 0.024 0.0255 0.031 0.024
$ Fe : num 0.135 0.08 NA 0.13 NA
$ No_EPT : int 0 0 0 0 0
I impute NA's of the sub_mex dataset using:
imp_mexi <- irmi(sub_mex) which works fine
However when I try to impute the subset sub_t I got the following error message:
> imp_t <- irmi(sub_t)
Error in indexNA2s[, variable[j]] : subscript out of bounds
Does anyone have an idea of how to solve this? I want to impute my data sub_t and I don't want to use a subset of the ìmp_mexi imputed dataset.
Any help will be deeply appreciated.
I had a similar issue and discovered that one of the columns in my dataframe was entirely missing- hence the out of bounds error.