ANOVA Error in levels(x)[x] - r

I am attempting to run an ANOVA on some data, but it gives me the following error:
Call:
aov(formula = speaker ~ CoG * skewness * kurtosis, data = total)
Error in levels(x)[x] : only 0's may be mixed with negative subscripts
In addition: Warning messages:
1: In model.response(mf, "numeric") :
using type="numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : - not meaningful for factors
I'm trying to see how well the three variables CoG, skewness and kurtosis can predict the speaker and if they are significant between speakers. A copy of my data an be found here:
https://www.dropbox.com/s/blzpb12bemv6kuc/All.csv
Can anyone help interpret what the error is saying and where it is occurring?

Here is the answer I gave on stats.stackexchange.com
Sounds like you are trying to do Multinomial regression. Perhaps look up information on that.
Here is a great start:
http://www.ats.ucla.edu/stat/r/dae/mlogit.htm
e.g.
install.packages('nnet')
library(nnet)
test<-multinom(formula = as.factor(speaker) ~ CoG * skewness * kurtosis, data = total)
z <- summary(test)$coefficients/summary(test)$standard.errors
# 2-tailed z test
p <- (1 - pnorm(abs(z), 0, 1)) * 2

Related

How can I compare 3 binary variables in R?

I'm looking at debris ingestion in gulls. Each gull is listed by row. Columns contain the sex(0=male, 1=female), if they ate debris (0=no, 1=yes) and if I found any number of other items in their stomach, for this problem I'd like to see if sex and presence of debris influences the number of birds with Shells in their stomach (0=no shells, 1=shells). Debris prevalence is likely overdispersed and zero-inflated, but I'm not sure that matters if I'm using it as a factor to evaluate shell prevalence. Shell prevalence might be overdispersed and zero inflated as well.
I've plotted the data and want to test whether the differences seen in the plot are significant.
But when trying to run a zero-inflated negative binomial model I get many diff errors depending on how I set it up.
library (aod)
library(MASS)
library (ggplot2)
library(gridExtra)
library(pscl)
library(boot)
library(reshape2)
mydata1 <- read.csv('D:/mp paper/analysis wkshts/stats files/FOdata.csv')
mydata1 <- within(mydata1, {
debris <- factor(debris)
sex <- factor(sex)
Shell_frags <- factor(Shell_frags)
})
summary(mydata1)
ggplot(mydata1, aes(Shell_frags, fill=debris)) +
stat_count() +
facet_grid(debris ~ sex, margins=TRUE, scales="free_y")
m1 <- zeroinfl((Shell_frags ~ sex + debris), data = mydata1, dist = "negbin", EM = TRUE)
summary(m1)
Error message:
Error in if (all(Y > 0)) stop("invalid dependent variable, minimum count is not zero") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(Y, 0) : ‘>’ not meaningful for factors
> summary(m1)
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'object' in selecting a method for function 'summary': object 'm1'
not found

Why am I getting NA's for sigma in this gamlss call?

The following question was asked by Michael Barton on Cross Validated and rejected because it was deemed to be a computer question. Regardless, I personally think the question is interesting and am wondering if it can be answered here.
The original post is here.
I am fitting a gamlss model with the call:
gamlss(formula = image_name + random(biological_source_name) - 1,
sigma.formula = biological_source_name - 1,
family = "NBI",
data = na.omit(data))
After three iterations I get an error:
GAMLSS-RS iteration 1: Global Deviance = 3814
GAMLSS-RS iteration 2: Global Deviance = 7760
GAMLSS-RS iteration 3: Global Deviance = 7756
In digamma(y + (1/sigma)) : NaNs produced
In digamma(1/sigma) : NaNs produced
In digamma(y + (1/sigma)) : NaNs produced
In digamma(1/sigma) : NaNs produced
Error in glim.fit(f = sigma.object, X = sigma.X, y = y, w = w, fv = sigma, :
NA's in the working vector or weights for parameter sigma
This suggests to me that the estimated sigma for some of the
categorical predictors is going to 0. Would this be correct?
Any suggestions on how to go about resolving this?
I contacted the authors regarding this. The issue is that a negative binomial is only able to model over dispersion, whereas my data contains both under- and over-dispersed output variables, between different dependent variable groups. This results in the error for the sigma going to 0.
The problem could be that the data are underdispered. and sigma goes to zero and the derivatives produced NA’s.
Try to fit double Poisson DPO() in this specific data set.
As recommended by the one of the authors, a distribution such as double poisson allows for fitting this because the standard deviation can be modelled being both more or less than the mean. When using this distribution, this solved the above problem for me and I was able to fit a model.
gamlss(formula = metric ~ image_name + random(biological_source_name) - 1,
sigma.formula = ~ biological_source_name - 1,
family = "DPO",
data = na.omit(data))
Note the use of DPO in the above example.

Using non-integers vs integers: warnings with non-integers but model won't run with integers

I am having some trouble running negative binomial models. Basically, I have a dataset with counts of animals. However, the effort is different and therefore I can calculate the rate of animals per day. I am doing this with quite a big dataset (>100000 observations). I am quite surprised I couldn't find other topics that covered my question, if you know one: would be helpful!
When trying to fit a model to my data, I run into some problems. Either I run a negative binomial model with the rates
> m1<-glm.nb(Rates ~ Par1+Par2+...+Par7+Par8,data=data)
and then I get the following warning messages:
>Warning messages:
1: In dpois(y, mu, log = TRUE) : non-integer x = 25.913718
2: In dpois(y, mu, log = TRUE) : non-integer x = 5.457385
3: In dpois(y, mu, log = TRUE) : non-integer x = 2.195133
4: In dpois(y, mu, log = TRUE) : non-integer x = 2.721088
5: In dpois(y, mu, log = TRUE) : non-integer x = 6.971678
6: In dpois(y, mu, log = TRUE) : non-integer x = 21.863799
7: In dpois(y, mu, log = TRUE) : non-integer x = 5.300733
8: In dpois(y, mu, log = TRUE) : non-integer x = 7.157865
9: In dpois(y, mu, log = TRUE) : non-integer x = 14.117588
10: In dpois(y, mu, log = TRUE) : non-integer x = 6.505993, etc.
Or I run the model with an offset
> m2<-glm.nb(Count ~ Par1+Par2+...+Par7+Par8+offset(Effort),data=data)
This however gives the following error:
> Error: no valid set of coefficients has been found: please supply starting values
In addition: Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted rates numerically 0 occurred
I have already tried providing the coefficients of the first model as starting coefficients for the second, but this won't work. Also using the package pscl doesnt work, or increasing the amount of iterations. This is a subset of my data (one species) with very few zeros.
Any suggestions? I feel that actually the second way of modelling this is the proper way of doing it, but I don't know how to get this model to run. Any ideas? Would be much appreciated.
You almost certainly want one of the following, assuming Rates = Count/Effort. Either fit the rate, and use effort as a weighting variable:
glm.nb(Rates ~ *, weights=Effort, data=data)
Or, fit the counts, and use log(effort) as an offset:
glm.nb(Count ~ * + offset(log(Effort)), data=data)
See also my answer on CrossValidated about offsets in poisson/negative binomial models.

Error in nlme repeated measures

I'm trying to run a linear mixed model with repeated measures at 57 different timepoints. But I keep getting the error message:
Error in solve.default(estimates[dimE[1L] - (p:1), dimE[2L] - (p:1), drop = FALSE]) :
system is computationally singular: reciprocal condition number = 7.7782e-18
What does this mean?
My code is such:
model.dataset = data.frame(TimepointM=timepoint,SubjectM=sample,GeneM=gene)
library("nlme")
model = lme(score ~ TimepointM + GeneM,data=model.dataset,random = ~1|SubjectM)
Here's the data:
score = c(2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,6,7,2,-3,11,14,1,7,6,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7,2,-3,11,14,1,7)
timepoint = c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7,7,8,8,8,8,8,8,8,9,9,9,9,9,9,10,10,10,10,10,10,10,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,14,14,14,14,14,14,14,14,15,15,15,15,15,15,16,16,16,16,16,16,16,16,17,17,17,17,17,17,18,18,18,18,18,18,19,19,19,19,19,19,19,20,20,20,20,20,20,21,21,21,21,21,21,24,24,24,24,24,24,24,25,25,25,25,25,25,25,27,27,27,27,27,27,28,28,28,28,28,28,29,29,29,29,29,29,30,30,30,30,30,30,30,31,31,31,31,31,31,31,32,32,32,32,32,32,33,33,33,33,33,33,33,33,34,34,34,34,34,34,34,35,35,35,35,35,35,36,36,36,36,36,36,36,37,37,37,37,37,37,38,38,38,38,38,38,39,39,39,39,39,39,39,40,40,40,40,40,40,40,41,41,41,41,41,41,41,41,42,42,42,42,42,42,42,42,43,43,43,43,43,43,44,44,44,44,44,44,44,45,45,45,45,45,45,46,46,46,46,46,46,47,47,47,47,47,47,48,48,48,48,48,48,49,49,49,49,49,49,49,50,50,50,50,50,50,51,51,51,51,51,51,52,52,52,52,52,52,52,53,53,53,53,53,53,53,54,54,54,54,54,54,55,55,55,55,55,55,56,56,56,56,56,56,57,57,57,57,57,57)
sample = c("S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S13T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S01T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0","S02T0","S03T0","S07T0","S09T0","S10T0","S12T0")
gene =c(24.1215870,-18.8771658,-27.3747309,-41.5740199,26.1561877,-2.7836332,20.8322796,36.5745088,-24.1541743,-11.2362216,4.9042852,7.4230219,155.8663563,16.4465366,-11.7982286,-1.6102783,-35.9559091,27.7909495,-13.9181661,-29.6037658,-68.4297261,-45.0877920,-48.3157529,17.1649982,-26.9084544,19.7358439,-5.8991143,-24.1541743,-23.5960654,13.0780939,-2.7836332,18.6394081,-28.3157487,-49.9186269,-33.7086648,41.6864242,-30.6199654,36.1823804,-36.5745088,-49.9186269,-44.9448864,-4.9042852,-34.3314764,62.3465425,-42.7609951,-11.7982286,-32.2055657,-56.1811080,5.7216661,-17.6296771,4.3857431,-43.6534459,9.6616697,-44.9448864,18.7997599,-12.9902884,109.1064494,7.6750504,-43.6534459,-17.7130611,-25.8433097,5.7216661,-18.5575548,35.2750175,36.1823804,2.3596457,-25.7644526,-55.0574858,15.5302365,-19.4854325,73.3687689,63.1668918,20.8322796,16.5175201,-22.5438960,-28.0905540,15.5302365,7.4230219,39.5062602,107.4657509,36.1823804,-23.5964573,-45.0877920,-43.8212642,4.0869043,-40.8266205,26.3375068,13.1572292,-25.9561030,-40.2569571,-52.8102415,2.4521426,-49.1775202,246.1047731,36.1823804,11.7982286,-35.4261223,-26.9669318,-2.4521426,-38.0429873,38.5656349,9.8679219,16.5175201,8.0513914,-42.6976421,26.9735686,-26.9084544,4.3857431,12.9780515,-32.2055657,-33.7086648,9.8085704,-36.2800196,215.7518511,6.5786146,-9.4385829,-19.3233394,-40.4503978,17.1649982,-7.4230219,14.2536650,-23.5964573,-53.1391834,-52.8102415,22.0692834,-54.7447866,24.1215870,-44.8332688,-24.1541743,-42.6976421,26.9735686,-40.8266205,191.1413737,17.5429723,-70.7893718,-37.0364006,-39.3267756,-4.9042852,-0.9278777,93.5198138,-6.5786146,-24.7762801,-28.9850091,-39.3267756,22.0692834,-50.1053979,14.2536650,23.5964573,-20.9336177,-53.9338637,14.7128556,-39.8987428,4.3857431,-64.8902575,-59.5802966,-33.7086648,22.0692834,2.7836332,46.0503024,-35.3946859,-43.4775137,-53.9338637,30.2430921,-34.3314764,80.3942259,28.5073300,-87.3068919,-24.1541743,-62.9228410,13.0780939,-25.0526990,35.0859447,-24.7762801,-38.6466789,-58.4283523,31.0604729,0.0000000,24.4562563,1.0964358,-27.1359259,-75.6830794,-16.8543324,20.4345217,-11.1345329,74.1390629,18.2282447,-27.3044720,-45.2890768,-46.7707724,15.3258912,-27.9523169,-6.9763039,117.3099418,18.6394081,-21.2368115,-38.6466789,-34.8322870,22.0692834,-48.2496425,6.5786146,-64.8902575,-51.5289052,-80.9007955,23.7040451,-26.9084544,223.1349942,8.7714862,10.6184058,-127.2119846,-31.4614205,0.8173809,-16.7017993,9.8679219,-35.3946859,-54.7494617,-44.9448864,14.7128556,-18.5575548,97.5827836,-166.3550237,-95.0064189,-123.5984376,104.6247509,-121.5519839,33.9895089,-44.8332688,-40.2569571,-56.1811080,51.4949946,0.0000000,-16.9312544,95.9808615,6.5786146,-21.2368115,-9.6616697,-13.4834659,10.6259513,-25.9805767,116.4895926,-1.0964358,-16.5175201,-56.3597400,-44.9448864,13.8954747,-12.9902884,-5.6437515,71.3703842,25.2180227,-41.2938002,-53.1391834,-32.5850426,8.9911895,12.9902884,31.9812582,1.0964358,-70.7893718,-33.8158440,-38.2031534,-15.5302365,-25.0526990,153.4053085,36.1823804,-34.2148630,-41.8672354,-19.1015767,22.8866643,0.9278777,20.8322796,-29.4955716,-43.4775137,-69.6645739,33.5126155,-45.4660092,26.3144585,-33.0350402,24.1541743,-42.6976421,0.0000000,-28.7642099,38.3752520,-7.0789372,-22.5438960,-20.2251989,34.3299964,19.4854325,4.3857431,-61.3507889,-33.8158440,-64.0464631,39.2342816,-28.7642099,183.7582306,-4.3857431,-22.4166344,-28.9850091,-57.3047302,25.3388069,-26.9084544,35.0859447,7.0789372,-33.8158440,-43.8212642,-1.6347617,5.5672664,-35.0859447,-40.1139773,-14.4925046,-12.3598438,21.2519025,-14.8460438,119.7709896,30.7002016,-22.4166344,-46.6980703,-43.8212642,5.7216661,-10.2066551,203.4466124,116.2221917,-83.7674233,-109.4989234,-38.2031534,78.4685632,-56.6005421,21.9287154,-63.7104346,-56.3597400,-4.4944886,25.3388069,-73.3023414,29.6037658,-31.8552173,-46.6980703,-79.7771734,21.2519025,-18.5575548,16.4465366,-27.1359259,-43.4775137,-41.5740199,-11.4433321,-23.1969435,27.4108943,-84.9472461,-53.1391834,-40.4503978,22.8866643,16.7017993)
tl;dr I think your problem is that every individual has exactly the same response value (score) for every time point (i.e. perfect homogeneity within individuals), so the random-effects term completely explains the data; there's nothing left over for the fixed effects. Are you sure you didn't want to use gene as your response variable?? (Discovered after running through a bunch of modeling attempts, by plotting the damn data, something everyone should always do first ...)
## simplifying names etc. slightly
dd <- data.frame(timepoint,sample,gene,score,)
library("nlme")
m0 <- lme(score ~ timepoint + gene, data=dd,
random = ~1|sample)
## reproduces error
As a first check, let's just see if there's something in your fixed-effect model that is singular:
lm(score~timepoint+gene,dd)
##
## Call:
## lm(formula = score ~ timepoint + gene, data = dd)
##
## Coefficients:
## (Intercept) timepoint gene
## 5.414652 -0.004064 -0.024485
No, that works fine.
Let's try it in lme4:
library(lme4)
m1 <- lmer(score ~ timepoint + gene + (1|sample), data=dd)
## Error in fn(x, ...) : Downdated VtV is not positive definite
Let's try scaling & centering the data -- sometimes that helps:
ddsc <- transform(dd,
timepoint=scale(timepoint),
gene=scale(gene))
lme still fails:
m0sc <- lme(score ~ timepoint + gene, data=ddsc,
random = ~1|sample)
lmer works -- sort of!
m1sc <- lmer(score ~ timepoint + gene + (1|sample), data=ddsc)
## Warning message:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model is nearly unidentifiable: very large eigenvalue
## - Rescale variables?
The results give coefficients for the parameters that are vanishingly close to zero. (The residual variance is also vanishingly small.)
## m1sc
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ timepoint + gene + (1 | sample)
## Data: ddsc
## REML criterion at convergence: -9062.721
## Random effects:
## Groups Name Std.Dev.
## sample (Intercept) 7.838e-01
## Residual 3.344e-07
## Number of obs: 348, groups: sample, 8
## Fixed Effects:
## (Intercept) timepoint gene
## 5.714e+00 -4.194e-16 -1.032e-14
At this point I can only think of a couple of possibilities:
there's something about the experimental design that means the random effects are somehow (?) completely confounded with one or both of the fixed effects
these are simulated data that are artificially constructed to be perfectly balanced ... ?
library(ggplot2); theme_set(theme_bw())
ggplot(dd,aes(timepoint,score,group=sample,colour=gene))+
geom_point(size=4)+
geom_line(colour="red",alpha=0.5)
Aha!
In order for R to solve a matrix, it needs to be computationally invertible. The error you are getting back is telling you that, for computational purposes, your matrix is singular, which means it does not have an inverse.
As this error deals more with the statistical theory side, it's probably better suited for cross-validated. See this link for more information.
Check your data to make sure you do not have perfectly correlated independent variables.

Day-ahead using GLM model in R

I have the following code to get a day-ahead prediction for load consumption in 15 minute interval using outside air temperature and TOD(96 categorical variable, time of the day). When I run the code below, I get the following errors.
i = 97:192
formula = as.formula(load[i] ~ load[i-96] + oat[i])
model = glm(formula, data = train.set, family=Gamma(link=vlog()))
I get the following error after the last line using glm(),
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
And the following error shows up after the last line using predict(),
Warning messages:
1: In if (!se.fit) { :
the condition has length > 1 and only the first element will be used
2: 'newdata' had 96 rows but variable(s) found have 1 rows
3: In predict.lm(object, newdata, se.fit, scale = residual.scale, type = ifelse(type == :
prediction from a rank-deficient fit may be misleading
4: In if (se.fit) list(fit = predictor, se.fit = se, df = df, residual.scale = sqrt(res.var)) else predictor :
the condition has length > 1 and only the first element will be used
You're doing things in a rather roundabout fashion, and one that doesn't translate well to making out-of-sample predictions. If you want to model on a subset of rows, then either subset the data argument directly, or use the subset argument.
train.set$load_lag <- c(rep(NA, 96), train.set$load[1:96])
mod <- glm(load ~ load_lag*TOD, data=train.set[97:192, ], ...)
You also need to rethink exactly what you're doing with TOD. If it has 96 levels, then you're fitting (at least) 96 degrees of freedom on 96 observations which won't give you a sensible outcome.

Resources