Error in calculating VIF (Variance Inflation Factor) - r

I am getting the following error when calculating VIF on a small dataset in Rstudio. Could anyone help? I can provide more information on the dataset if needed.
"Error in as.vector(y) - mean(y) non-numeric argument to binary
operator".
Dataset: 80 obs. and 15 variables (all variables are numeric)
Steps Followed:
# 1. Determine correlation
library(corrplot)
cor.data <- cor(train)
corrplot(cor.data, method = 'color')
cor.data
# 2. Build Model
model2 <- lm(Volume~., train)
summary(model2)
# 3. Calculate VIF
library(VIF)
vif(model2)
Here is a sample dataset with 20 obs.
train <- structure(list(Price = c(949, 2249.99, 399, 409.99, 1079.99,
114.22, 379.99, 65.29, 119.99, 16.99, 6.55, 15, 52.5, 21.08,
18.98, 3.6, 3.6, 174.99, 9.99, 670), X.5.Star.Reviews. = c(3,
2, 3, 49, 58, 83, 11, 33, 16, 10, 21, 75, 10, 313, 349, 8, 11,
170, 15, 20), X.4.Star.Reviews. = c(3, 1, 0, 19, 31, 30, 3, 19,
9, 1, 2, 25, 8, 62, 118, 6, 5, 100, 12, 2), X.3.Star.Reviews. = c(2,
0, 0, 8, 11, 10, 0, 12, 2, 1, 2, 6, 5, 13, 27, 3, 2, 23, 4, 4
), X.2.Star.Reviews. = c(0, 0, 0, 3, 7, 9, 0, 5, 0, 0, 4, 3,
0, 8, 7, 2, 2, 20, 0, 2), X.1.Star.Reviews. = c(0, 0, 0, 9, 36,
40, 1, 9, 2, 0, 15, 3, 1, 16, 5, 1, 1, 20, 4, 4), X.Positive.Service.Review. = c(2,
1, 1, 7, 7, 12, 3, 5, 2, 2, 2, 9, 2, 44, 57, 0, 0, 310, 3, 4),
X.Negative.Service.Review. = c(0, 0, 0, 8, 20, 5, 0, 3, 1,
0, 1, 2, 0, 3, 3, 0, 0, 6, 1, 3), X.Would.consumer.recommend.product. = c(0.9,
0.9, 0.9, 0.8, 0.7, 0.3, 0.9, 0.7, 0.8, 0.9, 0.5, 0.2, 0.8,
0.9, 0.9, 0.8, 0.8, 0.8, 0.8, 0.7), X.Shipping.Weight..lbs.. = c(25.8,
50, 17.4, 5.7, 7, 1.6, 7.3, 12, 1.8, 0.75, 1, 2.2, 1.1, 0.35,
0.6, 0.01, 0.01, 1.4, 0.4, 0.25), X.Product.Depth. = c(23.94,
35, 10.5, 15, 12.9, 5.8, 6.7, 7.9, 10.6, 10.7, 7.3, 21.3,
15.6, 5.7, 1.7, 11.5, 11.5, 13.8, 11.1, 5.8), X.Product.Width. = c(6.62,
31.75, 8.3, 9.9, 0.3, 4, 10.3, 6.7, 9.4, 13.1, 7, 1.8, 3,
3.5, 13.5, 8.5, 8.5, 8.2, 7.6, 1.4), X.Product.Height. = c(16.89,
19, 10.2, 1.3, 8.9, 1, 11.5, 2.2, 4.7, 0.6, 1.6, 7.8, 15,
8.3, 10.2, 0.4, 0.4, 0.4, 0.5, 7.8), X.Profit.margin. = c(0.15,
0.25, 0.08, 0.08, 0.09, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05,
0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.05, 0.15), Volume = c(12,
8, 12, 196, 232, 332, 44, 132, 64, 40, 84, 300, 40, 1252,
1396, 32, 44, 680, 60, 80)), .Names = c("Price", "X.5.Star.Reviews.",
"X.4.Star.Reviews.", "X.3.Star.Reviews.", "X.2.Star.Reviews.",
"X.1.Star.Reviews.", "X.Positive.Service.Review.", "X.Negative.Service.Review.",
"X.Would.consumer.recommend.product.", "X.Shipping.Weight..lbs..",
"X.Product.Depth.", "X.Product.Width.", "X.Product.Height.",
"X.Profit.margin.", "Volume"), row.names = c(NA, 20L), class = "data.frame")

The vif function from the VIF package does not estimates the Variance Inflation Factor(VIF). "It selects variables for a linear model" and "returns a subset of variables for building a linear model."; see here for the description.
What you want is the vif function from the car package.
install.packages("car")
library(car)
vif(model2) # This should do it
Edit: I won't comment specifically on the statistics side, but it seems like you have a perfect fit, something quite unusual, suggesting some problem in your data.

You're giving vif the wrong input. It wants the response y and predictor variables x:
vif(train$Volume,subset(train,select=-Volume),subsize=19)
I had to set the subsize argument to a value <= the number of observations (the default is 200).

There are 2 R libraries "car" and "VIF" which have the same function vif() defined differently. Your result/error depends on which package you have loaded in your current session.
If you use "VIF" library in the session and pass the linear model as parameter to the vif() function then you will get the error given in the initial query, as shown below:
> model1 = lm(Satisfaction~., data1)
> library(VIF)
Attaching package: ‘VIF’
The following object is masked from ‘package:car’:
vif
> vif(model1)
Error in as.vector(y) - mean(y) : non-numeric argument to binary operator
In addition: Warning message:
In mean.default(y) : argument is not numeric or logical: returning NA
If you load "car" library in R session and not "VIF", then you will get the vif numbers as expected for a linear model as shown below:
> model1 = lm(Satisfaction~., data1)
> library(car)
Loading required package: carData
Attaching package: ‘car’
The following object is masked from ‘package:psych’:
logit
> vif(model1)
ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage ComPricing
1.635797 2.756694 2.976796 4.730448 1.508933 3.488185 3.439420 1.635000
WartyClaim OrdBilling DelSpeed
3.198337 2.902999 6.516014
All the columns in data1 are numeric. Hope that helps

Related

non-numeric argument to binary operator when using apply on a numeric dataframe

I have the dataframe DATA1 as shown for a few rows:
structure(list(S = c(12, 12, 15, 15, 15, 9, 9), UG = c(84, 84,
84, 84, 84, 84, 84), CSi = c(0.487181441487271, 0.623551085193489,
0.505057492620447, 0.704318096382286, 0.575388552145397, 0.400731851672016,
0.490770631112789), N_l = c(1, 3, 1, 3, 5, 1, 3), N_b = c(5,
5, 5, 5, 5, 5, 5), m = c(1.2, 0.85, 1.2, 0.85, 0.65, 1.2, 0.85
), A = c(-12, -12, -15, -15, -15, -9, -9), x.sqr = c(1440, 1440,
2250, 2250, 2250, 810, 810), e_1 = c(21.8, 21.8, 29, 29, 29,
14.6, 14.6), e_2 = c(0, 9.8, 0, 17, 17, 0, 2.6), e_3 = c(0, -2.2,
0, 5, 5, 0, -9.4), e_4 = c(0, 0, 0, 0, -7, 0, 0), e_5 = c(0,
0, 0, 0, -19, 0, 0), K_g = c(6340598.65753794, 6340598.65753794,
6429472.98493414, 6429472.98493414, 6429472.98493414, 6296482.86883766,
6296482.86883766), stiff.girder = c(0.517988322166146, 0.517988322166146,
0.643978136780243, 0.643978136780243, 0.643978136780243, 0.416960174810184,
0.416960174810184), stiff.deck = c(276.422028597005, 276.422028597005,
147.89589537037, 147.89589537037, 147.89589537037, 642.725952664716,
642.725952664716)), row.names = c(10L, 30L, 50L, 70L, 90L, 110L,
130L), class = "data.frame")
I try to run the function proposed with nonlinear regression such as:
Proposed <- function(N_b,N_l,m,A,x.sqr,e_1,e_2,e_3,e_4,e_5,K_g,a,b,c,d) {
e <- data.frame(e_1,e_2,e_3,e_4,e_5,N_l)
CSi <- m * ((N_l/N_b) * ((a*K_g)^b) +
(max(A * apply(e,1,function(v) combn(v[1:5],v["N_l"],sum))) / x.sqr) * ((c*K_g)^d))
return(CSi)
}
library(minpack.lm)
G_1 <- nlsLM(CSi ~ Proposed(N_b,N_l,m,A,x.sqr,e_1,e_2,e_3,e_4,e_5,K_g,a,b,c,d),
data = DATA1,
start = c(a = 0.01, b = 0.01, c = 0.01, d = 0.01))
I get the error:
Error in A * apply(e, 1, function(v) combn(v[1:5], v["N_l"], sum)) :
non-numeric argument to binary operator

Can´t use survfit on some data.frames

I have a dataset I´m going to use for survival analysis, and it seems to be working fine when I use the whole set. However, once I slice it into smaller dataframes using data[which(data$variable1=="somevalue")]the thing seems to break down.
Most of the resulting smaller dataframes work fine, but some are a problem. In the problematic ones, I can use summary(survfit(Surv(time, status)~variable2, data=smalldataframe))$surv without a problem, but when I try summary(survfit(Surv(time, status)~variable2, data=smalldataframe), time=5)$surv, it throws Error in array(xx, dim = dd) : negative length vectors are not allowed.
I´ve tried looking at the data, to see if I have any weird values, like negative times, but there aren´t any. Besides, if there were a problem with that, the full dataframe should be throwing an error too, but it doesn´t. All the smaller dataframes are created using the same line of code, so I also don´t understand why they are acting differently. And mostly, I don´t understand why summary(survfit(...))$surv works fine, as does plot(survfit(...)), but when I want to calculate survival at a specific time, it suddenly doesn´t like the data anymore.
Here´s one of the offending dataframes
test <-
structure(list(time2 = c(0.15, 2.08, 2.06, 0.32, 39.45, 39.09,
2.57, 3.64, 13.57, 36.57, 36.26, 0.78, 0.1, 33.94, 3.1, NA, 1.77,
28.38, 1.24, NA, 1.87, 25.83, 2.62, 1.57, 1.6, 22.74, 21.03,
20.54, 20.03, 0.97, 19.35, 18.09, 2.61, 17.68, NA, 3.85, 3.52,
11.22, 11.52, 11.04, 10.51, 1.68, 10.4, 10.61, 9.01, 9.05, 7.8,
0.11, 4.83), status = c(1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1,
0, 1, NA, 1, 1, 1, NA, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, NA, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0), cas_dg = c(1,
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8,
8, 9, 9, 9, 9, 9)), .Names = c("time2", "status", "cas_dg"), row.names = c(NA, -49L), class = "data.frame")
The call that is giving me trouble is summary(survfit(Surv(time2, status)~cas_dg, data=test), time=5)$surv and that only with some of the smaller dataframes.
You need to use argument extend=TRUE in summary; according to ?summary.survfit:
extend: logical value: if TRUE, prints information for all specified
‘times’, even if there are no subjects left at the end of the
specified ‘times’. This is only valid if the ‘times’
argument is present.
So for your sample data, you can do:
fit <- survfit(Surv(time2, status) ~ cas_dg, data = test);
summary(fit, time = 5, extend = TRUE)$surv;
#[1] 0.0000000 0.0000000 0.5555556 0.5000000 0.3333333 0.5714286 0.6000000
#[8] 0.6666667 0.8000000

adjusting legend placement in plot of prc:vegan

data(pyrifos)
week <- gl(11, 12, labels=c(-4, -1, 0.1, 1, 2, 4, 8, 12, 15, 19, 24))
dose <- factor(rep(c(0.1, 0, 0, 0.9, 0, 44, 6, 0.1, 44, 0.9, 0, 6), 11))
ditch <- gl(12, 1, length=132)
mod <- prc(pyrifos, dose, week)
plot(mod)
how can I control the placement of the legend in this graph? e.g. if i want it in the bottom right corner instead - for my own data (not shown) the default placement covers the data.
Have you checked the help for plot.prc()?? the legpos parameter is responsible for legend position
Here is you solution
library(vegan)
data(pyrifos)
week <- gl(11, 12, labels=c(-4, -1, 0.1, 1, 2, 4, 8, 12, 15, 19, 24))
dose <- factor(rep(c(0.1, 0, 0, 0.9, 0, 44, 6, 0.1, 44, 0.9, 0, 6), 11))
ditch <- gl(12, 1, length=132)
mod <- prc(pyrifos, dose, week)
plot(mod, legpos="bottomright")

plotting issue with autofitVariogram in automap

I am a newbie in R. I am using autofitVariogram to daily rainfall data of 50 stations.The sample data is provided below.Some of stations have missing values represented by "NaN" values.
My question is regarding the variogramfit. The variogram covers only a distance of 60,000m. Why are the points in bins beyond 60Km not plotted. I had seen from spatial correlation plot maximum distance from lon-lat information is >200Km.
The summary of latitide and longitude information is provided below.
summary(lonlat)
lon lat
Min. :74.78 Min. :15.77
1st Qu.:75.14 1st Qu.:16.04
Median :75.56 Median :16.33
Mean :75.54 Mean :16.37
3rd Qu.:75.94 3rd Qu.:16.66
Max. :76.31 Max. :17.23
$ Sample data given below:
dput(rain[140:145,])
structure(list(Col0 = c(0, 0, 1, 9, 6.5, 0), Col1 = c(1.5, 36,
21, 44, 4, 0), Col2 = c(0, 0, 24.5, 21.5, 7.5, 1), Col3 = c(0,
1, 45, 3, 0, 0), Col4 = c(2, 0, 5, 54.5, 13.5, 0), Col5 = c(0.5,
2, 0, 3.5, 13.5, 0), Col6 = c(0.5, 0, 0, 59, 15.5, 0), Col7 = c(0,
0, 2.5, 1, 0, 0), Col8 = c(0, 6, 24, 2, 5.5, 0), Col9 = c(0,
3, 6, 1, 0, 7), Col10 = c(0.5, 1, 64, 20, 1, 0.5), Col11 = c(NaN,
NaN, NaN, NaN, NaN, NaN), Col12 = c(0, 11, 75, 19, 15.5, 0),
Col13 = c(0, 4, 57.5, 50.5, 8.5, 0), Col14 = c(1.5, 0.5,
127, 33.5, 34.5, 0), Col15 = c(0, 7, 0.5, 13, 1, 0), Col16 = c(0,
0.5, 81.5, 15, 49, 0), Col17 = c(0, 0, 4.5, 17, 5.5, 1),
Col18 = c(0, 3, 2.5, 0.5, 0, 0), Col19 = c(NaN, NaN, NaN,
NaN, NaN, NaN), Col20 = c(0, 0, 0, 0, 7, 0), Col21 = c(0,
1, 0, 5, 3.5, 0), Col22 = c(0, 0, 11.5, 28, 3.5, 0), Col23 = c(0,
0, 48.5, 0, 24.5, 0), Col24 = c(0, 0, 0, 10, 0.5, 14), Col25 = c(NaN,
NaN, NaN, NaN, NaN, NaN), Col26 = c(0, 7.5, 16, 28.5, 20.5,
0), Col27 = c(1.5, 0.5, 38, 28.5, 50, 0), Col28 = c(NaN,
NaN, NaN, NaN, NaN, NaN), Col29 = c(NaN, NaN, NaN, NaN, NaN,
NaN), Col30 = c(2.5, 0, 0, 80.5, 28, 13.5), Col31 = c(1,
0, 17, 85.5, 3.5, 0), Col32 = c(0, 0.5, 8, 101, 20, 4), Col33 = c(NaN,
NaN, NaN, NaN, NaN, NaN), Col34 = c(4, 3, 17, 122, 2, 2),
Col35 = c(0, 15.5, 14.5, 20, 3.5, 0), Col36 = c(0, 6.5, 8.5,
21, 7, 0), Col37 = c(0, 0, 1.5, 14.5, 0, 1.5), Col38 = c(0,
28, 30, 4, 0, 73), Col39 = c(28.5, 0, 4.5, 9.5, 1, 0), Col40 = c(1.5,
11.5, 32.5, 55, 0, 1), Col41 = c(0, 14.5, 0, 19, 12.5, 47.5
), Col42 = c(0, 28, 29, 17, 0.5, 20.5), Col43 = c(NaN, NaN,
NaN, NaN, NaN, NaN), Col44 = c(0, 19, 3.5, 42, 0, 0), Col45 = c(0,
0, 85, 15.5, 1, 0), Col46 = c(0, 0.5, 8, 24, 0.5, 0), Col47 = c(0,
1.5, 7, 12, 8.5, 0), Col48 = c(0, 0, 0, 43.5, 0, 1.5), Col49 = c(0,
13.5, 1, 16, 1, 1)), .Names = c("Col0", "Col1", "Col2", "Col3",
"Col4", "Col5", "Col6", "Col7", "Col8", "Col9", "Col10", "Col11",
"Col12", "Col13", "Col14", "Col15", "Col16", "Col17", "Col18",
"Col19", "Col20", "Col21", "Col22", "Col23", "Col24", "Col25",
"Col26", "Col27", "Col28", "Col29", "Col30", "Col31", "Col32",
"Col33", "Col34", "Col35", "Col36", "Col37", "Col38", "Col39",
"Col40", "Col41", "Col42", "Col43", "Col44", "Col45", "Col46",
"Col47", "Col48", "Col49"), row.names = 143:148, class = "data.frame")
# Import the required libraries
library(rgdal)
library(maptools)
library(gstat)
library(sp)
library(automap)
library(XLConnect)
# Read the station data from xls file
stnrain = readWorksheetFromFile(path_fileName,"Sheet1", region = "D1:BA187", header = FALSE)
N = nrow(stnrain)
rain = stnrain[4:N,]
lat = as.numeric(t(stnrain[2,]))
lon = as.numeric(t(stnrain[3,]))
lonlat = cbind(lon,lat)
#Transform from GCS to UTM protection
sp = SpatialPoints(lonlat,proj4string = CRS("+proj=longlat"))
sp_utm = spTransform(sp, CRS("+proj=utm +zone=43N +datum=WGS84"))
krige_value = list() #prepare a list for storing the autokrige output
krige_stderr = list()
nRows = nrow(rain)
for (i in 1:nRows)
{
irain = rain[i,]
miss_indx = (irain == "NaN")
irain = irain[!miss_indx]
irain = as.numeric(irain)
isallZeros = (max(irain) == 0) # To take care of the cases of dry day(irain =0)
irain = as.data.frame(irain)
M = nrow(irain)
if ((M > 5) & (!isallZeros)) # To avoid cases of NaN across many stations
{
print(i)
foo_utm = sp_utm[!indx]# Removing the locations with NaN values
data = data.frame(foo_utm,irain)
names(data) = c("Easting","Northing","rain")
coordinates(data) = c("Easting","Northing")
variogram = autofitVariogram(rain~1,data,model = "Sph",fix.values=c(0,NA,NA))
p = plot(variogram, main="Semi-variogram (Spherical Model)",xlab="Distance(m)",ylab="Semi-Variance(mm2)", sub=paste("Range: ",variogram$var_model$range[2], "Day",i))
print(p)
png(p)
dev.off()
}
else
{
krige_value[[i]] = list(rep(0, L))
krige_stderr[[i]] = list(rep(0, L))
}
}
}
Q2) How can i save the variogram fit png file in a loop. I understand that dev.off() should be used after each saving the figure, which i had done, but I am not able to save the the figure.
Any help would be appreciated.
Thanks,
Any suggestions would be appreciated?
In regard to your first question, the sample variogram is built using points up to a maximum distance of around 1/3 of the diagonal of the area of interest. The assumption here is that points farther away form that are not related, and because they are not in the sample variogram or variogram model they are plotted. This is just a choice, and might not be the correct choice, but when I wrote autofitVariogram it seemed to work well for my data. The variogram model you show confirms this, the range is smaller than 60 km.
For saving your png's I have two suggestions. First, call the plot command inside the png() dev.off pair, so not:
print(p)
png()
dev.off()
but:
png()
print(p)
dev.off()
In addition, I would create meaningful names for the png files.
To create sets of variogram plots, I would use ggplot2. This uses geom_line and facet_wrap. ggplot2 cannot deal directly with gstat/automap variogram models, luckily you can create distance semivariance data using the function variogramLine from gstat. See for example figure 3.1, and the plots in appendix A of this report I wrote. This answer I wrote earlier does also include an example of using ggplot2 for spatial data, this time to plot a grid map.

xyplot not merging plots when more than two conditioning variables

When I run the following code, xyplot produces 4 separate plots 2 by 3 plots,
whereas I want a single 4 by 6 trellis (to save real estate
space on the axis anotation and legends).
Note that my problem is different from this one in that I don't want to
see four set of axis/legends.
Here is some example data:
B <- structure(list(yval = c(0.88, 4.31, 7.52, 3.21, 3.27, 4.93, 4.21,
0.7, 0.68, 0.92, 3.86, 5.67, 9.08, 1.95, 3.27, 1.44, 2.38, 0.85,
0.79, 0.55, 0.79, 10.52, 0.9, 4, 0.78, 2.46, 0.78, 1.64, 2.47,
0.77, 0.83, 0.86, 3.65, 8.25, 0.65, 0.88, 0.95, 4.05, 4.98, 1.43,
4.43, 2.94, 5.52, 0.9, 3.69, 0.79, 0.74, 1.49, 7.29, 0.58, 8.47,
5.82, 0.84, 0.87, 0.69, 1.38, 0.83, 2.32, 0.86, 7.32, 6.73, 6.7,
3.3, 1.58, 2.74, 0.88, 4.2, 3.79, 4.98, 2.54, 1.84, 1.2, 2.59,
11.99, 0.78, 0.92, 0.59, 3.83, 0.92, 2.6, 0.95, 3.18, 2.75, 9.83,
9.81, 0.55, 0.83, 6.29, 1.64, 1.12, 0.65, 3.96, 4.27, 3.99, 20,
0.83, 6.23, 6.81, 0.86, 0.7), xval = c(0.62, 0.81, 9.01, 3.72,
1.49, 3.92, 6.22, 6.64, 5.56, 6.64, 4, 7.36, 9.6, 1, 1.64, 3.34,
3.47, 3.37, 4.34, 6.63, 7.62, 4.07, 5.69, 3.76, 9.74, 1.58, 1.53,
2.62, 1.64, 1.18, 9.79, 9.9, 2.76, 7.96, 5.11, 4.74, 9.92, 0.49,
9.05, 8.59, 0.7, 5.8, 5.34, 3.14, 6.96, 2.05, 8.29, 0.35, 7.52,
6.56, 2.01, 7.92, 3.89, 6.31, 8.64, 6.18, 4.49, 0.63, 7.52, 7.82,
1.25, 9.54, 4.68, 0.4, 1.38, 8.7, 4.71, 8.27, 5.72, 0.75, 6.08,
0.11, 1.38, 0.37, 4.94, 0.53, 7.53, 3.11, 2.73, 4.93, 9.47, 2.18,
4.54, 7.12, 8.28, 6.62, 5.14, 4.42, 0.21, 9.52, 3.77, 6.43, 6.78,
6.87, 9.47, 6.42, 0.81, 8.88, 7.2, 8.68), gval = c(1, 2, 5, 5,
2, 1, 2, 1, 2, 3, 6, 5, 1, 3, 2, 3, 5, 2, 6, 4, 4, 1, 1, 6, 4,
2, 1, 2, 4, 5, 5, 3, 6, 5, 4, 2, 2, 3, 3, 6, 2, 4, 1, 4, 4, 1,
1, 2, 2, 5, 1, 1, 2, 2, 1, 3, 1, 5, 6, 5, 1, 5, 4, 4, 3, 6, 6,
4, 5, 4, 4, 6, 5, 6, 5, 2, 1, 1, 6, 6, 2, 5, 5, 1, 1, 4, 6, 3,
4, 6, 3, 5, 3, 3, 6, 2, 1, 5, 1, 3), type = c(5, 2, 1, 5, 1,
1, 1, 1, 2, 12, 5, 1, 2, 5, 5, 12, 12, 12, 12, 2, 12, 2, 12,
5, 12, 2, 12, 12, 5, 12, 12, 12, 5, 2, 5, 12, 1, 1, 1, 1, 2,
12, 1, 12, 2, 12, 2, 2, 1, 1, 2, 1, 5, 12, 12, 5, 12, 5, 5, 1,
1, 1, 2, 5, 5, 5, 5, 5, 1, 5, 12, 12, 5, 2, 12, 12, 1, 1, 5,
5, 5, 2, 5, 1, 2, 2, 5, 1, 5, 2, 5, 5, 5, 2, 2, 5, 1, 2, 2, 5
), cr = c(0.2, 0.4, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.4,
0.4, 0.4, 0.2, 0.4, 0.4, 0.4, 0.2, 0.2, 0.2, 0.2, 0.4, 0.4, 0.4,
0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.4, 0.4, 0.4, 0.4, 0.2, 0.4, 0.2,
0.4, 0.2, 0.2, 0.4, 0.4, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.4, 0.2,
0.4, 0.2, 0.2, 0.4, 0.4, 0.2, 0.2, 0.4, 0.2, 0.4, 0.4, 0.4, 0.4,
0.2, 0.4, 0.4, 0.4, 0.4, 0.2, 0.4, 0.4, 0.2, 0.4, 0.4, 0.2, 0.2,
0.2, 0.2, 0.2, 0.4, 0.4, 0.4, 0.2, 0.4, 0.4, 0.2, 0.2, 0.4, 0.4,
0.2, 0.2, 0.2, 0.4, 0.2, 0.4, 0.4, 0.4, 0.4, 0.2, 0.2), p = c(4,
12, 12, 8, 12, 8, 12, 4, 4, 8, 8, 4, 4, 8, 8, 8, 4, 12, 8, 4,
12, 12, 12, 12, 8, 12, 4, 4, 8, 8, 8, 4, 8, 12, 4, 12, 12, 4,
12, 8, 4, 4, 12, 4, 4, 8, 4, 4, 8, 4, 8, 12, 12, 8, 4, 8, 8,
8, 8, 12, 4, 8, 4, 12, 4, 4, 12, 4, 12, 12, 8, 4, 4, 12, 8, 12,
4, 4, 12, 4, 8, 4, 8, 12, 8, 4, 4, 4, 8, 4, 4, 12, 8, 12, 8,
4, 4, 8, 8, 4), nsamp = c(100, 300, 300, 200, 300, 200, 300,
100, 100, 200, 200, 100, 100, 200, 200, 200, 100, 300, 200, 100,
300, 300, 300, 300, 200, 300, 100, 100, 200, 200, 200, 100, 200,
300, 100, 300, 300, 100, 300, 200, 100, 100, 300, 100, 100, 200,
100, 100, 200, 100, 200, 300, 300, 200, 100, 200, 200, 200, 200,
300, 100, 200, 100, 300, 100, 100, 300, 100, 300, 300, 200, 100,
100, 300, 200, 300, 100, 100, 300, 100, 200, 100, 200, 300, 200,
100, 100, 100, 200, 100, 100, 300, 200, 300, 200, 100, 100, 200,
200, 100)), .Names = c("yval", "xval", "gval", "type", "cr",
"p", "nsamp"), row.names = c(NA, -100L), class = "data.frame")
And here is the code I am running:
library(lattice)
library(latticeExtra)
library(grid)
types<-rep(NA,6)
types[1]<-expression(paste(epsilon,"=",0.2,", p=",4,sep=""))
types[2]<-expression(paste(epsilon,"=",0.2,", p=",8,sep=""))
types[3]<-expression(paste(epsilon,"=",0.2,", p=",12,sep=""))
types[4]<-expression(paste(epsilon,"=",0.4,", p=",4,sep=""))
types[5]<-expression(paste(epsilon,"=",0.4,", p=",8,sep=""))
types[6]<-expression(paste(epsilon,"=",0.4,", p=",12,sep=""))
types<-rep(types,4)
cl<-rainbow(7)[-4]
xyplot(B$yval~B$xval|as.factor(B$p)*as.factor(B$cr)*as.factor(B$type),
group=B$gval, as.table=TRUE,
ylab=expression(kappa(Sigma,S)), col=cl, xlab=expression(nu),
xlim=c(0,10), ylim=c(0,10), type=c("l","g"), lwd=5, cex.lab=2,
strip=function(...){
panel.fill(trellis.par.get("strip.background")$col[1])
type <- types[panel.number()]
grid::grid.text(label=type,x=0.5,y=0.5,gp=gpar(fontsize=20))
grid::grid.rect()
},
key=list(text=list(c("A","B","C","D","E","F"),cex=2),
lines=list(type=rep("l",6), label.cex=2,col=cl,lwd=3),columns=3),
par.settings=list(par.xlab.text=list(cex=2),axis.text=list(cex=2),
par.ylab.text=list(cex=2)))
Three conditioning variables means that it makes a three dimensional grid of panels, where the third dimension is onto multiple pages. One alternative is to only condition on two variables; here I use : to make the first conditioning factor the intersection of the first two original conditioning factors.
xyplot(B$yval~B$xval|as.factor(B$p):as.factor(B$cr)*as.factor(B$type), ...
I think you want layout=c(6,4) somewhere in your call to xyplot. Once you do that you will have to reconfigure many other settings.

Resources