Specifying number of trials, bootstrap

Specifying number of trials, bootstrap - r

For an assignment, I am applying mixture modeling with the mixtools package on R. When I try to figure out the optimal amount of components with bootstrap. I get the following error
Error in boot.comp(y, x, N = NULL, max.comp = 2, B = 5, sig = 0.05, arbmean = TRUE, :
Number of trials must be specified!
I found out that I have to fill an N: An n-vector of number of trials for the logistic regression type logisregmix. If
NULL, then N is an n-vector of 1s for binary logistic regression.
But, I don't know how to find out what the N is in fact to make my bootstrap working.
Link to my codes:
https://www.kaggle.com/blastchar/telco-customer-churn
My codes:
data <- read.csv("Desktop/WA_Fn-UseC_-Telco-Customer-Churn.csv", stringsAsFactors = FALSE,
na.strings = c("NA", "N/A", "Unknown*", "NULL", ".P"))
data <- droplevels(na.omit(data))
data <- data[c(1:5032),]
testdf <- data[c(5033:7032),]
data <- subset(data, select = -customerID)
set.seed(100)
library(plyr)
library(mixtools)
data$Churn <- revalue(data$Churn, c("Yes"=1, "No"=0))
y <- as.numeric(data$Churn)
x <- model.matrix(Churn ~ . , data = data)
x <- x[, -1] #remove intercept
x <-x[,-c(7, 11, 13, 15, 17, 19, 21)] #multicollinearity
a <- boot.comp(y, x, N = NULL, max.comp = 2, B = 100,
sig = 0.05, arbmean = TRUE, arbvar = TRUE,
mix.type = "logisregmix", hist = TRUE)
Below there is more information about my predictors:
dput(x[1:4,])
structure(c(0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
34, 2, 45, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 1, 1, 0, 29.85, 56.95, 53.85, 42.3, 29.85, 1889.5, 108.15,
1840.75), .Dim = c(4L, 23L), .Dimnames = list(c("1", "2", "3",
"4"), c("genderMale", "SeniorCitizen", "PartnerYes", "DependentsYes",
"tenure", "PhoneServiceYes", "MultipleLinesYes", "InternetServiceFiber optic",
"InternetServiceNo", "OnlineSecurityYes", "OnlineBackupYes",
"DeviceProtectionYes", "TechSupportYes", "StreamingTVYes", "StreamingMoviesYes",
"ContractOne year", "ContractTwo year", "PaperlessBillingYes",
"PaymentMethodCredit card (automatic)", "PaymentMethodElectronic check",
"PaymentMethodMailed check", "MonthlyCharges", "TotalCharges"
)))
My response variable is binary
I hope you guys can help me out!

Looking in the source code of mixtools::boot.comp, which is scary as it is over 800 lines long and in serious need of refactoring, the offending lines are:
if (mix.type == "logisregmix") {
if (is.null(N))
stop("Number of trials must be specified!")
Despite what the documentation says, N must be specified.
Try to set it to a vector of 1s: N = rep(1, length(y)) or N = rep(1, nrow(x))
In fact, if you look in mixtools::logisregmixEM, the internal function called by boot.comp, you'll see how N is set if NULL:
n <- length(y)
if (is.null(N)) {
N = rep(1, n)
}
Too bad this is never reached if N is NULL since it stops with an error before. This is a bug.

Related

Aggregate similar constructs/ FA with binary variables

I would like to aggregate, in order to reduce the number of constructs, its following data frame containing only binary variables that correspond to "yes/no", its following data frame (first 10 row). The original data frame contains 169 rows.
outcome <-
structure(list(Q9_Automazione.processi = c(0, 0, 0, 0, 0, 0,
1, 1, 1, 0), Q9_Velocita.Prod = c(1, 0, 0, 1, 0, 0, 1, 1, 1,
0), Q9_Flessibilita.Prod = c(0, 0, 0, 1, 0, 0, 1, 1, 0, 1), Q9_Controllo.processi = c(0,
0, 0, 1, 0, 0, 1, 1, 0, 0), Q9_Effic.Magazzino = c(0, 0, 0, 1,
0, 0, 0, 0, 0, 0), Q9_Riduz.Costi = c(0, 1, 0, 0, 0, 0, 0, 0,
0, 1), Q9_Miglior.Sicurezza = c(0, 0, 0, 0, 0, 0, 1, 0, 1, 1),
Q9_Connett.Interna = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0), Q9_Connett.Esterna = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), Q9_Virtualizzazione = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0), Q9_Innov.Prod = c(0, 0, 0, 0, 0,
1, 0, 0, 0, 1), Q9_Person.Prod = c(0, 1, 0, 1, 0, 1, 0, 0,
0, 1), Q9_Nuovi.Mercati = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
Q9_Nuovi.BM = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Q9_Perform.Energ = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), Q9_Perform.SostAmb = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 10L), class = "data.frame")
I have tried performing factor analysis via the tethracoric method on the obtained correlation matrix ( the obtained value from the KMO function turns out to be inadequate) both directly on the dataframe and then using tethracoric correletions in fafunction (using cor = "tet" I get a negative Tucker Lewis Index).
I have been reading up on this but cannot find a methodology that is adequate and of which I am certain of the correctness of the analysis.
So basically what I would like to achieve is to aggregate similar constructs, e.g., assess whether column 5 has value 1 (i.e., "yes") almost always when column 11 has value 1 and then aggregate.
Here the code that I try to used
library(psych)
tet <- tetrachoric(outcome)
corrplot(tet$rho, "ellipse", tl.cex = 0.75, tl.col = "black")
par(mfrow = c(1,2))
corr_matrix %>%
ggcorrplot(show.diag = F,
type="lower",
lab=TRUE,
lab_size=2)
KMO(corr_matrix)
cortest.bartlett(corr_matrix)
fa.parallel(corr_matrix, fm = "ml")
factor <- fa(corr_matrix, nfactors = 3, rotate = "oblimin", fm = "ml")
print(factor, cut = 0.3, digits = 3)
# -------- Pearson --------
cor(outcome, method = 'pearson', use = "pairwise.complete.obs") %>%
ggcorrplot(show.diag = F,
type="lower",
lab=TRUE,
lab_size=2)
KMO(outcome)
cortest.bartlett(outcome)
fa.parallel(outcome)
factor1 <- fa(outcome, nfactors = 3, rotate = "oblimin", cor = "tet", fm = "ml")
print(factor1, cut = 0.3, digits = 3)

How to calculate mean value of all columns of datarame [duplicate]

This question already has answers here:
calculate the mean for each column of a matrix in R
(10 answers)
Closed last year.
I have a data frame and I want to calculate the mean of all columns and save it into a new dataframe. I found this solution calculate the mean for each column of a matrix in R however, this is only for matrix and not dataframe
structure(list(TotFlArea = c(1232, 596, 708, 1052, 716), logg_weighted_assess = c(13.7765298160156,
13.1822275291412, 13.328376420438, 13.3076293132057, 13.5164823091252
), TypeDwel1.2.Duplex = c(0, 0, 0, 0, 0), TypeDwelApartment.Condo = c(0,
1, 1, 1, 1), TypeDwelTownhouse = c(1, 0, 0, 0, 0), Age_new.70 = c(0,
0, 0, 0, 0), Age_new0.1 = c(0, 0, 0, 0, 0), Age_new16.40 = c(1,
1, 0, 1, 0), Age_new2.5 = c(0, 0, 0, 0, 0), Age_new41.70 = c(0,
0, 0, 0, 0), Age_new6.15 = c(0, 0, 1, 0, 1), LandFreehold = c(1,
1, 1, 0, 1), LandLeasehold.prepaid = c(0, 0, 0, 1, 0), LandOthers = c(0,
0, 0, 0, 0), cluster_K_mean.1 = c(0, 0, 0, 0, 0)), row.names = c("1",
"2", "3", "4", "5"), class = "data.frame")
Can you please advise how I can do this?
Note: my data frame can have NA values which should be excluded from mean calculation

As #akrun pointed out. Also another alternative
apply(df, 2, mean)
where 2 means by column and 1 is by row.
However, besides its flexibility (e.g. changing from mean to mode or applying to selected columns only apply(df[,c('a', 'b')], 2, mean)) below shows the disadvantage to using apply (in terms of speed)
library(data.table)
library(microbenchmark)
# dummy data
x <- 1e7
df <- data.table(a = 1:x )
y <- letters[2:10]
df[, (y) := lapply(2:10, \(i) a+i)]
# benchmark
z <-
microbenchmark(colMeans = {colMeans(df)}
, apply = {apply(df, 2, mean)}
, times = 30
)
plot(z)

How to make a loop for these regressions?

I have a dataframe with 10 columns like this:
df = structure(list(X1 = c(-0.158841494166799, 1.74997712540787,
-0.603638753496694, -0.253379995687274, -1.13536828104642, -2.72698649676692,
0.0243826193956672, 3.21776393858788, -2.3633921387719, 0.0305028420399468,
-4.26940546325382, 1.55584592541131, -1.05118869595721, -2.84985861365441,
0.729293004856561, -3.80058253179317, 1.31191742952459, 2.33589025288871,
-0.315014740930705, 2.92302109498542, 2.26246357678861, 1.37039290089958,
0.0582396621450368, -0.902295561314538, 2.14889801735908, 0.391493314141552,
2.33280118206325, -0.0490864536654898, 0.068965431468872, 1.24553936025063
), X2 = c(0.428917030889186, 1.38179284926331, 2.83063848525283,
-0.926689840680047, -1.3903457418351, 0.708618895316463, -0.750232095654876,
0.894660121671367, 0.124277732759992, -1.98558522788821, 2.34165530946135,
2.85945910959032, 2.36563560772223, 0.0185741299096399, -2.48859548768527,
-1.26592501904523, -0.790849261930494, -0.828149152178906, 3.2014616981455,
-0.15442363719638, -0.516775154465199, -0.176649448869891, -1.41825025459806,
2.17806157731774, -0.44973481848562, 2.24655423722927, 2.47628395430464,
1.91613790409017, -0.393928617864301, -0.148584327095393), X3 = c(-0.430287082709084,
-0.625327045828844, -0.514958706302276, 0.751266821352889, 1.18006572060265,
-0.802136052260364, 0, 0.105706401868772, 0.442778243504233,
0.905950590121364, -1.26899829497004, 0.745801518841586, -0.978033251950966,
0.113286791573796, 1.20968420311602, -0.635010679093886, 0.858274389205649,
-1.009308286611, 0, 0.167623720006668, -0.720624475890533, -0.443172067136528,
-0.0181650919153045, -0.818017257660172, 1.18137870264753, -0.0566121497554404,
0.572314218120067, -1.01361737919216, 0.637155618813563, 0.00507063594816648
), X4 = c(1.05105923858325, -0.808507106602501, 1.01063388325313,
-0.363828197971125, -0.889357817262751, 0.0808507106602501, 0,
0.4851042639615, 0.687231040612125, -1.53616350254475, 2.02126776650625,
-1.09148459391338, -0.929783172592876, 0.323402842641, 0.768081751272376,
0.444678908631375, -1.53616350254475, 0.970208527923, 0, 0.646805685282001,
-1.01063388325313, 0.40425355330125, -0.323402842641, -0.202126776650625,
1.09148459391338, -0.970208527923, -0.606380329951876, 0, 0.606380329951876,
0.161701421320499), X5 = c(-1.68622583629267, -2.11651291900176,
-2.7418399648306, -3.25679867113288, -2.50553184977999, -1.32546612917734,
-2.12760218143771, -2.12760218143771, -2.02189577956894, -1.5791175360647,
-0.67316694594334, -1.94216524091338, -1.19636372207179, -2.17439697402276,
-2.06111018244896, -0.85142597933294, -1.48643665842683, -0.628162269221177,
-1.63747055583218, -1.63747055583218, -1.46984683582551, -2.19047131171604,
-2.63364337885257, -2.65180847076788, -3.46982572842805, -2.28844702578052,
-2.34505917553596, -1.77274495741589, -2.78636233660805, -2.14920671779449
), X6 = c(-3.52265292326337, -2.47159368468012, -3.28010079128262,
-2.26946690802949, -2.63329510600062, -3.52265292326337, -3.44180221260312,
-3.44180221260312, -2.95669794864162, -2.26946690802949, -3.80563041057424,
-1.78436264406799, -2.87584723798137, -3.80563041057424, -3.48222756793324,
-2.71414581666087, -2.26946690802949, -3.80563041057424, -2.83542188265124,
-2.83542188265124, -2.18861619736924, -3.19925008062237, -2.79499652732112,
-3.11839936996212, -3.32052614661274, -2.22904155269937, -3.19925008062237,
-3.80563041057424, -3.80563041057424, -3.19925008062237), X7 = c(1,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0), X8 = c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0),
X9 = c(1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0), X10 = c(3.2936,
3.283, 3.264, 3.3162, 3.31, 3.3485, 3.3017, 3.2927, 3.3186,
3.3004, 3.3043, 3.0945, 2.7868, 2.7635, 2.749, 2.5605, 2.5571,
2.2133, 2.0783, 2.0981, 2.0209, 2.0148, 1.973, 2.0567, 2.0214,
2.0315, 2.0057, 2.078, 2.0152, 2.0277)), row.names = c(NA,
30L), class = "data.frame")
What I want to do is be much faster in the implementation of the following code:
p = 3
instru1 = df$X7[1:(30-p)]
instru2 = df$X8[1:(30-p)]
instru3 = df$X9[1:(30-p)]
var <- VAR(df[, c(1:4,10)], p, type = "const")
summary(lm(var$varresult$[# the name of the first variable that appears]$residuals ~ instru1))
linearHypothesis(lm(var$varresult$[# the name of the first variable that appears]$residuals ~ instru1), "instru1 = 0", test = "F", vcov = vcovHAC, type = "HC1")
I did this code only for instru1 and for one column of interest (col 10 in df). I would like to do the same exercise, for instru1, instru2 and instru3 and, in addition to
VAR(df[, c(1:4,10)], p, type = "const")
for
VAR(df[, c(1:4,5)], p, type = "const")
and
VAR(df[, c(1:4,6)], p, type = "const").
In other words, I would like to avoid doing:
p = 3
instru1 = df$X7[1:(30-p)]
instru2 = df$X8[1:(30-p)]
instru3 = df$X9[1:(30-p)]
var1 <- VAR(df[, c(1:4,5)], p, type = "const")
summary(lm(var1$varresult$[# the name of the first variable that appears]$residuals ~ instru1))
linearHypothesis(lm(var1$varresult$[# the name of the first variable that appears]$residuals ~ instru1), "instru1 = 0", test = "F", vcov = vcovHAC, type = "HC1")
var2 <- VAR(df[, c(1:4,6)], p, type = "const")
summary(lm(var2$varresult$[# the name of the first variable that appears]$residuals ~ instru1))
linearHypothesis(lm(var2$varresult$[# the name of the first variable that appears]$residuals ~ instru1), "instru1 = 0", test = "F", vcov = vcovHAC, type = "HC1")
# and then the same but plugging instru2 and instru3 in place of instru1. Is it possible to have all in one loop?
Ideally I would like to have everything in one loop but, if not feasible, it is also fine to have a code for the VAR part and then I will update the 'instru' bit manually.
Can anyone help me?
Thanks a lot!

I believe that the best way is to write a function generalizing the VAR/linearHypothesis code, and call that function with the required arguments.
Step 1: the function
library(vars)
library(car)
customVAR <- function(DF, Select, Regr, p, type = "const"){
n <- nrow(DF)
instru <- DF[[Regr]][seq.int(n - p)]
var_fit <- VAR(DF[, Select], p = p, type = type)
r <- resid(var_fit$varresult[[1]])
lm_fit <- lm(r ~ instru)
smry <- summary(lm_fit)
lh <- linearHypothesis(lm_fit, "instru = 0", test = "F", vcov = vcovHAC, type = "HC1")
list(Summary = smry, linearHyp = lh)
}
Step 2: test it for one model
customVAR(df, Select = c(1:4, 10), Regr = 'X7', p = 3)
Step 3: fit several models, varying the regressors
Regr_vec <- paste0('X', 7:9)
var_list <- lapply(Regr_vec, function(R){
customVAR(df, Select = c(1:4, 10), Regr = R, p = 3)
})
var_list[[2]]$Summary
var_list[[2]]$linearHyp
Step 4: vary the responses
Select_list <- list(c(1:4, 10), c(1:4, 5), c(1:4, 6))
var_list2 <- lapply(Select_list, function(S){
customVAR(df, Select = S, Regr = 'X7', p = 3)
})
var_list2[[3]]$Summary
var_list2[[3]]$linearHyp

Hi Rollo/Arma 91 hopefully this answers your question (note this is untested as the package used is not specified).
# Constant:
p = 3
# Instruments:
instru_col_seq <- 7:10
instru_list <- vector("list", length(instru_col_seq))
instru_list <- setNames(lapply(instru_col_seq, function(i){
df[(1:(30-p)), i]
}
), paste0("instru", 1:length(instru_col_seq)))
# Var Setup:
var_col_seq <- unique(pmin(seq_along(df)+3, ncol(df)))
var_list <- vector("list", length(col_seq))
var_lm_summmary_list <- var_list
var_lh_list <- var_list
# Var computations:
var_list <- lapply(col_seq, function(j){
var <- VAR(df[, c(1:4,j)], p, type = "const")
}
)
# Var lm:
var_lm_summmary_list <- lapply(seq_along(var_list), function(k){
summary(lm(var_list[k]$varresult$[names(df)[var_col_seq[k]]]$residuals ~ instru_list[1]))
}
)
# Var lh:
var_lh_list <- lapply(seq_along(var_list), function(l){
linearHypothesis(lm(var$varresult$[names(df)[var_col_seq[l]]]$residuals ~ instru_list[1]),
"instru1 = 0", test = "F", vcov = vcovHAC, type = "HC1")
}
)
Data:
df <- structure(list(X1 = c(-0.158841494166799, 1.74997712540787,
-0.603638753496694, -0.253379995687274, -1.13536828104642, -2.72698649676692,
0.0243826193956672, 3.21776393858788, -2.3633921387719, 0.0305028420399468,
-4.26940546325382, 1.55584592541131, -1.05118869595721, -2.84985861365441,
0.729293004856561, -3.80058253179317, 1.31191742952459, 2.33589025288871,
-0.315014740930705, 2.92302109498542, 2.26246357678861, 1.37039290089958,
0.0582396621450368, -0.902295561314538, 2.14889801735908, 0.391493314141552,
2.33280118206325, -0.0490864536654898, 0.068965431468872, 1.24553936025063
), X2 = c(0.428917030889186, 1.38179284926331, 2.83063848525283,
-0.926689840680047, -1.3903457418351, 0.708618895316463, -0.750232095654876,
0.894660121671367, 0.124277732759992, -1.98558522788821, 2.34165530946135,
2.85945910959032, 2.36563560772223, 0.0185741299096399, -2.48859548768527,
-1.26592501904523, -0.790849261930494, -0.828149152178906, 3.2014616981455,
-0.15442363719638, -0.516775154465199, -0.176649448869891, -1.41825025459806,
2.17806157731774, -0.44973481848562, 2.24655423722927, 2.47628395430464,
1.91613790409017, -0.393928617864301, -0.148584327095393), X3 = c(-0.430287082709084,
-0.625327045828844, -0.514958706302276, 0.751266821352889, 1.18006572060265,
-0.802136052260364, 0, 0.105706401868772, 0.442778243504233,
0.905950590121364, -1.26899829497004, 0.745801518841586, -0.978033251950966,
0.113286791573796, 1.20968420311602, -0.635010679093886, 0.858274389205649,
-1.009308286611, 0, 0.167623720006668, -0.720624475890533, -0.443172067136528,
-0.0181650919153045, -0.818017257660172, 1.18137870264753, -0.0566121497554404,
0.572314218120067, -1.01361737919216, 0.637155618813563, 0.00507063594816648
), X4 = c(1.05105923858325, -0.808507106602501, 1.01063388325313,
-0.363828197971125, -0.889357817262751, 0.0808507106602501, 0,
0.4851042639615, 0.687231040612125, -1.53616350254475, 2.02126776650625,
-1.09148459391338, -0.929783172592876, 0.323402842641, 0.768081751272376,
0.444678908631375, -1.53616350254475, 0.970208527923, 0, 0.646805685282001,
-1.01063388325313, 0.40425355330125, -0.323402842641, -0.202126776650625,
1.09148459391338, -0.970208527923, -0.606380329951876, 0, 0.606380329951876,
0.161701421320499), X5 = c(-1.68622583629267, -2.11651291900176,
-2.7418399648306, -3.25679867113288, -2.50553184977999, -1.32546612917734,
-2.12760218143771, -2.12760218143771, -2.02189577956894, -1.5791175360647,
-0.67316694594334, -1.94216524091338, -1.19636372207179, -2.17439697402276,
-2.06111018244896, -0.85142597933294, -1.48643665842683, -0.628162269221177,
-1.63747055583218, -1.63747055583218, -1.46984683582551, -2.19047131171604,
-2.63364337885257, -2.65180847076788, -3.46982572842805, -2.28844702578052,
-2.34505917553596, -1.77274495741589, -2.78636233660805, -2.14920671779449
), X6 = c(-3.52265292326337, -2.47159368468012, -3.28010079128262,
-2.26946690802949, -2.63329510600062, -3.52265292326337, -3.44180221260312,
-3.44180221260312, -2.95669794864162, -2.26946690802949, -3.80563041057424,
-1.78436264406799, -2.87584723798137, -3.80563041057424, -3.48222756793324,
-2.71414581666087, -2.26946690802949, -3.80563041057424, -2.83542188265124,
-2.83542188265124, -2.18861619736924, -3.19925008062237, -2.79499652732112,
-3.11839936996212, -3.32052614661274, -2.22904155269937, -3.19925008062237,
-3.80563041057424, -3.80563041057424, -3.19925008062237), X7 = c(1,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0), X8 = c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0),
X9 = c(1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0), X10 = c(3.2936,
3.283, 3.264, 3.3162, 3.31, 3.3485, 3.3017, 3.2927, 3.3186,
3.3004, 3.3043, 3.0945, 2.7868, 2.7635, 2.749, 2.5605, 2.5571,
2.2133, 2.0783, 2.0981, 2.0209, 2.0148, 1.973, 2.0567, 2.0214,
2.0315, 2.0057, 2.078, 2.0152, 2.0277)), row.names = c(NA,
30L), class = "data.frame")

I tackled similar problem recently.
What helped me was using assign() function inside for loop:
example:
for (i in 1:3){
assign(x = paste0("variable",i),value = i*10)}
What automates process of manual assignment like:
variable1 = 10
variable2 = 20
variable3 = 30

What does this error mean "order(vertex_attr(g, measure), decreasing = TRUE) : argument 1 is not a vector" in R?

I am trying to calculate robustness, a graph theory measure using R (braingraph package).
Robustness = robustness(my_networkgraph, type = c("vertex"), measure = ("btwn.cent"))
I get the following error, when I use the above robustness function:
Error in order(vertex_attr(g, measure), decreasing = TRUE) : argument 1 is not a vector
Any idea, what I am doing wrong here?
My network, which is a matrix has been converted to igraph object and robustness was calculated.
My network as a matrix:
mynetwork <- matrix(c(0, 1, 0, 1, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 1, 0, 1, 1,
0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0), nrow = 8)
This matrix was converted as igraph using the following code:
my_networkgraph <-graph_from_adjacency_matrix(mynetwork, mode = c("undirected"),weighted = NULL, diag = TRUE, add.colnames = NULL, add.rownames = NA)
Please help me to understand the above error
Thanks
Priya

There was a bug in the above function. To run the robustness code, you will need to supply a vertex attribute to your network: V(network)$degree <- degree(network) V(network)$btwn.cent <- centr_betw(network)$res

How can I make Gurobi (Using R) show all solutions

As the question states: I know there are several solutions (see output of GA and check that value and constraints are correct), but I can't get them out of Gurobi.
Edit after #Paleo13's answer: As he states, his answer is a good workround. However I would also love to see, if there is a more efficient option. Therefore, I added a bounty. See here and here for what I know.
Reproducible example:
my_fun <- function(x) {
f <- sum(model$obj * x)
penalty <- sum(abs(model$A %*% x - model$rhs))
return_value <- -f - 1e8 * penalty # sum(model$obj^2) * // 1e7 *
return(return_value)
}
model <- structure(
list(modelsense = "min",
obj = c(0, 40, 20, 40, 0, 20, 20, 20, 0),
A = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, -1, 0, 1, 0, 0, 1,
1, 0, -1, 0, 0, 0, 0, -1, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, -1, 0, 0, 1, 0, -1, 0, 1, 0,
0, 1, 0, 0, -1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0),
.Dim = c(7L, 9L),
.Dimnames = list(
c("constraint1", "constraint2", "", "", "", "", ""),
NULL)),
rhs = c(1, 1, 0, 0, 0, 1, 1),
sense = c("=", "=", "=", "=", "=", "=", "="),
vtype = "B"),
.Names = c("modelsense", "obj", "A", "rhs", "sense", "vtype"))
# Gurobi:
params <- list(OutputFlag = 1, Presolve = 2, LogToConsole = 1, PoolSearchMode = 2, PoolSolutions = 10)
ilp_result <- gurobi::gurobi(model, params)
print(ilp_result$x)
# GA for cross-check
GA <- GA::ga(type = "binary", fitness = my_fun, nBits = length(model$obj),
maxiter = 3000, run = 2000, popSize = 10, seed = 12)
# Crosscheck:
summary(GA)
my_fun(ilp_result$x)
my_fun(GA#solution[1, ])
my_fun(GA#solution[2, ])
sum(abs(model$A %*% ilp_result$x - model$rhs))
sum(abs(model$A %*% GA#solution[1, ] - model$rhs))
sum(abs(model$A %*% GA#solution[2, ] - model$rhs))

What you describe can be done with the Solution Pool. Gurobi added the R API for the solution pool in version 8.0. You set parameters to control the solution pool; the multiple solutions are returned in the Solution Pool named components. This is illustrated in the poolsearch.R example, which can also be found in the examples\R subdirectory.
Disclaimer: I manage technical support for Gurobi.

Gurobi can indeed store feasible solutions it that encounters while searing for the optimal solution (or rather a solution that fits within a specified opitmality gap). These solutions are stored in a "solution pool". Unfortunately, the gurobi R package does not have the functionality to access the solutions in the solution pool, so if we are looking for a solution that just uses R then we cannot use the solution pool. Also, it's worth noting that the solution pool may not necessarily contain all the feasible solutions, it only contains the solutions that Gurobi found along the way, so if we require all the feasible solutions then we cannot just rely on the solution pool in a single run of Gurobi.
So, with regards to your question, one strategy is to use a method referred to as "Bender's cuts". This basically involves solving the problem, adding in constraints to forbid the solution we just obtained, and then solving the problem again, and repeating this process until there aren't any more feasible solutions. I have written a function that implements this method using the gurobi R package below and applied it to your example. This method may not scale very well to problems with a large number of feasible solutions, because ideally we would access the solution pool to reduce the total number of Gurobi runs, but this is the best approach to my knowledge (but I would love to hear if anyone has any better ideas).
# define functions
find_all_feasible_solutions <- function(model, params) {
# initialize variables
counter <- 0
solutions <- list()
objs <- numeric(0)
# search for feasible solutions until no more exist
while (TRUE) {
# increment counter
counter <- counter + 1
# solve problem
s <- gurobi::gurobi(model, params)
# break if status indicates that no feasible solution found
if (s$status %in% c("INFEASIBLE")) break()
# store set of solutions
solutions[[counter]] <- s$x
objs[[counter]] <- s$objval
# add constraint to forbid solution this solution
model$rhs <- c(model$rhs, sum(s$x) - 1)
model$sense <- c(model$sense, "<=")
model$A <- rbind(model$A, (s$x * 2) - 1)
}
# throw error if no feasible sets of solutions found
if (length(solutions) == 0) {
stop("no feasible solutions found.")
}
# return solutions as matrix
list(x = do.call(rbind, solutions), obj = objs)
}
# create initial model
model <- list(
modelsense = "min",
obj = c(0, 40, 20, 40, 0, 20, 20, 20, 0),
A = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, -1, 0, 1, 0, 0, 1,
1, 0, -1, 0, 0, 0, 0, -1, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, -1, 0, 0, 1, 0, -1, 0, 1, 0,
0, 1, 0, 0, -1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0),
.Dim = c(7L, 9L),
.Dimnames = list(c("constraint1", "constraint2", "", "", "", "", ""),
NULL)),
rhs = c(1, 1, 0, 0, 0, 1, 1),
sense = c("=", "=", "=", "=", "=", "=", "="),
vtype = "B")
# create parameters
params <- list(OutputFlag = 1, Presolve = 2, LogToConsole = 1)
# find all feasible solutions
output <- find_all_feasible_solutions(model, params)
# print number of feasible solutions
print(length(output$obj))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Specifying number of trials, bootstrap - r

Related

Aggregate similar constructs/ FA with binary variables

How to calculate mean value of all columns of datarame [duplicate]

How to make a loop for these regressions?

What does this error mean "order(vertex_attr(g, measure), decreasing = TRUE) : argument 1 is not a vector" in R?

How can I make Gurobi (Using R) show all solutions

Categories

Resources