Weighted vertex cover (as linear programming) in R with the ROI package - r

I'm trying to solve an instance of the weighted vertex cover problem using R for homework and I can't seem to get it right. I'm using the ROI package (could just as well use linprog).
The instance looks like this:
Edges:
A-B, A-C, A-G,
B-C, B-D, B-E, B-G,
C-E, C-F,
D-F,
E-G,
F-H, F-I,
G-H
Weights:
A - 10,
B - 7,
C - 4,
D - 7,
E - 12,
F - 25,
G - 27,
H - 3,
I - 9
My code is:
# a b c d e f g h i
constraints <- L_constraint(matrix(c(1, 1, 0, 0, 0, 0, 0, 0, 0, # a b
1, 0, 1, 0, 0, 0, 0, 0, 0, # a c
1, 0, 0, 0, 0, 0, 1, 0, 0, # a g
0, 1, 1, 0, 0, 0, 0, 0, 0, # b c
0, 1, 0, 1, 0, 0, 0, 0, 0, # b d
0, 1, 0, 0, 1, 0, 0, 0, 0, # b e
0, 1, 0, 0, 0, 0, 1, 0, 0, # b g
0, 0, 1, 0, 1, 0, 0, 0, 0, # c e
0, 0, 1, 0, 0, 1, 0, 0, 0, # c f
0, 0, 0, 1, 0, 1, 0, 0, 0, # d f
0, 0, 0, 0, 1, 0, 1, 0, 0, # e g
0, 0, 0, 0, 0, 1, 0, 1, 0, # f h
0, 0, 0, 0, 0, 1, 0, 0, 1, # f i
0, 0, 0, 0, 0, 0, 1, 1, 0, # g h
# end of u + v >= 1
1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1,
# end of u >= 0
1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1),
# end of u <= 1
ncol = 9), # matrix
dir = c(rep(">=", 14+9), rep("<=", 9)),
rhs = c(rep(1, 14), rep(0, 9), rep(1, 9))) # L_constraint
objective <- L_objective(c(10, 7, 4, 7, 12, 25, 27, 3, 9))
problem <- OP(objective, constraints, rep("C", 9),
maximum = FALSE)
solution <- ROI_solve(problem, solver = "glpk")
The result is No solution found. I don't know what I'm doing wrong, but it may just as well be something obvious. Can't get my head around it -- a solution should always exist, even if it takes all the vertices (i. e. all variables are >= 0.5).
If it matters, I'm on Arch Linux running R from the repositories (ver. 2.14) and installed the packages via install.packages("...").
Thanks!

Okay, solved it. The problem was that I didn't add byrows = TRUE to the matrix definition. In addition I changed ncol = 9 into nrow = .... Apparently the matrix() function did not work as I expected.

Related

R for loop wise : Rowwise sum on conditions : Performance issue

I have a database, where I am running code to change value of a cell-based on the sum of previous cells and the sum of succeeding cells in the same row.
for (i in 1:row1)
{
for(j in 3:col-1)
{ # for-loop over columns
if (as.numeric(rowSums(e[i,2:j])) == 0 )
{
e1[i,j] <- 0
}
else if (as.numeric(rowSums(e[i,2:j])) > 0 && e[i,j] == 0 && as.numeric(rowSums(e[i,j:col])) > 0 )
{
e1[i,j] <- 1
}
else if (as.numeric(rowSums(e[i,2:j])) > 0 && e[i,j] == 1 && as.numeric(rowSums(e[i,j:col])) > 0 )
{
e1[i,j] <- 0
}
}
}
The runtime is very high. Appreciate any suggestions to improve the speed. Additional info: copying new values into the data frame is being done.
Thanks,
Sandy
edit 2:
Sample data:
structure(list(`Sr no` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19), `2018-01` = c(0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2018-02` = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2018-03` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2018-04` = c(0,
0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2018-05` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), `2018-06` = c(0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0), `2018-07` = c(0,
0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), `2018-08` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1), `2018-09` = c(0,
0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0), `2018-10` = c(1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1), `2018-11` = c(0,
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1), `2018-12` = c(1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2019-01` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), `2019-02` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-19L), class = c("tbl_df", "tbl", "data.frame"))
I think you can do this with matrix logic. Depends if you have enough RAM.
# creating fake data
# nc <- 300 # number of columns
nc <- 10 # for testing
nn <- 1e6 # rows
e <- sapply(1:nc, function(x) sample.int(2, nn, replace = T) - 1L)
e <- as.data.frame(e)
row1 <- nrow(e)
colc <- ncol(e)
# note that:
3:colc-1
# isnt equal with:
3:(colc-1)
s <- 3:(colc-1) # I assume you meant this
e1 <- matrix(nrow = row1, ncol = length(s)) # empty resulting matrix
s1 <- sapply(s, function(j) rowSums(e[, 2:j])) # sum for each relevant i,j
s2 <- sapply(s, function(j) rowSums(e[, j:colc])) # sum for each relevant i,j
e2 <- as.matrix(e[, s]) # taking relevant columns of e
e1[s1 == 0] <- 0
e1[s1 > 0 & e2 == 0 & s2 > 0] <- 1
e1[s1 > 0 & e2 == 1 & s2 > 0] <- 0

Errors with distance-decay using betapart and ddecay packages

My goal is to create a distance-decay curve for species data vs geographic distance. However, I am running into errors. For the betapart package, this may be due to the lack of columns relative to the number of rows. Is there a way to get past this? If not, is there another method for creating a distance-decay curve (and plotting it)? I also tried the ddecay package but ran into errors there too. Any help is much appreciated. Data is in structure form below.
# BETAPART -------------------------------------------------
library(betapart)
spat.dist<-dist(coords)
dissim.BCI<-beta.pair.abund(spec)$beta.bray.bal
plot(spat.dist, dissim.BCI, ylim=c(0,1), xlim=c(0, max(spat.dist)))
BCI.decay.exp<-decay.model(dissim.BCI, spat.dist, y.type="dissim", model.type="exp", perm=100)
#========================================================================================================
I also tried a few other packages --------------------------
# ddecay package -------------------------------------------
devtools::install_github("chihlinwei/ddecay")
the issue with this method is that it requires the use of a gradient however, I would like to avoid that if possible but I do not see a way around this. Also they do not include their example data in the package.
dd <- beta.decay(gradient=spat.dist, counts=decostand(spec, method="pa"),
coords=coords, nboots=1000,
dis.fun = "beta.pair", index.family = "sorensen", dis = 1, like.pairs=T)
x <- vegdist(coords, method = "euclidean")
y <- 1 - dist(decostand(spec, method="pa"), index.family = "sorensen")[[1]]
plot(x, y)
lines(dd$Predictions[, "x"], dd$Predictions[,"mean"], col="red", lwd=2)
#========================================================================================================
# DATA -----------------------------------------------------
spec <- structure(list(Ccol = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Acol = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0), NYcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0), Mcol = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0), AAcol = c(14, 0, 14, 3, 11, 1, 0, 2, 0,
3, 0, 4, 0, 1, 8, 2, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 7),
Ncol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1), ATBcol = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 3), CVcol = c(0, 0, 0, 0, 0, 0, 1, 20,
0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 2, 0, 0,
0, 6), AZNcol = c(0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), GBcol = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0), KHAcol = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0), AFcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0), AFPcol = c(0,
0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1), TIAcol = c(4, 1, 0, 2, 6, 0,
1, 1, 0, 2, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0), AUcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), AScol = c(0,
4, 0, 2, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 5, 0, 0), NSAcol = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 7, 0, 0, 3, 0, 0, 0, 4, 0, 2, 0, 1, 0, 9, 5, 1,
0, 0, 2, 0), WZcol = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 10, 4,
0, 0, 0, 0, 0, 0, 1, 5, 0, 0, 0, 17, 4, 0, 0, 0, 0, 0), AJcol = c(0,
3, 6, 0, 0, 1, 0, 4, 0, 0, 0, 0, 39, 12, 0, 0, 0, 0, 0, 0,
0, 4, 5, 1, 12, 13, 16, 0, 5), EADcol = c(4, 1, 2, 1, 2,
0, 0, 0, 0, 4, 0, 2, 1, 1, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0,
0, 0, 0, 0, 1), CAcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), Pcol = c(0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 60, 0, 0,
13, 0, 8, 1, 0, 0, 0, 0, 0), ASDcol = c(3, 5, 6, 17, 3, 5,
26, 2, 0, 17, 3, 10, 6, 3, 2, 4, 0, 0, 5, 25, 0, 0, 0, 2,
2, 9, 0, 2, 8), RMAcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
OUcol = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), KAcol = c(0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12,
0, 0, 0, 0, 0, 8, 1), PACcol = c(0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 2, 0, 37, 0, 24,
1, 0, 0), LAAcol = c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), GAcol = c(1,
0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0,
0, 0, 3, 0, 0, 0, 2, 0, 0), AAcol = c(1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0), EVAcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0), EAcol = c(0,
0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0), AKcol = c(0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0), Acol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 1, 0), QAcol = c(0,
0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0), YAcol = c(11, 24, 21, 63, 44,
95, 12, 43, 0, 5, 26, 22, 25, 48, 86, 2, 0, 0, 13, 0, 0,
2, 0, 0, 60, 6, 7, 0, 45), BANcol = c(0, 0, 0, 3, 0, 0, 0,
0, 0, 0, 0, 0, 24, 0, 6, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0,
9, 17, 17), VCcol = c(0, 38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Vcol = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0), Ocol = c(0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0), AVcol = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1), JXcol = c(0,
3, 3, 0, 0, 0, 0, 0, 8, 0, 0, 10, 3, 0, 0, 5, 0, 0, 0, 1,
0, 0, 0, 2, 4, 1, 0, 0, 0)), class = "data.frame", row.names = c(NA,
-29L))
coords <- structure(list(Lat.x = c(34.43363, 34.36784, 34.32587, 34.19891,
34.24217, 34.24863, 34.18137, 34.16838, 34.10961, 34.08329, 34.40571,
34.39591, 34.39292, 34.37466, 34.28948, 34.26146, 34.04687, 34.0409,
34.068339, 34.34679, 34.17161, 34.23308, 34.21544, 34.14922,
34.27539, 34.2323, 34.19057, 34.07042, 34.06289), Lon.x = c(-94.94494,
-94.92512, -94.94429, -94.84497, -94.8573, -94.85641, -94.887,
-94.91322, -94.92913, -94.93276, -95.02622, -95.04382, -94.96295,
-94.83733, -94.81071, -94.79161, -95.03968, -95.0608, -95.086986,
-95.03345, -95.23862, -95.25619, -95.1041, -95.02286, -95.02672,
-95.02626, -95.02941, -95.01746, -94.98786)), class = "data.frame", row.names = c(NA,
-29L))
You can get more answers, if you tell what was the problem. For instance, which functions failed and what was the error message. I had a look at betapart::decay.model(), where I could get this error message:
Error in eval(family$initialize) :
cannot find valid starting values: please specify some
I cut the long story short: you cannot use this function with your data because you have dissimilarities of 1 in your data, dissimilarities are turned into similarities with 1-dissimilarity and this makes these values zero similarities (that is, these pairs of sampling unit have nothing in common, they share no species). Function decay.model uses glm with gaussian family with log-link, and log-link requires that you give the starting values, if you have zeros in the y-variate.
I think that you have four alternatives:
You do not use the method as it does not suit your data.
You modify the decay.model function so that you can specify the starting values, like the error message suggested. This means that you add mustart to the function call so that it reads, e.g., glm(y ~ x, family=gaussian(link="log"), mustart=pmax(y, 0.01)). This replaces zeros with 0.01 as starting values.
You change maximum distances from 1 to something smaller, for instance, 0.99: dissim.BCI[dissim.BCI==1] <- 0.99. However, this changes the data, and also changes the results from those you get with alternative 2 (which only changes starting values, but data are unmodified). However, the effect is not very large and any Bayesian would claim that dissimilarity 1 is just a frequentist folly (you just haven't seen the case that is in common with these sampling units).
You change the maximum distance to missing values. This will change data more than alternative 3. It removes maximum dissimilarities and these no longer influence the decay curve. The effect is the same as censoring greatest dissimilarities. The results change more than in alternative 3.

Working on bipartite networks with igraph : problem with basic measures (density, normalized degree)

I'm new to bipartite network analysis and i've some trouble with basic measures.
I'm trying to work on bipartite networks without projecting in 1-mode graphs.
My problems come from the fact that the igraph package allows to create bipartite graphs but that the measures do not seem to adapt to the specificity of these graphs.
So, my general question is how do you do when you work directly on bipartite networks ?
Here a concrete exemple with density
## Working with an incidence matrix (sample) with 47 columns and 10 rows (unweighted / undirected)
# Want to compute basic global index like density with igraph
library(igraph)
g <- graph.incidence(m, directed = F )
graph.density(g) # result = 0.04636591
# Now trying to compute basic density for a bipartite graph without igraph (number of edges divided by the product of the two types of vertices)
library(Matrix)
d <- nnzero(m)/ (ncol(m)*nrow(m)) # result 0.1574468
# It seems that bipartite package does the job
library(bipartite)
networklevel(m, index=c("connectance")) # result 0.1574468
But the bipartite package is very specific to ecology fields and lot of measures are designed to food web and interaction between species (and some, like clustering coefficient, don't seem to take into account the bipartite nature of the graph : e.g compute 4-cycles).
So, are there simpler ways to work on bipartite networks with igraph ? To measure some global indexes (density, clustering coefficient with 4-cycles, I know that tnet does this but my actual networks are too large), and to normalize local indexes like degree, closeness, betweenness centralities taking into account the bipartite specificity (like in Borgatti S.P., Everett M.G., 1997, « Network analysis of 2-mode data », Social Networks) ?
Any advice will be appreciated !
Below the code to reproduce the sample of my matrix "m"
m <- structure(c(1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,
1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0), .Dim = c(10L, 47L), .Dimnames = list(
c("02723", "13963", "F3238", "02194", "15051", "04477", "02164",
"06283", "04080", "08304"), c("1185241", "170063", "10350868",
"217831", "2210247", "2262963", "1816670", "1848354", "2232593",
"146214", "1880252", "2261639", "2262581", "2158177", "1850147",
"2262912", "146412", "2262957", "1566083", "1841811", "146384",
"216281", "2220957", "1846986", "1951567", "1581130", "105343",
"1580240", "170654", "1796236", "1835553", "1835848", "146400",
"1174872", "1283240", "2253354", "1283617", "146617", "160263",
"2263115", "184745", "1809858", "1496747", "10346824", "148730",
"2262582", "146268")))
Density: you already got it
Degree
degv1 <- degree(g, V(g)[type == FALSE])
degv2 <- degree(g, V(g)[type == TRUE])
Normalized degree: divise by the vcount of the other node category
degnormv1 <- degv1/length(V(g)[type == TRUE])
degnormv2 <- degv2/length(V(g)[type == FALSE])
No answer regarding closeness, betweenness nor clustering coefficient
For the normalized degree, here a solution without igraph
normalizedegreeV1 <- data.frame(ND = colSums(m)/nrow(m))
normalizedegreeV2 <- data.frame(ND = rowSums(m)/ncol(m))
but that leaves the other questions about centrality measures open...

Translating SAS language to R language: Creating a new variable

I have a sas code and I want to translate into R. I am interested in creating variables based on the conditions of other variables.
data wp;
set wp;
if totalcriteria =>3 and nonecom=0 then content=1;
if totalcriteria =>3 and nonecom=1 then content=0;
if totalcriteria <3 and nonecom=0 then content=0;
if totalcriteria <3 and nonecom=1 then content=0;
run;
This is a code I have in. My conditions for "content" as changed and I would like to translate the sas code to R to hopefully replace the "mutate" line of the code below or fit in with the code below:
wpnew <- wp %>%
mutate(content = ifelse (as.numeric(totalcriteria >= 3),1,0))%>%
group_by(district) %>%
summarise(totalreports =n(),
totalcontent = sum(content),
per.content=totalcontent/totalreports*100)
Can you help me translate this SAS code to R language. Thank you in advance.
Here is the dput output
structure(list(Finances = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), Exercise = c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), Relationships = c(0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0), Laugh = c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1), Gratitude = c(0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1), Regrets = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0), Meditate = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0), Clutter = c(0, 0, 1, 1, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0), Headache = c(0, 0, 1, 1, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0), Loss = c(0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0), Anger = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0), Difficulty = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), nonecom = c(1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1,
1, 0, 1, 1, 0), Othercon = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), totalcriteria = c(0, 0, 2, 3, 2, 0, 0, 4, 3,
0, 0, 0, 3, 0, 0, 2)), class = "data.frame", row.names = c(NA,
-16L))
This is what I would like it to look like
V1 V2 V3...V12 nonecom Othercon totalcriteria content
1 1 1 0 1 0 3 0
0 0 1 0 0 0 8 1
1 0 0 0 0 1 2 0
1 0 1 0 1 0 1 0
I use case_when just because I find it more similar in terms of syntax. Your current approach only tests the first part of the IF condition, not the second part regarding nonecom.
wpnew <- wp %>%
mutate(content = case_when(sum.content >= 3 & nonecom == 0 ~ 1,
TRUE ~ 0))

How to fix 'Node inconsistent with parents' in R2jags::jags

I am working with the R-package R2jags. After running the code I attach below, R produced the error message: "Node inconsistent with parents".
I tried to solve it. However, the error message persists. The variables I am using are:
i) "Adop": a 0-1 dummy variable.
ii) "NumInfo": a counter variable whose range is {0, 1, 2,...}.
iii) "Price": 5
iv) "NRows": 326.
install.packages("R2jags")
library(R2jags)
# Data you need to run the model.
# Adop: a 0-1 dummy variable.
Adop <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
# NumInfo: a counter variable.
NumInfo <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1)
# NRows: length of both 'NumInfo' and 'Adop'.
NRows <- length(NumInfo)
# Price: 5
Price <- 5
Data <- list("NRows" = NRows, "Adop" = Adop, "NumInfo" = NumInfo, "Price" = Price)
# The Bayesian model. The parameters I would like to infer are: 'mu.m', 'tau2.m', 'r.s', 'lambda.s', 'k', 'c', and 'Sig2'.
# I would like to obtain samples from the posterior distribution of the vector of parameters.
Bayesian_Model <- "model {
mu.m ~ dnorm(0, 1)
tau2.m ~ dgamma(1, 1)
r.s ~ dgamma(1, 1)
lambda.s ~ dgamma(1, 1)
k ~ dunif(1, 1/Price)
c ~ dgamma(1, 1)
Sig2 ~ dgamma(1, 1)
precision.m <- 1/tau2.m
m ~ dnorm(mu.m, precision.m)
s2 ~ dgamma(r.s, lambda.s)
for(i in 1:NRows){
Media[i] <- NumInfo[i]/Sig2 * m
Var[i] <- equals(NumInfo[i], 0) * 10 + (1 - equals(NumInfo[i], 0)) * NumInfo[i]/Sig2 * s2 * (NumInfo[i]/Sig2 + 1/s2)
Prec[i] <- pow(Var[i], -1)
W[i] ~ dnorm(Media[i], Prec[i])
PrAd1[i] <- 1 - step(-m/s2 - 1/c * 1/s2 * log(1 - k * Price) + 1/2 * c)
PrAd2[i] <- 1 - step(-W[i] - m/s2 - 1/c * 1/s2 * log(1 - k * Price) + 1/2 * c - 1/c * log(1 - k * Price))
PrAd[i] <- equals(NumInfo[i], 0) * PrAd1[i] + (1 - equals(NumInfo[i], 0)) * PrAd2[i]
Adop[i] ~ dbern(PrAd[i])
}
}"
# Save the Bayesian model in your computer with an extension '.bug'.
# Suppose that you saved the .bug file in: "C:/Users/Default/Bayesian_Model.bug".
writeLines(Bayesian_Model, "C:/Users/Default/Bayesian_Model.bug")
# Here I would like to use jags command from R-package called R2jags.
# I would like to generate 1000 iterations.
MCMC_Bayesian_Model <- R2jags::jags(
model.file = "C:/Users/Default/Bayesian_Model.bug",
data = Data,
n.chains = 1,
n.iter = 1000,
parameters.to.save = c("mu.m", "tau2.m", "r.s", "lambda.s", "k", "c", "Sig2")
)
When running the code, R produced the error message: "Node inconsistent with parents". I do not know what the mistakes are. I was wondering if you could help me with this problem, please. If you need more information, please let me know. Thank you very much.
It's a little hard to figure out the model without knowing what you're trying to do, but I suggest two fixes:
Instead of k ~ dunif(1, 1/Price), did you mean k ~ dunif(0, 1/Price)? For dunif(a, b), you must have a < b (see page 48 here: http://people.stat.sc.edu/hansont/stat740/jags_user_manual.pdf).
I inserted an additional line in the model,
PrAd01[i] <- max(min(PrAd[i], 0.99), 0.01)
and changed the last line to
Adop[i] ~ dbern(PrAd01[i])
Page 49 of the manual above states that 0 < p < 1 for dbern(p).
The model runs with the above two changes.

Resources