Thank you for your time in advance. I am attempting to identify a method to calculate in-degree Bonacich Power Centrality in R. I'm a long-time UCINET user attempting to make the switch. In UCINET, this is done selecting Beta Centrality (Bonacich Power), and selecting "in-centrality" for the direction.
In R, it doesn't seem as though there is a way to calculate this using either sna or igraph packages. Here it is for bonpow in sna:
bonpow(dat, g=1, nodes=NULL, gmode="digraph", diag=FALSE, tmaxdev=FALSE,
exponent=1, rescale=FALSE, tol=1e-07)
I do specify digraph, but I am not able to replicate the analysis in R.
Similarly, here it is for power_centrality in igraph:
power_centrality(graph, nodes = V(graph), loops = FALSE,
exponent = 1, rescale = FALSE, tol = 1e-07, sparse = TRUE)
Here, there does not seem to be a way to specify that it is a directed graph (although you can specify it when defining the network). However, you can estimate it for betweenness centrality.
In neither case do I seem to be able to specify in-degree or out-degree power centrality. Any help is appreciated. Is there something either in these or in a different package that I may be overlooking?
I'm not sure about what do you mean by direction since the original paper, seems to me, does not deal with it. Now, a thing that is usually done with these statistics that are calculated directly from the adjacency matrix is "change the direction" by taking the transpose of the statistic (for example, when computing exposure in the netdiffuseR package we allow the user to compute "incoming" or "outgoing" exposure by just taking the transpose of the adjacency matrix). When you take the transpose, you are essentially flipping the directionality of the ties, i.e. i->j turns to j->i.
If that's what UCINET does (again, not completely sure what it is), then you can get the "incoming"/"outgoing" version by transposing the network. Here is a toy example:
# Loading the sna package (btw: igraph's implementation is a copy of
# sna's). I wrap it around suppressMessages to avoid the verbose
# print that the package has
suppressMessages(library(sna))
# This is random graph I generated with 10 vertices
net <- structure(
c(0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1,
0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1,
0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0),
.Dim = c(10L, 10L)
)
# Here is the default
bonpow(net)
#> [1] -0.8921521 -0.7658658 -0.9165947 -1.4176664 -0.6151369 -0.7862345
#> [7] -0.9206684 -1.3565601 -1.0347335 -1.0062173
# Here I'm getting the transpose of the adjmat
net2 <- t(net)
# The output is different (as you can see)
bonpow(net2)
#> [1] -0.8969158 -1.1026305 -0.6336011 -0.7158869 -1.2960022 -0.9545159
#> [7] -1.1684592 -0.8845729 -1.0368018 -1.1190876
Created on 2019-11-20 by the reprex package (v0.3.0)
Related
I am trying to set up a simulation but I struggle with something.
I know which variables are relevant or not in the beginning as I set them myself. They are stored in a beta matrix taking 0 if insignificant and 1 (same amplitude for all) if they are significant.
I run a procedure which gives me its list of relevant variables. The problem is that, this procedures is not giving me a 0-1 matrix but just a list of the selected variables using their "place" in number (so 38 for the 38th variable in my list). I know wish to test whether this procedure is powerful or not and I want to display the confusion matrix (among other things).
Below you can find those two vectors. I've tried using the "%in%" operator but it does not work and I don't want to use a loop as the simulation takes enough time already.
Thanks in advance!
res=c( 1, 9, 11 ,18, 19, 25, 26, 28, 31, 34 ,37 ,38, 39, 42 ,43 ,47 ,48, 50) #From my procedure
beta=c(1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1)
Calculating TPR and FPR using pROC package. Am i able to specify the thresholds i want in the calculation using the package?
I am to get calculate TPR and FPR for thresholds from 0 to 1, with 0.05 increment.
This is the data set i am working with:
structure(list(prediction_resp_4 = c(0.0093660156194744, 0.63696691410065,
0.693562340217509, 0.850026939982271, 0.0921374166454612, 0.223883311111169,
0.000699258172241612, 0.117062385395824, 0.951947429014154, 0.714711536699156,
0.230100717565363, 0.839895799034341, 0.149678433930086, 0.0913803675468538,
0.86430898026459, 0.0807110314548418, 0.452757912184497, 0.819293921115556,
0.0700190883640999, 0.44900095299551, 0.803772423123997, 0.373799624421601,
0.122405205571954, 0.858831937028595, 0.276135791757235, 0.86129869300195,
0.674060141486476, 0.303046534598074, 0.356020758015023, 0.0246899999008411,
0.670342328628664, 0.0178170992678319, 0.0945567242524256, 0.0110559559742041,
0.356077534809716, 0.0792480681507026, 0.630756724182966, 0.0165338433136149,
0.816750535548877, 0.661098390528446, 0.0587373125478858, 0.315062410973728,
0.831315518918304, 0.463520030831427, 0.725937488979879, 0.301643645590828,
0.288625193696339, 0.9038875106375, 0.780722912230085, 0.37912106477669,
0.136094212636133, 0.503643519530075, 0.544482442341009, 0.575738927352128,
0.356077534809716, 0.722011034808203, 0.760550508601042, 0.603109270061287,
0.793014589613734, 0.834485477242473, 0.783008040183127, 0.365330782046478,
0.022358212647161, 0.0884525015278602, 0.200257196356859, 0.912502624283191,
0.230100717565363, 0.112122111461138, 0.453938368209989, 0.704600061065344,
0.224872418284352, 0.395491910748845, 0.999703986760998, 0.794479788600805,
0.385076713799569, 0.0305635117938959, 0.92898574855535, 0.163314780984271,
0.893410014409946, 0.496199240836053, 0.618023472980794, 0.584273518401166,
0.295133623201644, 0.12042873699888, 0.251479713644139, 0.825885814333607,
0.674317836386347, 0.371047453863054, 0.645239618141106, 0.00544077442795404,
0.377910289600606, 0.936696423985203, 0.418497325382622, 0.871684421084382,
0.345285714385491, 0.835470162627044, 0.0581701844461216, 0.612133334197249,
0.675206715878502, 0.667971057122422), landslide.validation.lslpts = c(0,
1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0,
0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1,
0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0,
0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0,
1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0)), row.names = c(NA,
100L), class = "data.frame")
where i have response and prediction.
This pROC package allows me to calculate but for all possible thresholds.
my current code is as follows:
library(pROC)
myRoc <- roc(response = new_df$landslide.validation.lslpts, predictor = new_df$prediction_resp_4)
ROC <-data.frame(myRoc$sensitivities, myRoc$specificities, myRoc$thresholds)
Expected to calculate TPR and FPR for thresholds from 0 to 1 with 0.05 increment. How can i go about doing it?
Any help will be appreciated. Thank you
Although you can't specify which thresholds to use to build the ROC curve itself (because a ROC curve goes over all thresholds by definition), you can easily extract the information you want with the coords function:
> coords(myRoc, seq(0, 1, 0.05))
0 0.05 0.1 0.15 0.2 0.25 0.3 ...
threshold 0 0.0500000 0.1000000 0.1500000 0.2000000 0.2500000 0.3000000 ...
specificity 0 0.1568627 0.3333333 0.4509804 0.4705882 0.5098039 0.5882353 ...
sensitivity 1 0.9795918 0.9795918 0.9795918 0.9795918 0.9183673 0.9183673 ...
Noting that:
TPR = sensitivity
FPR = 1 − specificity
Please not that, although you have FPR and TPR, this is not a ROC curve, as it doesn't go over all possible thresholds.
I am trying to calculate robustness, a graph theory measure using R (braingraph package).
Robustness = robustness(my_networkgraph, type = c("vertex"), measure = ("btwn.cent"))
I get the following error, when I use the above robustness function:
Error in order(vertex_attr(g, measure), decreasing = TRUE) : argument 1 is not a vector
Any idea, what I am doing wrong here?
My network, which is a matrix has been converted to igraph object and robustness was calculated.
My network as a matrix:
mynetwork <- matrix(c(0, 1, 0, 1, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 1, 0, 1, 1,
0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0), nrow = 8)
This matrix was converted as igraph using the following code:
my_networkgraph <-graph_from_adjacency_matrix(mynetwork, mode = c("undirected"),weighted = NULL, diag = TRUE, add.colnames = NULL, add.rownames = NA)
Please help me to understand the above error
Thanks
Priya
There was a bug in the above function. To run the robustness code, you will need to supply a vertex attribute to your network: V(network)$degree <- degree(network) V(network)$btwn.cent <- centr_betw(network)$res
As the question states: I know there are several solutions (see output of GA and check that value and constraints are correct), but I can't get them out of Gurobi.
Edit after #Paleo13's answer: As he states, his answer is a good workround. However I would also love to see, if there is a more efficient option. Therefore, I added a bounty. See here and here for what I know.
Reproducible example:
my_fun <- function(x) {
f <- sum(model$obj * x)
penalty <- sum(abs(model$A %*% x - model$rhs))
return_value <- -f - 1e8 * penalty # sum(model$obj^2) * // 1e7 *
return(return_value)
}
model <- structure(
list(modelsense = "min",
obj = c(0, 40, 20, 40, 0, 20, 20, 20, 0),
A = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, -1, 0, 1, 0, 0, 1,
1, 0, -1, 0, 0, 0, 0, -1, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, -1, 0, 0, 1, 0, -1, 0, 1, 0,
0, 1, 0, 0, -1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0),
.Dim = c(7L, 9L),
.Dimnames = list(
c("constraint1", "constraint2", "", "", "", "", ""),
NULL)),
rhs = c(1, 1, 0, 0, 0, 1, 1),
sense = c("=", "=", "=", "=", "=", "=", "="),
vtype = "B"),
.Names = c("modelsense", "obj", "A", "rhs", "sense", "vtype"))
# Gurobi:
params <- list(OutputFlag = 1, Presolve = 2, LogToConsole = 1, PoolSearchMode = 2, PoolSolutions = 10)
ilp_result <- gurobi::gurobi(model, params)
print(ilp_result$x)
# GA for cross-check
GA <- GA::ga(type = "binary", fitness = my_fun, nBits = length(model$obj),
maxiter = 3000, run = 2000, popSize = 10, seed = 12)
# Crosscheck:
summary(GA)
my_fun(ilp_result$x)
my_fun(GA#solution[1, ])
my_fun(GA#solution[2, ])
sum(abs(model$A %*% ilp_result$x - model$rhs))
sum(abs(model$A %*% GA#solution[1, ] - model$rhs))
sum(abs(model$A %*% GA#solution[2, ] - model$rhs))
What you describe can be done with the Solution Pool. Gurobi added the R API for the solution pool in version 8.0. You set parameters to control the solution pool; the multiple solutions are returned in the Solution Pool named components. This is illustrated in the poolsearch.R example, which can also be found in the examples\R subdirectory.
Disclaimer: I manage technical support for Gurobi.
Gurobi can indeed store feasible solutions it that encounters while searing for the optimal solution (or rather a solution that fits within a specified opitmality gap). These solutions are stored in a "solution pool". Unfortunately, the gurobi R package does not have the functionality to access the solutions in the solution pool, so if we are looking for a solution that just uses R then we cannot use the solution pool. Also, it's worth noting that the solution pool may not necessarily contain all the feasible solutions, it only contains the solutions that Gurobi found along the way, so if we require all the feasible solutions then we cannot just rely on the solution pool in a single run of Gurobi.
So, with regards to your question, one strategy is to use a method referred to as "Bender's cuts". This basically involves solving the problem, adding in constraints to forbid the solution we just obtained, and then solving the problem again, and repeating this process until there aren't any more feasible solutions. I have written a function that implements this method using the gurobi R package below and applied it to your example. This method may not scale very well to problems with a large number of feasible solutions, because ideally we would access the solution pool to reduce the total number of Gurobi runs, but this is the best approach to my knowledge (but I would love to hear if anyone has any better ideas).
# define functions
find_all_feasible_solutions <- function(model, params) {
# initialize variables
counter <- 0
solutions <- list()
objs <- numeric(0)
# search for feasible solutions until no more exist
while (TRUE) {
# increment counter
counter <- counter + 1
# solve problem
s <- gurobi::gurobi(model, params)
# break if status indicates that no feasible solution found
if (s$status %in% c("INFEASIBLE")) break()
# store set of solutions
solutions[[counter]] <- s$x
objs[[counter]] <- s$objval
# add constraint to forbid solution this solution
model$rhs <- c(model$rhs, sum(s$x) - 1)
model$sense <- c(model$sense, "<=")
model$A <- rbind(model$A, (s$x * 2) - 1)
}
# throw error if no feasible sets of solutions found
if (length(solutions) == 0) {
stop("no feasible solutions found.")
}
# return solutions as matrix
list(x = do.call(rbind, solutions), obj = objs)
}
# create initial model
model <- list(
modelsense = "min",
obj = c(0, 40, 20, 40, 0, 20, 20, 20, 0),
A = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, -1, 0, 1, 0, 0, 1,
1, 0, -1, 0, 0, 0, 0, -1, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, -1, 0, 0, 1, 0, -1, 0, 1, 0,
0, 1, 0, 0, -1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0),
.Dim = c(7L, 9L),
.Dimnames = list(c("constraint1", "constraint2", "", "", "", "", ""),
NULL)),
rhs = c(1, 1, 0, 0, 0, 1, 1),
sense = c("=", "=", "=", "=", "=", "=", "="),
vtype = "B")
# create parameters
params <- list(OutputFlag = 1, Presolve = 2, LogToConsole = 1)
# find all feasible solutions
output <- find_all_feasible_solutions(model, params)
# print number of feasible solutions
print(length(output$obj))
Using kernlab I've trained a model with code like the following:
my.model <- ksvm(result ~ f1+f2+f3, data=gold, kernel="vanilladot")
Since it's a linear model, I prefer at run-time to compute the scores as a simple weighted sum of the feature values rather than using the full SVM machinery. How can I convert the model to something like this (some made-up weights here):
> c(.bias=-2.7, f1=0.35, f2=-0.24, f3=2.31)
.bias f1 f2 f3
-2.70 0.35 -0.24 2.31
where .bias is the bias term and the rest are feature weights?
EDIT:
Here's some example data.
gold <- structure(list(result = c(-1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), f1 = c(0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1), f2 = c(13.4138113499447,
13.2216999857095, 12.964145772169, 13.1975227965938, 13.1031520152764,
13.59351759447, 13.1031520152764, 13.2700658838026, 12.964145772169,
13.1975227965938, 12.964145772169, 13.59351759447, 13.59351759447,
13.0897162110721, 13.364151238365, 12.9483051847806, 12.964145772169,
12.964145772169, 12.964145772169, 12.9483051847806, 13.0937231331592,
13.5362700880482, 13.3654209223623, 13.4356400945176, 13.59351759447,
13.2659406408724, 13.4228886221088, 13.5103065354936, 13.5642812689161,
13.3224757352068, 13.1779418771704, 13.5601730479315, 13.5457299603578,
13.3729010596517, 13.4823595997866, 13.0965264603473, 13.2710281801434,
13.4489887206797, 13.5132372154748, 13.5196188787197), f3 = c(0,
1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0)), .Names = c("result",
"f1", "f2", "f3"), class = "data.frame", row.names = c(NA, 40L
))
To get the bias, just evaluate the model with a feature vector of all zeros. To get the coefficient of the first feature, evaluate the model with a feature vector with a "1" in the first position, and zeros everywhere else - and then subtract the bias, which you already know. I'm afraid I don't know R syntax, but conceptually you want something like this:
bias = my.model.eval([0, 0, 0])
f1 = my.model.eval([1, 0, 0]) - bias
f2 = my.model.eval([0, 1, 0]) - bias
f3 = my.model.eval([0, 0, 1]) - bias
To test that you did it correctly, you can try something like this:
assert(bias + f1 + f2 + f3 == my.model.eval([1, 1, 1]))
If I'm not mistaken, I think you're asking how to extract the W vector of the SVM, where W is defined as:
W = \sum_i y_i * \alpha_i * example_i
Ugh: don't know best way to write equations here, but this just is the sum of the weight * support vectors. After you calculate the W, you can extract the "weight" for the feature you want.
Assuming this is correct, you'd:
Get the indices of your data that are the support vectors
Get their weights (alphas)
Calculate W
kernlab stores the support vector indices and their values in a list (so it works on multiclass problems, too), anyway any use of list manipulation is just to get at the real data (you'll see that the length of the lists returned by alpha and alphaindex are just 1 if you just have a 2-class problem, which I'm assuming you do).
my.model <- ksvm(result ~ f1+f2+f3, data=gold, kernel="vanilladot", type="C-svc")
alpha.idxs <- alphaindex(my.model)[[1]] # Indices of SVs in original data
alphas <- alpha(my.model)[[1]]
y.sv <- gold$result[alpha.idxs]
# for unscaled data
sv.matrix <- as.matrix(gold[alpha.idxs, c('f1', 'f2', 'f3')])
weight.vector <- (y.sv * alphas) %*% sv.matrix
bias <- b(my.model)
kernlab actually scales your data first before doing its thing. You can get the (scaled) weights like so (where, I guess, the bias should be 0(?))
weight.vector <- (y.sv * alphas) %*% xmatrix(my.model)[[1]]
If I understood your question, this should get you what you're after.