ga() function giving error when provided with suggestions - r

I am trying to get an optimized order of 20 as per self-defined function f (see below). So, I am using GA package of R. While using ga() function I want to provide some initial suggestions (population) to the algorithm. The initial suggestion is in the same format as the function ga() demands. But I am getting an error. Here is my code.
library(GA)
f <- function(z) sum((z-c(20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3,2,1))^2)
set.seed(123)
c1 <- c(1, 3, 4, 2, 5, 7, 6, 9, 10, 8, 11, 12, 13, 15, 14, 16, 17, 18, 19, 20)
sugg <- as.data.frame(c1)
sugg$c2 <- sample(c1, 20)
sugg$c3 <- sample(c1, 20)
sugg$c4 <- sample(c1, 20)
sugg$c5 <- sample(c1, 20)
sugg$c6 <- sample(c1, 20)
sugg$c7 <- sample(c1, 20)
sugg$c8 <- sample(c1, 20)
sugg$c9 <- sample(c1, 20)
sugg$c10 <- sample(c1, 20)
sugg$c11 <- sample(c1, 20)
sugg$c12 <- sample(c1, 20)
sugg$c13 <- sample(c1, 20)
sugg$c14 <- sample(c1, 20)
sugg$c15 <- sample(c1, 20)
sugg$c16 <- sample(c1, 20)
sugg$c17 <- sample(c1, 20)
sugg$c18 <- sample(c1, 20)
sugg$c19 <- sample(c1, 20)
sugg$c20 <- sample(c1, 20)
sugg <- as.matrix(sugg)
set.seed(123)
result <- ga(type = "permutation", fitness = f, min = c(1), max = c(20), maxiter = 100, suggestions = sugg)
> result <- ga(type = "permutation", fitness = f, min = c(1), max = c(20), maxiter = 100, suggestions = sugg)
Iter = 1 | Mean = 1312.72 | Best = 1804
Error in crossover(object, parents) :
number of items to replace is not a multiple of replacement length
Can somebody help me explain why am I getting this error? When I don't provide the suggestion ga() runs fine.

Old question, but might help someone.
My problem is not the same as yours (for me suggestions doesn't seem to make a difference), but anyway this might help understand what's going on. From the code in the function:
# check suggestions
if(is.null(suggestions))
{ suggestions <- matrix(nrow = 0, ncol = nvars) }
else
{ if(is.vector(suggestions))
{ if(nvars > 1) suggestions <- matrix(suggestions, nrow = 1)
else suggestions <- matrix(suggestions, ncol = 1) }
else
{ suggestions <- as.matrix(suggestions) }
if(nvars != ncol(suggestions))
stop("Provided suggestions (ncol) matrix do not match number of variables of the problem!")
}

Related

Return dataframe modified by function in R

I made a function which is searching for outliners in each row of dataframe. What i'd like to get at the end is modified dataframe with new column x$outliers_numb as return not as just print. I added return() function at the end but it doesn't work at all. Any ideas?
outliers <- function(x, s, e){
# x = dataframe
# s = index of first col to take
# e = index of last column to take
p <- x
for(i in s:e){
Q1 <- quantile(p[,i], 0.25, names = FALSE)
Q3 <- quantile(p[,i], 0.75, names = FALSE)
iqr <- IQR(p[,i])
low <- Q1 - iqr*1.5
up <- Q3 + iqr*1.5
p[,i] <- ((p[,i] < low) | (p[,i] > up))
}
p <- p %>% mutate(outliers_numb = rowSums(p[,s:e]))
x$outliers_numb <- p$outliers_numb
return(x)
}
#example
w <- data.frame(col1 = c(1, 2, 3, 4, 5, 90, 6),
col2 = c(13, 60, 13, 18, 13, 12, 0),
col3 = c(1, 899, 5, 4, 3, 8, 6))
outliers(w, 1, 3)
Just assign it to a new variable
dataframe_to_reus <- outliers(w, 1, 3)

identify and quantify cluster types in igraph

I have the following graph:
That's made from the given data:
K01 <- cbind(c(1, 3, 4, 6, 7, 8, 9, 11, 13, 14),
c(2, 4, 5, 7, 8, 6, 10, 12, 14, 15))
K02 <- graph_from_edgelist(K01, directed = FALSE)
I'd like to be able to quantify the number and type of all subclusters in the graph. So I'd like to be able to be able separate clusters of the same size by whether they're circular or not, or other properties about the nodes.
I can get cluster size pretty easily with:
K03 <- components(K02)
K04 <- groups(K03)
But this doesn't help me sort by cluster structure.
I'm not entirely sure but it seems like as long as there are no loops, if the number of edges in a group is equal to or greater than the number of vertex, it is circular.
with(do.call(rbind, lapply(K04, function(x){
g = induced_subgraph(K02, x)
data.frame(n = vcount(g),
type = ifelse(ecount(g) >= vcount(g), "circular", "simple"))
})), table(n, type))
# type
#n simple circular
# 2 3 0
# 3 2 1
So, thanks to d.b for his help on this. I broke down and wrote a function that I think performs the task I was trying to accomplish. It's a bit verbose and probably clunky, but it seems to work for what i need it to.
######
# take an igraph graph
# and return a nested list the length of the number of unique cluster structures
# where each list item is a list of clusters that shared that structure
# names ?
######
CategorizeSubGraphs <- function(IGraph) {
cat("\n")
# get groups
IGroups <- groups(components(IGraph))
Structures <- vector("list",
length = length(IGroups))
StructNames <- vector("list",
length = length(IGroups))
AllSubGraphs <- vector("list",
length = length(IGroups))
# initialize a progress bar
pBar <- txtProgressBar(style = 1L)
# Get all graph structures, remove names for unique()
for (i in seq_along(Structures)) {
AllSubGraphs[[i]] <- induced_subgraph(graph = IGraph,
vids = IGroups[[i]])
Structures[[i]] <- AllSubGraphs[[i]][seq(length(AllSubGraphs[[i]][1]))]
StructNames[[i]] <- dimnames(Structures[[i]])[[1]]
dimnames(Structures[[i]]) <- list(NULL,
NULL)
setTxtProgressBar(pb = pBar,
value = i / length(Structures))
}
cat("\n")
# categorize by structure
GraphSizes <- sapply(unique(Structures),
function(x) nrow(x))
AllSizes <- sapply(Structures,
function(x) nrow(x))
GraphTemplates <- unique(Structures)
Result <- vector("list",
length = length(GraphTemplates))
StructureCategory <- vector("integer",
length = length(AllSizes))
# assign each subgraph to a category
for (i in seq_along(StructureCategory)) {
Catch <- which(GraphSizes == AllSizes[i])
if (length(Catch) == 1L) {
StructureCategory[i] <- Catch
} else {
for (j in seq_along(Catch)) {
if (all(GraphTemplates[[Catch[j]]] == Structures[[i]])) {
StructureCategory[i] <- Catch[j]
}
}
}
setTxtProgressBar(pb = pBar,
value = i / length(StructureCategory))
}
Count <- rep(0L,
length(Result))
ResultSizes <- sapply(unique(StructureCategory),
function(x) length(which(StructureCategory == x)))
for (i in seq_along(Result)) {
Result[[i]] <- vector("list",
length = ResultSizes[i])
}
# collect all subgraphs into their distinct categories
for (i in seq_along(StructureCategory)) {
Count[StructureCategory[i]] <- Count[StructureCategory[i]] + 1L
Result[[StructureCategory[i]]][[Count[StructureCategory[i]]]] <- AllSubGraphs[[i]]
}
cat("\n")
return(Result)
}
So this tries to, and I think successfully sorts and summarizes all the subgraphs in a given graph.
So given:
K01 <- cbind(c(1, 3, 4, 6, 7, 8, 9, 11, 13, 14, 16, 18, 18, 17, 20, 21, 21),
c(2, 4, 5, 7, 8, 6, 10, 12, 14, 15, 18, 17, 19, 16, 21, 22, 23))
K02 <- graph_from_edgelist(K01,
directed = FALSE)
Which starts with:
K03 <- CategorizeSubGraphs(IGraph = K02)
length(K03) # the number of distinct subgraph types
[1] 5
lengths(K03) # the number of individual subgraphs for each type
[1] 3 2 1 1 1
And you can plot the individual subclusters/subclusters out appropriately, and grab their names from the graph object.
I've not tested this with graphs with interconnected subgraphs, because I need it with a fairly specific type of data, but it's here if anyone else has this question!

Passing vector elements to call as consecutive arguments

Given function and test vector:
multiply_stuff <- function(...) {
arguments <- list(...)
Reduce(f = `*`, x = arguments)
}
test_vec <- c(1, 20, 3, 40, 5, 60)
I would like to create an unevaluated call automatically listing all arguments of the passed vector. In this example that would be equivalent of expression:
call("multiply_stuff",
test_vec[1],
test_vec[2],
test_vec[3],
test_vec[4],
test_vec[5],
test_vec[6])
Attempts
For instance for the vector:
test_vec_B <- c(1, 5, 6, 8, 9, 11, 12, 14, 20, 11)
I would like to automatically list all test_vec_B arguments within call("multiply_stuff",...). Naturally this won't work:
call("multiply_stuff", test_vec_B)
call("multiply_stuff", list(test_vec_B))
Desired results
Unevaluated expression equivalent to:
call(
"multiply_stuff",
test_vec_B[1],
test_vec_B[2],
test_vec_B[3],
test_vec_B[4],
test_vec_B[5],
test_vec_B[6],
test_vec_B[7],
test_vec_B[8],
test_vec_B[9],
test_vec_B[10]
)
You can create a call object and then add arguments to it:
multiply_stuff <- function(...) {
arguments <- list(...)
Reduce(f = `*`, x = arguments)
}
test_vec_B <- c(1, 5, 6, 8, 9, 11, 12, 14, 20, 11)
get_call <- function(f, arg_vector){
my_call <- call(f)
my_call[2:(length(arg_vector) + 1)] <- arg_vector
return(my_call)
}
multiply_stuff(1, 5, 6, 8, 9, 11, 12, 14, 20, 11)
[1] 878169600
test_call <- get_call("multiply_stuff", test_vec_B)
eval(test_call)
[1] 878169600
Explanation: when you create a call object, you can access/modify the function and its arguments by index just like usual. Index 1 is the function call, indices from 2 onwards are arguments.Run to verify:
test_call2 <- call("sum", 1, 2)
test_call2[1]
test_call2[2]
test_call2[3]
eval(test_call2)
test_call2[3] <- 1234
eval(test_call2)

R: how to get optim to output parameter values at each iteration

library(stats4)
x <- 0:10
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
## Easy one-dimensional MLE:
nLL <- function(lambda) -sum(stats::dpois(y, lambda, log = TRUE))
fit0 <- mle(nLL, start = list(lambda = 5), nobs = NROW(y), method = "L-BFGS-B")
This is a toy example from mle's documentation. The optimization method I chose to use is L-BFGS-B. I'm interested in seeing the lambda values at different iterations.
Looking into optim's help page, I tried adding trace = TRUE. But that seems to give me the likelihood at each iteration and not the lambda values.
> fit0 <- mle(nLL, start = list(lambda = 5), nobs = NROW(y), method = "L-BFGS-B", control = list(trace = TRUE))
final value 42.726780
converged
How can I obtain the lambda estimates at each iteration?

Divide vector with grouping vector

I have two vectors, which I would like to combine in one dataframe. One of the vectors values needs to be divided into two columns. The second vector nc informs about the number of values for each observation. If nc is 1, only one value is given in values (which goes into val1) and 999 is to be written in the second column (val2).
What is an r-ish way to divide vector value and populate the two columns of df? I suspect I miss something very obvious, but can't proceed at the moment...Many thanks!
set.seed(123)
nc <- sample(1:2, 10, replace = TRUE)
value <- sample(1:6, sum(nc), replace = TRUE)
# result by hand
df <- data.frame(nc = nc,
val1 = c(6, 3, 4, 1, 2, 2, 6, 5, 6, 5),
val2 = c(999, 5, 999, 6, 1, 999, 6, 4, 4, 999))
Here's an approach based on this answer:
set.seed(123)
nc <- sample(1:2, 10, replace = TRUE)
value <- sample(1:6, sum(nc), replace = TRUE)
splitUsing <- function(x, pos) {
unname(split(x, cumsum(seq_along(x) %in% cumsum(replace(pos, 1, pos[1] + 1)))))
}
combineValues <- function(vals, nums) {
mydf <- data.frame(cbind(nums, do.call(rbind, splitUsing(vals, nums))))
mydf$V3[mydf$nums == 1] <- 999
return(mydf)
}
df <- combineValues(value, nc)
I think this is what you are looking for. I'm not sure it is the fastest way, but it should do the trick.
count <- 0
for (i in 1:length(nc)) {
count <- count + nc[i]
if(nc[i]==1) {
df$val1[i] <- value[count]
df$val2[i] <- 999
} else {
df$val1[i] <- value[count-1]
df$val2[i] <- value[count]
}
}

Resources