R - computer vision image matching - nearest neighbour for SURF descriptor - r

I am using https://github.com/bnosac/image
and the image.dlib package for image matching.
I am trying to match two objects for similarity using SURF.. Also successfully got feature vector.
But in the nearest neighbour comparison (rflann package), I get varying results.
Do I sum up the abs distances and compare?
Also I am getting varying number of rows of surf points for each image (64 columns are constant)
Attached are reference and candidate images.
Reference should match with c1 ideally. Here's my try.
library(image.dlib)
f <- "D:\\<path to my ref image>\\ref.bmp"
f1 <- "D:\\<path to my candidate images>\\c1.bmp"
f2 <- "D:\\<path to my candidate images>\\c2.bmp"
f3 <-"D:\\<path to my candidate images>\\c3.bmp"
surf_blobs <- image_surf(f, max_points = 1000, detection_threshold = 30)
surf_blobs1 <- image_surf(f1, max_points = 1000, detection_threshold = 30)
surf_blobs2 <- image_surf(f2, max_points = 1000, detection_threshold = 30)
surf_blobs3 <- image_surf(f3, max_points = 1000, detection_threshold = 30)
library(rflann)
kn1 <- rflann::Neighbour(as.matrix(surf_blobs1$surf),as.matrix(surf_blobs$surf),k = 10, build = "kmeans")
s1 <- sum(abs(kn1$distances))
kn2 <- rflann::Neighbour(as.matrix(surf_blobs2$surf),as.matrix(surf_blobs$surf),k = 10, build = "kmeans")
s2 <- sum(abs(kn2$distances))
kn3 <- rflann::Neighbour(as.matrix(surf_blobs3$surf),as.matrix(surf_blobs$surf),k = 10, build = "kmeans")
s3 <- sum(abs(kn3$distances))
ref image
Candidate images
The links are given below
Reference image
https://i.imgur.com/jvHcA9c.png
Candidate images
Candidate image 1: https://image.ibb.co/jL6xNw/c1.png
Candidate image 2: https://image.ibb.co/khhF9b/c2.png
Candidate image 3: https://image.ibb.co/bNTYvG/c3.png
Additional candidates to test robustness:
https://image.ibb.co/mHqyfG/c2.jpg
https://image.ibb.co/de0W0G/c3.jpg

Related

How do I find the minimum using which.min function within the function I built?

I would like to build that function that allows me to import any dataset (i.e. Dataset1) which I will use as my chosen number and sample numbers to find the difference (Difference = (chosen number - sample number) ^ 2) and then find the average. With that, I would then like to find which chosen number from Dataset1 will give me the minimum average (this will be the output for my function). Note that for my chosen numbers, it could be from Dataset1 or numbers outside of Dataset1, including real numbers.
Dataset1 <- c(12739, 172392, 16477, 14738, 12223, 15473, 18999, 12278)
simulation.model <- function(import.dataset){
test.df <- data.frame(matrix(ncol = 4, nrow = 1000))
colnames(test.df) <- c("Dataset","R_Sample","Payment","Mean_Payment" )
sample.numbers <- sample(c(import.dataset, size = 1000, replace = FALSE))
f <- 0
for (i in dataset)
test.df[i,2] <- sample(c(dataset, size = 1000, replace = FALSE))
test.df[i,3] <-(dataset[i] - sample.numbers)^2
test.df[i,4] <- mean(insert.dataset[i]$Payment)
choice <- test.df[which.min(test.df$MeanPayment)]$Dataset
return(c(chosen = choice))
}
new.simulation.function(Dataset1)
I am getting an error for my function - any help will be appreciated!

Shortest Paths based on edge attribute with igraph

I'm trying to get the shortest paths of a graph but based on its edge ids.
So having the following graph:
library(igraph)
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
The shortest_paths(g, 1, V(g)) function finds all the shortest paths from node 1 to all the other nodes. However, I would like to calculate this, not just by following the geodesic distance, but a mix between the geodesic distance, and the minimum of edge id changes.
For example if this would be a train network, and the edge ids would represent trains. I would like to calculate how to get from node A to all the other nodes using the shortest path, but while changing the least amount of time of trains.
OK I think I have a working solution, although the code is a little ugly. The basic algorithm (lets call it gs(i, j)) goes like this: If we want to find the shortest train journey from i to j (gs(i, j)) we:
find the shortest path from i to j considering all trains. if this path is length 0 or 1 return it (there is either no path or a path on 1 train)
split the graph up by 'trains' (subset graph by edges) so as to consider each train network separately, and find the shortest path between i and j in each individual train network
if a single train will get you from i to j, return the train route with the fewest stops between i and j, else
if no single train runs from i to j then call gs(i, j-1) where (j-1) is the stop before j in the shortest path between i and j on the full network.
So basically, we look to see if a single train can do it, and if it can't we call the function recursively looking if a single train can get you to the stop before the last stop, etc. etc.
library(igraph)
# First your data
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
plot(g, edge.color = E(g)$id)
# The function takes as arguments the graph, and the id of the vertex
# you want to go from/to. It should work for a vector of
# destinations but I have not rigorously tested it so proceed with
# caution!
get.shortest.routes <- function(g, from, to){
train.routes <- lapply(unique(E(g)$id), function(id){subgraph.edges(g, eids = which(E(g)$id==id), delete.vertices = F)})
target.sp <- shortest_paths(g, from = from, to = to, output = 'vpath')$vpath
single.train.paths <- lapply(train.routes, function(gs){shortest_paths(gs, from = from, to = to, output = 'vpath')$vpath})
for (i in length(target.sp)){
if (length(target.sp[[i]]>1)) {
cands <- lapply(single.train.paths, function(l){l[[i]]})
if (sum(unlist(lapply(cands, length)))!=0) {
cands <- cands[lapply(cands, length)!=0]
cands <- cands[lapply(cands, length)==min(unlist(lapply(cands, length)))]
target.sp[[i]] <- cands[[1]]
} else {
target.sp[[i]] <- c(get.shortest.routes(g, from = as.numeric(target.sp[[i]][1]),
to = as.numeric(target.sp[[i]][(length(target.sp[[i]]) - 1)]))[[1]],
get.shortest.routes(g, from = as.numeric(target.sp[[i]][(length(target.sp[[i]]) - 1)]),
to = as.numeric(target.sp[[i]][length(target.sp[[i]])]))[[1]][-1])
}
}
}
target.sp
}
OK now lets run some tests. If you squint at the graph above you can see that the path from vertex 5 to vertex 21 is length-2 if you take two trains, but that you can get there on 1 train if you pass through an extra station. Our new function should return the longer path:
shortest_paths(g, 5, 21)$vpath
#> [[1]]
#> + 3/25 vertices, from b014eb9:
#> [1] 5 13 21
get.shortest.routes(g, 5, 21)
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> [[1]]
#> + 4/25 vertices, from c22246c:
#> [1] 5 13 15 21
Lets make a really easy graph where we are sure what we want to see: here we should get 1-2-4-5 instead of 1-3-5:
df <- data.frame(from = c(1, 1, 2, 3, 4), to = c(2, 3, 4, 5, 5))
g1 <- graph_from_data_frame(df)
E(g1)$id <- c(1, 2, 1, 3, 1)
plot(g1, edge.color = E(g1)$id)
get.shortest.routes(g1, 1, 5)
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> Warning in shortest_paths(gs, from = from, to = to, output = "vpath"): At
#> structural_properties.c:745 :Couldn't reach some vertices
#> [[1]]
#> + 4/5 vertices, named, from c406649:
#> [1] 1 2 4 5
I'm sure there is a more rigorous solution, and you'll probably want to optimize the code a bit. For instance, I just realized that I don't stop the function immediately if the shortest path on the full graph has only two nodes -- doing so would avoid some needless computations! This was a fun problem, I hope some other answers gets posted.
Created on 2018-05-11 by the reprex package (v0.2.0).
Here is my take on the problem. A few notes:
1) all_simple_paths will not scale well with large or highly connected graphs
2) I favored fewest changes above all else, which means a path with two changes and a dist of 40 will beat a path with three changes and a dist of 3.
4) I can imagine an even faster approach if # of changes and distance change priority if there is no path on one id
library(igraph)
# First your data
set.seed(45)
g <- erdos.renyi.game(25, 1/10, directed = TRUE)
E(g)$id <- sample(1:3, length(E(g)), replace = TRUE)
plot(g, edge.color = E(g)$id)
##Option 1:
rst <- all_simple_paths(g, from = 1, to = 18, mode = "out")
rst <- lapply(rst, as_ids)
rst1 <- lapply(rst, function(x) c(x[1], rep(x[2:(length(x)-1)],
each=2), x[length(x)]))
rst2 <- lapply(rst1, function(x) data.frame(eid = get.edge.ids(graph=g, vp = x),
train=E(g)$id[get.edge.ids(graph=g, vp = x)]))
rst3 <- data.frame(pathID=seq_along(rst),
changes=sapply(rst2, function(x) length(rle(x$train)$lengths)),
dist=sapply(rst2, nrow))
spath <- rst3[order(rst3$changes, rst3$dist), ][1,1]
#Vertex IDs
rst[[spath]]
#[1] 1 23 8 18
plot(g, edge.color = E(g)$id, vertex.color=ifelse(V(g) %in% rst[[spath]], "firebrick", "gray80"),
edge.arrow.size=0.5)

How to calculate the edge attributes as the path length in igraph?

Pretend the dataframe below is an edgelist (relation between inst2 and motherinst2), and that km is an attribute I want to calculate as a path that's been assigned to the edges. I'm too new at coding to make a reproducible edge list.
inst2 = c(2, 3, 4, 5, 6)
motherinst2 = c(7, 8, 9, 10, 11)
km = c(20, 30, 40, 25, 60)
df2 = data.frame(inst2, motherinst2)
edgelist = cbind(df2, km)
g = graph_from_data_frame(edgelist)
I know how to calculate the path length of vertices in a graph, but I have some attributes attached to the edges that I want to sum up as path lengths. They are simple attributes (distance in km, time in days, and speed as km/day).
This is how I was calculating the path of vertices (between roots and terminals/leaves):
roots = which(sapply(sapply(V(g),
function(x) neighbors(g, x, mode = 'in')), length) == 0)
#slight tweaking this piece of code will also calculate 'terminal' nodes (or leaves). (11):
terminals = which(sapply(sapply(V(g),
function(x) neighbors(g, x, mode = 'out')), length) == 0)
paths= lapply(roots, function(x) get.all.shortest.paths(g, from = x, to = terminals, mode = "out")$res)
named_paths= lapply(unlist(paths, recursive=FALSE), function(x) V(g)[x])
I just want to do essentially exactly as I did above, but summing up the distance, time, and rate (which I will compute the mean of) incurred between each of those paths. If it helps to know how the edges have been added as attributes, I've used cbind like so:
edgelist_df = cbind(edgelist_df, time, dist, speed)
and my graph object (g) is set up like this:
g <- graph_from_data_frame(edgelist_df, vertices = vattrib_df)
vattrib_df is the attributes of the vertices, which is not of interest to us here.

Raster series sum

I have a loop with a series of raster images and I want to extract the values ​​equal to 150 and then add the total amount of pixels for the entire length of the loop. With the code that I have only managed to get the total values ​​for each image separately and not in total form. Thanks
m=52419 #total pixels basin
for(i in 1:4){
b1<-raster(myras1[i])
bc = b1 == 150 #Values eq 150
nbc = cellStats(bc,stat="sum")
print(nbc)
[1] 34962
[1] 38729
[1] 52389
[1] 52176
pc=nbc*100/m
}
In general, it is recommended not to use loops in R for things that can be easily vectorised. Rather than trying to fix the (several) problems with your loop, I show instead a better way. You can perform the whole calculation in a single vectorised line:
sum(cellStats(myras1==150, stat="sum")) * 100/m
Breaking this down: cellStats performed on a raster stack will return a vector of values, one for each layer. sum then adds these together. Then we divide by number of cells in the whole stack (all layers combined) and multiply by 100 to convert to a percnetage.
Testing this on some reproducible dummy test data:
set.seed(123)
myras1 = list(
raster(nrows = 100, ncols = 100, vals = sample(140:150,10000,T)),
raster(nrows = 100, ncols = 100, vals = sample(140:150,10000,T)),
raster(nrows = 100, ncols = 100, vals = sample(140:150,10000,T)),
raster(nrows = 100, ncols = 100, vals = sample(140:150,10000,T))
)
myras1 = stack(myras1)
m = ncol(myras1) * nrow(myras1) * nlayers(myras1)
sum(cellStats(myras1==150, stat="sum")) * 100/m
# [1] 8.815

Why is the actual number of generation not as specified for genetic algorithms in R

I am working with the genalg library for R, and try to save all the generations when I run a binary generic algorithm. It does not seems like there is a built-in method for that in the library, so my attempt was to save each chromosome, x, coming through the evaluation function.
To test this method I have tried to insert print(x) in the evaluation function to be able to see all the evaluated chromosomes. However, the number of printed chromosomes does not always match what I am suspecting.
I thought that the number of printed chromosomes would be equal to the number of iterations times the population size, but it does not seems to be try all the time.
The problem is that I want to know from which generation (or iteration) each chromosome belongs, which I can't tell if the number of chromosomes are different from iter times popSize.
What is the reason for this, and how can I "fix" it. Or is there another way of saving each chromosome and from which iteration it belongs?
Below is an example, where I thought that the evaluation function would print 2x5 chromosomes, but only prints 8.
library(genalg)
library(ggplot2)
dataset <- data.frame(
item = c("pocketknife", "beans", "potatoes", "unions", "sleeping bag", "rope", "compass"),
survivalpoints = c(10, 20, 15, 2, 30, 10, 30),
weight = c(1, 5, 10, 1, 7, 5, 1))
weightlimit <- 20
evalFunc <- function(x) {
print(x)
current_solution_survivalpoints <- x %*% dataset$survivalpoints
current_solution_weight <- x %*% dataset$weight
if (current_solution_weight > weightlimit)
return(0) else return(-current_solution_survivalpoints
}
iter = 2
popSize = 5
set.seed(1)
GAmodel <- rbga.bin(size = 7, popSize = popSize, iters = iter, mutationChance = 0.1,elitism = T, evalFunc = evalFunc)
Looking at the function code, it seems like at each iteration (generation) a subset of chromosomes is chosen from the population (population = 5 chromosomes in your example) with a certain probability (0.1 in your case) and mutated. Evaluation function is called only for the mutated chromosomes at each generation (and of course for all the chromosomes in the first iteration to know their initial value).
Note that, this subset do not include elitists group, which in your example you have defined as 1 element big (you have erroneously passed elitism=TRUE and TRUE is implicitly converted to 1).
Anyway, to know the population at each generation, you can pass a monitor function through the monitorFun parameter e.g. :
# obj contains a lot of informations, try to print it
monitor <- function(obj) {
print(paste(" GENERATION :", obj$iter))
print("POPULATION:")
print(obj$population)
print("VALUES:")
print(obj$evaluations)
}
iter = 2
popSize = 5
set.seed(1)
GAmodel <- rbga.bin(size = 7, popSize = popSize,
iters = iter, mutationChance = 0.1,
elitism = 1, evalFunc = evalFunc, monitorFunc = monitor)

Resources