Given a list of rectangle coordinates, how can we find all possible combinations of polygons from these coordinates? - math

Given the following list of region coordinates:
x y Width Height
1 1 65 62
1 59 66 87
1 139 78 114
1 218 100 122
1 311 126 84
1 366 99 67
1 402 102 99
7 110 145 99
I wish to identify all possible rectangle combinations that can be formed by combining any two or more of the above rectangles. For instance, one rectangle could be
1 1 66 146 by combining 1 1 65 62 and 1 59 66 87
What would be the most efficient way to find all possible combinations of rectangles using this list?
Okay. I'm sorry for not being specific about the problem.
I have been working on an algorithm for object detection that identifies different windows across the image that might have the object. But sometimes, the object gets divided into several windows. So, when looking for objects, I want to use all the windows that I identified as well as try their different combinations for object detection.
So far I have tried using 2 loops and going across all the windows one by one
: If the x coordinate of the second loop window lies in the first loop window, then I merge those two by taking the left-most, top-most, right-most and bottom-most coordinates.
However, this has been taking a lot of time and there are a lot of duplicates present in the final output. I would like to see if there is a more efficient and easier way to do this.
I hope this information helps.

Related

How do I use loops for automating work in R?

I have a file with data on the delivery of products to the store.I need to calculate the total number of products in the store. I want to use the knowledge of cycles to calculate the total quantity of the product in the store, but my cycle only counts the total quantity of the last product. Why?
Here is the delivery data:
"Day" "Cott.cheese, pcs." "Kefir, pcs." "Sour cream, pcs."
1 104 117 119
2 94 114 114
3 105 107 117
4 99 112 120
5 86 104 111
6 88 110 126
7 95 106 129
I put this table in the in1 variable
Here is code:
s<-0
for (p in (2:ncol(in1))){
s<-sum(in1[,p]) }
s
Not sure I understand correctly your question but if you only want to add all values of your data.frame except for the first column (Day), you just need to do this:
sum(in1[,-1])
You are rewriting the s variable each iteration, that's why it only shows the result for the last column. Try
s<-c()
for (p in 2:ncol(in1)) {
s<-c(s,sum(in1[,p]))
}
alternatively
colSums(in1[,-1])

How to sum column based on value in another column in two dataframes?

I am trying to create a limit order book and in one of the functions I want to return a list that sums the column 'size' for the ask dataframe and the bid dataframe in the limit order book.
The output should be...
$ask
oid price size
8 a 105 100
7 o 104 292
6 r 102 194
5 k 99 71
4 q 98 166
3 m 98 88
2 j 97 132
1 n 96 375
$bid
oid price size
1 b 95 100
2 l 95 29
3 p 94 87
4 s 91 102
Total volume: 318 1418
Where the input is...
oid,side,price,size
a,S,105,100
b,B,95,100
I have a function book.total_volumes <- function(book, path) { ... } that should return total volumes.
I tried to use aggregate but struggled with the fact that it is both ask and bid in the limit order book.
I appreciate any help, I am clearly a complete beginner. Only hear to learn :)
If there is anything more I can add to this question so is more clear feel free to leave a comment!

What can do to find and remove semi-duplicate rows in a matrix?

Assume I have this matrix
set.seed(123)
x <- matrix(rnorm(410),205,2)
x[8,] <- c(0.13152348, -0.05235148) #similar to x[5,]
x[16,] <- c(1.21846582, 1.695452178) #similar to x[11,]
The values are very similar to the rows specified above, and in the context of the whole data, they are semi-duplicates. What could I do to find and remove them? My original data is an array that contains many such matrices, but the position of the semi duplicates is the same across all matrices.
I know of agrep but the function operates on vectors as far as I understand.
You will need to set a threshold, but you can just compute the distance between each row using dist and find the points that are sufficiently close together. Of course, Each point is near itself, so you need to ignore the diagonal of the distance matrix.
DM = as.matrix(dist(x))
diag(DM) = 1 ## ignore diagonal
which(DM < 0.025, arr.ind=TRUE)
row col
8 8 5
5 5 8
16 16 11
11 11 16
48 48 20
20 20 48
168 168 71
91 91 73
73 73 91
71 71 168
This finds the "close" points that you created and a few others that got generated at random.

Formatting in a Data.frame in R

Hi I have Managed to pick the number of occurrence of certain values in a file in every row(I used "by" to achieve this and was adamant on not using a for loop)
[,1]
test 1041
use 474
error 192
why 148
when 96
Now this is of type "integer" - andI want to convert this to a data.frame that looks like
Key value
1 test 1041
2 use 474
3 error 192
4 temp 148
5 remedy 96
I am just breaking my head over this - as I could not find a right conversion technique for the same

Retrieving adjaceny values in a nng igraph object in R

edited to improve the quality of the question as a result of the (wholly appropriate) spanking received by Spacedman!
I have a k-nearest neighbors object (an igraph) which I created as such, by using the file I have uploaded here:
I performed the following operations on the data, in order to create an adjacency matrix of distances between observations:
W <- read.csv("/path/sim_matrix.csv")
W <- W[, -c(1,3)]
W <- scale(W)
sim_matrix <- dist(W, method = "euclidean", upper=TRUE)
sim_matrix <- as.matrix(sim_matrix)
mygraph <- nng(sim_matrix, k=10)
This give me a nice list of vertices and their ten closest neighbors, a small sample follows:
1 -> 25 26 28 30 32 144 146 151 177 183 2 -> 4 8 32 33 145 146 154 156 186 199
3 -> 1 25 28 51 54 106 144 151 177 234 4 -> 7 8 89 95 97 158 160 170 186 204
5 -> 9 11 17 19 21 112 119 138 145 158 6 -> 10 12 14 18 20 22 147 148 157 194
7 -> 4 13 123 132 135 142 160 170 173 174 8 -> 4 7 89 90 95 97 158 160 186 204
So far so good.
What I'm struggling with, however, is how to to get access to the values for the weights between the vertices that I can do meaningful calculations on. Shouldn't be so hard, this is a common thing to want from graphs, no?
Looking at the documentation, I tried:
degree(mygraph)
which gives me the sum of the weights for each node. But I don't want the sum, I want the raw data, so I can do my own calculations.
I tried
get.data.frame(mygraph,"E")[1:10,]
but this has none of the distances between nodes:
from to
1 1 25
2 1 26
3 1 28
4 1 30
5 1 32
6 1 144
7 1 146
8 1 151
9 1 177
10 1 183
I have attempted to get values for the weights between vertices out of the graph object, that I can work with, but no luck.
If anyone has any ideas on how to go about approaching this, I'd be grateful. Thanks.
It's not clear from your question whether you are starting with a dataset, or with a distance matrix, e.g. nng(x=mydata,...) or nng(dx=mydistancematrix,...), so here are solutions with both.
library(cccd)
df <- mtcars[,c("mpg","hp")] # extract from mtcars dataset
# knn using dataset only
g <- nng(x=as.matrix(df),k=5) # for each car, 5 other most similar mpg and hp
V(g)$name <- rownames(df) # meaningful names for the vertices
dm <- as.matrix(dist(df)) # full distance matrix
E(g)$weight <- apply(get.edges(g,1:ecount(g)),1,function(x)dm[x[1],x[2]])
# knn using distance matrix (assumes you have dm already)
h <- nng(dx=dm,k=5)
V(h)$name <- rownames(df)
E(h)$weight <- apply(get.edges(h,1:ecount(h)),1,function(x)dm[x[1],x[2]])
# same result either way
identical(get.data.frame(g),get.data.frame(h))
# [1] TRUE
So these approaches identify the distances from each vertex to it's five nearest neighbors, and set the edge weight attribute to those values. Interestingly, plot(g) works fine, but plot(h) fails. I think this might be a bug in the plot method for cccd.
If all you want to know is the distances from each vertex to the nearest neighbors, the code below does not require package cccd.
knn <- t(apply(dm,1,function(x)sort(x)[2:6]))
rownames(knn) <- rownames(df)
Here, the matrix knn has a row for each vertex and columns specifying the distance from that vertex to it's 5 nearest neighbors. It does not tell you which neighbors those are, though.
Okay, I've found a nng function in cccd package. Is that it? If so.. then mygraph is just an igraph object and you can just do E(mygraph)$whatever to get the names of the edge attributes.
Following one of the cccd examples to create G1 here, you can get a data frame of all the edges and attributes thus:
get.data.frame(G1,"E")[1:10,]
You can get/set individual edge attributes with E(g)$whatever:
> E(G1)$weight=1:250
> E(G1)$whatever=runif(250)
> get.data.frame(G1,"E")[1:10,]
from to weight whatever
1 1 3 1 0.11861240
2 1 7 2 0.06935047
3 1 22 3 0.32040316
4 1 29 4 0.86991432
5 1 31 5 0.47728632
Is that what you are after? Any igraph package tutorial will tell you more!

Resources