Related
I have a problem when tidying a table from website scraping.
I want to get the table (with header V1 to V5) from the link below, but I failed to convert it into the same format in R studio.
This is what I'm doing
url <- "https://www.r-bloggers.com/2018/08/using-control-charts-in-r/"
library(rvest)
library(tidyverse)
h <- read_html(url)
tab <- h %>% html_nodes("table")
tab <- tab[[2]] %>% html_table()
tab <- separate_rows(tab, 1, sep = " ")
tab <- tab[8:132,]
tab <- as.data.frame(tab)
tab1 <- data.frame(c("V1", "V2", "V3", "V4", "V5"))
tab1 <- tab1 %>% setNames("Cat")
tab2 <- cbind(tab1,tab)
tab3 <- tab2 %>% spread(key = Cat, X1)
Here is the result
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 125 rows:
* 1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81, 86, 91, 96, 101, 106, 111, 116, 121
* 2, 7, 12, 17, 22, 27, 32, 37, 42, 47, 52, 57, 62, 67, 72, 77, 82, 87, 92, 97, 102, 107, 112, 117, 122
* 3, 8, 13, 18, 23, 28, 33, 38, 43, 48, 53, 58, 63, 68, 73, 78, 83, 88, 93, 98, 103, 108, 113, 118, 123
* 4, 9, 14, 19, 24, 29, 34, 39, 44, 49, 54, 59, 64, 69, 74, 79, 84, 89, 94, 99, 104, 109, 114, 119, 124
* 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125
So what should I do to get the same table as from the website?
And if you can think of a better way to get the table from this website, please tell me.
P/s: I'm learning R programming on my own, so please teach me!
Cheers.
Here's a way :
library(rvest)
url <- "https://www.r-bloggers.com/2018/08/using-control-charts-in-r/"
url %>%
read_html %>%
html_nodes('table') %>%
.[[2]] %>%
html_table() %>%
dplyr::pull(X1) %>%
stringr::str_extract_all('\\d+\\.\\d+') %>%
.[[1]] %>%
matrix(ncol = 5, byrow = TRUE) %>%
as.data.frame() %>% type.convert() -> tab
tab
# V1 V2 V3 V4 V5
#1 1.45 1.56 1.40 1.45 1.33
#2 1.75 1.53 1.55 1.42 1.42
#3 1.60 1.41 1.35 1.52 1.36
#4 1.53 1.58 1.54 1.71 1.55
#5 1.48 1.34 1.64 1.59 1.46
#6 1.69 1.55 1.49 1.61 1.47
#...
#...
I got a dataframe (merged_df) with 52 columns (I show here only the first 4):
Row.names node_demand Node 1 Node 2
1 Node 1 3 0 87
2 Node 10 6 58 52
3 Node 11 10 43 70
4 Node 12 18 94 8
5 Node 13 3 44 63
6 Node 14 6 21 98
7 Node 15 20 31 64
8 Node 16 4 35 76
9 Node 17 14 58 52
10 Node 18 11 19 71
11 Node 19 19 62 38
12 Node 2 14 87 0
13 Node 20 15 102 19
14 Node 21 15 16 76
15 Node 22 4 54 51
16 Node 23 13 59 75
17 Node 24 13 73 28
18 Node 25 5 82 33
19 Node 26 16 62 72
20 Node 27 3 59 30
21 Node 28 7 73 32
22 Node 29 14 45 48
23 Node 3 1 43 78
24 Node 30 17 69 44
25 Node 31 3 70 43
26 Node 32 3 15 87
27 Node 33 12 38 72
28 Node 34 14 62 81
29 Node 35 20 104 17
30 Node 36 13 18 77
31 Node 37 10 70 22
32 Node 38 9 65 46
33 Node 39 6 24 64
34 Node 4 14 68 23
35 Node 40 18 85 8
36 Node 41 7 20 95
37 Node 42 20 55 82
38 Node 43 9 94 16
39 Node 44 1 10 79
40 Node 45 8 62 63
41 Node 46 5 50 88
42 Node 47 1 70 50
43 Node 48 7 54 73
44 Node 49 9 52 43
45 Node 5 19 57 48
46 Node 50 2 4 86
47 Node 6 2 76 22
48 Node 7 14 79 60
49 Node 8 6 108 25
50 Node 9 7 101 18
The columns Node 1, Node 2 .....Node 45....Node 46 show the distance from the Node indicated on the column respect all the other nodes.
I want to pick the closest nodes, and then to select all the nodes under which cumsum() node_demand is less than 120, starting from the first row. Since the first value is the distance between the main Node and itself I don't consider the first row.
To do that for Node 1 I would do:
test <- merged_df[,c(1,2,3)] # Columns 1 and 2 are fixed
test <- test[(order(test[3])),][2:50,] # to get the closest distances first
test<- test[cumsum(test$node_demand)< 120,]
I then need to create a new variable for each node with the last value of the cumsum()
node_1 <- tail(cumsum(test$`Node 1`), n=1) # 381
The output for node_1 would be 381
To do the same for node_2:
test <- merged_df[,c(1,2,4)] #c(1,2,**4**) 4 instead of 3 as before
test <- test[(order(test[3])),][2:50,]
test<- test[cumsum(test$node_demand)< 120,]
node_2 <- tail(cumsum(test$`Node 2`), n=1)
The output for node_2 is 178
Since this process is very repetitive I guess a loop could do it but I am not sure how to create the different variables I need
for(i in 3:52){
test <- merged_df[,c(1,2,i)]
test <- merged_df[order(test[3]),][2:50]
test<- test[cumsum(test$node_demand)< 120,]
}
node_1 <- tail(cumsum(test$test$`Node 1`), n=1) # should return 381
#I'm not sure how to create the variables node_1, node_2....node_50
The process to follow would be:
Create a subset of the dataframe using columns 1, 2 and i (representing the number from Node 1 to Node 50.
Sort the subset by the column Node i so the smallest distances are placed first.
I need to select rows until cumsum(node_demand) < 120. (which is what I do using test<- test[cumsum(test$node_demand)< 120,])
Then I need to calculate cumsum(test$Node 1). This will give me the cumulative distance from all the nodes under the condition of cumsum(node_demand) < 120
Anybody could give me a hand?
Many thanks!
The output of dput() is :
structure(list(Row.names = structure(c("Node 1", "Node 10", "Node 11",
"Node 12", "Node 13", "Node 14", "Node 15", "Node 16", "Node 17",
"Node 18", "Node 19", "Node 2", "Node 20", "Node 21", "Node 22",
"Node 23", "Node 24", "Node 25", "Node 26", "Node 27", "Node 28",
"Node 29", "Node 3", "Node 30", "Node 31", "Node 32", "Node 33",
"Node 34", "Node 35", "Node 36", "Node 37", "Node 38", "Node 39",
"Node 4", "Node 40", "Node 41", "Node 42", "Node 43", "Node 44",
"Node 45", "Node 46", "Node 47", "Node 48", "Node 49", "Node 5",
"Node 50", "Node 6", "Node 7", "Node 8", "Node 9"), class = "AsIs"),
node_demand = c(3L, 6L, 10L, 18L, 3L, 6L, 20L, 4L, 14L, 11L,
19L, 14L, 15L, 15L, 4L, 13L, 13L, 5L, 16L, 3L, 7L, 14L, 1L,
17L, 3L, 3L, 12L, 14L, 20L, 13L, 10L, 9L, 6L, 14L, 18L, 7L,
20L, 9L, 1L, 8L, 5L, 1L, 7L, 9L, 19L, 2L, 2L, 14L, 6L, 7L
), `Node 1` = c(0, 58, 43, 94, 44, 21, 31, 35, 58, 19, 62,
87, 102, 16, 54, 59, 73, 82, 62, 59, 73, 45, 43, 69, 70,
15, 38, 62, 104, 18, 70, 65, 24, 68, 85, 20, 55, 94, 10,
62, 50, 70, 54, 52, 57, 4, 76, 79, 108, 101), `Node 2` = c(87,
52, 70, 8, 63, 98, 64, 76, 52, 71, 38, 0, 19, 76, 51, 75,
28, 33, 72, 30, 32, 48, 78, 44, 43, 87, 72, 81, 17, 77, 22,
46, 64, 23, 8, 95, 82, 16, 79, 63, 88, 50, 73, 43, 48, 86,
22, 60, 25, 18), `Node 3` = c(43, 28, 11, 84, 15, 35, 52,
68, 30, 45, 73, 78, 97, 43, 72, 20, 78, 57, 91, 58, 80, 58,
0, 42, 83, 29, 69, 94, 91, 51, 70, 36, 41, 70, 79, 33, 22,
78, 34, 25, 13, 86, 84, 35, 73, 46, 60, 43, 101, 94), `Node 4` = c(68,
50, 62, 30, 56, 82, 43, 53, 49, 51, 16, 23, 34, 57, 29, 71,
10, 44, 50, 15, 15, 26, 70, 46, 25, 71, 49, 58, 39, 57, 5,
47, 45, 0, 19, 79, 76, 37, 62, 61, 81, 31, 50, 36, 25, 67,
30, 65, 41, 35), `Node 5` = c(57, 62, 66, 54, 62, 74, 30,
33, 61, 39, 10, 48, 55, 45, 15, 79, 23, 66, 26, 30, 24, 16,
73, 62, 22, 64, 27, 34, 63, 42, 28, 62, 37, 25, 42, 71, 80,
62, 55, 73, 84, 24, 28, 48, 0, 55, 53, 81, 63, 58), `Node 6` = c(76,
34, 54, 27, 46, 84, 60, 73, 36, 63, 45, 22, 40, 68, 53, 58,
38, 16, 77, 25, 41, 46, 60, 29, 50, 73, 71, 85, 34, 71, 28,
29, 55, 30, 27, 82, 67, 20, 67, 44, 70, 56, 75, 28, 53, 76,
0, 44, 41, 35), `Node 7` = c(79, 24, 38, 62, 36, 75, 72,
93, 23, 72, 75, 60, 79, 72, 84, 28, 75, 30, 105, 58, 79,
69, 43, 20, 88, 69, 91, 112, 66, 80, 66, 20, 66, 65, 64,
73, 38, 53, 70, 18, 45, 93, 102, 35, 81, 82, 44, 0, 81, 75
), `Node 8` = c(108, 75, 94, 23, 86, 120, 85, 91, 76, 91,
54, 25, 11, 97, 64, 98, 40, 52, 83, 49, 41, 67, 101, 67,
51, 109, 88, 91, 21, 98, 39, 69, 84, 41, 27, 117, 106, 28,
101, 85, 111, 56, 85, 66, 63, 107, 41, 81, 0, 7), `Node 9` = c(101,
68, 87, 17, 79, 113, 78, 86, 69, 85, 48, 18, 9, 90, 58, 91,
35, 46, 78, 42, 36, 60, 94, 60, 47, 102, 83, 87, 18, 91,
32, 62, 78, 35, 20, 110, 99, 23, 94, 79, 104, 52, 80, 59,
58, 100, 35, 75, 7, 0), `Node 10` = c(58, 0, 23, 57, 16,
58, 51, 70, 8, 50, 58, 52, 71, 51, 63, 24, 60, 29, 85, 40,
62, 48, 28, 16, 69, 49, 69, 91, 64, 59, 50, 8, 43, 50, 55,
56, 34, 50, 48, 12, 37, 73, 80, 14, 62, 60, 34, 24, 75, 68
), `Node 11` = c(43, 23, 0, 76, 10, 37, 45, 65, 22, 41, 65,
70, 89, 39, 67, 17, 71, 52, 85, 51, 73, 51, 11, 34, 77, 31,
64, 89, 83, 47, 63, 30, 36, 62, 71, 35, 18, 72, 34, 21, 19,
81, 79, 27, 66, 46, 54, 38, 94, 87), `Node 12` = c(94, 57,
76, 0, 69, 104, 71, 83, 56, 78, 44, 8, 17, 83, 58, 79, 34,
36, 78, 38, 38, 55, 84, 47, 50, 94, 79, 87, 9, 85, 29, 50,
71, 30, 12, 101, 86, 14, 87, 67, 93, 56, 80, 49, 54, 94,
27, 62, 23, 17), `Node 13` = c(44, 16, 10, 69, 0, 43, 45,
63, 18, 40, 60, 63, 82, 40, 61, 22, 65, 44, 82, 44, 67, 47,
15, 30, 71, 34, 62, 86, 77, 48, 56, 24, 34, 56, 65, 41, 27,
64, 34, 19, 26, 74, 76, 21, 62, 47, 46, 36, 86, 79), `Node 14` = c(21,
58, 37, 104, 43, 0, 46, 56, 58, 35, 78, 98, 114, 30, 73,
51, 88, 86, 82, 71, 89, 60, 35, 70, 88, 12, 58, 82, 113,
36, 84, 66, 39, 82, 97, 3, 44, 103, 22, 58, 37, 89, 74, 57,
74, 25, 84, 75, 120, 113), `Node 15` = c(31, 51, 45, 71,
45, 46, 0, 27, 49, 12, 35, 64, 77, 16, 34, 61, 47, 68, 42,
39, 49, 19, 52, 56, 48, 37, 22, 45, 80, 16, 46, 55, 16, 43,
60, 43, 58, 74, 29, 59, 61, 49, 38, 40, 30, 30, 60, 72, 85,
78), `Node 16` = c(35, 70, 65, 83, 63, 56, 27, 0, 70, 26,
42, 76, 85, 32, 28, 82, 53, 85, 29, 48, 51, 30, 68, 77, 44,
48, 9, 28, 93, 23, 55, 74, 29, 53, 72, 54, 80, 87, 38, 79,
79, 42, 19, 59, 33, 31, 73, 93, 91, 86), `Node 17` = c(58,
8, 22, 56, 18, 58, 49, 70, 0, 49, 56, 52, 71, 50, 63, 23,
59, 31, 84, 41, 63, 47, 30, 12, 70, 49, 68, 90, 63, 57, 50,
10, 43, 49, 54, 55, 31, 51, 48, 14, 37, 74, 80, 14, 61, 60,
36, 23, 76, 69), `Node 18` = c(19, 50, 41, 78, 40, 35, 12,
26, 49, 0, 44, 71, 85, 8, 39, 58, 55, 71, 48, 43, 56, 26,
45, 58, 54, 27, 25, 50, 87, 8, 53, 55, 9, 51, 68, 33, 55,
80, 17, 57, 55, 55, 41, 41, 39, 18, 63, 72, 91, 85), `Node 19` = c(62,
58, 65, 44, 60, 78, 35, 42, 56, 44, 0, 38, 46, 50, 19, 77,
14, 58, 35, 24, 18, 18, 73, 56, 21, 68, 37, 44, 53, 48, 19,
57, 40, 16, 32, 75, 79, 53, 58, 69, 84, 25, 37, 44, 10, 60,
45, 75, 54, 48), `Node 20` = c(102, 71, 89, 17, 82, 114,
77, 85, 71, 85, 46, 19, 0, 90, 58, 94, 33, 51, 75, 44, 35,
60, 97, 63, 46, 103, 81, 84, 18, 91, 32, 66, 78, 34, 19,
111, 101, 28, 95, 82, 107, 51, 78, 62, 55, 101, 40, 79, 11,
9), `Node 21` = c(16, 51, 39, 83, 40, 30, 16, 32, 50, 8,
50, 76, 90, 0, 46, 55, 62, 74, 54, 49, 63, 33, 43, 59, 61,
22, 30, 56, 92, 10, 59, 57, 15, 57, 73, 27, 51, 84, 16, 57,
51, 62, 48, 44, 45, 17, 68, 72, 97, 90), `Node 22` = c(54,
63, 67, 58, 61, 73, 34, 28, 63, 39, 19, 51, 58, 46, 0, 81,
26, 67, 26, 28, 23, 19, 72, 65, 16, 63, 26, 33, 67, 42, 29,
64, 35, 29, 46, 70, 83, 64, 52, 74, 84, 16, 23, 49, 15, 52,
53, 84, 64, 58), `Node 23` = c(59, 24, 17, 79, 22, 51, 61,
82, 23, 58, 77, 75, 94, 55, 81, 0, 81, 50, 100, 62, 84, 65,
20, 33, 90, 46, 80, 105, 85, 64, 73, 29, 53, 71, 77, 49,
13, 73, 51, 14, 19, 94, 95, 35, 79, 63, 58, 28, 98, 91),
`Node 24` = c(73, 60, 71, 34, 65, 88, 47, 53, 59, 55, 14,
28, 33, 62, 26, 81, 0, 53, 44, 22, 7, 29, 78, 56, 18, 77,
49, 53, 43, 60, 10, 57, 50, 10, 22, 85, 85, 43, 68, 71, 90,
24, 47, 46, 23, 71, 38, 75, 40, 35), `Node 25` = c(82, 29,
52, 36, 44, 86, 68, 85, 31, 71, 58, 33, 51, 74, 67, 50, 53,
0, 91, 39, 56, 58, 57, 22, 66, 76, 82, 99, 40, 79, 43, 22,
62, 44, 39, 84, 61, 25, 72, 36, 65, 72, 89, 30, 66, 83, 16,
30, 52, 46), `Node 26` = c(62, 85, 85, 78, 82, 82, 42, 29,
84, 48, 35, 72, 75, 54, 26, 100, 44, 91, 0, 53, 43, 37, 91,
86, 34, 74, 25, 9, 86, 46, 51, 86, 50, 50, 66, 80, 99, 86,
64, 95, 101, 30, 12, 71, 26, 59, 77, 105, 83, 78), `Node 27` = c(59,
40, 51, 38, 44, 71, 39, 48, 41, 43, 24, 30, 44, 49, 28, 62,
22, 39, 53, 0, 23, 23, 58, 40, 30, 60, 46, 60, 48, 50, 14,
39, 35, 15, 29, 69, 67, 39, 51, 52, 69, 34, 50, 27, 30, 58,
25, 58, 49, 42), `Node 28` = c(73, 62, 73, 38, 67, 89, 49,
51, 63, 56, 18, 32, 35, 63, 23, 84, 7, 56, 43, 23, 0, 31,
80, 60, 12, 78, 48, 51, 46, 61, 14, 60, 51, 15, 27, 86, 89,
46, 68, 74, 92, 18, 44, 49, 24, 71, 41, 79, 41, 36), `Node 29` = c(45,
48, 51, 55, 47, 60, 19, 30, 47, 26, 18, 48, 60, 33, 19, 65,
29, 58, 37, 23, 31, 0, 58, 50, 31, 50, 26, 43, 64, 31, 28,
50, 23, 26, 44, 57, 66, 59, 40, 59, 69, 33, 34, 35, 16, 43,
46, 69, 67, 60), `Node 30` = c(69, 16, 34, 47, 30, 70, 56,
77, 12, 58, 56, 44, 63, 59, 65, 33, 56, 22, 86, 40, 60, 50,
42, 0, 69, 61, 74, 93, 53, 66, 47, 9, 51, 46, 46, 67, 41,
41, 59, 21, 49, 74, 84, 19, 62, 70, 29, 20, 67, 60), `Node 31` = c(70,
69, 77, 50, 71, 88, 48, 44, 70, 54, 21, 43, 46, 61, 16, 90,
18, 66, 34, 30, 12, 31, 83, 69, 0, 77, 42, 42, 58, 58, 24,
68, 50, 25, 39, 86, 94, 57, 67, 81, 95, 6, 34, 55, 22, 68,
50, 88, 51, 47), `Node 32` = c(15, 49, 31, 94, 34, 12, 37,
48, 49, 27, 68, 87, 103, 22, 63, 46, 77, 76, 74, 60, 78,
50, 29, 61, 77, 0, 50, 75, 102, 29, 73, 57, 28, 71, 86, 9,
42, 92, 11, 51, 35, 78, 66, 47, 64, 19, 73, 69, 109, 102),
`Node 33` = c(38, 69, 64, 79, 62, 58, 22, 9, 68, 25, 37,
72, 81, 30, 26, 80, 49, 82, 25, 46, 48, 26, 69, 74, 42, 50,
0, 26, 88, 22, 51, 72, 28, 49, 67, 55, 79, 84, 40, 78, 79,
40, 18, 57, 27, 34, 71, 91, 88, 83), `Node 34` = c(62, 91,
89, 87, 86, 82, 45, 28, 90, 50, 44, 81, 84, 56, 33, 105,
53, 99, 9, 60, 51, 43, 94, 93, 42, 75, 26, 0, 96, 47, 60,
93, 53, 58, 75, 80, 103, 95, 65, 101, 104, 37, 12, 77, 34,
58, 85, 112, 91, 87), `Node 35` = c(104, 64, 83, 9, 77, 113,
80, 93, 63, 87, 53, 17, 18, 92, 67, 85, 43, 40, 86, 48, 46,
64, 91, 53, 58, 102, 88, 96, 0, 94, 38, 57, 80, 39, 21, 110,
93, 17, 96, 73, 100, 64, 89, 57, 63, 103, 34, 66, 21, 18),
`Node 36` = c(18, 59, 47, 85, 48, 36, 16, 23, 57, 8, 48,
77, 91, 10, 42, 64, 60, 79, 46, 50, 61, 31, 51, 66, 58, 29,
22, 47, 94, 0, 59, 64, 18, 57, 74, 34, 61, 87, 21, 65, 60,
57, 40, 49, 42, 16, 71, 80, 98, 91), `Node 37` = c(70, 50,
63, 29, 56, 84, 46, 55, 50, 53, 19, 22, 32, 59, 29, 73, 10,
43, 51, 14, 14, 28, 70, 47, 24, 73, 51, 60, 38, 59, 0, 48,
47, 5, 18, 81, 78, 35, 64, 62, 82, 30, 52, 37, 28, 69, 28,
66, 39, 32), `Node 38` = c(65, 8, 30, 50, 24, 66, 55, 74,
10, 55, 57, 46, 66, 57, 64, 29, 57, 22, 86, 39, 60, 50, 36,
9, 68, 57, 72, 93, 57, 64, 48, 0, 48, 47, 50, 63, 39, 44,
55, 16, 44, 73, 83, 16, 62, 66, 29, 20, 69, 62), `Node 39` = c(24,
43, 36, 71, 34, 39, 16, 29, 43, 9, 40, 64, 78, 15, 35, 53,
50, 62, 50, 35, 51, 23, 41, 51, 50, 28, 28, 53, 80, 18, 47,
48, 0, 45, 62, 36, 53, 72, 18, 51, 52, 51, 43, 33, 37, 23,
55, 66, 84, 78), `Node 40` = c(85, 55, 71, 12, 65, 97, 60,
72, 54, 68, 32, 8, 19, 73, 46, 77, 22, 39, 66, 29, 27, 44,
79, 46, 39, 86, 67, 75, 21, 74, 18, 50, 62, 19, 0, 94, 83,
23, 78, 66, 89, 45, 68, 44, 42, 84, 27, 64, 27, 20), `Node 41` = c(20,
56, 35, 101, 41, 3, 43, 54, 55, 33, 75, 95, 111, 27, 70,
49, 85, 84, 80, 69, 86, 57, 33, 67, 86, 9, 55, 80, 110, 34,
81, 63, 36, 79, 94, 0, 42, 100, 20, 56, 36, 86, 72, 55, 71,
24, 82, 73, 117, 110), `Node 42` = c(55, 34, 18, 86, 27,
44, 58, 80, 31, 55, 79, 82, 101, 51, 83, 13, 85, 61, 99,
67, 89, 66, 22, 41, 94, 42, 79, 103, 93, 61, 78, 39, 53,
76, 83, 42, 0, 82, 48, 26, 16, 97, 95, 42, 80, 59, 67, 38,
106, 99), `Node 43` = c(94, 50, 72, 14, 64, 103, 74, 87,
51, 80, 53, 16, 28, 84, 64, 73, 43, 25, 86, 39, 46, 59, 78,
41, 57, 92, 84, 95, 17, 87, 35, 44, 72, 37, 23, 100, 82,
0, 86, 59, 87, 63, 87, 45, 62, 94, 20, 53, 28, 23), `Node 44` = c(10,
48, 34, 87, 34, 22, 29, 38, 48, 17, 58, 79, 95, 16, 52, 51,
68, 72, 64, 51, 68, 40, 34, 59, 67, 11, 40, 65, 96, 21, 64,
55, 18, 62, 78, 20, 48, 86, 0, 52, 43, 67, 55, 43, 55, 13,
67, 70, 101, 94), `Node 45` = c(62, 12, 21, 67, 19, 58, 59,
79, 14, 57, 69, 63, 82, 57, 74, 14, 71, 36, 95, 52, 74, 59,
25, 21, 81, 51, 78, 101, 73, 65, 62, 16, 51, 61, 66, 56,
26, 59, 52, 0, 30, 85, 91, 26, 73, 65, 44, 18, 85, 79), `Node 46` = c(50,
37, 19, 93, 26, 37, 61, 79, 37, 55, 84, 88, 107, 51, 84,
19, 90, 65, 101, 69, 92, 69, 13, 49, 95, 35, 79, 104, 100,
60, 82, 44, 52, 81, 89, 36, 16, 87, 43, 30, 0, 98, 95, 45,
84, 54, 70, 45, 111, 104), `Node 47` = c(70, 73, 81, 56,
74, 89, 49, 42, 74, 55, 25, 50, 51, 62, 16, 94, 24, 72, 30,
34, 18, 33, 86, 74, 6, 78, 40, 37, 64, 57, 30, 73, 51, 31,
45, 86, 97, 63, 67, 85, 98, 0, 30, 60, 24, 67, 56, 93, 56,
52), `Node 48` = c(54, 80, 79, 80, 76, 74, 38, 19, 80, 41,
37, 73, 78, 48, 23, 95, 47, 89, 12, 50, 44, 34, 84, 84, 34,
66, 18, 12, 89, 40, 52, 83, 43, 50, 68, 72, 95, 87, 55, 91,
95, 30, 0, 67, 28, 50, 75, 102, 85, 80), `Node 49` = c(52,
14, 27, 49, 21, 57, 40, 59, 14, 41, 44, 43, 62, 44, 49, 35,
46, 30, 71, 27, 49, 35, 35, 19, 55, 47, 57, 77, 57, 49, 37,
16, 33, 36, 44, 55, 42, 45, 43, 26, 45, 60, 67, 0, 48, 53,
28, 35, 66, 59), `Node 50` = c(4, 60, 46, 94, 47, 25, 30,
31, 60, 18, 60, 86, 101, 17, 52, 63, 71, 83, 59, 58, 71,
43, 46, 70, 68, 19, 34, 58, 103, 16, 69, 66, 23, 67, 84,
24, 59, 94, 13, 65, 54, 67, 50, 53, 55, 0, 76, 82, 107, 100
)), .Names = c("Row.names", "node_demand", "Node 1", "Node 2",
"Node 3", "Node 4", "Node 5", "Node 6", "Node 7", "Node 8", "Node 9",
"Node 10", "Node 11", "Node 12", "Node 13", "Node 14", "Node 15",
"Node 16", "Node 17", "Node 18", "Node 19", "Node 20", "Node 21",
"Node 22", "Node 23", "Node 24", "Node 25", "Node 26", "Node 27",
"Node 28", "Node 29", "Node 30", "Node 31", "Node 32", "Node 33",
"Node 34", "Node 35", "Node 36", "Node 37", "Node 38", "Node 39",
"Node 40", "Node 41", "Node 42", "Node 43", "Node 44", "Node 45",
"Node 46", "Node 47", "Node 48", "Node 49", "Node 50"), class = "data.frame", row.names = c(NA,
-50L))
You can try a tidyverse
library(tidyverse)
d %>%
as.tibble() %>%
gather(k,v, -node_demand, -Row.names) %>%
arrange(k, v) %>%
group_by(k) %>%
filter(Row.names != k) %>%
filter(cumsum(node_demand)<120) %>%
summarise(sum(v))
# A tibble: 50 x 2
k `sum(v)`
<chr> <dbl>
1 Node 1 381
2 Node 10 202
3 Node 11 332
4 Node 12 186
5 Node 13 262
6 Node 14 419
7 Node 15 282
8 Node 16 279
9 Node 17 272
10 Node 18 302
# ... with 40 more rows
Prove result for Node 1 and 2:
.Last.value %>%
filter(k %in% c("Node 1", "Node 2"))
# A tibble: 2 x 2
k `sum(v)`
<chr> <dbl>
1 Node 1 381
2 Node 2 178
The idea is to transform the data from long to wide. After arranging, we group by Node (column k) and filter 1) "self-nodes" and 2) cumsum<120. Finally calculate the sum for each Node.
After running the command connected_components on an undirected graph g with LightGraphs in Julia, I obtain the following result:
9-element Array{Array{Int64,1},1}:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60 … 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
[69, 88]
[71, 73, 84, 102, 114, 122, 124, 127, 128, 134, 139, 143, 147, 150]
[101, 104, 105, 111, 112, 113, 116, 117, 121, 125 … 137, 138, 140, 141, 142, 144, 145, 146, 148, 149]
[103, 106, 108, 110, 118, 119, 123, 126, 130, 131, 132, 136]
[107]
[109]
[115]
I want to add an element-wise edge between the vertices 107 109 115 and the vertices 91 113 102. I know I can use the commands add_edge!(g,107,91) add_edge!(g,109,113) and add_edge!(g,115,102) but isn't there a command that can do all that in one shot instead of creating a loop or a function?
Thank you!
I have a Data Table with three columns: seller, product and price.
Example data:
seller product price
1: A banana 56
2: A lemon 94
3: A orange 84
4: A banana 11
5: A lemon 86
---
166: C orange 162
167: C banana 109
168: C orange 61
169: C banana 141
170: C orange 22
Code for the data
require (data.table)
DT <- data.table(seller = c(rep(c("A"),60),rep(c("B"),62),rep(c("C"),48)), product = c(rep(c("banana", "lemon", "orange"), 20), rep(c("banana", "lemon"), 31), rep(c("banana", "orange"), 24)),
price = c(56, 94, 84, 11, 86, 103, 151, 51, 117, 71, 63, 101, 45, 147, 135, 93, 26, 164, 90, 67, 12, 34, 14, 131, 92, 145, 48, 74, 62, 57, 20, 80, 113, 46, 88, 102, 134, 98, 137, 123, 169, 133, 146,
160, 58, 42, 52, 158, 170, 2, 152, 10, 130, 30, 33, 144, 73, 41, 139, 107, 163, 9, 66, 81, 79, 127, 40, 165, 106, 161, 16, 1, 112, 70, 115, 138, 76, 105, 17, 118, 114, 121, 25, 39, 15, 155, 50, 166,
100, 159, 5, 19, 29, 24, 64, 149, 120, 35, 119, 53, 21, 7, 72, 132, 154, 168, 156, 38, 3, 148, 69, 44, 6, 28, 140, 77, 104, 153, 59, 142, 116, 150, 97, 31, 91, 43, 47, 27, 143, 99, 37, 54, 49, 4, 111,
32, 23, 85, 167, 136, 78, 129, 83, 124, 36, 96, 110, 13, 65, 108, 8, 18, 157, 87, 82, 60, 122, 89, 125, 68, 75, 126, 128, 55, 95, 162, 109, 61, 141, 22))
I would like to perform a pairwise T.test combination between all sellers that sell the same products.
I would like to have an output as it is shown (hypotetical p.values for the example).
Desire output:
seller.x seller.y product p.value
A B banana 0.45
A B lemon 0.87
B C banana 0.03
A C banana 0.23
A C orange 0.01
You first need to group by product. Then, in your j parameter, you need to compute the combinations of seller for this product and get the p.value for the t.test of price between seller.x and seller.y:
DT[
, {
sellercomb <- data.table(t(combn(unique(seller), 2)))
names(sellercomb) <- c("seller.x", "seller.y")
sellercomb[
, {
data.table(p.value = t.test(price[seller == seller.x], price[seller == seller.y])$p.value)
}
, by = .(seller.x, seller.y)
]
}
, by = .(product)
]
The result for your data above looks like this:
product seller.x seller.y p.value
1: banana A B 0.9384329
2: banana A C 0.2413946
3: banana B C 0.2154216
4: lemon A B 0.7282811
5: orange A C 0.0354320
I have to find the K Shortest Path,However the below code i tried gives the same path when i choose different K Values and the distance computed is not correct.
My dataset is my.graph with class igraph
dput(my.graph)
structure(list(169, FALSE, c(22, 1, 2, 1, 2, 3, 114, 3, 4, 5,
4, 5, 6, 6, 7, 7, 8, 9, 8, 110, 78, 159, 9, 159, 30, 11, 13,
160, 11, 66, 160, 138, 14, 13, 14, 15, 81, 16, 15, 17, 16, 17,
18, 18, 19, 130, 19, 62, 62, 23, 42, 22, 22, 22, 23, 24, 161,
24, 25, 25, 26, 64, 26, 28, 161, 29, 28, 29, 47, 48, 53, 142,
31, 30, 32, 31, 32, 33, 33, 34, 35, 118, 34, 36, 35, 37, 36,
37, 38, 39, 38, 162, 40, 39, 40, 41, 41, 42, 43, 44, 43, 44,
45, 45, 46, 47, 46, 47, 47, 49, 48, 49, 50, 51, 50, 52, 51, 52,
53, 60, 53, 54, 53, 55, 54, 56, 55, 57, 56, 57, 58, 58, 59, 59,
60, 60, 60, 63, 162, 62, 62, 63, 64, 65, 65, 66, 166, 68, 163,
164, 69, 165, 68, 70, 69, 71, 70, 71, 72, 72, 73, 112, 73, 74,
75, 74, 76, 75, 76, 77, 78, 77, 78, 110, 78, 79, 80, 79, 146,
80, 81, 82, 81, 81, 82, 137, 164, 84, 85, 84, 86, 85, 86, 87,
87, 164, 165, 89, 89, 90, 90, 91, 92, 91, 93, 92, 93, 94, 95,
94, 165, 95, 163, 97, 97, 98, 99, 98, 99, 100, 101, 100, 101,
102, 102, 163, 104, 166, 105, 104, 106, 105, 106, 107, 108, 107,
109, 108, 109, 166, 110, 110, 125, 116, 112, 113, 112, 112, 114,
113, 114, 115, 114, 126, 115, 116, 117, 118, 117, 119, 118, 118,
120, 119, 120, 121, 121, 122, 123, 122, 124, 168, 141, 123, 124,
125, 125, 125, 126, 140, 140, 128, 128, 129, 130, 129, 130, 130,
131, 131, 132, 133, 132, 134, 133, 134, 135, 135, 136, 137, 136,
137, 137, 139, 138, 139, 168, 143, 140, 140, 141, 142, 158, 167,
143, 167, 144, 145, 144, 145, 146, 146, 146, 148, 148, 149, 149,
150, 151, 150, 152, 151, 153, 152, 153, 154, 154, 155, 156, 155,
156, 157, 157, 158, 158, 158, 159, 160, 159, 160, 160, 160, 161,
161, 162, 162, 163, 163, 163, 164, 164, 164, 165, 165, 165, 166,
166, 166, 167, 167, 168, 168), c(0, 0, 1, 0, 1, 2, 2, 2, 3, 4,
3, 4, 5, 5, 6, 6, 7, 8, 7, 9, 9, 9, 8, 10, 10, 10, 11, 11, 10,
12, 12, 12, 13, 11, 13, 14, 14, 15, 14, 16, 15, 16, 17, 17, 18,
19, 18, 19, 20, 20, 21, 21, 0, 21, 20, 23, 23, 23, 24, 24, 25,
26, 25, 27, 27, 28, 27, 28, 29, 29, 29, 30, 30, 10, 31, 30, 31,
32, 32, 33, 34, 34, 33, 35, 34, 36, 35, 36, 37, 38, 37, 38, 39,
38, 39, 40, 40, 21, 42, 43, 42, 43, 44, 44, 45, 46, 45, 29, 46,
48, 29, 48, 49, 50, 49, 51, 50, 51, 52, 53, 52, 53, 29, 54, 53,
55, 54, 56, 55, 56, 57, 57, 58, 58, 59, 53, 59, 61, 61, 20, 19,
61, 26, 64, 64, 12, 67, 67, 67, 68, 68, 68, 67, 69, 68, 70, 69,
70, 71, 71, 72, 72, 72, 73, 74, 73, 75, 74, 75, 76, 77, 76, 77,
78, 9, 78, 79, 78, 80, 79, 80, 81, 80, 14, 81, 82, 83, 83, 84,
83, 85, 84, 85, 86, 86, 87, 88, 88, 88, 89, 89, 90, 91, 90, 92,
91, 92, 93, 94, 93, 95, 94, 96, 96, 96, 97, 98, 97, 98, 99, 100,
99, 100, 101, 101, 102, 103, 103, 104, 103, 105, 104, 105, 106,
107, 106, 108, 107, 108, 109, 9, 78, 110, 111, 111, 112, 72,
111, 113, 112, 113, 114, 2, 115, 114, 111, 116, 117, 116, 118,
117, 34, 119, 118, 119, 120, 120, 121, 122, 121, 123, 123, 123,
122, 123, 124, 124, 110, 115, 126, 127, 127, 127, 128, 129, 128,
129, 19, 130, 130, 131, 132, 131, 133, 132, 133, 134, 134, 135,
136, 135, 136, 82, 138, 12, 138, 139, 139, 127, 126, 123, 30,
142, 142, 139, 143, 143, 144, 143, 144, 145, 80, 145, 147, 147,
148, 148, 149, 150, 149, 151, 150, 152, 151, 152, 153, 153, 154,
155, 154, 155, 156, 156, 157, 142, 157, 9, 159, 10, 12, 11, 159,
23, 27, 61, 38, 96, 67, 102, 68, 83, 87, 95, 88, 68, 67, 109,
103, 142, 143, 123, 139), c(3, 1, 4, 2, 7, 5, 10, 8, 11, 9, 13,
12, 15, 14, 18, 16, 22, 17, 28, 25, 33, 26, 34, 32, 38, 35, 40,
37, 41, 39, 43, 42, 46, 44, 52, 0, 53, 51, 54, 49, 57, 55, 59,
58, 62, 60, 66, 63, 67, 65, 73, 24, 75, 72, 76, 74, 78, 77, 82,
79, 84, 80, 86, 83, 87, 85, 90, 88, 93, 89, 94, 92, 96, 95, 97,
50, 100, 98, 101, 99, 103, 102, 106, 104, 107, 68, 108, 105,
110, 69, 111, 109, 114, 112, 116, 113, 117, 115, 122, 70, 120,
118, 124, 121, 126, 123, 128, 125, 129, 127, 131, 130, 133, 132,
135, 119, 136, 134, 140, 47, 139, 48, 141, 137, 142, 61, 144,
143, 145, 29, 152, 147, 154, 150, 156, 153, 157, 155, 159, 158,
162, 160, 165, 163, 167, 164, 168, 166, 171, 169, 174, 20, 172,
170, 177, 175, 179, 176, 183, 36, 182, 180, 184, 181, 189, 187,
191, 188, 192, 190, 194, 193, 198, 197, 200, 199, 203, 201, 205,
202, 206, 204, 209, 207, 211, 208, 214, 213, 217, 215, 218, 216,
221, 219, 222, 220, 224, 223, 229, 226, 231, 228, 232, 230, 235,
233, 237, 234, 238, 236, 240, 19, 241, 173, 246, 161, 247, 244,
249, 245, 252, 6, 250, 248, 254, 251, 255, 243, 258, 256, 261,
81, 260, 257, 263, 259, 264, 262, 266, 265, 269, 267, 273, 268,
274, 270, 277, 242, 276, 275, 278, 253, 282, 281, 285, 283, 287,
45, 286, 284, 289, 288, 292, 290, 294, 291, 295, 293, 297, 296,
300, 298, 302, 185, 301, 299, 304, 31, 305, 303, 309, 279, 308,
280, 310, 272, 311, 71, 314, 307, 318, 316, 319, 317, 321, 178,
322, 320, 324, 323, 326, 325, 329, 327, 331, 328, 333, 330, 334,
332, 336, 335, 339, 337, 340, 338, 342, 341, 344, 312, 345, 343,
346, 21, 348, 23, 350, 27, 349, 30, 351, 347, 352, 56, 353, 64,
355, 91, 354, 138, 357, 148, 356, 212, 358, 225, 359, 149, 360,
186, 361, 195, 364, 151, 363, 196, 362, 210, 365, 146, 367, 227,
366, 239, 368, 313, 369, 315, 370, 271, 371, 306), c(3, 1, 52,
0, 4, 2, 7, 5, 252, 6, 10, 8, 11, 9, 13, 12, 15, 14, 18, 16,
22, 17, 174, 20, 240, 19, 346, 21, 28, 25, 73, 24, 348, 23, 33,
26, 350, 27, 145, 29, 304, 31, 349, 30, 34, 32, 38, 35, 183,
36, 40, 37, 41, 39, 43, 42, 46, 44, 140, 47, 287, 45, 54, 49,
139, 48, 53, 51, 97, 50, 57, 55, 352, 56, 59, 58, 62, 60, 142,
61, 66, 63, 353, 64, 67, 65, 107, 68, 110, 69, 122, 70, 75, 72,
311, 71, 76, 74, 78, 77, 82, 79, 84, 80, 261, 81, 86, 83, 87,
85, 90, 88, 93, 89, 355, 91, 94, 92, 96, 95, 100, 98, 101, 99,
103, 102, 106, 104, 108, 105, 111, 109, 114, 112, 116, 113, 117,
115, 120, 118, 124, 121, 135, 119, 126, 123, 128, 125, 129, 127,
131, 130, 133, 132, 136, 134, 141, 137, 354, 138, 144, 143, 152,
147, 357, 148, 365, 146, 154, 150, 359, 149, 364, 151, 156, 153,
157, 155, 159, 158, 162, 160, 246, 161, 165, 163, 167, 164, 168,
166, 171, 169, 172, 170, 177, 175, 241, 173, 179, 176, 182, 180,
321, 178, 184, 181, 302, 185, 189, 187, 360, 186, 191, 188, 192,
190, 194, 193, 361, 195, 198, 197, 363, 196, 200, 199, 203, 201,
205, 202, 206, 204, 209, 207, 211, 208, 362, 210, 214, 213, 356,
212, 217, 215, 218, 216, 221, 219, 222, 220, 224, 223, 358, 225,
229, 226, 367, 227, 231, 228, 232, 230, 235, 233, 237, 234, 238,
236, 366, 239, 277, 242, 247, 244, 255, 243, 249, 245, 250, 248,
254, 251, 278, 253, 258, 256, 260, 257, 263, 259, 264, 262, 266,
265, 269, 267, 273, 268, 274, 270, 310, 272, 370, 271, 276, 275,
309, 279, 282, 281, 308, 280, 285, 283, 286, 284, 289, 288, 292,
290, 294, 291, 295, 293, 297, 296, 300, 298, 301, 299, 305, 303,
314, 307, 371, 306, 344, 312, 368, 313, 318, 316, 369, 315, 319,
317, 322, 320, 324, 323, 326, 325, 329, 327, 331, 328, 333, 330,
334, 332, 336, 335, 339, 337, 340, 338, 342, 341, 345, 343, 351,
347), c(0, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 18, 20, 20, 22,
24, 26, 28, 30, 32, 34, 34, 34, 38, 40, 42, 44, 46, 46, 48, 50,
52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82,
84, 88, 90, 92, 94, 96, 98, 102, 104, 106, 108, 110, 112, 114,
118, 118, 122, 124, 126, 128, 130, 130, 132, 134, 136, 138, 140,
142, 144, 146, 148, 150, 154, 156, 158, 162, 164, 164, 166, 168,
170, 172, 172, 174, 176, 178, 180, 182, 184, 186, 186, 188, 190,
192, 194, 196, 198, 198, 200, 202, 204, 206, 208, 210, 214, 214,
218, 220, 224, 226, 228, 230, 234, 236, 238, 240, 242, 244, 246,
250, 252, 252, 254, 256, 260, 262, 264, 266, 268, 270, 272, 276,
278, 280, 284, 286, 288, 290, 292, 294, 298, 298, 300, 302, 304,
306, 308, 310, 312, 314, 316, 318, 322, 326, 332, 336, 340, 346,
352, 358, 364, 368, 372), c(0, 4, 6, 10, 12, 14, 16, 18, 20,
22, 28, 34, 38, 44, 46, 50, 52, 54, 56, 58, 62, 66, 70, 70, 74,
76, 78, 80, 84, 86, 92, 96, 98, 100, 102, 106, 108, 110, 112,
116, 118, 120, 120, 122, 124, 126, 128, 130, 130, 132, 134, 136,
138, 140, 144, 146, 148, 150, 152, 154, 156, 156, 160, 160, 160,
162, 162, 162, 168, 174, 176, 178, 180, 184, 186, 188, 190, 192,
194, 198, 200, 204, 206, 208, 212, 214, 216, 218, 220, 224, 226,
228, 230, 232, 234, 236, 238, 242, 244, 246, 248, 250, 252, 254,
258, 260, 262, 264, 266, 268, 270, 272, 276, 278, 280, 282, 284,
286, 288, 290, 292, 294, 296, 298, 304, 306, 306, 308, 312, 314,
316, 318, 320, 322, 324, 326, 328, 330, 330, 332, 336, 336, 336,
340, 344, 346, 348, 348, 350, 352, 354, 356, 358, 360, 362, 364,
366, 368, 370, 370, 372, 372, 372, 372, 372, 372, 372, 372, 372,
372), list(c(1, 0, 1), structure(list(), .Names = character(0)),
structure(list(name = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17",
"18", "19", "20", "21", "22", "23", "24", "25", "26", "27",
"28", "29", "30", "31", "32", "33", "34", "35", "36", "37",
"38", "39", "40", "41", "42", "43", "44", "45", "46", "47",
"48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
"58", "59", "60", "61", "62", "63", "64", "65", "66", "67",
"68", "69", "70", "71", "72", "73", "74", "75", "76", "77",
"78", "79", "80", "81", "82", "83", "84", "85", "86", "87",
"88", "89", "90", "91", "92", "93", "94", "95", "96", "97",
"98", "99", "100", "101", "102", "103", "104", "105", "106",
"107", "108", "109", "110", "111", "112", "113", "114", "115",
"116", "117", "118", "119", "120", "121", "122", "123", "124",
"125", "126", "127", "128", "129", "130", "131", "132", "133",
"134", "135", "136", "137", "138", "139", "140", "141", "142",
"143", "144", "145", "146", "147", "148", "149", "150", "151",
"152", "153", "154", "155", "156", "157", "158", "159", "160",
"161", "162", "163", "164", "165", "166", "167", "168", "169"
)), .Names = "name"), structure(list(DIST_KM_CNT = c(4.89,
1.45, 2.36, 1.45, 2.36, 1.18, 0, 1.18, 0.89, 1.47, 0.89,
1.47, 1.16, 1.16, 1.2, 1.2, 1.02, 0.79, 1.02, 0, 0, 1, 0.79,
0, 0.98, 1.03, 1.15, 0, 1.03, 1.35, 0.95, 0, 0.99, 1.15,
0.99, 1.53, 0, 1.22, 1.53, 1.37, 1.22, 1.37, 1.23, 1.23,
1.1, 0, 1.1, 1.38, 1.69, 3.49, 3.16, 1.38, 4.89, 1.38, 3.49,
1.51, 0, 1.51, 1.39, 1.39, 1.78, 0.947, 1.78, 1.17, 2.12,
3.26, 1.17, 3.26, 1.43, 0, 0, 15.58, 1.11, 0.98, 1.09, 1.11,
1.09, 1.43, 1.43, 1.15, 1.11, 0, 1.15, 1.13, 1.11, 1.96,
1.13, 1.96, 1.86, 2.48, 1.86, 0, 1.44, 2.48, 1.44, 2.38,
2.38, 3.16, 2.41, 1.691, 2.41, 1.691, 1.54, 1.54, 1.65, 4.14,
1.65, 1.43, 4.14, 0.572, 0, 0.572, 0.455, 0.558, 0.455, 0.54,
0.558, 0.54, 0.682, 0.638, 0.682, 0.42, 0, 0.624, 0.42, 0.47,
0.624, 0.895, 0.47, 0.895, 0.493, 0.493, 0.703, 0.703, 0.553,
0.638, 0.553, 4.52, 1.94, 1.69, 1.38, 4.52, 0.947, 2.647,
2.647, 1.35, 0, 1.66, 0, 0, 1.05, 0, 1.66, 1.31, 1.05, 1.54,
1.31, 1.54, 1.72, 1.72, 1.24, 0, 1.24, 0.94, 1.57, 0.94,
1.15, 1.57, 1.15, 0.77, 0.95, 0.77, 0.95, 0, 0, 1.38, 0.6,
1.38, 11.42, 0.6, 0.72, 2.64, 0.72, 0, 2.64, 0, 0.82, 0.708,
0.467, 0.708, 0.59, 0.467, 0.59, 0.828, 0.828, 1.047, 0.77,
0.517, 0.517, 0.897, 0.897, 0.727, 0.602, 0.727, 0.481, 0.602,
0.481, 0.726, 0.602, 0.726, 0.92, 0.602, 0.986, 0.44, 0.44,
0.513, 0.548, 0.513, 0.548, 0.721, 0.513, 0.721, 0.513, 0.564,
0.564, 0.937, 0.412, 0.576, 0.542, 0.412, 0.567, 0.542, 0.567,
0.497, 0.426, 0.497, 0.379, 0.426, 0.379, 0.987, 0, 0, 0.614,
1.321, 1.327, 0.912, 0, 1.327, 1.735, 0.912, 1.735, 1.577,
0, 1.188, 1.577, 1.321, 1.017, 1.057, 1.017, 1.239, 1.057,
0, 0.732, 1.239, 0.732, 0.877, 0.877, 1.548, 0.816, 1.548,
0.806, 0, 11.5, 0.816, 0.806, 0.689, 0.689, 0.614, 1.188,
1.357, 2.496, 1.028, 1.028, 1.432, 0.93, 1.432, 0.93, 0,
0.794, 0.794, 0.811, 1.395, 0.811, 1.323, 1.395, 1.323, 1.385,
1.385, 0.774, 1.53, 0.774, 1.53, 0, 0.841, 0, 0.841, 1.317,
7.75, 2.496, 1.357, 11.5, 15.58, 0.75, 0.905, 7.75, 1.317,
0.89, 0.593, 0.89, 0.593, 0.555, 11.42, 0.555, 1.18, 1.18,
0.87, 0.87, 2.63, 1.21, 2.63, 1.6, 1.21, 1.26, 1.6, 1.26,
1.09, 1.09, 1.12, 1.58, 1.12, 1.58, 1.42, 1.42, 0.54, 0.75,
0.54, 1, 1.03, 0, 0.95, 0, 1.03, 0, 2.12, 1.94, 0, 0.986,
0, 0.937, 0, 0.82, 1.047, 0.92, 0.77, 0, 0, 0.987, 0.576,
0.905, 1.317, 0, 1.317)), .Names = "DIST_KM_CNT")), <environment>), class = "igraph")
K Shortest Path logic
# find k shortest paths
k.shortest.paths <- function(graph, from, to, k){
# first shortest path
k0 <- get.shortest.paths(graph,from,to, output='both')
# number of currently found shortest paths
kk <- 1
# list of alternatives
variants <- list()
# shortest variants
shortest.variants <- list(list(g=graph, path=k0$epath, vert=k0$vpath, dist=shortest.paths(graph,from,to)))
# until k shortest paths are found
while(kk<k){
# take last found shortest path
last.variant <- shortest.variants[[length(shortest.variants)]]
# calculate all alternatives
variants <- calculate.variants(variants, last.variant, from, to)
# find shortest alternative
sp <- select.shortest.path(variants)
# add to list, increase kk, remove shortest path from list of alternatives
shortest.variants[[length(shortest.variants)+1]] <- list(g=variants[[sp]]$g, path=variants[[sp]]$variants$path, vert=variants[[sp]]$variants$vert, dist=variants[[sp]]$variants$dist)
kk <- kk+1
variants <- variants[-sp]
}
return(shortest.variants)
}
# found all alternative routes
calculate.variants <- function(variants, variant, from, to){
# take graph from current path
g <- variant$g
# iterate through edges, removing one each iterations
for (j in unlist(variant$path)){
newgraph <- delete.edges(g, j) # remove adge
sp <- get.shortest.paths(newgraph,from,to, output='both') # calculate shortest path
spd <- shortest.paths(newgraph,from,to) # calculate length
if (spd != Inf){ # the the path is found
if (!contains.path(variants, sp$vpath)) # add to list, unless it already contains the same path
{
variants[[length(variants)+1]] <- list(g=newgraph, variants=list(path=sp$epath, vert=sp$vpath, dist=spd))
}
}
}
return(variants)
}
# does a list contain this path?
contains.path <- function(variants, variant){
return( any( unlist( lapply( variants, function(x){ identical(x$variant$vert,variant) } ) ) ) )
}
# which path from the list is the shortest?
select.shortest.path <- function(variants){
return( which.min( unlist( lapply( variants, function(x){x$variants$dist} ) ) ) )
}
The results are below with Same Path and and the distance computed is also not correct.I am not sure about where i am making the mistake
library(igraph)
k.shortest.paths(my.graph, from = 37, to = 8, k = 2)
[[1]]
[[1]]$g
IGRAPH UN-- 169 372 --
+ attr: name (v/c), DIST_KM_CNT (e/n)
+ edges (vertex names):
[1] 1 --23 1 --2 2 --3 1 --2 2 --3 3 --4 3 --115 3 --4 4 --5
[10] 5 --6 4 --5 5 --6 6 --7 6 --7 7 --8 7 --8 8 --9 9 --10
[19] 8 --9 10--111 10--79 10--160 9 --10 11--160 11--31 11--12 12--14
[28] 12--161 11--12 13--67 13--161 13--139 14--15 12--14 14--15 15--16
[37] 15--82 16--17 15--16 17--18 16--17 17--18 18--19 18--19 19--20
[46] 20--131 19--20 20--63 21--63 21--24 22--43 22--23 1 --23 22--23
[55] 21--24 24--25 24--162 24--25 25--26 25--26 26--27 27--65 26--27
[64] 28--29 28--162 29--30 28--29 29--30 30--48 30--49 30--54 31--143
+ ... omitted several edges
[[1]]$path
[[1]]$path[[1]]
+ 11/372 edges (vertex names):
[1] 36--37 35--36 34--35 33--34 32--33 31--32 11--31 11--160 10--160
[10] 9 --10 8 --9
[[1]]$vert
[[1]]$vert[[1]]
+ 12/169 vertices, named:
[1] 37 36 35 34 33 32 31 11 160 10 9 8
[[1]]$dist
8
37 11
[[2]]
[[2]]$g
IGRAPH UN-- 169 371 --
+ attr: name (v/c), DIST_KM_CNT (e/n)
+ edges (vertex names):
[1] 1 --23 1 --2 2 --3 1 --2 2 --3 3 --4 3 --115 3 --4 4 --5
[10] 5 --6 4 --5 5 --6 6 --7 6 --7 7 --8 7 --8 8 --9 9 --10
[19] 8 --9 10--111 10--79 10--160 9 --10 11--160 11--31 11--12 12--14
[28] 12--161 11--12 13--67 13--161 13--139 14--15 12--14 14--15 15--16
[37] 15--82 16--17 15--16 17--18 16--17 17--18 18--19 18--19 19--20
[46] 20--131 19--20 20--63 21--63 21--24 22--43 22--23 1 --23 22--23
[55] 21--24 24--25 24--162 24--25 25--26 25--26 26--27 27--65 26--27
[64] 28--29 28--162 29--30 28--29 29--30 30--48 30--49 30--54 31--143
+ ... omitted several edges
[[2]]$path
[[2]]$path[[1]]
+ 11/371 edges (vertex names):
[1] 36--37 35--36 34--35 33--34 32--33 31--32 11--31 11--160 10--160
[10] 9 --10 8 --9
[[2]]$vert
[[2]]$vert[[1]]
+ 12/169 vertices, named:
[1] 37 36 35 34 33 32 31 11 160 10 9 8
[[2]]$dist
8
37 11
I know this is like 2 years late but hopefully this will be useful for other people who needs an implementation of yen's algorithm in R.
library(igraph)
library(tidyverse)
#'#return the shortest path as a list of vertices or NULL if there is no path between src and dest
shortest_path <- function(graph, src, dest){
path <- suppressWarnings(get.shortest.paths(graph, src, dest))
path <- names(path$vpath[[1]])
if (length(path)==1) NULL else path
}
#'#return the sum of the weights of all the edges in the given path
path_weight <- function(path, graph) sum(E(graph, path=path)$weight)
#'#description sorts a list of paths based on the weight of the path
sort_paths <- function(graph, paths) paths[paths %>% sapply(path_weight, graph) %>% order]
#'#description creates a list of edges that should be deleted
find_edges_to_delete <- function(A,i,rootPath){
edgesToDelete <- NULL
for (p in A){
rootPath_p <- p[1:i]
if (all(rootPath_p == rootPath)){
edge <- paste(p[i], ifelse(is.na(p[i+1]),p[i],p[i+1]), sep = '|')
edgesToDelete[length(edgesToDelete)+1] <- edge
}
}
unique(edgesToDelete)
}
#returns the k shortest path from src to dest
#sometimes it will return less than k shortest paths. This occurs when the max possible number of paths are less than k
k_shortest_yen <- function(graph, src, dest, k){
if (src == dest) stop('src and dest can not be the same (currently)')
#accepted paths
A <- list(shortest_path(graph, src, dest))
if (k == 1) return (A)
#potential paths
B <- list()
for (k_i in 2:k){
prev_path <- A[[k_i-1]]
num_nodes_to_loop <- length(prev_path)-1
for(i in 1:num_nodes_to_loop){
spurNode <- prev_path[i]
rootPath <- prev_path[1:i]
edgesToDelete <- find_edges_to_delete(A, i,rootPath)
t_g <- delete.edges(graph, edgesToDelete)
#for (edge in edgesToDelete) t_g <- delete.edges(t_g, edge)
spurPath <- shortest_path(t_g,spurNode, dest)
if (!is.null(spurPath)){
total_path <- list(c(rootPath[-i], spurPath))
if (!total_path %in% B) B[length(B)+1] <- total_path
}
}
if (length(B) == 0) break
B <- sort_paths(graph, B)
A[k_i] <- B[1]
B <- B[-1]
}
A
}
#===================Test==========================#
edgeList <- tibble(from=character(), to=character(), weight = numeric())
edgeList[nrow(edgeList)+1,] <-list('c','d',3)
edgeList[nrow(edgeList)+1,] <-list('d','f',4)
edgeList[nrow(edgeList)+1,] <-list('f','h',1)
edgeList[nrow(edgeList)+1,] <-list('c','e',2)
edgeList[nrow(edgeList)+1,] <-list('e','d',1)
edgeList[nrow(edgeList)+1,] <-list('e','f',2)
edgeList[nrow(edgeList)+1,] <-list('e','g',3)
edgeList[nrow(edgeList)+1,] <-list('g','h',2)
edgeList[nrow(edgeList)+1,] <-list('f','g',2)
graph <- graph.data.frame(edgeList)
#k_shortest.yen(graph, 'c','c',7) #expect error
#expect all 7 paths
k_shortest_yen(graph,'c','h',7)
I had the same problem and then i noticed that there are a error in the code. The function identical in function contains.path were not returning the correct value. I simply changed the code of identical(x$variant$vert,variant) to identical(unlist(x$variant$vert),unlist(variant)). And now the code is reporting all routings and no duplicates are present.