Plotting a Dataframe in R - r

I have a dataframe of the form
Region Name 3-15 4-15 5-15 ... 3-16
Name1 30 82 56 ... 32
Name2 65 23 38 ... 11
... ... ... ... ... ...
Name18 87 33 11 ... 51
The first column being the names of regions and the other columns being recorded events over time (monthly by column)
I'd like to plot the recorded monthly values over time with respect to their associated name. Specifically, a different line for each Named region with a differentiated colour. Any advice would be appreciated, a lot of the plotting functions for data frames seem to function on frames of a different format.
dput() data:
dataframe <- structure(list("LSOA Name" = c("Lancaster 001", "Lancaster 002",
"Lancaster 003", "Lancaster 004", "Lancaster 005", "Lancaster 006",
"Lancaster 008", "Lancaster 009", "Lancaster 010", "Lancaster 011",
"Lancaster 013", "Lancaster 014", "Lancaster 015", "Lancaster 016",
"Lancaster 017", "Lancaster 018", "Lancaster 019", "Lancaster 020"
), "3-15" = c(49L, 16L, 17L, 28L, 21L, 197L, 57L, 143L, 78L,
121L, 67L, 223L, 41L, 86L, 66L, 27L, 40L, 77L), "4-15" = c(63L,
11L, 26L, 29L, 19L, 203L, 69L, 154L, 82L, 125L, 62L, 198L, 44L,
99L, 64L, 26L, 42L, 99L), "5-15" = c(67L, 10L, 20L, 30L, 10L,
194L, 62L, 186L, 61L, 110L, 75L, 273L, 29L, 126L, 92L, 34L, 41L,
88L), "6-15" = c(58L, 8L, 18L, 36L, 29L, 198L, 62L, 167L, 83L,
110L, 59L, 254L, 26L, 99L, 73L, 17L, 30L, 109L), "7-15" = c(53L,
29L, 27L, 23L, 38L, 188L, 56L, 149L, 90L, 129L, 37L, 226L, 32L,
119L, 57L, 14L, 30L, 96L), "8-15" = c(44L, 9L, 25L, 28L, 29L,
237L, 69L, 171L, 78L, 108L, 45L, 261L, 22L, 103L, 68L, 33L, 35L,
108L), "9-15" = c(59L, 12L, 18L, 35L, 19L, 230L, 45L, 128L, 74L,
144L, 56L, 223L, 26L, 90L, 51L, 27L, 23L, 120L), "10-15" = c(45L,
26L, 31L, 23L, 25L, 195L, 53L, 155L, 74L, 120L, 58L, 276L, 38L,
92L, 72L, 25L, 40L, 123L), "11-15" = c(31L, 11L, 33L, 15L, 19L,
188L, 52L, 127L, 66L, 102L, 50L, 241L, 26L, 74L, 72L, 26L, 35L,
68L), "12-15" = c(34L, 22L, 21L, 22L, 17L, 205L, 80L, 150L, 73L,
109L, 50L, 228L, 29L, 57L, 59L, 14L, 45L, 93L), "1-16" = c(20L,
9L, 25L, 21L, 11L, 199L, 46L, 124L, 65L, 117L, 40L, 224L, 28L,
88L, 43L, 22L, 18L, 94L), "2-16" = c(54L, 11L, 29L, 20L, 11L,
164L, 44L, 117L, 70L, 85L, 46L, 192L, 23L, 89L, 50L, 27L, 29L,
86L), "3-16" = c(53L, 11L, 24L, 26L, 19L, 203L, 45L, 144L, 66L,
109L, 47L, 213L, 15L, 120L, 59L, 15L, 33L, 127L)), .Names = c("LSOA Name",
"3-15", "4-15", "5-15", "6-15", "7-15", "8-15", "9-15", "10-15",
"11-15", "12-15", "1-16", "2-16", "3-16"), row.names = c(NA,
-18L), class = "data.frame")

A typical way of plotting lines by groups in ggplot is to shift the data to long format, where one column identifies the group, and the other columns identify the x and y axis values.
This example shifts your data into long format with three columns: LSOAName, month_col, and values_col. It adds a day value onto the month-year, and converts that column to a date. Then it plots a line for each group.
I've renamed your dataframe d, because dataframe could be easily misinterpreted as the function data.frame().
# load libraries
library(magrittr)
library(dplyr)
library(tidyr)
library(ggplot2)
# rename dataframe so it doesn't look so much like the base function
d <- dataframe
# remove spaces in column names
names(d) <- gsub(" ", "", names(d))
# shift data from wide to long and then
# add a day value and convert day-month-year to date class
d %<>% gather(month_col, values_col, -LSOAName) %>%
mutate(month_col = as.Date(paste0("1-", month_col), "%d-%m-%y"))
# plot using ggplot2
ggplot(d, aes(x = month_col, y = values_col, colour = LSOAName)) +
geom_line()
Edit
%<>% is found in the magrittr package. It is a compound pipe assignment operator. While %>% returns the result of a pipeline, %<>% assigns the result back to the left side object.
Instead of writing
d <- d %>% [pipeline]
you can assign the results to d by writing
d %<>% [pipeline]

Related

Convert pixel values stored in text file to image

I've been trying to find a way to convert text files with pixels values into images (no matter the format) in R but I couldn't find a way to do it.
I found solutions for MatLab and Python, for example.
I have a file with 520 x 640 pixels with values from 0 to 255.
This is a small piece of it.
mid1al <- read.table("C:/Users/u015/Mid1_R_Al.txt", header = FALSE, sep = ";")
mid1al <- mid1al[1:20,1:20]
dput(mid1al)
structure(list(V1 = c(84L, 79L, 97L, 67L, 98L, 113L, 77L, 46L,
41L, 37L, 42L, 46L, 23L, 28L, 24L, 34L, 45L, 51L, 24L, 24L),
V2 = c(118L, 107L, 105L, 82L, 87L, 108L, 100L, 40L, 71L,
74L, 81L, 55L, 41L, 25L, 22L, 58L, 53L, 38L, 26L, 36L), V3 = c(103L,
116L, 128L, 82L, 77L, 104L, 97L, 50L, 65L, 78L, 98L, 111L,
86L, 59L, 35L, 51L, 43L, 46L, 33L, 47L), V4 = c(114L, 91L,
90L, 96L, 103L, 98L, 86L, 36L, 50L, 65L, 98L, 125L, 86L,
32L, 24L, 36L, 36L, 44L, 34L, 43L), V5 = c(68L, 70L, 85L,
85L, 100L, 111L, 61L, 12L, 42L, 70L, 103L, 103L, 45L, 27L,
18L, 27L, 32L, 43L, 51L, 41L), V6 = c(43L, 87L, 85L, 89L,
130L, 123L, 78L, 43L, 15L, 39L, 62L, 44L, 27L, 14L, 19L,
61L, 83L, 90L, 88L, 88L), V7 = c(20L, 72L, 116L, 124L, 133L,
133L, 103L, 56L, 21L, 9L, 19L, 26L, 18L, 32L, 67L, 92L, 100L,
105L, 94L, 79L), V8 = c(69L, 96L, 120L, 144L, 142L, 101L,
96L, 46L, 14L, 4L, 8L, 2L, 24L, 73L, 96L, 106L, 103L, 116L,
109L, 74L), V9 = c(118L, 122L, 134L, 135L, 133L, 98L, 57L,
20L, 5L, 5L, 2L, 14L, 51L, 89L, 117L, 95L, 103L, 93L, 104L,
77L), V10 = c(122L, 107L, 127L, 147L, 128L, 88L, 24L, 11L,
10L, 4L, 10L, 31L, 74L, 104L, 113L, 107L, 109L, 99L, 103L,
45L), V11 = c(105L, 120L, 114L, 132L, 125L, 112L, 51L, 6L,
3L, 9L, 18L, 49L, 82L, 111L, 111L, 96L, 92L, 81L, 75L, 18L
), V12 = c(98L, 104L, 103L, 126L, 147L, 128L, 61L, 26L, 2L,
9L, 18L, 50L, 105L, 103L, 101L, 98L, 74L, 53L, 18L, 1L),
V13 = c(107L, 91L, 108L, 109L, 138L, 114L, 88L, 33L, 2L,
4L, 9L, 61L, 71L, 77L, 78L, 83L, 43L, 38L, 8L, 5L), V14 = c(53L,
60L, 43L, 49L, 104L, 128L, 72L, 44L, 6L, 8L, 10L, 24L, 35L,
27L, 33L, 37L, 31L, 24L, 10L, 5L), V15 = c(13L, 16L, 11L,
27L, 62L, 78L, 73L, 30L, 8L, 7L, 31L, 66L, 66L, 33L, 13L,
27L, 16L, 18L, 12L, 7L), V16 = c(11L, 12L, 7L, 3L, 16L, 35L,
45L, 13L, 5L, 7L, 22L, 74L, 73L, 31L, 16L, 43L, 35L, 14L,
15L, 8L), V17 = c(15L, 16L, 7L, 8L, 1L, 5L, 15L, 13L, 31L,
33L, 22L, 34L, 38L, 17L, 18L, 41L, 39L, 26L, 19L, 12L), V18 = c(9L,
15L, 7L, 2L, 2L, 5L, 5L, 25L, 50L, 55L, 35L, 25L, 14L, 8L,
18L, 44L, 36L, 36L, 19L, 0L), V19 = c(15L, 16L, 4L, 6L, 4L,
6L, 22L, 45L, 59L, 48L, 56L, 58L, 52L, 30L, 22L, 46L, 41L,
50L, 23L, 7L), V20 = c(20L, 7L, 4L, 2L, 6L, 14L, 40L, 55L,
74L, 60L, 69L, 74L, 60L, 56L, 38L, 45L, 67L, 39L, 25L, 11L
)), row.names = c(NA, 20L), class = "data.frame")
Is there a way to create this image in Rstudio?

Using case_when to fill out a string

I am trying to use case_when in order to pad out a string in R, dependent on the string length.
I take the following 3 examples with lengths 11, 12 and 13:
V1 V2
74300000330 00074300000330
811693200042 08011693200042
8829999820128 88029999820128
V1 is the column I am trying to match with V2
The first row in V1 has 11 digits, if the row has 11 digits then add 3 zeros at the begining of the number.
I have tried the following code without any luck (I have also tried it with paste0());
df %>%
mutate(col3 = case_when(length(col1) == 11 ~ str_pad(14, width = 3, pad = "0")))
The second has 12 digits, so I should add one zero at the begining of the number and then another zero between (counting from the left) the first digit and (counting from right) 11th digit, so row 2 would go from 81169... to 0801169....
The third row has 13 digits so I should paste a zero between the (counting from the left) 2nd digit and (counting from the right) the 11th digit. So the begining of the sequence goes from 88299 to 880299.
The total number of digits in the sequence should be exactly 14.
Data:
df <- structure(list(col1 = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 11L, 12L, 12L, 13L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 21L,
21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L, 23L, 23L,
23L, 23L, 23L, 24L, 24L, 24L, 24L, 25L, 26L, 27L, 27L, 27L, 27L,
27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 29L, 30L, 30L, 30L,
31L, 32L, 33L, 33L, 33L, 33L, 33L, 34L, 34L, 34L, 34L, 35L, 36L,
36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 37L, 38L, 38L, 38L, 38L,
38L, 39L, 39L, 39L, 39L, 40L, 41L, 41L, 41L, 42L, 42L, 43L, 44L,
45L, 45L, 45L, 45L, 45L, 46L, 46L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 48L, 49L, 49L, 49L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L,
50L, 50L, 50L, 50L, 50L, 50L, 51L, 51L, 51L, 51L, 51L, 51L, 51L,
51L, 51L, 51L, 51L, 52L, 52L, 53L, 53L, 53L, 53L, 54L, 55L, 56L,
56L, 56L, 56L, 56L, 56L, 56L, 56L, 57L, 58L, 59L, 59L, 60L, 60L,
60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 61L, 61L, 61L, 61L,
61L, 62L, 62L, 63L, 64L, 65L, 66L, 66L, 66L, 66L, 66L, 66L, 66L,
66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 67L, 67L, 68L,
68L, 69L, 69L, 69L, 70L, 70L, 70L, 70L, 70L, 70L, 71L, 71L, 71L,
71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L,
71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 72L, 72L,
72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L,
73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 74L,
74L, 74L, 74L, 74L, 75L, 75L, 75L, 76L, 77L, 77L, 78L, 79L, 80L,
81L, 82L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 83L, 84L, 84L,
84L, 85L, 86L, 86L, 87L, 87L, 87L, 87L, 88L, 89L, 90L, 91L, 92L,
93L, 93L, 93L, 94L, 94L, 95L, 95L, 95L, 95L, 95L, 96L, 97L, 97L,
97L, 98L, 99L, 100L, 100L, 100L, 100L, 101L, 102L, 102L, 103L,
104L, 105L, 105L, 105L, 105L, 105L, 105L, 105L, 105L, 105L, 106L,
107L, 107L, 108L, 109L, 109L, 109L, 109L, 109L, 109L, 109L, 110L,
110L, 110L, 110L, 110L, 110L, 110L, 110L, 110L, 111L, 111L, 111L,
111L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 113L, 113L, 113L,
113L, 113L, 113L, 114L, 114L, 114L, 114L, 114L, 114L, 114L, 114L,
115L, 116L, 116L, 117L, 117L, 117L, 118L, 118L, 118L, 118L, 118L,
118L, 118L, 118L, 118L, 118L, 119L, 119L, 119L, 119L, 119L, 119L,
119L, 119L, 119L, 120L, 120L, 120L, 121L, 122L, 122L, 122L, 122L,
122L, 122L, 122L, 123L, 123L, 123L, 123L, 123L, 123L, 123L), .Label = c("11114110010",
"11114110022", "11114110029", "11114110036", "11114110210", "11114110230",
"11114110261", "11114110271", "11114110281", "11114110291", "11114110316",
"11114110526", "11780900029", "11780900050", "11780900660", "11780900661",
"12451500878", "12451567602", "12550000033", "12550000365", "12550000366",
"12550000367", "12550000371", "12550000376", "12550000377", "12550000384",
"12550000388", "12550000392", "12550000393", "12550000397", "12550000401",
"12550000402", "12550000538", "12550006763", "12550006764", "12550020040",
"12550020042", "12550020043", "12550020044", "12550020188", "12550020204",
"12550020212", "12550090015", "12800046631", "12800063141", "12800070612",
"14300002922", "14300002923", "14300002924", "14300002925", "14300002934",
"14300002940", "14300002941", "14300002942", "14300003300", "14300004091",
"14300004296", "14300004299", "14300004301", "14300004648", "14300004650",
"14300004651", "14300070522", "15543760143", "15543760145", "15543760186",
"15543760235", "15543760253", "17089302817", "17103800044", "17103800047",
"17103800048", "17103800053", "17103800056", "17103800058", "17103800059",
"17103801173", "17103801175", "17232305018", "17447100091", "17510100575",
"17510100576", "17510121064", "17510121065", "17510181458", "17732447059",
"17762300048", "17762300060", "18903644280", "19955508003", "19955508050",
"19955508060", "19955508061", "19955508531", "19955508534", "19955508758",
"19955508792", "19955508800", "19955508801", "19955508832", "19955508992",
"19955509803", "19955538570", "19955538696", "19955538725", "19955538792",
"21291912261", "21780900078", "22550081121", "22550081122", "22800025406",
"22800030050", "24300070590", "25543760142", "25543760521", "29955539550",
"31291912240", "39955508520", "41114110525", "57103800060", "74300000330",
"8,11693E+11", "8,83E+12"), class = "factor"), col2 = structure(c(1L,
1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 11L, 12L, 12L, 13L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L,
23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 25L,
26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L,
28L, 29L, 30L, 30L, 30L, 31L, 32L, 33L, 33L, 33L, 33L, 33L, 34L,
34L, 34L, 34L, 35L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L, 36L,
37L, 38L, 38L, 38L, 38L, 38L, 39L, 39L, 39L, 39L, 40L, 41L, 41L,
41L, 42L, 42L, 43L, 44L, 45L, 45L, 45L, 45L, 45L, 46L, 46L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 48L, 49L, 49L, 49L, 50L, 50L, 50L,
50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L, 51L, 51L,
51L, 51L, 51L, 51L, 51L, 51L, 51L, 51L, 51L, 52L, 52L, 53L, 53L,
53L, 53L, 54L, 55L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 56L, 57L,
58L, 59L, 59L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L, 60L,
60L, 61L, 61L, 61L, 61L, 61L, 62L, 62L, 63L, 64L, 65L, 66L, 66L,
66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L,
66L, 66L, 67L, 67L, 68L, 68L, 69L, 69L, 69L, 70L, 70L, 70L, 70L,
70L, 70L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L,
71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L, 71L,
71L, 71L, 71L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 72L,
72L, 72L, 72L, 72L, 72L, 73L, 73L, 73L, 73L, 73L, 73L, 73L, 73L,
73L, 73L, 73L, 73L, 74L, 74L, 74L, 74L, 74L, 75L, 75L, 75L, 76L,
77L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 83L, 83L, 83L, 83L, 83L,
83L, 83L, 83L, 84L, 84L, 84L, 85L, 86L, 86L, 87L, 87L, 87L, 87L,
88L, 89L, 90L, 91L, 92L, 93L, 93L, 93L, 94L, 94L, 95L, 95L, 95L,
95L, 95L, 96L, 97L, 97L, 97L, 98L, 99L, 100L, 100L, 100L, 100L,
101L, 102L, 102L, 103L, 104L, 105L, 105L, 105L, 105L, 105L, 105L,
105L, 105L, 105L, 106L, 107L, 107L, 108L, 109L, 109L, 109L, 109L,
109L, 109L, 109L, 110L, 110L, 110L, 110L, 110L, 110L, 110L, 110L,
110L, 111L, 111L, 111L, 111L, 112L, 112L, 112L, 112L, 112L, 112L,
112L, 113L, 113L, 113L, 113L, 113L, 113L, 114L, 114L, 114L, 114L,
114L, 114L, 114L, 114L, 115L, 116L, 116L, 117L, 117L, 117L, 118L,
118L, 118L, 118L, 118L, 118L, 118L, 118L, 118L, 118L, 119L, 119L,
119L, 119L, 119L, 119L, 119L, 119L, 119L, 120L, 120L, 120L, 121L,
123L, 122L, 123L, 123L, 123L, 123L, 123L, 127L, 124L, 126L, 126L,
127L, 127L, 125L), .Label = c("00011114110010", "00011114110022",
"00011114110029", "00011114110036", "00011114110210", "00011114110230",
"00011114110261", "00011114110271", "00011114110281", "00011114110291",
"00011114110316", "00011114110526", "00011780900029", "00011780900050",
"00011780900660", "00011780900661", "00012451500878", "00012451567602",
"00012550000033", "00012550000365", "00012550000366", "00012550000367",
"00012550000371", "00012550000376", "00012550000377", "00012550000384",
"00012550000388", "00012550000392", "00012550000393", "00012550000397",
"00012550000401", "00012550000402", "00012550000538", "00012550006763",
"00012550006764", "00012550020040", "00012550020042", "00012550020043",
"00012550020044", "00012550020188", "00012550020204", "00012550020212",
"00012550090015", "00012800046631", "00012800063141", "00012800070612",
"00014300002922", "00014300002923", "00014300002924", "00014300002925",
"00014300002934", "00014300002940", "00014300002941", "00014300002942",
"00014300003300", "00014300004091", "00014300004296", "00014300004299",
"00014300004301", "00014300004648", "00014300004650", "00014300004651",
"00014300070522", "00015543760143", "00015543760145", "00015543760186",
"00015543760235", "00015543760253", "00017089302817", "00017103800044",
"00017103800047", "00017103800048", "00017103800053", "00017103800056",
"00017103800058", "00017103800059", "00017103801173", "00017103801175",
"00017232305018", "00017447100091", "00017510100575", "00017510100576",
"00017510121064", "00017510121065", "00017510181458", "00017732447059",
"00017762300048", "00017762300060", "00018903644280", "00019955508003",
"00019955508050", "00019955508060", "00019955508061", "00019955508531",
"00019955508534", "00019955508758", "00019955508792", "00019955508800",
"00019955508801", "00019955508832", "00019955508992", "00019955509803",
"00019955538570", "00019955538696", "00019955538725", "00019955538792",
"00021291912261", "00021780900078", "00022550081121", "00022550081122",
"00022800025406", "00022800030050", "00024300070590", "00025543760142",
"00025543760521", "00029955539550", "00031291912240", "00039955508520",
"00041114110525", "00057103800060", "00074300000330", "08011693200041",
"08011693200042", "88029999819907", "88029999820074", "88029999820083",
"88029999820128"), class = "factor")), row.names = c(NA, -513L
), class = "data.frame")
A few issues here. Your columns appear to be factors, which can create confusing problems when you apply string functions to them. You want them to be character, not factor. The correct way to check the length of a string is with nchar (spoiler alert: does not work with factor data!).
Your rules for padding seem a little arbitrary, but the following should work. For padding "within" the digit string, gsub and regular expressions work wonders.
df2 <- mutate_at(df, vars(col1, col2), as.character) %>%
mutate(col3 = case_when(
nchar(col1) == 11 ~ str_pad(col1, width = 14, pad = '0'),
nchar(col1) == 12 ~ gsub('(\\d)(\\d+)', '0\\10\\2', col1),
nchar(col1) == 13 ~ gsub('(\\d\\d)(\\d+)', '\\10\\2', col1),
T ~ col1
))
col1 col3
<chr> <chr>
1 74300000330 00074300000330
2 811693200042 08011693200042
3 8829999820128 88029999820128

How do I store the output of a repeat loop in a dataframe

My basic idea is to compute the Means of chunks (column-wise) of a large matrix and store these Means as rows of a data frame. Note, the chunks have different sizes (number of rows) and these are stored in a vector vec1. Below is my code:
df <- setNames(data.frame(matrix(nrow = 4000, ncol = 3)),
c("Age","Weight", "height"))
#
i <- 1
j <- vec1[1] - 1
k <- 0
repeat {
elements <- as.vector(apply(mydata[i : (j + 1), 3:5], 2, mean))
df <- rbind(df, elements)
k <- k + 1
i = i + vec1[k]
j = j + vec1[k + 1]
if (j + 1 >= l){
break
}
}
N.B.: When I perform the computations manually without looping it works. But the result of the loop yields a 4000 * 3 matrix filled with NA apart from the first row.
vec1 is a vector with 4000 entries, and whose first 500 elements - head(vec1, 500) -are below:
c(15L, 45L, 111L, 32L, 25L, 13L, 144L, 31L, 150L, 124L, 22L,
94L, 60L, 156L, 4L, 30L, 12L, 12L, 16L, 23L, 242L, 58L, 65L,
17L, 63L, 193L, 148L, 162L, 79L, 6L, 22L, 30L, 188L, 44L, 7L,
130L, 49L, 10L, 87L, 11L, 6L, 113L, 113L, 100L, 42L, 5L, 64L,
127L, 73L, 36L, 13L, 120L, 44L, 34L, 153L, 10L, 35L, 205L, 31L,
102L, 181L, 26L, 105L, 75L, 42L, 122L, 42L, 221L, 216L, 120L,
50L, 171L, 56L, 1L, 89L, 11L, 103L, 167L, 96L, 31L, 67L, 182L,
114L, 45L, 4L, 118L, 19L, 243L, 241L, 48L, 36L, 64L, 94L, 63L,
16L, 8L, 213L, 26L, 127L, 139L, 71L, 91L, 133L, 23L, 88L, 31L,
28L, 70L, 112L, 6L, 25L, 82L, 17L, 24L, 196L, 39L, 78L, 23L,
73L, 110L, 64L, 87L, 84L, 11L, 101L, 19L, 6L, 25L, 39L, 59L,
68L, 31L, 183L, 52L, 142L, 63L, 41L, 214L, 19L, 120L, 85L, 104L,
3L, 8L, 38L, 11L, 12L, 21L, 12L, 53L, 37L, 85L, 106L, 12L, 31L,
106L, 75L, 10L, 121L, 60L, 137L, 96L, 177L, 102L, 97L, 145L,
52L, 11L, 112L, 73L, 67L, 8L, 235L, 203L, 182L, 168L, 101L, 144L,
238L, 73L, 38L, 85L, 56L, 14L, 162L, 131L, 14L, 154L, 28L, 30L,
75L, 88L, 268L, 169L, 255L, 127L, 111L, 63L, 42L, 156L, 12L,
22L, 71L, 140L, 110L, 33L, 99L, 79L, 47L, 7L, 131L, 69L, 10L,
61L, 2L, 57L, 96L, 111L, 41L, 250L, 77L, 22L, 198L, 187L, 15L,
108L, 130L, 76L, 190L, 249L, 68L, 117L, 79L, 2L, 13L, 108L, 9L,
39L, 42L, 43L, 149L, 62L, 47L, 66L, 85L, 197L, 109L, 21L, 263L,
54L, 13L, 61L, 72L, 73L, 80L, 46L, 7L, 110L, 128L, 236L, 27L,
240L, 61L, 23L, 82L, 157L, 92L, 95L, 6L, 137L, 237L, 2L, 20L,
45L, 48L, 200L, 20L, 127L, 21L, 64L, 49L, 38L, 108L, 11L, 16L,
108L, 18L, 62L, 15L, 61L, 81L, 28L, 20L, 33L, 50L, 222L, 267L,
29L, 3L, 44L, 46L, 3L, 212L, 53L, 67L, 131L, 43L, 3L, 123L, 134L,
106L, 91L, 194L, 2L, 97L, 43L, 39L, 65L, 96L, 233L, 36L, 81L,
6L, 57L, 29L, 10L, 17L, 10L, 92L, 28L, 168L, 78L, 52L, 227L,
86L, 134L, 58L, 65L, 175L, 20L, 113L, 33L, 143L, 11L, 87L, 101L,
19L, 106L, 63L, 68L, 38L, 263L, 140L, 45L, 169L, 268L, 182L,
114L, 88L, 39L, 6L, 53L, 244L, 84L, 99L, 46L, 53L, 1L, 111L,
88L, 115L, 93L, 35L, 124L, 145L, 262L, 47L, 10L, 84L, 20L, 159L,
207L, 102L, 48L, 79L, 28L, 51L, 77L, 3L, 58L, 20L, 81L, 54L,
46L, 29L, 12L, 74L, 28L, 4L, 18L, 18L, 38L, 29L, 157L, 108L,
94L, 56L, 23L, 92L, 60L, 86L, 39L, 59L, 85L, 14L, 53L, 23L, 88L,
130L, 8L, 149L, 65L, 71L, 88L, 31L, 67L, 83L, 106L, 44L, 35L,
23L, 76L, 90L, 271L, 12L, 167L, 30L, 87L, 3L, 7L, 15L, 159L,
199L, 7L, 35L, 193L, 207L, 6L, 98L, 61L, 81L, 95L, 66L, 2L, 65L,
242L, 221L, 51L, 6L, 5L, 265L, 119L, 126L, 7L, 159L, 74L, 63L,
188L, 15L, 42L, 26L, 41L, 116L, 50L, 62L, 121L, 67L, 1L, 10L,
192L, 59L, 42L, 84L, 187L, 26L, 32L, 35L, 60L, 117L, 227L, 20L,
20L, 125L, 191L, 24L, 270L, 13L, 14L, 59L, 214L, 96L, 100L, 15L,
22L, 100L, 49L, 146L, 137L, 257L, 93L, 91L, 23L, 234L, 108L,
52L, 7L, 124L, 48L, 2L, 42L, 82L, 99L, 85L, 11L, 141L, 185L,
30L, 1L, 269L, 83L, 25L, 187L, 122L, 222L, 11L, 201L, 95L, 40L,
146L, 75L, 218L, 3L, 39L, 76L, 205L, 21L, 23L, 36L, 43L, 105L,
89L, 10L, 155L, 32L, 144L, 160L, 181L, 144L, 139L, 5L, 2L, 26L,
48L, 55L, 177L, 178L, 108L, 221L, 149L, 32L, 77L, 29L, 160L,
115L, 23L, 193L, 113L, 1L, 154L, 87L, 239L, 221L, 36L, 100L,
34L, 42L, 77L, 62L, 20L, 73L, 81L, 17L, 21L, 33L, 3L, 33L, 84L,
92L, 31L, 9L, 65L, 187L, 62L, 87L, 48L, 218L, 6L, 41L, 90L, 102L,
67L, 27L, 1L, 270L, 159L, 46L, 31L, 50L, 19L, 2L, 30L, 35L, 211L,
103L, 12L, 99L, 75L, 37L, 99L, 83L, 49L, 38L, 125L, 53L, 29L,
11L, 23L, 50L, 41L, 114L, 72L, 44L, 32L, 105L, 25L, 67L, 203L,
24L, 82L, 167L, 205L, 28L, 89L, 75L, 52L, 36L, 29L, 16L, 137L,
95L, 230L, 43L, 4L, 194L, 12L, 21L, 25L, 6L, 176L, 48L, 6L, 142L,
24L, 15L, 101L, 160L, 43L, 9L, 125L, 122L, 53L, 55L, 226L, 241L,
259L, 150L, 142L, 47L, 89L, 13L, 2L, 173L, 147L, 5L, 15L, 159L,
7L, 27L, 117L, 97L, 38L, 71L, 7L, 35L, 91L, 172L, 149L, 103L,
51L, 117L, 67L, 142L, 63L, 53L, 87L, 105L, 2L, 1L, 17L, 30L,
114L, 55L, 202L, 34L, 70L, 50L, 37L, 167L, 45L, 7L, 102L, 238L,
176L, 27L, 7L, 86L, 43L, 269L, 88L, 1L, 18L, 41L, 14L, 71L, 88L,
144L, 44L, 19L, 189L, 258L, 76L, 13L, 44L, 20L, 152L, 133L, 86L,
32L, 1L, 56L, 140L, 65L, 74L, 131L, 155L, 40L, 40L, 112L, 186L,
178L, 249L, 42L, 184L, 43L, 5L, 13L, 90L, 111L, 173L, 220L, 71L,
223L, 5L, 178L, 42L, 126L, 56L, 6L, 15L, 249L, 254L, 148L, 60L,
133L, 218L, 111L, 29L, 77L, 16L, 71L, 128L, 100L, 4L, 13L, 72L,
21L, 133L, 130L, 51L, 62L, 14L, 189L, 99L, 32L, 211L, 5L, 15L,
35L, 72L, 153L, 59L, 85L, 165L, 18L, 51L, 21L, 123L, 15L, 93L,
53L, 2L, 210L, 126L, 196L, 62L, 156L, 57L, 179L, 79L, 27L, 22L,
52L, 167L, 33L, 150L, 72L, 30L, 3L, 65L, 36L, 89L, 54L, 18L,
55L, 137L, 119L, 258L, 33L, 21L, 32L, 116L, 12L, 176L, 91L, 168L,
74L, 6L, 4L, 138L, 149L, 39L, 47L, 49L, 81L, 35L, 61L, 4L, 58L,
31L, 172L, 30L, 27L, 184L, 41L, 51L, 24L, 115L, 81L, 71L, 61L,
154L, 206L, 182L, 149L, 42L, 49L, 6L, 104L, 2L, 217L, 27L, 148L,
37L, 159L, 182L, 139L, 49L, 30L, 41L, 20L, 2L, 15L, 35L, 157L,
86L, 261L, 161L, 145L, 105L, 87L, 220L, 12L, 99L, 233L, 190L,
59L, 95L, 151L, 38L, 46L, 32L, 56L, 48L, 71L, 22L, 44L, 143L,
34L, 34L, 7L, 20L, 87L, 106L, 114L, 26L, 7L, 110L, 93L, 113L,
83L, 76L, 43L, 22L, 2L, 101L, 22L, 65L, 17L, 112L, 116L, 138L,
122L, 68L, 5L, 247L, 155L, 149L, 4L, 49L, 130L, 46L, 13L, 223L,
74L, 15L, 175L, 24L, 2L, 96L, 114L, 125L, 56L, 27L, 67L, 30L,
206L, 38L, 42L, 9L, 118L, 24L, 11L, 156L, 109L, 154L, 40L, 175L,
107L, 193L, 30L, 75L, 72L, 44L, 232L, 37L, 130L, 47L, 81L, 18L,
120L, 126L, 93L, 51L, 138L, 6L, 47L, 76L, 65L, 91L, 14L, 92L,
45L, 73L, 107L, 42L, 87L, 158L, 124L, 14L, 151L, 11L, 148L, 122L,
36L, 169L, 149L, 41L, 152L, 116L, 122L, 39L, 196L, 124L, 142L,
12L, 21L, 107L, 4L, 236L, 18L, 193L, 225L, 31L, 147L, 151L, 14L,
63L, 12L, 79L, 55L, 198L, 7L, 84L, 101L, 22L, 194L, 150L, 5L,
20L, 153L, 45L, 231L, 33L, 44L, 174L, 171L, 74L, 9L, 114L, 97L,
107L, 7L, 87L, 113L, 49L, 14L, 32L, 1L, 43L, 131L, 43L, 22L,
32L, 36L, 201L, 206L, 18L, 170L, 79L, 55L, 218L, 198L, 10L, 51L,
35L, 144L, 163L, 255L, 23L, 180L, 20L, 40L, 89L, 107L, 82L, 67L,
115L, 255L, 14L, 155L, 9L, 53L, 55L, 16L, 38L, 16L, 26L, 155L,
4L, 154L, 147L, 223L, 57L, 75L, 54L, 50L, 104L, 79L, 145L, 71L,
39L, 110L, 20L, 23L, 10L, 110L, 67L, 171L, 16L, 5L, 28L, 163L,
204L, 250L, 144L, 101L, 18L, 36L, 139L, 10L, 102L, 57L, 125L,
66L, 33L, 20L, 188L, 15L, 41L, 20L, 112L, 109L, 64L, 28L, 10L,
149L, 196L, 108L, 26L, 173L, 1L, 58L, 185L, 35L, 44L, 37L, 106L,
45L, 58L, 162L, 34L, 151L, 122L, 48L, 8L, 9L, 33L, 4L, 21L, 105L,
36L, 32L, 133L, 55L, 87L, 18L, 18L, 6L, 46L, 79L, 113L, 17L,
70L, 138L, 22L, 42L, 104L, 43L, 9L, 24L, 94L, 142L, 31L, 241L,
23L, 2L, 86L, 62L, 36L, 80L, 2L, 76L, 89L, 160L, 13L, 12L, 4L,
57L, 25L, 85L, 22L, 88L, 170L, 120L, 218L, 14L, 75L, 12L, 9L,
198L, 225L, 139L, 75L, 1L, 6L, 35L, 23L, 67L, 19L, 157L, 68L,
69L, 9L, 6L, 57L, 18L, 169L, 255L, 3L, 20L, 8L, 54L, 94L, 154L,
34L, 151L, 52L, 68L, 85L, 107L, 9L, 232L, 165L, 50L, 153L, 14L,
200L, 78L, 94L, 140L, 222L, 143L, 56L, 37L, 101L, 83L, 48L, 53L,
38L, 155L, 8L, 132L, 148L, 39L, 53L, 151L, 3L, 5L, 59L, 3L, 56L,
100L, 37L, 65L, 192L, 30L, 212L, 70L, 149L, 10L, 43L, 92L, 28L,
97L, 20L, 105L, 133L, 134L, 4L, 65L, 83L, 16L, 158L, 168L, 119L,
47L, 55L, 51L, 38L, 80L, 16L, 124L, 105L, 68L, 178L, 23L, 15L,
177L, 146L, 71L, 7L, 2L, 36L, 7L, 3L, 89L, 54L, 42L, 67L, 133L,
64L, 44L, 39L, 119L, 64L, 15L, 44L, 73L, 41L, 49L, 92L, 8L, 110L,
167L, 59L, 224L, 102L, 23L, 6L, 69L, 126L, 97L, 240L, 21L, 32L,
52L, 59L, 34L, 17L, 12L, 270L, 60L, 119L, 103L, 92L, 218L, 62L,
127L, 15L, 65L, 64L, 63L, 17L, 135L, 67L, 49L, 149L, 24L, 24L,
24L, 54L, 27L, 167L, 7L, 8L, 53L, 72L, 85L, 47L, 92L, 36L, 158L,
113L, 26L, 126L, 3L, 127L, 19L, 27L, 98L, 34L, 82L, 217L, 44L,
105L, 104L, 65L, 35L, 63L, 82L, 41L, 167L, 12L, 136L, 52L, 205L,
18L, 96L, 136L, 74L, 163L, 52L, 194L, 32L, 74L, 217L, 11L, 54L,
228L, 33L, 22L, 51L, 42L, 52L, 8L, 235L, 250L, 38L, 130L, 126L,
57L, 18L, 53L, 108L, 126L, 54L, 128L, 17L, 230L, 40L, 49L, 31L,
38L, 42L, 18L, 14L, 203L, 114L, 73L, 226L, 4L, 4L, 271L, 48L,
86L, 221L, 18L, 55L, 176L, 119L, 255L, 18L, 124L, 63L, 58L, 77L,
159L, 118L, 116L, 71L, 123L, 22L, 38L, 61L, 114L, 114L, 1L, 104L,
115L, 9L, 192L, 4L, 199L, 118L, 199L, 4L, 13L, 114L, 175L, 11L,
39L, 189L, 30L, 113L, 112L, 13L, 102L, 11L, 26L, 130L, 2L, 47L,
90L, 77L, 184L, 76L, 15L, 116L, 166L, 20L, 21L, 3L, 136L, 108L,
106L, 87L, 60L, 78L, 106L, 18L, 45L, 85L, 41L, 11L, 85L, 46L,
33L, 244L, 26L, 35L, 14L, 8L, 45L, 98L, 7L, 203L, 9L, 118L, 70L,
85L, 178L, 23L, 8L, 29L, 221L, 171L, 67L, 106L, 118L, 95L, 216L,
32L, 177L, 72L, 16L, 21L, 161L, 49L, 52L, 80L, 174L, 5L, 70L,
41L, 43L, 13L, 238L, 5L, 70L, 128L, 152L, 53L, 128L, 18L, 19L,
107L, 70L, 94L, 119L, 63L, 2L, 7L, 2L, 208L, 128L, 37L, 73L,
8L, 166L, 243L, 216L, 137L, 115L, 178L, 32L, 31L, 49L, 13L, 4L,
217L, 4L, 40L, 48L, 24L, 127L, 25L, 46L, 238L, 107L, 28L, 76L,
54L, 97L, 104L, 9L, 142L, 4L, 32L, 21L, 46L, 36L, 11L, 75L, 175L,
46L, 109L, 25L, 106L, 115L, 78L, 69L, 152L, 2L, 51L, 10L, 63L,
142L, 66L, 168L, 78L, 11L, 147L, 271L, 90L, 88L, 10L, 143L, 71L,
202L, 259L, 133L, 23L, 71L, 238L, 37L, 38L, 24L, 64L, 133L, 8L,
194L, 24L, 92L, 25L, 230L, 195L, 34L, 162L, 18L, 69L, 75L, 18L,
20L, 34L, 99L, 24L, 152L, 83L, 24L, 4L, 41L, 103L, 77L, 86L,
23L, 46L, 53L, 63L, 98L, 54L, 17L, 122L, 9L, 25L, 237L, 71L,
82L, 42L, 259L, 37L, 35L, 21L, 77L, 2L, 5L, 2L, 41L, 46L, 26L,
100L, 265L, 224L, 45L, 68L, 263L, 136L, 243L, 109L, 122L, 25L,
186L, 1L, 7L, 135L, 116L, 18L, 32L, 94L, 192L, 29L, 184L, 174L,
41L, 71L, 14L, 125L, 61L, 70L, 178L, 90L, 7L, 14L, 194L, 167L,
5L, 2L, 21L, 100L, 60L, 230L, 66L, 10L, 162L, 39L, 99L, 91L,
65L, 22L, 162L, 139L, 43L, 230L, 59L, 61L, 168L, 14L, 23L, 73L,
35L, 141L, 73L, 71L, 44L, 59L, 131L, 127L, 68L, 122L, 164L, 2L,
17L, 111L, 4L, 34L, 147L, 33L, 11L, 33L, 54L, 48L, 235L, 136L,
27L, 57L, 8L, 86L, 63L, 86L, 24L, 212L, 92L, 131L, 113L, 47L,
132L, 5L, 175L, 12L, 51L, 81L, 29L, 232L, 126L, 20L, 157L, 158L,
17L, 16L, 62L, 25L, 74L, 58L, 25L, 35L, 85L, 61L, 112L, 241L,
135L, 183L, 77L, 41L, 12L, 101L, 12L, 25L, 113L, 38L, 28L, 95L,
232L, 6L, 98L, 67L, 13L, 46L, 9L, 107L, 88L, 164L, 79L, 18L,
13L, 200L, 20L, 152L, 107L, 40L, 31L, 146L, 121L, 75L, 6L, 237L,
153L, 150L, 161L, 198L, 174L, 167L, 15L, 154L, 160L, 171L, 169L,
23L, 22L, 187L, 226L, 40L, 213L, 87L, 269L, 136L, 153L, 103L,
141L, 21L, 79L, 22L, 144L, 119L, 1L, 11L, 13L, 7L, 128L, 43L,
77L, 50L, 142L, 79L, 5L, 182L, 19L, 39L, 5L, 63L, 228L, 13L,
5L, 49L, 58L, 14L, 145L, 129L, 102L, 211L, 152L, 43L, 269L, 67L,
36L, 10L, 103L, 98L, 83L, 13L, 25L, 155L, 11L, 33L, 127L, 79L,
46L, 64L, 40L, 88L, 23L, 52L, 204L, 125L, 39L, 10L, 184L, 38L,
113L, 123L, 68L, 69L, 126L, 7L, 36L, 43L, 3L, 243L, 82L, 50L,
109L, 122L, 44L, 40L, 41L, 140L, 134L, 168L, 122L, 16L, 2L, 61L,
37L, 73L, 163L, 70L, 18L, 9L, 205L, 12L, 89L, 1L, 17L, 119L,
17L, 54L, 31L, 13L, 185L, 157L, 113L, 53L, 156L, 157L, 72L, 61L,
29L, 52L, 69L, 23L, 261L, 51L, 118L, 48L, 98L, 49L, 250L, 29L,
222L, 55L, 14L, 130L, 72L, 27L, 23L, 45L, 27L, 5L, 62L, 46L,
208L, 183L, 32L, 37L, 168L, 39L, 47L, 3L, 88L, 74L, 40L, 254L,
5L, 28L, 165L, 109L, 181L, 209L, 142L, 107L, 21L, 14L, 42L, 58L,
198L, 30L, 91L, 175L, 108L, 18L, 60L, 86L, 6L, 82L, 26L, 8L,
85L, 202L, 261L, 113L, 142L, 19L, 67L, 96L, 116L, 262L, 60L,
55L, 47L, 56L, 33L, 39L, 196L, 77L, 10L, 86L, 142L, 11L, 49L,
7L, 56L, 38L, 26L, 180L, 74L, 60L, 236L, 7L, 37L, 81L, 119L,
26L, 7L, 103L, 38L, 6L, 184L, 153L, 90L, 42L, 22L, 140L, 57L,
50L, 97L, 14L, 42L, 3L, 14L, 16L, 66L, 56L, 89L, 21L, 58L, 7L,
101L, 16L, 125L, 224L, 64L, 110L, 20L, 5L, 67L, 57L, 161L, 271L,
13L, 18L, 51L, 119L, 42L, 122L, 51L, 116L, 41L, 2L, 89L, 229L,
2L, 45L, 22L, 180L, 3L, 127L, 195L, 8L, 230L, 203L, 72L, 203L,
61L, 7L, 61L, 253L, 37L, 46L, 59L, 161L, 110L, 5L, 223L, 195L,
45L, 1L, 48L, 163L, 3L, 56L, 76L, 77L, 107L, 183L, 7L, 30L, 145L,
4L, 26L, 174L, 76L, 83L, 73L, 172L, 226L, 2L, 18L, 1L, 8L, 90L,
36L, 8L, 44L, 36L, 90L, 64L, 89L, 127L, 24L, 67L, 7L, 263L, 71L,
178L, 21L, 21L, 28L, 236L, 116L, 46L, 82L, 79L, 17L, 18L, 131L,
49L, 90L, 65L, 168L, 93L, 2L, 267L, 59L, 35L, 126L, 35L, 185L,
6L, 45L, 31L, 42L, 71L, 67L, 85L, 11L, 9L, 30L, 22L, 24L, 123L,
119L, 14L, 98L, 31L, 101L, 137L, 81L, 47L, 79L, 4L, 167L, 78L,
11L, 30L, 9L, 115L, 32L, 12L, 80L, 33L, 68L, 36L, 130L, 31L,
7L, 169L, 54L, 9L, 155L, 61L, 250L, 89L, 149L, 2L, 101L, 66L,
166L, 41L, 4L, 62L, 9L, 160L, 189L, 144L, 101L, 190L, 129L, 11L,
124L, 22L, 13L, 151L, 1L, 58L, 173L, 195L, 47L, 3L, 3L, 24L,
26L, 27L, 177L, 43L, 29L, 27L, 7L, 3L, 154L, 100L, 125L, 91L,
212L, 224L, 77L, 53L, 135L, 2L, 11L, 65L, 60L, 115L, 78L, 55L,
66L, 31L, 88L, 72L, 87L, 181L, 198L, 75L, 239L, 111L, 10L, 128L,
103L, 68L, 27L, 127L, 4L, 24L, 102L, 3L, 19L, 103L, 268L, 5L,
153L, 216L, 9L, 56L, 154L, 3L, 13L, 128L, 252L, 17L, 10L, 78L,
65L, 245L, 53L, 166L, 11L, 28L, 43L, 85L, 11L, 179L, 200L, 127L,
235L, 61L, 7L, 4L, 35L, 28L, 85L, 118L, 69L, 92L, 158L, 40L,
91L, 104L, 165L, 135L, 30L, 230L, 121L, 204L, 44L, 106L, 5L,
51L, 19L, 145L, 34L, 184L, 16L, 217L, 62L, 67L, 44L, 16L, 5L,
39L, 13L, 16L, 95L, 158L, 43L, 93L, 37L, 47L, 33L, 18L, 178L,
13L, 65L, 123L, 54L, 165L, 265L, 9L, 118L, 93L, 10L, 3L, 114L,
13L, 8L, 48L, 103L, 160L, 92L, 135L, 50L, 7L, 38L, 16L, 64L,
85L, 215L, 13L, 251L, 41L, 10L, 67L, 13L, 56L, 202L, 72L, 156L,
249L, 56L, 38L, 27L, 15L, 177L, 39L, 36L, 62L, 53L, 86L, 62L,
126L, 177L, 46L, 30L, 81L, 6L, 74L, 37L, 65L, 54L, 67L, 123L,
66L, 144L, 90L, 48L, 173L, 47L, 49L, 108L, 22L, 103L, 22L, 144L,
23L, 233L, 78L, 181L, 136L, 27L, 3L, 135L, 46L, 34L, 30L, 42L,
6L, 53L, 49L, 180L, 247L, 106L, 22L, 124L, 9L, 161L, 43L, 82L,
112L, 225L, 153L, 124L, 53L, 90L, 64L, 86L, 35L, 121L, 118L,
129L, 39L, 3L, 16L, 24L, 224L, 128L, 145L, 108L, 124L, 32L, 9L,
7L, 22L, 16L, 207L, 51L, 27L, 22L, 6L, 132L, 154L, 26L, 223L,
145L, 105L, 78L, 44L, 171L, 29L, 53L, 229L, 89L, 47L, 41L, 81L,
62L, 169L, 102L, 241L, 35L, 6L, 174L, 51L, 181L, 83L, 52L, 92L,
31L, 110L, 148L, 52L, 7L, 73L, 136L, 25L, 29L, 42L, 84L, 190L,
49L, 139L, 62L, 7L, 86L, 13L, 182L, 203L, 68L, 127L, 13L, 27L,
244L, 69L, 65L, 92L, 14L, 257L, 7L, 49L, 20L, 44L, 17L, 13L,
73L, 20L, 43L, 33L, 242L, 4L, 66L, 70L, 99L, 193L, 12L, 179L,
63L, 14L, 53L, 49L, 105L, 59L, 113L, 79L, 124L, 35L, 9L, 7L,
44L, 6L, 21L, 8L, 114L, 36L, 90L, 121L, 113L, 96L, 26L, 253L,
14L, 53L, 10L, 25L, 18L, 18L, 10L, 87L, 4L, 159L, 179L, 17L,
9L, 222L, 68L, 268L, 120L, 197L, 21L, 67L, 59L, 250L, 221L, 233L,
41L, 114L, 20L, 136L, 136L, 94L, 19L, 29L, 11L, 81L, 179L, 154L,
20L, 29L, 148L, 249L, 34L, 246L, 212L, 46L, 4L, 33L, 118L, 47L,
246L, 116L, 42L, 91L, 60L, 49L, 186L, 37L, 85L, 8L, 26L, 5L,
30L, 44L, 22L, 28L, 48L, 144L, 200L, 33L, 29L, 77L, 15L, 120L,
33L, 27L, 53L, 126L, 183L, 79L, 62L, 102L, 61L, 112L, 56L, 77L,
201L, 74L, 7L, 99L, 120L, 110L, 148L, 35L, 48L, 18L, 4L, 16L,
84L, 53L, 39L, 20L, 36L, 159L, 30L, 3L, 46L, 247L, 31L, 127L,
61L, 127L, 238L, 109L, 154L, 178L, 78L, 31L, 5L, 77L, 69L, 3L,
49L, 165L, 91L, 29L, 72L, 24L, 30L, 105L, 55L, 225L, 28L, 36L,
13L, 18L, 106L, 56L, 143L, 105L, 55L, 33L, 4L, 100L, 215L, 59L,
169L, 103L, 70L, 76L, 189L, 42L, 94L, 101L, 41L, 83L, 52L, 231L,
120L, 111L, 37L, 198L, 69L, 57L, 51L, 13L, 14L, 55L, 24L, 74L,
136L, 1L, 218L, 110L, 125L, 26L, 106L, 203L, 46L, 57L, 16L, 90L,
186L, 209L, 64L, 254L, 1L, 103L, 175L, 3L, 5L, 41L, 51L, 232L,
89L, 73L, 67L, 260L, 85L, 189L, 249L, 166L, 72L, 250L, 56L, 2L,
66L, 232L, 33L, 259L, 12L, 47L, 7L, 106L, 193L, 63L, 132L, 3L,
21L, 76L, 195L, 15L, 43L, 171L, 29L, 108L, 84L, 199L, 189L, 98L,
43L, 83L, 28L, 67L, 47L, 195L, 62L, 57L, 53L, 163L, 48L, 65L,
188L, 3L, 52L, 257L, 62L, 62L, 114L, 38L, 128L, 26L, 205L, 100L,
75L, 104L, 56L, 146L, 105L, 35L, 26L, 18L, 46L, 25L, 96L, 61L,
1L, 91L, 13L, 169L, 35L, 54L, 77L, 35L, 9L, 213L, 124L, 22L,
29L, 52L, 203L, 98L, 61L, 8L, 33L, 14L, 11L, 13L, 48L, 105L,
76L, 22L, 136L, 123L, 18L, 39L, 39L, 9L, 212L, 11L, 37L, 9L,
59L, 254L, 18L, 85L, 38L, 180L, 159L, 94L, 42L, 15L, 230L, 38L,
35L, 19L, 98L, 185L, 10L, 24L, 103L, 67L, 8L, 63L, 200L, 135L,
34L, 39L, 19L, 62L, 175L, 13L, 9L, 1L, 37L, 116L, 41L, 42L, 105L,
54L, 17L, 90L, 47L, 38L, 34L, 23L, 105L, 23L, 57L, 115L, 107L,
58L, 50L, 121L, 123L, 23L, 99L, 31L, 148L, 9L, 106L, 4L, 76L,
55L, 102L, 66L, 135L, 43L, 73L, 7L, 255L, 15L, 24L, 229L, 115L,
55L, 52L, 18L, 22L, 39L, 181L, 1L, 135L, 45L, 103L, 24L, 180L,
118L, 228L, 219L, 116L, 90L, 154L, 35L, 73L, 65L, 48L, 58L, 35L,
26L, 166L, 66L, 128L, 15L, 28L, 109L, 154L, 3L, 24L, 52L, 89L,
50L, 53L, 69L, 17L, 15L, 124L, 50L, 134L, 267L, 11L, 194L, 6L,
143L, 40L, 35L, 223L, 12L, 27L, 45L, 181L, 60L, 37L, 19L, 6L,
24L, 57L, 75L, 12L, 93L, 38L, 27L, 140L, 32L, 57L, 115L, 82L,
262L, 5L, 185L, 223L, 10L, 72L, 7L, 110L, 12L, 81L, 61L, 29L,
91L, 12L, 85L, 62L, 34L, 73L, 27L, 16L, 85L, 216L, 228L, 157L,
66L, 73L, 38L, 88L, 26L, 83L, 184L, 10L, 108L, 43L, 11L, 3L,
47L, 61L, 139L, 10L, 8L, 69L, 11L, 63L, 224L, 82L, 5L, 22L, 3L,
51L, 39L, 5L, 232L, 150L, 93L, 89L, 174L, 5L, 85L, 159L, 49L,
150L, 187L, 101L, 29L, 20L, 48L, 4L, 142L, 44L, 57L, 105L, 79L,
51L, 91L, 89L, 115L, 14L, 67L, 2L, 165L, 114L, 2L, 17L, 67L,
38L, 108L, 23L, 103L, 223L, 1L, 34L, 21L, 41L, 73L, 186L, 55L,
14L, 61L, 81L, 75L, 15L, 95L, 85L, 145L, 222L, 139L, 231L, 162L,
79L, 67L, 80L, 75L, 17L, 27L, 48L, 38L, 27L, 71L, 100L, 51L,
132L, 2L, 183L, 110L, 23L, 37L, 103L, 30L, 43L, 138L, 1L, 13L,
83L, 180L, 27L, 21L, 236L, 78L, 118L, 93L, 95L, 83L, 28L, 15L,
236L, 41L, 51L, 11L, 181L, 91L, 4L, 40L, 86L, 165L, 24L, 115L,
252L, 28L, 35L, 13L, 15L, 7L, 9L, 27L, 33L, 9L, 40L, 5L, 105L,
28L, 5L, 16L, 117L, 153L, 27L, 141L, 52L, 168L, 10L, 84L, 17L,
47L, 56L, 233L, 140L, 69L, 221L, 19L, 8L, 71L, 37L, 123L, 137L,
10L, 55L, 146L, 14L, 41L, 69L, 142L, 89L, 4L, 37L, 170L, 37L,
35L, 182L, 70L, 24L, 158L, 83L, 25L, 38L, 116L, 132L, 209L, 69L,
221L, 41L, 114L, 28L, 20L, 42L, 132L, 83L, 168L, 87L, 64L, 249L,
155L, 66L, 113L, 44L, 35L, 100L, 133L, 31L, 126L, 10L, 184L,
53L, 64L, 57L, 22L, 2L, 30L, 25L, 39L, 151L, 164L, 42L, 72L,
2L, 38L, 29L, 8L, 22L, 9L, 91L, 58L, 58L, 78L, 82L, 117L, 104L,
29L, 80L, 70L, 137L, 137L, 115L, 10L, 87L, 66L, 1L, 11L, 21L,
118L, 262L, 70L, 5L, 153L, 118L, 35L, 249L, 68L, 38L, 79L, 30L,
39L, 39L, 158L, 17L, 145L, 5L, 8L, 47L, 177L, 77L, 203L, 94L,
107L, 96L, 68L, 7L, 12L, 24L, 18L, 146L, 13L, 164L, 54L, 73L,
143L, 96L, 22L, 5L, 100L, 71L, 65L, 1L, 16L, 22L, 13L, 39L, 101L,
39L, 75L, 148L, 45L, 257L, 67L, 18L, 50L, 62L, 29L, 222L, 96L,
7L, 7L, 130L, 108L, 44L, 48L, 109L, 67L, 112L, 100L, 169L, 260L,
130L, 169L, 79L, 111L, 121L, 15L, 21L, 240L, 220L, 56L, 8L, 18L,
4L, 37L, 98L, 46L, 247L, 66L, 69L, 19L, 66L, 112L, 42L, 103L,
122L, 155L, 36L, 4L, 60L, 39L, 25L, 2L, 182L, 105L, 157L, 5L,
70L, 16L, 55L, 52L, 39L, 156L, 14L, 118L, 88L, 91L, 132L, 52L,
18L, 38L, 31L, 35L, 75L, 186L, 45L, 110L, 232L, 52L, 135L, 33L,
11L, 29L, 129L, 147L, 20L, 20L, 59L, 46L, 6L, 53L, 251L, 120L,
192L, 41L, 87L, 38L, 134L, 5L, 120L, 130L, 71L, 121L, 84L, 183L,
166L, 20L, 8L, 20L, 74L, 201L, 35L, 176L, 189L, 17L, 231L, 48L,
38L, 3L, 142L, 53L, 199L, 135L, 6L, 38L, 256L, 76L, 6L, 56L,
154L, 25L, 76L, 69L, 149L, 107L, 113L, 246L, 61L, 23L, 6L, 76L,
3L, 68L, 70L, 89L, 130L, 226L, 31L, 157L, 24L, 80L, 170L, 169L,
64L, 12L, 110L, 47L, 141L, 159L, 22L, 53L, 167L, 61L, 81L, 98L,
172L, 261L, 99L, 9L, 13L, 132L, 103L, 16L, 97L, 186L, 35L, 128L,
73L, 136L, 62L, 187L, 30L, 31L, 26L, 115L, 76L, 260L, 54L, 11L,
169L, 227L, 43L, 6L, 23L, 212L, 23L, 68L, 119L, 181L, 34L, 137L,
144L, 48L, 101L, 25L, 10L, 92L, 5L, 92L, 132L, 206L, 44L, 113L,
9L, 25L, 249L, 69L, 250L, 67L, 35L, 6L, 60L, 251L, 6L, 32L, 94L,
13L, 224L, 21L, 43L, 81L, 9L, 9L, 95L, 11L, 7L, 26L, 172L, 46L,
17L, 3L, 2L, 39L, 26L, 7L, 18L, 57L, 88L, 16L, 47L, 136L, 135L,
73L, 26L, 60L, 56L, 77L, 158L, 23L, 1L, 139L, 234L, 76L, 99L,
28L, 22L, 83L, 114L, 6L, 122L, 7L, 36L, 59L, 4L, 33L, 79L, 25L,
26L, 8L, 28L, 19L, 33L, 2L, 23L, 44L, 158L, 56L, 14L, 8L, 56L,
16L, 36L, 90L, 18L, 22L, 7L, 74L, 70L, 2L, 51L, 13L, 130L, 25L,
17L, 23L, 48L, 37L, 60L, 17L, 58L, 15L, 41L, 261L, 245L, 35L,
17L, 41L, 234L, 13L, 11L, 192L, 3L, 5L, 29L, 14L, 34L, 4L, 110L,
63L, 47L, 157L, 9L, 116L, 120L, 29L, 126L, 26L, 106L, 219L, 209L,
93L, 255L, 137L, 88L, 96L, 87L, 229L, 23L, 128L, 101L, 62L, 2L,
193L, 58L, 1L, 8L, 146L, 44L, 12L, 27L, 99L, 270L, 54L, 41L,
161L, 231L, 53L, 126L, 139L, 77L, 55L, 32L, 6L, 159L, 131L, 54L,
266L, 87L, 13L, 205L, 154L, 3L, 82L, 35L, 11L, 2L, 56L, 84L,
110L, 116L, 28L, 30L, 60L, 74L, 12L, 147L, 31L, 206L, 31L, 56L,
209L, 115L, 149L, 33L, 198L, 205L, 71L, 28L, 40L, 201L, 32L,
3L, 40L, 75L, 91L, 32L, 9L, 4L, 192L, 11L, 41L, 30L, 46L, 57L,
44L, 243L, 67L, 118L, 108L, 181L, 83L, 45L, 93L, 13L, 2L, 104L,
163L, 92L, 8L, 17L, 14L, 150L, 5L, 60L, 123L, 100L, 105L, 110L,
225L, 249L, 207L, 100L, 188L, 138L, 6L, 176L, 68L, 91L, 8L, 20L,
18L, 21L, 79L, 20L, 4L, 99L, 136L, 28L, 156L, 7L, 36L, 226L,
33L, 42L, 1L, 28L, 227L, 11L, 9L, 157L, 206L, 34L, 17L, 61L,
113L, 112L, 158L, 24L, 18L, 36L, 75L, 40L, 18L, 183L, 3L, 37L,
92L, 69L, 13L, 213L, 48L, 163L, 188L, 251L, 59L, 75L, 1L, 12L,
46L, 232L, 13L, 74L, 32L, 149L, 219L, 22L, 59L, 109L, 264L, 25L,
141L, 5L, 67L, 41L, 5L, 71L, 19L, 63L, 114L, 28L, 76L, 80L, 86L,
71L, 18L, 166L, 40L, 57L, 185L, 88L, 115L)
The problem is that you initially created 4000 * 3 data.frame filled in with NA. Please see the corrected code. I did not put your actual data vec1 (too long) and simulated vec1 with sampling from exponential distribution. Additionally I used colMeans as more effective than apply. See the code below:
# vec1, mydata, l - simulation
set.seed(123)
vec1 <- (sample(1:271, 4000, replace = TRUE, prob = dexp(1:271, rate = .01)))
mydata <- matrix(1:(300 * 300), nrow = 300)
l <- 300
# data given by OP
df <- data.frame(Age = 1, Weight = 1, height = 1 )
df <- df[-1, ]
i <- 1
j <- vec1[1] - 1
k <- 0
repeat{
elements <- as.vector(colMeans(mydata[i:(j + 1), 3:5]))
df <- rbind(df, elements)
k <- k + 1
i = i + vec1[k]
j = j + vec1[k + 1]
if (j + 1 >= l){
break
}
}
df <- setNames(df, c("Age","Weight", "height"))
df
Output:
Age Weight height
1 608.0 908.0 1208.0
2 638.0 938.0 1238.0
3 716.0 1016.0 1316.0
4 787.5 1087.5 1387.5
5 816.0 1116.0 1416.0
6 835.0 1135.0 1435.0

How to find Number of unique occurrences of a value in data-set?

I have the following piece of my data-set:
> dput(test)
structure(list(X2002.06.26 = structure(c(99L, 88L, 65L, 94L,
60L, 101L, 27L, 83L, 16L, 12L, 54L, 97L, 63L, 41L, 13L, 2L, 58L,
9L, 82L, 22L, 14L, 77L, 55L, 32L, 45L, 80L, 39L, 70L, 114L, 103L,
69L, 104L, 106L, 108L, 38L, 10L, 64L, 1L, 112L, 102L, 67L, 98L,
66L, 19L, 81L, 72L, 89L, 23L, 48L, 4L, 25L, 91L, 26L, 62L, 33L,
3L, 28L, 57L, 17L, 20L, 73L, 78L, 90L, 84L, 5L, 92L, 43L, 74L,
75L, 93L, 100L, 56L, 36L, 79L, 111L, 52L, 24L, 105L, 29L, 53L,
110L, 71L, 18L, 8L, 34L, 50L, 109L, 61L, 35L, 21L, 11L, 47L,
59L, 51L, 113L, 44L, 30L, 42L, 107L, 7L, 87L, 6L, 68L, 96L, 86L,
15L, 46L, 85L, 31L, 49L, 40L, 76L, 95L, 115L, 37L), .Label = c("BMG4388N1065",
"BMG812761002", "GB00BYMT0J19", "IE00BLS09M33", "IE00BQRQXQ92",
"US0003611052", "US0015471081", "US0025671050", "US0028962076",
"US0044981019", "US0116591092", "US01741R1023", "US0185223007",
"US01988P1084", "US0305061097", "US0311001004", "US03662Q1058",
"US0375981091", "US0383361039", "US03836W1036", "US03937C1053",
"US0396701049", "US0462241011", "US06652V2088", "US0997241064",
"US1033041013", "US1096961040", "US1170431092", "US1250711009",
"US1258961002", "US12686C1099", "US1311931042", "US1416651099",
"US1423391002", "US1431301027", "US1564311082", "US1718711062",
"US1778351056", "US2193501051", "US2289031005", "US23331A1097",
"US2537981027", "US2829141009", "US2925621052", "US2966891028",
"US3116421021", "US34354P1057", "US3498531017", "US3693851095",
"US3984331021", "US3989051095", "US4158641070", "US4222451001",
"US4285671016", "US4586653044", "US4835481031", "US5261071071",
"US5367971034", "US5463471053", "US55305B1017", "US5535301064",
"US5562691080", "US5663301068", "US5871181005", "US59001A1025",
"US6081901042", "US62914B1008", "US6517185046", "US6900701078",
"US6907684038", "US6936561009", "US7081601061", "US7132781094",
"US7234561097", "US7310681025", "US7415034039", "US7496851038",
"US7549071030", "US7595091023", "US76009N1000", "US7703231032",
"US7811821005", "US7835491082", "US8081941044", "US8308791024",
"US83088M1027", "US83545G1022", "US8354951027", "US8528572006",
"US8545021011", "US85590A4013", "US8581191009", "US8589121081",
"US8681571084", "US8685361037", "US8712371033", "US8793691069",
"US8799391060", "US8832031012", "US8851601018", "US8865471085",
"US8873891043", "US88830M1027", "US8968181011", "US89785X1019",
"US8990355054", "US90385D1072", "US9134831034", "US9202531011",
"US92552R4065", "US9410531001", "US9427491025", "US9433151019",
"US9633201069", "US9837721045"), class = "factor"), X2002.06.27 = structure(c(57L,
43L, 73L, 70L, 35L, 114L, 58L, 88L, 55L, 7L, 72L, 28L, 16L, 84L,
110L, 44L, 75L, 20L, 99L, 18L, 10L, 80L, 113L, 52L, 66L, 36L,
60L, 101L, 107L, 103L, 34L, 22L, 81L, 40L, 1L, 46L, 108L, 106L,
91L, 37L, 98L, 9L, 104L, 115L, 54L, 100L, 42L, 2L, 3L, 26L, 21L,
71L, 23L, 62L, 50L, 97L, 11L, 94L, 27L, 53L, 79L, 4L, 51L, 76L,
49L, 78L, 87L, 32L, 59L, 96L, 13L, 86L, 15L, 48L, 109L, 29L,
85L, 68L, 17L, 41L, 64L, 31L, 8L, 38L, 90L, 45L, 12L, 56L, 6L,
39L, 92L, 63L, 5L, 82L, 19L, 89L, 69L, 74L, 25L, 95L, 105L, 61L,
67L, 14L, 112L, 111L, 102L, 83L, 93L, 33L, 30L, 47L, 65L, 24L,
77L), .Label = c("CH0044328745", "GB00BVVBC028", "LR0008862868",
"US0003611052", "US0010841023", "US0044981019", "US0079731008",
"US0116591092", "US0305061097", "US0311001004", "US0383361039",
"US03937C1053", "US0462241011", "US06652V2088", "US0733021010",
"US0952291005", "US0997241064", "US1096411004", "US1096961040",
"US1265011056", "US12686C1099", "US1311931042", "US1431301027",
"US1564311082", "US1628251035", "US1630721017", "US1897541041",
"US2017231034", "US23331A1097", "US2829141009", "US2925621052",
"US29444U7000", "US2974251009", "US3024913036", "US3138551086",
"US34354P1057", "US3596941068", "US3693851095", "US3719011096",
"US3825501014", "US3984331021", "US3989051095", "US4108671052",
"US4130861093", "US4158641070", "US4456581077", "US4586653044",
"US4606901001", "US48666K1097", "US5006432000", "US5053361078",
"US5138471033", "US5179421087", "US5246601075", "US5260571048",
"US5463471053", "US5526761086", "US5535301064", "US5663301068",
"US5766901012", "US59001A1025", "US6117421072", "US63935N1072",
"US6515871076", "US67066G1040", "US6795801009", "US6819191064",
"US6900701078", "US6907684038", "US6935061076", "US6936561009",
"US6951561090", "US7004162092", "US73179P1066", "US7376301039",
"US7401891053", "US74762E1029", "US7496851038", "US7549071030",
"US7757111049", "US7811821005", "US8305661055", "US8308791024",
"US8335511049", "US83545G1022", "US8354951027", "US8358981079",
"US8545021011", "US85590A4013", "US86732Y1091", "US8681681057",
"US8712371033", "US87305R1095", "US8799391060", "US8851601018",
"US88830M1027", "US8894781033", "US8962391004", "US8968181011",
"US89785X1019", "US9022521051", "US90385D1072", "US9046772003",
"US9111631035", "US9134831034", "US92552R4065", "US92552V1008",
"US9258151029", "US9292361071", "US9410531001", "US9427491025",
"US9433151019", "US9699041011", "US9746371007", "US9807451037"
), class = "factor")), .Names = c("X2002.06.26", "X2002.06.27"
), class = "data.frame", row.names = c(NA, -115L))
The actual data extends over 3000+ columns and there are approximately 1150 unique values.
I need to count how many times each of these values appears in the Data-Set.
We can try to flat the elements in the data frame first, then apply the table() method:
tab1 <- table(do.call(c, lapply(df, as.character)))
Another option is to convert the data frame to matrix then apply table method:
tab2 <- table(as.matrix(df))
identical(tab1, tab2)
[1] TRUE

Select a set of edges which create the largest graph given that some edges are mutually exclusive of others

I'm trying to determine how to best tackle this problem.
Given a set of nodes and multiple, conflicting ways in which they could be connected I need to select the set of non-conflicting relations such that largest number of nodes remain in connected.
Example.
Here is a graph including all possible relations (edges) ignoring conflicts. Eg., this image doesn't depict the dependence of the edges on each other.
All edges attached to a specific node are dependent on one another. For simplicity each edge implies an attribute to each node it connects say A...Z. If an edge connecting nodes 3 and 16 specifies attributes 3-B and 16-F, then all edges connecting 16 to other nodes must have attribute 16-F. Similarly all edges connecting 3 to other nodes must have attribute 3-B.
Here is the same graph when specifying attribute F to node 16. This attribute removes most edges leaving one edge connecting 16-4 and one edge connecting 16-3. This has left no edges between 16-42.
(16 is near the left in both images.)
This image does not illustrate that the edge connecting 3-42 will specify an attribute for node 42, say 42-X. This will further constrain connections to 42 and further break up the graph. I have not displayed this because this is what my question pertains to.
I am looking for advice.
Is this a known problem? Can you point me to any references?
How
would you approach this problem? My best idea is to iterate,
starting at each edge, over all possible attributes. Evaluate each
partitioning and find which preserves the largest network. This
sounds challenging though and I could use some help.
If this is the solution is there a way using igraph in R to specify an "edge attribute constraint" and pull out the resulting, fragmented graph.
I have dput the graph here:
df = structure(list(nodeA = c(3L, 4L, 42L, 43L, 44L, 29L, 30L, 29L, 30L, 3L, 4L, 6L, 43L, 44L, 43L, 44L, 29L, 30L, 29L, 30L, 52L, 29L, 30L, 35L, 25L, 35L, 25L, 43L, 44L, 29L, 30L, 3L, 4L, 43L, 44L, 29L, 30L, 25L, 29L, 30L, 42L, 3L, 4L, 17L, 43L, 44L, 29L, 30L, 29L, 30L, 17L, 17L, 29L, 30L, 6L, 43L, 44L, 29L, 30L, 52L, 35L, 35L, 25L, 25L, 24L, 24L, 43L, 44L, 29L, 30L, 35L, 35L, 25L, 25L, 24L, 24L, 43L, 44L, 29L, 30L, 35L, 35L, 25L, 25L, 24L, 24L, 52L, 42L, 3L, 42L, 42L, 3L, 4L, 42L, 25L, 42L, 25L, 3L, 4L, 42L, 3L, 4L, 17L, 35L, 3L, 4L, 35L, 43L, 44L, 29L, 30L, 35L, 35L, 35L, 52L, 25L, 25L, 24L, 24L, 35L, 29L, 30L, 3L, 4L, 43L, 44L, 29L, 30L, 25L, 29L, 30L, 52L, 43L, 44L, 29L, 30L, 25L, 29L, 30L, 3L, 4L, 43L, 44L, 29L, 30L, 52L, 43L, 44L, 43L, 44L, 29L, 30L, 3L, 4L, 43L, 44L, 29L, 30L, 52L, 52L, 43L, 44L, 29L, 30L, 35L, 52L, 52L, 3L, 4L, 43L, 44L, 29L, 30L, 52L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 17L, 17L, 42L, 42L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 3L, 4L, 25L, 25L, 16L, 16L, 3L, 4L, 43L, 44L, 24L, 3L, 4L, 52L, 52L, 17L, 35L, 35L, 35L, 17L, 3L, 4L, 6L, 35L, 42L, 42L, 42L, 42L, 3L, 4L, 17L, 25L, 17L, 17L, 29L, 30L, 25L, 3L, 4L, 29L, 30L, 3L, 4L, 17L, 17L, 17L, 35L, 3L, 4L, 17L, 17L, 17L, 29L, 30L, 43L, 44L, 43L, 44L, 29L, 30L, 17L, 6L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 3L, 43L, 44L, 29L, 30L, 3L, 43L, 44L, 29L, 30L, 17L, 17L, 42L, 42L, 25L, 42L, 25L, 43L, 44L, 29L, 30L, 42L, 17L, 17L, 42L, 42L, 43L, 44L, 29L, 30L, 25L, 29L, 30L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 25L, 29L, 30L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 43L, 44L, 29L, 30L, 25L, 25L, 25L, 25L), nodeB = c(16L, 16L, 17L, 24L, 24L, 25L, 25L, 35L, 35L, 16L, 16L, 17L, 24L, 24L, 24L, 24L, 25L, 25L, 25L, 25L, 35L, 35L, 35L, 43L, 43L, 44L, 44L, 24L, 24L, 25L, 25L, 16L, 16L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 16L, 16L, 16L, 24L, 24L, 24L, 25L, 25L, 35L, 35L, 43L, 44L, 52L, 52L, 17L, 24L, 24L, 25L, 25L, 35L, 43L, 44L, 29L, 30L, 43L, 44L, 24L, 24L, 25L, 25L, 43L, 44L, 29L, 30L, 43L, 44L, 24L, 24L, 25L, 25L, 43L, 44L, 29L, 30L, 43L, 44L, 17L, 24L, 42L, 43L, 44L, 16L, 16L, 17L, 35L, 17L, 35L, 16L, 16L, 52L, 16L, 16L, 6L, 25L, 16L, 16L, 52L, 24L, 24L, 25L, 25L, 43L, 44L, 25L, 25L, 29L, 30L, 43L, 44L, 17L, 42L, 42L, 16L, 16L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 35L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 16L, 16L, 24L, 24L, 25L, 25L, 35L, 17L, 17L, 24L, 24L, 25L, 25L, 16L, 16L, 24L, 24L, 25L, 25L, 25L, 35L, 24L, 24L, 25L, 25L, 25L, 29L, 30L, 16L, 16L, 24L, 24L, 25L, 25L, 35L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 43L, 44L, 3L, 4L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 16L, 16L, 35L, 35L, 3L, 4L, 16L, 16L, 17L, 17L, 17L, 16L, 16L, 29L, 30L, 6L, 25L, 29L, 30L, 42L, 16L, 16L, 25L, 52L, 16L, 16L, 16L, 16L, 16L, 16L, 24L, 35L, 43L, 44L, 52L, 52L, 35L, 16L, 16L, 52L, 52L, 16L, 16L, 24L, 43L, 44L, 25L, 16L, 16L, 24L, 43L, 44L, 52L, 52L, 17L, 17L, 24L, 24L, 25L, 25L, 52L, 42L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 42L, 24L, 24L, 25L, 25L, 42L, 24L, 24L, 25L, 25L, 43L, 44L, 4L, 17L, 35L, 17L, 35L, 24L, 24L, 25L, 25L, 16L, 43L, 44L, 4L, 4L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 24L, 24L, 25L, 25L, 35L, 35L, 35L, 35L), attributeA = c(25L, 25L, 130L, 110L, 110L, 110L, 110L, 113L, 113L, 43L, 43L, 71L, 5L, 5L, 127L, 127L, 5L, 5L, 127L, 127L, 72L, 130L, 130L, 137L, 140L, 137L, 140L, 6L, 6L, 6L, 6L, 56L, 56L, 137L, 137L, 137L, 137L, 130L, 140L, 140L, 29L, 68L, 68L, 56L, 143L, 143L, 143L, 143L, 146L, 146L, 43L, 43L, 45L, 45L, 46L, 80L, 80L, 80L, 80L, 47L, 11L, 11L, 80L, 80L, 80L, 80L, 84L, 84L, 84L, 84L, 14L, 14L, 84L, 84L, 84L, 84L, 90L, 90L, 90L, 90L, 18L, 18L, 90L, 90L, 90L, 90L, 110L, 37L, 122L, 114L, 114L, 108L, 108L, 58L, 27L, 136L, 109L, 26L, 26L, 115L, 111L, 111L, 78L, 109L, 112L, 112L, 78L, 114L, 114L, 114L, 114L, 37L, 37L, 47L, 73L, 114L, 114L, 114L, 114L, 128L, 111L, 111L, 125L, 125L, 54L, 54L, 54L, 54L, 45L, 58L, 58L, 143L, 55L, 55L, 55L, 55L, 126L, 136L, 136L, 44L, 44L, 56L, 56L, 56L, 56L, 145L, 68L, 68L, 57L, 57L, 57L, 57L, 128L, 128L, 58L, 58L, 58L, 58L, 143L, 146L, 59L, 59L, 59L, 59L, 126L, 70L, 70L, 129L, 129L, 60L, 60L, 60L, 60L, 73L, 61L, 61L, 61L, 61L, 62L, 62L, 62L, 62L, 124L, 124L, 91L, 91L, 63L, 63L, 63L, 63L, 64L, 64L, 64L, 64L, 65L, 65L, 65L, 65L, 135L, 135L, 58L, 136L, 127L, 127L, 57L, 57L, 143L, 143L, 68L, 138L, 138L, 143L, 143L, 80L, 136L, 126L, 126L, 109L, 139L, 139L, 128L, 80L, 110L, 112L, 113L, 30L, 141L, 141L, 135L, 70L, 125L, 125L, 126L, 126L, 142L, 69L, 69L, 128L, 128L, 144L, 144L, 138L, 128L, 128L, 142L, 145L, 145L, 139L, 129L, 129L, 130L, 130L, 121L, 121L, 79L, 79L, 79L, 79L, 91L, 109L, 82L, 82L, 82L, 82L, 86L, 86L, 86L, 86L, 88L, 88L, 88L, 88L, 97L, 92L, 92L, 92L, 92L, 118L, 94L, 94L, 94L, 94L, 107L, 107L, 89L, 138L, 111L, 140L, 113L, 116L, 116L, 116L, 116L, 1L, 134L, 134L, 92L, 19L, 135L, 135L, 135L, 135L, 128L, 138L, 138L, 136L, 136L, 136L, 136L, 137L, 137L, 137L, 137L, 130L, 140L, 140L, 138L, 138L, 138L, 138L, 139L, 139L, 139L, 139L, 140L, 140L, 140L, 140L, 138L, 140L, 144L, 146L), attributeB = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 13L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 23L, 23L, 23L, 23L, 24L, 24L, 25L, 25L, 25L, 27L, 27L, 28L, 28L, 29L, 29L, 29L, 36L, 36L, 36L, 36L, 36L, 36L, 37L, 37L, 37L, 37L, 37L, 37L, 38L, 38L, 38L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 43L, 43L, 43L, 43L, 43L, 43L, 43L, 44L, 44L, 44L, 44L, 44L, 44L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 46L, 46L, 46L, 46L, 46L, 46L, 46L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 48L, 48L, 48L, 48L, 49L, 49L, 49L, 49L, 49L, 49L, 50L, 50L, 50L, 50L, 50L, 50L, 51L, 51L, 51L, 51L, 52L, 52L, 52L, 52L, 54L, 54L, 54L, 55L, 56L, 56L, 56L, 56L, 56L, 56L, 57L, 58L, 58L, 58L, 58L, 59L, 59L, 59L, 59L, 59L, 60L, 60L, 60L, 60L, 62L, 63L, 64L, 65L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 66L, 67L, 68L, 68L, 68L, 68L, 70L, 70L, 70L, 70L, 70L, 71L, 72L, 72L, 72L, 72L, 72L, 72L, 72L, 77L, 77L, 78L, 78L, 78L, 78L, 79L, 80L, 81L, 81L, 81L, 81L, 85L, 85L, 85L, 85L, 87L, 87L, 87L, 87L, 89L, 91L, 91L, 91L, 91L, 92L, 93L, 93L, 93L, 93L, 96L, 96L, 97L, 108L, 108L, 110L, 110L, 115L, 115L, 115L, 115L, 117L, 117L, 117L, 118L, 122L, 125L, 125L, 125L, 125L, 125L, 125L, 125L, 126L, 126L, 126L, 126L, 127L, 127L, 127L, 127L, 127L, 127L, 127L, 128L, 128L, 128L, 128L, 129L, 129L, 129L, 129L, 130L, 130L, 130L, 130L, 135L, 137L, 141L, 143L)), .Names = c("nodeA", "nodeB", "attributeA", "attributeB" ), row.names = c(3L, 4L, 5L, 7L, 8L, 9L, 10L, 12L, 13L, 18L, 19L, 20L, 24L, 25L, 26L, 27L, 28L, 29L, 31L, 32L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 52L, 53L, 54L, 55L, 59L, 60L, 62L, 63L, 64L, 65L, 71L, 72L, 73L, 78L, 82L, 83L, 86L, 87L, 88L, 89L, 90L, 96L, 97L, 98L, 99L, 108L, 109L, 112L, 114L, 115L, 116L, 117L, 120L, 121L, 122L, 129L, 131L, 134L, 135L, 141L, 142L, 143L, 144L, 146L, 147L, 153L, 154L, 156L, 157L, 163L, 164L, 165L, 166L, 168L, 169L, 175L, 176L, 178L, 179L, 183L, 186L, 187L, 188L, 189L, 196L, 197L, 198L, 201L, 204L, 206L, 208L, 209L, 213L, 216L, 217L, 221L, 222L, 225L, 226L, 230L, 241L, 242L, 243L, 244L, 248L, 249L, 255L, 256L, 259L, 260L, 264L, 265L, 272L, 276L, 277L, 284L, 285L, 287L, 288L, 289L, 290L, 292L, 293L, 294L, 295L, 303L, 304L, 305L, 306L, 308L, 309L, 310L, 315L, 316L, 318L, 319L, 320L, 321L, 325L, 333L, 334L, 336L, 337L, 338L, 339L, 347L, 348L, 350L, 351L, 352L, 353L, 354L, 359L, 365L, 366L, 367L, 368L, 369L, 373L, 374L, 381L, 382L, 384L, 385L, 386L, 387L, 390L, 395L, 396L, 397L, 398L, 406L, 407L, 408L, 409L, 411L, 412L, 416L, 417L, 421L, 422L, 423L, 424L, 430L, 431L, 432L, 433L, 438L, 439L, 440L, 441L, 447L, 448L, 450L, 452L, 454L, 455L, 456L, 457L, 458L, 459L, 468L, 472L, 473L, 476L, 477L, 481L, 483L, 484L, 485L, 488L, 493L, 494L, 495L, 501L, 504L, 508L, 511L, 512L, 513L, 514L, 516L, 518L, 519L, 520L, 523L, 524L, 526L, 528L, 529L, 534L, 535L, 538L, 539L, 540L, 543L, 544L, 550L, 555L, 556L, 558L, 561L, 562L, 564L, 565L, 576L, 577L, 582L, 583L, 584L, 585L, 590L, 594L, 596L, 597L, 598L, 599L, 605L, 606L, 607L, 608L, 613L, 614L, 615L, 616L, 620L, 622L, 623L, 624L, 625L, 629L, 631L, 632L, 633L, 634L, 643L, 644L, 647L, 657L, 660L, 665L, 666L, 673L, 674L, 675L, 676L, 691L, 692L, 693L, 696L, 700L, 705L, 706L, 707L, 708L, 711L, 712L, 713L, 720L, 721L, 722L, 723L, 728L, 729L, 730L, 731L, 733L, 734L, 735L, 741L, 742L, 743L, 744L, 750L, 751L, 752L, 753L, 759L, 760L, 761L, 762L, 772L, 777L, 787L, 790L), class = "data.frame")
library(igraph)
g = graph.data.frame(df)
plot(g, vertex.size = 6, edge.arrow.mode=1, edge.arrow.size = 0)
> head(df)
nodeA nodeB attributeA attributeB
1 3 16 25 1
4 4 16 25 1
5 42 17 130 1
7 43 24 110 1
8 44 24 110 1
9 29 25 110 1
In the above, row 1 attributeA is the exclusive attribute for node 3 such that all other edges connecting to node 3 must have attribute 25. Similarly, attributeB indicates that all edges connecting to node 16 must have the attribute 1. It is not necessary that row 1 be an edge, but it is necessary that no retained edges conflict.
Thanks for reading!
Is this a known problem? Can you point me to any references?
This is quite an interesting problem, and not one that I've encountered before.
How would you approach this problem?
I would approach this problem from an integer programming perspective. The decision variables will be used to select the attribute of each node (only edges labeled with the attributes of both of their endpoints will be allowed). Further, we will select a "root node" that we expect to be in the large connected component, and we will create flow outward from this root node. Each other node will have demand 1, and flow will only be possible over valid edges. We will maximize the amount of flow pushed out from the root node; this will be the number of other nodes in the large component.
To achieve this formulation, I would create two classes of variables:
Node attribute variables: For each node i and attribute a, I would create a binary variable z_ia that is 1 if node i is assigned attribute a and 0 otherwise.
Flow variables: For each edge from node i to j (I assume "from" is nodeA in your data frame and "to" is nodeB in your data frame), variable x_ij indicates the amount of flow from i to j (negative values indicate flow from j to i).
We also have a number of different constraints:
Each node only has 1 attribute: This can be achieved with \sum_{a\in A} z_ia = 1 for each node i, where A is the set of all attributes.
Edge flows are 0 if the edge is not valid: For each edge from i to j with attributes a and b, respectively, we will have x_ij <= n*z_ia, x_ij <= n*z_jb, x_ij >= -n*z_ia, and x_ij >= -n*z_jb. In all four constraints, n is the total number of nodes. These constraints will force x_ij=0 if z_ia=0 or z_jb=0, and otherwise will not be binding.
The net flow to any non-root node falls in [0, 1]: This constraint ensures that all outflow must come from the root, so nodes can only get flow if they are connected to the root. For each non-root node i with edges incoming from node set I and edges outgoing to node set O, these constraints are of the form \sum_{j\in I} x_ji - \sum_{j\in O} x_ij >= 0 and \sum_{j\in I} x_ji - \sum_{j\in O} x_ij <= 1.
The objective is to maximize the amount of flow out of the root node r. If r has incoming edges from nodes in set I and outgoing edges to nodes in set O, then this objective (which we maximize) is \sum_{j\in O} x_ji - \sum_{j\in I} x_ij.
With these variables and constraints in place, all you need to do is specify the root node r and solve; the solution will indicate the best possible assignment of attributes to nodes, assuming that r is in the largest component. If you re-solved for each root node r, you would end up with the global optimal assignment.
The following in an implementation of this approach with the lpSolve package in R:
library(lpSolve)
optim <- function(df, r) {
# Some book keeping
nodes = c(df$nodeA, df$nodeB)
u.nodes <- unique(nodes)
if (!r %in% u.nodes) {
stop("Invalid root node provided")
}
n.node <- length(u.nodes)
attrs = c(df$attributeA, df$attributeB)
node.attrs <- do.call(rbind, lapply(u.nodes, function(x) {
data.frame(node=x, attr=unique(attrs[nodes == x]))
}))
n.na <- nrow(node.attrs)
n.e <- nrow(df)
# Constraints limiting each node to have exactly one attribute
node.one.attr <- t(sapply(u.nodes, function(i) {
c(node.attrs$node == i, rep(0, 2*n.e))
}))
node.one.attr.dir <- rep("==", n.node)
node.one.attr.rhs <- rep(1, n.node)
# Constraints limiting edges to only be used if both attributes are selected
edge.flow <- do.call(rbind, lapply(seq_len(n.e), function(idx) {
i <- df$nodeA[idx]
j <- df$nodeB[idx]
a <- df$attributeA[idx]
b <- df$attributeB[idx]
na.i <- node.attrs$node == i & node.attrs$attr == a
na.j <- node.attrs$node == j & node.attrs$attr == b
rbind(c(-n.node*na.i, seq_len(n.e) == idx, -(seq_len(n.e) == idx)),
c(-n.node*na.j, seq_len(n.e) == idx, -(seq_len(n.e) == idx)),
c(n.node*na.i, seq_len(n.e) == idx, -(seq_len(n.e) == idx)),
c(n.node*na.j, seq_len(n.e) == idx, -(seq_len(n.e) == idx)))
}))
edge.flow.dir <- rep(c("<=", "<=", ">=", ">="), n.e)
edge.flow.rhs <- rep(0, 4*n.e)
# Constraints limiting net flow on non-root nodes
net.flow <- do.call(rbind, lapply(u.nodes, function(i) {
if (i == r) {
return(NULL)
}
rbind(c(rep(0, n.na), (df$nodeB == i) - (df$nodeA == i),
-(df$nodeB == i) + (df$nodeA == i)),
c(rep(0, n.na), (df$nodeB == i) - (df$nodeA == i),
-(df$nodeB == i) + (df$nodeA == i)))
}))
net.flow.dir <- rep(c(">=", "<="), n.node-1)
net.flow.rhs <- rep(c(0, 1), n.node-1)
# Build the model
mod <- lp(direction = "max",
objective.in = c(rep(0, n.na), (df$nodeA == r) - (df$nodeB == r),
-(df$nodeA == r) + (df$nodeB == r)),
const.mat = rbind(node.one.attr, edge.flow, net.flow),
const.dir = c(node.one.attr.dir, edge.flow.dir, net.flow.dir),
const.rhs = c(node.one.attr.rhs, edge.flow.rhs, net.flow.rhs),
binary.vec = seq_len(n.na))
opt <- node.attrs[mod$solution[1:n.na] > 0.999,]
valid.edges <- df[opt$attr[match(df$nodeA, opt$node)] == df$attributeA &
opt$attr[match(df$nodeB, opt$node)] == df$attributeB,]
list(attrs = opt,
edges = valid.edges,
objval = mod$objval)
}
It can solve the problem for subsets of the nodes in your original graph, but it becomes quite slow as you include an increasing number of nodes:
# Limit to 5 nodes
keep <- c(3, 4, 6, 16, 42)
df.play <- df[df$nodeA %in% keep & df$nodeB %in% keep,]
(opt.play <- optim(df.play, 42))
# $attrs
# node attr
# 24 3 50
# 45 4 50
# 50 42 91
# 60 16 127
# 87 6 109
#
# $edges
# nodeA nodeB attributeA attributeB
# 416 42 3 91 50
# 417 42 4 91 50
#
# $objval
# [1] 2
That run took 15 seconds. To speed this up, you could consider switching to a more powerful solver such as cplex or gurobi. These solvers are free for academic use but non-free otherwise.
If this is the solution is there a way using igraph in R to specify an "edge attribute constraint" and pull out the resulting, fragmented graph.
Yes, given the attributes you can easily subset and plot the graph. For the 5-node example that I solved above:
g <- graph.data.frame(opt.play$edges, vertices=unique(c(df.play$nodeA, df.play$nodeB)))
plot(g, vertex.size = 6, edge.arrow.mode=1, edge.arrow.size = 0)
While working through this problem I stumbled upon a simpler solution. It seems my formulation of the problem was making the answer hard to see.
The core of the matter is: when two different constraints are applied to a node it effectively becomes two distinct nodes.
Framing the challenge in this way allows us to rapidly construct graphs for each set of constraints. We can then quickly inspect these, look at the size, and (as my original question desired) select the set of constraints which preserves the largest graph.
g = graph.data.frame(df); plot(g, vertex.size = 6, edge.arrow.mode=1, edge.arrow.size = 0)
# Combine the node and the rule into a new, unique node id referencing both the node and the constraint
df.split = c(df[,1:2]) + df[,3:4]*1E3
# Keep track of edge numbers in this dataset for later
df.split = cbind(df.split, row = seq(nrow(df)))
g.split = graph.data.frame(df.split); plot(g.split, vertex.size = 6, edge.arrow.mode=1, edge.arrow.size = 0)
# Decompose into unlinked sub graphs and count the edges in each
g.list = decompose.graph(g.split)
g.list.nodenum = sapply(g.list, ecount)
head(g.list.nodenum[order(g.list.nodenum, decreasing=T)])
[1] 9 8 5 5 5 5
# Select the largest subgraph
g.sub = g.list[[order(g.list.nodenum, decreasing=T)[1]]]
plot(g.sub)
# Find what edges these were in the original dataset
originaledges = E(g.sub)$row
originaledges
[1] 129 157 130 158 131 159 212 213 132
# Play with the resulting graph, the largest graph which obeys constraints at all nodes.
df.largest = df[originaledges,]
df.largest
nodeA nodeB attributeA attributeB
292 25 35 45 41
352 29 25 58 45
293 29 35 58 41
353 30 25 58 45
294 30 35 58 41
354 52 25 143 45
476 52 29 143 58
477 52 30 143 58
295 52 35 143 41
g.largest = graph.data.frame(df.largest); plot(g.largest, vertex.size = 6, edge.arrow.mode=1, edge.arrow.size = 0)
Hopefully this helps someone someday!

Resources