Row and columns-wise calculation in long format in R - r

I am struggling with cellwise-calculations in a complex data-set (see below for dput() example).
I need to apply the formula for standardized mean difference (M1-M2/sqrt(s1^2+s2^2) to multiple rows and columns (studies and tests). The M1 & M2 (means) values are in the pr_cognm_ columns and s1 and s2 (standard deviations) in the pr_cognsd_ columns, and they should be calculated dependent on the factors id & tx...3 (treatments).
So e.g. for the pr_cognm_VV2_CRT_error column the id 336 has two rows and, in this case!, the values in the treat1 row need to be subtracted from VGT. But sometimes it is the other way around, e.g. within id 162 dGT needs to be subtracted from treat1 (luckily, this logic will be the same for each specific comparison though). Then, the same thing needs to happen with the standard deviations (i.e. potentiate and addition). Lastly, M and S need to be divided. There are many tests (columns) to run the formula on (e.g.pr_cognm_BVMT_perc_retention, pr_cognm_VV2_CRT_error, etc.) and often NA, since the specific id did not have this test. The data is in the long format and, to make it more complicated, some id have three instead of two rows (where two rows need to be subtracted from one other in a specific direction e.g. task1).
My best idea was to
#make a dataset
a <- readxl::read_excel("C:/.../reprod.xlsx")
b<- a[!grepl("com", a$id),] #already omitted in example dataset
pr_cognm <- dplyr::select(b,contains("pr_cognm"))
pr_cognsd <- dplyr::select(b,contains("pr_cognsd"))
c <- cbind(b$tx...3, b$id ,pr_cognm, pr_cognsd)
c$`b$id` <- as.factor(c$`b$id`)
#turn var's into numerics and factors
#potentiate all standard deviations (s1^2 and s2^2)
c[,3:ncol(c)] <- sapply(c[,3:ncol(c)], as.numeric)
c[,grepl("pr_cognsd", colnames(c))] <- c[,grepl("pr_cognsd", colnames(c))]^2
#then reshape
require(reshape2)
c %>%
dcast(b$id ~ b$tx...3, value.var = c("pr_cognm_VV2_CRT_error"), fill = 0)
b$id BF BL BT dGT H-TT HFL LM-TT treat1 VGT
1 55 0 0 0 0 0 0.00 0.00 0.00 0.00
2 162 0 0 0 0 0 0.00 0.00 0.00 0.00
3 236 0 0 0 0 0 0.00 0.00 0.00 0.00
4 336 0 0 0 0 0 0.00 0.00 8.75 7.58
5 377 0 0 0 0 0 0.00 0.00 0.00 0.00
6 521 0 0 0 0 0 0.00 0.00 0.00 0.00
7 525 0 0 0 0 0 0.00 0.00 0.00 0.00
8 527 0 0 0 0 0 0.00 0.00 0.00 0.00
9 528 0 0 0 0 0 0.00 0.00 0.00 0.00
10 535 0 0 0 0 0 5.65 6.54 0.00 0.00
11 548 0 0 0 0 0 0.00 0.00 0.00 0.00
12 553 0 0 0 0 0 0.00 0.00 0.00 0.00
Now I could define the rules which variables should be added like c$sub <- c$treat1-c$VGT and c$sub <- c$HFL-c$LM-TT, do the addition with the SD's in a similar fashion and finally divide the two variables to find the SMD. BUT, this only works for one test. In this case value.var = c("pr_cognm_VV2_CRT_error"). I would like to get this matrix for every test I have in the dataset via e.g. loop since more than one value.var dont work:
require(reshape2)
c %>%
dcast(b$id ~ b$tx...3, value.var = c("pr_cognm_VV2_CRT_error", "pr_cognm_BNT_perc_retention"), fill = 0)
Error in .subset2(x, i, exact = exact) : subscript out of bounds
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
the condition has length > 1 and only the first element will be used
If there were a way to loop via
c %>%
+ dcast(b$id ~ b$tx...3, value.var = c(names(c[,3:ncol(c)]), fill = 0)
then I could maybe rbind them and do the subtractions into a new variable as described above and after doing the same for the SD's I would finally be able to do the division to get the SMD.
I could not get any solutions to work.
Reprod example (truncated):
a <- structure(list(checked = c("Y", "Y", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), id = c("55", "55", "162", "162", "236", "236", "336", "336",
"377", "377", "521", "521", "525", "525", "527", "527", "528",
"528", "535", "535", "548", "548", "548", "553", "553"), tx...3 = c("task1",
"VGT", "dGT", "task1", "BT", "H-TT", "task1", "VGT", "BT", "H-TT",
"task1", "VGT", "HFL", "H-TT", "BF", "BT", "HFL", "task1", "HFL",
"LM-TT", "HFL", "BL", "task1", "HFL", "task1"), nta = c(2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
2, 2), id2 = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 3, 1, 2), cross = c("N", "N", "N", "N", "N",
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N",
"N", "N", "N", "N", "N", "N", "N"), pre_post = c("N", "N", "N",
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N",
"N", "N", "N", "N", "N", "N", "N", "N", "N"), case_control = c("N",
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N",
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N"), expsy = c("Y",
"Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y",
"Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y"), hosp = c("Out",
"Out", "Out", "Out", "In", "In", "NA", "NA", "In", "In", "NR",
"NR", "NR", "NR", "Mx", "Mx", "NR", "NR", "Mx", "Mx", "Out",
"Out", "Out", "Out", "Out"), tx...11 = c("task1", "VGT", "dGT",
"task1", "BT", "H-TT", "task1", "VGT", "BT", "H-TT", "task1",
"VGT", "HFL", "H-TT", "BF", "BT", "HFL", "task1", "HFL", "LM-TT",
"HFL", "BL", "task1", "HFL", "task1"), vt_p = c("17", "17", "24",
"24", "21", "21", "NR", "NR", "NR", "NR", "NA", "NA", "17", "17",
"24", "24", "17", "17", "17", "17", "17", "17", "17", "17", "17"
), n_se = c("12", "12", "20", "20", "6", "6", "10", "10", "NR",
"NR", "20", "20", "10", "6", "8", "8", "15", "15", "10", "6",
"15", "15", "15", "10", "10"), cogn_name_AMI_K = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), cogn_cite_AMI_K = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), cogn_last_stim_AMI_K = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), n_bcogn_AMI_K = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), n_pcogn_AMI_K = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), pr_cognm_AMI_K = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
pr_cognsd_AMI_K = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), po_cognm_n_AMI_K = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), po_cognsd_n_AMI_K = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA), cogn_name_BVMT_perc_retention = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "brief_visual_memory_test", "brief_visual_memory_test",
"brief_visual_memory_test", NA, NA), cogn_cite_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "benedict_1997", "benedict_1997", "benedict_1997",
NA, NA), cogn_last_stim_BVMT_perc_retention = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, "NR", "NR", "NR", NA, NA), n_bcogn_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 19, 24, 18, NA, NA), n_pcogn_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 19, 24, 18, NA, NA), pr_cognm_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "91.6", "86.6", "90", NA, NA), pr_cognsd_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "17", "36", "13.3", NA, NA), po_cognm_n_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "86.7", "82.1", "71.7", NA, NA), po_cognsd_n_BVMT_perc_retention = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "20", "39.3", "24.2", NA, NA), cogn_name_BNT_naming = c(NA,
NA, NA, NA, NA, NA, NA, NA, "boston_naming_task_naming",
"boston_naming_task_naming", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), cogn_cite_BNT_naming = c(NA,
NA, NA, NA, NA, NA, NA, NA, "kaplan_1983", "kaplan_1983",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), cogn_last_stim_BNT_naming = c(NA, NA, NA, NA, NA, NA,
NA, NA, 30, 30, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA), n_bcogn_BNT_naming = c(NA, NA, NA, NA, NA,
NA, NA, NA, 14, 14, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), n_pcogn_BNT_naming = c(NA, NA, NA, NA,
NA, NA, NA, NA, 14, 14, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), pr_cognm_BNT_naming = c(NA, NA,
NA, NA, NA, NA, NA, NA, "19.64", "18.14", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), pr_cognsd_BNT_naming = c(NA,
NA, NA, NA, NA, NA, NA, NA, "9.15", "5.3", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), po_cognm_n_BNT_naming = c(NA,
NA, NA, NA, NA, NA, NA, NA, "20.21", "20.71", NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), po_cognsd_n_BNT_naming = c(NA,
NA, NA, NA, NA, NA, NA, NA, "9.38", "6.34", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), cogn_name_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "VV2_crt_error", "VV2_crt_error", NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "VV2_crt_error", "VV2_crt_error",
NA, NA, NA, NA, NA), cogn_cite_VV2_CRT_error = c(NA, NA,
NA, NA, NA, NA, "robbins_1994", "robbins_1994", NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, "robbins_1994", "robbins_1994",
NA, NA, NA, NA, NA), cogn_last_stim_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "NR", "NR", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, "NR", "NR", NA, NA, NA, NA, NA), n_bcogn_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, 12, 12, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 12, 12, NA, NA, NA, NA, NA), n_pcogn_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, 12, 12, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 12, 12, NA, NA, NA, NA, NA), pr_cognm_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "8.75", "7.58", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "5.65", "6.54", NA, NA, NA, NA, NA), pr_cognsd_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "1.13", "2.84", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "1.10", "1.89", NA, NA, NA, NA, NA), po_cognm_n_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "7.50", "5.33", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "7.50", "2.34", NA, NA, NA, NA, NA), po_cognsd_n_VV2_CRT_error = c(NA,
NA, NA, NA, NA, NA, "2.06", "2.42", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "2.06", "2", NA, NA, NA, NA, NA), cogn_name_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "VV2_crt_latency", "VV2_crt_latency",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA), cogn_cite_VV2_CRT_latency = c(NA, NA, NA, NA, NA,
NA, "robbins_1994", "robbins_1994", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), cogn_last_stim_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "NR", "NR", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), n_bcogn_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, 12, 12, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), n_pcogn_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, 12, 12, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), pr_cognm_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "476.05", "465.65", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), pr_cognsd_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "35.86", "37.54", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), po_cognm_n_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "460.66", "433.13", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), po_cognsd_n_VV2_CRT_latency = c(NA,
NA, NA, NA, NA, NA, "34.75", "46.70", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA,
-25L), class = c("tbl_df", "tbl", "data.frame"))

Related

Remove all rows before row meeting condition in R

I am importing multiple excel files. The files have a non-standard structure, but all have the required data after a row of headers, midway down the rows of the data frame.
Here is a MWE:
df= structure(list(...1 = c("CPET Results", NA, "Operator", NA, NA,
"Patient data", NA, "Administrative Data", "ID", "Title", "Last Name",
"First Name", "Name Addition", "Sex", "Date of Birth", NA, NA,
"Biological and Medical Baseline Data", "Height", "Weight", "Mask",
"Race", "Body Fat", "Hip/Waist Ratio", "BMI", "Estimated Fitness Level",
"BSA", "Hct", "Hb", "Medication that changes the Heart Rate",
"Medication", "Existing Medical Conditions", NA, NA, NA, "Test data",
"Start Time", "Duration", "CPET device", "Serial number", "Firmware version",
"Flow Sensor", "Temperature", "Barometric Pressure", "Humidity",
NA, NA, NA, "Variable", "V'O2", "V'CO2", "V'O2/kg", "V'O2/HR",
"HR", "V'E/V'O2", "V'E/V'CO2", "V'E", "BF", "RER", "WR", NA,
NA, "t", "h:mm:ss.ms", "0:00:25.000", "0:00:26.000", "0:00:27.000",
"0:00:28.000", "0:00:29.000", "0:00:30.000"), ...2 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Unit",
"L/min", "L/min", "ml/min/kg", "ml", "/min", NA, NA, "L/min",
"/min", NA, "W", NA, NA, "Phase", NA, "Rest", "Rest", "Rest",
"Rest", "Rest", "Rest"), ...3 = c(NA, NA, NA, NA, NA, NA, NA,
NA, "343", NA, "GFRex", "343", NA, "female", "21/05/1924", NA,
NA, NA, "178 cm", "88.2 kg", "Blue, medium", NA, NA, "0.96",
"28", NA, "2.06 m2", NA, NA, NA, NA, NA, NA, NA, NA, NA, "12/04/2021 11:27 AM",
"0:15:12", "MetaLyzer 3B-R3", "231821624", "1.3.10", NA, "21.5°C",
"1030mBar", "36%", NA, NA, NA, "Rest", "0.36", "0.31", "4", "0",
"-", "35.7", "40.0", "14.9", "14", "0.88", "0", NA, NA, "Marker",
NA, NA, NA, NA, NA, NA, NA), ...4 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Unloaded Pedalling",
"-", "-", "-", "-", "-", "-", "-", "-", "-", "-", "-", NA, NA,
"V'O2", "L/min", "0.61123179277253403", "0.61123179277253403",
"0.61123179277253403", "0.61123179277253403", "0.51731964113453299",
"0.51731964113453299"), ...5 = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "Warm Up", "0.61", "0.47",
"7", "0", "-", "26.2", "33.9", "18.5", "16", "0.77", "0", NA,
NA, "V'O2/kg", "ml/min/kg", "6.9339965147196203", "6.9339965147196203",
"6.9339965147196203", "6.9339965147196203", "5.8686289408341796",
"5.8686289408341796"), ...6 = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "VT1", "1.22", "0.98", "14",
"-", "-", "28.6", "32.3", "35.4", "22", "0.88", "71", NA, NA,
"V'O2/HR", "ml", "0", "0", "0", "0", "0", "0"), ...7 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "VT1 % Norm",
"145", "-", "145", "-", "-", "-", "-", "-", "102", "-", "131",
NA, NA, "HR", "/min", NA, NA, NA, NA, NA, NA), ...8 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "VT1 % Max",
"69", "55", "69", "-", "-", "83", "96", "54", "76", "87", "55",
NA, NA, "WR", "W", "0", "0", "0", "0", "0", "0"), ...9 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "VT2",
"1.71", "1.66", "19", "-", "-", "31.9", "32.7", "60.0", "32",
"0.97", "122", NA, NA, "V'E/V'O2", NA, "30.6521809263484", "30.6521809263484",
"30.6521809263484", "30.6521809263484", "34.760039405568897",
"34.760039405568897"), ...10 = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "VT2 % Norm", "203", "-",
"203", "-", "-", "-", "-", "-", "147", "-", "226", NA, NA, "V'E/V'CO2",
NA, "35.697970640705897", "35.697970640705897", "35.697970640705897",
"35.697970640705897", "39.618822090063901", "39.618822090063901"
), ...11 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "VT2 % Max", "97", "93", "97", "-", "-", "93",
"97", "92", "110", "96", "96", NA, NA, "RER", NA, "0.858653317715381",
"0.858653317715381", "0.858653317715381", "0.858653317715381",
"0.87736175817015105", "0.87736175817015105"), ...12 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "V'O2peak",
"1.77", "1.79", "20", "0", "-", "34.3", "33.8", "65.4", "29",
"1.01", "128", NA, NA, "V'E", "L/min", "23.334937499999999",
"23.334937499999999", "23.334937499999999", "23.334937499999999",
"21.762284444444401", "21.762284444444401"), ...13 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "V'O2peak % Norm",
"210", "-", "210", "0", "-", "-", "-", "-", "135", "-", "237",
NA, NA, "VT", "L", "0.86250000000000004", "0.86250000000000004",
"0.86250000000000004", "0.86250000000000004", "0.97866666666666702",
"0.97866666666666702"), ...14 = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "Normal", "0.84", "-", "10",
"8", "104", "-", "-", "-", "22", "-", "54", NA, NA, "BF", "/min",
"27.055", "27.055", "27.055", "27.055", "22.2366666666667", "22.2366666666667"
), ...15 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, "Absolute Maximum Values", "2.06", "1.96", "23",
"0", "-", "44.5", "37.2", "73.5", "34", "1.19", "128", NA, NA,
"V'CO2", "L/min", "0.52483620675725695", "0.52483620675725695",
"0.52483620675725695", "0.52483620675725695", "0.45387646988174501",
"0.45387646988174501"), ...16 = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, "WR", "W", "0", "0", "0", "0", "0",
"0")), row.names = c(NA, -70L), class = c("tbl_df", "tbl", "data.frame"
))
I want to remove all rows before ...1 == "t". I'm importing multiple files and want to do this to all of them at the sametime, and the header "t" appears at a different row number in each file.
I have tried
df1 = df[-c(1:row_number(df$...1 =="t")),]
df1 = df[-c(rownames(df[df$...1 =="t",])),]
I'd like a base R or dplyr solution. Thanks
In dplyr, the slice function can be used to select rows by index, and the base-R which() can tell you which row index to start at.
df %>%
slice(min(which(...1 == 't')):n())
This code will check for any rows on which ...1 == 't', then which() tells you the row index. min() is in case you get a file with two rows of 't'. Then, the slice picks all rows from that row you just found to the end (n()).
in base R you could do:
df[-seq(which(df[, '...1'] == 't') - 1),]

Geom_rect() removed after log2 transformation

(I am aware of a similar issue discussed here but not able to get these solutions to work)
I used this code to successfully create this plot with a linear scale on the y axis. The important bit is the geom_rect() that creates the blue lines that later disappear when I transform the y axis to a log2 scale.
plot <- ggplot(data, aes(x=ï.., y=Casirivimab)) + geom_point(size = 1)
#plot <- plot + scale_y_continuous(trans='log2', breaks = trans_breaks("log2", function(x) 2^x),labels = trans_format("log2", math_format(2^.x)))
plot <- plot + geom_rect(aes(xmin=ï..-.1, xmax=ï..+.1, ymin=-Inf, ymax=Inf, fill = Cas.Epitope), alpha=0.5, stat="identity")
plot <- plot + scale_fill_manual(values = c("x"="blue"), na.value= "grey92")
plot <- plot + coord_cartesian(xlim = c(340,510))
plot <- plot + theme(panel.background = element_rect(colour = "grey92", fill = "grey92"))
plot <- plot + labs(x = "amino acid position", y = "fold change", title = "Casirivimab", color = "Epitope Position")
plot_Cas <- plot
plot_Cas
This was the resultant plot
Once I add in the log 2 transformation the log2 transformation the blue lines created by geom_rect dissappear. How do I prevent these blue lines disappearing when I transform the axis to a log2 scale?
plot <- ggplot(data, aes(x=ï.., y=Casirivimab)) + geom_point(size = 1)
**plot <- plot + scale_y_continuous(trans='log2', breaks = trans_breaks("log2", function(x) 2^x),labels = trans_format("log2", math_format(2^.x)))**
plot <- plot + geom_rect(aes(xmin=ï..-.1, xmax=ï..+.1, ymin=-Inf, ymax=Inf, fill = Cas.Epitope), alpha=0.5, stat="identity")
plot <- plot + scale_fill_manual(values = c("x"="blue"), na.value= "grey92")
plot <- plot + coord_cartesian(xlim = c(340,510))
plot <- plot + theme(panel.background = element_rect(colour = "grey92", fill = "grey92"))
plot <- plot + labs(x = "amino acid position", y = "fold change", title = "Casirivimab", color = "Epitope Position")
plot_Cas <- plot
plot_Cas
The is the resulting plot:
Many thanks for your help in advance.
P.s. here is the data set from which the data plotted on the graphs is taken. The plotted data is for "Casirivimab" only.
structure(list(ï.. = c(18L, 69L, 80L, 141L, 215L, 222L, 241L,
246L, 247L, 321L, 333L, 334L, 335L, 337L, 339L, 340L, 341L, 343L,
344L, 345L, 346L, 348L, 351L, 354L, 357L, 359L, 360L, 361L, 367L,
378L, 384L, 403L, 405L, 406L, 408L, 409L, 415L, 416L, 417L, 417L,
417L, 420L, 421L, 435L, 439L, 440L, 441L, 444L, 444L, 445L, 446L,
447L, 448L, 449L, 450L, 452L, 452L, 453L, 455L, 456L, 456L, 456L,
457L, 458L, 460L, 470L, 472L, 473L, 474L, 475L, 475L, 476L, 477L,
478L, 478L, 479L, 481L, 482L, 483L, 484L, 484L, 485L, 486L, 486L,
486L, 486L, 486L, 487L, 488L, 489L, 490L, 490L, 492L, 493L, 494L,
494L, 495L, 496L, 498L, 500L, 501L, 502L, 503L, 504L, 505L, 508L,
509L, 519L, 537L, 570L, 583L, 655L, 681L, 681L, 692L, 701L, 716L,
859L, 982L, 1118L, 1147L, 1163L, 1229L), Mutation = c("L18F",
"Δ69-70", "D80A", "Δ141-146", "D215G", "A222V", "Δ242-247",
"R246I", "S247R", "Q321L", NA, NA, NA, NA, NA, NA, "V341I", NA,
NA, NA, NA, "A348T", NA, "N354D", NA, "S359N", NA, NA, "V367F",
"K378R", "P384L", NA, NA, "E406W", "R408I", "Q409E", NA, NA,
"K417E", "K417N", "K417T", NA, NA, "A435S", "N439K", "N440K",
NA, "K444Q", "K444T", "V445A", "G446V", NA, NA, "Y449N", "N450D",
"L452Q", "L452R", "Y453F", "L455F", "F456A", "F456K", "F456R",
NA, "K458R", "N460T", NA, "I472V", NA, NA, "A475R", "A475V",
"G476S", "D.2 (S477N)", "T478I", "T478K", "P479S", NA, NA, "V483A",
"E484K", "E484Q", "G485D", "F486K", "F486L", "F486R", "F486S",
"F486V", "N487R", NA, NA, "F490L", "F490S", NA, "Q493K", "S494P",
"S494R", NA, NA, NA, NA, "N501Y", NA, NA, NA, NA, "Y508H", NA,
"H519P", "K537R", "A570D", "E583D", "H655Y", "P681H", "P681R",
"I692V", "A701V", "T716I", "T859N", "S982A", "D1118H", "S1147L",
"D1163Y", "M1229I"), Cas.Epitope = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "x", NA, NA, NA, NA, NA,
NA, NA, "x", "x", NA, "x", NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, "x", "x", "x", NA, NA, NA, NA, NA, NA, NA,
NA, NA, "x", "x", "x", NA, NA, NA, NA, NA, NA, NA, "x", "x",
"x", "x", "x", "x", "x", "x", "x", "x", "x", NA, NA, NA, "x",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Casirivimab = c(0.766666667,
1.2, 0.5, 0.9, 1.2, 1.25, 0.866666667, 0.7, NA, 1.6, NA, NA,
NA, NA, NA, NA, 0.8, NA, NA, NA, NA, 1, NA, 1.4, NA, 0.5, NA,
NA, 0.6, 0.8, 1.9, NA, NA, 84.4, 0.5, 1.3, NA, NA, 62.8, 34.8,
7.1, NA, NA, 1.1, 1.22, 1, NA, 0.95, 2, 1.75, 0.4, NA, NA, NA,
1.4, 5.8, 3.966666667, 73.91428571, 89.95, NA, NA, 3, NA, 0.8,
NA, NA, 2.2, NA, NA, 44.4, NA, 3.3, 2.45, 0.866666667, 2.9, 0.9,
NA, NA, 0.4, 18.84166667, 26.5, 3.7, 100, 48.6, 100, 100, 75.2,
100, NA, NA, 2.15, 0.866666667, NA, 42.66666667, 2.966666667,
0.8, NA, NA, NA, NA, 0.9, NA, NA, NA, NA, 1.1, NA, 0.7, 1, 0.433333333,
1, 1.2, 0.833333333, 1.2, 0.3, 0.7, 0.366666667, 1.3, 1, 0.9,
2.1, 0.8, 1.8), Bam.Epitope = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x",
"x", NA, "x", NA, "x", "x", "x", "x", NA, NA, NA, NA, "x", NA,
NA, NA, NA, NA, NA, NA, NA, NA, "x", "x", "x", "x", "x", "x",
"x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x",
"x", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Bamlanivimab = c(NA,
0.5, NA, NA, NA, 0.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 0.4, NA, NA, NA, NA, 1.3, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 100, 1.55, NA, NA, NA, 4.2,
NA, NA, NA, NA, NA, NA, NA, 2.5, NA, NA, 1.1, 2.6, NA, NA, NA,
NA, NA, 100, 100, NA, NA, NA, 100, NA, NA, 0.9, NA, NA, 100,
100, NA, NA, 100, 100, NA, NA, NA, NA, 1.1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), Ete.Epitope = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "x", "x", "x", "x", "x", "x", "x", NA,
"x", "x", "x", "x", NA, NA, NA, NA, "x", "x", "x", "x", "x",
NA, "x", "x", "x", "x", NA, "x", "x", "x", "x", "x", "x", "x",
NA, NA, "x", "x", "x", "x", "x", "x", NA, NA, NA, NA, NA, NA,
"x", "x", NA, "x", "x", "x", "x", "x", "x", NA, "x", NA, NA,
NA, "x", "x", "x", "x", NA, NA, NA, "x", "x", NA, "x", "x", NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), Etesevimab = c(0.6, 0.65, NA, 0.6, NA, 1.2, 0.2, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 83, 49,
NA, NA, NA, 0.35, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
1.4, 1.4, NA, 14.9, 32.6, 100, NA, NA, 100, NA, NA, NA, NA, 100,
16.7, NA, 0.6, 0.6, NA, NA, NA, NA, NA, 3.125, 1.3, NA, NA, 2.9,
17.9, NA, NA, 100, NA, NA, 4.5, 0.8, NA, NA, 0.55, 3.3, NA, NA,
NA, NA, 2.25, NA, NA, NA, NA, NA, NA, NA, NA, 0.2, 0.3, 0.5,
0.5, NA, NA, NA, 0.4, NA, 0.8, 0.5, NA, 0.7, NA), Imde.Epitope = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, "x", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x", "x", "x",
"x", "x", "x", "x", "x", "x", "x", "x", "x", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "x", "x", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), Imdevimab = c(0.7, 0.65, 0.6, 0.95, 0.7, 0.675, 0.433333333,
0.7, NA, 1.2, NA, NA, NA, NA, NA, NA, 0.8, NA, NA, NA, NA, 0.7,
NA, 0.7, NA, 0.6, NA, NA, 0.4, 0.6, 1.3, NA, NA, 100, 0.4, 1,
NA, NA, 1.166666667, 0.566666667, 1.1, NA, NA, 1, 24.66, 95.6,
NA, 68, 100, 68.26666667, 58.35, NA, NA, NA, 18.6, 4.4, 3.15,
1.5625, 1.2, NA, NA, 0.3, NA, 0.4, NA, NA, 1, NA, NA, 0.3, NA,
0.5, 1.375, 0.933333333, 1.4, 1.6, NA, NA, 0.6, 2.169230769,
1.1, 2.2, 0.4, 0.8, 0.3, 0.4, 2.05, 0.1, NA, NA, 2.933333333,
2.35, NA, 2.3, 1.233333333, 3.2, NA, NA, NA, NA, 0.633333333,
NA, NA, NA, NA, 0.7, NA, 0.5, 1.2, 0.466666667, 1, 1.3, 0.766666667,
0.9, 0.6, 0.8, 0.433333333, 1.2, 1.233333333, 0.733333333, 0.9,
1, 0.3), Sotro.Epitope = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", NA,
NA, "x", "x", "x", "x", "x", NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, "x", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), Sotrovimab = c("1.1", "0.8",
NA, "0.8", NA, "0.8", "0.5", NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "0.7", "1", NA, NA, NA, "1.05",
NA, NA, NA, NA, NA, NA, NA, NA, "#DIV/0!", NA, NA, NA, "1", NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "#DIV/0!", "0.5",
"0.7", NA, NA, NA, NA, NA, "0.45", NA, NA, NA, "1.5", NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "0.8", NA, NA, NA, NA, NA, "2.35",
NA, NA, NA, NA, NA, NA, NA, NA, "1.8", "1.1", "0.8", "0.7", NA,
NA, NA, "1", NA, "0.7", "0.5", NA, "#DIV/0!", NA), Regdan.Epitope = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "x",
NA, NA, NA, NA, NA, NA, "x", "x", "x", NA, NA, NA, NA, NA, NA,
NA, NA, NA, "x", NA, NA, "x", "x", "x", "x", "x", "x", "x", "x",
"x", NA, NA, NA, "x", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, "x", NA, NA, NA, "x", "x", "x", "x", "x", NA, NA, "x",
"x", "x", "x", "x", "x", "x", "x", "x", "x", NA, NA, NA, NA,
NA, "x", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), Regdanivimab = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 0.7, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 35, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, 8.7, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 5.5,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-123L))

How to plot an igraph object on a vector map in R

I use the igraph and sf packages.
I have an igraph object whose vertices have spatial coordinates geo_dist_graph.
The vertices names and coordinates look like this:
grid_grid <-
structure(list(coords.x1 = c(15.504078, 15.704078, 15.904078,
15.104078, 15.304078, 15.504078, 15.704078, 15.104078, 15.304078,
15.704078, 14.904078, 14.304078, 13.904078, 14.704078, 13.704078,
14.104078, 14.704078, 14.904078, 13.704078, 13.904078, 14.704078,
13.704078, 13.904078, 14.304078),
coords.x2 = c(43.835623, 43.835623,
43.835623, 44.035623, 44.035623, 44.035623, 44.035623, 44.235623,
44.235623, 44.235623, 44.435623, 44.635623, 44.835623, 44.835623,
45.035623, 45.035623, 45.035623, 45.035623, 45.235623, 45.235623,
45.235623, 45.435623, 45.435623, 45.435623),
g9.nodes = c(27,
28, 29, 40, 41, 42, 43, 55, 56, 58, 69, 81, 94, 98, 108, 110,
113, 114, 123, 124, 128, 138, 139, 141)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24"
))
The graph is from a simple squared adjacency matrix:
geo_dist_graph <-
structure(c(NA, 1, 1, NA, NA, 1, 1, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, 1, NA, NA, NA,
1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, 1, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA,
NA, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 1, NA, 1, NA, NA, 1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, 1, NA,
1, NA, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, 1, 1, NA, NA, 1, NA, NA, NA, 1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA,
NA, 1, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 1, 1, 1, NA, 1, NA, 1, 1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, NA,
1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1, 1, NA, NA, NA, NA, 1, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 1, 1, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, 1, 1,
NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, 1, NA, NA, NA, NA, 1, 1, NA, NA, 1, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, 1, NA,
NA, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, 1, NA, 1, NA, NA, NA, NA, 1, NA, NA, NA, 1, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, 1,
NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 1, NA, NA, 1, NA, NA, NA, 1, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA,
NA, 1, NA, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, NA, 1, 1, NA, NA, 1, NA, NA, 1, 1, 1, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, 1, 1, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1, 1, NA, NA, 1, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1,
1, NA, 1, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 1, NA, NA, NA, 1, NA, NA, 1, NA),
.Dim = c(24L,
24L))
colnames(geo_dist_graph) <- grid_grid$g9.nodes
row.names(geo_dist_graph) <- grid_grid$g9.nodes
geo_dist_graph <- graph_from_adjacency_matrix(geo_dist_graph, mode = "upper", diag = F)
The spatial coordinates where attched this way:
V(geo_dist_graph)$x <-
grid_grid$coords.x1[match(V(geo_dist_graph)$name, grid_grid$g9.nodes)]
V(geo_dist_graph)$y <-
grid_grid$coords.x2[match(V(geo_dist_graph)$name, grid_grid$g9.nodes)]
The graph is correclty plotted in space when using the plot function. But when I try to add a basemap like this plot(map_crop_sp, add = T), the map doesn't show behind the graph, but there is no error message.
The map is vector map, don't know if it's important. Here is the code used to create it.
map <- st_read("ne_10m_coastline/ne_10m_coastline.shp")
map_crop <- st_crop(map, xmin = 13.304078, ymin = 43.635623, xmax = 16.503846, ymax = 45.60185)
map_crop_sp <- as(map_crop, Class = "Spatial")
Answer
Since the igraph should be on top of the map, I plot it second. I also added rescale = F:
plot(map_crop_sp)
plot(geo_dist_graph, add = T, rescale = F)
Rationale
I typed ?plot.igraph. From there, I found ?igraph.plotting. It seems that plotting an igraph object rescales it (plot(..., rescale = TRUE):
Logical constant, whether to rescale the coordinates to the [-1,1]x-1,1 interval. This parameter is not implemented for tkplot.
Defaults to TRUE, the layout will be rescaled.

UpsetR - choosing intersections to visualize

I would like to use Upset plot instead of venn diagram to show overlap between specific groups (20 in total). However, one of the group (number 10) is the most important for me and I would like to present how many unique values is in that specific fraction. I would like to present ~25-30 intersections in total on the graph but uniqueness of group 10 has to be also shown.
I know existence of sets function but I would like to present around 25-30 intersections as mentioned and this 1 group additionally.
Any ideas ?
EDIT:
Added reproducible example:
dput(rep_exp)
structure(list(Gr_4 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 24.4310955935393, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), Gr_5 = c(NA, NA, NA, 21.8310535918369,
NA, NA, NA, NA, NA, 20.7254228450715, NA, 27.1619253143803, NA,
NA, NA, NA, NA, NA, NA, 26.6027203831498, NA, NA, NA, NA, NA,
NA, 30.8729830402671, NA, NA, NA), Gr_6 = c(28.8390902059829,
24.67734371881, 22.683139406727, 29.1546773298581, NA, NA, 21.9107159172821,
NA, 22.9230495998744, 26.9880437180908, NA, 32.391666051163,
NA, NA, NA, 21.6001415858001, 23.0239282537894, NA, 21.055168555216,
30.7121903523751, NA, NA, NA, NA, 22.0963548474675, NA, 32.513357598066,
NA, NA, 23.7976852708585), Gr_7 = c(21.4265985064224, NA, NA,
23.0695638371137, NA, NA, NA, NA, NA, 20.7903453146324, NA, 28.2499758022535,
NA, NA, NA, NA, NA, NA, NA, 25.9613085520105, NA, NA, NA, NA,
NA, NA, 29.355377815192, NA, NA, 21.1302512982254), Gr_8 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 21.8730880931062, NA,
NA, NA, NA, NA, NA, NA, 22.4189005564519, NA, NA, NA, NA, NA,
NA, 30.2275670312356, NA, NA, NA), Gr_9 = c(22.195894810917,
25.9203441316619, NA, 23.5317193622031, NA, NA, NA, 20.4526193062251,
NA, 22.357699113594, 19.9767319209274, 29.0184743803346, NA,
NA, NA, NA, 22.1010446624755, NA, NA, 26.3118997535445, NA, NA,
NA, NA, NA, NA, 29.9658173049532, NA, NA, 22.7388204380555),
Gr_10 = c(24.9716280984187, 26.6702013159945, NA, 26.0197313721615,
NA, NA, NA, 22.1233522815746, NA, 24.0516716332837, 22.4063679987568,
30.256761573029, NA, NA, NA, NA, 26.4434318913431, NA, NA,
27.9654320211905, NA, NA, NA, NA, NA, NA, 29.8212398361126,
NA, NA, 24.1442935303143), Gr_11 = c(22.9856008507804, 25.1691705265075,
NA, 26.0689081411402, NA, NA, NA, 21.1400234004731, NA, 24.5711480491199,
23.5402595534611, 29.329649538014, NA, NA, NA, NA, 28.6076666902364,
NA, NA, 26.5597151498881, NA, NA, 25.8334491330428, NA, NA,
NA, 29.7854239060885, NA, NA, 21.751849665826), Gr_12 = c(28.2942576160509,
28.4109042369708, NA, 30.2938411874268, 28.1159976488766,
26.6893919055319, NA, 20.2236435193017, NA, 31.1236147481775,
27.1394614209655, 33.7497512742728, NA, NA, NA, 22.1620175455317,
32.740995072413, NA, 23.2685659859292, 31.9204662366898,
NA, NA, 30.7601811119423, NA, NA, 22.8704941623247, 31.3416488641037,
NA, NA, 28.6773773257387), Gr_13 = c(27.9415091276483, 27.0299165363222,
NA, 30.7110417097659, 28.7379570773404, 25.5882365428802,
NA, NA, NA, 32.2667076588073, 27.2933369287433, 34.7079501935325,
NA, 22.8206916170467, NA, NA, 32.5779472688676, NA, NA, 32.6317048040664,
NA, NA, 30.1389490092958, 23.8308408919424, 23.0679896658325,
26.164689687244, 30.2006952484736, NA, 24.447772868487, 29.5606883639626
), Gr_14 = c(27.4616237005853, 26.7750499947566, NA, 30.3932526396929,
31.1062446290124, 27.2595253359549, NA, NA, NA, 33.6656430607522,
27.734214173453, 35.0800848727354, NA, NA, NA, 23.151279208873,
33.2366327906614, NA, NA, 33.4932145181405, 22.9608977649923,
NA, 31.8193222893087, 24.7850652730265, NA, 24.9920915833786,
29.0239557410047, 25.2744788247811, 26.6821750741598, 29.7891764054099
), Gr_15 = c(27.2029382158867, 25.3112934881725, NA, 29.1103329989503,
29.514275096105, NA, NA, NA, NA, 31.6854120776358, 28.5249970429603,
35.9001903675862, 22.4465240056921, NA, NA, NA, 31.8450938083269,
25.5788830788713, NA, 34.7663358707296, 25.6549086895753,
26.2291635318221, 31.9466351025545, 26.715548983008, NA,
25.6752720211283, 28.4457302899793, 27.2647239196348, 25.0412216502086,
31.6489022687779), Gr_16 = c(25.1843096821522, NA, NA, 26.444459119903,
23.8302606418847, NA, NA, NA, NA, 27.987230611469, 27.8591095189136,
32.8816869988268, NA, NA, 24.8165571469754, NA, 28.7689442058935,
25.2395434664377, NA, 32.829999906694, 23.6787411063596,
NA, 27.8325560998723, 25.9582137297807, NA, NA, 25.6769403745901,
25.3048339598422, 23.7070405817542, 29.8423911570548), Gr_17 = c(23.2209751780558,
NA, NA, 24.6434488773652, 22.5225058653221, NA, NA, NA, NA,
27.0216809889885, 26.6607134339159, 31.099676534797, NA,
NA, 26.93077937966, NA, 27.8090060912948, 26.7795654758791,
NA, 32.3731900255852, 24.9494014193233, NA, 24.5609834789349,
26.086325932043, NA, NA, 25.5082418618407, 23.6504233402429,
23.8014399755019, 28.7791270904749), Gr_18 = c(NA, NA, NA,
NA, NA, NA, 26.0401348427439, NA, NA, 24.3341543275568, 24.7556235529872,
30.4889365348298, NA, NA, 26.9888022043666, NA, 25.7387173773674,
27.1316334308385, NA, 31.571451882524, NA, NA, NA, 25.745888266175,
NA, NA, 23.2997781674234, NA, NA, 23.2402643606836), Gr_19 = c(NA,
NA, NA, NA, NA, NA, 24.3940790216008, NA, NA, NA, 21.4222413790374,
25.7991932672173, NA, NA, 25.9372380266141, NA, 22.9217973627502,
20.5334552143032, NA, 28.7776543930148, NA, NA, NA, 23.9298543509444,
NA, NA, 24.3614522942989, NA, NA, NA), Gr_20 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 25.6961748124338, NA,
NA, NA, NA, NA, NA, NA, 26.4582321196234, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), Gr_21 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 22.8258042878256, NA, NA, NA, NA, NA,
NA, NA, 25.1317511650203, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA)), row.names = c(NA, 30L), class = "data.frame")
It is a code which has been used to generate a plot:
for (i in 1:nrow(rep_exp)) { rep_exp[i, ][is.na(rep_exp[i, ]) == F] = rownames(rep_exp)[i]}
n_sizescale=nrow(rep_exp)*1.2
p1 <- { upset( fromList(rep_exp),
nsets = ncol(rep_exp), nintersects = 35, order.by = "freq", #degree, freq
empty.intersections = "on", number.angles = 0, mb.ratio = c(0.55, 0.45), point.size = 2.5, line.size = 0.8,
text.scale = c(1.3, 1.3, 1, 1, 1, 1), mainbar.y.label = "Number of groups",
sets.x.label = "Groups",
show.numbers = "yes", keep.order = TRUE,
set_size.show = TRUE,
set_size.scale_max = n_sizescale)
}
If I understood correctly the idea of dots below barplot they indicate how many groups can be found in specific intersections and a single dot gives a number for that specific Gr ? Am I right ?
How to force and algorithm to show couple of "interesting" groups as a single dot (show uniqueness of this group) and other as intersections.
Can you (#krassowski) rewrite a code for the package you mentioned ?

Store position and values of non-NA elements

I have a data set consisting out of a matrix with quite some NAs. From this, I want to create a dataframe storing both the location and the value of the non-NA values. Via this answer which can be used to get the locations, via tempList <- which(!is.na(dummy),TRUE).
Currently I use a for loop afterwards. Is there a better way to add the values?
Data:
structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, "#000000FF", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, "#000000FF", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), .Dim = c(10L,
10L))
Desired result:
structure(list(x = c(8, 7), y = c(5, 7), colour = structure(c(1L,
1L), .Label = "#000000FF", class = "factor")), class = "data.frame", row.names = c(NA,
-2L))
Current code:
tempList <- which(!is.na(dummy),TRUE)
changedDF <- data.frame(tempList[,1],tempList[,2])
names(changedDF) <- c("row","column")
for(i in 1:nrow(changedDF)){
changedDF$colour[i] <- dummy[changedDF[i,1],changedDF[i,2]]
}

Resources