Create (many) columns conditional on similarly named columns - r

I want to create a new column that take the value of one of two similarly named columns, depending on a third column. There are many such columns to create. Here's my data.
dt <- structure(list(malvol_left_1_w1 = c("1", "1", "4", "3", "4",
"4", "1", "4", "4", "3", "1", "4", "4", "3", "4", "4", "5", "2",
"4", "2"), malvol_left_2_w1 = c("1", "1", "4", "3", "4", "4",
"1", "3", "4", "2", "2", "2", "4", "1", "5", "4", "5", "2", "4",
"2"), malvol_right_1_w1 = c("1", "1", "4", "3", "4", "4", "1",
"3", "4", "2", "1", "4", "4", "5", "5", "4", "2", "6", "4", "1"
), malvol_right_2_w1 = c("1", "1", "4", "3", "4", "4", "1", "3",
"4", "2", "1", "2", "4", "5", "5", "4", "5", "5", "4", "5"),
malvol_left_1_w2 = c("1", "1", "3", "3", "4", "4", "1", "5",
"4", "4", "4", "2", "1", "4", "5", "4", "3", "2", "4", "4"
), malvol_left_2_w2 = c("1", "1", "3", "3", "4", "4", "7",
"5", "4", "2", "3", "1", "1", "4", "4", "4", "3", "4", "4",
"4"), malvol_right_1_w2 = c("1", "3", "3", "3", "4", "4",
"1", "4", "4", "3", "2", "2", "4", "1", "4", "4", "5", "5",
"4", "4"), malvol_right_2_w2 = c("1", "2", "3", "3", "4",
"4", "1", "2", "4", "2", "3", "2", "4", "1", "4", "4", "5",
"4", "4", "3"), leftright_w1 = c("right", "right", "left",
"right", "right", "right", "left", "right", "right", "left",
"left", "left", "left", "right", "left", "left", "right",
"right", "right", "left"), leftright_w2 = c("right", "right",
"left", "left", "right", "left", "left", "right", "right",
"left", "left", "left", "left", "right", "left", "left",
"right", "right", "left", "left")), class = "data.frame", row.names = c("12",
"15", "69", "77", "95", "96", "112", "122", "150", "163", "184",
"216", "221", "226", "240", "298", "305", "354", "370", "379"
))
Now I can do this in dplyr like:
dt <- dt %>%
mutate(
malvol_1_w1 = case_when(
leftright_w1 == "left" ~ malvol_right_1_w1,
leftright_w1 == "right" ~ malvol_left_1_w1),
malvol_2_w1 = case_when(
leftright_w1 == "left" ~ malvol_right_2_w1,
leftright_w1 == "right" ~ malvol_left_2_w1),
malvol_1_w2 = case_when(
leftright_w2 == "left" ~ malvol_right_1_w2,
leftright_w2 == "right" ~ malvol_left_1_w2),
malvol_2_w2 = case_when(
leftright_w2 == "left" ~ malvol_right_2_w2,
leftright_w2 == "right" ~ malvol_left_2_w2))
However, it's not really a feasible solution, because there will be more of both numbers defining a variable (e.g. both malvol_3_w1 and malvol_1_w3 will need to be created).
One solution is to this with a loop:
for (wave in 1:2) {
for (var in 1:2) {
dt[, paste0("malvol_", var, "_w", wave)] <- dt[, paste0("malvol_right_", var, "_w", wave)]
dt[dt[[paste0("leftright_w", wave)]] == "right", paste0("malvol_", var, "_w", wave)] <-
dt[dt[[paste0("leftright_w", wave)]] == "right", paste0("malvol_left_", var, "_w", wave)]
}
}
However, what is a tidyverse solution?
UPDATE:
I came up with a tidyverse solution myself, however, not every elegant. Still looking for more canonical solutions.
dt <- dt %>%
mutate(
malvol_1_w1 = NA, malvol_2_w1 = NA,
malvol_1_w2 = NA, malvol_2_w2 = NA) %>%
mutate(
across(matches("malvol_\\d"),
~ case_when(
eval(parse(text = paste0("leftright_", str_extract(cur_column(), "w.")))) == "left" ~
eval(parse(text = paste0(str_split(cur_column(), "_\\d", simplify = T)[1],
"_right", str_split(cur_column(), "malvol", simplify = T)[2]))),
eval(parse(text = paste0("leftright_", str_extract(cur_column(), "w.")))) == "right" ~
eval(parse(text = paste0(str_split(cur_column(), "_\\d", simplify = T)[1],
"_left", str_split(cur_column(), "malvol", simplify = T)[2]))))))

What makes your problem difficult is that a lot of information is hidden in variable names rather than data cells. Hence, you need some steps to transform your data into "tidy" format. In the code below, the crucial part is (1) to turn the variables [malvol]_[lr]_[num]_[w] into four separate columns malvol, lr, num, w (all prefixed with m_), and (2) from the variables leftright_[w] extract variable w (prefixed with l_) using the functions pivot_longer and than separate.
# Just adding a row_id to your data, for later joining
dt <- dt %>% mutate(id = row_number())
df <- dt %>%
# Tidy the column "malvol"
pivot_longer(cols = starts_with('malvol'), names_to = "m_var", values_to = "m_val") %>%
separate(m_var, into = c("m_malvol", "m_lr", "m_num", "m_w")) %>%
# They the column "leftright"
pivot_longer(cols = starts_with('leftright'), names_to = 'l_var', values_to = 'l_lr') %>%
separate(l_var, into = c(NA, "l_w")) %>%
# Implement the logic
filter(l_w == m_w) %>%
filter(l_lr != m_lr) %>%
# Pivot into original wide format
select(-c(l_w, l_lr, m_lr)) %>%
pivot_wider(names_from = c(m_malvol, m_num, m_w), values_from = m_val)
# Merging back results to original data
dt <- dt %>% mutate(id = row_number()) %>% inner_join(df, by="id")
Although I pivoted the data back into your desired format in the end (to check whether results are in line with your desired results), I would suggest you leave the data in the long format, which is "tidy" and more easy to work with, compared to your "wide" format. So maybe skip the last pivot_wider operation.

Related

Make ggplot connect datapoints in a scatterplot chronologically

This seems very simple but for some reason I can't make it work.
I have a dataset with 3 variables. The first variable is a measurement which is taken several times per day across two months (it can take the values 1, 2, 3, 4, 5 and 6 - these are not groups, it is values that have been measured). The second variable are the dates. The third variable are the times the measurement was taken. I want to plot how this measurement changes across time so I need the datapoints to be connected chronologically.
Things I have tried:
I have tried to plot just using date by making sure it is a date format and it is ordered correctly and then specified + geom_path() which should tell R I want it to go row by row connecting
DF$Date <- as.Date(DF$Date)
DF <- DF[order(DF$Date),]
ggplot(DF, aes(x = Date, y = Measurement)) +
geom_line(linewidth=1, colour="green") +
geom_path()
I created a DateTime variable:
DF$DateTime <- as.POSIXct(paste(DF$Date, DF$Time, format="%y/%m/%d %H:%M:%S"))
ggplot(DF, aes(x = DateTime, y = Measurement)) +
geom_line(linewidth=1, colour="green")
In both cases R just connects all the response of value 1 to each other, all responses of value 2 to each other and so on. And does not do it chronologically.
Thank you!
structure(list(Measurement = c("1", "1", "1", "1", "2", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "2", "2", "1", "2", "2", "2", "1", "2", "1", "1",
"1", "2", "1", "2", "2", "2", "1", "2", "3", "2", "2", "4", "3",
"2", "2", "2", "2", "3", "3", "3", "3", "3", "4", "3", "3", "2",
"2", "4", "4", "3", "1", "1", "2", "1", "1", "4", "3", "3", "3",
"3", "3", "3", "4", "3", "3", "3", "3", "3", "3", "3", "3", "3",
"3", "5", "4", "5", "3", "5", "5", "5", "4", "4", "4", "4", "4",
"4", "3", "5", "4", "4", "4", "4", "4", "4", "4", "4", "4", "5",
"4", "4", "5", "5", "5", "5", "4", "3", "4", "4", "4", "4", "3",
"4", "3", "4", "4", "4", "4", "3", "5", "4", "4", "5", "4", "4",
"4", "4", "4", "4", "4", "4", "3", "3", "3", "3", "4", "3", "4",
"3", "3", "2", "3", "4", "4", "4", "4", "4", "5", "4", "4", "4",
"4", "3", "4", "5", "4", "3", "4", "4", "4", "4", "4", "3", "4",
"1", "4", "4", "3", "4", "4", "3", "4", "4", "4", "3", "4", "4",
"4", "4", "6", "4", "4", "4", "4", "3", "3", "4", "4", "3", "3",
"4", "3", "3", "3", "4", "5", "4", "4", "4", "4", "1", "3", "4",
"3", "4", "4", "4", "1", "3", "4", "5", "5", "5", "5", "5", "5",
"5", "4", "4", "4", "4", "4", "5", "4", "3", "4", "4", "4", "4",
"5", "4", "4", "3", "4", "4", "4", "4", "4", "5", "4", "4", "4",
"4", "4", "3", "5", "4", "4", "4", "4", "3", "3", "3", "4", "4",
"3", "4", "4", "5"), Date = structure(c(19333, 19333,
19334, 19334, 19334, 19334, 19334, 19334, 19334, 19335, 19335,
19335, 19335, 19335, 19335, 19335, 19335, 19336, 19336, 19336,
19336, 19336, 19336, 19337, 19337, 19337, 19337, 19337, 19337,
19338, 19338, 19338, 19338, 19338, 19338, 19338, 19338, 19338,
19339, 19339, 19339, 19339, 19339, 19339, 19339, 19339, 19339,
19340, 19340, 19340, 19340, 19341, 19341, 19341, 19341, 19342,
19342, 19342, 19342, 19342, 19342, 19342, 19343, 19343, 19343,
19343, 19343, 19344, 19344, 19344, 19344, 19344, 19345, 19345,
19345, 19345, 19346, 19346, 19346, 19346, 19347, 19347, 19347,
19347, 19348, 19348, 19349, 19350, 19350, 19350, 19350, 19350,
19350, 19351, 19351, 19351, 19352, 19353, 19353, 19353, 19353,
19353, 19354, 19355, 19355, 19355, 19356, 19356, 19356, 19356,
19357, 19357, 19357, 19357, 19358, 19358, 19358, 19358, 19359,
19359, 19359, 19359, 19360, 19360, 19360, 19360, 19360, 19361,
19361, 19362, 19362, 19362, 19362, 19363, 19363, 19363, 19363,
19364, 19364, 19364, 19364, 19364, 19365, 19365, 19365, 19365,
19365, 19366, 19366, 19366, 19366, 19366, 19367, 19367, 19367,
19367, 19367, 19368, 19368, 19368, 19368, 19369, 19369, 19369,
19370, 19370, 19370, 19371, 19371, 19371, 19372, 19372, 19372,
19372, 19372, 19373, 19373, 19373, 19373, 19374, 19374, 19374,
19374, 19374, 19374, 19375, 19375, 19375, 19375, 19376, 19376,
19376, 19377, 19377, 19377, 19377, 19377, 19378, 19378, 19378,
19378, 19379, 19379, 19379, 19379, 19379, 19380, 19380, 19380,
19380, 19380, 19381, 19381, 19381, 19381, 19382, 19382, 19382,
19382, 19383, 19383, 19385, 19385, 19385, 19385, 19385, 19385,
19385, 19385, 19385, 19385, 19386, 19386, 19386, 19386, 19386,
19387, 19387, 19387, 19387, 19387, 19387, 19387, 19387, 19387,
19388, 19388, 19388, 19388, 19388, 19388, 19388, 19388, 19388,
19389, 19389, 19389, 19389, 19389, 19389, 19389, 19389, 19390,
19390, 19390, 19390, 19390, 19390, 19390, 19390, 19390), class = "Date"),
Time = structure(c(43810, 44174, 49104, 49343, 50921,
54029, 59443, 65767, 70544, 40647, 40731, 43219, 50506, 54044,
58687, 68571, 71016, 36049, 38921, 44148, 55413, 66503, 70310,
34796, 48468, 48770, 56701, 67069, 73131, 32103, 37937, 43270,
43941, 49507, 57796, 59420, 65187, 70669, 28787, 33612, 38807,
43900, 49607, 54026, 60525, 65861, 76855, 29833, 43197, 45349,
67928, 34018, 44887, 54024, 65491, 34687, 45029, 45029, 45096,
45096, 56881, 70503, 30726, 49625, 54871, 76945, 76990, 30348,
51899, 58286, 65893, 76301, 34075, 54033, 54075, 66322, 34158,
47973, 69070, 69113, 29971, 43838, 43891, 68344, 58512, 64840,
74134, 42286, 48249, 53712, 75669, 75669, 75669, 34484, 67922,
67922, 63298, 30761, 30814, 52835, 67936, 78132, 69679, 44485,
61309, 65893, 32443, 46595, 55031, 65701, 40995, 43257, 78737,
78783, 35103, 58260, 65353, 78583, 36651, 44588, 53857, 74257,
34262, 44172, 50954, 56508, 68744, 32577, 54241, 32233, 45405,
59002, 68596, 33529, 44235, 56676, 65104, 35378, 43263, 59195,
70423, 76305, 34704, 40350, 43769, 54069, 65163, 32335, 43220,
52463, 64829, 64883, 33312, 47326, 56974, 78210, 78249, 37710,
47664, 51668, 67281, 39815, 57103, 67451, 52368, 54111, 66853,
45038, 45079, 64861, 35856, 45970, 54136, 54174, 67102, 32497,
49309, 56959, 68312, 33326, 44280, 53945, 54763, 65275, 65313,
32958, 52099, 57512, 65378, 27223, 58171, 64993, 32862, 44507,
44547, 54631, 76109, 33983, 49720, 58810, 66231, 29886, 53075,
54592, 64904, 64942, 29982, 40303, 43288, 54319, 65762, 28881,
36993, 44716, 65239, 34587, 43395, 64886, 66650, 41670, 53480,
29252, 38412, 38477, 38477, 44963, 48648, 56521, 59572, 65410,
70232, 32517, 38681, 43273, 50715, 74179, 33337, 39419, 39419,
40341, 49560, 59123, 60091, 65164, 70217, 37318, 37822, 43242,
43287, 49346, 55187, 59908, 64815, 72710, 32553, 33678, 37864,
43283, 54029, 59412, 68693, 78965, 34730, 38193, 43936, 51483,
54039, 54134, 59417, 66687, 72937), class = c("hms", "difftime"
), units = "secs"), DateTime = structure(c(1670415010, 1670415374,
1670506704, 1670506943, 1670508521, 1670511629, 1670517043,
1670523367, 1670528144, 1670584647, 1670584731, 1670587219,
1670594506, 1670598044, 1670602687, 1670612571, 1670615016,
1670666449, 1670669321, 1670674548, 1670685813, 1670696903,
1670700710, 1670751596, 1670765268, 1670765570, 1670773501,
1670783869, 1670789931, 1670835303, 1670841137, 1670846470,
1670847141, 1670852707, 1670860996, 1670862620, 1670868387,
1670873869, 1670918387, 1670923212, 1670928407, 1670933500,
1670939207, 1670943626, 1670950125, 1670955461, 1670966455,
1671005833, 1671019197, 1671021349, 1671043928, 1671096418,
1671107287, 1671116424, 1671127891, 1671183487, 1671193829,
1671193829, 1671193896, 1671193896, 1671205681, 1671219303,
1671265926, 1671284825, 1671290071, 1671312145, 1671312190,
1671351948, 1671373499, 1671379886, 1671387493, 1671397901,
1671442075, 1671462033, 1671462075, 1671474322, 1671528558,
1671542373, 1671563470, 1671563513, 1671610771, 1671624638,
1671624691, 1671649144, 1671725712, 1671732040, 1671827734,
1671882286, 1671888249, 1671893712, 1671915669, 1671915669,
1671915669, 1671960884, 1671994322, 1671994322, 1672076098,
1672129961, 1672130014, 1672152035, 1672167136, 1672177332,
1672255279, 1672316485, 1672333309, 1672337893, 1672390843,
1672404995, 1672413431, 1672424101, 1672485795, 1672488057,
1672523537, 1672523583, 1672566303, 1672589460, 1672596553,
1672609783, 1672654251, 1672662188, 1672671457, 1672691857,
1672738262, 1672748172, 1672754954, 1672760508, 1672772744,
1672822977, 1672844641, 1672909033, 1672922205, 1672935802,
1672945396, 1672996729, 1673007435, 1673019876, 1673028304,
1673084978, 1673092863, 1673108795, 1673120023, 1673125905,
1673170704, 1673176350, 1673179769, 1673190069, 1673201163,
1673254735, 1673265620, 1673274863, 1673287229, 1673287283,
1673342112, 1673356126, 1673365774, 1673387010, 1673387049,
1673432910, 1673442864, 1673446868, 1673462481, 1673521415,
1673538703, 1673549051, 1673620368, 1673622111, 1673634853,
1673699438, 1673699479, 1673719261, 1673776656, 1673786770,
1673794936, 1673794974, 1673807902, 1673859697, 1673876509,
1673884159, 1673895512, 1673946926, 1673957880, 1673967545,
1673968363, 1673978875, 1673978913, 1674032958, 1674052099,
1674057512, 1674065378, 1674113623, 1674144571, 1674151393,
1674205662, 1674217307, 1674217347, 1674227431, 1674248909,
1674293183, 1674308920, 1674318010, 1674325431, 1674375486,
1674398675, 1674400192, 1674410504, 1674410542, 1674461982,
1674472303, 1674475288, 1674486319, 1674497762, 1674547281,
1674555393, 1674563116, 1674583639, 1674639387, 1674648195,
1674669686, 1674671450, 1674732870, 1674744680, 1674893252,
1674902412, 1674902477, 1674902477, 1674908963, 1674912648,
1674920521, 1674923572, 1674929410, 1674934232, 1674982917,
1674989081, 1674993673, 1675001115, 1675024579, 1675070137,
1675076219, 1675076219, 1675077141, 1675086360, 1675095923,
1675096891, 1675101964, 1675107017, 1675160518, 1675161022,
1675166442, 1675166487, 1675172546, 1675178387, 1675183108,
1675188015, 1675195910, 1675242153, 1675243278, 1675247464,
1675252883, 1675263629, 1675269012, 1675278293, 1675288565,
1675330730, 1675334193, 1675339936, 1675347483, 1675350039,
1675350134, 1675355417, 1675362687, 1675368937), class = c("POSIXct",
"POSIXt"), tzone = "")), row.names = c(NA, -271L), class = c("tbl_df",
"tbl", "data.frame"))
You can connect the points chronologically by specifying group = 1. Additionally, you can use scale_x_datetime to control the breaks on your x-axis. For the way the x-axis is labelled, see this r-bloggers post for more information about date formats.
I've included one example that best recreates what you were trying to do and another example where the points are colored by Measurement.
library(tidyverse)
DF$Date <- as.Date(DF$Date)
DF <- DF[order(DF$Date),]
DF$DateTime <- as.POSIXct(paste(DF$Date, DF$Time, format = "%y/%m/%d %H:%M:%S"))
# what I think you want
ggplot(DF, aes(x = DateTime, y = Measurement, group = 1)) +
geom_line(linewidth = 1, color = "green") +
geom_point() +
scale_x_datetime(date_breaks = "10 days", date_labels = "%b %d %y")
# another option
ggplot(DF, aes(x = DateTime, y = Measurement, color = Measurement, group = 1)) +
geom_line(linewidth = 1, color = "gray") +
geom_point() +
scale_x_datetime(date_breaks = "10 days", date_labels = "%b %d %y")

Recode a factor variable, dropping N/A

I have a factor variable with 14 levels, which I'm trying to into collapse into only 3 levels. It contains two N/A which I also wanna remove.
My code looks like this:
job <- fct_collapse(E$occupation, other = c("7","9", "10", "13" "14"), 1 = c("1", "2", "3", "12"), 2 = c("4", "5", "6", "8", "11"))
However it just gives me tons of error. Can anyone help here me here?
We could also this with a named list
library(forcats)
lst1 <- setNames(list(as.character(c(7, 9, 10, 13, 14)),
as.character(c(1, 2, 3, 12)), as.character(c(4, 5, 6, 8, 11))), c('other', 1, 2))
fct_collapse(df$occupation, !!!lst1)
data
df <- structure(list(occupation = c("1", "3", "5", "7", "9", "10",
"12", "14", "13", "4", "7", "6", "5")), class = "data.frame", row.names = c(NA,
-13L))
For numbers try using backquotes in fct_collapse.
job <- forcats::fct_collapse(df$occupation,
other = c("7","9", "10", "13", "14"),
`1` = c("1", "2", "3", "12"),
`2` = c("4", "5", "6", "8", "11"))

Problem with Piping for revalue in R Studio

I would like to revalue 13 different variables. They all have character as levels right now and are supposed to be changed to values.
Individually it would work to use
x$eins <- revalue(x$eins, c("Nie Thema" = "1",
"Selten Thema" = "2",
"Manchmal Thema" = "3",
"Häufig Thema" = "4",
"Sehr häufig Thema" = "5",
"Fast immer Thema" = "6"))
With the piping, I guess it would look something like this
x %>%
dplyr::select(., eins:dreizehn) %>%
revalue(., c("Nie Thema" = "1",
"Selten Thema" = "2",
"Manchmal Thema" = "3",
"Häufig Thema" = "4",
"Sehr häufig Thema" = "5",
"Fast immer Thema" = "6"))
With this, I get the warning message from revalue, that x is not a factor or a character vector.
What am I doing wrong?
Thanks in advance.
Use across to apply a function for multiple columns.
library(dplyr)
x <- x %>%
dplyr::mutate(across(eins:dreizehn, ~revalue(., c("Nie Thema" = "1",
"Selten Thema" = "2",
"Manchmal Thema" = "3",
"Häufig Thema" = "4",
"Sehr häufig Thema" = "5",
"Fast immer Thema" = "6"))))

removing unused factors from facet_grid with row and column specified

What is the Problem?
Each facet in ggplot2 may not contain all factor levels in the 'name' column, in which case I want the facet to ignore those factor levels to avoid the gaps in the tiles.
What have I tried
I have tried adding and removing scales = "free", drop = TRUE, and space = "free" from facet_grid as recommended in a stackexchange question here, and a stackoverflow question here.
Any help would be much appreciated!
Example slice of data
test_data <- structure(list(sample = c("1", "1", "1", "1", "1", "2", "2",
"2", "2", "2", "3", "3", "3", "3", "3", "1", "1", "1", "1", "1",
"2", "2", "2", "2", "2", "3", "3", "3", "3", "3"), name = c("IV_1385127_1385127_1_+_3_1",
"IV_78222_78222_1_-_3_1", "XV_978130_978130_1_-_3_1", "XIV_574351_574351_1_+_3_1",
"XV_357215_357215_1_-_3_1", "XII_456601_456601_1_-_3_1", "V_423552_423552_1_+_3_1",
"XI_200191_200191_1_-_3_1", "XII_465717_465717_1_-_4_2", "XII_455342_455342_1_-_3_1",
"VII_84298_84298_1_-_3_1", "IV_229884_229884_1_+_4_2", "XII_633371_633371_1_-_4_2",
"XIII_9888_9888_1_-_4_2", "X_703096_703096_1_-_3_2", "IV_1385127_1385127_1_+_3_1",
"IV_78222_78222_1_-_3_1", "XV_978130_978130_1_-_3_1", "XIV_574351_574351_1_+_3_1",
"XV_357215_357215_1_-_3_1", "XII_456601_456601_1_-_3_1", "V_423552_423552_1_+_3_1",
"XI_200191_200191_1_-_3_1", "XII_465717_465717_1_-_4_2", "XII_455342_455342_1_-_3_1",
"VII_84298_84298_1_-_3_1", "IV_229884_229884_1_+_4_2", "XII_633371_633371_1_-_4_2",
"XIII_9888_9888_1_-_4_2", "X_703096_703096_1_-_3_2"), ntile = c("1",
"1", "1", "1", "1", "1", "1", "1", "2", "1", "1", "2", "2", "2",
"2", "1", "1", "1", "1", "1", "1", "1", "1", "2", "1", "1", "2",
"2", "2", "2"), position = c("-1", "-1", "-1", "-1", "-1", "-1",
"-1", "-1", "-1", "-1", "-1", "-1", "-1", "-1", "-1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"
), base = c("T", "T", "T", "A", "C", "A", "T", "T", "A", "T",
"T", "G", "C", "C", "A", "A", "G", "A", "G", "G", "A", "T", "A",
"G", "T", "A", "A", "A", "A", "A")), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -30L))
Code used for plotting
p <- ggplot(test_data, aes(x = position, y = factor(name))) +
geom_tile(aes(fill = base)) +
scale_fill_viridis_d() +
theme_bw() +
theme(
axis.title.y=element_blank(),
axis.text.y=element_blank(),
legend.title=element_blank(),
axis.title.x=element_text(margin = margin(t = 15)),
panel.grid=element_blank()
)
p <- p + facet_grid(ntile ~ sample, scales = "free", space = "free", drop = TRUE)
Example plot output
ggplot2 version
ggplot2 3.1.1
There are no unused factors in your sample data and the arguments are not giving you what you desire. Possible startup workaround can be:
# Your script
p <- ggplot(test_data, aes(x = position, y = factor(name))) +
geom_tile(aes(fill = base)) +
scale_fill_viridis_d() +
theme_bw() +
theme(
axis.title.y=element_blank(),
axis.text.y=element_blank(),
legend.title=element_blank(),
axis.title.x=element_text(margin = margin(t = 15)),
panel.grid=element_blank()
)
p + coord_flip() + facet_wrap(ntile ~ sample, scales = "free")
output

Using msSurv package in R

I'm trying to use msSurv for a multi-state modelling problem that looks at an individuals transition to different stages. Part of that is creating a tree object which is where I think I'm making a mistake but I can't understand what it is. I'll include the minimum workable example here.
Nodes <- c("1", "2", "3", "4", "5", "6")
Edges <- list("1" = list(edges = c("2", "3", "4", "5", "6")),
"2" = list(edges = c("1", "3", "4", "5", "6")),
"3" = list(edges = c("1", "2", "4", "5", "6")),
"4" = list(edges = c("1", "2", "3", "5", "6")),
"5" = list(edges = c("3", "4", "6")),
"6" = list(edges = NULL))
treeobj <- new("graphNEL", nodes = Nodes, edgeL = Edges, edgemode = "directed")
fit3 <- msSurv(df, treeobj, bs = TRUE, LT = TRUE)
The error I'm getting is as follows.
No states eligible for exit distribution calculation.
Entry distributions calculated for states 6 .
Error in bs.IA[, , j, b] : subscript out of bounds
The dataset in question can be found here.
Any help is sincerely appreciated.
I may be misunderstanding, but your 6 group doesn't have 1-6 as an edge, thus the program returns an error because in essence you're saying 6 isn't connected to the calculation. In relation to the solution, I believe 6 should have edges, as in this line may need to have edges: "6" = list(edges = NULL))

Resources