Create dataframe from logical test - r

I've got what I know is a really easy question, but I'm stumped and seem to lack the vocabulary to seek out the answer effectively with the search bar.
I have a data frame full of numbers similar to this (though not of the same class)
Dat <- structure(c(9L, 9L, 3L, 3L, 2L, 9L, 10L, 5L, 6L, 2L, 4L, 6L,
10L, 2L, 9L, 0L, 1L, 8L, 9L, 7L, 7L, 4L, 4L, 3L, 4L, 7L, 7L,
1L, 0L, 3L, 6L, 10L, 8L, 3L, 0L, 7L, 7L, 1L, 2L, 8L, 5L, 7L,
7L, 8L, 2L, 1L, 10L, 3L, 0L, 2L, 7L, 0L, 0L, 7L, 9L, 8L, 9L,
0L, 4L, 4L, 5L, 6L, 6L, 2L, 4L, 1L, 6L, 2L, 4L, 7L, 5L, 2L, 7L,
4L, 8L, 3L, 3L, 2L, 5L, 1L, 1L, 3L, 8L, 0L, 1L, 8L, 8L, 1L, 1L,
0L, 4L, 4L, 4L, 5L, 6L, 9L, 5L, 2L, 6L, 3L), .Dim = c(10L, 10L
))
All I want to do is replace all values > 5 with a 1, and all values less than 5 with a 0. I've gotten as far as getting a frame with TRUE and FALSE, but can't seem to figure out how to replace things.
Datlog <- Dat > 5
Any help would be greatly appreciated. Thank you.

If I read your question correctly, you'll kick yourself for the answer:
(Dat > 5) * 1
TRUE and FALSE in R equate to 1 and 0 respectively. As such, the more semantically correct way to do this would be something like:
out <- as.numeric(Dat > 5)
dim(out) <- dim(Dat)
The two step approach is required in this second approach because when you use as.numeric, the dims of the original data are lost.
One way to replace with different values would be to use factor:
out <- factor((Dat > 5), c(TRUE, FALSE), c("YES", "NO"))
dim(out) <- dim(Dat)
Another way would be basic subsetting and substitution:
out <- Dat
out[out > 5] <- 999
out[out <= 5] <- 0
out

Related

How to filter out duplicates in 2-column data frame but in reverse orientation?

I have a data frame of pairs of genes. There are some pairs which are listed twice but in reverse orientation. How do I remove those pairs which are duplicates (but in reverse orientation)? Thanks!
> dput(all_pairs)
structure(list(gene1 = structure(c(2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 5L, 6L, 7L, 8L, 9L, 10L, 1L,
2L, 3L, 4L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 8L, 9L, 10L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 10L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("ASXL1", "BICRA",
"CCDC168", "HRAS", "MUC16", "NOTCH1", "OBSCN", "PLEC", "RREB1",
"TTN"), class = "factor"), gene2 = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L), .Label = c("ASXL1", "BICRA",
"CCDC168", "HRAS", "MUC16", "NOTCH1", "OBSCN", "PLEC", "RREB1",
"TTN"), class = "factor")), out.attrs = list(dim = c(10L, 10L
), dimnames = list(Var1 = c("Var1=ASXL1", "Var1=BICRA", "Var1=CCDC168",
"Var1=HRAS", "Var1=MUC16", "Var1=NOTCH1", "Var1=OBSCN", "Var1=PLEC",
"Var1=RREB1", "Var1=TTN"), Var2 = c("Var2=ASXL1", "Var2=BICRA",
"Var2=CCDC168", "Var2=HRAS", "Var2=MUC16", "Var2=NOTCH1", "Var2=OBSCN",
"Var2=PLEC", "Var2=RREB1", "Var2=TTN"))), class = "data.frame", row.names = c(NA,
-90L))
This keeps only one copy of each pair, no matter what the orientation/order is:
all_pairs[!duplicated(t(apply(all_pairs, 1, sort))), ]

R best vizualization

I would like to do vizualisation of 2 vectors (predikcia & test data) of all wrongly classified numbers from my classification problem, where i have 76 data in both vectors - first one (predikcia) has numbers from 0-9 what classificator wrongly predicted and in second vector (test data) are numbers what it should be. Basic plot of these vectors has not good representation or not giving some good information about what numbers were wrongly classified and what number they should be classified correctly. Here is a picture what is basic plot showing
data
classres <- data.frame(
predikcia = c(9L, 8L, 3L, 9L, 1L, 6L, 2L, 2L,
6L, 3L, 5L, 9L, 8L, 1L, 5L, 1L, 3L, 3L, 5L, 9L,
5L, 1L, 8L, 9L, 5L, 0L, 1L, 9L, 5L, 5L, 8L, 9L,
2L, 5L, 8L, 5L, 6L, 9L, 9L, 4L, 9L, 3L, 5L, 5L, 9L, 9L, 9L, 4L, 3L,
5L, 8L, 3L, 0L, 5L, 8L, 8L, 7L, 3L, 8L, 8L, 5L, 9L, 9L, 1L, 5L, 5L,
9L, 9L, 5L, 3L, 1L, 9L, 2L, 5L, 8L, 9L),
testdata = c(4L, 6L, 1L, 5L, 5L, 1L, 1L, 1L, 5L,
9L, 7L, 8L, 0L, 8L, 8L, 9L, 7L, 1L, 9L, 5L, 8L,
8L, 0L, 5L, 1L, 8L, 4L, 1L, 9L, 1L, 0L, 5L, 1L,
9L, 0L, 0L, 0L, 4L, 1L, 2L, 7L, 5L, 9L, 8L, 5L,
5L, 5L, 1L, 9L, 9L, 0L, 9L, 8L, 9L, 6L, 0L, 8L,
5L, 0L, 9L, 8L, 5L, 5L, 9L, 2L, 8L, 0L, 5L, 7L,
1L, 8L, 8L, 9L, 9L, 7L, 1L))
I'm assuming that there is either "correct" or "incorrect" predictions, otherwise the graph would need more work.
First, I have the data in which there are precitions and real values. In this examle they are integers, but I'm pretending that it does not mean anything.
classres <- data.frame(
predikcia = c(9L, 8L, 3L, 9L, 1L, 6L, 2L, 2L,
6L, 3L, 5L, 9L, 8L, 1L, 5L, 1L, 3L, 3L, 5L, 9L),
testdata = c(4L, 6L, 1L, 5L, 5L, 1L, 1L, 1L, 5L,
9L, 7L, 8L, 0L, 8L, 8L, 9L, 7L, 1L, 9L, 5L))
Then I create a count data-frame. The "factor" part is important because I want all the possible combinations to appear on the plot.
dat.plot <- classres %>%
count(testdata, predikcia) %>%
mutate(
testdata = factor(testdata, levels = 0:9),
predikcia = factor(predikcia, levels = 0:9))
Finally, I create a heatmap from the data coloring the inside of each cell with the count values and adding a border to the cells where predictions are considered correct (this is why I need the goodclass data-frame).
goodclass <- data.frame(
testdata = factor(0:9),
predikcia = factor(0:9)
)
dat.plot %>%
ggplot(aes(testdata, predikcia, fill = n)) +
geom_tile() +
scale_fill_gradient(low = "goldenrod", high = "darkorchid4") +
geom_tile(data = goodclass,
aes(testdata, predikcia, color = "Correct\npredictions"),
inherit.aes = FALSE, fill = NA, size = 2) +
scale_color_manual(values = c(`Correct\npredictions` = "limegreen")) +
labs(x = "Real class value", y = "Predicted class value",
fill = "count", color = "") +
coord_equal() +
theme_minimal() +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_line(color = "black", size = 2))
And the results hurst a little bit the eyes: it will probably need little bit more work to find more beautiful colors.

Error in pts when when calling function ordiellipse in VEGAN package: subscript out of bounds

I have a nice DCA ordination of some of my data, but can't produce an ordiellipse for all different dataclusters. My data is found in the bottom of this post.
What happens now: I have four different groups (Block A-D) with three clusters per group (three year categories). So, I want to end up with different delineated groups. When I run the ordiellipes I get the following error message:
Error in pts[gr, , drop = FALSE] : subscript out of bounds
In addition: Warning message:
In complete.cases(pts) & !is.na(groups) :
longer object length is not a multiple of shorter object length
and a graph as follows:
DCA ordination of four blocks of vegetation
This piece of code reproduces the error, but due to the reduced dataset presented here, the graph looks a bit different:
install.packages("vegan")
library(vegan)
{plot(site_scr_kikker, type="n", main="Kikkervalleien", xlab="DCA1 Eigenvalue = 0.62",
ylab="DCA2 Eigenvalue = 0.39")
points(site_scr_kikker, display = "sites", cex = 0.8, pch=10)
ordiellipse(site_scr_kikker, Years_KIKKER, kind="se", conf=0.95, lwd=2,
draw = "polygon", col=1:4, border=1:4,
alpha=63)}
DCA Axes Dataset:
structure(list(DCA1 = c(-0.554410061801955, 2.68272013411215,
2.68697635940812, 2.82668169800565, 2.80053527027075, 2.23642581481516,
2.35425133786973, 2.52415368531054, 2.83239838572004, 2.84069370354046,
2.77338234239721, 2.81710120200121, 3.02325331285456, 2.53697043507954,
3.05037536310673, 3.32304086730676, 2.94495328416423, 3.15022598269494,
3.39489992455406, 3.28769160043834, -0.350924337608413, 0.275699505382009,
0.297344502163647, 0.240762119868438, 0.228861788615913, 0.314964666243383,
0.371085287846039, 0.455145784364889, 0.652221371003541, 0.499839296442089,
0.379360398080226, 0.549933370572594, 0.399966004306952, 0.500218697886041,
0.441564088620194, 0.374702692230443, 0.382333410536051, 0.43285459912782,
0.428611459750847, 0.349092514843647, -0.888853907037661, -1.28333663263808,
-1.40792331844972, -1.38537198615101, -1.38995889090796, -1.3655773745443,
-1.31803656153966, -1.34826448701426, -1.34653537792753, -1.49305269877646,
-1.50814236008689, -1.41827597394111, -1.39602666811321, -1.4148816003514,
-1.49783699791751, -1.47003691605731, -1.42755467648435, -1.30533485632748,
-1.36950094020217, -1.23477912087743, 3.23114892464093, 2.9350886798946,
3.14124836945073, 3.26161277365282, 3.09515391638416, 3.1529521123077,
3.06459587965894, 3.10368520711438, 3.22697584876561, 3.53654928835111,
2.98450087615265, 3.270797532973, 3.26776719866589, 3.49199289032157,
3.22923990263853, 3.25429242878212, 3.04740856725947, 3.0826704683258,
3.13214804334072, 3.02742007198209, -0.117264033632094, -0.16440126600505,
-0.0448538849517754, -0.0426633870391433, 0.0330104299718532,
-0.0752808949299411, 0.117242046915944, 0.0416044435035445, 0.124146770645119,
0.0523946356429974, -0.110261999817611, -0.228252641183511, -0.188814210123203,
-0.290927018876809, -0.248633979863795, -0.0903889717097015,
-0.123459222045697, -0.149699086185127, -0.150112841061331, 3.01689076683526,
2.01577708020474, 2.03044077034707, 2.07207139315213, 2.12441461917371,
2.03701011931199, 2.01252790874418, 1.83219506720427, 2.04345013029757,
2.15504917885961, 2.06913115663176, 1.98989149024749, 1.99123174245595,
1.96507730677135, 1.95295285738276, 2.04095710166195, 1.84679490913208,
1.83479477688629, 2.06370280057877, 2.09660967186289, 0.541840589690319,
0.103220405988339, 0.145850580989204, 0.171702980416538, 0.0991444115624873,
0.163980634495489, 0.0100630096884953, 0.00653099371627297, -0.049057450717042,
-0.0731989798652191, 0.0484957737765508, -0.0813429375561661,
0.226394829075491, 0.118747426326434, 0.0785696207929674, 0.372080921641888,
0.228084973201013, 0.436449500065551, 0.380195760092951, 0.421054280535058,
-0.3407891239866, -0.770535673192646, -0.78726979249955, -0.605034153126869,
-0.79603463000109, -0.611191548761836, -0.479087063427777, -0.431712806416684,
-0.442179135680639, -0.359040655364315, -0.387751952086651, -0.333064178275891,
-0.245245634230479, -0.294664916205089, -0.325293571885643, -0.371714350289459,
-0.384076243072539, -0.364275416660051, -0.492176029276133, -0.360665042070641,
1.97498200909125, 1.71918456504906, 1.65998788992634, 1.66434225634425,
1.56633028293729, 1.74620235786651, 1.62590128379407, 1.50258825353478,
1.48820880624004, 1.42926003809109, 1.45513337793396, 1.42592371006012,
1.4963606424124, 1.44021703608174, 1.44438380462437, 1.47109090679392,
1.82139520526838, 1.43656718298432, 1.44873704214624, 1.6139306940386,
0.329534864476447, 0.242211052748716, 0.235001084932526, 0.203151203202996,
0.0621389966258401, 0.0944651233344451, 0.335947463398379, 0.34920131294113,
0.356337550057783, 0.413800173211847, 0.475501084593146, 0.636972497835927,
0.378416570342712, 0.405927373162309, 0.483958766985421, 0.313492417628128,
0.18082570013463, 0.213448692873988, 0.175969392011173, 0.306174433718341,
-0.661344430804266, -0.36312534912334, -0.531638029637394, -0.323308841681458,
-0.1705480775506, -0.320797820641974, -0.0112455616689928, 0.0058094693123143,
-0.173103348877858, -0.187484910613069, -0.140328782633759, -0.262935718115112,
-0.213706195115846, -0.201623466852359, -0.176562229774177, -0.129977719792298,
-0.214157064357283, -0.312304712680445, -0.321801942265119, -0.447307072541585
), DCA2 = c(1.55135681949654, 0.390676820301294, 0.298911889220322,
0.263998071977169, 0.318540344211798, 0.261720092088233, 0.185092505227324,
0.266125079431566, 0.394828240097056, 0.302396200887096, 0.427178178571868,
0.362329582087479, 0.329300702637127, 0.106852609024896, 0.0916401140801768,
-0.0498768808296606, 0.0568755736541453, 0.0409183688588972,
-0.00982842960758612, -0.0532614523308772, 2.37879922539826,
0.870845236307184, 0.875448127097767, 0.641275864686684, 0.642137889278431,
0.61287181240447, 0.46096369228661, 0.353139355245069, 0.30571197713629,
0.127480232335107, 0.155591712070341, 0.201701485575426, 0.164465659451652,
0.053079369473755, 0.0208974057538049, 0.146542798250278, 0.133527092556681,
0.0558014251324042, 0.0947450033654067, 0.146527814538444, 1.5634624218799,
-0.045338607959831, 0.0921067787998133, 0.100136516321785, 0.176555931155601,
0.1779356816878, 0.169352553487154, 0.159084219879744, 0.1416643202517,
0.046751076432749, -0.0143690327219694, 0.0854961342502074, 0.0502099136978105,
0.0730195528098192, 0.0853374008019263, 0.115531044767214, 0.0847573955605063,
0.163097640034325, 0.134198472975748, 0.275479651900967, 0.0460034929226141,
0.560715956164233, 0.37831200537774, 0.258027386145382, 0.388149229049795,
0.321257843490554, 0.403942482899889, 0.195339552141307, 0.151011302110764,
0.0876417694236817, 0.0587161407304979, 0.0994546033680268, 0.251510488850064,
0.122130974589908, 0.111911790245653, 0.165535261771949, 0.060970314956561,
0.225723170237567, 0.313941078588394, 0.231918137883269, 0.993679773684799,
0.881292795126892, 0.949549576326203, 0.820143650778247, 0.967230951435818,
0.913507935790706, 0.987962294037885, 0.89747403569919, 1.0281502304616,
1.03056849037379, 0.985558206829436, 0.956118614451869, 0.990861510942461,
1.03853618229401, 0.76894643786781, 0.71956843396122, 0.895677149723554,
1.04202078104011, 0.994362394242357, 0.45816044069548, -0.256799924265915,
-0.219215409286906, -0.274974314031124, 0.00673120866587418,
-0.34588695905374, -0.330796391785146, -0.284953089585678, -0.358994836114401,
-0.0877152907820218, -0.0179836181616615, -0.0514356092538941,
-0.0631722426274615, -0.321764014760995, -0.292880797095688,
-0.124966216219314, 0.0448721698628494, -0.0122687592139075,
-0.0293240055712474, 0.241689548511685, 0.258771228150735, -0.243978231909183,
-0.273670301716394, -0.346381197575676, -0.540924824745573, -0.578466142473874,
-0.881698449269004, -0.988487876600371, -0.874559759965791, -0.898043753863041,
-0.65177624643986, -0.897172266653606, -0.428027378766614, -0.618350571130815,
-0.650486911424929, -0.522529645458612, -0.540007295687359, -0.56048323820591,
-0.318120499195913, -0.233107811772576, 1.11444300342379, -0.464311522843928,
-0.671043500267456, -0.293709784912165, -0.48957037940714, -0.303799057505386,
-0.5014139212286, -0.446968540045644, -0.584723850846212, -0.768962102318167,
-0.473903387692755, -0.476071214131476, -0.738718937014587, -0.802748557174088,
-0.878862063849493, -1.03232927446183, -0.901938937530595, -0.90685531694932,
-0.835172924486567, -0.444400981243365, 0.0711913939922195, -0.376272209371,
-0.278328148225639, -0.37229823300335, -0.158396017104884, -0.221206427389147,
-0.356652754022269, -0.130851791393296, -0.208569352651987, -0.12330848067377,
0.119039186900003, -0.145975049001435, -0.0110773787283525, 0.154455358806736,
0.186221284305681, 0.0518734671667143, 0.0410707622863646, 0.295096579413462,
0.277622386022512, -0.0377429837590535, -0.126848197591401, -0.0574585504960616,
-0.250845634712495, -0.0177800130809138, -0.107737216176091,
-0.0631643821637247, 0.1010605032824, -0.0442202733629364, -0.372070473916875,
-0.533311401539873, -0.724584176353283, -0.865166680824871, -0.87656068793911,
-0.813421991975295, -0.839998556813832, -0.655707249050569, -0.534597066763741,
-0.378820955906015, -0.0722774697143169, -0.109467994974947,
-0.331582307211823, -1.28959124402666, -1.37962362889618, -1.43451046953702,
-1.38447488090246, -1.69236882979906, -1.44344360082209, -1.3915281556235,
-1.58096147044615, -1.68132043125815, -2.20367091829309, -2.5599499288299,
-2.31297384112025, -2.38435599310711, -2.20768782296035, -1.65607037944418,
-1.64014952994504, -1.69013789782066, -1.7017681151936, -1.61898692370139
)), .Names = c("DCA1", "DCA2"), row.names = c("01A01", "01A02",
"01A03", "01A04", "01A05", "01A06", "01A07", "01A08", "01A09",
"01A10", "01A11", "01A12", "01A13", "01A14", "01A15", "01A16",
"01A17", "01A18", "01A19", "01A20", "08A01", "08A02", "08A03",
"08A04", "08A05", "08A06", "08A07", "08A08", "08A09", "08A10",
"08A11", "08A12", "08A13", "08A14", "08A15", "08A16", "08A17",
"08A18", "08A19", "08A20", "18A01", "18A02", "18A03", "18A04",
"18A05", "18A06", "18A07", "18A08", "18A09", "18A10", "18A11",
"18A12", "18A13", "18A14", "18A15", "18A16", "18A17", "18A18",
"18A19", "18A20", "01B01", "01B02", "01B03", "01B04", "01B05",
"01B06", "01B07", "01B08", "01B09", "01B10", "01B11", "01B12",
"01B13", "01B14", "01B15", "01B16", "01B17", "01B18", "01B19",
"01B20", "18B02", "18B03", "18B04", "18B05", "18B06", "18B07",
"18B08", "18B09", "18B10", "18B11", "18B12", "18B13", "18B14",
"18B15", "18B16", "18B17", "18B18", "18B19", "18B20", "01C01",
"01C02", "01C03", "01C04", "01C05", "01C06", "01C07", "01C08",
"01C09", "01C10", "01C11", "01C12", "01C13", "01C14", "01C15",
"01C16", "01C17", "01C18", "01C19", "01C20", "08C01", "08C02",
"08C03", "08C04", "08C05", "08C06", "08C07", "08C08", "08C09",
"08C10", "08C11", "08C12", "08C13", "08C14", "08C15", "08C16",
"08C17", "08C18", "08C19", "08C20", "18C01", "18C02", "18C03",
"18C04", "18C05", "18C06", "18C07", "18C08", "18C09", "18C10",
"18C11", "18C12", "18C13", "18C14", "18C15", "18C16", "18C17",
"18C18", "18C19", "18C20", "01D01", "01D02", "01D03", "01D04",
"01D05", "01D06", "01D07", "01D08", "01D09", "01D10", "01D11",
"01D12", "01D13", "01D14", "01D15", "01D16", "01D17", "01D18",
"01D19", "01D20", "08D01", "08D02", "08D03", "08D04", "08D05",
"08D06", "08D07", "08D08", "08D09", "08D10", "08D11", "08D12",
"08D13", "08D14", "08D15", "08D16", "08D17", "08D18", "08D19",
"08D20", "18D01", "18D02", "18D03", "18D04", "18D05", "18D06",
"18D07", "18D08", "18D09", "18D10", "18D11", "18D12", "18D13",
"18D14", "18D15", "18D16", "18D17", "18D18", "18D19", "18D20"
), class = "data.frame")
Years_Kikker Dataset
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 9L, 9L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 11L, 11L, 11L, 11L), .Label = c("2001_A", "2001_B",
"2001_C", "2001_D", "2008_A", "2008_C", "2008_D", "2018_A", "2018_B",
"2018_C", "2018_D"), class = "factor")

How to rename integers to factor based on key

I need to make new column of factors based on value of column Quadrat. There are 9 quadrats, and new column called Sponge would be something like:
"Old Growth" if Quadrat = 1,4,9
"Absent" if Quadrat= 3,6,7
"New Growth" if Quadrat = 2,5,8
I am sorry if answer is easy, I did check: How to convert integer to factor in R?
and also I am trying to use recode_factor. Here is my code:
library(dplyr)
key <- list(`1,4,9` = "Old Growth", `3,6,7` = "Absent", `2,5,8` = "New Growth")
df <- mutate(df, Sponge = recode_factor(Quadrat, key))
I get error:
Error in mutate_impl(.data, dots) :
Evaluation error: Vector 1 must be length 108 or one, not 3.
Real data has much more entries than the dataset I include here, if that matters. Thank you for any help.
df <- structure(list(Quadrat = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L,
5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L,
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L,
7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L,
9L, 9L, 9L), Month = structure(c(4L, 4L, 4L, 3L, 3L, 3L, 7L,
7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L,
2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L,
8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L,
4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L,
6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L,
7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L,
2L, 2L, 9L, 9L, 9L), .Label = c("Apr", "Aug", "Feb", "Jan", "Jul",
"Jun", "Mar", "May", "Sep"), class = "factor"), PopDens = c(65.6011820777785,
18.4913752602879, 12.151802276494, 68.0740840677172, 50.9832500135526,
36.8684287818614, 52.0825074084569, 26.8776902493555, 49.2173263626173,
25.5460870559327, 5.4171769618988, 34.4303709487431, 44.3439512783661,
2.25230997451581, 61.2502326716203, 25.9035727053415, 32.339118222706,
24.1017888628412, 12.340617884649, 53.3521768709179, 26.0048255382571,
52.8581868957262, 31.9503199581522, 18.1601244299673, 34.228305231547,
2.09199664392509, 22.6402857622597, 4.48008164577186, 48.2082461479586,
65.4937081446406, 5.43837511213496, 32.8203339113388, 4.44421968702227,
19.8568186087068, 24.2561273102183, 12.3652934685815, 39.0541164302267,
16.1970243314281, 12.9826903613284, 36.3537323835772, 48.7148000504822,
11.5067498446442, 68.7493303583469, 60.7505214684643, 49.3874175737146,
63.0705459746532, 23.721419940237, 53.4379795142449, 57.7867246468086,
38.4747762591578, 8.43540686019696, 20.5636212413665, 28.7687741059344,
53.2144687068649, 32.0859562589321, 10.5120962983929, 53.4312571119517,
13.6547974413261, 31.3038802060764, 14.5005466006696, 6.03453303268179,
62.6867637028918, 17.7734197168611, 11.0327071261127, 51.4377708046231,
26.8335341704078, 9.81126144807786, 43.993699422339, 20.5123583010864,
14.9305799969006, 23.8019575944636, 39.1543961388525, 30.4534046472982,
61.2751477411948, 48.0770866076928, 59.4514226955362, 42.9857548968866,
23.0139948409051, 1.76873184926808, 33.1222371393815, 10.8652087603696,
24.5235243474599, 62.4086231633555, 55.6522683221847, 68.8337469024118,
48.2195318546146, 6.75986870843917, 57.7931131315418, 18.2255988919642,
40.8185531077906, 38.066848333925, 31.8611310839187, 22.2724406518973,
51.7982920755167, 29.2363496678881, 35.541056742426, 66.5265460675582,
28.267403066624, 40.5209824540652, 31.8187582066748, 67.2972998009063,
53.6718824433628, 42.6495425191242, 31.6603209995665, 44.3039192620199,
21.6216275517363, 66.9763269643299, 36.3314134527463)), .Names = c("Quadrat",
"Month", "PopDens"), row.names = c(NA, -108L), class = "data.frame")
If we are using recode_factor, then create the list with individual components instead of pasteed one
key <- setNames(as.list(rep(c("Old Growth", "Absent", "New Growth"),
each = 3)), c(1, 4, 9, 3, 6, 7, 2, 5, 8))
df %>%
mutate(Sponge = recode_factor(Quadrat, !!! key)) %>%
head
# Quadrat Month PopDens Sponge
#1 1 Jan 65.60118 Old Growth
#2 1 Jan 18.49138 Old Growth
#3 1 Jan 12.15180 Old Growth
#4 2 Feb 68.07408 New Growth
#5 2 Feb 50.98325 New Growth
#6 2 Feb 36.86843 New Growth
Use mutate with the factor function
df %>% mutate(Quadrat2 =
factor(Quadrat, levels = 1:9,
labels =rep(c("Old Growth", "New Growth", "Absent"),3)
)
)

For loop assigns variable while i want to plot something with qplot (R)

So i have the following for loop:
for (Count in 1:19){
png(paste0(colnames(fdd$rawCounts)[Count], ".pdf"))
qplot(y = log2(fdd$rawCounts[,Count]), main = colnames(fdd$rawCounts)[Count])
dev.off()
}
Which should simply plot some count data which i put a head from here:
structure(c(11L, 3L, 12L, 8L, 15L, 2L, 5L, 2L, 8L, 7L, 6L, 10L,
6L, 1L, 7L, 4L, 2L, 1L, 3L, 0L, 4L, 4L, 2L, 5L, 8L, 0L, 13L,
4L, 10L, 7L, 2L, 1L, 2L, 4L, 7L, 7L, 14L, 4L, 25L, 17L, 14L,
16L, 4L, 2L, 5L, 5L, 5L, 2L, 9L, 5L, 11L, 8L, 1L, 4L, 10L, 8L,
8L, 7L, 9L, 5L, 9L, 15L, 14L, 11L, 16L, 8L, 11L, 4L, 3L, 6L,
3L, 0L, 6L, 3L, 4L, 6L, 1L, 4L, 11L, 11L, 12L, 6L, 2L, 6L, 7L,
9L, 22L, 8L, 13L, 7L, 6L, 1L, 4L, 5L, 6L, 2L, 4L, 2L, 6L, 7L,
3L, 2L, 6L, 3L, 3L, 2L, 5L, 5L, 9L, 2L, 6L, 5L, 4L, 2L), .Dim = c(6L,
19L), .Dimnames = structure(list(feature = c("chr10:100000001-100000500",
"chr10:10000001-10000500", "chr10:1000001-1000500", "chr10:100000501-100001000",
"chr10:100001-100500", "chr10:100001001-100001500"), sample = c("K562_FAIRE_Acla_4hr_1",
"K562_FAIRE_Acla_4hr_2", "K562_FAIRE_Daun_4hr_1", "K562_FAIRE_Daun_4hr_2",
"K562_FAIRE_Etop_4hr_1", "K562_FAIRE_Etop_4hr_2", "K562_FAIRE_untreated",
"FAIRE.seq_K562_2MethylDoxo_A", "FAIRE.seq_K562_2MethylDoxo_B",
"FAIRE.seq_K562_Ctr_A", "FAIRE.seq_K562_Ctr_B", "FAIRE.seq_K562_Doxo_10uM_4hrs_A",
"FAIRE.seq_K562_Doxo_10uM_4hrs_B", "FAIRE.seq_K562_Epirubicin_A",
"FAIRE.seq_K562_Epirubicin_B", "FAIRE.seq_K562_MTX_40uM_4hrs_A",
"FAIRE.seq_K562_MTX_40uM_4hrs_B", "FAIRE.seq_K562_MTX_5uM_4hrs_A",
"FAIRE.seq_K562_MTX_5uM_4hrs_B")), .Names = c("feature", "sample"
)))
Now if i try to plot the data it gives me a variable called Count and the value is 19L. While i expect 19 plots to be drawn. Why is this happening?
Thanks!
This works for me:
for (Count in 1:19){
pdf(paste0(colnames(df)[Count], ".pdf"))
print(qplot(y = log2(df[, Count]), main = colnames(df)[Count]))
dev.off()
}
A couple of changes:
I read in your data as df. It turns out to be a matrix, so I adjusted the subsetting accordingly.
I also wrapped the qplot function in a print function which forces the figure to be created.
Finally, I switched the png function to pdf as that seemed like the files you were trying to create based on the paste0 result.

Resources