R best vizualization

R best vizualization - r

I would like to do vizualisation of 2 vectors (predikcia & test data) of all wrongly classified numbers from my classification problem, where i have 76 data in both vectors - first one (predikcia) has numbers from 0-9 what classificator wrongly predicted and in second vector (test data) are numbers what it should be. Basic plot of these vectors has not good representation or not giving some good information about what numbers were wrongly classified and what number they should be classified correctly. Here is a picture what is basic plot showing
data
classres <- data.frame(
predikcia = c(9L, 8L, 3L, 9L, 1L, 6L, 2L, 2L,
6L, 3L, 5L, 9L, 8L, 1L, 5L, 1L, 3L, 3L, 5L, 9L,
5L, 1L, 8L, 9L, 5L, 0L, 1L, 9L, 5L, 5L, 8L, 9L,
2L, 5L, 8L, 5L, 6L, 9L, 9L, 4L, 9L, 3L, 5L, 5L, 9L, 9L, 9L, 4L, 3L,
5L, 8L, 3L, 0L, 5L, 8L, 8L, 7L, 3L, 8L, 8L, 5L, 9L, 9L, 1L, 5L, 5L,
9L, 9L, 5L, 3L, 1L, 9L, 2L, 5L, 8L, 9L),
testdata = c(4L, 6L, 1L, 5L, 5L, 1L, 1L, 1L, 5L,
9L, 7L, 8L, 0L, 8L, 8L, 9L, 7L, 1L, 9L, 5L, 8L,
8L, 0L, 5L, 1L, 8L, 4L, 1L, 9L, 1L, 0L, 5L, 1L,
9L, 0L, 0L, 0L, 4L, 1L, 2L, 7L, 5L, 9L, 8L, 5L,
5L, 5L, 1L, 9L, 9L, 0L, 9L, 8L, 9L, 6L, 0L, 8L,
5L, 0L, 9L, 8L, 5L, 5L, 9L, 2L, 8L, 0L, 5L, 7L,
1L, 8L, 8L, 9L, 9L, 7L, 1L))

I'm assuming that there is either "correct" or "incorrect" predictions, otherwise the graph would need more work.
First, I have the data in which there are precitions and real values. In this examle they are integers, but I'm pretending that it does not mean anything.
classres <- data.frame(
predikcia = c(9L, 8L, 3L, 9L, 1L, 6L, 2L, 2L,
6L, 3L, 5L, 9L, 8L, 1L, 5L, 1L, 3L, 3L, 5L, 9L),
testdata = c(4L, 6L, 1L, 5L, 5L, 1L, 1L, 1L, 5L,
9L, 7L, 8L, 0L, 8L, 8L, 9L, 7L, 1L, 9L, 5L))
Then I create a count data-frame. The "factor" part is important because I want all the possible combinations to appear on the plot.
dat.plot <- classres %>%
count(testdata, predikcia) %>%
mutate(
testdata = factor(testdata, levels = 0:9),
predikcia = factor(predikcia, levels = 0:9))
Finally, I create a heatmap from the data coloring the inside of each cell with the count values and adding a border to the cells where predictions are considered correct (this is why I need the goodclass data-frame).
goodclass <- data.frame(
testdata = factor(0:9),
predikcia = factor(0:9)
)
dat.plot %>%
ggplot(aes(testdata, predikcia, fill = n)) +
geom_tile() +
scale_fill_gradient(low = "goldenrod", high = "darkorchid4") +
geom_tile(data = goodclass,
aes(testdata, predikcia, color = "Correct\npredictions"),
inherit.aes = FALSE, fill = NA, size = 2) +
scale_color_manual(values = c(`Correct\npredictions` = "limegreen")) +
labs(x = "Real class value", y = "Predicted class value",
fill = "count", color = "") +
coord_equal() +
theme_minimal() +
theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_line(color = "black", size = 2))
And the results hurst a little bit the eyes: it will probably need little bit more work to find more beautiful colors.

Related

How to filter out duplicates in 2-column data frame but in reverse orientation?

I have a data frame of pairs of genes. There are some pairs which are listed twice but in reverse orientation. How do I remove those pairs which are duplicates (but in reverse orientation)? Thanks!
> dput(all_pairs)
structure(list(gene1 = structure(c(2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 5L, 6L, 7L, 8L, 9L, 10L, 1L,
2L, 3L, 4L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 7L, 8L,
9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 8L, 9L, 10L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 10L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("ASXL1", "BICRA",
"CCDC168", "HRAS", "MUC16", "NOTCH1", "OBSCN", "PLEC", "RREB1",
"TTN"), class = "factor"), gene2 = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L), .Label = c("ASXL1", "BICRA",
"CCDC168", "HRAS", "MUC16", "NOTCH1", "OBSCN", "PLEC", "RREB1",
"TTN"), class = "factor")), out.attrs = list(dim = c(10L, 10L
), dimnames = list(Var1 = c("Var1=ASXL1", "Var1=BICRA", "Var1=CCDC168",
"Var1=HRAS", "Var1=MUC16", "Var1=NOTCH1", "Var1=OBSCN", "Var1=PLEC",
"Var1=RREB1", "Var1=TTN"), Var2 = c("Var2=ASXL1", "Var2=BICRA",
"Var2=CCDC168", "Var2=HRAS", "Var2=MUC16", "Var2=NOTCH1", "Var2=OBSCN",
"Var2=PLEC", "Var2=RREB1", "Var2=TTN"))), class = "data.frame", row.names = c(NA,
-90L))

This keeps only one copy of each pair, no matter what the orientation/order is:
all_pairs[!duplicated(t(apply(all_pairs, 1, sort))), ]

Error in pts when when calling function ordiellipse in VEGAN package: subscript out of bounds

I have a nice DCA ordination of some of my data, but can't produce an ordiellipse for all different dataclusters. My data is found in the bottom of this post.
What happens now: I have four different groups (Block A-D) with three clusters per group (three year categories). So, I want to end up with different delineated groups. When I run the ordiellipes I get the following error message:
Error in pts[gr, , drop = FALSE] : subscript out of bounds
In addition: Warning message:
In complete.cases(pts) & !is.na(groups) :
longer object length is not a multiple of shorter object length
and a graph as follows:
DCA ordination of four blocks of vegetation
This piece of code reproduces the error, but due to the reduced dataset presented here, the graph looks a bit different:
install.packages("vegan")
library(vegan)
{plot(site_scr_kikker, type="n", main="Kikkervalleien", xlab="DCA1 Eigenvalue = 0.62",
ylab="DCA2 Eigenvalue = 0.39")
points(site_scr_kikker, display = "sites", cex = 0.8, pch=10)
ordiellipse(site_scr_kikker, Years_KIKKER, kind="se", conf=0.95, lwd=2,
draw = "polygon", col=1:4, border=1:4,
alpha=63)}
DCA Axes Dataset:
structure(list(DCA1 = c(-0.554410061801955, 2.68272013411215,
2.68697635940812, 2.82668169800565, 2.80053527027075, 2.23642581481516,
2.35425133786973, 2.52415368531054, 2.83239838572004, 2.84069370354046,
2.77338234239721, 2.81710120200121, 3.02325331285456, 2.53697043507954,
3.05037536310673, 3.32304086730676, 2.94495328416423, 3.15022598269494,
3.39489992455406, 3.28769160043834, -0.350924337608413, 0.275699505382009,
0.297344502163647, 0.240762119868438, 0.228861788615913, 0.314964666243383,
0.371085287846039, 0.455145784364889, 0.652221371003541, 0.499839296442089,
0.379360398080226, 0.549933370572594, 0.399966004306952, 0.500218697886041,
0.441564088620194, 0.374702692230443, 0.382333410536051, 0.43285459912782,
0.428611459750847, 0.349092514843647, -0.888853907037661, -1.28333663263808,
-1.40792331844972, -1.38537198615101, -1.38995889090796, -1.3655773745443,
-1.31803656153966, -1.34826448701426, -1.34653537792753, -1.49305269877646,
-1.50814236008689, -1.41827597394111, -1.39602666811321, -1.4148816003514,
-1.49783699791751, -1.47003691605731, -1.42755467648435, -1.30533485632748,
-1.36950094020217, -1.23477912087743, 3.23114892464093, 2.9350886798946,
3.14124836945073, 3.26161277365282, 3.09515391638416, 3.1529521123077,
3.06459587965894, 3.10368520711438, 3.22697584876561, 3.53654928835111,
2.98450087615265, 3.270797532973, 3.26776719866589, 3.49199289032157,
3.22923990263853, 3.25429242878212, 3.04740856725947, 3.0826704683258,
3.13214804334072, 3.02742007198209, -0.117264033632094, -0.16440126600505,
-0.0448538849517754, -0.0426633870391433, 0.0330104299718532,
-0.0752808949299411, 0.117242046915944, 0.0416044435035445, 0.124146770645119,
0.0523946356429974, -0.110261999817611, -0.228252641183511, -0.188814210123203,
-0.290927018876809, -0.248633979863795, -0.0903889717097015,
-0.123459222045697, -0.149699086185127, -0.150112841061331, 3.01689076683526,
2.01577708020474, 2.03044077034707, 2.07207139315213, 2.12441461917371,
2.03701011931199, 2.01252790874418, 1.83219506720427, 2.04345013029757,
2.15504917885961, 2.06913115663176, 1.98989149024749, 1.99123174245595,
1.96507730677135, 1.95295285738276, 2.04095710166195, 1.84679490913208,
1.83479477688629, 2.06370280057877, 2.09660967186289, 0.541840589690319,
0.103220405988339, 0.145850580989204, 0.171702980416538, 0.0991444115624873,
0.163980634495489, 0.0100630096884953, 0.00653099371627297, -0.049057450717042,
-0.0731989798652191, 0.0484957737765508, -0.0813429375561661,
0.226394829075491, 0.118747426326434, 0.0785696207929674, 0.372080921641888,
0.228084973201013, 0.436449500065551, 0.380195760092951, 0.421054280535058,
-0.3407891239866, -0.770535673192646, -0.78726979249955, -0.605034153126869,
-0.79603463000109, -0.611191548761836, -0.479087063427777, -0.431712806416684,
-0.442179135680639, -0.359040655364315, -0.387751952086651, -0.333064178275891,
-0.245245634230479, -0.294664916205089, -0.325293571885643, -0.371714350289459,
-0.384076243072539, -0.364275416660051, -0.492176029276133, -0.360665042070641,
1.97498200909125, 1.71918456504906, 1.65998788992634, 1.66434225634425,
1.56633028293729, 1.74620235786651, 1.62590128379407, 1.50258825353478,
1.48820880624004, 1.42926003809109, 1.45513337793396, 1.42592371006012,
1.4963606424124, 1.44021703608174, 1.44438380462437, 1.47109090679392,
1.82139520526838, 1.43656718298432, 1.44873704214624, 1.6139306940386,
0.329534864476447, 0.242211052748716, 0.235001084932526, 0.203151203202996,
0.0621389966258401, 0.0944651233344451, 0.335947463398379, 0.34920131294113,
0.356337550057783, 0.413800173211847, 0.475501084593146, 0.636972497835927,
0.378416570342712, 0.405927373162309, 0.483958766985421, 0.313492417628128,
0.18082570013463, 0.213448692873988, 0.175969392011173, 0.306174433718341,
-0.661344430804266, -0.36312534912334, -0.531638029637394, -0.323308841681458,
-0.1705480775506, -0.320797820641974, -0.0112455616689928, 0.0058094693123143,
-0.173103348877858, -0.187484910613069, -0.140328782633759, -0.262935718115112,
-0.213706195115846, -0.201623466852359, -0.176562229774177, -0.129977719792298,
-0.214157064357283, -0.312304712680445, -0.321801942265119, -0.447307072541585
), DCA2 = c(1.55135681949654, 0.390676820301294, 0.298911889220322,
0.263998071977169, 0.318540344211798, 0.261720092088233, 0.185092505227324,
0.266125079431566, 0.394828240097056, 0.302396200887096, 0.427178178571868,
0.362329582087479, 0.329300702637127, 0.106852609024896, 0.0916401140801768,
-0.0498768808296606, 0.0568755736541453, 0.0409183688588972,
-0.00982842960758612, -0.0532614523308772, 2.37879922539826,
0.870845236307184, 0.875448127097767, 0.641275864686684, 0.642137889278431,
0.61287181240447, 0.46096369228661, 0.353139355245069, 0.30571197713629,
0.127480232335107, 0.155591712070341, 0.201701485575426, 0.164465659451652,
0.053079369473755, 0.0208974057538049, 0.146542798250278, 0.133527092556681,
0.0558014251324042, 0.0947450033654067, 0.146527814538444, 1.5634624218799,
-0.045338607959831, 0.0921067787998133, 0.100136516321785, 0.176555931155601,
0.1779356816878, 0.169352553487154, 0.159084219879744, 0.1416643202517,
0.046751076432749, -0.0143690327219694, 0.0854961342502074, 0.0502099136978105,
0.0730195528098192, 0.0853374008019263, 0.115531044767214, 0.0847573955605063,
0.163097640034325, 0.134198472975748, 0.275479651900967, 0.0460034929226141,
0.560715956164233, 0.37831200537774, 0.258027386145382, 0.388149229049795,
0.321257843490554, 0.403942482899889, 0.195339552141307, 0.151011302110764,
0.0876417694236817, 0.0587161407304979, 0.0994546033680268, 0.251510488850064,
0.122130974589908, 0.111911790245653, 0.165535261771949, 0.060970314956561,
0.225723170237567, 0.313941078588394, 0.231918137883269, 0.993679773684799,
0.881292795126892, 0.949549576326203, 0.820143650778247, 0.967230951435818,
0.913507935790706, 0.987962294037885, 0.89747403569919, 1.0281502304616,
1.03056849037379, 0.985558206829436, 0.956118614451869, 0.990861510942461,
1.03853618229401, 0.76894643786781, 0.71956843396122, 0.895677149723554,
1.04202078104011, 0.994362394242357, 0.45816044069548, -0.256799924265915,
-0.219215409286906, -0.274974314031124, 0.00673120866587418,
-0.34588695905374, -0.330796391785146, -0.284953089585678, -0.358994836114401,
-0.0877152907820218, -0.0179836181616615, -0.0514356092538941,
-0.0631722426274615, -0.321764014760995, -0.292880797095688,
-0.124966216219314, 0.0448721698628494, -0.0122687592139075,
-0.0293240055712474, 0.241689548511685, 0.258771228150735, -0.243978231909183,
-0.273670301716394, -0.346381197575676, -0.540924824745573, -0.578466142473874,
-0.881698449269004, -0.988487876600371, -0.874559759965791, -0.898043753863041,
-0.65177624643986, -0.897172266653606, -0.428027378766614, -0.618350571130815,
-0.650486911424929, -0.522529645458612, -0.540007295687359, -0.56048323820591,
-0.318120499195913, -0.233107811772576, 1.11444300342379, -0.464311522843928,
-0.671043500267456, -0.293709784912165, -0.48957037940714, -0.303799057505386,
-0.5014139212286, -0.446968540045644, -0.584723850846212, -0.768962102318167,
-0.473903387692755, -0.476071214131476, -0.738718937014587, -0.802748557174088,
-0.878862063849493, -1.03232927446183, -0.901938937530595, -0.90685531694932,
-0.835172924486567, -0.444400981243365, 0.0711913939922195, -0.376272209371,
-0.278328148225639, -0.37229823300335, -0.158396017104884, -0.221206427389147,
-0.356652754022269, -0.130851791393296, -0.208569352651987, -0.12330848067377,
0.119039186900003, -0.145975049001435, -0.0110773787283525, 0.154455358806736,
0.186221284305681, 0.0518734671667143, 0.0410707622863646, 0.295096579413462,
0.277622386022512, -0.0377429837590535, -0.126848197591401, -0.0574585504960616,
-0.250845634712495, -0.0177800130809138, -0.107737216176091,
-0.0631643821637247, 0.1010605032824, -0.0442202733629364, -0.372070473916875,
-0.533311401539873, -0.724584176353283, -0.865166680824871, -0.87656068793911,
-0.813421991975295, -0.839998556813832, -0.655707249050569, -0.534597066763741,
-0.378820955906015, -0.0722774697143169, -0.109467994974947,
-0.331582307211823, -1.28959124402666, -1.37962362889618, -1.43451046953702,
-1.38447488090246, -1.69236882979906, -1.44344360082209, -1.3915281556235,
-1.58096147044615, -1.68132043125815, -2.20367091829309, -2.5599499288299,
-2.31297384112025, -2.38435599310711, -2.20768782296035, -1.65607037944418,
-1.64014952994504, -1.69013789782066, -1.7017681151936, -1.61898692370139
)), .Names = c("DCA1", "DCA2"), row.names = c("01A01", "01A02",
"01A03", "01A04", "01A05", "01A06", "01A07", "01A08", "01A09",
"01A10", "01A11", "01A12", "01A13", "01A14", "01A15", "01A16",
"01A17", "01A18", "01A19", "01A20", "08A01", "08A02", "08A03",
"08A04", "08A05", "08A06", "08A07", "08A08", "08A09", "08A10",
"08A11", "08A12", "08A13", "08A14", "08A15", "08A16", "08A17",
"08A18", "08A19", "08A20", "18A01", "18A02", "18A03", "18A04",
"18A05", "18A06", "18A07", "18A08", "18A09", "18A10", "18A11",
"18A12", "18A13", "18A14", "18A15", "18A16", "18A17", "18A18",
"18A19", "18A20", "01B01", "01B02", "01B03", "01B04", "01B05",
"01B06", "01B07", "01B08", "01B09", "01B10", "01B11", "01B12",
"01B13", "01B14", "01B15", "01B16", "01B17", "01B18", "01B19",
"01B20", "18B02", "18B03", "18B04", "18B05", "18B06", "18B07",
"18B08", "18B09", "18B10", "18B11", "18B12", "18B13", "18B14",
"18B15", "18B16", "18B17", "18B18", "18B19", "18B20", "01C01",
"01C02", "01C03", "01C04", "01C05", "01C06", "01C07", "01C08",
"01C09", "01C10", "01C11", "01C12", "01C13", "01C14", "01C15",
"01C16", "01C17", "01C18", "01C19", "01C20", "08C01", "08C02",
"08C03", "08C04", "08C05", "08C06", "08C07", "08C08", "08C09",
"08C10", "08C11", "08C12", "08C13", "08C14", "08C15", "08C16",
"08C17", "08C18", "08C19", "08C20", "18C01", "18C02", "18C03",
"18C04", "18C05", "18C06", "18C07", "18C08", "18C09", "18C10",
"18C11", "18C12", "18C13", "18C14", "18C15", "18C16", "18C17",
"18C18", "18C19", "18C20", "01D01", "01D02", "01D03", "01D04",
"01D05", "01D06", "01D07", "01D08", "01D09", "01D10", "01D11",
"01D12", "01D13", "01D14", "01D15", "01D16", "01D17", "01D18",
"01D19", "01D20", "08D01", "08D02", "08D03", "08D04", "08D05",
"08D06", "08D07", "08D08", "08D09", "08D10", "08D11", "08D12",
"08D13", "08D14", "08D15", "08D16", "08D17", "08D18", "08D19",
"08D20", "18D01", "18D02", "18D03", "18D04", "18D05", "18D06",
"18D07", "18D08", "18D09", "18D10", "18D11", "18D12", "18D13",
"18D14", "18D15", "18D16", "18D17", "18D18", "18D19", "18D20"
), class = "data.frame")
Years_Kikker Dataset
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 9L, 9L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 11L, 11L, 11L, 11L), .Label = c("2001_A", "2001_B",
"2001_C", "2001_D", "2008_A", "2008_C", "2008_D", "2018_A", "2018_B",
"2018_C", "2018_D"), class = "factor")

How to rename integers to factor based on key

I need to make new column of factors based on value of column Quadrat. There are 9 quadrats, and new column called Sponge would be something like:
"Old Growth" if Quadrat = 1,4,9
"Absent" if Quadrat= 3,6,7
"New Growth" if Quadrat = 2,5,8
I am sorry if answer is easy, I did check: How to convert integer to factor in R?
and also I am trying to use recode_factor. Here is my code:
library(dplyr)
key <- list(`1,4,9` = "Old Growth", `3,6,7` = "Absent", `2,5,8` = "New Growth")
df <- mutate(df, Sponge = recode_factor(Quadrat, key))
I get error:
Error in mutate_impl(.data, dots) :
Evaluation error: Vector 1 must be length 108 or one, not 3.
Real data has much more entries than the dataset I include here, if that matters. Thank you for any help.
df <- structure(list(Quadrat = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L,
5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L,
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L,
7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L,
9L, 9L, 9L), Month = structure(c(4L, 4L, 4L, 3L, 3L, 3L, 7L,
7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L,
2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L,
8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L,
4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L,
6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L,
7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L,
2L, 2L, 9L, 9L, 9L), .Label = c("Apr", "Aug", "Feb", "Jan", "Jul",
"Jun", "Mar", "May", "Sep"), class = "factor"), PopDens = c(65.6011820777785,
18.4913752602879, 12.151802276494, 68.0740840677172, 50.9832500135526,
36.8684287818614, 52.0825074084569, 26.8776902493555, 49.2173263626173,
25.5460870559327, 5.4171769618988, 34.4303709487431, 44.3439512783661,
2.25230997451581, 61.2502326716203, 25.9035727053415, 32.339118222706,
24.1017888628412, 12.340617884649, 53.3521768709179, 26.0048255382571,
52.8581868957262, 31.9503199581522, 18.1601244299673, 34.228305231547,
2.09199664392509, 22.6402857622597, 4.48008164577186, 48.2082461479586,
65.4937081446406, 5.43837511213496, 32.8203339113388, 4.44421968702227,
19.8568186087068, 24.2561273102183, 12.3652934685815, 39.0541164302267,
16.1970243314281, 12.9826903613284, 36.3537323835772, 48.7148000504822,
11.5067498446442, 68.7493303583469, 60.7505214684643, 49.3874175737146,
63.0705459746532, 23.721419940237, 53.4379795142449, 57.7867246468086,
38.4747762591578, 8.43540686019696, 20.5636212413665, 28.7687741059344,
53.2144687068649, 32.0859562589321, 10.5120962983929, 53.4312571119517,
13.6547974413261, 31.3038802060764, 14.5005466006696, 6.03453303268179,
62.6867637028918, 17.7734197168611, 11.0327071261127, 51.4377708046231,
26.8335341704078, 9.81126144807786, 43.993699422339, 20.5123583010864,
14.9305799969006, 23.8019575944636, 39.1543961388525, 30.4534046472982,
61.2751477411948, 48.0770866076928, 59.4514226955362, 42.9857548968866,
23.0139948409051, 1.76873184926808, 33.1222371393815, 10.8652087603696,
24.5235243474599, 62.4086231633555, 55.6522683221847, 68.8337469024118,
48.2195318546146, 6.75986870843917, 57.7931131315418, 18.2255988919642,
40.8185531077906, 38.066848333925, 31.8611310839187, 22.2724406518973,
51.7982920755167, 29.2363496678881, 35.541056742426, 66.5265460675582,
28.267403066624, 40.5209824540652, 31.8187582066748, 67.2972998009063,
53.6718824433628, 42.6495425191242, 31.6603209995665, 44.3039192620199,
21.6216275517363, 66.9763269643299, 36.3314134527463)), .Names = c("Quadrat",
"Month", "PopDens"), row.names = c(NA, -108L), class = "data.frame")

If we are using recode_factor, then create the list with individual components instead of pasteed one
key <- setNames(as.list(rep(c("Old Growth", "Absent", "New Growth"),
each = 3)), c(1, 4, 9, 3, 6, 7, 2, 5, 8))
df %>%
mutate(Sponge = recode_factor(Quadrat, !!! key)) %>%
head
# Quadrat Month PopDens Sponge
#1 1 Jan 65.60118 Old Growth
#2 1 Jan 18.49138 Old Growth
#3 1 Jan 12.15180 Old Growth
#4 2 Feb 68.07408 New Growth
#5 2 Feb 50.98325 New Growth
#6 2 Feb 36.86843 New Growth

Use mutate with the factor function
df %>% mutate(Quadrat2 =
factor(Quadrat, levels = 1:9,
labels =rep(c("Old Growth", "New Growth", "Absent"),3)
)
)

Facets: organising their order and organising the levels within facets

I would like to please organise the following plots so that facets are printed out from most to least busy (i.e. Hemiptera, Coleoptera, Hymenoptera, Siphonaptera, Lepidoptera, etc.)
I would also like to order the levels within each facet like in Coleoptera. I realise that the X-labels will change order too so I need each facet to print out its own X-label according the level order.
I have already read many threads and that's how I was able to organise Coleoptera. But now I want it to be more tidy.
This is the data (let me know if this format is ok, if not I can try another way):
structure(list(Order = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L), .Label = c("Coleoptera",
"Dermaptera", "Dictyoptera", "Diptera", "Hemiptera", "Hymenoptera",
"Lepidoptera", "Phthiraptera", "Psocoptera", "Siphonaptera",
"Thysanoptera"), class = "factor"), Nrange = structure(c(1L,
3L, 4L, 5L, 6L, 7L, 8L, 10L, 11L, 12L, 14L, 14L, 1L, 10L, 1L,
3L, 4L, 6L, 7L, 10L, 11L, 12L, 14L, NA, 1L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 14L, NA, 1L, 4L, 5L, 6L, 7L, 8L, 10L, 11L,
12L, 14L, 15L, NA, 1L, 2L, 4L, 5L, 6L, 7L, 8L, 10L, 11L, 12L,
13L, 14L, 4L, 10L, 11L, 12L, 14L, 1L, 4L, 10L, 11L, 12L, 13L,
14L, 1L, 5L, 10L, 1L, 4L, 6L, 7L, 10L, 11L, 12L, 14L), .Label = c("Africa",
"Africa, Asia", "Americas", "Asia", "Asia-Temp", "Asia-Trop",
"Australasia", "C&S America", "Cosmopolitan", "Cryptogenic",
"N America", "S America", "Trop", "Trop, SubTrop", "Unknown"), class = "factor"),
Records = c(16L, 1L, 9L, 7L, 11L, 17L, 1L, 15L, 8L, 8L, 5L,
1L, 2L, 1L, 5L, 1L, 1L, 1L, 1L, 9L, 9L, 2L, 1L, 4L, 11L,
10L, 30L, 15L, 9L, 2L, 2L, 2L, 34L, 11L, 21L, 1L, 21L, 16L,
8L, 1L, 14L, 3L, 5L, 25L, 4L, 2L, 1L, 1L, 8L, 1L, 10L, 1L,
2L, 1L, 1L, 8L, 5L, 2L, 1L, 2L, 2L, 9L, 1L, 2L, 1L, 3L, 1L,
12L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 1L, 1L, 1L, 3L,
3L, 2L)), .Names = c("Order", "Nrange", "Records"), row.names = c(NA,
-83L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), vars = "Order", drop = TRUE)
This is the reordering that I guess is affecting only Coleoptera.
xy<-x%>%
mutate(Nrange=reorder(Nrange,-Records,sum))
This is the plot:
to_plot<-xy %>%
filter(!is.na(Nrange))
ggplot(to_plot,aes(x=Nrange,y=Records,fill=Nrange))+
geom_col()+
theme(axis.text.x = element_text(angle=90, vjust=0.7), legend.position = "none") +
facet_wrap(~Order,ncol=3)+
labs(title="Insects recorded as alien-invasive to mainland Spain",
subtitle="Native ranges vs number of records",
caption="Data source: DAISIE (http://www.europe-aliens.org/)")
And this is the plot:
enter image description here

Assuming you're using the tidyverse (based on your code):
library(tidyverse)
xy <- x %>%
ungroup() %>%
mutate(
Order = fct_reorder(Order, Records, sum, .desc = TRUE)
)
xy %>%
filter(!is.na(Nrange)) %>%
ggplot() +
aes(x = Nrange, y = Records, fill = Nrange) +
geom_col() +
facet_wrap(~Order, ncol = 3)
fct_reorder comes from the forcats package, which I believe is now a part of the tidyverse.
Or, using base R, something like this:
xy <- x
record_sums <- tapply(xy$Records, xy$Order, sum)
levels(xy$Order) <- levels(xy$Order)[order(record_sums, decreasing = TRUE)]

For loop assigns variable while i want to plot something with qplot (R)

So i have the following for loop:
for (Count in 1:19){
png(paste0(colnames(fdd$rawCounts)[Count], ".pdf"))
qplot(y = log2(fdd$rawCounts[,Count]), main = colnames(fdd$rawCounts)[Count])
dev.off()
}
Which should simply plot some count data which i put a head from here:
structure(c(11L, 3L, 12L, 8L, 15L, 2L, 5L, 2L, 8L, 7L, 6L, 10L,
6L, 1L, 7L, 4L, 2L, 1L, 3L, 0L, 4L, 4L, 2L, 5L, 8L, 0L, 13L,
4L, 10L, 7L, 2L, 1L, 2L, 4L, 7L, 7L, 14L, 4L, 25L, 17L, 14L,
16L, 4L, 2L, 5L, 5L, 5L, 2L, 9L, 5L, 11L, 8L, 1L, 4L, 10L, 8L,
8L, 7L, 9L, 5L, 9L, 15L, 14L, 11L, 16L, 8L, 11L, 4L, 3L, 6L,
3L, 0L, 6L, 3L, 4L, 6L, 1L, 4L, 11L, 11L, 12L, 6L, 2L, 6L, 7L,
9L, 22L, 8L, 13L, 7L, 6L, 1L, 4L, 5L, 6L, 2L, 4L, 2L, 6L, 7L,
3L, 2L, 6L, 3L, 3L, 2L, 5L, 5L, 9L, 2L, 6L, 5L, 4L, 2L), .Dim = c(6L,
19L), .Dimnames = structure(list(feature = c("chr10:100000001-100000500",
"chr10:10000001-10000500", "chr10:1000001-1000500", "chr10:100000501-100001000",
"chr10:100001-100500", "chr10:100001001-100001500"), sample = c("K562_FAIRE_Acla_4hr_1",
"K562_FAIRE_Acla_4hr_2", "K562_FAIRE_Daun_4hr_1", "K562_FAIRE_Daun_4hr_2",
"K562_FAIRE_Etop_4hr_1", "K562_FAIRE_Etop_4hr_2", "K562_FAIRE_untreated",
"FAIRE.seq_K562_2MethylDoxo_A", "FAIRE.seq_K562_2MethylDoxo_B",
"FAIRE.seq_K562_Ctr_A", "FAIRE.seq_K562_Ctr_B", "FAIRE.seq_K562_Doxo_10uM_4hrs_A",
"FAIRE.seq_K562_Doxo_10uM_4hrs_B", "FAIRE.seq_K562_Epirubicin_A",
"FAIRE.seq_K562_Epirubicin_B", "FAIRE.seq_K562_MTX_40uM_4hrs_A",
"FAIRE.seq_K562_MTX_40uM_4hrs_B", "FAIRE.seq_K562_MTX_5uM_4hrs_A",
"FAIRE.seq_K562_MTX_5uM_4hrs_B")), .Names = c("feature", "sample"
)))
Now if i try to plot the data it gives me a variable called Count and the value is 19L. While i expect 19 plots to be drawn. Why is this happening?
Thanks!

This works for me:
for (Count in 1:19){
pdf(paste0(colnames(df)[Count], ".pdf"))
print(qplot(y = log2(df[, Count]), main = colnames(df)[Count]))
dev.off()
}
A couple of changes:
I read in your data as df. It turns out to be a matrix, so I adjusted the subsetting accordingly.
I also wrapped the qplot function in a print function which forces the figure to be created.
Finally, I switched the png function to pdf as that seemed like the files you were trying to create based on the paste0 result.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R best vizualization - r

Related

How to filter out duplicates in 2-column data frame but in reverse orientation?

Error in pts when when calling function ordiellipse in VEGAN package: subscript out of bounds

How to rename integers to factor based on key

Facets: organising their order and organising the levels within facets

For loop assigns variable while i want to plot something with qplot (R)

Categories

Resources