Reverse leaf order in dendrogram using R

Reverse leaf order in dendrogram using R - r

I have tried for several days to just flip a dendrogram so that the last gene is the first in the figure and the first the last. But even when I have managed to move leaves around the internal ordering is not the same. Here is my script:
cluster.hosts <- read.table("Norm_0_to1_heatmap.txt", header = TRUE, sep="", quote="/", row.names = 1)
# A table with 8 columnns and 229 rows cirresponding to gene expression
hosts.dist <- dist(cluster.hosts, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)
hc <- hclust(hosts.dist, method = "average")
dd <- as.dendrogram(hc)
order.dendrogram(dd)
X11()
par(cex=0.5,font=3)
plot(dd, main="Dendrogram of Syn9 genes")
order.dd <- order.dendrogram(dd) #the numbers in the order indicate the position of the gene in the original table
#Then I generate a vector with the opposed order to the one obtained
y <- c(206, 204, 210, 209, 213, 212, 211, 207, 208, 94, 199, 192, 195, 198, 193, 201, 203, 200, 185, 61, 191, 190, 197, 189, 188, 196, 187, 215, 214, 202, 217, 220, 219, 218, 95, 180, 179, 181, 182, 186, 178, 132, 133, 122, 66, 65, 64, 58, 91, 88, 92, 89, 62, 184, 103, 128, 127, 229, 231, 230, 148, 63, 228, 116, 134, 104, 221, 78, 20, 232, 160, 159, 225, 112, 167, 164, 166, 140, 222, 51, 149, 227, 79, 68, 90, 131, 130, 136, 135, 105, 147, 172, 150, 176, 175, 174, 177, 152, 151, 165, 137, 168, 163, 52, 146, 141, 145, 82, 81, 56, 161, 120, 144, 129, 84, 1, 173, 143, 142, 86, 85, 83, 194, 183, 111, 55, 53, 54, 224, 171, 170, 223, 169, 93, 59, 60, 123, 121, 124, 87, 125, 226, 3, 158, 47, 10, 162, 138, 139, 154, 153, 119, 118, 117, 106, 80, 45, 70, 69, 126, 205, 77, 67, 19, 102, 46, 13, 108, 107, 109, 72, 71, 73, 23, 22, 25, 57, 48, 216, 155, 29, 24, 101, 35, 113, 115, 36, 37, 114, 110, 2, 14, 6, 16, 15, 17, 18, 74, 31, 30, 76, 12, 75, 8, 11, 5, 7, 99, 98, 100, 39, 38, 33, 32, 97, 96, 49, 44, 34, 50, 156, 26, 157, 42, 41, 43, 4, 28, 27, 9, 40, 21)
rx <- reorder(dd, y, agglo.FUN=mean)
order.rx <- order.dendrogram(rx)
write(order.rx, file="order_hosts_rx.txt", sep="\t")
write(labels(rx), file="labels_order_hosts_rx.txt", sep="\t")
X11()
par(cex=0.5)
plot(rx, main="Dendrogram of Syn9 genes")
I guess it has something to do with the heights of the leaves but I just want to flip the dendrogram...
Thanks in advance!
Miguel

You can use rev(dd); rev.dendrogram simply returns the dendrogram with reversed nodes:
hc <- hclust(dist(USArrests), "ave")
dd <- as.dendrogram(hc)
plot(dd)
plot(rev(dd))

Related

Highlighting the part of a line graph that has the highest slope in R

My dataset contains 2 variables Y and X. Y was measured every 1.0 seconds.
My Data:
dput(Dataexample)
structure(list(X = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,
94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120,
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133,
134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146,
147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159,
160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172,
173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185,
186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198,
199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211,
212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224,
225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237,
238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250,
251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263,
264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276,
277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289,
290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302,
303, 304, 305, 306), Y = c(71756.2344, 71745.85, 70882.42, 71025.61,
70539.02, 70602.3047, 70811.87, 70514.125, 69998.63, 70531.76,
70424.9141, 70663.51, 70075.375, 69731.0859, 70029.74, 70519.31,
69858.63, 69987.23, 70080.56, 69970.63, 69829.6, 69872.12, 69775.68,
69679.24, 69814.05, 69639.84, 69645.02, 69344.35, 69430.41, 70078.49,
69239.65, 69734.1953, 69736.27, 69549.63, 69506.0859, 69108,
69669.91, 69516.45, 69490.54, 69609.77, 69314.29, 69454.25, 69590.07,
69721.76, 69525.79, 69736.27, 69303.92, 69171.23, 69294.59, 69430.41,
69457.36, 69462.54, 69144.27, 69590.07, 69446.99, 70083.67, 69358.87,
69800.56, 69680.28, 69332.95, 69723.83, 69942.63, 69772.56, 69969.59,
69808.86, 70043.23, 70208.13, 70077.45, 69856.56, 70423.875,
69490.54, 69984.12, 70175.98, 70192.58, 70279.7, 70480.93, 70594,
70792.16, 70234.06, 70165.61, 70249.62, 70564.95, 70403.13, 70444.625,
70426.99, 69907.375, 70327.4141, 70686.3359, 70473.67, 71031.83,
70864.78, 70710.1953, 70691.52, 70703.97, 70826.39, 70708.12,
70595.04, 70946.75, 71319.27, 70977.875, 70475.74, 70612.68,
70680.11, 70527.61, 70461.22, 70877.2344, 70631.35, 70723.68,
70677, 70433.21, 70306.6641, 71246.63, 70375.125, 70416.62, 70150.0547,
70733.0156, 70583.63, 70866.86, 70580.5156, 70433.21, 70377.2,
70114.79, 70347.12, 70613.71, 70576.37, 70599.19, 70407.28, 70581.5547,
70650.02, 71122.11, 70909.4, 70694.63, 71076.45, 70650.02, 71133.52,
70810.83, 71240.41, 70630.31, 71144.94, 71493.63, 71117.95, 71374.28,
71143.9, 70805.64, 71349.375, 71208.2344, 71322.39, 71727.1641,
71060.88, 71546.56, 71569.4, 70984.1, 72032.37, 71573.55, 71787.375,
71469.76, 71398.15, 71683.57, 71709.52, 71637.9, 71556.9453,
71870.4141, 71612.99, 71953.47, 71515.43, 71315.125, 72007.4453,
72021.9844, 71549.68, 72001.22, 71359.75, 71775.95, 72327.23,
71949.31, 71844.47, 71857.96, 72128.9141, 72147.6, 71501.94,
72268.05, 72104, 72217.1641, 72253.51, 72198.48, 72908.78, 72084.27,
72653.29, 72431.06, 72858.92, 72512.0547, 72632.5156, 72700.02,
72335.53, 72713.52, 73065.62, 72818.42, 73004.3359, 72458.06,
73436.48, 73231.82, 73002.26, 73313.89, 73213.125, 72980.4453,
72948.25, 73106.13, 72931.625, 73409.47, 73057.31, 73141.4453,
73218.32, 73216.24, 73273.375, 73701.42, 73486.35, 72574.37,
73229.74, 73576.74, 73195.46, 73697.2656, 73115.48, 73065.62,
73062.5, 73111.32, 73988.23, 73619.3359, 73874.95, 73683.76,
73674.41, 73550.7656, 74166.9844, 73875.99, 74013.17, 74092.16,
73872.875, 74015.25, 73984.07, 73911.33, 73606.87, 74082.8, 73866.64,
74550.53, 74271.95, 73980.95, 74502.71, 74901.92, 74753.25, 74310.4141,
75178.51, 74748.05, 74756.37, 75194.1, 74797.95, 75531.0547,
75549.77, 75293.94, 75378.17, 75457.21, 75676.67, 76087.56, 76141.6641,
76008.5, 76241.55, 76585.96, 76091.73, 76880.4844, 76898.18,
77005.38, 77080.32, 77548.78, 77337.4453, 77000.18, 77448.8359,
76997.0547, 77314.54, 77919.47, 77185.46, 78127.75, 77464.45,
78349.59, 77824.71, 77465.49, 77818.46, 78140.25, 78547.51, 77850.74,
78236.06, 78341.2656, 78104.8359, 78464.17, 77888.23, 78392.3,
78686.0547, 78149.625, 78623.5547, 78672.5156, 78810.03, 78498.55,
78652.72, 78717.31, 78831.91, 78882.96, 78715.23, 78499.5859,
78892.3359, 78372.51)), row.names = c(NA, -306L), class = c("tbl_df",
"tbl", "data.frame"))
I have used ggplot to plot the data and used a loop to calculate the average slope within a moving 60-second-window for the entire duration of the dataset to find the 60 consecutive seconds where the slope is greatest.
Code:
library(readr)
library(ggplot2)
Dataexample<- read_csv("HF-6.csv", skip = 3)
Dataexample<- head(Dataexample, -1)
Dataexample$X <- as.numeric(Dataexample$X)
df <- data.frame(Dataexample)
ggplot(data=df, aes(x=X, y=Y, group=1)) +
geom_line()
slopes <- rep(NA, nrow(Dataexample)-59)
for( i in 1:length(slopes)){
slopes[i] <- lm(Y ~ X, data=Dataexample[i:(i+59), ])$coefficients[2]
}
print(slopes)
which.max(slopes)
max(slopes)
My questions is how can I then take the results of my loop that show the consecutive 60 seconds where the slope is highest and change the color of the line in the plot during those 60 seconds to highlight where slope is greatest.

This should work:
maxslope_ind <- which.max(slope)
Dataexample$highlight <- ifelse(Dataexample$X %in% maxslope_ind:(maxslope_ind+59), 1, 0)
library(ggplot2)
ggplot(data=Dataexample, aes(x=X, y=Y, group=1)) +
geom_line(aes(colour=as.factor(highlight)), show.legend=FALSE) +
scale_colour_manual(values=c("black", "red"))

How to identify parameters for SARIMA model in R

Part 2 Boston
plot(boston, ylab=" Boston crime data", xlab= "Time")
#Time series seem to have homogeneous variance upon visual inspection
#Q2
#Trend looks linear in the plot, so for trend differencing operator take d=1
newboston= as.numeric(unlist(boston))
xdiff = diff(newboston)
plot(xdiff)
#Q3
#ADF
library(tseries)
adf.test(xdiff)
#From the result, alternative hypothesis is stationary so null hypothesis is rejected
#KPSS test
install.packages('fpp3', dependencies = TRUE)
library ( fpp3 )
unitroot_kpss(xdiff)
#the p-value is >0.05, so fail to reject null hypothesis for KPSS
#Q4
library(astsa)
acf2(xdiff, max.lag = 50)
model1 = sarima(xdiff, p, 1, q)
So this is what I have tried so far. I am quite new to R and so do be kind if my workings make little sense. For context, Boston is the data I imported from an excel, that is simply a column of x axis data.
Firstly, I am trying to do Q4, but I am not sure how I would go about to find p and q.
Second, I am unsure whether what I did in Q2 to detrend my data is correct in the first place.
Here is the output of dput(boston)
dput(boston)
structure(list(x = c(41, 39, 50, 40, 43, 38, 44, 35, 39, 35,
29, 49, 50, 59, 63, 32, 39, 47, 53, 60, 57, 52, 70, 90, 74, 62,
55, 84, 94, 70, 108, 139, 120, 97, 126, 149, 158, 124, 140, 109,
114, 77, 120, 133, 110, 92, 97, 78, 99, 107, 112, 90, 98, 125,
155, 190, 236, 189, 174, 178, 136, 161, 171, 149, 184, 155, 276,
224, 213, 279, 268, 287, 238, 213, 257, 293, 212, 246, 353, 339,
308, 247, 257, 322, 298, 273, 312, 249, 286, 279, 309, 401, 309,
328, 353, 354, 327, 324, 285, 243, 241, 287, 355, 460, 364, 487,
452, 391, 500, 451, 375, 372, 302, 316, 398, 394, 431, 431),
y = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,
76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104,
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
117, 118)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-118L))

Return column value based on a vectors of row.names in a data.frame [duplicate]

This question already has an answer here:
extract values from a data frame based on a vector of row numbers in R
(1 answer)
Closed 2 years ago.
I have a sample data.frame below (subset of a very large cyclic database)
> dput(try)
structure(list(Actuator.Force = c(-402.57388, -400.83463, -402.72595,
-404.24283, -404.07663, -403.83575, -407.55435, -418.7684, -435.86246,
-462.38239, -504.09146, -558.40039, -618.46674, -681.58704, -748.87347,
-814.95032, -880.57739, -946.11627, -1012.9043, -1075.2557, -1141.4972,
-1209.1968, -1272.8707, -1336.021, -1400.5078, -1465.5786, -1528.6499,
-1589.5626, -1654.6541, -1717.825, -1780.0903, -1839.9329, -1902.9841,
-1964.1945, -2025.569, -2085.9578, -2148.239, -2207.5295, -2267.5806,
-2328.6467, -2388.4958, -2447.5298, -2506.7534, -2567.687, -2625.7661,
-2682.866, -2741.3511, -2802.1934, -2858.2546, -2915.1028, -2972.7683,
-3030.8093, -3089.2439, -3145.5701, -3199.8442, -3259.2087, -3315.8582,
-3371.958, -3426.5596, -3484.3855, -3541.2642, -3595.3362, -3650.0208,
-3708.3748, -3763.8076, -3820.0623, -3875.3044, -3932.9504, -3989.6238,
-4047.5957, -4104.8169, -4164.8237, -4223.5444, -4283.3813, -4341.3989,
-4403.166, -4462.1479, -4522.5728, -4584.0186, -4644.7656, -4704.3525,
-4762.6826, -4821.8706, -4878.8818, -4924.1021, -4959.0415, -4985.9517,
-5005.4531, -5017.8027, -5026.0757, -5032.3428, -5036.8042, -5038.9292,
-5039.5361, -5043.021, -5043.0981, -5043.0415, -5042.627, -5014.4199,
-4853.5854, -4566.9771, -4198.7612, -3774.5527, -3317.6958, -2847.5229,
-2364.7585, -1880.9485, -1405.4272, -930.289, -467.04822, -18.867363,
421.17499, 838.86719, 1239.9121, 1626.0669, 1990.6389, 2334.0852,
2655.344, 2962.0227, 3243.7817, 3506.2249, 3744.2622, 3959.8271,
4156.7061, 4324.9048, 4469.229, 4591.6689, 4687.4194, 4764.0801,
4814.6167, 4840.313, 4846.0181, 4826.3135, 4777.6553, 4696.0791,
4583.854, 4442.457, 4272.5254, 4076.7224, 3851.1211, 3603.1853,
3330.7456, 3038.3157, 2724.115, 2386.5476, 2032.5809, 1660.0547,
1268.0084, 859.16675, 432.4075, -14.131592, -479.29309, -955.67108,
-1444.614, -1937.2562, -2437.0085, -2941.8914, -3450.9009, -3959.9597,
-4468.9795, -4981.2549, -5492.6997, -6002.334, -6510.5425, -7016.2432,
-7517.8286, -8013.1348, -8500.4199, -8974.8867, -9439.5479, -9890.5938,
-10326.367, -10744.421, -11147.754, -11534.83, -11902.651, -12248.997,
-12577.919, -12885.458, -13172.309, -13441.554, -13691.502, -13922.634,
-14127.116, -14305.272, -14458.267, -14582.934, -14685.274, -14758.539,
-14806.058, -14830.719, -14836.625, -14822.204, -14773.916, -14700.484,
-14597.968, -14469.834, -14312.099, -14126.422, -13915.136, -13676.505,
-13412.388, -13120.703, -12807.961, -12473.883, -12115.751, -11740.082,
-11342.633, -10929.945, -10502.158, -10062.869, -9611.8271, -9146.6006,
-8673.3545, -8191.7417, -7700.769, -7200.9346, -6695.8809, -6185.2378,
-5670.8711, -5154.9995, -4643.4414, -4135.0015, -3629.2859, -3125.657,
-2626.541, -2134.0662, -1646.4242, -1168.816, -699.63068, -245.34488,
192.7984, 618.76703, 1033.223, 1428.922, 1807.2645, 2165.6274,
2507.6655, 2826.2754, 3120.4724, 3395.2593, 3647.6946, 3879.4983,
4086.3855, 4265.1323, 4421.6831, 4554.3594, 4657.8184, 4736.9561,
4792.6724, 4822.3784, 4830.3091, 4815.9038, 4773.9692, 4706.4736,
4614.8379, 4491.3198, 4337.8892, 4158.002, 3949.3147, 3713.4622,
3453.9114, 3167.8179, 2861.2598, 2536.3259, 2187.3623, 1822.752,
1437.5449, 1034.8208, 617.23962, 183.35637, -270.79733, -738.95618,
-1220.1345, -1710.7787, -2206.1941, -2706.4871, -3210.8625, -3721.0002,
-4233.6387, -4747.7271, -5258.7578, -5771.3071, -6280.7759, -6791.0166,
-7295.0229, -7794.4199, -8287.4189, -8771.6377, -9243.3457, -9702.2559,
-10146.865, -10577.053, -10989.863, -11385.981, -11760.477, -12116.938,
-12456.351, -12772.688, -13071.995), No.Rows = c(1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85,
86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139,
140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152,
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165,
166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178,
179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204,
205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217,
218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230,
231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243,
244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256,
257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269,
270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282,
283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295,
296, 297, 298, 299, 300)), row.names = c(NA, 300L), class = "data.frame") class =
"data.frame")
I find the peaks and valleys of the data using:
library(quantmod)
max <- findPeaks(try$Actuator.Force)
min <- findValleys(try$Actuator.Force)
The result is the row.number of the try data.frame corresponding to the peaks and valleys. What I want is a vector of the Actuator.Force peak values corresponding to the row.numbers that the findPeaks and findValleys function find.

If the min and max values are the row index of try data frame, you can get a subset of try:
> try[min, ]
Actuator.Force No.Rows
5 -404.0766 5
97 -5043.0415 97
193 -14822.2040 193
> try[max, ]
Actuator.Force No.Rows
3 -402.7260 3
7 -407.5543 7
133 4826.3135 133
253 4815.9038 253
If you want to get only the Actuator.Force values for max and min row index:
> try[min, "Actuator.Force"]
[1] -404.0766 -5043.0415 -14822.2040
> try[max, "Actuator.Force"]
[1] -402.7260 -407.5543 4826.3135 4815.9038

In R, how can I properly subset a data frame based on a list of values inside of a function?

I have a function that is attempting to select rows from a dataframe based on a list of values.
For instance, some values might be:
> subset_ids
[1] "JUL_0003_rep1" "JUL_0003_rep2"
[3] "JUL_0003_rep3" "JUL_0007_rep1"
[5] "JUL_0007_rep2" "JUL_0007_rep3"
I have a data frame called "targets" with a column called "LongName". It has many other columns but no big deal. I want to select the rows from targets when LongName is in subset ids.
I can do this fine with either:
targets[is.element(targets$LongName, subset_ids),]
or
targets[targets$LongName %in% subset_ids,]
The problem is that I want to do this in a function, and I don't know what the column will be called in advance.
So I tried using the eval/parse method, which upon recent reading may not be the best way to do it. When I do the following:
sub1 <- paste("targets[is.element(targets$", column_name, ", subset_ids),]", sep="")
targets_subset <- as.character(eval(parse(text = sub1)))
It returns some strange concatenation of row numbers. It looks like this:
[1] "c(5, 6, 7, 17, 18, 19, 26, 27, 28, 35, 36, 46, 47, 48, 54, 55, 61, 62, 63, 64, 73, 74, 75, 76, 77, 78, 91, 92, 93, 102, 103, 104, 114, 117, 118, 129, 136, 137, 140, 141, 151, 152, 153, 157, 158, 159, 169, 172, 173, 183, 187, 188, 199, 200, 201, 208, 209, 210, 232, 233, 241, 242, 243, 252, 253, 254, 264, 265, 270, 271, 285, 286, 296, 297, 298)"
[2] "c(5, 6, 7, 17, 18, 19, 26, 27, 28, 35, 36, 46, 47, 48, 54, 55, 61, 62, 63, 64, 73, 74, 75, 76, 77, 78, 91, 92, 93, 102, 103, 104, 114, 117, 118, 129, 136, 137, 140, 141, 151, 152, 153, 157, 158, 159, 169, 172, 173, 183, 187, 188, 199, 200, 201, 208, 209, 210, 232, 233, 241, 242, 243, 252, 253, 254, 264, 265, 270, 271, 285, 286, 296, 297, 298)"
[3] "c(3, 3, 3, 7, 7, 7, 11, 11, 11, 15, 15, 19, 19, 19, 22, 22, 26, 26, 27, 27, 31, 31, 31, 32, 32, 32, 39, 39, 39, 43, 43, 43, 47, 49, 49, 53, 57, 57, 59, 59, 63, 63, 63, 65, 65, 65, 70, 72, 72, 76, 78, 78, 83, 83, 83, 86, 86, 86, 97, 97, 100, 100, 100, 104, 104, 104, 108, 108, 111, 111, 117, 117, 121, 121, 121)"
So 5, 6, 7, 17 ... appear to be the right rows for the target i'm trying to pick, but I don't understand why it sent this back in the first place, or what item [3] is at all.
If I manually execute the line generated by the above "sub1 <- ...", then it returns the proper data. If I ask the function to do it, it returns this garbage.
My question is two-fold. 1: Why is the data being returned this way? 2: Is there a better way than eval/parse to do what I'm trying to do?
I suspect some strange scope or environment level issue, but it is unclear to me at this point. I appreciate any advice anyone has.

The data are returned that way because you are coercing the dataframe to a character object. Try
as.character(head(targets))
to see a short example.
So, your method works if you eliminate the as.character(). Here it is as a MWE:
targets <- data.frame(LongName = sample(letters, 1000, replace = TRUE),
SeqNum= 1:1000,
X = rnorm(1000))
subset_ids <- c("a","f")
targets[is.element(targets$LongName, subset_ids),]
targets[targets$LongName %in% subset_ids,]
testfun <- function(targets, column_name, subset_ids){
sub1 <- paste("targets[is.element(targets$", column_name, ", subset_ids),]", sep="")
targets_subset <- eval(parse(text = sub1))
return(targets_subset)
}
testfun(targets, column_name = "LongName", subset_ids)

ggvis barchart using dates as x axis

I have been switching over from ggplot to ggvis when working with shiny apps. I have figured out a lot but am really stumped when it comes to bar graphs. I have a timeseries with dates and values. I simply want bars instead of points for each value (I would ideally like to be able to plot multiple semi-transparent bars if anyone has had success there please share) but here I wanted to get one bar working.
Works with layer_points()
df %>% ggvis(~date, ~x) %>% layer_points() %>% scale_datetime("x")
Doesnt work with layer_bars()
df %>% ggvis(~date, ~x) %>% layer_bars() %>% scale_datetime("x")
Data I am using...
structure(list(date = structure(c(7680, 7687, 7694, 7701, 7708,
7715, 7722, 7729, 7736, 7743, 7750, 7757, 7764, 7771, 7778, 7785,
7792, 7799, 7806, 7813, 7820, 7827, 7834, 7841, 7848, 7855, 7862,
7869, 7876, 7883, 7890, 7897, 7904, 7911, 7918, 7925, 7932, 7939,
7946, 7953, 7960, 7967, 7974, 7981, 7988, 7995, 8002, 8009, 8016,
8023, 8030, 8037, 8044, 8051, 8058, 8065, 8072, 8079, 8086, 8093,
8100, 8107, 8114, 8121, 8128, 8135, 8142, 8149, 8156, 8163, 8170,
8177, 8184, 8191, 8198, 8205, 8212, 8219, 8226, 8233, 8240, 8247,
8254, 8261, 8268, 8275, 8282, 8289, 8296, 8303, 8310, 8317, 8324,
8331, 8338, 8345, 8352, 8359, 8366, 8373, 8380, 8387, 8394, 8401,
8408, 8415, 8422, 8429, 8436, 8443, 8450, 8457, 8464, 8471, 8478,
8485, 8492, 8499, 8506, 8513, 8520, 8527, 8534, 8541, 8548, 8555,
8562, 8569, 8576, 8583, 8590, 8597, 8604, 8611, 8618, 8625, 8632,
8639, 8646, 8653, 8660, 8667, 8674, 8681, 8688, 8695, 8702, 8709,
8716, 8723, 8730, 8737, 8744, 8751, 8758, 8765, 8772, 8779, 8786,
8793, 8800, 8807, 8814, 8821, 8828, 8835, 8842, 8849, 8856, 8863,
8870, 8877, 8884, 8891, 8898, 8905, 8912, 8919, 8926, 8933, 8940,
8947, 8954, 8961, 8968, 8975, 8982, 8989, 8996, 9003, 9010, 9017,
9024, 9031, 9038, 9045, 9052, 9059, 9066, 9073), class = "Date"),
x = c(-0.034038302, 0.122310949, -0.002797319, 0.026515253,
0.039961798, 0.034473263, 0.00549937, -0.024125944, 0.000132490000000001,
0.011038357, -0.02135072, 0.030663311, -0.008915551, 0.004855042,
0.01563688, -0.007397493, 0.013569146, -0.004968811, -0.00250391,
0.014624532, 0.036937453, -0.023685917, 0.018921356, -0.003066779,
-0.009217771, 0.005317513, 0.010378968, 0.001580798, -0.015085972,
-0.000121644000000001, 0.020468644, 0.007925229, 0.007721276,
-0.003123545, -0.018317891, -0.014900591, 0.003260844, -0.001565358,
-0.014833886, 0.00366766, 0.014297139, -0.00725552, 0.012207931,
0.024035152, -0.024195095, -0.0043564, 0.000847468, 0.033031596,
0.023685033, 0.025143071, 0.046264348, 0.038285177, -0.009180356,
-0.01630399, -0.010131294, -0.009939386, -0.007620427, 0.013062259,
0.009912238, 0.000192973, -0.01683559, -0.002627549, 0.019836063,
-0.019946159, -0.020124331, 0.012921737, 0.034604405, -0.020774015,
0.00334805, 0.002271156, -0.018676732, 0.019160923, -0.01945997,
-0.014342636, -0.004867796, -0.010002446, -0.004372991, 0.023164369,
0.019824112, -0.00321832, -0.015785746, 0.040836652, 0.00148831,
0.012084485, -0.009603897, -0.004642148, -0.008399234, 0.010463218,
0.000256571000000001, -0.01978405, -0.003439498, -0.015669975,
0.026180724, 0.020373255, 0.019160773, 0.00692683, 0.010215506,
0.010861939, 0.012041143, 0.025734568, -0.004828156, 0.006914552,
-0.00720089, -0.000538489999999999, -0.008479448, 0.022926604,
0.002131842, -0.003688597, 0.025325639, -0.009562293, -0.024336741,
0.012907537, 0.004339383, 0.010744364, -0.013058765, -0.003672014,
-0.023887493, 0.01062259, 0.02088054, -0.035249878, -0.001462821,
0.01904368, -0.001308787, 0.009203217, 0.019856479, 0.011296979,
0.010039545, -0.01559142, 0.006083419, -0.017958978, -0.007488063,
0.01236649, -0.004459064, -0.004375386, 0.025500722, 0.005557851,
0.008444321, 0.002827649, 0.020320308, 0.031611803, -0.010199803,
-0.009425874, 0.007942729, -2.59379999999999e-05, 0.016669077,
-0.011666062, 0.022835386, -0.025599107, 0.013562535, -0.018365192,
0.018148786, 0.016649144, -0.009530455, 0.012996597, 0.002034778,
-0.005926478, -0.004897238, -0.004419719, 0.010848926, -0.006039757,
-0.030287605, 0.019221837, 0.001808161, -0.009566133, 0.005009292,
0.005365023, -0.004879922, -0.024637933, -0.0186584, 0.004786059,
-0.008245254, -0.000106243, -0.001714888, -0.017804006, -0.021200061,
0.003812757, 0.021940886, 0.002270448, -0.015417493, -0.045754612,
-0.003468442, -0.006242659, 0.022383824, -0.018753927, 0.008577571,
0.008655048, 0.02374636, 0.029522811, 0.009946946, 0.015419714,
-0.016714623, -0.014616188, 0.019670855, -0.038979063, 0.020491563,
-0.009640674, 0.046051144, -0.021434575, 0.000190443999999998,
-0.029013969), id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,
57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124,
125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136,
137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148,
149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160,
161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172,
173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184,
185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196,
197, 198, 199, 200)), .Names = c("date", "x", "id"), row.names = 53:252, class = "data.frame")

Set format df$date as character:
df$date <- as.character(df$date)
and then:
df %>% ggvis(~date, ~x) %>% layer_bars()

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Reverse leaf order in dendrogram using R - r

You can use rev(dd); rev.dendrogram simply returns the dendrogram with reversed nodes: hc <- hclust(dist(USArrests), "ave") dd <- as.dendrogram(hc) plot(dd) plot(rev(dd))

Related

Highlighting the part of a line graph that has the highest slope in R

How to identify parameters for SARIMA model in R

Return column value based on a vectors of row.names in a data.frame [duplicate]

In R, how can I properly subset a data frame based on a list of values inside of a function?

ggvis barchart using dates as x axis

Categories

Resources