How to consecutively subset an array and count the number of iterations?

How to consecutively subset an array and count the number of iterations? - r

I need to repeatedly apply a function on the resultant arrays until all data in the array is reduced to a single set, and count the number of iterations.
Data
Array ar
structure(c(0, 11, 17, 15, 22, 67, 73, 68, 31, 31, 28, 33, 34,
32, 11, 0, 9, 12, 21, 67, 73, 67, 35, 30, 34, 67, 60, 36, 17,
9, 0, 6, 19, 70, 74, 68, 36, 36, 36, 64, 66, 39, 15, 12, 6, 0,
13, 64, 69, 66, 34, 37, 39, 77, 65, 45, 22, 21, 19, 13, 0, 59,
60, 66, 38, 39, 39, 40, 43, 43, 67, 67, 70, 64, 59, 0, 10, 18,
77, 75, 78, 93, 93, 85, 73, 73, 74, 69, 60, 10, 0, 15, 76, 74,
80, 103, 101, 95, 68, 67, 68, 66, 66, 18, 15, 0, 59, 65, 73,
90, 87, 82, 31, 35, 36, 34, 38, 77, 76, 59, 0, 8, 19, 24, 28,
32, 31, 30, 36, 37, 39, 75, 74, 65, 8, 0, 12, 20, 22, 23, 28,
34, 36, 39, 39, 78, 80, 73, 19, 12, 0, 6, 14, 18, 33, 67, 64,
77, 40, 93, 103, 90, 24, 20, 6, 0, 2, 8, 34, 60, 66, 65, 43,
93, 101, 87, 28, 22, 14, 2, 0, 6, 32, 36, 39, 45, 43, 85, 95,
82, 32, 23, 18, 8, 6, 0), .Dim = c(14L, 14L))
From
a<-colSums(ar<25)
b<-which.max(a)
c<-ar[ar[,b] > 25,, drop = FALSE]
we get
structure(c(0, 11, 17, 15, 22, 67, 73, 68, 11, 0, 9, 12, 21,
67, 73, 67, 17, 9, 0, 6, 19, 70, 74, 68, 15, 12, 6, 0, 13, 64,
69, 66, 22, 21, 19, 13, 0, 59, 60, 66, 67, 67, 70, 64, 59, 0,
10, 18, 73, 73, 74, 69, 60, 10, 0, 15, 68, 67, 68, 66, 66, 18,
15, 0, 31, 35, 36, 34, 38, 77, 76, 59, 31, 30, 36, 37, 39, 75,
74, 65, 28, 34, 36, 39, 39, 78, 80, 73, 33, 67, 64, 77, 40, 93,
103, 90, 34, 60, 66, 65, 43, 93, 101, 87, 32, 36, 39, 45, 43,
85, 95, 82), .Dim = c(8L, 14L))
then from
a<-colSums(c<25)
b<-which.max(a)
d<-c[c[,b]>25,,drop=FALSE]
we get
structure(c(67, 73, 68, 67, 73, 67, 70, 74, 68, 64, 69, 66, 59,
60, 66, 0, 10, 18, 10, 0, 15, 18, 15, 0, 77, 76, 59, 75, 74,
65, 78, 80, 73, 93, 103, 90, 93, 101, 87, 85, 95, 82), .Dim = c(3L,
14L))
applying once more
a<-colSums(d<25)
b<-which.max(a)
e<-d[d[,b]>25,,drop=FALSE]
results in a array with no values
structure(numeric(0), .Dim = c(0L, 14L))
Then, the operation can be performed a number of times, and I need to know how many times; in this case it was 3 times.
Also, I need to reduce the number of the lines of the code probably with a loop function, as the action is repetitive.

You can use recursion as shown below:
fun<- function(x, i=1){
a<-which.max(colSums(x<25))
b<-x[x[,a] > 25,, drop = FALSE]
if(length(b)) Recall(b, i+1) else i
}
fun(ar)
[1] 3

Related

Batch column aggregation and reordering dataframe in R

I have census tract data divided my age variables by sex, into a value for males (varname_m) and females (varname_f):
Rows: 146,112
Columns: 13
$ tractid <chr> "01001020100", "01001020100", "01001020200", "01001020200", "01001020300", "01001020300", "01001020400", "01001020400", "01001020500", "01001020500", "0100102060…
$ ag18to19_m <dbl> 37, 57, 24, 15, 49, 27, 87, 33, 293, 159, 57, 40, 19, 41, 18, 56, 143, 86, 25, 155, 41, 7, 40, 0, 35, 0, 99, 25, 190, 420, 61, 157, 63, 110, 37, 127, 67, 45, 198…
$ ag20_m <dbl> 6, 14, 64, 0, 11, 18, 16, 8, 115, 21, 42, 15, 53, 71, 16, 0, 63, 77, 43, 96, 32, 15, 21, 0, 12, 44, 8, 0, 105, 80, 34, 20, 8, 0, 13, 46, 88, 0, 83, 241, 10, 96, …
$ ag21_m <dbl> 18, 0, 15, 7, 0, 16, 117, 18, 14, 40, 23, 26, 45, 47, 32, 0, 41, 50, 0, 76, 14, 45, 20, 1, 48, 11, 11, 30, 18, 30, 60, 55, 20, 0, 28, 43, 31, 21, 9, 0, 11, 8, 0,…
$ ag22to24_m <dbl> 48, 64, 109, 45, 25, 62, 65, 41, 224, 531, 28, 51, 31, 60, 0, 24, 132, 96, 59, 98, 27, 45, 111, 30, 113, 58, 71, 61, 46, 114, 11, 86, 116, 99, 28, 158, 72, 135, …
$ ag25to29_m <dbl> 49, 31, 83, 99, 87, 144, 153, 142, 428, 327, 69, 35, 36, 22, 61, 113, 202, 420, 184, 255, 94, 84, 118, 82, 71, 30, 47, 195, 44, 135, 118, 150, 215, 157, 118, 180…
$ ag30to34_m <dbl> 52, 72, 59, 97, 84, 157, 124, 85, 415, 227, 95, 13, 105, 202, 37, 86, 274, 334, 161, 182, 91, 173, 84, 84, 81, 106, 79, 67, 263, 77, 40, 115, 199, 411, 81, 115, …
$ ag18to19_f <dbl> 33, 8, 51, 7, 31, 19, 107, 15, 33, 25, 47, 37, 35, 81, 98, 92, 127, 147, 72, 0, 109, 57, 7, 74, 78, 0, 36, 24, 109, 268, 88, 62, 10, 0, 47, 33, 79, 191, 63, 134,…
$ ag20_f <dbl> 13, 40, 23, 18, 27, 18, 12, 11, 37, 0, 58, 83, 19, 45, 20, 77, 16, 103, 0, 36, 15, 0, 8, 37, 29, 34, 36, 0, 23, 30, 37, 0, 10, 48, 51, 67, 17, 15, 125, 55, 27, 1…
$ ag21_f <dbl> 40, 6, 13, 24, 36, 0, 16, 19, 17, 0, 11, 0, 0, 89, 28, 31, 39, 20, 15, 0, 7, 13, 0, 17, 9, 13, 17, 47, 106, 36, 42, 94, 0, 13, 19, 50, 67, 0, 122, 48, 21, 9, 145…
$ ag22to24_f <dbl> 21, 67, 71, 21, 69, 35, 28, 165, 346, 350, 15, 0, 53, 50, 25, 42, 207, 165, 158, 114, 20, 0, 73, 66, 29, 29, 59, 39, 83, 94, 22, 24, 79, 69, 37, 21, 73, 201, 282…
$ ag25to29_f <dbl> 36, 24, 86, 51, 88, 160, 130, 73, 318, 539, 157, 127, 128, 111, 86, 29, 334, 365, 87, 217, 57, 60, 177, 92, 17, 90, 86, 113, 67, 204, 136, 120, 130, 108, 211, 51…
$ ag30to34_f <dbl> 36, 73, 38, 42, 87, 154, 63, 84, 440, 414, 51, 95, 151, 73, 27, 70, 429, 458, 231, 173, 54, 82, 104, 24, 61, 159, 69, 30, 218, 82, 88, 214, 222, 158, 76, 125, 24…
I want to aggregate each of the variables divided by sex to a single combined variable. For example, I want to add ag18to19_m and ag18to19_f to create ag18to19. I can easily do this using mutate and the following code and order them to the front of the data frame:
aggregated <- merged %>%
mutate(ag18to19 = ag18to19_m + ag18to19_f) %>%
relocate(ag18to19, .before = ag18to19_m) %>%
mutate(ag20 = ag20_m + ag20_f) %>%
relocate(ag20, .before = ag20_m) %>%
mutate(ag21 = ag21_m + ag21_f) %>%
relocate(ag21, .before = ag21_m) %>%
mutate(ag22to24 = ag22to24_m + ag22to24_f) %>%
relocate(ag22to24, .before = ag22to24_m) %>%
mutate(ag25to29 = ag25to29_m + ag25to29_f) %>%
relocate(ag25to29, .before = ag25to29_m) %>%
mutate(ag30to34 = ag30to34_m + ag30to34_f) %>%
relocate(ag30to34, .before = ag30to34_m)
I know there's a more efficient way to do this using a loop or map_df function that will also give me a data frame as an output. I've been trying for the last hour to write a function and use map_df but haven't had any success. Does anyone have a suggestion?
More efficient code here is best practice and will help me apply this same data cleaning step to several other variables that are grouped in the same way (e.g., income grouped by sex or education grouped by age).
Any help would be greatly appreciated. Thanks.

Here is an option in tidyverse
library(dplyr)
library(stringr)
merged1 <- merged %>%
mutate(across(ends_with('_m'), ~
. + get(str_replace(cur_column(), '_m', '_f')),
.names = '{.col}_new')) %>%
rename_at(vars(ends_with('_new')),
~ str_remove(., '_[m]_new$')) %>%
select(tract_id, order(names(.)[-1]) + 1)

Plotly animation in R; frames are not in correct order, mix up in frames

The Data:
dput(LifeExpCH$Age)
c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,
66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110,
111)
> dput(LifeExpCH$Die)
c(380, 16, 11, 9, 9, 8, 7, 7, 7, 8, 7, 7, 8, 8, 8, 10, 11, 13,
16, 19, 20, 20, 21, 20, 19, 21, 20, 21, 23, 24, 25, 27, 30, 31,
35, 37, 41, 44, 48, 52, 57, 63, 70, 76, 84, 94, 104, 115, 129,
143, 159, 176, 195, 215, 237, 258, 283, 307, 334, 363, 392, 424,
458, 495, 534, 578, 624, 677, 734, 798, 869, 952, 1044, 1149,
1271, 1411, 1569, 1750, 1955, 2184, 2440, 2723, 3032, 3363, 3711,
4064, 4404, 4710, 4952, 5106, 5143, 5041, 4795, 4408, 3908, 3342,
2753, 2190, 1679, 1243, 888, 612, 406, 259, 158, 93, 51, 27,
13, 5, 2, 1)
I want to create a plotly animation in R. The animation is not working as intended. There is a mix up in the frames. Frames 100:109 are at the start. May I ask for some help, how to get the frames in the right order?
Here is the code:
library(plotly)
library(dplyr)
library(purrr)
LifeExpCH %>%
split(.$Age) %>% accumulate(~bind_rows(.x, .y)) %>%
set_names(0:111) %>%
bind_rows(.id = "frame") %>%
plot_ly(x = ~Age, y = ~Die) %>%
add_lines(frame = ~frame, showlegend = FALSE)

Determining the FWHM from a distribution in R

I have a dataset:
x = c(30, 23, 22, 31, 18, 16, 19, 16, 15, 23, 21, 17, 17, 27, 24, 22, 27, 32, 23, 21, 14, 19, 22, 23, 22, 19, 13, 22, 33, 25, 24, 26, 24, 24, 13, 21, 23, 23, 31, 23, 25, 20, 24, 23, 24, 16, 19, 33, 36, 29, 29, 26, 22, 23, 23, 23, 20, 27, 32, 30, 34, 37, 28, 24, 29, 32, 35, 32, 25, 28, 27, 32, 33, 25, 25, 30, 25, 25, 26, 36, 30, 37, 27, 25, 26, 34, 38, 29, 30, 32, 31, 33, 33, 31, 37, 36, 42, 32, 37, 34, 37, 38, 31, 25, 27, 27, 32, 33, 39, 38, 36, 32, 35, 33, 36, 40, 39, 32, 32, 34, 35, 38, 43, 43, 39, 30, 33, 39, 39, 46, 51, 40, 30, 37, 48, 52, 47, 58, 41, 47, 51, 55, 56, 46, 34, 47, 45, 48, 52, 66, 56, 49, 66, 71, 63, 47, 41, 50, 56, 58, 56, 56, 61, 49, 57, 58, 52, 65, 70, 75, 59, 55, 44, 48, 49, 49, 55, 61, 58, 56, 56, 62, 64, 58, 59, 64, 64, 61, 56, 62, 64, 78, 86, 80, 75, 65, 75, 66, 72, 85, 65, 70, 63, 58, 73, 83, 89, 78, 75, 79, 84, 92, 90, 80, 76, 82, 83, 82, 82, 93, 93, 79, 92, 97, 84, 85, 81, 90, 99, 106, 100, 96, 97, 101, 128, 120, 121, 109, 125, 118, 121, 118, 123, 108, 115, 123, 123, 117, 113, 124, 128, 142, 134, 120, 116, 112, 129, 149, 150, 134, 134, 133, 139, 165, 167, 173, 158, 170, 188, 186, 191, 172, 163, 176, 182, 188, 205, 203, 219, 211, 229, 242, 245, 273, 270, 285, 292, 329, 355, 358, 362, 403, 429, 480, 516, 512, 525, 544, 575, 595, 668, 622, 612, 627, 649, 620, 576, 608, 576, 545, 471, 435, 422, 416, 405, 372, 338, 299, 285, 279, 274, 251, 241, 213, 201, 197, 203, 197, 196, 189, 184, 166, 165, 161, 167, 160, 157, 139, 131, 141, 152, 143, 144, 140, 136, 148, 123, 114, 113, 109, 122, 134, 120, 103, 117, 134, 117, 106, 114, 112, 104, 86, 94, 103, 108, 98, 109, 98, 100, 108, 114, 92, 78, 83, 111, 98, 78, 80, 80, 68, 62, 76, 74, 89, 78, 85, 86, 86, 76, 71, 72, 72, 64, 76, 77, 77, 89, 76, 65, 61, 66, 68, 72, 75, 72, 67, 67, 69, 75, 65, 63, 75, 68, 65, 59, 68, 61, 60, 63, 63, 58, 63, 59, 49, 68, 55, 60, 67, 65, 69, 68, 53, 59, 64, 45, 43, 42, 48, 46, 50, 52, 41, 38, 44, 38, 51, 50, 51, 41, 40, 41, 41, 34, 41, 32, 35, 39, 52, 46, 38, 37, 39, 36, 36, 34, 41, 40, 38, 38, 47, 45, 46, 36, 40, 34, 32, 39, 41, 47, 38, 38, 33, 44, 37, 38, 30, 34, 30, 40, 43, 41, 31, 27, 39, 34, 31, 29, 29, 25, 38, 38, 33, 42, 45, 46, 42, 37, 40, 35, 50, 34, 29, 25, 30, 36, 35, 36, 35, 24, 22, 29, 29, 32, 32, 25, 32, 30, 28, 23, 28, 34, 31, 28, 30, 27, 27, 20, 25, 32, 32, 41, 28, 19, 22, 23, 20, 25, 31, 27, 24, 26, 21, 20, 25, 33, 31, 44, 31, 31, 22, 29, 29, 32, 20, 24, 26, 27, 28, 24, 16, 19, 24, 23, 28, 27, 22, 24, 18, 19, 19, 21, 26, 26, 25, 28, 28, 32, 32, 26, 23, 31, 27, 20, 18, 29, 25, 15, 23, 28, 29)
I know the data follows the following distribution, a quadratic background with a cauchy peak:
a + bx + cx^2 + dx^3 + ex^4 + fx^5 + gx^6 + (A/pi)(gamma/((x-pos)^2 + gamma^2))
I am trying to determine the optimum parameters to fit this data in order to find the 'gamma' parameter. This will tell me the FWHM of the distribution. I only know the 'pos' parameter. This is the location of the peak.
Here is a plot of this distribution:
I have achieved this in python using the lmfit package but have not found a way to do this in R yet. I believe it requires the fitdistrplus package but I have had no luck. Any ideas?

Even though the data sets are a little different, you should be able to use the same recipe as found at R Find Full width at half maximum for a gausian density distribution -- to get FWHM you need a density curve which is the part that's missing:
> d <- density(na.omit(x),n=1e4)
> xmax <- d$x[d$y==max(d$y)]
> x1 <- d$x[d$x < xmax][which.min(abs(d$y[d$x < xmax]-max(d$y)/2))]
> x2 <- d$x[d$x > xmax][which.min(abs(d$y[d$x > xmax]-max(d$y)/2))]
> points(c(x1, x2), c(d$y[d$x==x1], d$y[d$x==x2]), col="red")
> (FWHM <- x2-x1)
[1] 43.37897

Discretize the columns first - Apriori in R

I extracted this data from a file:
forests<-read.table("~/Desktop/f.txt", header = FALSE, sep = " ", fill = TRUE)
library(arules) //after installing the package
I get an error when I type
forests<- apriori(forests, parameter = list(support=0.3))
Error in asMethod(object) : column(s) 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85 not logical or a factor. Discretize the
columns first.
I tried discritize(forests). It still doesn't work.
itemFrequencyPlot(forests) also gives an error saying unable to find inherited method.
Have I imported the data incorrectly?

R - hist plot colours by quantile

I am trying to do a simple hist plot and colour the bins by quantile.
I was wondering why when the bins size change the colours gets all messed up.
Maybe I am not doing it right from the beginning.
The quantiles are
quantile(x)
0% 25% 50% 75% 100%
0.00 33.75 58.00 78.25 123.00
Then I am setting the colours with the quantile values
k = ifelse(test = x <= 34, yes = "#8DD3C7",
no = ifelse(test = (x > 34 & x <= 58), yes = "#FFFFB3",
no = ifelse(test = (x > 58 & x <= 79), yes = "#BEBADA",
no = ifelse(test = (x > 79), yes = "#FB8072", 'grey'))) )
Then when I plot with larger bin, I get :
hist(dt, breaks = 10, col = k)
Which seems right, even though the last bin is wrong (?!).
But when I try with smaller bins, the colours are not right.
Could someone help me understand why is it wrong ? Or if my code is wrong ?
The x in question
x = c(23, 23, 16, 16, 34, 34, 43, 43, 97, 97, 63, 63, 39, 39, 29,
29, 63, 63, 48, 48, 7, 7, 80, 80, 69, 69, 110, 110, 103, 103,
43, 43, 39, 39, 46, 46, 14, 14, 56, 56, 76, 76, 52, 52, 18, 18,
32, 32, 66, 66, 70, 70, 26, 26, 40, 40, 105, 105, 62, 62, 51,
51, 58, 58, 37, 37, 55, 55, 42, 42, 11, 11, 89, 89, 55, 55, 109,
109, 49, 49, 27, 27, 96, 96, 27, 27, 65, 65, 74, 74, 17, 17,
33, 33, 89, 89, 63, 63, 18, 18, 25, 25, 36, 36, 108, 108, 3,
3, 52, 52, 83, 83, 74, 74, 56, 56, 99, 99, 6, 6, 25, 25, 51,
51, 4, 4, 100, 100, 17, 17, 44, 44, 23, 23, 70, 70, 85, 85, 14,
14, 22, 22, 89, 89, 45, 45, 2, 2, 29, 29, 14, 14, 69, 69, 96,
96, 10, 10, 58, 58, 97, 97, 54, 54, 60, 60, 65, 65, 2, 2, 54,
54, 4, 4, 28, 28, 107, 107, 74, 74, 72, 72, 71, 71, 42, 42, 92,
92, 64, 64, 39, 39, 111, 111, 72, 72, 73, 73, 58, 58, 41, 41,
56, 56, 73, 73, 18, 18, 73, 73, 36, 36, 60, 60, 49, 49, 47, 47,
95, 95, 19, 19, 8, 8, 7, 7, 38, 38, 38, 38, 38, 38, 28, 28, 79,
79, 53, 53, 30, 30, 19, 19, 14, 14, 53, 53, 68, 68, 39, 39, 42,
42, 87, 87, 33, 33, 18, 18, 77, 77, 83, 83, 19, 19, 14, 14, 7,
7, 32, 32, 94, 94, 30, 30, 55, 55, 89, 89, 30, 30, 45, 45, 84,
84, 38, 38, 59, 59, 73, 73, 77, 77, 22, 22, 55, 55, 31, 31, 52,
52, 20, 20, 26, 26, 62, 62, 55, 55, 46, 46, 26, 26, 49, 49, 22,
22, 65, 65, 67, 67, 73, 73, 29, 29, 88, 88, 86, 86, 76, 76, 32,
32, 12, 12, 19, 19, 14, 14, 8, 8, 63, 63, 63, 63, 65, 65, 84,
84, 34, 34, 42, 42, 26, 26, 75, 75, 68, 68, 28, 28, 95, 95, 17,
17, 76, 76, 33, 33, 91, 91, 93, 93, 80, 80, 89, 89, 64, 64, 81,
81, 98, 98, 47, 47, 70, 70, 46, 46, 11, 11, 92, 92, 69, 69, 95,
95, 51, 51, 87, 87, 61, 61, 50, 50, 47, 47, 35, 35, 31, 31, 39,
39, 19, 19, 81, 81, 35, 35, 68, 68, 68, 68, 67, 67, 57, 57, 7,
7, 9, 9, 23, 23, 50, 50, 89, 89, 41, 41, 54, 54, 53, 53, 57,
57, 89, 89, 32, 32, 40, 40, 48, 48, 35, 35, 15, 15, 90, 90, 1,
1, 17, 17, 53, 53, 73, 73, 76, 76, 59, 59, 45, 45, 68, 68, 21,
21, 37, 37, 33, 33, 51, 51, 61, 61, 31, 31, 15, 15, 23, 23, 29,
29, 45, 45, 96, 96, 87, 87, 37, 37, 104, 104, 50, 50, 58, 58,
103, 103, 91, 91, 72, 72, 73, 73, 27, 27, 60, 60, 23, 23, 99,
99, 28, 28, 78, 78, 27, 27, 82, 82, 63, 63, 34, 34, 84, 84, 62,
62, 2, 2, 99, 99, 22, 22, 85, 85, 39, 39, 47, 47, 66, 66, 17,
17, 74, 74, 45, 45, 70, 70, 87, 87, 28, 28, 97, 97, 89, 89, 33,
33, 50, 50, 79, 79, 86, 86, 69, 69, 91, 91, 75, 75, 52, 52, 76,
76, 13, 13, 71, 71, 42, 42, 20, 20, 28, 28, 56, 56, 69, 69, 16,
16, 47, 47, 60, 60, 45, 45, 72, 72, 78, 78, 107, 107, 4, 4, 64,
64, 88, 88, 9, 9, 3, 3, 10, 10, 92, 92, 41, 41, 5, 5, 35, 35,
31, 31, 24, 24, 70, 70, 47, 47, 41, 41, 32, 32, 92, 92, 90, 90,
75, 75, 3, 3, 78, 78, 30, 30, 93, 93, 60, 60, 17, 17, 25, 25,
48, 48, 70, 70, 69, 69, 66, 66, 76, 76, 104, 104, 31, 31, 72,
72, 56, 56, 64, 64, 92, 92, 68, 68, 102, 102, 100, 100, 27, 27,
40, 40, 47, 47, 29, 29, 76, 76, 78, 78, 20, 20, 13, 13, 10, 10,
113, 113, 17, 17, 61, 61, 69, 69, 65, 65, 16, 16, 100, 100, 5,
5, 18, 18, 24, 24, 54, 54, 41, 41, 64, 64, 66, 66, 90, 90, 29,
29, 97, 97, 37, 37, 42, 42, 84, 84, 37, 37, 74, 74, 65, 65, 12,
12, 49, 49, 31, 31, 108, 108, 9, 9, 93, 93, 71, 71, 39, 39, 70,
70, 79, 79, 92, 92, 60, 60, 104, 104, 79, 79, 103, 103, 38, 38,
93, 93, 46, 46, 66, 66, 79, 79, 51, 51, 31, 31, 65, 65, 93, 93,
25, 25, 22, 22, 91, 91, 123, 123, 51, 51, 34, 34, 64, 64, 31,
31, 24, 24, 74, 74, 57, 57, 95, 95, 83, 83, 28, 28, 56, 56, 72,
72, 43, 43, 18, 18, 66, 66, 32, 32, 17, 17, 67, 67, 10, 10, 44,
44, 66, 66, 57, 57, 89, 89, 57, 57, 55, 55, 18, 18, 78, 78, 82,
82, 103, 103, 110, 110, 92, 92, 54, 54, 35, 35, 8, 8, 53, 53,
86, 86, 45, 45, 99, 99, 19, 19, 84, 84, 94, 94, 92, 92, 80, 80,
69, 69, 45, 45, 22, 22, 59, 59, 9, 9, 41, 41, 72, 72, 24, 24,
117, 117, 79, 79, 57, 57, 29, 29, 96, 96, 47, 47, 23, 23, 64,
64, 33, 33, 48, 48, 80, 80, 30, 30, 42, 42, 10, 10, 42, 42, 68,
68, 46, 46, 58, 58, 39, 39, 82, 82, 79, 79, 80, 80, 89, 89, 85,
85, 24, 24, 106, 106, 40, 40, 90, 90, 69, 69, 92, 92, 84, 84,
82, 82, 86, 86, 80, 80, 73, 73, 78, 78, 39, 39, 27, 27, 55, 55,
100, 100, 63, 63, 21, 21, 46, 46, 94, 94, 6, 6, 45, 45, 66, 66,
94, 94, 52, 52, 78, 78, 59, 59, 86, 86, 67, 67, 76, 76, 54, 54,
47, 47, 37, 37, 76, 76, 32, 32, 49, 49, 87, 87, 122, 122, 27,
27, 82, 82, 51, 51, 50, 50, 22, 22, 32, 32, 99, 99, 77, 77, 54,
54, 29, 29, 82, 82, 80, 80, 85, 85, 30, 30, 57, 57, 41, 41, 50,
50, 65, 65, 51, 51, 109, 109, 89, 89, 50, 50, 6, 6, 66, 66, 42,
42, 48, 48, 88, 88, 67, 67, 89, 89, 109, 109, 80, 80, 64, 64,
64, 64, 95, 95, 76, 76, 76, 76, 78, 78, 44, 44, 51, 51, 19, 19,
29, 29, 31, 31, 75, 75, 11, 11, 10, 10, 64, 64, 80, 80, 29, 29,
73, 73, 67, 67, 38, 38, 27, 27, 23, 23, 74, 74, 79, 79, 49, 49,
78, 78, 29, 29, 59, 59, 70, 70, 8, 8, 24, 24, 39, 39, 80, 80,
27, 27, 29, 29, 36, 36, 94, 94, 86, 86, 35, 35, 84, 84, 99, 99,
83, 83, 92, 92, 81, 81, 58, 58, 2, 2, 64, 64, 75, 75, 29, 29,
53, 53, 58, 58, 11, 11, 38, 38, 83, 83, 108, 108, 86, 86, 56,
56, 12, 12, 84, 84, 76, 76, 38, 38, 54, 54, 37, 37, 27, 27, 61,
61, 83, 83, 37, 37, 59, 59, 81, 81, 76, 76, 70, 70, 61, 61, 101,
101, 77, 77, 68, 68, 74, 74, 83, 83, 70, 70, 93, 93, 53, 53,
64, 64, 89, 89, 1, 1, 53, 53, 67, 67, 81, 81, 71, 71, 51, 51,
85, 85, 35, 35, 67, 67, 53, 53, 37, 37, 31, 31, 65, 65, 82, 82,
47, 47, 60, 60, 81, 81, 21, 21, 94, 94, 75, 75, 92, 92, 113,
113, 93, 93, 84, 84, 77, 77, 82, 82, 84, 84, 58, 58, 83, 83,
84, 84, 80, 80, 1, 1, 49, 49, 73, 73, 22, 22, 99, 99, 74, 74,
28, 28, 33, 33, 74, 74, 91, 91, 83, 83, 70, 70, 99, 99, 69, 69,
38, 38, 68, 68, 47, 47, 61, 61, 47, 47, 70, 70, 85, 85, 20, 20,
100, 100, 3, 3, 49, 49, 100, 100, 85, 85, 54, 54, 8, 8, 3, 3,
47, 47, 46, 46, 45, 45, 27, 27, 87, 87, 20, 20, 24, 24, 51, 51,
50, 50, 105, 105, 73, 73, 13, 13, 18, 18, 51, 51, 75, 75, 55,
55, 62, 62, 85, 85, 56, 56, 51, 51, 66, 66, 74, 74, 63, 63, 2,
2, 81, 81, 85, 85, 19, 19, 16, 16, 83, 83, 36, 36, 79, 79, 63,
63, 41, 41, 45, 45, 76, 76, 62, 62, 67, 67, 74, 74, 92, 92, 47,
47, 41, 41, 80, 80, 57, 57, 100, 100, 66, 66, 58, 58, 65, 65,
59, 59, 20, 20, 54, 54, 10, 10, 79, 79, 64, 64, 106, 106, 44,
44, 28, 28, 41, 41, 49, 49, 80, 80, 61, 61, 20, 20, 75, 75, 59,
59, 93, 93, 32, 32, 38, 38, 30, 30, 41, 41, 8, 8, 8, 8, 54, 54,
56, 56, 83, 83, 81, 81, 77, 77, 42, 42, 59, 59, 11, 11, 21, 21,
77, 77, 84, 84, 86, 86, 84, 84, 34, 34, 48, 48, 80, 80, 92, 92,
18, 18, 66, 66, 40, 40, 45, 45, 60, 60, 80, 80, 2, 2, 5, 5, 84,
84, 66, 66, 70, 70, 70, 70, 95, 95, 62, 62, 0, 0, 67, 67, 61,
61, 71, 71, 73, 73, 82, 82, 45, 45, 54, 54, 43, 43)

It is because you mistunderstand the col argument of hist.
The col argument is a vector where col[i] is the colour of the ith bar of the histogram.
Your k vector has one element per element of x, which is many more than the number of bars in the histogram.
In the first case, only the first 13 elements of k are used to colour the bars (in that order), since there are only 13 bars. In the second case, the first n elements of k are used to colour the bars, where n is the number of bars (see how the first 13 bars of the small-bin histogram have the same colour as the first 13 of the first histogram?).
If you want to colour the bars by quantile, you will have to work out how many bars are in each quantile (not how many data points), and create your k like that.
To do this, you need to know the histogram breaks - the breakpoints of your bins. The output of hist returns an object where you can get the breakpoints and so on - see ?hist.
# do the histogram counts to get the break points
# don't plot yet
h <- hist(x, breaks=20, plot=F) # h$breaks and h$mids
To work out the colour the bar should be, you can use either the starting coordinate of each bar (all but the last element of h$breaks), the ending coordinate of each bar (all but the first element of h$breaks) or the midpoint coordinate of each bar (h$mids). Set your colours like you did above.
The findInterval(h$mids, quantile(x), ...) works out which quantile each bar is in (determined by the bar's midpoint); it returns an integer with which interval it is in, or 0 if it's outside (though by definition every bar of the histogram is between the 0th and 100th quantile, so technically your "grey" colour is not ever used). rightmost.closed makes sure the 100% quantile value is included in the top-most colour bracket. The cols[findInterval(...)+1] is just a cool/tricksy way to do your ifelse(h$mids <= ..., "$8DD3C7", ifelse(h$mids <= ..., .....)); you could do it the ifelse way if you prefer.
cols <- c('grey', "#8DD3C7", "#FFFFB3", "#BEBADA", "#FB8072")
k <- cols[findInterval(h$mids, quantile(x), rightmost.closed=T, all.inside=F) + 1]
# plot the histogram with the colours
plot(h, col=k)
Have a look at k - it is only as long as the number of bars in the histogram, rather than as long as the number of datapoints in x.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to consecutively subset an array and count the number of iterations? - r

You can use recursion as shown below: fun<- function(x, i=1){ a<-which.max(colSums(x<25)) b<-x[x[,a] > 25,, drop = FALSE] if(length(b)) Recall(b, i+1) else i } fun(ar) [1] 3

Related

Batch column aggregation and reordering dataframe in R

Plotly animation in R; frames are not in correct order, mix up in frames

Determining the FWHM from a distribution in R

Discretize the columns first - Apriori in R

R - hist plot colours by quantile

Categories

Resources