Order a numeric vector by length in R - r

I've got two numeric vectors that I want to order by the length of the their observations, i.e., the number of times each observation appears.
For example:
x <- c("a", "a", "a", "b", "b", "b", "b", "c", "e", "e")
Here, b occurs four times, a three times, e two and c one time. I'd like my result in this order.
ans <- c("b", "b", "b", "b", "a", "a", "a", "e", "e", "c")
I´ve tried this:
x <- x[order(-length(x))] # and some similar lines.
Thanks

Using rle you can get values lenghts. You order lengths, and use values to recreate the vector again using the new order:
xx <- c('a', 'a', 'a', 'b', 'b', 'b','b', 'c', 'e', 'e')
rr <- rle(xx)
ord <- order(rr$lengths,decreasing=TRUE)
rep(rr$values[ord],rr$length[ord])
## [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"

You may also use ave when calculating the lengths
x[order(ave(x, x, FUN = length), decreasing = TRUE)]
# [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"

Related

Finding in which vector does the element belong to

suppose I have 3 vectors:
a = c("A", "B", "C")
b = c("D", "E", "F")
c = c("G", "H", "I")
then I have an element:
element = "E"
I want to find which list does my element belongs to. In this case, list b.
It will be appreciated if the solution to this problem is more general because my real data set have more than a hundred lists.
element = "E"
names(our_lists)[sapply(our_lists, `%in%`, x = element)]
# [1] "b"
Data
our_lists <- list(
a = c("A", "B", "C"),
b = c("D", "E", "F"),
c = c("G", "H", "I")
)
Using grep.
element <- "E"
l <- mget(c("a", "b", "c"))
names(l)[grep(element, l)]
# [1] "b"
If you keep the data in individual objects, you need to check for the element in each one individually. Get them in a list.
list_data <- mget(c('a', 'b', 'c'))
names(Filter(any, lapply(list_data, `==`, element)))
#[1] "b"
If all your vectors have the same length then a vectorised idea can be,
c('a', 'b', 'c')[ceiling(which(c(a, b, c) == 'E') / length(a))]
#[1] "b"
You can use dplyr::lst that creates named list from variable names. Then purrr::keep to keep only the vectors that contain your element.
require(tidyverse)
lst(a, b, c) %>%
keep(~ element %in% .x) %>%
names()
output:
[1] "b"

Create new vector from row index of two matching columns

I have a data frame:
a <- c(1,2,3,4,5,6)
b <- c(1,2,1,2,1,4)
c <- c("A", "B", "C", "D", "E", "F")
df <- data.frame(a,b,c)
What I want to do, is create another vector d, which contains the value of c in the row of a which matches each value of b
So my new vector would look like this:
d <- c("A", "B", "A", "B", "A", "D")
As an example, the final value of b is 4, which matches with the 4th row of a, so the value of d is the 4th row of c, which is "D".
If a and b are both lists with integer values you can use them directly.
d <- c[b[a]]
d
[1] "A" "B" "A" "B" "A" "D"
if a is a regular integer sequence along c you can simply call c from b.
c[b]
[1] "A" "B" "A" "B" "A" "D"
Another option is to convert to factor and use it as:
factor(a, labels = c)[b]
#[1] A B A B A D
OR
as.character(factor(a, labels = c)[b])
#[1] "A" "B" "A" "B" "A" "D"
data
a <- c(1,2,3,4,5,6)
b <- c(1,2,1,2,1,4)
c <- c("A", "B", "C", "D", "E", "F")

Returning the values of a list based on "two" parameters

Very new to R. So I am wondering if you can use two different parameters to get the position of both elements from a list. See the below example...
x <- c("A", "B", "A", "A", "B", "B", "C", "C", "A", "A", "B")
y <- c(which(x == "A"))
[1] 1 3 4 9 10
x[y]
[1] "A" "A" "A" "A" "A"
x[y+1]
[1] "B" "A" "B" "A" "B"
But I would like to return the positions of both y and y+1 together in the same list. My current solution is to merge the two above lists by row number and create a dataframe from there. I don't really like that and was wondering if there is another way. Thanks!
I dont know what exactly you want, but this could help:
newY = c(which(x == "A"),which(x == "A")+1)
After that you can sort it with
finaldata <- newY[order(newY)]
Or you do both in one step:
finaldata <- c(which(x == "A"),which(x == "A")+1)[order(c(which(x == "A"),which(x == "A")+1))]
Then you could also delete duplicates if you want to. Please tell me if this is what you wanted.

Return all elements of list containing certain strings

I have a list of vectors containing strings and I want R to give me another list with all vectors that contain certain strings. MWE:
list1 <- list("a", c("a", "b"), c("a", "b", "c"))
Now, I want a list that contains all vectors with "a" and "b" in it. Thus, the new list should contain two elements, c("a", "b") and c("a", "b", "c").
As list1[grep("a|b", list1)] gives me a list of all vectors containing either "a" or "b", I expected list1[grep("a&b", list1)] to do what I want, but it did not (it returned a list of length 0).
This should work:
test <- list("a", c("a", "b"), c("a", "b", "c"))
test[sapply(test, function(x) sum(c('a', 'b') %in% x) == 2)]
Try purrr::keep
library(purrr)
keep(list1, ~ all(c("a", "b") %in% .))
We can use Filter
Filter(function(x) all(c('a', 'b') %in% x), test)
#[[1]]
#[1] "a" "b"
#[[2]]
#[1] "a" "b" "c"
A solution with grepl:
> list1[grepl("a", list1) & grepl("b", list1)]
[[1]]
[1] "a" "b"
[[2]]
[1] "a" "b" "c"

Using fct_relevel over a list of variables using map_at

I have a bunch of factor variables that have the same levels, and I want them all reordered similarly using fct_relevel from the forcats package. Many of the variable names start with the same characters ("Q11A" to "Q11X", "Q12A" to "Q12X", "Q13A" to "Q13X", etc.). I wanted to use the starts_with function from dplyr to shorten the task. The following error didn't give me an error, but it didn't do anything either. Is there anything I'm doing wrong?
library(dplyr)
library(purrr)
library(forcats)
library(tibble)
#Setting up dataframe
f1 <- factor(c("a", "b", "c", "d"))
f2 <- factor(c("a", "b", "c", "d"))
f3 <- factor(c("a", "b", "c", "d"))
f4 <- factor(c("a", "b", "c", "d"))
f5 <- factor(c("a", "b", "c", "d"))
df <- tibble(f1, f2, f3, f4, f5)
levels(df$f1)
[1] "a" "b" "c" "d"
#Attempting to move level "c" up before "a" and "b".
df <- map_at(df, starts_with("f"), fct_relevel, "c")
levels(df$f1)
[1] "a" "b" "c" "d" #Didn't work
#If I just re-level for one variable:
fct_relevel(df$f1, "c")
[1] a b c d
Levels: c a b d
#That worked.
I think you're looking for mutate_at:
df <- mutate_at(df, starts_with("f"), fct_relevel, ... = "c")
df$f1
[1] a b c d
Levels: c a b d

Resources