I have a list that looks like this.
my_list <- list(Y = c("p", "q"), K = c("s", "t", "u"))
I want to name each list element (the character vectors) with the name of the list they are in. All element of the same vector must have the same name
I was able to write this function that works on a single list element
name_vector <- function(x){
names(x[[1]]) <- rep(names(x[1]), length(x[[1]]))
return(x)
}
> name_vector(my_list[1])
$Y
Y Y
"p" "q"
But can't find a way to vectorize it. If I run it with an apply function it just returns the list unchanged
> lapply(my_list, name_vector)
$K
[1] "p" "q"
$J
[1] "x" "y"
My desired output for my_list is a named vector
Y Y K K K
"p" "q" "s" "t" "u"
We unlist the list while setting the names by replicating
setNames(unlist(my_list), rep(names(my_list), lengths(my_list)))
Or stack into a two column data.frame, extract the 'values' column and name it with 'ind'
with(stack(my_list), setNames(values, ind))
if your names don't end with numbers :
vec <- unlist(my_list)
names(vec) <- sub("\\d+$","",names(vec))
vec
# Y Y K K K
# "p" "q" "s" "t" "u"
Related
I am trying to get a vector of the unique elements of two vectors that respects the order of both of the original vectors.
The vectors are both sampled from a longer "hidden" vector that only contains unique entries (i.e. no repeats are allowed), which ensures both v1 and v2 have a compatible order (i.e. v1<-("Z","A",...) and v2<-("A","Z",...) can not occur).
The order is arbitrary, so I cannot use any simple order() or sort().
An example below:
v1 <- c("Z", "A", "F", "D")
v2 <- c("A", "T", "F", "Q", "D")
Result desired:
c("Z", "A", "T", "F", "Q", "D") or
Further explanation: v1 establishes the relationship
"Z" < "A" < "F" < "D"
and v2 states
"A" < "T" < "F" < "Q" < "D"
so the sequence that satisfies v1 and v2 is
"Z" < "A" < "T" < "F" < "Q" < "D"
I understand this case is fully determined (the two vectors do completely define the order of all elements), but there would be cases when this is not enough. In that case, any permutation that respects the two sets of ordering would be a satisfactory solution.
Any tips will be appreciated.
You can get unique from v1 and v2 and resort it using match on v1 and v2 and repeat this until no change happens.
x <- unique(c(v1, v2))
repeat {
y <- x
i <- match(v2, x)
x[sort(i)] <- x[i]
i <- match(v1, x)
x[sort(i)] <- x[i]
if(identical(x, y)) break;
}
x
#[1] "Z" "A" "T" "F" "Q" "D"
Alternative you can get the overlapping letters of v1 and v2 and then join to this anchor points the subsets of v1 and v2:
i <- v2[na.omit(match(v1, v2))]
j <- c(0, match(i, v2))
i <- c(0, match(i, v1))
unique(c(unlist(lapply(seq_along(i)[-1], function(k) {
c(v1[head((i[k-1]:i[k]), -1)], v2[head((j[k-1]:j[k])[-1], -1)])
})), v1, v2))
#[1] "Z" "A" "T" "F" "Q" "D"
For this example the next code works. One first has to define auxiliar vectors w1, w2 depending on which has the first common element and another vector w on which to append the lacking elements by order.
It would be clearer using a for loop, which would avoid this cumbersome code, but at first, this is faster and shorter.
w <- w1 <- unlist(ifelse(intersect(v1,v2)[1] == v1[1], list(v2), list(v1)))
w2 <- unlist(ifelse(intersect(v1,v2)[1] == v1[1], list(v1), list(v2)))
unique(lapply(setdiff(w2,w1), function(elmt) w <<- append(w, elmt, after = match(w2[match(elmt,w2)-1],w)))[[length(setdiff(w2,w1))]])
[1] "Z" "A" "T" "F" "Q" "D"
I have a list (my.list) and a data frame (my.dataframe). The names of each element within my.list are of a sequence and are of the same type as the elements within two variables in my.dataframe. I want to pull out elements of the list whose names fall at, within, or just outside the range of the elements of two columns in my.dataframe.
RNGkind('Mersenne-Twister')
set.seed(1)
#Create my.dataframe
my.letters <- sample(x = sample(LETTERS[1:20],
size = 13,
replace = FALSE),
size = 100,
replace = TRUE)
my.other.letters <- LETTERS[match(my.letters, LETTERS) +
sample(x = 0:5,
size = 100,
replace = TRUE)]
my.dataframe <- data.frame(col1 = my.letters,
col2 = my.other.letters)
head(my.dataframe)
col1 col2
1 D F
2 C C
3 O O
4 A E
5 T T
6 D F
#So here, I'd want to pull out elements within my.list who's names would fall within D
#and F for the first row, C for the second row, O for the fourth, A and E for the fifth,
#so on and so forth.
#Create my.list
temp.data <- data.frame(a = rnorm(13*20, 10, 1),
b = rep(LETTERS[sample(1:length(LETTERS),
size = 13,
replace = FALSE)],
each = 20))
my.list <- split(x = temp.data$a, f = factor(temp.data$b))
I've used mapply() to try and do this:
mega.list <- mapply(function(f, s)my.list[which(LETTERS == f):which(LETTERS == s)], f = my.dataframe$col1, s = my.dataframe$col2)
But it only works if col1, col2, and the names of the elements in my.list have all the letters of the alphabet, but they don't. If you look at mega.list[[98]], you've got an empty list because it's looking for names within my.list that fall between T and Y(my.dataframe[98,]). Seeing as there isn't a list element whose name is T, you get nothing.
sort(unique(as.character(my.dataframe$col1))); sort(unique(as.character(my.dataframe$col2))); sort(unique(names(my.list)))
[1] "A" "B" "C" "D" "F" "H" "I" "K" "N" "O" "P" "S" "T"
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "X" "Y"
[1] "A" "B" "D" "E" "F" "G" "H" "J" "K" "R" "S" "W" "Z"
Question: If the exact letter name isn't available within my.list, is there a way to select the next closest letter before or after the letters in col1 or col2, respectively? For example, if it tries to look for a letter N from col1, how could I get it to select K instead? Likewise, if it's trying to find U from col2, how can I get it to look for W instead?
I figured it out. I had to make an ammendment to the mapply function, where the first which function looks for all letters and and before f and takes the last value (with the tail function) and where the last which function looks at all the letters at and after it and takes the first one (done with [1]).
mega.list <- mapply(function(f, s)my.list[tail(which(names(my.list) <= f), n = 1):which(names(my.list) >= s)[1]], f = as.character(my.dataframe$col1), s = as.character(my.dataframe$col2))
I have a vector where I want to replace one element with multiple element, I am able to replace with one but not multuiple, can anyone help?
For example I have
data <- c('a', 'x', 'd')
> data
[1] "a" "x" "d"
I want to replace "x" with "b", "c" to get
[1] "a" "b" "c" "d"
However
gsub('x', c('b', 'c'), data)
gives me
[1] "a" "b" "d"
Warning message:
In gsub("x", c("b", "c"), data) :
argument 'replacement' has length > 1 and only the first element will
be used
Here's how I would tackle it:
data <- c('a', 'x', 'd')
lst <- as.list(data)
unlist(lapply(lst, function(x) if(x == "x") c("b", "c") else x))
# [1] "a" "b" "c" "d"
We're making use of the fact that list-structures are more flexible than atomic vectors. We can replace a length-1 list-element with a length>1 element and then unlist the result to go back to an atomic vector.
Since you want to replace exact matches of "x" I prefer not to use sub/gsub in this case.
You may try this , although I believe the accepted answer is great:
unlist(strsplit(gsub("x", "b c", data), split = " "))
Logic: Replacing "x" with "b c" with space and then doing the strsplit, once its splitted we can convert is again back to vector using unlist.
This is a bit tricky of a problem because in your replacement you also want to grow your vector. That being said, I believe this should work:
replacement <- c("b","c")
new_data <- rep(data, times = ifelse(data=="x", length(replacement), 1))
new_data[new_data=="x"] <- replacement
new_data
#[1] "a" "b" "c" "d"
This will also work if you have multiple "x"s in your vector like:
data <- c("a","x","d","x")
Another approach:
data <- c('a', 'x', 'd')
pattern <- "x"
replacement <- c("b", "c")
sub_one_for_many <- function(pattern, replacement, x) {
removal_index <- which(x == pattern)
if (removal_index > 1) {
result <- c(x[1:(removal_index-1)], replacement, x[(removal_index+1):length(x)])
} else if (removal_index == 1) {
result <- c(replacement, x[2:length(x)])
}
return(result)
}
answer <- sub_one_for_many(pattern, replacement, data)
Output:
> answer
[1] "a" "b" "c" "d"
I started with a matrix
Xray Stay Leave
[1,] "H" "H" "H"
[2,] "A" "L" "O"
And I have the following vector:
[1] "H" "L"
I want to get the output
"Stay".
I tried this:
which(vec %in% matrix )
but that gives me the following output:
[1] 1 2
Seems to be just telling me the rows that it finds the H and L in. I need the column name of the one that is an exact match.
Another approach:
vec <- c("H", "L")
colnames(mat)[colMeans(mat == vec) == 1]
# [1] "Stay"
where mat is the name of your matrix.
Assuming m is your matrix, you could do
> vec <- c("H", "A")
> colnames(m)[apply(m, 2, identical, vec)]
NOTE: identical used here because the original post says "I need the column name of the one that is an exact match"
This should return a logical vector:
logv <- apply(mat, 2, function(x) identical(vec,x))
Then this will select the correct column name:
dimnames(mat)[[2]][logv]
[1] "Stay"
Test case:
mat <- matrix( c( "H","H", "H", "A","L","O") ,2, byrow=TRUE,
dimnames=list(NULL, c('Xray', 'Stay', 'Leave') ) )
You could try:
If mat and vec are the matrix and vector
colnames(mat)[table(mat %in% vec, (seq_along(mat)-1)%/%nrow(mat) +1)[2,] >1]
#[1] "Stay"
I have a list
myList <- list(matrix(letters[1:4], nrow=2), matrix(letters[5:8], nrow=2))
names(myList) <- c("xx", "yy")
I want to rbind this list of matrix, along with the names xx and yy, using Reduce. The problem I have is that Reduce goes directly to myList[[i]] so it loses the names if I pass myList directly. I'm guessing the solution is some combination of creating more 'layers' and clever use of [, but I can't seem to figure it out.
The desired output is
"xx"
"a" "c"
"b" "d"
"yy"
"e" "g"
"f" "h"
library(MASS)
for( nm in names(myList)){ cat(nm,"\n"); write.matrix(myList[[nm]]) }
xx
a c
b d
yy
e g
f h