R sort multiple (linked) list - r

I encountered this problem, I have two lists which have the same dimension (or the same number of elements), and they are linked.
For example, one list stores the ID number of the students, while another one stores the exam marks of these students.
I want to sort the exam marks from small to large, but I do not want to loose the one-to-one link between students ID and their marks. How can I do so in R ?

Let's assume a list like this:
a <- list(id = c("c", "a", "b"), score = c(2, 9, 12))
> a
$id
[1] "c" "a" "b"
$score
[1] 2 9 12
Then you can sort it using lapply...
> lapply(a, function(x)x[order(a$id)])
$id
[1] "a" "b" "c"
$score
[1] 9 12 2
> lapply(a, function(x)x[order(a$score, decreasing=TRUE)])
$id
[1] "b" "a" "c"
$score
[1] 12 9 2

Related

Pasting together two vectors in R elementwise [duplicate]

This question already has answers here:
Unique combination of all elements from two (or more) vectors
(6 answers)
How to cross-paste all combinations of two vectors (each-to-each)?
(4 answers)
Closed 4 years ago.
I am trying to build a web scraper that'll go through a number of websites and extract some information. In order to do this, I'm going to need to make a list of the last couple of characters in the url, so I can map the information into a data frame. To do this, I've found out that the end of the url should have a number between 1 and 10 to identify a department, and a given character to identify a unit. Therefore I've made two vectors with the needed units and department IDs:
departments <- c(1, 2, 9, 10)
units <- c("A", "B", "C", "F", "I", "O", "V")
To go through each website, I need a list that combines these two vectors in each possible way, like for instance "1A", "1B", "1C", "1F", "1I", "1O", "1V", "2A" and so forth.
I've tried different solutions, but they do not return what I expected, like for instance:
> depUn <- as.list(paste(departments, units, sep = ""))
> depUn
[[1]]
[1] "1A"
[[2]]
[1] "2B"
[[3]]
[1] "9C"
[[4]]
[1] "10F"
[[5]]
[1] "1I"
[[6]]
[1] "2O"
[[7]]
[1] "9V"
Does someone have any good insight to how I could solve this?
EDIT
I've already tried the expand.grid option, and while it was succesful in putting the elements after one another in a list, it failed to put them together in one string. Can someone help me how to accomplish this? Here's and excerpt of the code and results I tried back then:
> dfDepUn <- expand.grid(departments, units)
> DepUn <- lapply(apply(dfDepUn, 1, identity), unlist)
> dfDepUn
[[1]]
Var1 Var2
"1" "A"
[[2]]
Var1 Var2
"2" "A"
[[3]]
Var1 Var2
"9" "A"
[[4]]
Var1 Var2
"10" "A"
Is this what you need?
df <- expand.grid(departments, units)
as.list(paste0(df$Var1, df$Var2))

How to sort data according to sectors and subsectors?

I would like to sort the data x (here: 1:12) according to the sectors sec and subsectors ssec. Below is an example showing how this can be done. The questions is whether this can be done more elegantly (maybe with a base-R function/not using additional packages)?
## Data
set.seed(17)
(sec <- sample(rep(LETTERS[1:3], each = 4))) # 3 sectors
(ssec <- rep(sample(1:4, 12, replace = TRUE))) # 4 subsectors
x <- 1:12 # data to sort according to increasing sectors and subsectors
## Sort according to sectors
ord <- order(sec)
x. <- x[ord]
sec. <- sec[ord]
ssec. <- ssec[ord]
## Sort according to subsectors
usec. <- unique(sec.)
x.. <- x.
ssec.. <- ssec.
for(grp in usec.) {
ii <- sec. == grp # indices of components in that sector
ord. <- order(ssec.[ii])
x..[ii] <- x.[ii][ord.]
ssec..[ii] <- ssec.[ii][ord.]
}
## Result
x..
sec.
ssec..
The order function from base R also accepts multiple arguments. From ?order:
order returns a permutation which rearranges its first argument into
ascending or descending order, breaking ties by further arguments.
To demonstrate, we can check how order(sec, ssec) sort the sector and subsector
Here is the original sec and ssec:
sec
[1] "B" "C" "A" "B" "A" "B" "C" "C" "C" "A" "B" "A"
ssec
[1] 3 1 3 2 1 2 4 1 3 2 1 4
After applying the ordered index, sec is sorted alphabetically and ssec is sorted within each sec, which means the index order(sec, ssec) is the sorting index expected:
sec[order(sec, ssec)]
[1] "A" "A" "A" "A" "B" "B" "B" "B" "C" "C" "C" "C"
ssec[order(sec, ssec)]
[1] 1 2 3 4 1 2 2 3 1 1 3 4

Random sequence from fixed ensemble that contains at least one of each character

I am trying to generate a random sequence from a fixed number of characters that contains at least one of each character.
For example having the ensemble
m = letters[1:3]
I would like to create a sequence of N = 10 elements that contain at least one of each m characters, like
a
a
a
a
b
c
c
c
c
a
I tried with sample(n,N,replace=T) but in this way also a sequence like
a
a
a
a
a
c
c
c
c
a
can be generated that does not contain b.
f <- function(x, n){
sample(c(x, sample(m, n-length(x), replace=TRUE)))
}
f(letters[1:3], 5)
# [1] "a" "c" "a" "b" "a"
f(letters[1:3], 5)
# [1] "a" "a" "b" "b" "c"
f(letters[1:3], 5)
# [1] "a" "a" "b" "c" "a"
f(letters[1:3], 5)
# [1] "b" "c" "b" "c" "a"
Josh O'Briens answer is a good way to do it but doesn't provide much input checking. Since I already wrote it might as well present my answer. It's pretty much the same thing but takes care of checking things like only considering unique items and making sure there are enough unique items to guarantee you get at least one of each.
at_least_one_samp <- function(n, input){
# Only consider unique items.
items <- unique(input)
unique_items_count <- length(items)
if(unique_items_count > n){
stop("Not enough unique items in input to give at least one of each")
}
# Get values for vector - force each item in at least once
# then randomly select values to get the remaining.
vals <- c(items, sample(items, n - unique_items_count, replace = TRUE))
# Now shuffle them
sample(vals)
}
m <- c("a", "b", "c")
at_least_one_samp(10, m)

Shuffling a vector - all possible outcomes of sample()?

I have a vector with five items.
my_vec <- c("a","b","a","c","d")
If I want to re-arrange those values into a new vector (shuffle), I could use sample():
shuffled_vec <- sample(my_vec)
Easy - but the sample() function only gives me one possible shuffle. What if I want to know all possible shuffling combinations? The various "combn" functions don't seem to help, and expand.grid() gives me every possible combination with replacement, when I need it without replacement. What's the most efficient way to do this?
Note that in my vector, I have the value "a" twice - therefore, in the set of shuffled vectors returned, they all should each have "a" twice in the set.
I think permn from the combinat package does what you want
library(combinat)
permn(my_vec)
A smaller example
> x
[1] "a" "a" "b"
> permn(x)
[[1]]
[1] "a" "a" "b"
[[2]]
[1] "a" "b" "a"
[[3]]
[1] "b" "a" "a"
[[4]]
[1] "b" "a" "a"
[[5]]
[1] "a" "b" "a"
[[6]]
[1] "a" "a" "b"
If the duplicates are a problem you could do something similar to this to get rid of duplicates
strsplit(unique(sapply(permn(my_vec), paste, collapse = ",")), ",")
Or probably a better approach to removing duplicates...
dat <- do.call(rbind, permn(my_vec))
dat[duplicated(dat),]
Noting that your data is effectively 5 levels from 1-5, encoded as "a", "b", "a", "c", and "d", I went looking for ways to get the permutations of the numbers 1-5 and then remap those to the levels you use.
Let's start with the input data:
my_vec <- c("a","b","a","c","d") # the character
my_vec_ind <- seq(1,length(my_vec),1) # their identifier
To get the permutations, I applied the function given at Generating all distinct permutations of a list in R:
permutations <- function(n){
if(n==1){
return(matrix(1))
} else {
sp <- permutations(n-1)
p <- nrow(sp)
A <- matrix(nrow=n*p,ncol=n)
for(i in 1:n){
A[(i-1)*p+1:p,] <- cbind(i,sp+(sp>=i))
}
return(A)
}
}
First, create a data.frame with the permutations:
tmp <- data.frame(permutations(length(my_vec)))
You now have a data frame tmp of 120 rows, where each row is a unique permutation of the numbers, 1-5:
>tmp
X1 X2 X3 X4 X5
1 1 2 3 4 5
2 1 2 3 5 4
3 1 2 4 3 5
...
119 5 4 3 1 2
120 5 4 3 2 1
Now you need to remap them to the strings you had. You can remap them using a variation on the theme of gsub(), proposed here: R: replace characters using gsub, how to create a function?
gsub2 <- function(pattern, replacement, x, ...) {
for(i in 1:length(pattern))
x <- gsub(pattern[i], replacement[i], x, ...)
x
}
gsub() won't work because you have more than one value in the replacement array.
You also need a function you can call using lapply() to use the gsub2() function on every element of your tmp data.frame.
remap <- function(x,
old,
new){
return(gsub2(pattern = old,
replacement = new,
fixed = TRUE,
x = as.character(x)))
}
Almost there. We do the mapping like this:
shuffled_vec <- as.data.frame(lapply(tmp,
remap,
old = as.character(my_vec_ind),
new = my_vec))
which can be simplified to...
shuffled_vec <- as.data.frame(lapply(data.frame(permutations(length(my_vec))),
remap,
old = as.character(my_vec_ind),
new = my_vec))
.. should you feel the need.
That gives you your required answer:
> shuffled_vec
X1 X2 X3 X4 X5
1 a b a c d
2 a b a d c
3 a b c a d
...
119 d c a a b
120 d c a b a
Looking at a previous question (R: generate all permutations of vector without duplicated elements), I can see that the gtools package has a function for this. I couldn't however get this to work directly on your vector as such:
permutations(n = 5, r = 5, v = my_vec)
#Error in permutations(n = 5, r = 5, v = my_vec) :
# too few different elements
You can adapt it however like so:
apply(permutations(n = 5, r = 5), 1, function(x) my_vec[x])
# [,1] [,2] [,3] [,4]
#[1,] "a" "a" "a" "a" ...
#[2,] "b" "b" "b" "b" ...
#[3,] "a" "a" "c" "c" ...
#[4,] "c" "d" "a" "d" ...
#[5,] "d" "c" "d" "a" ...

R dynamic lists of lists

Hi I'm new to R and for a school project I'm trying to to create a lists of lists that I can access by index and append to. Something like
aList[1] = A, B, C
aList[1] returns [1] A, B, C
aList[1] += D
aList[1] returns [1] A, B, C, D
aList[2] = 1, 2, 3
aList[2] returns [2] 1, 2, 3
aList returns [1] A, B, C, D
[2] 1, 2, 3
However, I'm not sure if I'm using the right datatype (and definitely not the proper syntax) as everything I've tried just either makes a single index of a list or makes multiple indexes of one item.
This isn't the homework. This shouldn't even be an issue but I can't find a solution.
Lists in R are separate from vectors- each item in a vector can only be a basic type like a number or a string, while a list can contains vectors or other lists. It sounds like you want to create a list of vectors. This could be done as:
> aList = list(c("A", "B", "C"), c(1, 2, 3))
> aList[[1]]
[1] "A" "B" "C"
> aList[[1]] = c(aList[[1]], "D")
> aList[[1]]
[1] "A" "B" "C" "D"
> aList[[2]]
[1] 1 2 3
> aList
[[1]]
[1] "A" "B" "C" "D"
[[2]]
[1] 1 2 3
Note that you normally access a list using double brackets, like [[1]]. If you access a list using single brackets, you'll get a subset of the list:
[[1]]
[1] "A" "B" "C" "D"
Which isn't what you want if you want to modify that item.

Resources