Create list with specific iteration in R - r

I have the following dataset containing dates:
> dates
[1] "20180412" "20180424" "20180506" "20180518" "20180530" "20180611" "20180623" "20180705" "20180717" "20180729"
I am trying to create a list where in each position, the name is 'Coherence_' + the first and second dates in dates. So in output1[1] I would have Coherence_20180412_20180424. Then in output1[2] I would have Coherence_20180506_20180518, etc.
I am starting with this code but it is not working they way I need:
output1<-list()
for (i in 1:5){
output1[[i]]<-paste("-Poutput1=", S1_Out_Path,"Coherence_VV_TC", dates[[i]],"_", dates[[i+1]], ".tif", sep="")
}
Do you have any suggestions?
M

Try this:
Without loop
even_indexes<-seq(2,10,2) # List of even indexes
odd_indexes<-seq(1,10,2) # List of odd indexes
print(paste('Coherence',paste(odd_indexes,even_indexes,sep = "_"),sep = "_"))
Link answer from here: Create list in R with specific iteration
Updated (To get data in List)
lst=c(paste('Coherence',paste(odd_indexes,even_indexes,sep = "_"),sep = "_"))
OR
a=c(1:10)
for (i in seq(1, 9, 2)){
print(paste('Coherence',paste(a[i],a[i+1],sep = "_"),sep = "_"))
}
Output:
[1] "Coherence_1_2"
[1] "Coherence_3_4"
[1] "Coherence_5_6"
[1] "Coherence_7_8"
[1] "Coherence_9_10"

You can create these patterns using paste capability to operate on vectors:
dates <- c("20180412", "20180424", "20180506", "20180518", "20180530",
"20180611", "20180623", "20180705", "20180717", "20180729")
paste("Coherence", dates[1:length(dates)-1], dates[2:length(dates)], sep="_")
[1] "Coherence_20180412_20180424" "Coherence_20180424_20180506" "Coherence_20180506_20180518"
[4] "Coherence_20180518_20180530" "Coherence_20180530_20180611" "Coherence_20180611_20180623"
[7] "Coherence_20180623_20180705" "Coherence_20180705_20180717" "Coherence_20180717_20180729"
Or other simple patterns can be generated as:
paste("Coherence", dates[seq(1, length(dates), 2)], dates[seq(2, length(dates), 2)], sep="_")
[1] "Coherence_20180412_20180424" "Coherence_20180506_20180518" "Coherence_20180530_20180611"
[4] "Coherence_20180623_20180705" "Coherence_20180717_20180729"

You can use matrix(..., nrow=2):
dates <- c("20180412", "20180424", "20180506", "20180518", "20180530", "20180611", "20180623", "20180705", "20180717", "20180729")
paste0("Coherence_", apply(matrix(dates, 2), 2, FUN=paste0, collapse="_"))
# > paste0("Coherence_", apply(matrix(dates, 2), 2, FUN=paste0, collapse="_"))
# [1] "Coherence_20180412_20180424" "Coherence_20180506_20180518" "Coherence_20180530_20180611" "Coherence_20180623_20180705"
# [5] "Coherence_20180717_20180729"

Related

how to sort list.files() in correct date order?

Using normal list.files() in the working directory return the file list but the numeric order is messed up.
f <- list.files(pattern="*.nc")
f
# [1] "te1971-1.nc" "te1971-10.nc" "te1971-11.nc" "te1971-12.nc"
# [5] "te1971-2.nc" "te1971-3.nc" "te1971-4.nc" "te1971-5.nc"
# [9] "te1971-6.nc" "te1971-7.nc" "te1971-8.nc" "te1971-9.nc"
where the number after "-" describes the month number.
I used the following to try to sort it
myFiles <- paste("te", i, "-", c(1:12), ".nc", sep = "")
mixedsort(myFiles)
it returns ordered files but in reverse:
[1] "te1971-12.nc" "te1971-11.nc" "tev1971-10.nc" "te1971-9.nc"
[5] "te1971-8.nc" "te1971-7.nc" "te1971-6.nc" "te1971-5.nc"
[9] "te1971-4.nc" "te1971-3.nc" "te1971-2.nc" "te1971-1.nc"
How do I fix this?
The issue is that the values get alphabetically sorted.
You could gsub out years and months as groups (.) and add "-1" as first day of the month to the yield, coerce it as.Date and order by that.
x[order(as.Date(gsub('.*(\\d{4})-(\\d{,2}).*', '\\1-\\2-1', x)))]
# [1] "te1971-1.nc" "te1971-2.nc" "te1971-3.nc" "te1971-4.nc" "te1971-5.nc"
# [6] "te1971-6.nc" "te1971-7.nc" "te1971-8.nc" "te1971-9.nc" "te1971-10.nc"
# [11] "te1971-11.nc" "te1971-12.nc"
Data:
x <- c("te1971-1.nc", "te1971-10.nc", "te1971-11.nc", "te1971-12.nc",
"te1971-2.nc", "te1971-3.nc", "te1971-4.nc", "te1971-5.nc", "te1971-6.nc",
"te1971-7.nc", "te1971-8.nc", "te1971-9.nc")

Sequence of numbers by hyphen without hyphenating single occurrences

I want to generate readable number sequences (e.g. 1, 2, 3, 4 = 1-4), but for a set of data where each number in the sequence must have four digits (e.g. 99 = 0099 or 1 = 0001 or 1022 = 1022) AND where there are different letters in front of each number.
I was looking at the answer to this question, which managed to do almost exactly as I want with two caveats:
If there is a stand-alone number that does not appear in a sequence, it will appear twice with a hyphen in between
If there are several stand-alone numbers that do no appear in a sequence, they won't be included in the result
### Create Data Set ====
## Create the data for different tags. I'm only using two unique levels here, but in my dataset I've got
## 400+ unique levels.
FM <- paste0('FM', c('0001', '0016', '0017', '0018', '0019', '0021', '0024', '0026', '0028'))
SC <- paste0('SC', c('0002', '0003', '0004', '0010', '0012', '0014', '0033', '0036', '0039'))
## Combine data
my.seq1 <- c(FM, SC)
## Sort data by number in sequence
my.seq1 <- my.seq1[order(substr(my.seq1, 3, 7))]
### Attempt Number Sequencing ====
## Get the letters
sp.tags <- substr(my.seq1, 1, 2)
## Get the readable number sequence
lapply(split(my.seq1, sp.tags), ## Split data by the tag ID
function(x){
## Get the run lengths as per [previous answer][1]
rl <- rle(c(1, pmin(diff(as.numeric(substr(x, 3, 7))), 2)))
## Generate number sequence by separator as per [previous answer][1]
seq2 <- paste0(x[c(1, cumsum(rl$lengths))], c("-", ",")[rl$values], collapse="")
return(substr(seq2, 1, nchar(seq2)-1))
})
## Combine lists and sort elements
my.seq2 <- unlist(strsplit(do.call(c, my.seq2), ","))
my.seq2 <- my.seq2[order(substr(my.seq2, 3, 7))]
names(my.seq2) <- NULL
my.seq2
[1] "FM0001-FM0001" "SC0002-SC0004" "FM0016-FM0019" "FM0028" "SC0039"
my.seq1
[1] "FM0001" "SC0002" "SC0003" "SC0004" "SC0010" "SC0012" "SC0014" "FM0016" "FM0017" "FM0018" "FM0019" "FM0021"
[13] "FM0024" "FM0026" "FM0028" "SC0033" "SC0036" "SC0039"
The major problems with this are:
Some values are completely missing from the data set (e.g. FM0021, FM0024, FM0026)
The first number in the sequence (FM0001) appears with a hyphen in between
I feel like I'm getting warmer by using A5C1D2H2I1M1N2O1R2T1's answer to utilize seqToHumanReadable because it's quite elegant AND solves both problems. Two more problems are that I'm not able to tag the ID before each number and can't force the number of digits to four (e.g. 0004 becomes 4).
library(R.utils)
lapply(split(my.seq1, sp.tags), function(x){
return(unlist(strsplit(seqToHumanReadable(substr(x, 3, 7)), ',')))
})
$FM
[1] "1" " 16-19" " 21" " 24" " 26" " 28"
$SC
[1] "2-4" " 10" " 12" " 14" " 33" " 36" " 39"
Ideally the result would be:
"FM0001, SC002-SC004, SC0012, SC0014, FM0017-FM0019, FM0021, FM0024, FM0026, FM0028, SC0033, SC0036, SC0039"
Any ideas? It's one of those things that's really simple to do by hand but would take blinking ages, and you'd think a function would exist for it but I haven't found it yet or it doesn't exist :(
This should do?
# get the prefix/tag and number
tag <- gsub("(^[A-z]+)(.+)", "\\1", my.seq1)
num <- gsub("([A-z]+)(\\d+$)", "\\2", my.seq1)
# get a sequence id
n <- length(tag)
do_match <- c(FALSE, diff(as.numeric(num)) == 1 & tag[-1] == tag[-n])
seq_id <- cumsum(!do_match) # a sequence id
# tapply to combine the result
res <- setNames(tapply(my.seq1, seq_id, function(x)
if(length(x) < 2)
return(x)
else
paste(x[1], x[length(x)], sep = "-")), NULL)
# show the result
res
#R> [1] "FM0001" "SC0002-SC0004" "SC0010" "SC0012" "SC0014" "FM0016-FM0019" "FM0021"
#R> [8] "FM0024" "FM0026" "FM0028" "SC0033" "SC0036" "SC0039"
# compare with
my.seq1
#R> [1] "FM0001" "SC0002" "SC0003" "SC0004" "SC0010" "SC0012" "SC0014" "FM0016" "FM0017" "FM0018" "FM0019" "FM0021" "FM0024"
#R> [14] "FM0026" "FM0028" "SC0033" "SC0036" "SC0039"
Data
FM <- paste0('FM', c('0001', '0016', '0017', '0018', '0019', '0021', '0024', '0026', '0028'))
SC <- paste0('SC', c('0002', '0003', '0004', '0010', '0012', '0014', '0033', '0036', '0039'))
my.seq1 <- c(FM, SC)
my.seq1 <- my.seq1[order(substr(my.seq1, 3, 7))]

Change order of multiple optional substrings

That's a bit like this question, but I have multiple substrings that may or may not occur.
The substrings code for two different dimensions, in my example "test" and "eye". They can occur in any imaginable order.
The variables can be coded in different ways - in my example, "method|test" would be two ways to code for "test", as well as "r|re|l|le" different ways to code for eyes.
I found a convoluted solution, which is using a chain of seven (!) gsub calls, and I wondered if there is a more concise way.
x <- c("id", "r_test", "l_method", "test_re", "method_le", "test_r_old",
"test_l_old", "re_test_new","new_le_method", "new_r_test")
x
#> [1] "id" "r_test" "l_method" "test_re"
#> [5] "method_le" "test_r_old" "test_l_old" "re_test_new"
#> [9] "new_le_method" "new_r_test"
Desired output
#> [1] "id" "r_test" "l_test" "r_test" "l_test"
#> [6] "r_test_old" "l_test_old" "r_test_new" "l_test_new" "r_test_new"
How I got there (convoluted)
## Unify codes for variables, I use the underscores to make it more unique for future regex
clean_test<- gsub("(?<![a-z])(test|method)(?![a-z])", "_test_", tolower(x), perl = TRUE)
clean_r <- gsub("(?<![a-z])(r|re)(?![a-z])", "_r_", tolower(clean_test), perl = TRUE)
clean_l <- gsub("(?<![a-z])(l|le)(?![a-z])", "_l_", tolower(clean_r), perl = TRUE)
## Now sort, one after the other
sort_eye <- gsub("(.*)(_r_|_l_)(.*)", "\\2\\1\\3", clean_l, perl = TRUE)
sort_test <- gsub("(_r_|_l_)(.*)(_test_)(.*)", "\\1\\3\\2\\4", sort_eye, perl = TRUE)
## Remove underscores
clean_underscore_mult <- gsub("_{2,}", "_", sort_test)
clean_underscore_ends <- gsub("^_|_$", "", clean_underscore_mult)
clean_underscore_ends
#> [1] "id" "r_test" "l_test" "r_test" "l_test"
#> [6] "r_test_old" "l_test_old" "r_test_new" "l_test_new" "r_test_new"
I'd be already very very grateful for a suggestion how to better proceed from ## Now sort, one after the other downwards...
How about tokenizing the string and using lookup tables instead? I'll use data.table to assist but the idea fits naturally with other data grammars as well
library(data.table)
# build into a table, keeping track of an ID
# to say which element it came from originally
l = strsplit(x, '_', fixed=TRUE)
DT = data.table(id = rep(seq_along(l), lengths(l)), token = unlist(l))
Now build a lookup table:
# defined using fread to make it easier to see
# token & match side-by-side; only define tokens
# that actually need to be changed here
lookups = fread('
token,match
le,l
re,r
method,test
')
Now combine:
# default value is the token itself
DT[ , match := token]
# replace anything matched
DT[lookups, match := i.match, on = 'token']
Next use factor ordering to get the tokens in the right order:
# the more general [where you don't have an exact list of all the possible
# tokens ready at hand] is a bit messier -- you might do something
# similar to setdiff(unique(match), lookups$match)
DT[ , match := factor(match, levels = c('id', 'r', 'l', 'test', 'old', 'new'))]
# sort to this new order
setorder(DT, id, match)
Finally combine again (an aggregation) to get the output:
DT[ , paste(match, collapse='_'), by = id]$V1
# [1] "id" "r_test" "l_test" "r_test" "l_test"
# [6] "r_test_old" "l_test_old" "r_test_new" "l_test_new" "r_test_new"
Here's a one-liner with nested sub that transforms x without any intermediary steps:
sub("^(\\w+)_(r|re|l|le)", "\\2_\\1",
sub("method", "test",
sub("(l|r)e", "\\1",
sub("(^new)_(\\w+_\\w+)$", "\\2_\\1", x))))
# [1] "id" "r_test" "l_test" "r_test" "l_test" "r_test_old"
# [7] "l_test_old" "r_test_new" "l_test_new" "r_test_new"
Data:
x <- c("id", "r_test", "l_method", "test_re", "method_le", "test_r_old",
"test_l_old", "re_test_new","new_le_method", "new_r_test")
Much inspired and building on user MichaelChirico's answer, this is a function using base R only, which (in theory) should work with any number of substrings to sort. The list defines the sort (by its elements), and it specifies all ways to code for the default tokens (the list names).
## I've added some more ways to code for right and left eyes, as well as different further strings that are not known.
x <- c("id", "r_random_test_old", "r_test", "r_test_else", "l_method", "test_re", "method_le", "test_od_old",
"test_os_old", "re_mth_new","new_le_method", "new_r_test_random")
x
#> [1] "id" "r_random_test_old" "r_test"
#> [4] "r_test_else" "l_method" "test_re"
#> [7] "method_le" "test_od_old" "test_os_old"
#> [10] "re_mth_new" "new_le_method" "new_r_test_random"
sort_substr(x, list(r = c("od","re"), l = c("os","le"), test = c("method", "mth"), time = c("old","new")))
#> [1] "id" "r_test_time_random" "r_test"
#> [4] "r_test_else" "l_test" "r_test"
#> [7] "l_test" "r_test_time" "l_test_time"
#> [10] "r_test_time" "l_test_time" "r_test_time_random"
sort_substr
sort_substr <- function(x, list_substr) {
lookups <- data.frame(match = rep(names(list_substr), lengths(list_substr)),
token = unlist(list_substr))
l <- strsplit(x, "_", fixed = TRUE)
DF <- data.frame(id = rep(seq_along(l), lengths(l)), token = unlist(l))
match_token <- lookups$match[match(DF$token, lookups$token)]
DF$match <- ifelse(is.na(match_token), DF$token, match_token)
rest_token <- base::setdiff(DF$match, names(list_substr))
DF$match <- factor(DF$match, levels = c(names(list_substr), rest_token))
DF <- DF[with(DF, order(id, match)), ]
out <- vapply(split(DF$match, DF$id),
paste, collapse = "_",
FUN.VALUE = character(1),
USE.NAMES = FALSE)
out
}

How to select a specific interval of dataframes/objects inside a list()?

I have a list composed of 10 numeric vectors. I would like to select the first 5 1:5, or let's say just the 3rd and the 9th of this numeric vectors inside the list.
This below would be an example of a list:
n_vec = lapply(1:10, function(x) rnorm(20,5,2))
bLister = list()
keeping_names = NULL
for (i in 1:length(n_vec)) {
single_name_ = paste("thisis_vec",i)
temp = n_vec[[i]]
keeping_names = c(keeping_names,single_name_)
bLister[[i]] = temp
}
names(bLister) = keeping_names
This way doesn't work:
bLister[[1:5]]
bLister[[c(3,9)]]
How can I do this?
You can subset vectors like so. Notice the number of square brackets.
> bLister[c(3, 9)]
$`thisis_vec 3`
[1] 5.603467 3.749571 3.944807 7.279552 7.122220 2.065051 2.587282 4.405463
[9] 6.687400 7.567451 6.239640 6.017510 2.484759 3.223271 5.301008 1.545704
[17] 2.465992 1.518966 6.997675 3.966775
$`thisis_vec 9`
[1] 3.900151 5.260895 7.971662 6.578425 4.861220 3.770569 1.128102 6.164506
[9] 4.767511 5.286352 3.898185 2.298500 8.476691 7.794415 7.148588 6.699527
[17] 3.638074 4.240355 8.575829 5.340551

Accessing selected elements of a list of lists in R

I have a list of list subgame[[i]]$Weight of this type:
[[1]]
[1] 0.4720550 0.4858826 0.4990469 0.5115899 0.5235512 0.5349672 0.5458720
[8] 0.5562970 0.5662715 0.5758226 0.5849754 0.5937532 0.6021778 0.6102692
[15] 0.6180462 0.6255260 0.6327250 0.6396582 0.6463397 0.6527826
[[2]]
[1] 0.4639948 0.4779027 0.4911519 0.5037834 0.5158356 0.5273443 0.5383429
[8] 0.5488623 0.5589313 0.5685767 0.5778233 0.5866943 0.5952111 0.6033936
[15] 0.6112605 0.6188291 0.6261153 0.6331344 0.6399002 0.6464260
[[3]]
[1] 0.4629488 0.4768668 0.4901266 0.5027692 0.5148329 0.5263534 0.5373639
[8] 0.5478953 0.5579764 0.5676339 0.5768926 0.5857755 0.5943041 0.6024984
[15] 0.6103768 0.6179568 0.6252543 0.6322844 0.6390611 0.6455976
What I am looking for is to access all the j-th elements of every list. Example if j=1 I must get:
>0.4720550 0.4639948 0.4629488
How can I do it?
I found
sapply(1:length(subgame[[i]]$Weight),function(k) subgame[[i]]$Weight[[k]][1])
But seems too tricky to me.
There is a more elegant way?
If j=1, then you're interested in subgame[[i]]$Weight[[1]][1], subgame[[i]]$Weight[[2]][1], and subgame[[i]]$Weight[[3]][1]. In other words, you want to use [1] on each list element.
But what happens when you subset a vector? For example:
(x <- rnorm(5))
# [1] -1.8965529 0.4688618 0.6588774 0.2749539 0.1829046
x[3]
# [1] 0.6588774
[ is actually a function, and it gets called in this situation. You can read a bit more about it with ?"[", but the point is that you can call it like any other function. Its first argument will be the object to subset, then you can pass it the index (or indices) you're interested in (along with some other arguments that the help page discusses):
x[3]
# [1] 0.6588774
`[`(x, 3)
# [1] 0.6588774
Note the backticks surrounding the name. A bare [ will throw an error, so you need to quote it. The same goes for other functions like +.
So if you want to get the first element of each list element, you can apply [ to each element of the list, passing it 1 or whatever j is:
sapply(subgame[[i]]$Weight, `[`, 1)
I would like to add a solution which returns the result you want for the Weight list of each elements of your subgame list.
> subgame <- list(list(weight = list(c(1, 2), c(3, 4), c(5, 6))), list(weight = list(c(7, 8), c(9, 10), c(11, 12))))
>
> j = 1
>
> do.call(rbind, subgame[[1]]$weight)[,j]
[1] 1 3 5
>
> lapply(subgame, function(x) {do.call(rbind, x$weight)[,j]})
[[1]]
[1] 1 3 5
[[2]]
[1] 7 9 11

Resources