Random extraction from a list with NO REPLACEMENT - r

So I am wondering how to extract randomly a string from a list in R with NO REPLACEMENT till the list is empty.
To write
sample(x, size=1, replace=FALSE)
is not helping me, since string are extracted more than once before the list gets empty.
Kind regards

In every iteration one list element will be picked, and from this element a value removed. If there is only one value left, the list element is removed.
x <- list(a = "bla", b = c("ble", "bla"), c = "bli")
while (length(x) > 0) {
s <- sample(x, size = 1)
column <- x[[names(s)]]
value <- sample(unlist(s, use.names = FALSE), size = 1)
list_element_without_value <- subset(column, column != value)
x[[names(s)]] <- if (length(list_element_without_value) == 0) {
NULL
} else {
list_element_without_value
}
}

sample(x)
You can't use size=1 on repeated calls and expect it to know not to grab values previously selected. You have to grab all the values you want at one time. This code will shuffle your data and then you can grab the first element when you need it. Then the next time you need something grab the second... And so on.

Related

Tracking parent list in nested list

I have a nested loop that I need to iterate over. I want to go to the end of the list (in this case second item of the parent list), and add item to it if it isn't nested loop anymore. So loop may have many levels of nested loop. Right now, I'm only getting second list as a return. How do I track parent list?
a <- list( x = list(1,2,3),y =list(4,5,6))
con=TRUE
while(con){
i <-length(a)
for(k in i:i){
if(!typeof(a[[k]])=="list"){
a[[k+1]] <- "test"
con=FALSE
}else{
a <- a[[k]]
i <- length(a)
}
}
}
Expected Result:a <- list(x = list(1,2,3), y =list(4,5,6, "test"))
Result: a <- list(4,5,6,"test")
library(magrittr)
a <- list( x = list(1,2,3),y =list(4,5,6), z = 1)
temp <- lapply(a, typeof) %>% unlist
tempList <- (temp!="list")
if (sum(tempList) > 0) {
a[[max(which(tempList == FALSE))]] %<>% append("test")
} else {
a[[length(a)]] %<>% append("test")
}
It isn't clear to me what it is that you want to do, but
just concentrating on your example, this would work.
In short, see which elements of the parent list are not Lists, and for the last one of them add "test". If all of them are lists, then add "test" to the last one.

How can I subindex a multiway array with vector that preserves blanks

As a minimal example lets consider the following multiway array (a):
a = as.table(array(c(1:8), dim=c(2,2,2)))
For this array manual subindicing is easy, e.g.
a[1,,] (a 2 x 2 matrix that comply with dimension one being in state 1 (A))
My question is now; how can I do the same thing with a vector that preserves blanks, e.g. something like c(1,,).
Note that I need to define which dimentions are left blank (dynamically) based on the observed variables in an instance; My initial thought was a generic cha. vector b=c("","","") , where I could replace variable 1 with 1 if it was observed as in state 1, e.g. b[1]="1", but, first of all, I do not know how to use this vector for indicing a["b"], or whether there is a better way of doing this.
I need this dynamic indicing, because I want to update parts of the table as I receive evidence (information == counts)
Thank you very much in advance!
Best,
Sebastian
Here's how I would do it:
while (evidence) {
idx <- lapply(dim(a), function (dimsize) 1:dimsize)
## update `idx` according to `evidence`, e.g.,
## if you want to do `a[1,,2]`
idx[[1]] <- 1
idx[[3]] <- 2
do.call(`[`, c(list(a), idx))
## if you want to do `a[1,,2] <- c(20, 30)`
a <- do.call(`[<-`, c(list(a), idx, list(value=c(20, 30))))
}
Here is a dirty way of solving it:
data:
a = as.table(array(c(1:8), dim=c(2,2,2)))
Your dynamic indices should be a text: (that's a new question of how you get your condition into a string like index, index2)
index = "1,,"
index2 = ",2,"
function:
crazyIndexing <- function(obj, index) {
stringExpr = paste0(obj, "[",index,"]")
return(eval(parse(text=stringExpr)))
}
call your function: (see how it does the same!)
a[1,,]
crazyIndexing("a",index)
a[,2,]
crazyIndexing("a",index2)
please note:
b=c("","",""); b[1]="1"
index = paste0(b, collapse = ",")
#[1] "1,,"
You can of course change your function accordingly:
crazyIndexing2 <- function(obj, obj2, index) {
stringExpr = paste0(obj ,"[",index,"]", "<-", obj, "[",index,"]", "+", obj2)
eval(parse(text=stringExpr))
return( get(obj) )
}
a = as.table(array(c(1:8), dim=c(2,2,2)))
aa = a[,2,]
aopt = crazyIndexing2("a","aa","1,,")
Now you have all the tools.

Subset from a list with relational operators

I have a list object that contains multiple lists with in each list the same returning objects. The list below is a short version of this.
logs <- list(list(success = TRUE, details = "check", timestamp = as.Date("2017-10-06")),
list(success = FALSE, details = "uncheck", timestamp = as.Date("2017-10-07")),
list(success = FALSE, details = "check", timestamp = as.Date("2017-10-08")),
list(success = FALSE, details = "uncheck", timestamp = as.Date("2017-10-09")))
I want to create two vectors: one vector (success_true) that contains the content of the second element of each list if the content of the first element equals true, and one vector (succes_false) that contains the content of the second element of each list if the content of the first element equals false. The result that I'm looking for looks like this:
success_true <- c("check")
succes_false <- c("uncheck", "check", "uncheck")
The sapply solution that Shaun Wilkinson came up with works.
# Solution number 1 by Shaun Wilkinson: sapply
successes <- sapply(logs, function(e) e$success)
details <- sapply(logs, function(e) e$details)
success_true <- details[successes]
success_false <- details[!successes]
I also came up with another solution that incoorporates a conditional statement within a for loop.
# Solution number 2 by SHW: conditional statement with for loop
success_true <- c() #create two vectors
success_false <- c()
for (log in logs) {
if (log$success == TRUE) {
success_true <- c(success_true, log$details) #add content of details element to the success_true vector if the condition is met
} else {
success_false <- c(succes_false, log$details) #add content of details element to the success_false vector if the condition is met
}
Try this:
successes <- sapply(logs, function(e) e$success)
details <- sapply(logs, function(e) e$details)
success_true <- details[successes]
success_false <- details[!successes]
In addition to Shauns answer, I have found another solution that incoorporates a conditional statement in a for loop. I think this solution allows for more flexibility and therefore is the solution that I will be using.
success_true <- c() #create two vectors
success_false <- c()
for (log in logs) {
if (log$success == TRUE) {
success_true <- c(success_true, log$details) #add content of details element to the success_true vector if the condition is met
} else {
success_false <- c(succes_false, log$details) #add content of details element to the success_false vector if the condition is met
}
}

dealing with empty elements in lapply()

it might be a rather beginner level question. lapply() is useful in applying a specific function on each component of a list. However, when I deal with data periodically generated by the data base, it happens sometimes, that one or more elements in the list is empty, while all other components of the same class are, let's say, data frames.
When I use lapply() to deal with the whole list, error occurs when it is the turn for the empty elements, because somehow the dimension or length or class don't fit. What I do in this case is using if/else loop, but I guess there must be a neat and smart way to tackle this problem.
Here is a example:
FTSR.site.app <- lapply(sortier.d.f, function(x) {
if(length(x) != 1){
FTSR <- as.numeric(get.FTSR(x))
}else FTSR <- 0})
sortier.d.f is a list consisting of dataframes with numerous rows and columns. If an empty element among them is present, which means no data is generated there, it will not get alone with the get.FTSR function (I wrote for a particular calculation), because the latter can only process data frames. The length of this empty element will be 1, I guess because it still exists as a 0 or a FALSE. Otherwise without such empty elements I can simply use
FTSR.site.app <- lapply(sortier.d.f, get.FTSR(x))
Would you please suggest a better solution for the problem with empty elements in such a case?
A simpler dummy example here:
test.A <- data.frame(name <- c("Michael", "John", "Mary"),
mathematik <- c(85, 72, 90), physics <- c(67, 82, 94))
test.B <- vector(length = 0, mode = "numeric")
test.L <- list(test.A, test.B)
sum.mean.calc <- function(test){
test$total <- apply(test[,2:3], MARGIN = 1, sum)
test$mean <- apply(test[,2:3], MARGIN = 1, mean)
return(test)
}
test.L <- lapply(test.L, sum.mean.calc)
test.L <- lapply(test.L, function(x){
if(length(x) != 0){
x <- sum.mean.calc(x)
}else x <- 0
return(x)
})
To first attemp to use lapply failed, because test.B is a 1-Dim vector with 0, so it can't be processed by function sum.mean.calc, so in the second attempt I have to use the extra loop
if(length(x) != 0){
...
}else x <- 0
to process all components in the list test.L, and that can be annoying when I want to use lapply a number of times on that list.

combine results from loop in one file in R (some results were missing)

I want to combine the results from a for loop into 1 txt file and I have written my code based on suggestion from this link
combine results from a loop in one file
There is one problem. I am supposed to get 8 results (row) but I only ended with only 5. Somehow the other results did not get into the file. I think the problem is with the if statement but I don't know how to fix it.
Here is my code
prob <- c(0.10, 0.20)
for (j in seq(prob)) {
range <- c(2,3)
for (i in seq(range)) {
sample <- c(10,20)
for (k in seq(sample)) {
data <- Simulation(X =1,Y =range[i], Z=sample[k] ,p = prob[j])
filename <- paste('file',i,'txt')
if (j == 1) {
write.table(data, "Desktop/file2.txt", col.names= TRUE)
} else {
write.table(data,"Desktop/file2.txt", append = TRUE, col.names = FALSE)
}
}
}
}
That's because the if ( j == 1 ) bit is meant to check whether this is the first time you've written to the file or not.
If it is the first time, then it will write the column names (i.e. X, Y, Z, p) into the file (see the col.names=TRUE?).
If it isn't the first time, then it won't write the column names, but will just append the data.
Since you have multiple nested loops, that condition won't work so well for you: when j==1 (i.e. for prob=0.1) you perform 4 other loops within. But since j==1, the data is getting overwritten each time.
I'd recommend initialising a variable count that counts how many times you've performed Simulation, and then changing that line to if ( count == 1 ):
count <- 1
prob <- c(0.10,0.20)
# .... code as before
data <- Simulation(X =1,Y =range[i], Z=sample[k] ,p = prob[j])
if ( count == 1 ) {
write.table(data, "Desktop/file2.txt", col.names=T)
} else {
write.table(data, "Desktop/file2.txt", append=T, col.names=F)
}
# increment count
count <- count + 1
}}}

Resources