Passing Along Column Values to Paste - r

This is a follow-up to Paste/Collapse in R
I assume its preferable to start a new question than to endlessly edit a previous question with new questions.
What I've got going on is some vectors that I want to simulate playing a game against. The goal is to randomly pick two strategies to play against each other, where afterwards the results matrix is made, a magical for loop will assign each strategy a score.
###Sample Strategies
whales <- c("C","D","C","D","D")
quails <- c("D","D","D","D","D")
snails <- c("C", "C", "C", "C", "C")
bales <- c("D", "D", "C", "D", "C")
####Combining into a matrix
gameboard<-cbind(whales, quails, bales, snails, deparse.level = 1)
####All of the names of the strategies/columns
colnames(gameboard)
####Randomly pick two random column names
game1<- colnames(gameboard)[sample(1:ncol(gameboard), 2, replace= FALSE)]
results <-paste(game1[1], game1[2], sep='')
Now this does work, except for I am actually accessing the column names, not the data in the columns. So I end up with results like 'whalesbales' when I want the actual concatenation of CD DD CC DD DC.
Maybe 'apply' or 'lapply'...apply here?
The inevitable follow up question is how can I get the last line where it says 'results' to instead say 'results_whalesVbales'?
because I assume
results"game1[1]", sep='V',game1[2]"
is not going to cut it, and there is some ugly way to do this with lots of parentheses and block quotes.
#
FOLLOW UP
Thanks in advance for advice.
Thanks Ferdinand for the response and thorough explanation-
A couple of follow ups:
(1) Is there a way to get the
paste(.Last.value, collapse=" ")
[1] "DC DD CC DD CD"
result to be a new object (vector?) that is named result_balesVwhales based on
paste0("results_", paste(colnames(gameboard [randompair],collapse="V"))
[1] "results_balesVwhales"
everything I've tried so far makes the vector have a value of results_balesVwhales.
(2) Can I force the new results_balesVwhales to have the "long" (columnar) format that bales and whales each have individually, w/o reshape?

Ferdinand has the first question answered. In regards to your second questions... the function you are asking about is assign.
x = 'foo'
assign(x, 2)
foo
# [1] 2
However, there be dragons... Instead, the R way of doing that would be to assign into an element of a named list:
game1 <- sample(colnames(gameboard), 2)
result <- list()
list_name <- paste0("results_", paste(colnames(gameboard)[game1], collapse="V"))
result[[list_name]] <- paste(gameboard[, game1[1]],
gameboard[, game1[2]],
sep='',
collapse=' ')
If you want the vector of pasted elements, just remove the collapse as I've done below.
Eventually, once I sorted out how I wanted this all to work, I would wrap it in a function or two. These can be much cleaner, but the verbosity illustrates the idea.
my_game <- function(trial, n_samples, dat) {
# as per my comment, generate the game result and name using the colnames directly
game <- sample(colnames(dat), n_samples)
list_name <- paste0("results_", paste(game, collapse="V"))
game_result <- paste(dat[, game[1]],
dat[, game[2]],
sep='')
# return both the name and the data in the format desired to parse out later
return(list(list_name, game_result))
}
my_game_wrapper <- function(trials, n_samples, dat) {
# for multiple trials we create a list of lists of the results and desired names
results <- lapply(1:trials, my_game, n_samples, dat)
# extract the names and data
result_names <- sapply(results, '[', 1)
result_values <- sapply(results, '[', 2)
# and return them once the list is pretty.
names(result_values) <- result_names
return(result_values)
}
my_game_wrapper(3, 2, gameboard)
# $results_quailsVwhales
# [1] "DC" "DD" "DC" "DD" "DD"
# $results_quailsVbales
# [1] "DD" "DD" "DC" "DD" "DC"
# $results_balesVquails
# [1] "DD" "DD" "CD" "DD" "CD"

Related

ifelse assignment does not return pre-defined character vector

I cannot figure out why an ifelse assignment does not return the entire object I'm trying to pass.
Below, I look to see if the state of Texas is in a vector of state names (in my actual code, I'm looking up unique state names in a shapefile). Then, if Texas is in this list of state names, assign to a new object (states_abb) the abbreviations for Texas and New Mexico. Otherwise, assign to states_abb the abbreviations for California and Nevada.
I know from this post and ?ifelse that...
ifelse returns a value with the same shape as test which is filled
with elements selected from either yes or no depending on whether the
element of test is TRUE or FALSE.
So in my first example below, I can understand that only CA is returned.
states <- c("California", "Nevada")
(states_abb <- ifelse("Texas" %in% states, c("NM", "TX"), c("CA", "NV")))
# [1] "CA"
But in this second example below, I've pre-defined the object canv_abb. Why doesn't that whole object get passed? Even if it's a character vector, it's its own object, right? Shouldn't that whole "package" get passed?
txnm_abb <- c("NM", "TX")
canv_abb <- c("CA", "NV")
(states_abb <- ifelse("Texas" %in% states, txnm_abb, canv_abb))
# [1] "CA"
I appreciate any insights as to why this is happening. And can someone offer a solution so that I can assign BOTH abbreviations?
ifelse is the wrong tool here. You use ifelse when you have a vector of logical tests and you wish to create a new vector of the same length. You have a single logical test, but want your output to be a vector.
What you are describing is better handled by an if and else clause:
states_abb <- if("Texas" %in% states) txnm_abb else canv_abb
states_abb
#> [1] "CA" "NV"
To try to get a feel for how ifelse works, consider the following input and output:
condition <- c(TRUE, FALSE, FALSE)
input1 <- c("A", "b", "C")
input2 <- c("1", "2", "3")
result <- ifelse(condition, input1, input2)
result
#> [1] "A" "2" "3"
You will see that ifelse is a vectorized, shorthand way of writing the following loop, which is a pattern that comes up surprisingly often in data wrangling:
result <- character(3)
for(i in 1:length(condition)) {
result[i] <- if(condition[i]) input1[i] else input2[i]
}
result
#> [1] "A" "2" "3"
Note though that this is not what you are trying to do in your own code.
The key word here is 'same length'. If you apply ifelse both the input and output must have same length. In your case it is 1:2. In this case only the first will be calculated.
See ?ifelse:
'ifelse returns a value with the same shape as test"
solution use a classic if else structure:
states <- c("California", "Nevada")
states_abb <- if ("Texas" %in% states) {
c("NM", "TX")
} else {
c("CA", "NV")
}
> states_abb
[1] "CA" "NV"

names of leaves of nested list in R

I want to check if two nested lists have the same names at the last level.
If unlist gave an option not to concatenate names this would be trivial. However, it looks like I need some function leaf.names():
X <- list(list(a = pi, b = list(alpha.c = 1:5, 7.12)), d = "a test")
leaf.names(X)
[1] "a" "alpha.c" "NA" "d"
I want to avoid any inelegant grepping if possible. I feel like there should be some easy way to do this with rapply or unlist...
leaf.names <- function(X) names(rlang::squash(X))
or
leaf.names <- function(X){
while(any(sapply(X, is.list))) X <- purrr::flatten(X)
names(X)
}
gives
leaf.names(X)
# [1] "a" "alpha.c" "" "d"

Extracting coefficients while looping over variable names

I'm working on some time-series stuff in R (version 3.4.1), and would like to extract coefficients from regressions I ran, in order to do further analysis.
All results are so far saved as uGARCHfit objects, which are basically complicated list objects, from which I want to extract the coefficients in the following manner.
What I want is in essence this:
for(i in list){
i_GARCH_mxreg <- i_GARCH#fit$robust.matcoef[5,1]
}
"list" is a list object, where every element is the name of one observation. For now, I want my loop to create a new numeric object named as I specified in the loop.
Now this obviously doesn't work because the index, 'i', isn't replaced as I would want it to be.
How do I rewrite my loop appropriately?
Minimal working example:
list <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
for (i in list){
i_b <- i_a
}
what this should give me would be:
> one_b
[1] 1
> two_b
[1] 2
> three_b
[1] 3
Clarification:
I want to extract the coefficients form multiple list objects. These are named in the manner 'string'_obj. The problem is that I don't have a function that would extract these coefficients, the list "is not subsettable", so I have to call the individual objects via obj#fit$robust.matcoef[5,1] (or is there another way?). I wanted to use the loop to take my list of strings, and in every iteration, take one string, add 'string'_obj#fit$robust.matcoef[5,1], and save this value into an object, named again with " 'string'_name "
It might well be easier to have this into a list rather than individual objects, as someone suggest lapply, but this is not my primary concern right now.
There is likely an easy way to do this, but I am unable to find it. Sorry for any confusion and thanks for any help.
The following should match your desired output:
# your list
l <- as.list(c("one", "two", "three"))
one_a <- 1
two_a <- 2
three_a <- 3
# my workspace: note that there is no one_b, two_b, three_b
ls()
[1] "l" "one_a" "three_a" "two_a"
for (i in l){
# first, let's define the names as characters, using paste:
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# then let's assign the values. Since we are working with
# characters, the functions assign and get come in handy:
assign(dest, get(orig) )
}
# now let's check my workspace again. Note one_b, two_b, three_b
ls()
[1] "dest" "i" "l" "one_a" "one_b" "orig" "three_a"
[8] "three_b" "two_a" "two_b"
# let's check that the values are correct:
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3
To comment on the functions used: assign takes a character as first argument, which is supposed to be the name of the newly created object. The second argument is the value of that object. get takes a character and looks up the value of the object in the workspace with the same name as that character. For instance, get("one_a") will yield 1.
Also, just to follow up on my comment earlier: If we already had all the coefficients in a list, we could do the following:
# hypothetical coefficients stored in list:
lcoefs <- list(1,2,3)
# let's name the coefficients:
lcoefs <- setNames(lcoefs, paste0(c("one", "two", "three"), "_c"))
# push them into the global environment:
list2env(lcoefs, env = .GlobalEnv)
# look at environment:
ls()
[1] "dest" "i" "l" "lcoefs" "one_a" "one_b" "one_c"
[8] "orig" "three_a" "three_b" "three_c" "two_a" "two_b" "two_c"
one_c
[1] 1
two_c
[1] 2
three_c
[1] 3
And to address the comments, here a slightly more realistic example, taking the list-structure into account:
l <- as.list(c("one", "two", "three"))
# let's "hide" the values in a list:
one_a <- list(val = 1)
two_a <- list(val = 2)
three_a <- list(val = 3)
for (i in l){
dest <- paste0(i, "_b")
orig <- paste0(i, "_a")
# let's get the list-object:
tmp <- get(orig)
# extract value:
val <- tmp$val
assign(dest, val )
}
one_b
[1] 1
two_b
[1] 2
three_b
[1] 3

R: replace the whole part of a number but keep the decimal part unchanged

ok I'm very new to R and this is a general question. I have various numbers and I want to replace the whole part and keep the decimal part unchanged.
For example I have values from 25.01 to 25.99, I want to replace 25 with 10 but keep the decimal part unchanged (so my new numbers would be from 10.01 to 10.99).
Is that possible?
The commenters are right. It doesn't matter how long the list of numbers is. R can do the operation in one go.
x <- c(25.01, 25.8, 25.4)
x-15
[1] 10.01 10.80 10.40
If you do run into a situation with many wide-ranging numbers and need to literally keep what comes after the decimal and pick a number to put in front, use something like this:
keep.decimal <- function(x, n=10) {
paste0(n, gsub('\\d*(\\.\\d*)', '\\1', x))
}
Then you can even choose which number you want in front of it.
keep.decimal(x)
[1] "10.01" "10.8" "10.4"
keep.decimal(x, 50)
[1] "50.01" "50.8" "50.4"
Explanation of vectorization
When a vector is defined like x <- 1:5, R will attempt to execute operations element by element when possible. If 2 had to be added to each element of x and assigned to a new variable y, it might be difficult with a for loop:
y <- NULL
for (i in 1:length(x)) {
y[i] <- x[i] + 2
}
y
[1] 3 4 5 6 7
But with R, this operation is simplified to:
y <- x + 2
This operational technique extends further than arithmetic. Here's a problem to solve, write an expression that matches a lowercase letter to its capitalized counterpart. So turn this:
x <- c("a", "b", "c")
x
[1] "a" "b" "d"
Into
y
[1] "aA" "bB" "dD"
I can write
y <- paste0(x, LETTERS[match(x, letters)])
y
[1] "aA" "bB" "dD"
Instead of:
y <- NULL
for(i in 1:length(x)) {
y[i] <- paste0(x[i], LETTERS[match(x[i], letters)])
}
y
[1] "aA" "bB" "dD"
It is much faster too. As an analogy, think of restaurant service. R is your server ready to fetch what you want. If you ask your server for ketchup, they get the ketchup and deliver it. Then you ask for a napkin. They go back to get the napkin and deliver it. Then you ask for a refill.
This process takes a long time. If you would have asked for ketchup, a napkin, and a refill at the same time, the server would be able to optimize their trip to get what you want fastest. R has optimized functions built in the 'C' language that are lightning fast, and we exploit that speed every time we vectorize.

R: Replacing rownames of data frame by a substring[2]

I have a question about the use of gsub. The rownames of my data, have the same partial names. See below:
> rownames(test)
[1] "U2OS.EV.2.7.9" "U2OS.PIM.2.7.9" "U2OS.WDR.2.7.9" "U2OS.MYC.2.7.9"
[5] "U2OS.OBX.2.7.9" "U2OS.EV.18.6.9" "U2O2.PIM.18.6.9" "U2OS.WDR.18.6.9"
[9] "U2OS.MYC.18.6.9" "U2OS.OBX.18.6.9" "X1.U2OS...OBX" "X2.U2OS...MYC"
[13] "X3.U2OS...WDR82" "X4.U2OS...PIM" "X5.U2OS...EV" "exp1.U2OS.EV"
[17] "exp1.U2OS.MYC" "EXP1.U20S..PIM1" "EXP1.U2OS.WDR82" "EXP1.U20S.OBX"
[21] "EXP2.U2OS.EV" "EXP2.U2OS.MYC" "EXP2.U2OS.PIM1" "EXP2.U2OS.WDR82"
[25] "EXP2.U2OS.OBX"
In my previous question, I asked if there is a way to get the same names for the same partial names. See this question: Replacing rownames of data frame by a sub-string
The answer is a very nice solution. The function gsub is used in this way:
transfecties = gsub(".*(MYC|EV|PIM|WDR|OBX).*", "\\1", rownames(test)
Now, I have another problem, the program I run with R (Galaxy) doesn't recognize the | characters. My question is, is there another way to get to the same solution without using this |?
Thanks!
If you don't want to use the "|" character, you can try something like :
Rnames <-
c( "U2OS.EV.2.7.9", "U2OS.PIM.2.7.9", "U2OS.WDR.2.7.9", "U2OS.MYC.2.7.9" ,
"U2OS.OBX.2.7.9" , "U2OS.EV.18.6.9" ,"U2O2.PIM.18.6.9" ,"U2OS.WDR.18.6.9" )
Rlevels <- c("MYC","EV","PIM","WDR","OBX")
tmp <- sapply(Rlevels,grepl,Rnames)
apply(tmp,1,function(i)colnames(tmp)[i])
[1] "EV" "PIM" "WDR" "MYC" "OBX" "EV" "PIM" "WDR"
But I would seriously consider mentioning this to the team of galaxy, as it seems to be rather awkward not to be able to use the symbol for OR...
I wouldn't recommend doing this in general in R as it is far less efficient than the solution #csgillespie provided, but an alternative is to loop over the various strings you want to match and do the replacements on each string separately, i.e. search for "MYN" and replace only in those rownames that match "MYN".
Here is an example using the x data from #csgillespie's Answer:
x <- c("U2OS.EV.2.7.9", "U2OS.PIM.2.7.9", "U2OS.WDR.2.7.9", "U2OS.MYC.2.7.9",
"U2OS.OBX.2.7.9", "U2OS.EV.18.6.9", "U2O2.PIM.18.6.9","U2OS.WDR.18.6.9",
"U2OS.MYC.18.6.9","U2OS.OBX.18.6.9", "X1.U2OS...OBX","X2.U2OS...MYC")
Copy the data so we have something to compare with later (this just for the example):
x2 <- x
Then create a list of strings you want to match on:
matches <- c("MYC","EV","PIM","WDR","OBX")
Then we loop over the values in matches and do three things (numbered ##X in the code):
Create the regular expression by pasting together the current match string i with the other bits of the regular expression we want to use,
Using grepl() we return a logical indicator for those elements of x2 that contain the string i
We then use the same style gsub() call as you were already shown, but use only the elements of x2 that matched the string, and replace only those elements.
The loop is:
for(i in matches) {
rgexp <- paste(".*(", i, ").*", sep = "") ## 1
ind <- grepl(rgexp, x) ## 2
x2[ind] <- gsub(rgexp, "\\1", x2[ind]) ## 3
}
x2
Which gives:
> x2
[1] "EV" "PIM" "WDR" "MYC" "OBX" "EV" "PIM" "WDR" "MYC" "OBX" "OBX" "MYC"

Resources