R: Make character string refer to an object - r

I have a large list of files (file1, file2, file3, etc.) and, for each analysis, I want to refer to two files from this list (e.g. function(file1,file2)). When I try to do this using paste0("file", pairs[1,x] I get back the character string "file1" rather than the object file1.
How can I refer to the objects rather than create a character string?
Thank you very much!
Additional comment:
pairs is a 2xn matrix where each column is the combination of files for one analysis (e.g. pairs[1,1] = 1 and pairs[2,1] = 2 for the comparison between file1 and file2).

Are you looking for get()???
a <- 1:5
> get("a")
[1] 1 2 3 4 5

How to get the variable from a string containing the variable name:
> a = 10
> string = "a"
> string
[1] "a"
> eval(parse(text = string))
[1] 10
> eval(parse(text = "a"))
[1] 10
Hope this helps.

Another alternative:
eval(as.name("file"))

Related

Finding consecutive values in a string in R

I am trying to find 3 or more consecutive "a" within the last 10 letters of my data frame string. My data frame looks like this:
V1
aaashkjnlkdjfoin
jbfkjdnsnkjaaaas
djshbdkjaaabdfkj
jbdfkjaaajbfjna
ndjksnsjksdnakns
aaaandfjhsnsjna
I have written this code, however it just gets out the number of consecutive "a" within the whole string. However, I am wanting to do it so it only looks at the last 10 digits and then prints the string where the consecutive "a" are found. The code I have wrote is:
out: [1] 3
I am wanting my output to look like this:
jbfkjdnsnkjaaaas
djshbdkjaaabdfkj
jbdfkjaaajbfjna
Can anyone help
Using regex, you could do:
grep("(?=.{10}$).*?a{3,}", string, perl = TRUE, value = TRUE)
[1] "jbfkjdnsnkjaaaas" "djshbdkjaaabdfkj" "jbdfkjaaajbfjna"
string <- c("aaashkjnlkdjfoin", "jbfkjdnsnkjaaaas", "djshbdkjaaabdfkj",
"jbdfkjaaajbfjna", "ndjksnsjksdnakns", "aaaandfjhsnsjna")
If you have a dataframe and need tosubset it:
subset(df, grepl("(?=.{10}$).*?a{3}",V1, perl = TRUE))
V1
2 jbfkjdnsnkjaaaas
3 djshbdkjaaabdfkj
4 jbdfkjaaajbfjna

String matching within a list of lists [duplicate]

I have a list like this:
map_tmp <- list("ABC",
c("EGF", "HIJ"),
c("KML", "ABC-IOP"),
"SIN",
"KMLLL")
> grep("ABC", map_tmp)
[1] 1 3
> grep("^ABC$", map_tmp)
[1] 1 # by using regex, I get the index of "ABC" in the list
> grep("^KML$", map_tmp)
[1] 5 # I wanted 3, but I got 5. Claiming the end of a string by "$" didn't help in this case.
> grep("^HIJ$", map_tmp)
integer(0) # the regex do not return to me the index of a string inside the vector
How can I get the index of a string (exact match) in the list?
I'm ok not to use grep. Is there any way to get the index of a certain string (exact match) in the list? Thanks!
Using lapply:
which(lapply(map_tmp, function(x) grep("^HIJ$", x))!=0)
The lapply function gives you a list of which for each element in the list (0 if there's no match). The which!=0 function gives you the element in the list where your string occurs.
Use either mapply or Map with str_detect to find the position, I have run only for one string "KML" , you can run it for all others. I hope this is helpful.
First of all we make the lists even so that we can process it easily
library(stringr)
map_tmp_1 <- lapply(map_tmp, `length<-`, max(lengths(map_tmp)))
### Making the list even
val <- t(mapply(str_detect,map_tmp_1,"^KML$"))
> which(val[,1] == T)
[1] 3
> which(val[,2] == T)
integer(0)
In case of "ABC" string:
val <- t(mapply(str_detect,map_tmp_1,"ABC"))
> which(val[,1] == T)
[1] 1
> which(val[,2] == T)
[1] 3
>
I had the same question. I cannot explain why grep would work well in a list with characters but not with regex. Anyway, the best way I found to match a character string using common R script is:
map_tmp <- list("ABC",
c("EGF", "HIJ"),
c("KML", "ABC-IOP"),
"SIN",
"KMLLL")
sapply( map_tmp , match , 'ABC' )
It returns a list with similar structure as the input with 'NA' or '1', depending on the result of the match test:
[[1]]
[1] 1
[[2]]
[1] NA NA
[[3]]
[1] NA NA
[[4]]
[1] NA
[[5]]
[1] NA

grep exact match in vector inside a list in R

I have a list like this:
map_tmp <- list("ABC",
c("EGF", "HIJ"),
c("KML", "ABC-IOP"),
"SIN",
"KMLLL")
> grep("ABC", map_tmp)
[1] 1 3
> grep("^ABC$", map_tmp)
[1] 1 # by using regex, I get the index of "ABC" in the list
> grep("^KML$", map_tmp)
[1] 5 # I wanted 3, but I got 5. Claiming the end of a string by "$" didn't help in this case.
> grep("^HIJ$", map_tmp)
integer(0) # the regex do not return to me the index of a string inside the vector
How can I get the index of a string (exact match) in the list?
I'm ok not to use grep. Is there any way to get the index of a certain string (exact match) in the list? Thanks!
Using lapply:
which(lapply(map_tmp, function(x) grep("^HIJ$", x))!=0)
The lapply function gives you a list of which for each element in the list (0 if there's no match). The which!=0 function gives you the element in the list where your string occurs.
Use either mapply or Map with str_detect to find the position, I have run only for one string "KML" , you can run it for all others. I hope this is helpful.
First of all we make the lists even so that we can process it easily
library(stringr)
map_tmp_1 <- lapply(map_tmp, `length<-`, max(lengths(map_tmp)))
### Making the list even
val <- t(mapply(str_detect,map_tmp_1,"^KML$"))
> which(val[,1] == T)
[1] 3
> which(val[,2] == T)
integer(0)
In case of "ABC" string:
val <- t(mapply(str_detect,map_tmp_1,"ABC"))
> which(val[,1] == T)
[1] 1
> which(val[,2] == T)
[1] 3
>
I had the same question. I cannot explain why grep would work well in a list with characters but not with regex. Anyway, the best way I found to match a character string using common R script is:
map_tmp <- list("ABC",
c("EGF", "HIJ"),
c("KML", "ABC-IOP"),
"SIN",
"KMLLL")
sapply( map_tmp , match , 'ABC' )
It returns a list with similar structure as the input with 'NA' or '1', depending on the result of the match test:
[[1]]
[1] 1
[[2]]
[1] NA NA
[[3]]
[1] NA NA
[[4]]
[1] NA
[[5]]
[1] NA

R: return column using get() and paste()

Why does get() in combination with paste() work for dataframes but not for columns within a dataframe? how can I make it work?
ab<-12
get(paste("a","b",sep=""))
# gives: [1] 12
ab<-data.frame(a=1:3,b=3:5)
ab$a
#gives: [1] 1 2 3
get(paste("a","b",sep=""))
# gives the whole dataframe
get(paste("ab$","a",sep=""))
# gives: Error in get(paste("ab$", "a", sep = "")) : object 'ab$a' not found
Columns in dataframes are not first class objects. Their "names" are really indexing values for list-extraction. Despite the understandable confusion caused by the existence of the names function, they are not true R-names, i.e. unquoted tokens or symbols, in the list of R objects. See the ?is.symbol help page. The get function takes a character value, and then looks for it in the workspace and returns it for further processing.
> ab<-data.frame(a=1:3,b=3:5)
> ab$a
[1] 1 2 3
> get(paste("a","b",sep=""))
a b
1 1 3
2 2 4
3 3 5
>
> # and this would be the way to get the 'a' column of the ab object
get(paste("ab",sep=""))[['a']]
If there were a named object target with a value "a" tehn you could also do:
target <- "a"
get(paste("ab",sep=""))[[target]] # notice no quotes around target
# because `target` is a _real_ R name
It doesn't work because get() interprets the string it's passed as referring to an object named "ab$a" (not as referring to the element named "a" of the object named "ab") . Here's probably the best way to see what that means:
ab<-data.frame(a=1:3,b=3:5)
`ab$a` <- letters[1:3]
get("ab$a")
# [1] "a" "b" "c"

Storing array of words from string in a list of full names in R

I have a list of 120777 records which contains names of people. I want to store an array of name parts for each record in the dataset. I tried this in R.
my_list$name_parts<- strsplit(my_list$name, " ")
I get a my_list$name_parts as a list of 120777 items. When I try querying the number of words in each name using length(my_list$name_parts), I get 120777 for all.
Let's use this simple example:
my_list <- list()
my_list$name <- c("toto t. tutu", "foo bar")
To get the number of words, you can do that:
lapply(strsplit(my_list$name," "), length)
which gives in the simple example above:
[[1]]
[1] 3
[[2]]
[1] 2
To avoid getting a list, you can even do:
unlist(lapply(strsplit(my_list$name," "), length))
[1] 3 2

Resources