How to convert object variable into string inside a function - r

I have the following list of vectors
v1 <- c("foo","bar")
v2 <- c("qux","uip","lsi")
mylist <- list(v1,v2)
mylist
#> [[1]]
#> [1] "foo" "bar"
#>
#> [[2]]
#> [1] "qux" "uip" "lsi"
What I want to do is to apply a function so that it prints the this string:
v1:foo,bar
v2:qux,uip,lsi
So it involves two step: 1) Convert object variable to string and
2) make the vector into string. The latter is easy as I can do this:
make_string <- function (content_vector) {
cat(content_vector,sep=",")
}
make_string(mylist[[1]])
# foo,bar
make_string(mylist[[2]])
# qux,uip,lsi
I am aware of this solution, but I don't know how can I turn the object name into a string within a function so that
it prints like my desired output.
I need to to this inside a function, because there are many other output I need to process.

We can use
cat(paste(c('v1', 'v2'), sapply(mylist, toString), sep=":", collapse="\n"), '\n')
#v1:foo, bar
#v2:qux, uip, lsi
If we need to pass the original object i.e. 'v1', 'v2'
make_string <- function(vec){
obj <- deparse(substitute(vec))
paste(obj, toString(vec), sep=":")
}
make_string(v1)
#[1] "v1:foo, bar"

If you want to use a list, you can name the objects in the list to be able to use them in a function. Remove the cat if you just want a string to be returned.
v1 <- c("foo","bar")
v2 <- c("qux","uip","lsi")
# objects given names here
mylist <- list("v1" = v1, "v2" = v2)
# see names now next to the $
mylist
$v1
[1] "foo" "bar"
$v2
[1] "qux" "uip" "lsi"
make_string <- function (content_vector) {
vecname <- names(content_vector)
cat(paste0(vecname, ":", paste(sapply(content_vector, toString), sep = ",")))
}
make_string(mylist[1])
v1:foo, bar
make_string(mylist[2])
v2:qux, uip, lsi

Related

Accesing column name inside lapply

I use 'deparse(substitute(x))' from inside my function to get the name of the dataframe column passed as argument. It works great... but not with 'lapply'
myfun <- function(x)
{
return(deparse(substitute(x)))
}
a <- c(1,2,3)
b <- c(4,5,5)
df<-data.frame(a,b)
myfun(df$a)
[1] "df$a"
but, with 'lapply'...
lapply(df, myfun)
$a
[1] "X[[i]]"
$b
[1] "X[[i]]"
How can I get the name inside 'lapply'?
EDIT: I need to access not the column name but the FULL NAME (dataFrameName$varName)
You can use colnames() :
f=function(d) {
paste0(deparse(substitute(d)),"$",colnames(d))
}

Split vector, categorized by regex

I am searching for a method to split a character vector based on a RegEx pattern.
Example of input:
input <- c("a_foo","b_foo", "c_bar", "d_bar")
split_by <- c("foo", "bar")
The result I am searching for:
$foo
[1] "a_foo" "b_foo"
$bar
[1] "c_bar" "d_bar"
EDIT
Based on the comments and answers, there is need for a clarification.
split_by can have any number of elements;
the RegEx pattern varies from case to case; and
an element in input may be assigned to 0 (no matches), 1, or multiple splits depending on the match.
Hence, the following input:
input <- c("foo_bar", "nothing", "a_foo", "c_bar")
split_by <- c("foo", "bar")
Could return:
$foo
[1] "foo_bar" "a_foo"
$bar
[1] "foo_bar" "c_bar"
In real case, can you extract split_by values from input data?
This works for the example shared.
split(input, sub('.*_', '', input))
#$bar
#[1] "c_bar" "d_bar"
#$foo
#[1] "a_foo" "b_foo"
where
sub('.*_', '', input) #returns
#[1] "foo" "foo" "bar" "bar"
lapply(split_by, grep, x = input, value = TRUE)
# [[1]]
# [1] "a_foo" "b_foo"
#
# [[2]]
# [1] "c_bar" "d_bar"
To get named output you could do:
lapply(setNames(split_by, split_by), grep, x = input, value = TRUE)
split.regex <- function(input, split_by, pattern, add_names=TRUE) {
out <- lapply(split_by, function(x) {
input[grepl(sprintf(pattern, x), input)]
})
if (add_names) {
names(out) <- split_by
}
return(out)
}
First, the pattern must be defined. Since foo and bar occurs at the end of the string, sprintf("%s$", split_by) can be used. In the function, I defined the sprintf inside the function so the argument pattern should be defined as the sprintf string "%s$".
First example
By defining input and split_by as in the question's first example, and then running:
split.regex(input=input, split_by=split_by, pattern="%s$", add_names=TRUE)
We get the desired result:
$foo
[1] "a_foo" "b_foo"
$bar
[1] "c_bar" "d_bar"
Second example
By defining input and split_by as in the question's second example, and then running:
split.regex(input=input, split_by=split_by, pattern="(%s)", add_names=TRUE)
We get the desired result:
$foo
[1] "foo_bar" "a_foo"
$bar
[1] "foo_bar" "c_bar"
Since the input "nothing" didn't match on any, it was correctly excluded from the split, whereas "foo_bar" was correctly added to both splits as it matched on both.

Collapsing mixed types into a neat comma separated string

I have a list of mixed types which I would like to collapse into a neat comma separated string to be read somewhere else. The following is a MWE:
a <- "name"
b <- as.vector(c(10))
names(b) <- c('s')
c <- as.vector(c(1, 2))
names(c) <- c('p1', 'p2')
d <- 20
r <- list(a, b, c, d)
r
# [[1]]
# [1] "name"
#
# [[2]]
# s
# 10
#
# [[3]]
# p1 p2
# 1 2
#
# [[4]]
# [1] 20
I want this:
# [1] '"name","10","1,2","20"'
But this is as far as I got:
# Collapse individual elements into individual strings.
# `sapply` with `paste` works perfectly:
> sapply(r, paste, collapse = ",")
# [1] "name" "10" "1,2" "20"
# Try paste again (doesn't work):
> paste(sapply(r, paste, collapse = ","), collapse = ',')
# [1] "name,10,1,2,20"
I tried paste0, cat to no avail. The only way I could do it is using write.table and passing it a buffer memory. That way is too complicated, and quite error prone. I need to have my code working on a cluster with MPI.
You need to add in the quotes - the ones printed after your sapply are just markers to show they are strings. This seems to work...
cat(paste0('"',sapply(r, paste, collapse = ','),'"',collapse=','))
"name","10","1,2","20"
You might need to try with and without the cat if you are writing to a file. Without it, at the terminal, you get backslashes before the 'real' quotes.

R: Argument as variablename and string in function? [duplicate]

I am looking for the reverse of get().
Given an object name, I wish to have the character string representing that object extracted directly from the object.
Trivial example with foo being the placeholder for the function I am looking for.
z <- data.frame(x=1:10, y=1:10)
test <- function(a){
mean.x <- mean(a$x)
print(foo(a))
return(mean.x)}
test(z)
Would print:
"z"
My work around, which is harder to implement in my current problem is:
test <- function(a="z"){
mean.x <- mean(get(a)$x)
print(a)
return(mean.x)}
test("z")
The old deparse-substitute trick:
a<-data.frame(x=1:10,y=1:10)
test<-function(z){
mean.x<-mean(z$x)
nm <-deparse(substitute(z))
print(nm)
return(mean.x)}
test(a)
#[1] "a" ... this is the side-effect of the print() call
# ... you could have done something useful with that character value
#[1] 5.5 ... this is the result of the function call
Edit: Ran it with the new test-object
Note: this will not succeed inside a local function when a set of list items are passed from the first argument to lapply (and it also fails when an object is passed from a list given to a for-loop.) You would be able to extract the ".Names"-attribute and the order of processing from the structure result, if it were a named vector that were being processed.
> lapply( list(a=4,b=5), function(x) {nm <- deparse(substitute(x)); strsplit(nm, '\\[')} )
$a # This "a" and the next one in the print output are put in after processing
$a[[1]]
[1] "X" "" "1L]]" # Notice that there was no "a"
$b
$b[[1]]
[1] "X" "" "2L]]"
> lapply( c(a=4,b=5), function(x) {nm <- deparse(substitute(x)); strsplit(nm, '\\[')} )
$a
$a[[1]] # but it's theoretically possible to extract when its an atomic vector
[1] "structure(c(4, 5), .Names = c(\"a\", \"b\"))" ""
[3] "1L]]"
$b
$b[[1]]
[1] "structure(c(4, 5), .Names = c(\"a\", \"b\"))" ""
[3] "2L]]"
deparse(quote(var))
My intuitive understanding
In which the quote freeze the var or expression from evaluation
and the deparse function which is the inverse of parse function makes that freezed symbol back to String
Note that for print methods the behavior can be different.
print.foo=function(x){ print(deparse(substitute(x))) }
test = list(a=1, b=2)
class(test)="foo"
#this shows "test" as expected
print(test)
#this (just typing 'test' on the R command line)
test
#shows
#"structure(list(a = 1, b = 2), .Names = c(\"a\", \"b\"), class = \"foo\")"
Other comments I've seen on forums suggests that the last behavior is unavoidable. This is unfortunate if you are writing print methods for packages.
To elaborate on Eli Holmes' answer:
myfunc works beautifully
I was tempted to call it within another function (as discussed in his Aug 15, '20 comment)
Fail
Within a function, coded directly (rather than called from an external function), the deparse(substitute() trick works well.
This is all implicit in his answer, but for the benefit of peeps with my degree of obliviousness, I wanted to spell it out.
an_object <- mtcars
myfunc <- function(x) deparse(substitute(x))
myfunc(an_object)
#> [1] "an_object"
# called within another function
wrapper <- function(x){
myfunc(x)
}
wrapper(an_object)
#> [1] "x"

subset() drops attributes on vectors; how to maintain/persist them?

Let's say I have a vector where I've set a few attributes:
vec <- sample(50:100,1000, replace=TRUE)
attr(vec, "someattr") <- "Hello World"
When I subset the vector, the attributes are dropped. For example:
tmp.vec <- vec[which(vec > 80)]
attributes(tmp.vec) # Now NULL
Is there a way to, subset and persist attributes without having to save them to another temporary object?
Bonus: Where would one find documentation of this behaviour?
I would write a method for [ or subset() (depending on how you are subsetting) and arrange for that to preserve the attributes. That would need a "class" attribute also adding to your vector so that dispatch occurs.
vec <- 1:10
attr(vec, "someattr") <- "Hello World"
class(vec) <- "foo"
At this point, subsetting removes attributes:
> vec[1:5]
[1] 1 2 3 4 5
If we add a method [.foo we can preserve the attributes:
`[.foo` <- function(x, i, ...) {
attrs <- attributes(x)
out <- unclass(x)
out <- out[i]
attributes(out) <- attrs
out
}
Now the desired behaviour is preserved
> vec[1:5]
[1] 1 2 3 4 5
attr(,"someattr")
[1] "Hello World"
attr(,"class")
[1] "foo"
And the answer to the bonus question:
From ?"[" in the details section:
Subsetting (except by an empty index) will drop all attributes except names, dim and dimnames.
Thanks to a similar answer to my question #G. Grothendieck, you can use collapse::fsubset see here.
library(collapse)
#tmp_vec <- fsubset(vec, vec > 80)
tmp_vec <- sbt(vec, vec > 80) # Shortcut for fsubset
attributes(tmp_vec)
# $someattr
# [1] "Hello World"

Resources