Sapply different than individual application of function - r

When applied individually to each element of the vector, my function gives a different result than using sapply. It's driving me nuts!
Item I'm using: this (simplified) list of arguments another function was called with:
f <- as.list(match.call()[-1])
> f
$ampm
c(1, 4)
To replicate this you can run the following:
foo <- function(ampm) {as.list(match.call()[-1])}
f <- foo(ampm = c(1,4))
Here is my function. It just strips the 'c(...)' from a string.
stripConcat <- function(string) {
sub(')','',sub('c(','',string,fixed=TRUE),fixed=TRUE)
}
When applied alone it works as so, which is what I want:
> stripConcat(f)
[1] "1, 4"
But when used with sapply, it gives something totally different, which I do NOT want:
> sapply(f, stripConcat)
ampm
[1,] "c"
[2,] "1"
[3,] "4"
Lapply doesn't work either:
> lapply(f, stripConcat)
$ampm
[1] "c" "1" "4"
And neither do any of the other apply functions. This is driving me nuts--I thought lapply and sapply were supposed to be identical to repeated applications to the elements of the list or vector!

The discrepency you are seeing, I believe, is simply due to how as.character coerces elements of a list.
x2 <- list(1:3, quote(c(1, 5)))
as.character(x2)
[1] "1:3" "c(1, 5)"
lapply(x2, as.character)
[[1]]
[1] "1" "2" "3"
[[2]]
[1] "c" "1" "5"
f is not a call, but a list whose first element is a call.
is(f)
[1] "list" "vector"
as.character(f)
[1] "c(1, 4)"
> is(f[[1]])
[1] "call" "language"
> as.character(f[[1]])
[1] "c" "1" "4"
sub attempts to coerce anything that is not a character into a chracter.
When you pass sub a list, it calls as.character on the list.
When you pass it a call, it calls as.character on that call.
It looks like for your stripConcat function, you would prefer a list as input.
In that case, I would recommend the following for that function:
stripConcat <- function(string) {
if (!is.list(string))
string <- list(string)
sub(')','',sub('c(','',string,fixed=TRUE),fixed=TRUE)
}
Note, however, that string is a misnomer, since it doesn't appear that you are ever planning to pass stripConcat a string. (not that this is an issue, of course)

Related

How To Add Characters Around Elements

I am using R and want to covert following:
"A,B"
to
"A","B" OR 'A','B'
I tried str_replace(), but that's not working out.
Please suggest, thanks.
Update
I tried the suggested answer by d.b. Though it works, but I didn't realize that I should have shared that, I am going to use above solution for vector. I need the values in data with "A,B" to split in order to use it as a vector.
Using strsplit
> data
[1] "A,B"
> test <- strsplit(x = data, split = ",")
> test
[[1]]
[1] "A" "B"
above test won't be useful because I can't use it for following:
> output_1 <- c(test)
> outputFinalData <- outputFinal[outputFinal$Column %in% output_1,]
outputFinalData is empty with above process. But is not empty when I do:
> output_2 <- c("A", "B")
> outputFinalData <- outputFinal[outputFinal$Column %in% output_2,]
Also, output_1 and output_2 are not same:
> output_1
[[1]]
[1] "Bin_14" "Bin_15"
> output_2
[1] "Bin_14" "Bin_15"
> output_1 == output_2
[1] FALSE FALSE
Use strsplit:
> data = "A,B"
> strsplit(x=data,split=",")
[[1]]
[1] "A" "B"
Note that it returns a list with a vector. The list is length one because you asked it to split one string. If you ask it to split two strings you get a list of length 2:
> data = c("A,B","Foo,bar")
> strsplit(x=data,split=",")
[[1]]
[1] "A" "B"
[[2]]
[1] "Foo" "bar"
So if you know you are only going to have one thing to split you can get a vector of the parts by taking the first element:
> data = "A,B"
> strsplit(x=data,split=",")[[1]]
[1] "A" "B"
However it might be more efficient to do a load of splits in one go and put the bits in a matrix. As long as you can be sure everything splits into the same number of parts, then something like:
> data = c("A,B","Foo,bar","p1,p2")
> do.call(rbind,(strsplit(x=data,split=",")))
[,1] [,2]
[1,] "A" "B"
[2,] "Foo" "bar"
[3,] "p1" "p2"
>
Gets you the two parts in columns of a matrix that you can then add to a data frame if that's what you need.

R Removing an element of a list element

This is more a general question on the behavior of lists in R, but the specific problem is:
I have a list of groups of words which I'm trying to manually remove specific words for - where no word is mentioned twice.
Currently, I'm using this method
l = strsplit(c("a b", "c d"), " ")
> l
[[1]]
[1] "a" "b"
[[2]]
[1] "c" "d"
# remove the value "d"
l = lapply(l, function(x) { x[x != "d"] })
> l
[[1]]
[1] "a" "b"
[[2]]
[1] "c"
Is there any sort of built in list indexing method that would be preferable to use? I feel like I should just be able to parse the list without using lapply. If not, is it possible that someone could explain why this is the case?
Thanks
You need to go through each element of the list and check if the vector contains d to filter/remove it.
One of the reason is that a list can contains various type of data (functions, data.frame, numeric, character, boolean, other lists, class) so there can't be vectorized operations (which are - as suggests the name - for vectors).
What you do is to filter you filter on the front end - eg when you have the list. It could be preferable to filter in the back end your vector, eg before obtaining the list:
l = strsplit(gsub('d','',c("a b", "c d")), " ")
#[[1]]
#[1] "a" "b"
#[[2]]
#[1] "c"
Some alternative solution for a front end filtering:
lapply(l, grep, pattern='[^d]', value=T)

How do I apply an index vector over a list of vectors?

I want to apply a long index vector (50+ non-sequential integers) to a long list of vectors (50+ character vectors containing 100+ names) in order to retrieve specific values (as a list, vector, or data frame).
A simplified example is below:
> my.list <- list(c("a","b","c"),c("d","e","f"))
> my.index <- 2:3
Desired Output
[[1]]
[1] "b"
[[2]]
[1] "f"
##or
[1] "b"
[1] "f"
##or
[1] "b" "f"
I know I can get the same value from each element using:
> lapply(my.list, function(x) x[2])
##or
> lapply(my.list,'[', 2)
I can pull the second and third values from each element by:
> lapply(my.list,'[', my.index)
[[1]]
[1] "b" "c"
[[2]]
[1] "e" "f"
##or
> for(j in my.index) for(i in seq_along(my.list)) print(my.list[[i]][[j]])
[1] "b"
[1] "e"
[1] "c"
[1] "f"
I don't know how to pull just the one value from each element.
I've been looking for a few days and haven't found any examples of this being done, but it seems fairly straight forward. Am I missing something obvious here?
Thank you,
Scott
Whenever you have a problem that is like lapply but involves multiple parallel lists/vectors, consider Map or mapply (Map simply being a wrapper around mapply with SIMPLIFY=FALSE hardcoded).
Try this:
Map("[",my.list,my.index)
#[[1]]
#[1] "b"
#
#[[2]]
#[1] "f"
..or:
mapply("[",my.list,my.index)
#[1] "b" "f"

Can't access vector elements in a for loop

Let's say I have two vectors:
a1=c("a","b")
a2=c("x","y")
Now in a 'for' loop, I want to access the first element of each vector:
for(i in c(a1,a2)) {
print(i[1])
}
If I run the above code, I get:
[1] "a"
[1] "b"
[1] "x"
[1] "y"
But I just want:
[1] "a"
[1] "x"
More surprisingly, if I want to access the second element:
for(i in c(a1,a2)) {
print(i[2])
}
I get:
[1] "NA"
[1] "NA"
[1] "NA"
[1] "NA"
Any help will be highly appreciated.
Because c(a1, a2) = c("a","b","x","y") -- passing multiple atomic vectors to c causes them to get collapsed. Use list(a1, a2) in the loop instead.

In R, how can a string be split without using a seperator

i am try split method and i want to have the second element of a string containing only 2 elemnts. The size of the string is 2.
examples :
string= "AC"
result shouldbe a split after the first letter ("A"), that I get :
res= [,1] [,2]
[1,] "A" "C"
I tryed it with split, but I have no idea how to split after the first element??
strsplit() will do what you want (if I understand your Question). You need to split on "" to split the string on it's elements. Here is an example showing how to do what you want on a vector of strings:
strs <- rep("AC", 3) ## your string repeated 3 times
next, split each of the three strings
sstrs <- strsplit(strs, "")
which produces
> sstrs
[[1]]
[1] "A" "C"
[[2]]
[1] "A" "C"
[[3]]
[1] "A" "C"
This is a list so we can process it with lapply() or sapply(). We need to subset each element of sstrs to select out the second element. Fo this we apply the [ function:
sapply(sstrs, `[`, 2)
which produces:
> sapply(sstrs, `[`, 2)
[1] "C" "C" "C"
If all you have is one string, then
strsplit("AC", "")[[1]][2]
which gives:
> strsplit("AC", "")[[1]][2]
[1] "C"
split isn't used for this kind of string manipulation. What you're looking for is strsplit, which in your case would be used something like this:
strsplit(string,"",fixed = TRUE)
You may not need fixed = TRUE, but it's a habit of mine as I tend to avoid regular expressions. You seem to indicate that you want the result to be something like a matrix. strsplit will return a list, so you'll want something like this:
strsplit(string,"",fixed = TRUE)[[1]]
and then pass the result to matrix.
If you sure that it's always two char string (check it by all(nchar(x)==2)) and you want only second then you could use sub or substr:
x <- c("ab", "12")
sub(".", "", x)
# [1] "b" "2"
substr(x, 2, 2)
# [1] "b" "2"

Resources