Replace values in list - r

I have a nested list, which could look something like this:
characlist<-list(list(c(1,2,3,4)),c(1,3,2,NA))
Next, I want to replace all values equal to one with NA. I tried the following, but it produces an error:
lapply(characlist,function(x) ifelse(x==1,NA,x))
Error in ifelse(x == 1, NA, x) :
(list) object cannot be coerced to type 'double'
Can someone tell me what's wrong with the code?

Use rapply instead:
> rapply(characlist,function(x) ifelse(x==1,NA,x), how = "replace")
#[[1]]
#[[1]][[1]]
#[1] NA 2 3 4
#
#
#[[2]]
#[1] NA 3 2 NA
The problem in your initial approach was that your first list element is itself a list. Hence you cannot directly apply the ifelse logic as you would on an atomic vector. By using ?rapply you can avoid that problem (rapply is a recursive version of lapply).

Another option would be using relist after we replace the elements that are 1 to NA in the unlisted vector. We specify the skeleton as the original list to get the same structure.
v1 <- unlist(characlist)
relist(replace(v1, v1==1, NA), skeleton=characlist)
#[[1]]
#[[1]][[1]]
#[1] NA 2 3 4
#[[2]]
#[1] NA 3 2 NA

Related

How to unwrap list with access variables? [duplicate]

I have a vector like below
tmp <- c(a=1, b=2, c=3)
a b c
1 2 3
I want to flatten this vector to get only 1, 2, 3.
I tried unlist(tmp) but it still gives me the same result.
How to achieve that efficiently?
You just want to remove the names attribute from tmp. There are a number of ways to do that.
You can unname it.
unname(tmp)
# [1] 1 2 3
Or use a very common method for removing names, by setting them to NULL.
names(tmp) <- NULL
Or strip the attributes with as.vector.
as.vector(tmp)
# [1] 1 2 3
Or re-concatenate it without the names.
c(tmp, use.names=FALSE)
# [1] 1 2 3
Or use setNames.
setNames(tmp, NULL)
# [1] 1 2 3
There is a use case that the above does not cover:
tmp <- c(1,2,3)
names(tmp) <- c("a","b","c")
In this case you need to use both:
unlist(unname(tmp))

R: Applying function to DataFrame

I have following code:
library(Ecdat)
data(Fair)
Fair[1:5,]
x1 = function(x){
mu = mean(x)
l1 = list(s1=table(x),std=sd(x))
return(list(l1,mu))
}
mylist <- as.list(Fair$occupation,
Fair$education)
x1(mylist)
What I wanted is that x1 outputs the result for the items selected in mylist. However, I get In mean.default(x) : argument is not numeric or logical: returning NA.
You need to use lapply if your passing a list to a function
output<-lapply(mylist,FUN=x1)
This will process your function x1 for each element in mylist and return a list of results to output.
Here the mylist is created not in the correct way and a list is not needed also as data.frame is a list with columns of equal length. So, just subset the columns of interest and apply the function
lapply(Fair[c("occupation", "education")], x1)
In the OP's code, as.list simply creates a list of length 601 with only a single element in each.
str(mylist)
#List of 601
#$ : int 7
#$ : int 6
#$ : int 1
#...
#...
Another problem in the code is that it is not even considering the 2nd argument. Using a simple example
as.list(1:3, 1:2)
#[[1]]
#[1] 1
#[[2]]
#[1] 2
#[[3]]
#[1] 3
The second argument is not at all considered. It could have been
list(1:3, 1:2)
#[[1]]
#[1] 1 2 3
#[[2]]
#[1] 1 2
But for data.frame columns, we don't need to explicitly call the list as it is a list of vectors that have equal length.
Regarding the error in OP's post, mean works on vectors and not on list or data.frame.

Dynamically creating named list in R

I need to create named lists dynamically in R as follows.
Suppose there is an array of names.
name_arr<-c("a","b")
And that there is an array of values.
value_arr<-c(1,2,3,4,5,6)
What I want to do is something like this:
list(name_arr[1]=value_arr[1:3])
But R throws an error when I try to do this. Any suggestions as to how to get around this problem?
you can use [[...]] to assign values to keys given by strings:
my.list <- list()
my.list[[name_arr[1]]] <- value_arr[1:3]
You could use setNames. Examples:
setNames(list(value_arr[1:3]), name_arr[1])
#$a
#[1] 1 2 3
setNames(list(value_arr[1:3], value_arr[4:6]), name_arr)
#$a
#[1] 1 2 3
#
#$b
#[1] 4 5 6
Or without setNames:
mylist <- list(value_arr[1:3])
names(mylist) <- name_arr[1]
mylist
#$a
#[1] 1 2 3
mylist <- list(value_arr[1:3], value_arr[4:6])
names(mylist) <- name_arr
mylist
#$a
#[1] 1 2 3
#
#$b
#[1] 4 5 6
Your code will throw a error. Because in list(A = B), A must be a name instead of an object.
You could convert a object to a name by function eval. Here is the example.
eval(parse(text = sprintf('list(%s = value_arr[1:3])',name_arr[1])))

Using backticks and operators in apply family functions

I saw in a recent answer an apply family function with assignments built-in and can't generalize it.
lst <- list(a=1, b=2:3)
lst
$a
[1] 1
$b
[1] 2 3
This can't yet be made into a data.frame because of the unequal lengths. But by coercing the max length to the list, it works:
data.frame(lapply(lst, `length<-`, max(lengths(lst))))
a b
1 1 2
2 NA 3
That works. But I've never used arrow assignments in apply functions. I tried to understand it by generalizing like:
lapply(lst, function(x) length(x) <- max(lengths(lst)))
$a
[1] 2
$b
[1] 2
That's not the correct output. Nor is
lapply(lst, function(x) length(x) <- max(lengths(x)))
Error in lengths(x) : 'x' must be a list
This would be a useful technique to understand well. Is there a way to express the assignment in the anonymous function form?
By using anonymous functions, we are returning only the value of that function, and not the value of 'x'. We have to specify return(x) or simply x.
lapply(lst, function(x) {
length(x) <- max(lengths(lst))
x})
#$a
#[1] 1 NA
#$b
#[1] 2 3

summary still shows NAs after using both na.omit and complete.cases

I am a grad student using R and have been reading the other Stack Overflow answers regarding removing rows that contain NA from dataframes. I have tried both na.omit and complete.cases. When using both it shows that the rows with NA have been removed, but when I write summary(data.frame) it still includes the NAs. Are the rows with NA actually removed or am I doing this wrong?
na.omit(Perios)
summary(Perios)
Perios[complete.cases(Perios),]
summary(Perios)
The error is that you actually didn't assign the output from na.omit !
Perios <- na.omit(Perios)
If you know which column the NAs occur in, then you can just do
Perios[!is.na(Perios$Periostitis),]
or more generally:
Perios[!is.na(Perios$colA) & !is.na(Perios$colD) & ... ,]
Then as a general safety tip for R, throw in an na.fail to assert it worked:
na.fail(Perios) # trust, but verify! Die Paranoia ist gesund.
is.na is not the proper function. You want complete.cases and you want complete.cases which is the equivalent of function(x) apply(is.na(x), 1, all) or na.omit to filter the data:
That is, you want all rows where there are no NA values.
< x <- data.frame(a=c(1,2,NA), b=c(3,NA,NA))
> x
a b
1 1 3
2 2 NA
3 NA NA
> x[complete.cases(x),]
a b
1 1 3
> na.omit(x)
a b
1 1 3
Then this is assigned back to x to save the data.
complete.cases returns a vector, one element per row of the input data frame. On the other hand, is.na returns a matrix. This is not appropriate for returning complete cases, but can return all non-NA values as a vector:
> is.na(x)
a b
[1,] FALSE FALSE
[2,] FALSE TRUE
[3,] TRUE TRUE
> x[!is.na(x)]
[1] 1 2 3

Resources