I've got a list of lists of bootstrap statistics from a function that I wrote in R. The main list has the 1000 bootstrap iterations. Each element within the list is itself a list of three things, including fitted values for each of the four variables ("fvboot" -- a 501x4 matrix).
I want to make a vector of the values for each position on the grid of x values, from 1:501, and for each variable, from 1:4.
For example, for the ith point on the xgrid of the jth variable, I want to make a vector like the following:
vec = bootfits$fvboot[[1:1000]][i,j]
but when I do this, I get:
recursive indexing failed at level 2
googling around, i think I understand why R is doing this. but I'm not getting an answer for how I can get the ijth element of each fvboot matrix into a 1000x1 vector.
help would be much appreciated.
Use unlist() function in R. From example(unlist),
unlist(options())
unlist(options(), use.names = FALSE)
l.ex <- list(a = list(1:5, LETTERS[1:5]), b = "Z", c = NA)
unlist(l.ex, recursive = FALSE)
unlist(l.ex, recursive = TRUE)
l1 <- list(a = "a", b = 2, c = pi+2i)
unlist(l1) # a character vector
l2 <- list(a = "a", b = as.name("b"), c = pi+2i)
unlist(l2) # remains a list
ll <- list(as.name("sinc"), quote( a + b ), 1:10, letters, expression(1+x))
utils::str(ll)
for(x in ll)
stopifnot(identical(x, unlist(x)))
This would be easier if you give a minimal example object. In general, you can not index lists with vectors like [[1:1000]]. I would use the plyr functions. This should do it (although I haven't tested it):
require("plyr")
laply(bootfits$fvboot,function(l) l[i,j])
If you are not familiar with plyr: I always found Hadley Wickham's article 'The split-apply-combine strategy for data analysis' very useful.
You can extract one vector at a time using sapply, e.g. for i=1 and j=1:
i <- 1
j <- 1
vec <- sapply(bootfits, function(x){x$fvboot[i,j]})
sapply carries out the function (in this case an inline function we have written) to each element of the list bootfits, and simplifies the result if possible (i.e. converts it from a list to a vector).
To extract a whole set of values as a matrix (e.g. over all the i's) you can wrap this in another sapply, but this time over the i's for a specified j:
j <- 1
mymatrix <- sapply(1:501, function(i){
sapply(bootfits, function(x){x$fvboot[i,j]})
})
Warning: I haven't tested this code but I think it should work.
Related
I a data frame with different columns that has string answers from different assessors, who used random upper or lower cases in their answers. I want to convert everything to lower case. I have a code that works as follows:
# Creating a reproducible data frame similar to what I am working with
dfrm <- data.frame(a = sample(names(islands))[1:20],
b = sample(unname(islands))[1:20],
c = sample(names(islands))[1:20],
d = sample(unname(islands))[1:20],
e = sample(names(islands))[1:20],
f = sample(unname(islands))[1:20],
g = sample(names(islands))[1:20],
h = sample(unname(islands))[1:20])
# This is how I did it originally by writing everything explicitly:
dfrm1 <- dfrm
dfrm1$a <- tolower(dfrm1$a)
dfrm1$c <- tolower(dfrm1$c)
dfrm1$e <- tolower(dfrm1$e)
dfrm1$g <- tolower(dfrm1$g)
head(dfrm1) #Works as intended
The problem is that as the number of assessors increase, I keep making copy paste errors. I tried to simplify my code by writing a function for tolower, and used sapply to loop it, but the final data frame does not look like what I wanted:
# function and sapply:
dfrm2 <- dfrm
my_list <- c("a", "c", "e", "g")
my_low <- function(x){dfrm2[,x] <- tolower(dfrm2[,x])}
sapply(my_list, my_low) #Didn't work
# Alternative approach:
dfrm2 <- as.data.frame(sapply(my_list, my_low))
head(dfrm2) #Lost the numbers
What am I missing?
I know this must be a very basic concept that I'm not getting. There was this question and answer that I simply couldn't follow, and this one where my non-working solution simply seems to work. Any help appreciated, thanks!
Maybe you want to create a logical vector that selects the columns to change and run an apply function only over those columns.
# only choose non-numeric columns
changeCols <- !sapply(dfrm, is.numeric)
# change values of selected columns to lower case
dfrm[changeCols] <- lapply(dfrm[changeCols], tolower)
If you have other types of columns, say logical, you also could be more explicit regarding the types of columns that you want to change. For example, to select only factor and character columns, use.
changeCols <- sapply(dfrm, function(x) is.factor(x) | is.character(x))
For your first attempt, if you want the assignments to your data frame dfrm2 to stick, use the <<- assignment operator:
my_low <- function(x){ dfrm2[,x] <<- tolower(dfrm2[,x]) }
sapply(my_list, my_low)
Demo
i am working with consumer price index CPI and in order to calculate it i have to multiply the index matrix with the corresponding weights:
grossCPI77_10 <- grossIND1977 %*% weights1910/100
grossCPI82_10 <- grossIND1982 %*% weights1910/100
of course i would rather like to have a code like the one beyond:
grossIND1982 <- replicate(20, cbind(1:61))
grossIND1993 <- replicate(20, cbind(1:61))
weights1910_sc <- c(1:20)
grossIND_list <- mget(ls(pattern = "grossIND...."))
totalCPI <- mapply("*", grossIND_list, weights1910_sc)
the problem is that it gives me a 1200x20 matrix. i expected a normal matrix (61x20) vector (20x1) multiplication which should result in a 20x1 vector? could you explain me what i am doing wrong? thanks
part of your problem is that you don't have matrices but 3D arrays, with one singleton dimension. The other issue is that mapply likes to try and combine the results into a matrix, and also that constant arguments should be passed via MoreArgs. But actually, this is more a case for lapply.
grossIND1982 <- replicate(20, cbind(1:61))[,1,]
grossIND1993 <- replicate(20, cbind(1:61))[,1,]
weights1910_sc <- c(1:20)
grossIND_list <- mget(ls(pattern = "grossIND...."))
totalCPI <- mapply("*", grossIND_list, MoreArgs=list(e2 = weights1910_sc), SIMPLIFY = FALSE)
totalCPI <- lapply(grossIND_list, "*", e2 = weights1910_sc)
I am not sure if I understood all aspects of your problem (especially concerning what should be colums, what should be rows, and in which order the crossproduct shall be applied), but I will try at least to cover some aspects. See comments in below code for clarifications of what you did and what you might want. I hope it helps, let me know if this is what you need.
#instead of using mget, I recommend to use a list structure
#otherwise you might capture other variables with similar names
#that you do not want
INDlist <- sapply(c("1990", "1991"), function(x) {
#this is how to set up a matrix correctly, check `?matrix`
#I think your combination of cbind and rep did not give you what you wanted
matrix(rep(1:61, 20), nrow = 61)
}, USE.NAMES = TRUE, simplify = F)
weights <- list(c(1:20))
#the first argument of mapply needs to be a function, in this case of two variables
#the body of the function calculates the cross product
#you feed the arguments (both lists) in the following part of mapply
#I have repeated your weights, but you might assign different weights for each year
res <- mapply(function(x, y) {x %*% y}, INDlist, rep(weights, length(INDlist)))
dim(res)
#[1] 61 2
I just started doing some R script and I can't figure out this problem.
I got a list of vector let say
myListOfVector <- list(
c(1,2),
c(1,2),
c(1,2),
c(1,2)
)
what I want is the sum of each X element of each vector that are in my list base on the position of the element
so that if I have 3 vector that contains (a, b, c), I will get the sum of each a, each b and each c in a list or vector
I know that each vector are the same length
What I seek is something like that
result <- sum(myListOfVector)
# result must be c(4, 8)
Does anybody have an idea ?
The only way I've been able to do it is by using a loop but it take so much time that I can't resign to do it.
I tried some apply and lapply but they don't seem to work like I want it to because all I have is one vector at a time.
Precision :
The list of vector is returned by a function that I can't modify
I need an efficient solution if possible
A list of vectors of the same length can be summed with Reduce:
Reduce(`+`, myListOfVector)
Putting it in a matrix and using colSums or rowSums, as mentioned by #AnandaMahto and #JanLuba, might be faster in some cases; I'm not sure.
Side note. The example in the OP is not a list; instead, it should be constructed like
myListOfVector <- list( # ... changing c to list on this line
c(1,2),
c(1,2),
c(1,2),
c(1,2)
)
you should first convert your list to a matrix:
mymatrix=matrix(myListOfVector,ncol=2,byrow=T)
and then use colSums:
colSums(mymatrix)
In so far as I understand it, when using r it can be more elegant to use functions such as lapply rather than for loops (that are used more often than not in other object oriented languages). However I cannot get my head around the syntax and am making foolish errors when trying to implement simple tasks with the command. For example:
I have a series of dataframes loaded from csv files using a for loop.The following dummy dataframes adequately describe the data:
x <- c(0,10,11,12,13)
y <- c(1,NA,NA,NA,NA)
z <- c(2,20,21,22,23)
a <- c(0,6,5,4,3)
b <- c(1,7,8,9,10)
c <- c(2,NA,NA,NA,NA)
df1 <- data.frame(x,y,z)
df2 <- data.frame(a,b,c)
I first generate a list of dataframe names (data_names- I do this when loading the csv files) and then simply want to sum the columns. My attempt of course does not work:
lapply(data_names, function(df) {
counts <- colSums(!is.na(data_names))
})
I could of course use lists (and I realise in the long run this maybe better) however from a pedagogical point of view I would like to understand lapply better.
Many thanks for any pointers
It's really just your use of is.na and the fact you don't need to use the asignment operator <- inside the function. lapply returns a list which is the result of applying FUN to each element of the input list. You assign the output of lapply to a variable, e.g. res <- lapply( .... , FUN ).
I'm also not too sure how you made the list initially, but the below should suffice. You also don't need an anonymous function in this case, you can use the named colSums and also provide the na.rm = TRUE argument to take care of persky NAs in your data:
lapply( list( df1, df2 ) , colSums , na.rm = TRUE )
[[1]]
x y z
46 1 88
[[2]]
a b c
18 35 2
So you can read this as:
For each df in the list:
apply colSums with the argument na.rm = TRUE
The result is a list, each element of which is the result of applying colSums to each df in the list.
What code in R will do the following:
Given a list 1, 2, ..., M create a list of N random entries from that list. Furthermore, obtain the complement list.
example:
N = 5
M = 10
list = [1,4,3,9,2]
complement = [5,6,7,8,10]
?sample
samp_range <- 1:M
out <- sample(samp_range, N)
compliment <- samp_range[!samp_range %in% out]
or as per Joran's comment:
compliment <- setdiff(samp_range, out)
Also, as a rule, avoid using things like list as variable names since they are internal R functions.