in R: combine columns of different dataframes - r

I try to combine each columns of three different dataframes to get an object with the same length of the original dataframe and three columns of every subobject. Each of the original dataframe has 10 columns and 14 rows.
I tried it with a for-loop, but the result is not usable for me.
t <- NULL
for(i in 1 : length(net)) {
a <- cbind(imp.qua.00.09[i], exp.qua.00.09[i], net[i])
t <- list(t, a)
}
t
But in the end I would like to get 10 seperated dataframes with three columns.
So I want to loop through this:
a <- cbind(imp.qua.00.09[i], exp.qua.00.09[i], net[i])
for every column of each original dataframe. But if I use t <- list(t, a) it constructs a crazy list. Thanks.

The code you're using to append elements to t is wrong, you should do in this way:
t <- list()
for(i in 1:length(net)) {
a <- cbind(imp.qua.00.09[i], exp.qua.00.09[i], net[i])
t[[length(t)+1]] <- a
}
t
Your code is wrong since at each step, you transform t into a list where the first element is the previous t (that is a list, except for the first iteration), and the second element is the subset. So basically in the end you're getting a sort of recursive list composed by two elements where the second one is the data.frame subset and the first is again a list of two elements with the same structure, for ten levels.
Anyway, your code is equivalent to this one-liner (that is probably more efficient since it does not perform any list concatenation):
t <- lapply(1:length(net),
function(i){cbind(imp.qua.00.09[i], exp.qua.00.09[i], net[i])})

This should work:
do.call(cbind,list(imp.qua.00.09, exp.qua.00.09, net))

Related

Break apart nested list into two (or more) lists if the nested list contains a vector of all NAs

I have a nested list (or list of lists) containing vectors of integers. I run this list through a custom function that randomly replaces the integer with NA. I would like to "break apart" the internal list into two lists if a vector contains all NAs
Its likely better with just showing you what I have and what I want instead of text explanation:
#Example of full list data
a<-list(1,3,c(0,2,0),c(0,0))
b<-list(1,6,c(0,3,2,0,1,0),c(0,0,0,1,0,0),1,2,c(0,0),2,c(0,0))
c<-list(1,0)
d<-list(1,0)
e<-list(1,4,c(2,0,0,0),c(4,1),c(1,0,0,0,0),0)
L.full<-list(a,b,c,d,e,)
#Example of list with random positions replaced with NA
f<-list(1,3,c(0,NA,0),c(0,0))
g<-list(1,6,c(0,3,NA,0,NA,0),c(0,NA,0,1,0,0),1,NA,c(0,0),2,c(0,0))
h<-list(1,NA)
i<-list(NA,0)
j<-list(1,NA,c(NA,0,0,0),c(NA,NA),c(1,0,0,NA,0),0)
L.miss<-list(f,g,h,i,j)
#To get what I want, I need to evaluate each list in the list-of-lists for vectors containing all NAs,
#and "break" into two lists (or more, if mulitple vectors in the list contain all NAs)
#In this example:
#"f" should remain complete since no vector in the list contains all NAs
#"g" should be "broken" since the 6th position only has one position and is NA (i.e. all NAs) and has subsequent positions in the list
#"g" should be broken up such that:
g.1<-list(1,6,c(0,3,NA,0,NA,0),c(0,NA,0,1,0,0),1)
g.2<-list(2,c(0,0))
#"h" should remain complete since the NA is at the end and there are no subsequent positions in the list
#"i" should remain complete since the NA is at the beginning and there are no previous positions in the list
#"j" should be broken up since the 2nd and 4th positions contain all NA and have previous/subsequent positions in the list
#"h" should be broken up such that:
j.1<-1
j.2<-c(NA,0,0,0)
j.3<-list(c(1,0,0,NA,0),0)
#In this example, the original list of 5 lists would result in a list of 8 lists/individual vectors, such that:
L.want<-list(f,g.1,g.2,h,i,j.1,j.2,j.3)
I tried quite a few things but I am quite stuck. I thought I may be on to something when I realized a vector of all NAs is a logical, not numeric, so I started coding
#Checking each vector in the nested list for if it is logical
for(i in 1:length(L.miss)){
for (j in 1:length(L.miss[[i]])){
if(is.logical(L.miss[[i]][[j]])){
##I have no idea what to do here to break it apart##
}
}
}
I appreciate any advice or guidance!
One option is to do a nested loop with lapply, create a numeric index based on all NA elements, and split
L.split <- lapply(names(L.miss), function(nm) {
split(L.miss[[nm]], cumsum(sapply(L.miss[[nm]], function(x) all(is.na(x)))))
})
From this, if we need to remove the elements that have all NAs
L.split2 <- lapply(L.split, function(lstA) lapply(lstA,
function(x) Filter(function(y) !all(is.na(y)), x)))
names(L.split2) <- names(L.miss)
data
names(L.miss) <- c('f', 'g', 'h', 'i', 'j')

create boxplots with first element of first row of multiple dataframes

I have a list of dataframes. Each dataframe has 6 rows. I want to create 6 boxplots. The first boxplot should take the values of the first row of the first column. The second boxplot should take the values of the second row of the first column, etc.
I want to end up with something like this: example image
Each row should be one boxplot on the horizontal axis.
Right now I have started to do it in a loop, but I think this is not the way to go:
for (counter in seq(from = 1, to = wins)) {
res <- (lapply(mylist, function(x) x[counter,1]))
boxplot(res)
}
The variable mylist contains the dataframes. I already use lapply to get the first/second/etc. row elements over all dataframes according to the counter variable. However, I think I have to also avoid the loop, but this would need a 'better' lapply which also loops over the rows of the dataframes in mylist.
Maybe not the one liner you want but this works for me
# Add a column to each data frame with the row index
for (i in seq_along(mylist)) {
mylist[[i]]$rowID <- 1:nrow(mylist[[i]])
}
# Stick all the data frames into one single data frame
allData <- do.call(rbind, mylist)
# Split the first column based on rowID
boxList <- split(allData[,1], allData$rowID)
# boxplot likes a list
boxplot(boxList)

How to code this if else clause in R?

I have a function that outputs a list containing strings. Now, I want to check if this list contain strings which are all 0's or if there is at least one string which doesn't contain all 0's (can be more).
I have a large dataset. I am going to execute my function on each of the rows of the dataset. Now,
Basically,
for each row of the dataset
mylst <- func(row[i])
if (mylst(contains strings containing all 0's)
process the next row of the dataset
else
execute some other code
Now, I can code the if-else clause but I am not able to code the part where I have to check the list for all 0's. How can I do this in R?
Thanks!
You can use this for loop:
for (i in seq(nrow(dat))) {
if( !any(grepl("^0+$", dat[i, ])) )
execute some other code
}
where dat is the name of your data frame.
Here, the regex "^0+$" matches a string that consists of 0s only.
I'd like to suggest solution that avoids use of explicit for-loop.
For a given data set df, one can find a logical vector that indicates the rows with all zeroes:
all.zeros <- apply(df,1,function(s) all(grepl('^0+$',s))) # grepl() was taken from the Sven's solution
With this logical vector, it is easy to subset df to remove all-zero rows:
df[!all.zeros,]
and use it for any subsequent transformations.
'Toy' dataset
df <- data.frame(V1=c('00','01','00'),V2=c('000','010','020'))
UPDATE
If you'd like to apply the function to each row first and then analyze the resulting strings, you should slightly modify the all.zeros expression:
all.zeros <- apply(df,1,function(s) all(grepl('^0+$',func(s))))

List elements to dataframes in R

How would I go about taking elements of a list and making them into dataframes, with each dataframe name consistent with the list element name?
Ex:
exlist <- list(west=c(2,3,4), north=c(2,5,6), east=c(2,4,7))
Where I'm tripping up is in the actual naming of the unique dataframes -- I can't figure out how to do this with a for() loop or with lapply:
for(i in exlist) {
i <- data.frame(exlist$i)
}
gives me an empty dataframe called i, whereas I'd expect three dataframes to be made (one called west, another called north, and another called east)
When I use lapply syntax and call the individual list element name, I get empty dataframes:
lapply(exlist, function(list) i <- data.frame(list["i"]))
yields
data frame with 0 columns and 0 rows
> $west
list..i..
1 NA
$north
list..i..
1 NA
$east
list..i..
1 NA
If you want to convert your list elements to data.frames, you can try either
lapply(exlist, as.data.frame)
Or (as suggested by #Richard), depends on your desired output:
lapply(exlist, as.data.frame.list)
It is always recommended to keep multiple data frames in a list rather than polluting your global environment, but if you insist on doing this, you could use list2env (don't do this), such as:
list2env(lapply(exlist, as.data.frame.list), .GlobalEnv)
This should create the three objects you want:
df.names <- "value" ## vector with column names here
for (i in names(exlist)) setNames(assign(i, data.frame(exlist[[i]])), df.names)

Sorting list of matrices by the first column

I have a list containing 4 matrices, each with 21 random numbers in 3 columns and 7 rows.
I want to create new list using lapply function in which each matrix is sorted by the first column.
I tried:
#example data
set.seed(1)
list.a <- replicate(4, list(matrix(sample(1:99, 21), nrow=7)))
ordered <- order(list.a[,1])
lapply(list.a, function(x){[ordered,]})
but at the first step the R gives me error "incorrect number of dimensions". Don't know what to do. It works with one matrix, though.
Please help me. Thanks!
You were almost there - but you would need to iterate through the list to reorder each matrix.
Its easier to do this is one lapply statement
lapply(list.a, function(x) x[order(x[,1]),])
Note that x in the function call represents the matrices in the list.

Resources