rbind() in mapply() results different output in R - r

Iam new to R and trying to understand mapply() behaviour.
I have two dataframes k1 and k2.
When I use rbind(k1,k2) it appends k2 at the bottom of k1.
But when I use it along mapply(rbind,k1,k2), it takes 1row from k1 and other from k2 and appends.
Is it possible to append k1 after k2 using mapply().
Can you also explain when is mapply() used in real-time scenario.
> k1 <- data.frame(x1 = 1:10, x2 = 11:20)
> k2 <- data.frame(x1= 21:30, x2 = 31:40)
>
> rbind(k1,k2)
x1 x2
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16
7 7 17
8 8 18
9 9 19
10 10 20
11 21 31
12 22 32
13 23 33
14 24 34
15 25 35
16 26 36
17 27 37
18 28 38
19 29 39
20 30 40
> mapply(rbind,k1,k2)
x1 x2
[1,] 1 11
[2,] 21 31
[3,] 2 12
[4,] 22 32
[5,] 3 13
[6,] 23 33
[7,] 4 14
[8,] 24 34
[9,] 5 15
[10,] 25 35
[11,] 6 16
[12,] 26 36
[13,] 7 17
[14,] 27 37
[15,] 8 18
[16,] 28 38
[17,] 9 19
[18,] 29 39
[19,] 10 20
[20,] 30 40
I want the output using mapply() function like the first output.

mapply applies the function to the first element of the first input and the first element of the second input, second element of the first input and second element of the second input, etc. So since k1 and k2 are both dataframes, mapply(rbind, k1, k2) would rbind first column of k1 to first column of k2, and so on... It then simplifies the output to a matrix.
If you want to use mapply in this fashion, you would want to use the non-simplified version (mapply(f, ..., SIMPLIFY = FALSE)) and supply single element lists, this forces mapply to rbind the dataframes and not simplify the output list:
mapply(rbind, list(k1), list(k2), SIMPLIFY = FALSE)
This gives you the correct binding, but results in a list of one element.
[[1]]
x1 x2
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16
7 7 17
8 8 18
9 9 19
10 10 20
11 21 31
12 22 32
13 23 33
14 24 34
15 25 35
16 26 36
17 27 37
18 28 38
19 29 39
20 30 40
The following would give you the same result as rbind(k1, k2):
unlist(mapply(rbind, list(k1), list(k2), SIMPLIFY = FALSE))
As noted by #Mako212, Reduce and do.call might be better tools to use in this case.

It sounds like you may be misusing mapply()
Instead, create a list of the data frames:
dfList <- list(k1,k2)
And rbind using Reduce like so:
Reduce(rbind, dfList)
And another way:
do.call("rbind",dfList)

Related

How to select the n lowest values, n lowest values excluding the lowest, etc. in a for loop in R?

In R, I want to make a for loop in which I want to select the n lowest values, then the n lowest values excluding lowest value, then the n lowest values excluding the 2 lowest values etc.
Here's an example to clarify:
set.seed(1)
x <- round(rnorm(10,20,15))
n <- 4
I want to get:
7 8 11 15
8 11 15 23
11 15 23 25
15 23 25 27
23 25 27 29
25 27 29 31
27 29 31 44
I tried the following code, but then I do not get the last row (does not include last/highest value). I could get this by adding another code line in the for loop, but was wondering whether this could be done more efficient.
y <- matrix(data=NA, nrow=length(x)+1-n, ncol=n)
for (i in 1:(length(x)-n)) {y[i,] <- sort(x)[i:(i+n-1)]}
Thanks
set.seed(1)
x <- round(rnorm(10,20,15))
n <- 4
Get the pattern:
rbind(sort(x)[1:4], sort(x)[2:5], sort(x)[3:6], sort(x)[4:7], sort(x)[5:8], sort(x)[6:9], sort(x)[6:9], sort(x)[7:10])
Now, use dynamic programming in R to finish (in the general case):
matrix(c( sapply(1:(length(x)+1-n), function(i) sort(x)[i:(i+3)] )),nrow=length(x)+1-n, byrow=TRUE)
[,1] [,2] [,3] [,4]
[1,] 7 8 11 15
[2,] 8 11 15 23
[3,] 11 15 23 25
[4,] 15 23 25 27
[5,] 23 25 27 29
[6,] 25 27 29 31
[7,] 27 29 31 44
The most perfect one:
t(sapply(1:(length(x)+1-n), function(i) sort(x)[i:(i+3)] ))
[,1] [,2] [,3] [,4]
[1,] 7 8 11 15
[2,] 8 11 15 23
[3,] 11 15 23 25
[4,] 15 23 25 27
[5,] 23 25 27 29
[6,] 25 27 29 31
[7,] 27 29 31 44
Note that sapply provides columnwise outputs, hence a transpose finished the inconvinience.
Note to Rob: Apply family (apply, mapply, sapply, tapply etc.) overrides for. Hence, you should use this family as long as possible.

R: initialize /create data frame in for loop

I wonder what would be the best or most appropriate way to create and modify a data frame in a for-loop, using cbind or rbind? For the first iteration, the data frame has no column or rows, so - in the below example - cbind does not work. Only for this first case, I need the if-else-command inside the for-loop. Isn't there a more elegant way wrting the code below, i.e. without if-else?
mydat <- data.frame()
for (j in 1:10) {
if (ncol(mydat) == 0)
mydat <- data.frame(sample(x = j * 5, size = 20, replace = T))
else
mydat <- cbind(mydat, data.frame(sample(x = j * 5, size = 20, replace = T)))
}
colnames(mydat) <- sprintf("x%i", 1:10)
Here is a simple way to combine lapply and the do.call(cbind, list) convention for generating the data.frame you want.
set.seed(1234)
gendata <- function(x) {
sample(x = x*5, size = 20, replace = T)
}
do.call(cbind, lapply(1:10, gendata))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 4 9 18 24 2 27 21 20 24
# [2,] 4 4 10 1 12 17 33 40 26 18
# [3,] 4 2 5 7 4 9 35 13 20 31
# [4,] 4 1 10 1 14 7 33 20 11 4
# [5,] 5 3 5 5 5 5 18 15 4 48
# [6,] 4 9 8 15 23 10 10 26 29 2
# [7,] 1 6 11 7 10 5 9 30 20 43
# [8,] 2 10 8 11 8 4 18 23 4 32
# [9,] 4 9 4 2 5 14 18 40 37 16
# [10,] 3 1 12 12 23 2 12 24 15 38
# [11,] 4 5 2 3 5 22 34 18 35 32
# [12,] 3 3 5 18 23 4 23 10 27 50
# [13,] 2 4 11 1 4 29 5 4 32 7
# [14,] 5 6 8 16 4 4 15 35 20 45
# [15,] 2 2 3 2 3 7 33 10 16 41
# [16,] 5 8 8 11 13 28 17 40 35 42
# [17,] 2 3 8 8 8 29 32 25 20 42
# [18,] 2 3 12 2 1 9 21 40 26 37
# [19,] 1 10 3 7 8 4 23 16 6 50
# [20,] 2 9 13 14 19 24 31 23 14 32
EDIT:
As was pointed out by Konrad Rudolph, the result I provided was a matrix not a data.frame. Just convert the matrix using as.data.frame:
set.seed(1234)
gendata <- function(x) {
sample(x = x*5, size = 20, replace = T)
}
dat <- as.data.frame(do.call(cbind, lapply(1:10, gendata)))
names(dat) <- sprintf("x%i", 1:10)
head(dat)
# x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
# 1 1 4 9 18 24 2 27 21 20 24
# 2 4 4 10 1 12 17 33 40 26 18
# 3 4 2 5 7 4 9 35 13 20 31
# 4 4 1 10 1 14 7 33 20 11 4
# 5 5 3 5 5 5 5 18 15 4 48
# 6 4 9 8 15 23 10 10 26 29 2

R: transpose a series to a matrix with ignoring remains

Matlab can do this task. I cannot get it right so far by using matrix(), t(), and reShape().
My intention is to transpose a series to a matrix of fixed 10 rows and the number of column varies based on the length of the data series. If these are some remains left, they can be discarded.
For example:
Row #1 1 2 3 4
Row #2 5 6 7 8
Row #3 9 10 11 12
Row #4 13 14 15 16
Row #5 17 18 19 20
Row #6 21 22 23 24
Row #7 25 26 27 28
Row #8 29 30 31 32
Row #9 33 34 35 36
Row #10 37 38 39 40
If there are any remains left (i.e, 41~49), these data can be just discarded.
Any suggestions?
This is what I think you are asking for. A vector of arbitrary length and data. To be turned into a matrix with nrow 10 and ncol based on data length.
#your series of arbitrary length
data = 1:49
#calculate number of columns based on length
col = as.integer(length(data)/10)
#max index
maxIndx = 10*col
#create and transpose matrix
yourMtx = t(matrix(data[0:maxIndx],col,10))
#your matrix
> [,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
[6,] 21 22 23 24
[7,] 25 26 27 28
[8,] 29 30 31 32
[9,] 33 34 35 36
[10,] 37 38 39 40
#create reverse matrix
revMtx = yourMtx[,rev(seq_len(ncol(yourMtx)))]
#reverse matrix
> [,1] [,2] [,3] [,4]
[1,] 4 3 2 1
[2,] 8 7 6 5
[3,] 12 11 10 9
[4,] 16 15 14 13
[5,] 20 19 18 17
[6,] 24 23 22 21
[7,] 28 27 26 25
[8,] 32 31 30 29
[9,] 36 35 34 33
[10,] 40 39 38 37
If I understand your question correctly, this looks to be an approach you could use.
# generate my series
myseries <- 1:49
# specify number of columns and rows
ncols <- 4
nrows <- 10
# create a matrix with the first ncols*nrows elements and fill by row
mymatrix <- matrix(myseries[1:(ncols*nrows)],
ncol = ncols, nrow = nrows, byrow = TRUE)
mymatrix
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
[6,] 21 22 23 24
[7,] 25 26 27 28
[8,] 29 30 31 32
[9,] 33 34 35 36
[10,] 37 38 39 40

indifference curves in R - plotting table with repeating function

I just spent last week learning R, and now I am playing with it, but I can't find the answer for this problem
I have utility function written like this
u(x,y)=min(3x,9y)
and the goal is to plot a contour graph of this function.
I tried quite a lot of solutions, until now I came to this
x<-seq(0,30,3)
y<-seq(0,90,9)
n<-1:11
table<-c(
pmin(x,y[1]),
pmin(x,y[2]),
pmin(x,y[3]),
pmin(x,y[4]),
pmin(x,y[5]),
pmin(x,y[6]),
pmin(x,y[7]),
pmin(x,y[8]),
pmin(x,y[9]),
pmin(x,y[10]),
pmin(x,y[11]))
mat<-matrix(table, ncol=11, nrow=11)
contour(x,y,mat)
and obviously, the contour graph is not very precise.
I would like to now, what can I use so I do not have to write the table by hand like this.
I wanted to use the sapply function somehow, but I have two values and I was kind of lost how to put them there.
and I would be very thankful, if someone showed me, how to plot this contour graph as most efficiently as possible.
You can use outer for this:
n <- 11
foo <- outer(X=seq_len(n), Y=seq_len(n), function(x, y) pmin(3*x, 9*y))
outer applies the specified function to all combinations of elements of the vectors passed to arguments X and Y. In this case, we're applying the function pmin(3*x, 9*y) to all pairs of elements of the two vectors, each of which are the numbers 1 to n.
foo
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
## [1,] 3 3 3 3 3 3 3 3 3 3 3
## [2,] 6 6 6 6 6 6 6 6 6 6 6
## [3,] 9 9 9 9 9 9 9 9 9 9 9
## [4,] 9 12 12 12 12 12 12 12 12 12 12
## [5,] 9 15 15 15 15 15 15 15 15 15 15
## [6,] 9 18 18 18 18 18 18 18 18 18 18
## [7,] 9 18 21 21 21 21 21 21 21 21 21
## [8,] 9 18 24 24 24 24 24 24 24 24 24
## [9,] 9 18 27 27 27 27 27 27 27 27 27
## [10,] 9 18 27 30 30 30 30 30 30 30 30
## [11,] 9 18 27 33 33 33 33 33 33 33 33
contour(foo)
To increase the number of points at which the function is evaluated, just pass finer-resolution vectors:
foo2 <- outer(seq(1, 11, 0.01), seq(1, 11, 0.01), function(x, y) pmin(3*x, 9*y))
contour(foo2)
Now it is evaluated at 0.01 increments from 1 through 11.

Select rows without missing values in R

I am a new user to R and for loop. I am trying to take sampling from data and check to see if there is a colinear column. I want to record in that iteration that the colinear column exists and record it in the vector (baditr). Also, I would like to print a line indicating that "colinearity is at iteration i". Then I would like the code to jump to the second iteration and continue running. For each iteration, I would like the code to save the sum of the columns in the corresponding row of the matrix.
My problem is that I am getting an NA for the bad iterations. My intent is for bad iterations to not be included in my matrix at all. Here is my code:
a0=rep(1,40)
a=rep(0:1,20)
b=c(rep(1,20),rep(0,20))
c0=c(rep(0,12),rep(1,28))
c1=c(rep(1,5),rep(0,35))
c2=c(rep(1,8),rep(0,32))
c3=c(rep(1,23),rep(0,17))
da=matrix(cbind(a0,a,b,c0,c1,c2,c3),nrow=40,ncol=7)
sing <- function(nrw){
sm <- matrix(NA,nrow=nrw,ncol=ncol(da))
baditr <- NULL
for(i in 1:nrw){
ind <- sample(1:nrow(da), nrow(da),replace =TRUE)
smdat <- da[ind,]
evals <- eigen(crossprod(smdat))$values
if(any(abs(evals) < 1e-7)){
baditr <- c(baditr,i)
cat("singularity occurs at", paste(i),"\n")
next
}
sm[i,] <- apply(smdat,2,sum)
}
return(sm)
}
sing(20)
I will get the following output:
singularity occurs at 9
singularity occurs at 13
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 40 23 22 25 5 8 26
[2,] 40 20 18 30 4 7 22
[3,] 40 19 24 28 6 7 25
[4,] 40 19 22 30 6 9 26
[5,] 40 12 26 26 8 13 30
[6,] 40 17 16 27 7 10 19
[7,] 40 20 17 33 3 5 19
[8,] 40 22 19 28 4 9 23
[9,] NA NA NA NA NA NA NA
[10,] 40 21 24 28 3 6 27
[11,] 40 21 16 31 2 4 22
[12,] 40 21 21 26 3 6 23
[13,] NA NA NA NA NA NA NA
[14,] 40 18 16 29 2 7 22
[15,] 40 24 18 30 6 9 21
[16,] 40 23 18 29 4 8 21
[17,] 40 17 25 25 3 8 29
[18,] 40 22 28 23 9 14 30
[19,] 40 25 23 25 7 11 30
[20,] 40 20 23 27 7 10 26
I would like my matrix to look like this:
singularity occurs at 9
singularity occurs at 13
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 40 23 22 25 5 8 26
[2,] 40 20 18 30 4 7 22
[3,] 40 19 24 28 6 7 25
[4,] 40 19 22 30 6 9 26
[5,] 40 12 26 26 8 13 30
[6,] 40 17 16 27 7 10 19
[7,] 40 20 17 33 3 5 19
[8,] 40 22 19 28 4 9 23
[10,] 40 21 24 28 3 6 27
[11,] 40 21 16 31 2 4 22
[12,] 40 21 21 26 3 6 23
[14,] 40 18 16 29 2 7 22
[15,] 40 24 18 30 6 9 21
[16,] 40 23 18 29 4 8 21
[17,] 40 17 25 25 3 8 29
[18,] 40 22 28 23 9 14 30
[19,] 40 25 23 25 7 11 30
[20,] 40 20 23 27 7 10 26
As a fail safe, I would also appreciate any information you may have on saving a certain number of iterations to a file (for example, 50 iterations), which I can override once the next number of iterations is produced. Meaning, I save the first 50 iterations to a file and then once the second round of 50 iterations is produced, they override the first round and as a result, my file now has 100 iterations.
Sorry for the long post. But thanks in advance.
Before you return sm, you can filter out the rows with NA values by using complete.cases(). It would look something like sm[complete.cases(sm),]. The function returns a logical vector of TRUE/FALSE values, which forces R to not return those values with FALSE.
Also, it doesn't look like you are doing anything with baditers after defining it.I can comment out all lines referring to baditers and your function seems to work just fine...maybe it's a legacy from an older iteration of your code?
Update
Here's your updated function using complete.cases(). Note I also commented out everything related to baditr to illustrate that it's not doing anything currently in your code.
sing <- function(nrw){
sm <- matrix(NA,nrow=nrw,ncol=ncol(da))
#baditr <- NULL
for(i in 1:nrw){
ind <- sample(1:nrow(da), nrow(da),replace =TRUE)
smdat <- da[ind,]
evals <- eigen(crossprod(smdat))$values
if(any(abs(evals) < 1e-7)){
#baditr <- c(baditr,i)
cat("singularity occurs at", paste(i),"\n")
next
}
sm[i,] <- apply(smdat,2,sum)
}
return(sm[complete.cases(sm),])
}
Now let's run the function, I'm wrapping dim() around the function call which will tell us the #rows and #columns of the resulting object:
> dim(sing(20))
singularity occurs at 6
[1] 19 7
So one singularity and a matrix of 19 rows and 7 columns, am I missing something?
As to your other question about writing things out, are you aware of the append parameter to write.table() and friends? The help page tells us that If TRUE, the output is appended to the file. If FALSE, any existing file of the name is destroyed.
Update 2
Here's an example using append = TRUE in write.table()
#Matrix 1 definition and write to file
x <- matrix(1:9, ncol = 3)
write.table(x, "out.txt", sep = "\t", col.names = TRUE, row.names = FALSE)
#Matrix 2 definition and write to same file with append = TRUE
x2 <- matrix(10:18, ncol = 3)
write.table(x2, "out.txt", sep = "\t", col.names = FALSE, row.names = FALSE, append = TRUE)
#read consolidated data back in to check if it's right
x3 <- read.table("out.txt", header = TRUE)
Results in
V1 V2 V3
1 1 4 7
2 2 5 8
3 3 6 9
4 10 13 16
5 11 14 17
6 12 15 18

Resources