how to insert column to a model.matrix [duplicate]

how to insert column to a model.matrix [duplicate] - r

I'm trying to add a new column to existing matrix, but getting warning everytime.
I'm trying this code:
normDisMatrix$newColumn <- labels
Getting this message:
Warning message: In normDisMatrix$newColumn <- labels : Coercing LHS
to a list
After it, when I check the matrix, it seems null:
dim(normDisMatrix)
NULL
Note: labels are just vectors which have numbers between 1 and 4.
What can be the problem?

As #thelatemail pointed out, the $ operator cannot be used to subset a matrix. This is because a matrix is just a single vector with a dimension attribute. When you used $ to try to add a new column, R converted your matrix to the lowest structure where $ can be used on the vector, which is a list.
The function you want is cbind() (column bind). Suppose I have the matrix m
(m <- matrix(51:70, 4))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 51 55 59 63 67
# [2,] 52 56 60 64 68
# [3,] 53 57 61 65 69
# [4,] 54 58 62 66 70
To add the a new column from a vector called labels, we can do
labels <- 1:4
cbind(m, newColumn = labels)
# newColumn
# [1,] 51 55 59 63 67 1
# [2,] 52 56 60 64 68 2
# [3,] 53 57 61 65 69 3
# [4,] 54 58 62 66 70 4

Related

Create 'usable' bins from a vector in R

I have a numeric vector with integers which:
I want to transform into "bins".
I want these bins to be used as sample frames from which I can then sample again, uniformly.
So far I can do both using findInterval but I am looking for a way to do it with cut.
Let's consider a random vector with integers which will be split in equally sized intervals of length 2:
df = sample(1:100,10)
df
[1] 81 11 38 95 45 14 10 61 96 88
Using findInterval I get the bins and a approximate way for sampling:
breaks = seq(1,max(df+1),by=10)
b <- findInterval(df, breaks)
b
[1] 9 2 4 10 5 2 1 7 10 9
# If b is equal to 1 or 100, then use ifelse() to prevent leaking outside [1,100]
sam <- round(runif(10,ifelse(b==1,10*b-9,10*b-10),ifelse(b==10,10*b,10*b+10)))
sam
[1] 85 14 39 94 50 16 7 63 93 85
Using cut I get the intervals:
breaks = seq(1,max(df+1),by=10)
cut(df,breaks,right=TRUE)
[1] (71,81] (1,11] (31,41] <NA> (41,51] (11,21] (1,11] (51,61] <NA> (81,91] Levels: (1,11] (11,21] (21,31] (31,41] (41,51] (51,61] (61,71] (71,81] (81,91]
But I don't know how to use those values as intervals from which to sample.
If there is another approach, I would be interested to know!

Good Question! I will give you a completely different approach.
So basically you want to perform Latin Hypercube sampling, i.e. stratified uniform sampling in the interval [0,100] with each bin of 10.
For this, it would be easier to download lhs package and use randomLHS function to perform stratified sampling.
First step: Generate uniform draws from every 10 quartiles (strata) as many times as you want. In this example, let's do 5 times:
library(lhs)
randomLHS(10, 5)
> X
[,1] [,2] [,3] [,4] [,5]
[1,] 0.92154144 0.22185959 0.49953326 0.66248165 0.79035832
[2,] 0.47571700 0.05894016 0.55883326 0.34875162 0.98831829
[3,] 0.57738486 0.64525528 0.04955733 0.50939147 0.46297294
[4,] 0.17578838 0.83843074 0.27138703 0.87421301 0.16401042
[5,] 0.03850768 0.40746004 0.69518073 0.23487653 0.55537945
[6,] 0.83942905 0.52957416 0.84952231 0.14031915 0.84956654
[7,] 0.22802502 0.79911728 0.76789194 0.09788194 0.08667802
[8,] 0.61821268 0.93088726 0.30789950 0.95831993 0.36903120
[9,] 0.70391230 0.11445154 0.97976851 0.42027836 0.61097786
[10,] 0.31385709 0.33557430 0.18389684 0.70124986 0.27601550
Second step: Although the output of X is stratified, the columns are still unsorted. Therefore, when we show the final stratified draws, we sort them.
Y <- apply(X,2, function(x) sort(round(x*100)))
> Y
[,1] [,2] [,3] [,4] [,5]
[1,] 4 6 5 10 9
[2,] 18 11 18 14 16
[3,] 23 22 27 23 28
[4,] 31 34 31 35 37
[5,] 48 41 50 42 46
[6,] 58 53 56 51 56
[7,] 62 65 70 66 61
[8,] 70 80 77 70 79
[9,] 84 84 85 87 85
[10,] 92 93 98 96 99
NB: I have done rounding only for convenience to make it obvious but no need to call round function if you are happy to have non-integer draws as output).

how to rbind list of different lengths in r

I have extracted tables from pdf file with tabulizer package. After extracting tables I want to rbind different tables extracted as list with different length.
table1 <- extract_tables("\\AC002_2017.pdf")
final <- do.call(rbind, table1)
But it gives me following error
Error in (function (..., deparse.level = 1) :
number of columns of matrices must match (see arg 2)
How can I rbind it?
Format of data is as follows
[[1]] [,1] [,2] [,3] [,4]
[1,] 20 45 34 34
[2,] 23 34 67 43
[3,] 22 23 42 34
[4,] 45 44 56 54
[5,] 12 11 12 14
[6,] 34 33 45 32

aperm clarification in R

I am following the thread 2d matrix to 3d stacked array in r and have a clarification on the aperm function.
1) I get the first part of the solution, but did not understand the c(2,1,3) used in the function. Could you kindly clarify that?
2) Also I am trying a slight variation of the example in that thread.
My case is as follows:
For a similar matrix in example:
set.seed(1)
mat <- matrix(sample(100, 12 * 5, TRUE), ncol = 5)
[,1] [,2] [,3] [,4] [,5]
[1,] 27 69 27 80 74
[2,] 38 39 39 11 70
[3,] 58 77 2 73 48
[4,] 91 50 39 42 87
[5,] 21 72 87 83 44
[6,] 90 100 35 65 25
[7,] 95 39 49 79 8
[8,] 67 78 60 56 10
[9,] 63 94 50 53 32
[10,] 7 22 19 79 52
[11,] 21 66 83 3 67
[12,] 18 13 67 48 41
I am trying to rearrange such that I have a 3 (row) X 5 (col) x 11 (third dim) array.
So, essentially the rows would overlap and show something like:
,,1
27 69 27 80 74
38 39 39 11 70
58 77 2 73 48
,,2
38 39 39 11 70
58 77 2 73 48
91 50 39 42 87
,,3
58 77 2 73 48
91 50 39 42 87
21 72 87 83 44
and so on until we hit ,,11
Would someone have any experience with this?
Thanks!

Just stumbled over this question. Though the answer comes a little late, here are two options for you.
First, you need to extend mat in such a way that it's rows overlap. We can use this vector for row indexing.
#[1] 1 2 3 2 3 4 3 4 5 4 5 6 5 6 7 6 7 8 7 8 9 8 9 10 9 10 11 10 11 12
I used rollapply from the zoo package to create it as follows:
library(zoo)
row_nums <- c(t(rollapply(1:nrow(mat), width = 3, FUN = rep, 1)))
mat <- mat[row_nums, ]
dim(mat)
#[1] 30 5
Now use the matsplitter function that #Mr.Flick provided in this answer (please consider to upvote his answer) to get the desired output:
matsplitter(mat, 3, 5)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 27 69 27 80 74
#[2,] 38 39 39 11 70
#[3,] 58 77 2 73 48
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 38 39 39 11 70
#[2,] 58 77 2 73 48
#[3,] 91 50 39 42 87
#
#, , 3
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 58 77 2 73 48
#[2,] 91 50 39 42 87
#[3,] 21 72 87 83 44
#
#, , 4
# ...
Note that you will end up with an array of dimension 3 x 5 x 10, not 11.
matsplitter <- function(M, r, c) {
rg <- (row(M) - 1) %/% r + 1
cg <- (col(M) - 1) %/% c + 1
rci <- (rg - 1) * max(cg) + cg
N <- prod(dim(M)) / r / c
cv <- unlist(lapply(1:N, function(x)
M[rci == x]))
dim(cv) <- c(r, c, N)
cv
}
Here is a solution using aperm as in the linked answer (assuming that mat was extended as above and is of dimension 30 x 5).
aperm(`dim<-`(t(mat), list(5, 3, 10)), c(2, 1, 3))
t(mat): transposes mat (new dimension: 5 x 30)
`dim<-`(t(mat), list(5, 3, 10)): changes the dimension of t(mat) from 5 X 30 to 5 x 3 x 10
aperm(..., c(2, 1, 3)) permutes the dimensions of the array `dim<-`(t(mat), list(5, 3, 10)) from 5 x 3 x 10 to 3 x 5 x 10, i.e. the second dimension becomes the first, the first
dimension becomes the second and the third dimension stays the same.

Adding a new column to matrix error

I'm trying to add a new column to existing matrix, but getting warning everytime.
I'm trying this code:
normDisMatrix$newColumn <- labels
Getting this message:
Warning message: In normDisMatrix$newColumn <- labels : Coercing LHS
to a list
After it, when I check the matrix, it seems null:
dim(normDisMatrix)
NULL
Note: labels are just vectors which have numbers between 1 and 4.
What can be the problem?

As #thelatemail pointed out, the $ operator cannot be used to subset a matrix. This is because a matrix is just a single vector with a dimension attribute. When you used $ to try to add a new column, R converted your matrix to the lowest structure where $ can be used on the vector, which is a list.
The function you want is cbind() (column bind). Suppose I have the matrix m
(m <- matrix(51:70, 4))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 51 55 59 63 67
# [2,] 52 56 60 64 68
# [3,] 53 57 61 65 69
# [4,] 54 58 62 66 70
To add the a new column from a vector called labels, we can do
labels <- 1:4
cbind(m, newColumn = labels)
# newColumn
# [1,] 51 55 59 63 67 1
# [2,] 52 56 60 64 68 2
# [3,] 53 57 61 65 69 3
# [4,] 54 58 62 66 70 4

Extract Consecutive Pairs of Elements from a Vector and Place in a Matrix

This may be a simple question but I can not find how to produce pairs of values from a vector sequentially which each pair includes last value and new value in a matrix of values with two columns. Example below
C<-c(1 , 20 , 44 , 62 , 64 , 89 , 91, 100)
matrix example
newpairs
[,1] [,2]
[1,] 1 20
[2,] 20 44
[3,] 44 64
[4,] 64 89
[5,] 89 91
[6,] 91 100
So when I try the matrix it does not work as last element is not repated with the new element
newpairs <- matrix(C, ncol=2, byrow=TRUE)
newpairs
[,1] [,2]
[1,] 1 20
[2,] 44 62
[3,] 64 89
[4,] 91 100
I guess you can subset but if C values change then you have to change the drop or keep of subset. I also have tried on functions that extract certain increments or that can extract every nth elemen. However I would like to find a systematic way to create the first example matrix.
Any help is welcomed

This fits your desired output:
cbind(C[-length(C)], C[-1])
[,1] [,2]
[1,] 1 20
[2,] 20 44
[3,] 44 62
[4,] 62 64
[5,] 64 89
[6,] 89 91
[7,] 91 100

How about:
## define input
C <- c(1 , 20 , 44 , 62 , 64 , 89 , 91, 100)
## replicate all but first and last elements
Crep <- rep(C,c(1,rep(2,length(C)-2),1))
## create matrix
matrix(Crep,ncol=2,byrow=TRUE)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

how to insert column to a model.matrix [duplicate] - r

Related

Create 'usable' bins from a vector in R

how to rbind list of different lengths in r

aperm clarification in R

Adding a new column to matrix error

Extract Consecutive Pairs of Elements from a Vector and Place in a Matrix

Categories

Resources