How to create a Matrix from an Array in R? - r

I have a simple array, like:
x <- c(10,20,30,40,50,60,70,80,90,100)
I would like to create a matrix from this array, because those numbers are prices of two stocks.
stock A: 10 30 50 70 90
stock B: 20 40 60 80 100
How can I create two columns from this list of prices.
Thank you

I suspect the OP actually wants:
> matrix(x, ncol = 2, byrow = TRUE)
[,1] [,2]
[1,] 10 20
[2,] 30 40
[3,] 50 60
[4,] 70 80
[5,] 90 100
or possibly
> split(x, rep(c("A","B"), length(x)/2))
$A
[1] 10 30 50 70 90
$B
[1] 20 40 60 80 100
which can be converted to a data frame easily enough...

Just push the vector into matrix
matrix(x, ncol = 2)
No need to specify the number of rows since that is implicit. See ?matrix

Related

Vector to a matrix where the next row starts 1 observation

Suppose I have a data set with 40 observations
y <- rnorm(40,10,10)
Now I would like to transform this vector into a matrix with 4 observations in each row.
On top of that, I would like the row to start with value y[i] and add one each iteration upuntil the 40th observation.
So for example:
r1 = y[1] y[2] y[3] y[4]
r2 = y[2] y[3] y[4] y[5]
r3 = y[3] y[4] y[5] y[6]
.
.
r40 = y[39] y[38] y[37] y[36]
Does anyone know how to do this?
You can use matrix like:
y <- 1:40
matrix(y, 41, 4)[1:37,]
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 2 3 4 5
# [3,] 3 4 5 6
#...
#[35,] 35 36 37 38
#[36,] 36 37 38 39
#[37,] 37 38 39 40
Or using seq in mapply and fill the index matrix with the values of y.
i <- 1:37
M <- t(mapply(seq, i, i+3))
M
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 2 3 4 5
# [3,] 3 4 5 6
#...
#[35,] 35 36 37 38
#[36,] 36 37 38 39
#[37,] 37 38 39 40
M[] <- y[M]
This is one way to produce the first 37 rows. If you want to change the direction for the last 3 rows, then it would be easy to do with the same code:
purrr::map(seq_len(37), ~y[.x:(.x+3)]) %>%
unlist() %>%
matrix(nrow = 37, byrow = T)
Only difference would be to first save the values of the first 37 rows, then produce the last 3 rows, bind them, and turn that vector to a matrix.
Try embed
embed(y, 4)[, 4:1]
which could give the desired output

assign() for only one sheet of an array - avoid manual change

I have some arrays that I will need to fill in. The names of the arrays are variable, but the same functions will happen to them throughout. Basically I need a way to replace only one "sheet" of an array with another without manually entering the array name. Example below:
big_array_1 <- array(dim = c(5,5,10))
big_array_1[,,1] <- sample(c(1:10), 25, replace=T)
big_array_2 <- array(dim = c(5,5,10))
big_array_2[,,1] <- sample(c(40:50), 25, replace=T)
small_array <- array(dim = c(5,5,2))
small_array[,,] <- sample(c(20:30), 50, replace=T)
so each big array will have to have its second sheet (the third dimension) replaced by the second sheet of the small array, but I want to just be able to set a number (i.e. big array "1" or "2") to make this work in my code instead of change the name manually every time.
# So I know I can do this, but I want to avoid manually changing the "_1" to "_2" when I run the script
big_array_1[,,2] <- small_array[,,2]
# instead, I'm hoping I can use a variable and some kind of assign()
arraynumber <- 1
# but this gives an error for assigning a non-language object
get(paste0("big_array_",arraynumber))[,,2] <- small_array[,,2]
# and this gives an error for invalid first argument.
assign(get(paste0("big_array_",arraynumber))[,,2], small_array[,,2])
# even though get(paste0("big_array_",arraynumber))[,,2] works on its own.
Any suggestions?
In R, you cannot assign values to the result of get(). Additionally, it is not advisable to use assign even attach, eval+parse, list2env, and other environment-changing, dynamic methods that tend to be hard to debug.
As commented, simply use named lists for identically structured objects. Lists can contain any object from arrays to data frames to plots without a number limit. Even more, you avoid flooding global environment with separately named objects but work with a handful of lists containing many underlying elements which better manages data for assignment or iterative needs.
Definition
set.seed(8620)
# NAMED LIST OF TWO ARRAYS
big_array_list <- list(big_array_1 = array(dim = c(5,5,10)),
big_array_2 = array(dim = c(5,5,10)))
big_array_list$big_array_1[,,1] <- sample(c(1:10), 25, replace=TRUE)
big_array_list$big_array_2[,,1] <- sample(c(40:50), 25, replace=TRUE)
# NAMED LIST OF ONE ARRAY
small_array_list <- list(small_array_1 = array(sample(c(20:30), 50, replace=TRUE),
dim = c(5,5,2)))
Assignment
# ASSIGN BY FIXED NAME
big_array_list$big_array_1[,,2] <- small_array_list$small_array_1[,,2]
big_array_list$big_array_1[,,2]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 30 29 26 24 23
# [2,] 21 20 22 20 24
# [3,] 27 24 26 30 30
# [4,] 30 26 24 29 25
# [5,] 26 21 26 20 30
# ASSIGN BY DYNAMIC NAME
arraynumber <- 1
big_array_list[[paste0("big_array_",arraynumber)]][,,2] <- small_array_list[[paste0("small_array_",arraynumber)]][,,2]
big_array_list[[paste0("big_array_",arraynumber)]][,,2]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 30 29 26 24 23
# [2,] 21 20 22 20 24
# [3,] 27 24 26 30 30
# [4,] 30 26 24 29 25
# [5,] 26 21 26 20 30
# ASSIGN BY INDEX
big_array_list[[1]][,,2] <- small_array_list[[1]][,,2]
big_array_list[[1]][,,2]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 30 29 26 24 23
# [2,] 21 20 22 20 24
# [3,] 27 24 26 30 30
# [4,] 30 26 24 29 25
# [5,] 26 21 26 20 30
Iterative Needs
# RETURN DIMENSIONS OF EACH big_array
lapply(big_array_list, dim)
# SHOW FIRST 5 ELEMENTS OF EACH big_array
sapply(big_array_list, `[`, 1:5)
# RETURN LIST WHERE ALL big_arrays ARE EQUAL TO small_array
mapply(`<-`, big_array_list, small_array_list, SIMPLIFY=FALSE)

sum vectors in a matrix by distance from a cell (R)

Suppose I have a matrix A of dimensions n x m. A starting cell (i,j), And a constant k which satisfies k < n x m.
I need a way to extract the values inside A such that all values are within k steps from the starting cell. a step is either a column move or a row move.
Then Im looking to sum the extracted values by 2 groups where 1 group consists of sums obtained from the same column in the original matrix and the other group is the sum obtained from summation of values along rows of the original matrix.
It is important for me that this addresses situations where the starting cell is within k steps from the edge of the matrix.
Example set (I'm heavily simplifying here):
> #create matrix where m = 7,n = 7
> Mat <- sample(1:49,49) %>% matrix(7,7)
>
> #declare starting cell where (i = 4, j = 2)
> i = 4
> j = 2
>
> #declare number of steps
> k = 2
>
> Mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 25 35 29 10 16 46 23
[2,] 32 43 7 5 31 1 14
[3,] 36 19 49 45 13 41 47
[4,] 17 18 48 9 3 28 12
[5,] 26 6 30 33 20 2 11
[6,] 40 24 39 21 37 38 8
[7,] 4 15 34 22 27 44 42
> Mat[i,j]
[1] 18
for this example an output would be two vectors (one for column sums and one for row sums):
> Columnsum <- c(sum(36,17,26) , #sum(Mat[3:5,1])
+ sum(43,19,18,6,24), #sum(Mat[2:6,2])
+ sum(49,48,30), #sum(Mat[3:5,3])
+ sum(9)) #sum(Mat[4:4,3])
>
> Rowsum <- c(sum(43), #sum(Mat[2,2:2])
+ sum(36,19,49), #sum(Mat[3,1:3])
+ sum(17,18,48,9), #sum(Mat[4,1:4])
+ sum(26,6,30), #sum(Mat[5,1:3])
+ sum(24)) #sum(Mat[6,2:2])
>
> Columnsum
[1] 79 110 127 9
> Rowsum
[1] 43 104 92 62 24
You could 'remove' parts of your matrix Mat with entries more than k steps away from (i,j) by overwriting them with NA:
Mat[abs(row(Mat) - i) + abs(col(Mat) - j) > k] <- NA
Then remove the rows and columns that are entirely NA:
Mat <- Mat[rowSums(is.na(Mat)) != ncol(Mat), colSums(is.na(Mat)) != nrow(Mat)]
And finally you can compute the row and column sums:
Columnsum <- colSums(Mat, na.rm = TRUE)
Rowsum <- rowSums(Mat, na.rm = TRUE)

Sum Every N Values in Matrix

So I have taken a look at this question posted before which was used for summing every 2 values in each row in a matrix. Here is the link:
sum specific columns among rows. I also took a look at another question here: R Sum every k columns in matrix which is more similiar to mine. I could not get the solution in this case to work. Here is the code that I am working with...
y <- matrix(1:27, nrow = 3)
y
m1 <- as.matrix(y)
n <- 3
dim(m1) <- c(nrow(m1)/n, ncol(m1), n)
res <- matrix(rowSums(apply(m1, 1, I)), ncol=n)
identical(res[1,],rowSums(y[1:3,]))
sapply(split.default(y, 0:(length(y)-1) %/% 3), rowSums)
I just get an error message when applying this. The desired output is a matrix with the following values:
[,1] [,2] [,3]
[1,] 12 39 66
[2,] 15 42 69
[3,] 18 45 72
To sum consecutive sets of n elements from each row, you just need to write a function that does the summing and apply it to each row:
n <- 3
t(apply(y, 1, function(x) tapply(x, ceiling(seq_along(x)/n), sum)))
# 1 2 3
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72
Transform the matrix to an array and use colSums (as suggested by #nongkrong):
y <- matrix(1:27, nrow = 3)
n <- 3
a <- y
dim(a) <- c(nrow(a), ncol(a)/n, n)
b <- aperm(a, c(2,1,3))
colSums(b)
# [,1] [,2] [,3]
#[1,] 12 39 66
#[2,] 15 42 69
#[3,] 18 45 72
Of course this assumes that ncol(y) is divisible by n.
PS: You can of course avoid creating so many intermediate objects. They are there for didactic purposes.
I would do something similar to the OP -- apply rowSums on subsets of the matrix:
n = 3
ng = ncol(y)/n
sapply( 1:ng, function(jg) rowSums(y[, (jg-1)*n + 1:n ]))
# [,1] [,2] [,3]
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72

Keep column name when filtering matrix columns

I have a matrix, like the one generated with this code:
> m = matrix(data=c(1:50), nrow= 10, ncol = 5);
> colnames(m) = letters[1:5];
If I filter the columns, and the result have more than one column, the new matrix keeps the names. For example:
> m[, colnames(m) != "a"];
b c d e
[1,] 11 21 31 41
[2,] 12 22 32 42
[3,] 13 23 33 43
[4,] 14 24 34 44
[5,] 15 25 35 45
[6,] 16 26 36 46
[7,] 17 27 37 47
[8,] 18 28 38 48
[9,] 19 29 39 49
[10,] 20 30 40 50
Notice that here, the class is still matrix:
> class(m[, colnames(m) != "a"]);
[1] "matrix"
But, when the filter lets only one column, the result is a vector, (integer vector in this case) and the column name, is lost.
> m[, colnames(m) == "a"]
[1] 1 2 3 4 5 6 7 8 9 10
> class(m[, colnames(m) == "a"]);
[1] "integer"
The name of the column is very important.
I would like to keep both, matrix structure (a one column matrix) and the column's name.
But, the column's name is more important.
I already know how to solve this by the long way (by keeping track of every case). I'm wondering if there is an elegant, enlightening solution.
You need to set drop = FALSE. This is good practice for programatic use
drop
For matrices and arrays. If TRUE the result is coerced to the lowest possible dimension (see the examples)
m[,'a',drop=FALSE]
This will retain the names as well.
You can also use subset:
m.a = subset(m, select = colnames(m) == "a")

Resources