Changing dimnames of matrices and data frames in R - r

Let's say I have created the following matrix:
> x <- matrix(1:20000,nrow=100)
> x[1:10,1:10]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 101 201 301 401 501 601 701 801 901
[2,] 2 102 202 302 402 502 602 702 802 902
[3,] 3 103 203 303 403 503 603 703 803 903
[4,] 4 104 204 304 404 504 604 704 804 904
[5,] 5 105 205 305 405 505 605 705 805 905
[6,] 6 106 206 306 406 506 606 706 806 906
[7,] 7 107 207 307 407 507 607 707 807 907
[8,] 8 108 208 308 408 508 608 708 808 908
[9,] 9 109 209 309 409 509 609 709 809 909
[10,] 10 110 210 310 410 510 610 710 810 910
What are the methods in R to change row and column names? For example, I like row names to be SS1, SS2, ..., SS100 and column names to be M1, M2, ..., M200. I usually work with data with 1000s of rows and columns, and I need a good method to do that. Some people use something like attributes(x)$dimnames <- list(...) and some use rownames <- paste(...). What are all possible methods?
My second question is, can I use the same methods after I convert the matrix to a data frame?

From comment to answer:
row.names(x) <- paste("SS", 1:nrow(x), sep="")
colnames(x) <- paste("M" , 1:ncol(x), sep="")
As #doug wrote, it works for matrices and data frames.

Yes same methods will work (matrix/data.frame)--see below:
A = matrix(1:12, nrow=4)
colnames(A) = c("col1", "col2", "col3")
row.names(A) = c("row1", "row2", "row3", "row4")
dfA = as.data.frame(A)
row.names(dfA) = c("r1", "r2", "r3", "r4")
colnames(A) = c("C1", "C2", "C3")
And to save time, you can do this:
x = rep("col", dim(M)[2])
y = 1:dim(M)[2]
colnames(M) = paste(x, y, sep="")

If it is within a list you can do.
dimnames(x)[[1]]<-paste("SS", 1:nrow(x), sep="")
dimnames(x)[[2]]<-paste("M" , 1:ncol(x), sep="")

Related

Populate new dataframe with rowMeans for every four columns minus other rowMeans value

I am trying to calculate my data's means by populating a new dataframe with data corrected by my experiment's blank.
So far, I have created my new data frame:
data_mean <- data.frame(matrix(ncol = 17, # As many columns as experimental conditions plus one for "Time(h)"
nrow = nrow(data)))
Copied the data corresponding to time:
data_mean[,1] <- data[,1]
And attempted to populate the dataframe by assigning the mean of every condition minus the mean of the blanks to each column:
data_mean[,2] <- rowMeans(data[,5:8])-rowMeans(data[,2:4])
data_mean[,3] <- rowMeans(data[,9:12])-rowMeans(data[,2:4])
data_mean[,4] <- rowMeans(data[,13:16])-rowMeans(data[,2:4])
data_mean[,5] <- rowMeans(data[,17:20])-rowMeans(data[,2:4])
and so on.
Is there an easier way to do this rather than typing the same code over and over?
res <- sapply(split.default(data[, -1], seq(ncol(data) - 1)%/%4), rowSums)
res[,-1] - res[,1] # Should give you all the differences above
example:
data <- data.frame(matrix(1:200, 10))
res <- sapply(split.default(data[, -1], seq(ncol(data) - 1)%/%4), rowSums)
res[,-1] - res[,1]
1 2 3 4
[1,] 161 321 481 641
[2,] 162 322 482 642
[3,] 163 323 483 643
[4,] 164 324 484 644
[5,] 165 325 485 645
[6,] 166 326 486 646
[7,] 167 327 487 647
[8,] 168 328 488 648
[9,] 169 329 489 649
[10,] 170 330 490 650
and you can check:
rowSums(data[, 5:8]) - rowSums(data[,2:4])
[1] 161 162 163 164 165 166 167 168 169 170 # first column
rowSums(data[, 9:12]) - rowSums(data[,2:4])
[1] 321 322 323 324 325 326 327 328 329 330 # second column

Extract each RGB layers from raster and paste into matrix in R

I would like to store a raster image into an empty matrix. I load my image and I create an empty matrix of same dimension.
setwd("C:/Users/Desktop/image/")
img_path <- "image.jpeg"
raster <- brick(img_path, package="raster")
nrow <- dim(raster)[1]
ncol <- dim(raster)[2]
img_matrix <- matrix(, nrow = nrow, ncol = ncol)
dim(raster)
[1] 896 1408 3
dim(img_matrix)
[1] 896 1408
When I try to load the raster into the matrix I got this error:
img_matrix[1, 1] <- raster
Error in img_matrix[1, 1] <- raster :
number of items to replace is not a multiple of replacement length
I think the problem is there are 3 layers in the raster and the matrix is just 1. So I think I create 3 matrix and paste each R, G, B layer of the raster into the associated matrix. Finally, assemble the 3 matrix to create the image.
How to do it?
There is an as.array method for objects of class raster.
library(raster)
raster <- brick("3.jpg", package="raster")
dim(raster)
#[1] 665 800 3
array <- as.array(raster)
dim(array)
#[1] 665 800 3
array[201:205,401:405,1]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 244 244 244 244 244
#[2,] 244 244 244 244 244
#[3,] 244 244 244 244 244
#[4,] 244 244 244 244 244
#[5,] 244 244 244 244 244
array[201:205,401:405,2]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 202 202 202 202 202
#[2,] 202 202 202 202 202
#[3,] 202 202 202 202 202
#[4,] 202 202 202 202 202
#[5,] 202 202 202 202 202
As you can see, this will coerce the raster object into a three dimensional array.

Apply same function on same object, with different parameters -- the tidy way

I would like to apply iteratively the same function (in the example, my_function) to the same object (here, my_matrix) with different parameters (my_parameters) stored in a list, or something similar.
The reprex is:
library(tidyverse)
my_matrix <- replicate(10, sample(1:10, 10, replace = TRUE))
my_parameters <- tibble(value1 = sample(1:10, 5, replace = TRUE),
value2 = sample(2:3, 5, replace = TRUE))
my_function <- function(param1, param2, matrix) {
(matrix + param1) * param2
}
new_matrix <- my_matrix
for (i in seq.int(nrow(my_parameters))) {
new_matrix <- my_function(my_parameters$value1[[i]], my_parameters$value2[[i]], new_matrix)
}
new_matrix
(I guess I could compute everything in one iteration with a clever factorization, but this is just to illustrate the problem.)
Is there a tidy function, like:
magical_function(my_matrix, my_function, my_parameters)
that would do the trick?
So far, I dug into purrr, but accumulate, or similar functions, does not seem to apply here.
Of note, I am aware that factorization does not (directly) apply here, and that could explain why there is no obvious answer here (as far as I know).
Here is an option using tidyverse where we use the reduce2 and pass the matrix (original_matrix) in .init so that it will be used as the first value to start the accumulation. Note that, the ..1, ..2 etc. notation is used when there are more than 2 objects passed into the function. The ..1 signifies the original_matrix, and the other two in the same order as the 'x', 'y' parameters
library(purrr)
reduce2(my_params$x, my_params$y, ~ (..1 + ..2) * ..3, .init = original_matrix)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 656 512 704 464 368 560 560 560 704 368
# [2,] 704 752 560 416 368 752 416 464 608 752
# [3,] 512 560 560 608 416 608 704 320 752 464
# [4,] 704 656 464 752 512 656 752 416 656 416
# [5,] 416 656 320 464 416 464 416 464 368 752
# [6,] 416 416 752 608 656 320 368 320 320 608
# [7,] 464 368 608 416 416 560 464 752 320 368
# [8,] 752 416 704 512 608 368 320 464 608 560
# [9,] 368 560 368 512 464 368 608 512 368 320
#[10,] 560 368 464 560 656 416 704 704 320 464
If we want to check how the values are getting changed in each run, use the accumulate2
accumulate2(my_params$x, my_params$y, ~ (..1 + ..2) * ..3, .init = original_matrix)
data
set.seed(23131)
original_matrix <- replicate(10, sample(1:10, 10, replace = TRUE))
my_params <- tibble(x = sample(1:10, 5, replace = TRUE),
y = sample(2:3, 5, replace = TRUE))
So I've been thinking about this and I'm not sure there's a 'clean' tidy way of doing it. purrr is all about functional programming, which means that any iteration is independent of the other. As you say, accumulate and reduce can't solve this (at least without other type of checks). One way to do this would be to update your desired matrix using <<-. Below is an example that matches your results:
library(tidyverse)
set.seed(23131)
original_matrix <- replicate(10, sample(1:10, 10, replace = TRUE))
my_params <- tibble(x = sample(1:10, 5, replace = TRUE),
y = sample(2:3, 5, replace = TRUE))
# Define the matrix that will be updated in each iteration
my_matrix <- original_matrix
my_function2 <- function(param1, param2) {
# In each iteration, multiply by your params
# and update `my_matrix` in the global environment
# This ensures that in every interation you're getting
# the up to date `my_matrix`.
my_matrix <<- (my_matrix + param1) * param2
my_matrix
}
final_res <- map2(my_params$x,
my_params$y,
my_function2)
final_res[[nrow(my_params)]]
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 656 512 704 464 368 560 560 560 704 368
#> [2,] 704 752 560 416 368 752 416 464 608 752
#> [3,] 512 560 560 608 416 608 704 320 752 464
#> [4,] 704 656 464 752 512 656 752 416 656 416
#> [5,] 416 656 320 464 416 464 416 464 368 752
#> [6,] 416 416 752 608 656 320 368 320 320 608
#> [7,] 464 368 608 416 416 560 464 752 320 368
#> [8,] 752 416 704 512 608 368 320 464 608 560
#> [9,] 368 560 368 512 464 368 608 512 368 320
#> [10,] 560 368 464 560 656 416 704 704 320 464

Turn extraction list to csv file

I have uploaded a raster file and polyline shapefile into R and use the extract function to to extract the data from every pixel along the polyline. How do I turn the list output by extract into a CSV file?
Here is a simple self-contained reproducible example (this one is taken from ?raster::extract)
library(raster)
r <- raster(ncol=36, nrow=18, vals=1:(18*36))
cds1 <- rbind(c(-50,0), c(0,60), c(40,5), c(15,-45), c(-10,-25))
cds2 <- rbind(c(80,20), c(140,60), c(160,0), c(140,-55))
lines <- spLines(cds1, cds2)
e <- extract(r, lines)
e is a list
> e
[[1]]
[1] 126 127 161 162 163 164 196 197 200 201 231 232 237 266 267 273 274 302 310 311 338 346 381 382 414 417 450 451 452 453 487 488
[[2]]
[1] 139 140 141 174 175 177 208 209 210 213 243 244 249 250 279 286 322 358 359 394 429 430 465 501 537
and you cannot directly write this to a csv because the list elements (vectors) have different lengths.
So first make them all the same length
x <- max(sapply(e, length))
ee <- sapply(e, `length<-`, x)
Let's see
head(ee)
# [,1] [,2]
#[1,] 126 139
#[2,] 127 140
#[3,] 161 141
#[4,] 162 174
#[5,] 163 175
#[6,] 164 177
tail(ee)
# [,1] [,2]
#[27,] 450 NA
#[28,] 451 NA
#[29,] 452 NA
#[30,] 453 NA
#[31,] 487 NA
#[32,] 488 NA
And now you can write to a csv file
write.csv(ee, "test.csv", row.names=FALSE)
If I understand what it is you're asking, I think you could resolve your situation by using unlist().
d <- c(1:10) # creates a sample data frame to use
d <- as.list(d) # converts the data frame into a list
d <- unlist(d) # converts the list into a vector

Calculating ratios and put them into a matrix in R

I have a table similar to this one, and want to calculate the ratio between column A and B. For example:
A B C D E F
[1,] 187 174 183 115 101 104
[2,] 451 166 177 842 101 133
[3,] 727 171 187 12803 98 134
[4,] 1532 181 196 730 98 108
[5,] 4139 188 214 20358 105 159
[6,] 689 185 211 1633 110 162
[7,] 1625 184 195 2283 109 114
[8,] 771 181 190 904 105 110
[9,] 950 177 190 1033 106 112
[10,] 703 180 191 463 106 110
[11,] 2052 178 188 2585 100 105
[12,] 1161 178 187 2874 99 110
[13,] 214 175 184 173 98 110
[14,] 473 184 191 971 104 111
[15,] 756 185 193 14743 107 114
I want to create a new matrix that has all of those previous rows as new rows and columns (15 rows and 15 columns) like so (values in parentheses are placeholders for the calculated ratios):
[,1] [,2] [,3] [,4]
[1,] (A1:B1) (A1:B2) (A1:B3) (A1:B4) ...
[2,]
[3,]
[4,]
...
That is maybe not the best example, but I hope it is not too confusing.
To calculate the ratios A1:B1, A2:B2, A3:B3 I could do something like:
data.matrix(data["A"]/data["B"])
And to do it for all, I would do something like:
data.matrix(data[1,]/data[1,1])
data.matrix(data[1,]/data[1,2])
...
and so on.
This seems to be a lot of work and maybe someone knows a quicker and more efficient method.
EDIT
I thought the combn function would work, but then I figured out it doesn't. When I have a 2 column matrix, such as:
A B
[1,] 187 115
[2,] 451 842
[3,] 727 12803
[4,] 1532 730
[5,] 4139 20358
[6,] 689 1633
[7,] 1625 2283
[8,] 771 904
[9,] 950 1033
[10,] 703 463
[11,] 2052 2585
[12,] 1161 2874
[13,] 214 173
[14,] 473 971
[15,] 756 14743
And I use the combn function to calculate all possible ratios (A1:B1, A1:B2, ... A2:B1, A2:B2...) I get just the result for A1 vs all values of B.
> combn(ncol(data), 2, function(x) data[,x[1]]/data[,x[2]])
[,1]
[1,] 1.62608696
[2,] 0.53562945
[3,] 0.05678357
[4,] 2.09863014
[5,] 0.20331074
[6,] 0.42192284
[7,] 0.71178274
[8,] 0.85287611
[9,] 0.91965150
[10,] 1.51835853
[11,] 0.79381044
[12,] 0.40396660
[13,] 1.23699422
[14,] 0.48712667
[15,] 0.05127857
Or maybe I just don't understand the combn function and I am doing something wrong here.
You can achieve what you want by using expand.grid, apply and matrix functions as below
I am assuming what you want is matrix like
A1/B1 A1/B2 A1/B3 ...
A2/B1 A2/B2 A2/B3 ...
... ... ... ...
... ... ... ...
Here is the code to do that. Explanation in comments
txt <- "A B C D E F\n187 174 183 115 101 104\n451 166 177 842 101 133\n727 171 187 12803 98 134\n1532 181 196 730 98 108\n4139 188 214 20358 105 159\n689 185 211 1633 110 162\n1625 184 195 2283 109 114\n771 181 190 904 105 110\n950 177 190 1033 106 112\n703 180 191 463 106 110\n2052 178 188 2585 100 105\n1161 178 187 2874 99 110\n214 175 184 173 98 110\n473 184 191 971 104 111\n756 185 193 14743 107 114"
data <- as.matrix(read.table(textConnection(txt), header = TRUE))
# expand.grid : creates every combination of one element each from column A and
# B with elements of B repeated first
# apply : calls function(x) { x[1]/x[2]) } for every combination outputted by
# expand.grid
# matrix : converts the result of apply into matrix. dimnames arguments sets
# rownames and colnames for easy verification for us
result <- matrix(apply(expand.grid(data[, "A"], data[, "B"]), 1, function(x) x[1]/x[2]),
nrow = nrow(data), dimnames = list(data[, "A"], data[, "B"]))
# note that we have set rownames for result to be values of A and colnames for
# result to be value of B
result
## 174 166 171 181 188 185 184
## 187 1.074713 1.126506 1.093567 1.033149 0.9946809 1.010811 1.016304
## 451 2.591954 2.716867 2.637427 2.491713 2.3989362 2.437838 2.451087
## 727 4.178161 4.379518 4.251462 4.016575 3.8670213 3.929730 3.951087
## 1532 8.804598 9.228916 8.959064 8.464088 8.1489362 8.281081 8.326087
## 4139 23.787356 24.933735 24.204678 22.867403 22.0159574 22.372973 22.494565
## 689 3.959770 4.150602 4.029240 3.806630 3.6648936 3.724324 3.744565
## 1625 9.339080 9.789157 9.502924 8.977901 8.6436170 8.783784 8.831522
## 771 4.431034 4.644578 4.508772 4.259669 4.1010638 4.167568 4.190217
## 950 5.459770 5.722892 5.555556 5.248619 5.0531915 5.135135 5.163043
## 703 4.040230 4.234940 4.111111 3.883978 3.7393617 3.800000 3.820652
## 2052 11.793103 12.361446 12.000000 11.337017 10.9148936 11.091892 11.152174
## 1161 6.672414 6.993976 6.789474 6.414365 6.1755319 6.275676 6.309783
## 214 1.229885 1.289157 1.251462 1.182320 1.1382979 1.156757 1.163043
## 473 2.718391 2.849398 2.766082 2.613260 2.5159574 2.556757 2.570652
## 756 4.344828 4.554217 4.421053 4.176796 4.0212766 4.086486 4.108696
## 181 177 180 178 178 175 184
## 187 1.033149 1.056497 1.038889 1.050562 1.050562 1.068571 1.016304
## 451 2.491713 2.548023 2.505556 2.533708 2.533708 2.577143 2.451087
## 727 4.016575 4.107345 4.038889 4.084270 4.084270 4.154286 3.951087
## 1532 8.464088 8.655367 8.511111 8.606742 8.606742 8.754286 8.326087
## 4139 22.867403 23.384181 22.994444 23.252809 23.252809 23.651429 22.494565
## 689 3.806630 3.892655 3.827778 3.870787 3.870787 3.937143 3.744565
## 1625 8.977901 9.180791 9.027778 9.129213 9.129213 9.285714 8.831522
## 771 4.259669 4.355932 4.283333 4.331461 4.331461 4.405714 4.190217
## 950 5.248619 5.367232 5.277778 5.337079 5.337079 5.428571 5.163043
## 703 3.883978 3.971751 3.905556 3.949438 3.949438 4.017143 3.820652
## 2052 11.337017 11.593220 11.400000 11.528090 11.528090 11.725714 11.152174
## 1161 6.414365 6.559322 6.450000 6.522472 6.522472 6.634286 6.309783
## 214 1.182320 1.209040 1.188889 1.202247 1.202247 1.222857 1.163043
## 473 2.613260 2.672316 2.627778 2.657303 2.657303 2.702857 2.570652
## 756 4.176796 4.271186 4.200000 4.247191 4.247191 4.320000 4.108696
## 185
## 187 1.010811
## 451 2.437838
## 727 3.929730
## 1532 8.281081
## 4139 22.372973
## 689 3.724324
## 1625 8.783784
## 771 4.167568
## 950 5.135135
## 703 3.800000
## 2052 11.091892
## 1161 6.275676
## 214 1.156757
## 473 2.556757
## 756 4.086486
Edit: I seem to have misunderstood the question. The answer is even more simpler using outer:
# gives the same 15*15 matrix as geektrader's
outer(mm[,1], mm[,2], '/')
Old answer (not correct):
You should use combn:
# combn(ncol(mm), 2) gives you all possible combinations
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
# [1,] 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5
# [2,] 2 3 4 5 6 3 4 5 6 4 5 6 5 6 6
# it also accepts a function argument. we can use it to divide
# respective columns
mm.div <- combn(ncol(mm), 2, function(x) mm[,x[1]]/mm[,x[2]])
# set column names the matrix
colnames(mm.div) <- combn(colnames(mm), 2, paste, collapse="")
I might be completely missing the point here, but why not just use a couple for loops? I wrote a quick function, then you could pass the pairs to.
For example:
A <- rnorm(15)
B <- rnorm(15)
data <- data.frame(A,B)
ratio <- function(input1, input2){
out <- matrix(0, nrow=length(input1), ncol=length(input1))
k <- 1
for (i in 1:length(input1)){
for (j in 1:length(input1)){
out[k, j] <- input1[k] / input2[j]
}
k <- k + 1
}
return(out)
}
ratio(data$A, data$B)
EDIT
Another thought. To then use the function to do all possible pairs of ratios, you could simply add another for loop, like this:
combs <- combn(1:4, 2)
out <- list()
for (i in 1:(length(combs)/2)){
out[[i]] <- ratio(data[,combs[1,i]], data[,combs[2,i]])
}
Hope that helps!

Resources