Expanding a list of numbers into a matrix (list with n values to multiply to a n x n matrix) - r

I have a set of numbers, which I want to expand into a matrix.
There are 4 values in the list which I want to expand into a 4x4 matrix.
Here is some example data
freq <- c(627,449,813,111)
I want to expand this into a matrix of so that it's like this.
Apologies I have just copied and pasted data, thus it's not an R output, but hope it helps to get the idea across.
1 2 3 4 Total
1 197 141 255 35 627
2 141 101 183 25 449
3 255 183 330 45 813
4 35 25 45 6 111
627 449 813 111 2000
The cells are multiplication of the (row total)x(column total)/(table total). The value in 1,1 = (627 x 627)/2000 = 197. The value in 2,1 = (627 x 449)/2000 = 141, and so on.
Is there a function that will create this matrix? I will try to do it via a loop but was hoping there is a function or matrix calculation trick that can do this more efficiently? Apologies if I didn't articulate the above too well, any help is greatly appreciated. Thanks

freq <- c(627,449,813,111)
round(outer(freq, freq)/sum(freq))
#> [,1] [,2] [,3] [,4]
#> [1,] 197 141 255 35
#> [2,] 141 101 183 25
#> [3,] 255 183 330 45
#> [4,] 35 25 45 6

It doesn't really matter here, but it is good practice to avoid constructions like outer(x, x) / sum(x) in favour of ones like tcrossprod(x / sqrt(sum(x))):
round(tcrossprod(freq / sqrt(sum(freq))))
## [,1] [,2] [,3] [,4]
## [1,] 197 141 255 35
## [2,] 141 101 183 25
## [3,] 255 183 330 45
## [4,] 35 25 45 6
There are a few issues with the outer approach:
outer(x, x) evaluates tcrossprod(as.vector(x), as.vector(x)) internally. The as.vector calls and everything else that happens inside of outer are completely redundant if x is already a vector. The as.vector calls are actually worse than redundant: if x has any attributes, then as.vector(x) requires a deep copy of x.
Naively doing A <- outer(x, x); A / sum(x) requires R to allocate memory for two n-by-n matrices. For large enough n, that can be quite wasteful, if not impossible. R is clever enough to avoid the second allocation if you compute outer(x, x) / sum(x) directly. However, such optimizations are low level, come with a number of gotchas, and are not even documented in ?Arithmetic, so it can be unsafe to rely on them.
outer(x, x) can result in underflow or overflow if the elements of x are very (very) small or large.
tcrossprod(x / sqrt(sum(x))) avoids all of these issues by scaling x before computing an outer product and cutting out all of the redundancies of outer.

Related

Adding the subsequent numbers of list containing random numbers, to the subsequent indices

I have a list with some random numbers. I want to add the two following numbers for each random number and add them to the subsequent indices in the list, without using a for loop.
So, lets say I have this list: v <- c(238,1002,569,432,6,1284)
Then the output I want is:
v <- c(238,239,240,1002,1003,1004,569,570,571,432,433,434,6,7,8,1284,1285,1286)
I am still pretty new to r, so I don't really know what I'm doing, but I've tried for hours now with no results.. I have tho, made it work using a for loop, but I know r isn't too happy with loops so I really need to vectorize it, somehow.
Does anybody know how I can implement this into my r code in an efficient manner?
You can just use outer to calculate the outer sum:
res <- outer(0:2, v, "+")
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 238 1002 569 432 6 1284
#[2,] 239 1003 570 433 7 1285
#[3,] 240 1004 571 434 8 1286
You can then turn the resulting matrix into a vector:
res <- as.vector(res)
#[1] 238 239 240 1002 1003 1004 569 570 571 432 433 434 6 7 8 1284 1285 1286
Note that matrices are "column-major" in R.

How to automatically multiply and add some coefficient to a data frame in R?

I have this data set
obs <- data.frame(replicate(8,rnorm(10, 0, 1)))
and this coefficients
coeff <- data.frame(replicate(8,rnorm(2, 0, 1)))
For each column of obs, I need to multiply the first element of first column, and add the second element of the first column too. I need to do the same for the 8 columns. I read somewhere that if someone copy and paste code more than once you are doing something wrong... and that's exactly what I did.
obs.transformed.X1 <-(obs[1]*coeff[1,1])+coeff[2,1]
obs.transformed.X2 <-(obs[2]*coeff[1,2])+coeff[2,2]
.
.
.
.
.
obs.transformed.X8 <-(obs[8]*coeff[1,8])+coeff[2,8]
I know there is a smarter way to do this (loop?), but I just couldn't figure it out. Any help will be appreciated.
This is what I've tried but I am only getting the last column
for (i in 1:length(obs)) {
results=(obs[i]*coeff[1,i])+coeff[2,i]
}
If you coerce to matrix class you can use the sweep function in a sequential fashion first multiplying columns by the first row of coeff and then by adding hte second row, again column-wise:
obs <- data.frame(matrix(1:60, 10)) # I find checking with random numbers difficult
coeff <- data.frame(matrix(1:12,2))
sweep(
sweep(as.matrix(obs), 2, as.matrix(coeff)[1,], "*"), # first operation is "*"
2, as.matrix(coeff)[2,], "+" ) # arguments for the addition
#--------------------------------
X1 X2 X3 X4 X5 X6
[1,] 3 37 111 225 379 573
[2,] 4 40 116 232 388 584
[3,] 5 43 121 239 397 595
[4,] 6 46 126 246 406 606
[5,] 7 49 131 253 415 617
[6,] 8 52 136 260 424 628
[7,] 9 55 141 267 433 639
[8,] 10 58 146 274 442 650
[9,] 11 61 151 281 451 661
[10,] 12 64 156 288 460 672
Decreased number of columns because your original code was too wide for my Rstudio console. But this should be very general. I suspect there's an equivalent matrix operator method but It didn't come to me
I came up with this solution..
results = list()
for (i in 1:length(obs)) {
results[[i]]=(obs[i]*coeff[1,i])+coeff[2,i]
}
results <- as.data.frame(results)
Is there any efficient way to do this?
I used Map
results <- as.data.frame(Map(`+`, Map(`*`, obs, coeff[1,]), coeff[2,]))
This should also give what you are looking for.

What does sapply do for given function

I am still learning R. Kindly, I'd like to understand this function:
sapply(M[,-1], function(x) x^2)
Where M is a matrix. It looks like it is squaring every element in M. Can someone provide a brief example of how this line functions?
Thank you
The apply functions family in R are of different types depending on the use case.
1.When you want apply a function to the rows or columns of a matrix , apply() function is used.
When you want to apply a function to each element of a list in turn and get a list back , we use lapply() function.
When you want to apply a function to each element of a list in turn, but you want a vector in return, and not a list - we use sapply() function.
In your case above yes it squares all values and returns a vector , except the first column of the matrix, see below :
M <- matrix(seq(10,25), 4, 4) # random 4 by 4 matrix
[,1] [,2] [,3] [,4]
[1,] 10 14 18 22
[2,] 11 15 19 23
[3,] 12 16 20 24
[4,] 13 17 21 25
M[,-1]
[,1] [,2] [,3]
[1,] 14 18 22
[2,] 15 19 23
[3,] 16 20 24
[4,] 17 21 25
sapply(M[,-1], function(x) x^2)
[1] 196 225 256 289 324 361 400 441 484 529 576 625

Loop over matrix using n consecutive rows in R

I have a matrix that consists of two columns and a number (n) of rows, while each row represents a point with the coordinates x and y (the two columns).
This is what it looks (LINK):
V1 V2
146 17
151 19
153 24
156 30
158 36
163 39
168 42
173 44
...
now, I would like to use a subset of three consecutive points starting from 1 to do some fitting, save the values from this fit in another list, an den go on to the next 3 points, and the next three, ... till the list is finished. Something like this:
Data_Fit_Kasa_1 <- CircleFitByKasa(Data[1:3,])
Data_Fit_Kasa_2 <- CircleFitByKasa(Data[3:6,])
....
Data_Fit_Kasa_n <- CircleFitByKasa(Data[i:i+2,])
I have tried to construct a loop, but I can't make it work. R either tells me that there's an "unexpected '}' in "}" " or that the "subscript is out of bonds". This is what I've tried:
minimal runnable code
install.packages("conicfit")
library(conicfit)
CFKasa <- NULL
Data.Fit <- NULL
for (i in 1:length(Data)) {
row <- Data[i:(i+2),]
CFKasa <- CircleFitByKasa(row)
Data.Fit[i] <- CFKasa[3]
}
RStudio Version 0.99.902 – © 2009-2016 RStudio, Inc.; Win10 Edu.
The third element of the fitted circle (CFKasa[3]) represents the radius, which is what I am really interested in. I am really stuck here, please help.
Many thanks in advance!
Best, David
Turn your data into a 3D array and use apply:
DF <- read.table(text = "V1 V2
146 17
151 19
153 24
156 30
158 36
163 39", header = TRUE)
a <- t(DF)
dim(a) <-c(nrow(a), 3, ncol(a) / 3)
a <- aperm(a, c(2, 1, 3))
# , , 1
#
# [,1] [,2]
# [1,] 146 17
# [2,] 151 19
# [3,] 153 24
#
# , , 2
#
# [,1] [,2]
# [1,] 156 30
# [2,] 158 36
# [3,] 163 39
center <- function(m) c(mean(m[,1]), mean(m[,2]))
t(apply(a, 3, center))
# [,1] [,2]
#[1,] 150 20
#[2,] 159 35
center(DF[1:3,])
#[1] 150 20

shuffle elements of a matrix's column to correlate to another column of the matrix in R

I have a matrix of human height in R like:
#the first 4 rows of 400 total
[,1] [,2]
[1,] 178 162
[2,] 186 157
[3,] 179 159
[4,] 180 157
I need to shuffle elements of second column i.e x[,2] so that cor(x[,1],x[,2])≈0.6 or anything more than 0.5, and I want to keep x[,1] untouched. (for now it has a very weak correlation of <0.1)
anybody know how to do this? thanks in advance.

Resources