How to multiply each column by each scalar in R? - r

I have the following variable Q
a = c(1,2,3,4)
b = c(45,4,3,2)
c = c(34,23,12,45)
Q = cbind(a,b,c)
I also have another variable r
r = c(10,20,30)
I would like to multiply each column of Q by each respective value in r (for example, the first column of Q multiplied by first value in r, the second column of Q multiplied by second value in rand so on).
Specifically for this example, the output I am looking for is:
10 900 1020
20 80 690
30 60 360
40 40 1350
I am new to R and looking for the most optimal way to do this.

Try this:
Q %*% diag(r)
giving:
[,1] [,2] [,3]
[1,] 10 900 1020
[2,] 20 80 690
[3,] 30 60 360
[4,] 40 40 1350
or any of these:
t(t(Q) * r)
Q * r[col(Q)]
sweep(Q, 2, r, "*")
Q * rep(r, each = nrow(Q))
mapply("*", as.data.frame(Q), r)
See this answer for the same question except using division:
How to divide each row of a matrix by elements of a vector in R

you will just need to do double transpose:
t(r*t(Q))
a b c
[1,] 10 900 1020
[2,] 20 80 690
[3,] 30 60 360
[4,] 40 40 1350

Related

Comparing two lists of values of different lengths

I have a long list of random numbers between 1 and 100, and i would like to count how many of them are larger than 10,20,30 etc
x <- c(sample(1:100, 500, replace = T))
y <- seq(0,100, by = 10)
I am looking for this to return an output such as;
Total
10
20
30
40
50
Count
7
13
17
28
42
Where Count is the number of x Values that are larger than Total (each y value )
So far, I have tried
Count = ifelse(x > y, 1, 0)
However this returns a list of Binary 1,0 returns for each of the 500 values of X
I'd appreciate any help
This answer asummes your looking for intervals not for cummulative sum of numbers greater than a threshold given your count.
cut + table are useful here:
table(cut(x, breaks = y))
(0,10] (10,20] (20,30] (30,40] (40,50] (50,60] (60,70] (70,80] (80,90] (90,100]
51 66 36 44 54 49 55 46 58 41
findInterval + table will give you the same result
table(findInterval(x, y, left.open = TRUE))
Data
set.seed(505)
x <- c(sample(1:100, 500, replace = T))
y <- seq(0,100, by = 10)
With base R this is one approach
rbind(Total = y, Count = rowSums(sapply(x, ">", y)))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
Total 0 10 20 30 40 50 60 70 80 90 100
Count 500 444 381 329 279 241 198 150 104 52 0
If I understood correctly, this might work:
x <- c(sample(1:100, 500, replace = T))
y <- seq(0,100, by = 10)
is_bigger_than <- function(y){
data.frame(y, n = sum(x > y,na.rm = TRUE))
}
purrr::map_df(y,is_bigger_than)
y n
1 0 500
2 10 450
3 20 403
4 30 359
5 40 305
6 50 264
7 60 201
8 70 155
9 80 100
10 90 52
11 100 0

Expanding a list of numbers into a matrix (list with n values to multiply to a n x n matrix)

I have a set of numbers, which I want to expand into a matrix.
There are 4 values in the list which I want to expand into a 4x4 matrix.
Here is some example data
freq <- c(627,449,813,111)
I want to expand this into a matrix of so that it's like this.
Apologies I have just copied and pasted data, thus it's not an R output, but hope it helps to get the idea across.
1 2 3 4 Total
1 197 141 255 35 627
2 141 101 183 25 449
3 255 183 330 45 813
4 35 25 45 6 111
627 449 813 111 2000
The cells are multiplication of the (row total)x(column total)/(table total). The value in 1,1 = (627 x 627)/2000 = 197. The value in 2,1 = (627 x 449)/2000 = 141, and so on.
Is there a function that will create this matrix? I will try to do it via a loop but was hoping there is a function or matrix calculation trick that can do this more efficiently? Apologies if I didn't articulate the above too well, any help is greatly appreciated. Thanks
freq <- c(627,449,813,111)
round(outer(freq, freq)/sum(freq))
#> [,1] [,2] [,3] [,4]
#> [1,] 197 141 255 35
#> [2,] 141 101 183 25
#> [3,] 255 183 330 45
#> [4,] 35 25 45 6
It doesn't really matter here, but it is good practice to avoid constructions like outer(x, x) / sum(x) in favour of ones like tcrossprod(x / sqrt(sum(x))):
round(tcrossprod(freq / sqrt(sum(freq))))
## [,1] [,2] [,3] [,4]
## [1,] 197 141 255 35
## [2,] 141 101 183 25
## [3,] 255 183 330 45
## [4,] 35 25 45 6
There are a few issues with the outer approach:
outer(x, x) evaluates tcrossprod(as.vector(x), as.vector(x)) internally. The as.vector calls and everything else that happens inside of outer are completely redundant if x is already a vector. The as.vector calls are actually worse than redundant: if x has any attributes, then as.vector(x) requires a deep copy of x.
Naively doing A <- outer(x, x); A / sum(x) requires R to allocate memory for two n-by-n matrices. For large enough n, that can be quite wasteful, if not impossible. R is clever enough to avoid the second allocation if you compute outer(x, x) / sum(x) directly. However, such optimizations are low level, come with a number of gotchas, and are not even documented in ?Arithmetic, so it can be unsafe to rely on them.
outer(x, x) can result in underflow or overflow if the elements of x are very (very) small or large.
tcrossprod(x / sqrt(sum(x))) avoids all of these issues by scaling x before computing an outer product and cutting out all of the redundancies of outer.

Using a function and mapply in R to create new columns that sums other columns

Suppose, I have a dataframe, df, and I want to create a new column called "c" based on the addition of two existing columns, "a" and "b". I would simply run the following code:
df$c <- df$a + df$b
But I also want to do this for many other columns. So why won't my code below work?
# Reproducible data:
martial_arts <- data.frame(gym_branch=c("downtown_a", "downtown_b", "uptown", "island"),
day_boxing=c(5,30,25,10),day_muaythai=c(34,18,20,30),
day_bjj=c(0,0,0,0),day_judo=c(10,0,5,0),
evening_boxing=c(50,45,32,40), evening_muaythai=c(50,50,45,50),
evening_bjj=c(60,60,55,40), evening_judo=c(25,15,30,0))
# Creating a list of the new column names of the columns that need to be added to the martial_arts dataframe:
pattern<-c("_boxing","_muaythai","_bjj","_judo")
d<- expand.grid(paste0("martial_arts$total",pattern))
# Creating lists of the columns that will be added to each other:
e<- names(martial_arts %>% select(day_boxing:day_judo))
f<- names(martial_arts %>% select(evening_boxing:evening_judo))
# Writing a function and using mapply:
kick_him <- function(d,e,f){d <- rowSums(martial_arts[ , c(e, f)], na.rm=T)}
mapply(kick_him,d,e,f)
Now, mapply produces the correct results in terms of the addition:
> mapply(ff,d,e,f)
Var1 <NA> <NA> <NA>
[1,] 55 84 60 35
[2,] 75 68 60 15
[3,] 57 65 55 35
[4,] 50 80 40 0
But it doesn't add the new columns to the martial_arts dataframe. The function in theory should do the following
martial_arts$total_boxing <- martial_arts$day_boxing + martial_arts$evening_boxing
...
...
martial_arts$total_judo <- martial_arts$day_judo + martial_arts$evening_judo
and add four new total columns to martial_arts.
So what am I doing wrong?
The assignment is wrong here i.e. instead of having martial_arts$total_boxing as a string, it should be "total_boxing" alone and this should be on the lhs of the Map/mapply. As the OP already created the 'martial_arts$' in 'd' dataset as a column, we are removing the prefix part and do the assignment
kick_him <- function(e,f){rowSums(martial_arts[ , c(e, f)], na.rm=TRUE)}
martial_arts[sub(".*\\$", "", d$Var1)] <- Map(kick_him, e, f)
-check the dataset now
> martial_arts
gym_branch day_boxing day_muaythai day_bjj day_judo evening_boxing evening_muaythai evening_bjj evening_judo total_boxing total_muaythai total_bjj total_judo
1 downtown_a 5 34 0 10 50 50 60 25 55 84 60 35
2 downtown_b 30 18 0 0 45 50 60 15 75 68 60 15
3 uptown 25 20 0 5 32 45 55 30 57 65 55 35
4 island 10 30 0 0 40 50 40 0 50 80 40 0

What does sapply do for given function

I am still learning R. Kindly, I'd like to understand this function:
sapply(M[,-1], function(x) x^2)
Where M is a matrix. It looks like it is squaring every element in M. Can someone provide a brief example of how this line functions?
Thank you
The apply functions family in R are of different types depending on the use case.
1.When you want apply a function to the rows or columns of a matrix , apply() function is used.
When you want to apply a function to each element of a list in turn and get a list back , we use lapply() function.
When you want to apply a function to each element of a list in turn, but you want a vector in return, and not a list - we use sapply() function.
In your case above yes it squares all values and returns a vector , except the first column of the matrix, see below :
M <- matrix(seq(10,25), 4, 4) # random 4 by 4 matrix
[,1] [,2] [,3] [,4]
[1,] 10 14 18 22
[2,] 11 15 19 23
[3,] 12 16 20 24
[4,] 13 17 21 25
M[,-1]
[,1] [,2] [,3]
[1,] 14 18 22
[2,] 15 19 23
[3,] 16 20 24
[4,] 17 21 25
sapply(M[,-1], function(x) x^2)
[1] 196 225 256 289 324 361 400 441 484 529 576 625

Loop over matrix using n consecutive rows in R

I have a matrix that consists of two columns and a number (n) of rows, while each row represents a point with the coordinates x and y (the two columns).
This is what it looks (LINK):
V1 V2
146 17
151 19
153 24
156 30
158 36
163 39
168 42
173 44
...
now, I would like to use a subset of three consecutive points starting from 1 to do some fitting, save the values from this fit in another list, an den go on to the next 3 points, and the next three, ... till the list is finished. Something like this:
Data_Fit_Kasa_1 <- CircleFitByKasa(Data[1:3,])
Data_Fit_Kasa_2 <- CircleFitByKasa(Data[3:6,])
....
Data_Fit_Kasa_n <- CircleFitByKasa(Data[i:i+2,])
I have tried to construct a loop, but I can't make it work. R either tells me that there's an "unexpected '}' in "}" " or that the "subscript is out of bonds". This is what I've tried:
minimal runnable code
install.packages("conicfit")
library(conicfit)
CFKasa <- NULL
Data.Fit <- NULL
for (i in 1:length(Data)) {
row <- Data[i:(i+2),]
CFKasa <- CircleFitByKasa(row)
Data.Fit[i] <- CFKasa[3]
}
RStudio Version 0.99.902 – © 2009-2016 RStudio, Inc.; Win10 Edu.
The third element of the fitted circle (CFKasa[3]) represents the radius, which is what I am really interested in. I am really stuck here, please help.
Many thanks in advance!
Best, David
Turn your data into a 3D array and use apply:
DF <- read.table(text = "V1 V2
146 17
151 19
153 24
156 30
158 36
163 39", header = TRUE)
a <- t(DF)
dim(a) <-c(nrow(a), 3, ncol(a) / 3)
a <- aperm(a, c(2, 1, 3))
# , , 1
#
# [,1] [,2]
# [1,] 146 17
# [2,] 151 19
# [3,] 153 24
#
# , , 2
#
# [,1] [,2]
# [1,] 156 30
# [2,] 158 36
# [3,] 163 39
center <- function(m) c(mean(m[,1]), mean(m[,2]))
t(apply(a, 3, center))
# [,1] [,2]
#[1,] 150 20
#[2,] 159 35
center(DF[1:3,])
#[1] 150 20

Resources