I have a matrix that consists of two columns and a number (n) of rows, while each row represents a point with the coordinates x and y (the two columns).
This is what it looks (LINK):
V1 V2
146 17
151 19
153 24
156 30
158 36
163 39
168 42
173 44
...
now, I would like to use a subset of three consecutive points starting from 1 to do some fitting, save the values from this fit in another list, an den go on to the next 3 points, and the next three, ... till the list is finished. Something like this:
Data_Fit_Kasa_1 <- CircleFitByKasa(Data[1:3,])
Data_Fit_Kasa_2 <- CircleFitByKasa(Data[3:6,])
....
Data_Fit_Kasa_n <- CircleFitByKasa(Data[i:i+2,])
I have tried to construct a loop, but I can't make it work. R either tells me that there's an "unexpected '}' in "}" " or that the "subscript is out of bonds". This is what I've tried:
minimal runnable code
install.packages("conicfit")
library(conicfit)
CFKasa <- NULL
Data.Fit <- NULL
for (i in 1:length(Data)) {
row <- Data[i:(i+2),]
CFKasa <- CircleFitByKasa(row)
Data.Fit[i] <- CFKasa[3]
}
RStudio Version 0.99.902 – © 2009-2016 RStudio, Inc.; Win10 Edu.
The third element of the fitted circle (CFKasa[3]) represents the radius, which is what I am really interested in. I am really stuck here, please help.
Many thanks in advance!
Best, David
Turn your data into a 3D array and use apply:
DF <- read.table(text = "V1 V2
146 17
151 19
153 24
156 30
158 36
163 39", header = TRUE)
a <- t(DF)
dim(a) <-c(nrow(a), 3, ncol(a) / 3)
a <- aperm(a, c(2, 1, 3))
# , , 1
#
# [,1] [,2]
# [1,] 146 17
# [2,] 151 19
# [3,] 153 24
#
# , , 2
#
# [,1] [,2]
# [1,] 156 30
# [2,] 158 36
# [3,] 163 39
center <- function(m) c(mean(m[,1]), mean(m[,2]))
t(apply(a, 3, center))
# [,1] [,2]
#[1,] 150 20
#[2,] 159 35
center(DF[1:3,])
#[1] 150 20
Related
I have a set of numbers, which I want to expand into a matrix.
There are 4 values in the list which I want to expand into a 4x4 matrix.
Here is some example data
freq <- c(627,449,813,111)
I want to expand this into a matrix of so that it's like this.
Apologies I have just copied and pasted data, thus it's not an R output, but hope it helps to get the idea across.
1 2 3 4 Total
1 197 141 255 35 627
2 141 101 183 25 449
3 255 183 330 45 813
4 35 25 45 6 111
627 449 813 111 2000
The cells are multiplication of the (row total)x(column total)/(table total). The value in 1,1 = (627 x 627)/2000 = 197. The value in 2,1 = (627 x 449)/2000 = 141, and so on.
Is there a function that will create this matrix? I will try to do it via a loop but was hoping there is a function or matrix calculation trick that can do this more efficiently? Apologies if I didn't articulate the above too well, any help is greatly appreciated. Thanks
freq <- c(627,449,813,111)
round(outer(freq, freq)/sum(freq))
#> [,1] [,2] [,3] [,4]
#> [1,] 197 141 255 35
#> [2,] 141 101 183 25
#> [3,] 255 183 330 45
#> [4,] 35 25 45 6
It doesn't really matter here, but it is good practice to avoid constructions like outer(x, x) / sum(x) in favour of ones like tcrossprod(x / sqrt(sum(x))):
round(tcrossprod(freq / sqrt(sum(freq))))
## [,1] [,2] [,3] [,4]
## [1,] 197 141 255 35
## [2,] 141 101 183 25
## [3,] 255 183 330 45
## [4,] 35 25 45 6
There are a few issues with the outer approach:
outer(x, x) evaluates tcrossprod(as.vector(x), as.vector(x)) internally. The as.vector calls and everything else that happens inside of outer are completely redundant if x is already a vector. The as.vector calls are actually worse than redundant: if x has any attributes, then as.vector(x) requires a deep copy of x.
Naively doing A <- outer(x, x); A / sum(x) requires R to allocate memory for two n-by-n matrices. For large enough n, that can be quite wasteful, if not impossible. R is clever enough to avoid the second allocation if you compute outer(x, x) / sum(x) directly. However, such optimizations are low level, come with a number of gotchas, and are not even documented in ?Arithmetic, so it can be unsafe to rely on them.
outer(x, x) can result in underflow or overflow if the elements of x are very (very) small or large.
tcrossprod(x / sqrt(sum(x))) avoids all of these issues by scaling x before computing an outer product and cutting out all of the redundancies of outer.
I have generate a random matrix d, then make some matrix operation.
Finally, I need to store the result in vector B. Code is below
set.seed(42)
n <- 3
m <- 4
d <- matrix(sample(0:255, n*m, replace=T), nrow = n, ncol = m)
# some matrix operation
B <-c(d[1,], d[2,], d[3,])
> d
[,1] [,2] [,3] [,4]
[1,] 234 212 188 180
[2,] 239 164 34 117
[3,] 73 132 168 184
> B
[1] 234 212 188 180 239 164 34 117 73 132 168 184
>
Could some one please explain me how to rewrite last
line via a function in order to combine the n arguments in one vector?
I have tried
B <- sapply(1:n, FUN=function(i) B<-c(d[i,]))
Thank!
This function should do it (overkill, since c(t(d)) as suggested by #joran works fine):
vectorizeByRow <- function(IN) {
OUT <- rep(NA_real_, length(IN))
nc <- ncol(IN)
nr <- nrow(IN)
a <- seq(1, length(IN), nc)
b <- a + nc - 1
for (n in 1:length(a)) {
OUT[a[n]:b[n]] <- IN[n,]
}
OUT
}
Use:
vectorizeByRow(d)
Produces:
[1] 234 212 188 180 239 164 34 117 73 132
[11] 168 184
This is from the HandyStuff package. Disclaimer: I am the author.
I stumble upon the following thing. I read the reshape manual, but still lost.
Is there an efficient and more elegant way to reshape the matrix of even chunks?
the code to generate the matrix and reshaped matrix is below.
# current matrix
x <- matrix(sample(20*9), 20, 9)
colnames(x) <- c(paste("time",c(1:3),sep="_"),
paste("SGNL", 1, c(1:3), sep="_"),
paste("SGNL", 2, c(1:3), sep="_"))
# reshaped matrix
x.reshaped <- rbind( x[,c(1,4,7)], x[,c(2,5,8)], x[,c(3,6,9)] )
colnames(x.reshaped) <- sub("\\_1$", "", colnames(x.reshaped))
Thanks!
If you want to use an approach that is name-based and not position-based, then you should look at melt from "data.table":
library(data.table)
melt(as.data.table(x), measure.vars = patterns("time", "SGNL_1", "SGNL_2"))
Example output:
head(melt(as.data.table(x), measure.vars = patterns("time", "SGNL_1", "SGNL_2")))
# variable value1 value2 value3
# 1: 1 48 110 155
# 2: 1 67 35 140
# 3: 1 102 55 72
# 4: 1 161 39 66
# 5: 1 36 137 99
# 6: 1 158 169 85
Or, in base R:
patts <- c("time", "SGNL_1", "SGNL_2")
sapply(patts, function(y) c(x[, grep(y, colnames(x))]))
# time SGNL_1 SGNL_2
# [1,] 48 110 155
# [2,] 67 35 140
# [3,] 102 55 72
# [4,] 161 39 66
# [5,] 36 137 99
# .
# .
# .
# .
# [56,] 13 1 84
# [57,] 40 46 95
# [58,] 152 7 178
# [59,] 81 79 123
# [60,] 50 101 146
Data generated with set.seed(1).
We could create the subset of matrices (based on the index generated by the seq) in a list and then rbind it together.
do.call(rbind, lapply(1:3, function(i) x[,seq(i, length.out=3, by=3)]))
Or using a for loop
m2 <- c()
for(i in 1:3) { m2 <- rbind(m2, x[,seq(i, length.out=3, by=3)])}
x[,c(matrix(1:9, 3, byrow=TRUE))] # or shorter:
x[,matrix(1:9, 3, byrow=TRUE)]
I am trying to apply a function to each row or column of a matrix, but I need to pass a different argument value for each row.
I thought I was familiar with lapply, mapply etc... But probably not enough.
As a simple example :
> a<-matrix(1:100,ncol=10);
> a
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 3 13 23 33 43 53 63 73 83 93
[4,] 4 14 24 34 44 54 64 74 84 94
[5,] 5 15 25 35 45 55 65 75 85 95
[6,] 6 16 26 36 46 56 66 76 86 96
[7,] 7 17 27 37 47 57 67 77 87 97
[8,] 8 18 28 38 48 58 68 78 88 98
[9,] 9 19 29 39 49 59 69 79 89 99
[10,] 10 20 30 40 50 60 70 80 90 100
Let's say I want to apply a function to each row, I would do :
apply(a, 1, myFunction);
However my function takes an argument, so :
apply(a, 1, myFunction, myArgument);
But if I want my argument to take a different value for each row, I cannot find the right way to do it.
If I define a 'myArgument' with multiple values, the whole vector will obviously be passed to each call of 'myFunction'.
I think that I would need a kind of hybrid between apply and the multivariate mapply. Does it make sense ?
One 'dirty' way to achieve my goal is to split the matrix by rows (or columns), use mapply on the resulting list and merge the result back to a matrix :
do.call(rbind, Map(myFunction, split(a,row(a)), as.list(myArgument)));
I had a look at sweep, aggregate, all the *apply variations but I wouldn't find the perfect match to my need. Did I miss it ?
Thank you for your help.
You can use sweep to do that.
a <- matrix(rnorm(100),10)
rmeans <- rowMeans(a)
a_new <- sweep(a,1,rmeans,`-`)
rowMeans(a_new)
I don't think there are any great answers, but you can somewhat simplify your solution by using mapply, which handles the "rbind" part for you, assuming your function always returns the same sizes vector (also, Map is really just mapply):
a <- matrix(1:80,ncol=8)
myFun <- function(x, y) (x - mean(x)) * y
myArg <- 1:nrow(a)
t(mapply(myFun, split(a, row(a)), myArg))
I know the topic is quiet old but I had the same issue and I solved it that way:
# Original matrix
a <- matrix(runif(n=100), ncol=5)
# Different value for each row
v <- runif(n=nrow(a))
# Result matrix -> Add a column with the row number
o <- cbind(1:nrow(a), a)
fun <- function(x, v) {
idx <- 2:length(x)
i <- x[1]
r <- x[idx] / v[i]
return(r)
}
o <- t(apply(o, 1, fun, v=v)
By adding a column with the row number to the left of the original matrix, the index of the needed value from the argument vector can be received from the first column of the data matrix.
What's the most efficient way to create a moving average or rolling sum in R? How do you do the rolling function along with a "group by"?
While zoo is great, sometimes there are simpler ways. If you data behaves nicely, and is evenly spaced, the embed() function effectively lets you create multiple lagged version of a time series. If you look inside the VARS package for vector auto-regression, you will see that the package author chooses this route.
For example, to calculate the 3 period rolling average of x, where x = (1 -> 20)^2:
> x <- (1:20)^2
> embed (x, 3)
[,1] [,2] [,3]
[1,] 9 4 1
[2,] 16 9 4
[3,] 25 16 9
[4,] 36 25 16
[5,] 49 36 25
[6,] 64 49 36
[7,] 81 64 49
[8,] 100 81 64
[9,] 121 100 81
[10,] 144 121 100
[11,] 169 144 121
[12,] 196 169 144
[13,] 225 196 169
[14,] 256 225 196
[15,] 289 256 225
[16,] 324 289 256
[17,] 361 324 289
[18,] 400 361 324
> apply (embed (x, 3), 1, mean)
[1] 4.666667 9.666667 16.666667 25.666667 36.666667 49.666667
[7] 64.666667 81.666667 100.666667 121.666667 144.666667 169.666667
[13] 196.666667 225.666667 256.666667 289.666667 324.666667 361.666667
I scratched up a good answer from Achim Zeileis over on the r list. Here's what he said:
library(zoo)
## create data
x <- rnorm(365)
## transform to regular zoo series with "Date" index
x <- zooreg(x, start = as.Date("2004-01-01")) plot(x)
## add rolling/running/moving average with window size 7
lines(rollmean(x, 7), col = 2, lwd = 2)
## if you don't want the rolling mean but rather a weekly ## time series of means you can do
nextfri <- function(x) 7 * ceiling(as.numeric(x - 1)/7) + as.Date(1) xw <- aggregate(x, nextfri, mean)
## nextfri is a function which computes for a certain "Date" ## the next friday. xw is then the weekly series.
lines(xw, col = 4)
Achim went on to say:
Note, that the difference between is
rolling mean and the aggregated series
is due to different alignments. This
can be changed by changing the 'align'
argument in rollmean() or the
nextfri() function in the aggregate
call.
All this came from Achim, not from me:
http://tolstoy.newcastle.edu.au/R/help/05/06/6785.html