I have some trouble in order to solve my set of linear equations.
I have three 3D points (A, B, C) in my example and I want to automate the solving of my system. I want to create a plane with these 3 points.
It's very simple manually (mathematically) but I don't see why I don't solve my problem when I code...
I have a system of cartesian equation which is the equation of a plane : ax+by+cz+d=0
xAx + yAy + zA*z +d = 0 #point A
xBx + yBy + zB*z +d = 0 #point B
etc
I use a matrix, for example A=(0,0,1) ; B=(4,2,3) and C=(-3,1,0).
With manual solving, I have for this example this solution : x+3y-5z+5=0.
For resolving it in R : I wanted to use solve().
A <- c(0,0,1)
B <- c(4,2,3)
C <- c(-3,1,0)
res0 <- c(-d,-d,-d) #I don't know how having it so I tried c(0,0,0) cause each equation = 0. But I really don't know for that !
#' #param A vector 3x1 with the 3d coordinates of the point A
carteq <- function(A, B, C, res0) {
matrixtest0 <- matrix(c(A[1], A[2], A[3], B[1], B[2], B[3],C[1], C[2], C[3]), ncol=3) #I tried to add the 4th column for solving "d" but that doesn't work.
#checking the invertibility of my matrix
out <- tryCatch(determinant(matrixtest0)$modulus<threshold, error = function(e) e)#or out <- tryCatch(solve(X) %*% X, error = function(e) e)
abcd <- solve(matrixtest0, res0) #returns just 3 values
abcd <- qr.solve(matrixtest0, res0) #returns just 3 values
}
That's not the good method... But I don't know how I can add the "d" in my problem.
The return that I need is : return(a, b, c, d)
I thing that my problem is classical and easy, but I don't find a function like solve() or qr.solve() which can solve my problem...
Your solution is actually wrong:
A <- c(0,0,1)
B <- c(4,2,3)
C <- c(-3,1,0)
CrossProduct3D <- function(x, y, i=1:3) {
#http://stackoverflow.com/a/21736807/1412059
To3D <- function(x) head(c(x, rep(0, 3)), 3)
x <- To3D(x)
y <- To3D(y)
Index3D <- function(i) (i - 1) %% 3 + 1
return (x[Index3D(i + 1)] * y[Index3D(i + 2)] -
x[Index3D(i + 2)] * y[Index3D(i + 1)])
}
N <- CrossProduct3D(A - B, C - B)
#[1] 4 2 -10
d <- -sum(N * B)
#[1] 10
#test it:
crossprod(A, N) + d
# [,1]
#[1,] 0
crossprod(B, N) + d
# [,1]
#[1,] 0
crossprod(C, N) + d
# [,1]
#[1,] 0
Related
Suppose I have a function with a kink. I want to derive a kink point, which in this case is 0.314. I tried optim but it does not work.
Here is an example. In general, I want to derive c. Of course, I could use brute force, but it is slow.
# function with a kink
f <- function(x, c){
(x >= 0 & x < c) * 0 + (x >= c & x <=1) * (sin(3*(x-c))) +
(x < 0 | x > 1) * 100
}
# plot
x_vec <- seq(0, 1, .01)
plot(x_vec, f(x_vec, c = pi/10), "l")
# does not work
optim(.4, f, c = pi/10)
This function has no unique minimum.
Here, a trick is to transform this function a little bit, so that its kink becomes a unique minimum.
g <- function (x, c) f(x, c) - x
x_vec <- seq(0, 1, 0.01)
plot(x_vec, g(x_vec, c = pi/10), type = "l")
# now works
optim(0.4, g, c = pi/10, method = "BFGS")
#$par
#[1] 0.3140978
#
#$value
#[1] -0.3140978
#
#$counts
#function gradient
# 34 5
#
#$convergence
#[1] 0
#
#$message
#NULL
Note:
In mathematics, if we want to find something, we have to first define it precisely. So what is a "kink" exactly? In this example, you refer to the parameter c = pi / 10. But what is it in general? Without a clear definition, there is no algorithm/function to get it.
I need to evaluate an integral in the following form:
\int_a^b f(x) \int_0^x g(t)(x-t)dtdx
Can you please suggest a way? I assume that this integral can't be done in the standard approach suggested in the following answer:
Standard approach
Update: Functions are added in the following image. f(x) basically represents a pdf of a uniform distribution but the g(t) is a bit more complicated. a and b can be any positive real numbers.
The domain of integration is a simplex (triangle) with vertices (a,a), (a,b) and (b,b). Use the SimplicialCubature package:
library(SimplicialCubature)
alpha <- 3
beta <- 4
g <- function(t){
((beta/t)^(1/2) + (beta/t)^(3/2)) * exp(-(t/beta + beta/t - 2)/(2*alpha^2)) /
(2*alpha*beta*sqrt(2*pi))
}
a <- 1
b <- 2
h <- function(tx){
t <- tx[1]
x <- tx[2]
g(t) * (x-t)
}
S <- cbind(c(a, a), c(a ,b), c(b, b))
adaptIntegrateSimplex(h, S)
# $integral
# [1] 0.01962547
#
# $estAbsError
# [1] 3.523222e-08
Another way, less efficient and less reliable, is:
InnerFunc <- function(t, x) { g(t) * (x - t) }
InnerIntegral <- Vectorize(function(x) { integrate(InnerFunc, a, x, x = x)$value})
integrate(InnerIntegral, a, b)
# 0.01962547 with absolute error < 2.2e-16
I am trying to code a function which will identify which row of an nxm matrix M is closest to a vector y of length m.
What am I doing wrong in my code please? I am aiming for the function to produce a column vector of length n which gives the distance between each row coordinates of the matrix and the vector y. I then want to output the row number of the Matrix for which is the closest point to the vector.
closest.point <- function(M, y) {
p <- length(y)
k <- nrow(M)
T <- matrix(nrow=k)
T <- for(i in 1:n)
for(j in 1:m) {
(X[i,j] - x[j])^2 + (X[i,j] - x[j])^2
}
W <- rowSums(T)
max(W)
df[which.max(W),]
}
Even though there is already a better approach (not using for loops when dealing with matrices) to the problem, I would like to give you a solution to your approach with a for loop.
There were some mistakes in your function. There are some undefined variables like n, m or X.
Also try to avoid to name variables as T, because R interprets T as TRUE. It works but could result in some errors if one uses T as TRUE in the following code lines.
When looping, you need to give an index to your variable that you are updating, like T.matrix[i, j] and not only T.matrix as this will overwrite T.matrix at every iteration.
closest.point <- function(M, y) {
k <- nrow(M)
m <- ncol(M)
T.matrix <- matrix(nrow = k, ncol = m)
for (i in 1:k) {
for (j in 1:m) {
T.matrix[i, j] <- (M[i,j] - y[j])^2 + (M[i,j] - y[j])^2
}
}
W <- rowSums(T.matrix)
return(which.min(W))
}
# example 1
closest.point(M = rbind(c(1, 1, 1),
c(1, 2, 5)),
y = cbind(c(1, 2, 5)))
# [1] 2
# example 2
closest.point(M = rbind(c(1, 1, 1, 1),
c(1, 2, 5, 7)),
y = cbind(c(2, 2, 6, 2)))
# [1] 2
You should try to avoid using for loop to do operations on vectors and matrices. The dist base function calculates distances. Then which.min will give you the index of the minimal distance.
set.seed(0)
M <- matrix(rnorm(100), ncol = 5)
y <- rnorm(5)
closest_point <- function(M, y) {
dist_mat <- as.matrix(dist(rbind(M, y)))
all_distances <- dist_mat[1:nrow(M),ncol(dist_mat)]
which.min(all_distances)
}
closest_point(M, y)
#>
#> 14
Created on 2021-12-10 by the reprex package (v2.0.1)
Hope this makes sense, let me know if you have questions.
There are a number of problems here
p is defined but never used.
Although not wrong T does not really have to be a matrix. It would be sufficient to have it be a vector.
Although not wrong using T as a variable is dangerous because T also means TRUE.
The code defines T and them immediately throws it away in the next statement overwriting it. The prior statement defining T is never used.
for always has the value of NULL so assigning it to T is pointless.
the double for loop doesn't do anything. There are no assignments in it so the loops have no effect.
the loops refer to m, n, X and x but these are nowhere defined.
(X[i,j] - x[j])^2 is repeated. It is only needed once.
Writing max(W) on a line by itself has no effect. It only causes printing to be done if done directly in the console. If done in a function it has no effect. If you meant to print it then write print(max(W)).
We want the closest point, not the farthest point, so max should be min.
df is used in the last line but is not defined anywhere.
The question is incomplete without a test run.
I have tried to make the minimum changes to make this work:
closest.point <- function(M, y) {
nr <- nrow(M)
nc <- ncol(M)
W <- numeric(nr) # vector having nr zeros
for(i in 1:nr) {
for(j in 1:nc) {
W[i] <- W[i] + (M[i,j] - y[j])^2
}
}
print(W)
print(min(W))
M[which.min(W),]
}
set.seed(123)
M <- matrix(rnorm(12), 4); M
## [,1] [,2] [,3]
## [1,] -0.56047565 0.1292877 -0.6868529
## [2,] -0.23017749 1.7150650 -0.4456620
## [3,] 1.55870831 0.4609162 1.2240818
## [4,] 0.07050839 -1.2650612 0.3598138
y <- rnorm(3); y
## [1] 0.4007715 0.1106827 -0.5558411
closest.point(M, y)
## [1] 0.9415062 2.9842785 4.6316069 2.8401691 <--- W
## [1] 0.9415062 <--- min(W)
## [1] -0.5604756 0.1292877 -0.6868529 <-- closest row
That said the calculation of the closest row can be done in this function with a one-line body. We transpose M and then subtract y from it which will subtract y from each column but the columns of the transpose are the rows of M so this subtracts y from each row. Then take the column sums of the squared differences and find which one is least. Subscript M using that.
closest.point2 <- function(M, y) {
M[which.min(colSums((t(M) - y)^2)), ]
}
closest.point2(M, y)
## [1] -0.5604756 0.1292877 -0.6868529 <-- closest row
I'm given a question in R language to find the 30th term of the recurrence relation x(n) = 2*x(n-1) - x(n-2), where x(1) = 0 and x(2) = 1. I know the answer is 29 from mathematical deduction. But as a newbie to R, I'm slightly confused by how to make things work here. The following is my code:
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*b - a
}
return(a)
}
loop(30)
I'm returned 1 as a result, which is way off.
In case you're wondering why this looks Python-ish, I've mostly only been exposed to Python programming thus far (I'm new to programming in general). I've tried to check out all the syntax in R, but I suppose my logic is quite fixed by Python. Can someone help me out in this case? In addition, does R have any resources like PythonTutor to help visualise the code execution logic?
Thank you!
I guess what you need might be something like below
loop <- function(n){
if (n<=2) return(n-1)
a <- 0
b <- 1
for (i in 3:n){
a_new <- b
b <- 2*b - a
a <- a_new
}
return(b)
}
then
> loop(30)
[1] 29
If you need a recursion version, below is one realization
loop <- function(n) {
if (n<=2) return(n-1)
2*loop(n-1)-loop(n-2)
}
which also gives
> loop(30)
[1] 29
You can solve it another couple of ways.
Solve the linear homogeneous recurrence relation, let
x(n) = r^n
plugging into the recurrence relation, you get the quadratic
r^n-2*r^(n-1)+r^(n-2) = 0
, i.e.,
r^2-2*r+1=0
, i.e.,
r = 1, 1
leading to general solution
x(n) = c1 * 1^n + c2 * n * 1^n = c1 + n * c2
and with x(1) = 0 and x(2) = 1, you get c2 = 1, c1 = -1, s.t.,
x(n) = n - 1
=> x(30) = 29
Hence, R code to compute x(n) as a function of n is trivial, as shown below:
x <- function(n) {
return (n-1)
}
x(30)
#29
Use matrix powers (first find the following matrix A from the recurrence relation):
(The matrix A has algebraic / geometric multiplicity, its corresponding eigenvectors matrix is singular, otherwise you could use spectral decomposition yourself for fast computation of matrix powers, here we shall use the library expm as shown below)
library(expm)
A <- matrix(c(2,1,-1,0), nrow=2)
A %^% 29 %*% c(1,0) # [x(31) x(30)]T = A^29.[x(2) x(1)]T
# [,1]
# [1,] 30 # x(31)
# [2,] 29 # x(30)
# compute x(n)
x <- function(n) {
(A %^% (n-1) %*% c(1,0))[2]
}
x(30)
# 29
You're not using the variable you're iterating on in the loop, so nothing is updating.
loop <- function(n){
a <- 0
b <- 1
for (i in 1:30){
a <- b
b <- 2*i - a
}
return(a)
}
You could define a recursive function.
f <- function(x, n) {
n <- 1:n
r <- function(n) {
if (length(n) == 2) x[2]
else r({
x <<- c(x[2], 2*x[2] - x[1])
n[-1]
})
}
r(n)
}
x <- c(0, 1)
f(x, 30)
# [1] 29
I have a 2396x34 double matrix named y wherein each row (2396) represents a separate situation consisting of 34 consecutive time segments.
I also have a numeric[34] named x that represents a single situation of 34 consecutive time segments.
Currently I am calculating the correlation between each row in y and x like this:
crs[,2] <- cor(t(y),x)
What I need now is to replace the cor function in the above statement with a weighted correlation. The weight vector xy.wt is 34 elements long so that a different weight can be assigned to each of the 34 consecutive time segments.
I found the Weighted Covariance Matrix function cov.wt and thought that if I first scale the data it should work just like the cor function. In fact you can specify for the function to return a correlation matrix as well. Unfortunately it does not seem like I can use it in the same manner because I cannot supply my two variables (x and y) separately.
Does anyone know of a way I can get a weighted correlation in the manner I described without sacrificing much speed?
Edit: Perhaps some mathematical function could be applied to y prior to the cor function in order to get the same results that I'm looking for. Maybe if I multiply each element by xy.wt/sum(xy.wt)?
Edit #2 I found another function corr in the boot package.
corr(d, w = rep(1, nrow(d))/nrow(d))
d
A matrix with two columns corresponding to the two variables whose correlation we wish to calculate.
w
A vector of weights to be applied to each pair of observations. The default is equal weights for each pair. Normalization takes place within the function so sum(w) need not equal 1.
This also is not what I need but it is closer.
Edit #3
Here is some code to generate the type of data I am working with:
x<-cumsum(rnorm(34))
y<- t(sapply(1:2396,function(u) cumsum(rnorm(34))))
xy.wt<-1/(34:1)
crs<-cor(t(y),x) #this works but I want to use xy.wt as weight
Unfortunately the accepted answer is wrong when y is a matrix of more than one row. The error is in the line
vy <- rowSums( w * y * y )
We want to multiply the columns of y by w, but this will multiply the rows by the elements of w, recycled as necessary. Thus
> f(x, y[1, , drop = FALSE], xy.wt)
[1] 0.103021
is correct, because in this case the multiplication is performed element-wise, which is equivalent to column-wise multiplication here, but
> f(x, y, xy.wt)[1]
[1] 0.05463575
gives a wrong answer due to the row-wise multiplication.
We can correct the function as follows
f2 <- function( x, y, w = rep(1,length(x))) {
stopifnot(length(x) == dim(y)[2] )
w <- w / sum(w)
# Center x and y, using the weighted means
x <- x - sum(x * w)
ty <- t(y - colSums(t(y) * w))
# Compute the variance
vx <- sum(w * x * x)
vy <- colSums(w * ty * ty)
# Compute the covariance
vxy <- colSums(ty * x * w)
# Compute the correlation
vxy / sqrt(vx * vy)
}
and check the results against those produced by corr from the boot package:
> res1 <- f2(x, y, xy.wt)
> res2 <- sapply(1:nrow(y),
+ function(i, x, y, w) corr(cbind(x, y[i,]), w = w),
+ x = x, y = y, w = xy.wt)
> all.equal(res1, res2)
[1] TRUE
which in itself gives another way that this problem could be solved.
You can go back to the definition of the correlation.
f <- function( x, y, w = rep(1,length(x))) {
stopifnot( length(x) == dim(y)[2] )
w <- w / sum(w)
# Center x and y, using the weighted means
x <- x - sum(x*w)
y <- y - apply( t(y) * w, 2, sum )
# Compute the variance
vx <- sum( w * x * x )
vy <- rowSums( w * y * y ) # Incorrect: see Heather's remark, in the other answer
# Compute the covariance
vxy <- colSums( t(y) * x * w )
# Compute the correlation
vxy / sqrt(vx * vy)
}
f(x,y)[1]
cor(x,y[1,]) # Identical
f(x, y, xy.wt)
Here is a generalization to compute the weighted Pearson correlation between two matrices (instead of a vector and a matrix, as in the original question):
matrix.corr <- function (a, b, w = rep(1, nrow(a))/nrow(a))
{
# normalize weights
w <- w / sum(w)
# center matrices
a <- sweep(a, 2, colSums(a * w))
b <- sweep(b, 2, colSums(b * w))
# compute weighted correlation
t(w*a) %*% b / sqrt( colSums(w * a**2) %*% t(colSums(w * b**2)) )
}
Using the above example and the correlation function from Heather, we can verify it:
> sum(matrix.corr(as.matrix(x, nrow=34),t(y),xy.wt) - f2(x,y,xy.wt))
[1] 1.537507e-15
In terms of calling syntax, this resembles the unweighted cor:
> a <- matrix( c(1,2,3,1,3,2), nrow=3)
> b <- matrix( c(2,3,1,1,7,3,5,2,8,1,10,12), nrow=3)
> matrix.corr(a,b)
[,1] [,2] [,3] [,4]
[1,] -0.5 0.3273268 0.5 0.9386522
[2,] 0.5 0.9819805 -0.5 0.7679882
> cor(a, b)
[,1] [,2] [,3] [,4]
[1,] -0.5 0.3273268 0.5 0.9386522
[2,] 0.5 0.9819805 -0.5 0.7679882