weighted average of neighbor elements in a vector in R - r

I have two vectors x and w. vector w is a numerical vector of weights the same length as x.
How can we get the weighted average of neighbor elements in vector x( weighted average of the first element and second one , then weighted average of the secnod and third elements, ..... For example, these vectors are as follows:
x = c(0.0001560653, 0.0001591889, 0.0001599698, 0.0001607507, 0.0001623125,
0.0001685597, 0.0002793819, 0.0006336307, 0.0092017241, 0.0092079042,
0.0266525118, 0.0266889564, 0.0454923285, 0.0455676525, 0.0457005450)
w = c(2.886814e+03, 1.565955e+04, 9.255762e-02, 7.353589e+02, 1.568933e+03,
5.108046e+05, 6.942338e+05, 4.912165e+04, 9.257674e+00, 3.609918e+02,
8.090436e-01, 1.072975e+00, 1.359145e+00, 9.828314e+00, 9.455688e+01)

sapply(1:(length(x)-1), function(i) weighted.mean(x[i:(i+1)], w[i:(i+1)]))

A functional programming approach - will be slower than `#David Robinsons
# lots of `Map` \ functional programming
mapply(weighted.mean,
x = Map(c, head(x,-1),tail(x,-1)),
w = Map(c, head(w,-1) ,tail(w,-1))

Related

Pointwise multiplication and right matrix division

I'm currently trying to recreate this Matlab function in R:
function X = uniform_sphere_points(n,d)
% X = uniform_sphere_points(n,d)
%
%function generates n points unformly within the unit sphere in d dimensions
z= randn(n,d);
r1 = sqrt(sum(z.^2,2));
X=z./repmat(r1,1,d);
r=rand(n,1).^(1/d);
X = X.*repmat(r,1,d);
Regarding the the right matrix division I installed the pracma package. My R code right now is:
uniform_sphere_points <- function(n,d){
# function generates n points uniformly within the unit sphere in d dimensions
z = rnorm(n, d)
r1 = sqrt(sum(z^2,2))
X = mrdivide(z, repmat(r1,1,d))
r = rnorm(1)^(1/d)
X = X * matrix(r,1,d)
return(X)
}
But it is not really working since I always end with a non-conformable arrays error in R.
This operation for sampling n random points from the d-dimensional unit sphere could be stated in words as:
Construct a n x d matrix with entries drawn from the standard normal distribution
Normalize each row so it has (2-norm) magnitude 1
For each row, compute a random value by taking a draw from the uniform distribution (between 0 and 1) and raise that value to the 1/d power. Multiply all elements in the row by that value.
The following R code does these operations:
unif.samp <- function(n, d) {
z <- matrix(rnorm(n*d), nrow=n, ncol=d)
z * (runif(n)^(1/d) / sqrt(rowSums(z^2)))
}
Note that in the second line of code I have taken advantage of the fact that multiplying a n x d matrix in R by a vector of length n will multiply each row by the corresponding value in that vector. This saves us the work of using repmat to construct matrices of exactly the same size as our original matrix for these sorts of row-specific operations.

Calculate a n-byn matrix using values in 2 vectors (lengths of n) in R

I'm trying to calculate a n-by-n matrix in R using the values from 2 n vectors.
For example, let's say I have the following vectors.
formula f(x,y)=x+y
x<-c(1,2,3)
y<-c(8,9,10)
z should be a 3-by-3 matrix where z[0][0] is f(0,0) z[0][1] is f(0,1). IS there any way to perform such a calculation in R?
You can try outer
outer(x, y, FUN= f)
where
f <- function(x,y) x+y

Euclidean distance between two n-dimenstional vectors

What's an easy way to find the Euclidean distance between two n-dimensional vectors in Julia?
Here is a simple way
n = 10
x = rand(n)
y = rand(n)
d = norm(x-y) # The euclidean (L2) distance
For Manhattan/taxicab/L1 distance, use norm(x-y,1)
This is easily done thanks to the lovely Distances package:
Pkg.add("Distances") #if you don't have it
using Distances
one7d = rand(7)
two7d = rand(7)
dist = euclidean(one7d,two7d)
Also if you have say 2 matrices of 9d col vectors, you can get the distances between each corresponding pair using colwise:
thousand9d1 = rand(9,1000)
thousand9d2 = rand(9,1000)
dists = colwise(Euclidean(), thousand9d1, thousand9d2)
#returns: 1000-element Array{Float64,1}
You can also compare to a single vector e.g. the origin (if you want the magnitude of each column vector)
origin9 = zeros(9)
mags = colwise(Euclidean(), thousand9ds1, origin9)
#returns: 1000-element Array{Float64,1}
Other distances are also available:
Squared Euclidean
Cityblock
Chebyshev
Minkowski
Hamming
Cosine
Correlation
Chi-square
Kullback-Leibler divergence
Jensen-Shannon divergence
Mahalanobis
Squared Mahalanobis
Bhattacharyya
Hellinger
More details at the package's github page here.

Find K nearest neighbors, starting from a distance matrix

I'm looking for a well-optimized function that accepts an n X n distance matrix and returns an n X k matrix with the indices of the k nearest neighbors of the ith datapoint in the ith row.
I find a gazillion different R packages that let you do KNN, but they all seem to include the distance computations along with the sorting algorithm within the same function. In particular, for most routines the main argument is the original data matrix, not a distance matrix. In my case, I'm using a nonstandard distance on mixed variable types, so I need to separate the sorting problem from the distance computations.
This is not exactly a daunting problem -- I obviously could just use the order function inside a loop to get what I want (see my solution below), but this is far from optimal. For example, the sort function with partial = 1:k when k is small (less than 11) goes much faster, but unfortunately returns only sorted values rather than the desired indices.
Try to use FastKNN CRAN package (although it is not well documented). It offers k.nearest.neighbors function where an arbitrary distance matrix can be given. Below you have an example that computes the matrix you need.
# arbitrary data
train <- matrix(sample(c("a","b","c"),12,replace=TRUE), ncol=2) # n x 2
n = dim(train)[1]
distMatrix <- matrix(runif(n^2,0,1),ncol=n) # n x n
# matrix of neighbours
k=3
nn = matrix(0,n,k) # n x k
for (i in 1:n)
nn[i,] = k.nearest.neighbors(i, distMatrix, k = k)
Notice: You can always check Cran packages list for Ctrl+F='knn'
related functions:
https://cran.r-project.org/web/packages/available_packages_by_name.html
For the record (I won't mark this as the answer), here is a quick-and-dirty solution. Suppose sd.dist is the special distance matrix. Suppose k.for.nn is the number of nearest neighbors.
n = nrow(sd.dist)
knn.mat = matrix(0, ncol = k.for.nn, nrow = n)
knd.mat = knn.mat
for(i in 1:n){
knn.mat[i,] = order(sd.dist[i,])[1:k.for.nn]
knd.mat[i,] = sd.dist[i,knn.mat[i,]]
}
Now knn.mat is the matrix with the indices of the k nearest neighbors in each row, and for convenience knd.mat stores the corresponding distances.

Calculating Partitioned Matrices from subs

Say you have a matrix A which is of size P × P and a number Q < P can be used to
take a partition of said matrix, where:
A1 is the upper-left sub matrix, with dimension Q × Q,
A2 is the upper-right sub matrix, with dimension Q × (P-Q),
A3 is the lower-left sub matrix, with dimension (P-Q) × Q,
A4 is the lower-rightsub matrix, with dimension (P-Q) × (P-Q).
Which looks like this:
A1 | A2
A = ---+----
A3 | A4
How can you calculate the matrix:
Where 0q is a Q × Q matrix with zero elements.
I'm learning from a book called "Discovering Statistics using R" and although it discusses partitioned matrices, it doesn't show how to calculate one like the one given above and unfortunately I'm having no luck on the programming or maths based searches...
Any help, either mathematically and/or example R code would be great. Thanks in advance.
R has various ways to grab blocks from matrices. For instance, you can use a vector of indexes to reference a set of rows or set of columns using the extract function [, as shown in this example. (The option drop=FALSE is needed if you must handle the case p=1 or q=1 so that R continues to treat the results as matrices and not just vectors.)
#
# Create a symmetric p-d matrix of size p+q.
#
p <- 2; q <- 3 # Both must be 1 or greater
x <- matrix(rnorm((p+q)^2 * 2), ncol=p+q)
a <- cov(x)
#
# Compute b.
#
i <- 1:p; j <- 1:q + p # Indexes of the blocks
b <- a[i,i, drop=FALSE] -
a[i,j, drop=FALSE] %*% solve(a[j,j, drop=FALSE], a[j,i, drop=FALSE])
Matrix inversion is implemented by solve (which is numerically more stable and efficient than computing the inverse of a[j,j] and multiplying that by a[j,i]), the single remaining multiplication is carried out by %*%, and - subtracts one matrix from another component by component. In this fashion the code closely parallels the mathematical expression in the question.

Resources