Suppose I create a hexbin (using the hexbin package):
h <- hexbin(df)
where df has x and y fields. For a particular value of x and y, how do I get the count of the corresponding bin?
Assuming you are using the hexbin function from library(hexbin) you can use the bin IDs to achive what you want.
Call the function as hexbin(..., IDs = T) and the result will have a field that tells you in which bin the points fall.
Working example:
library(hexbin)
x <- c(1, 1.2, 1, 3, 5, -2 ,1, 0, 0.8)
y <- c(1, 1, 0, -1, 0, 2, -1, 1, 1)
h <- hexbin(x, y, xbins = 3,IDs = T)
#what is the cell ID of point 1?
ID1 <- h#cID[1]
#how many points fall in that cell?
sum(h#cID == ID1) #answer is 4 in this case
get_count <- function(x, y, h) {
my_dist <- function(x2, y2) {
return(sqrt((x - x2) ^ 2 + (y - y2) ^ 2))
}
distances <- mapply(my_dist, attr(h, 'xcm'), attr(h, 'ycm'))
return(attr(h, 'count')[which.min(distances)])
}
Related
I have the below codes for a bivariate normal distribution:
library(mnormt)
x <- seq(-5, 5, 0.25)
y <- seq(-5, 5, 0.25)
mu <- c(0, 0)
sigma <- matrix(c(2, -1, -1, 2), nrow = 2)
f <- function(x, y) dmnorm(cbind(x, y), mu, sigma)
z <- outer(x, y, f)
a) I would like to know what the algebraic expression z=f(x,y) is based on the above codes (please write the algebraic expression explicitly). b) Indeed, numbers 2, -1, -1 and 2 in matrix(c(2, -1, -1, 2), nrow = 2) are which parameters in the algebraic expression z=f(x,y)?
If you want to see the source code, you may go to see there there.
I comment the code for you :
dmnorm <- function(x, mean=rep(0,d), varcov, log=FALSE)
{
# number of variable
d <- if(is.matrix(varcov)) ncol(varcov) else 1
if(d==1) return(dnorm(x, mean, sqrt(varcov), log=log))
x <- if (is.vector(x)) t(matrix(x)) else data.matrix(x)
if(ncol(x) != d) stop("mismatch of dimensions of 'x' and 'varcov'")
if(is.matrix(mean)) {
if ((nrow(x) != nrow(mean)) || (ncol(mean) != d))
stop("mismatch of dimensions of 'x' and 'mean'") }
if(is.vector(mean)) mean <- outer(rep(1, nrow(x)), as.vector(matrix(mean,d)))
# center
X <- t(x - mean)
# compute the inverse of sigma
conc <- pd.solve(varcov, log.det=TRUE)
# Q is the exponential part
Q <- colSums((conc %*% X)* X)
# compute the log determinant
log.det <- attr(conc, "log.det")
# log likelihood
logPDF <- as.vector(Q + d*logb(2*pi) + log.det)/(-2)
if(log) logPDF else exp(logPDF)
}
It is a strick application of this equation :
Which come from this website.
I have 2 step function objects (ecdf objects to be exact). How to calculate a step function that is a difference or sum of these two?
I just had the same question and found the follwoing nice solution
y1 <- c(0, 1, 2, 0)
x1 <- c(1, 2, 3)
f1 <- stepfun(x = x1, y = y1)
par(mfrow = c(2, 2))
plot(f1)
y2 <- c(0, 1, 0)
x2 <- c(1.5, 2.5)
f2 <- stepfun(x = x2, y = y2)
plot(f2)
fs <- function(x, f1, f2) {
return(f1(x) + f2(x))
}
fm <- function(x, f1, f2) {
return(f1(x) * f2(x))
}
x <- seq(0, 4, length.out = 100)
plot(x, fs(x, f1, f2), type = "s", main = "Sum f1+f2")
plot(x, fm(x, f1, f2), type = "s", main = "Multiplication f1*f2")
par(mfrow = c(1, 1))
There might be a more elegant version using + and * operators, see e.g. here...
foo <- structure(list(value = 1, txt = 'a'), class = 'foo')
`+.foo` <- function(leftfoo, rightfoo) { return (paste(leftfoo$txt, rightfoo$txt)) }
foo + foo
#[1] "a a"
Depends on what you need it for. An object of class stepfun is in one sense a function; if a <- ecdf(rnorm(100)), then a(0) will evaluate to something close to .5. So you can add them just by adding functions -- ecdf.sum <- function(x) { ecdf1(x) + ecdf2(x) }. This will yield something that is effectively a step function, but not of class stepfun or ecdf.
Regardless, what you get out will not be an ecdf object, because the values will not have the correct range. But to at least recover it as a step function, you can decompose it into knots:
knots.new <- sort(knots(ec1), knots(ec2))
ec.new <- stepfun(knots.new, c(0,ec1(knots.new) + ec2(knots.new)))
The c(0, ... is because you need one more value than the knots (for the left-hand value of the step function), and for objects of type ecdf 0 is a safe value.
I'm implementing the Jacobi iterative method to solve linear systems Ax = b
I have the following code:
data.a <- c(3, -1, 1, 3, 6, 2, 3, 3, 7)
A <- matrix(data.a, nrow = 3, ncol = 3, byrow = TRUE)
b <- c(1, 0, 4)
Xo <- c(0,0,0)
X <- c(0, 0, 0)
#A is the matrix:
#3 -1 1
#3 6 2
#3 3 7
#b is the column vector:
#[1, 0, 4]
#and Xo is the previous X computed
for(i in 1:nrow(A)){
sum = 0
for(j in 1:ncol(A)){
if(j != i){
sum = sum + A[i,j]*Xo[j]
}
}
X[i] = -(1/A[i,i])*(sum - b[i])
}
The thing is, because I only multiply and sum up the values A[i][j]*Xo[j] for j != i
I am using nested for-loops and use the auxiliar variable sum.
My question is: Could I use something like
A[i,] %*% Xo
to compute the values of the sum without the nested-for loops?
edit: I found a solution
X[i] = -(1/A[i,i])*(A[i,]%*%Xo - A[i,i]*Xo[i] - b[i])
# I subtracted the term A[i,i]*Xo[i] from the product A*Xo
You can even remove the first loop by making a matrix R, whose elements are equal to A except that diagonal elements are zeros.
update <- function(x, A, b) {
D <- diag(diag(A))
R <- A - D
sums <- R %*% x
x <- (b - sums) / diag(D)
x
}
data.a <- c(3, -1, 1, 3, 6, 2, 3, 3, 7)
A <- matrix(data.a, nrow = 3, ncol = 3, byrow = TRUE)
b <- c(1, 0, 4)
x <- c(0, 0, 0)
for (i in 1:100) x <- update(x, A, b)
x
# verify the answer is correct
solve(A, b)
I am trying to simulate the shape of the rings from a trunk section in R, but each time that I want to approach the real shape it get more difficult. I started doing it with four radii measurements, and I got a nice solution (see here).
However, now I want to plot more than four radii but at different angles, and connect these points with a line simulating the rings like this sketch that I did:
My first approach was to rotate the matrix of data, but I could not make that all radii started in the same position (0,0). I also tried to rate the axes without success.
That is why I would like to ask for some direction to do it, and finally calculate the area of each ring.
Any help will be welcome
I am using the spline.poly function from here.
spline.poly
spline.poly <- function(xy, vertices, k=3, ...) {
# Assert: xy is an n by 2 matrix with n >= k.
# Wrap k vertices around each end.
n <- dim(xy)[1]
if (k >= 1) {
data <- rbind(xy[(n-k+1):n,], xy, xy[1:k, ])
} else {
data <- xy
}
# Spline the x and y coordinates.
data.spline <- spline(1:(n+2*k), data[,1], n=vertices, ...)
x <- data.spline$x
x1 <- data.spline$y
x2 <- spline(1:(n+2*k), data[,2], n=vertices, ...)$y
# Retain only the middle part.
cbind(x1, x2)[k < x & x <= n+k, ]
}
DATA
df = data.frame(A = c(1, 4, 5, 8, 10),
B = c(1, 3, 7, 9, 10),
C = c(2, 6, 8, 9, 10),
D = c(1, 3, 4, 7, 9),
E = c(1, 2, 3, 4, 5))
DRAW
#Calculate angles based on number of columns
angles = 0:(NCOL(df) - 1) * 2*pi/NCOL(df)
#Calculate x and y corresponding to each radial distance
toplot = lapply(1:NCOL(df), function(i){
data.frame(x = df[,i]*cos(angles[i]),
y = df[,i]*sin(angles[i]))
})
#Split toplot and merge back together the same rows
toplot2 = lapply(toplot, function(x) data.frame(x, ID = sequence(NROW(x))))
toplot2 = do.call(rbind, toplot2)
toplot2 = split(toplot2, toplot2$ID)
#Create empty plot
graphics.off()
plot(do.call(rbind, toplot), type = "n", axes = FALSE, ann = FALSE, asp = 1)
#Allow drawing outside the plot region just in case
par(xpd = TRUE)
#Draw polygons
lapply(toplot2, function(a){
polygon(spline.poly(xy = cbind(a$x, a$y), vertices = 100, k = 3))
})
#Draw points
lapply(toplot, function(a){
points(a)
})
#Draw radial lines
lapply(toplot, function(a){
lines(a)
})
AREA
area_data = lapply(toplot2, function(a){
spline.poly(xy = cbind(a$x, a$y), vertices = 100, k = 3)
})
library(geometry)
lapply(area_data, function(P) polyarea(P[,1], P[,2]))
#$`1`
#[1] 4.35568
#$`2`
#[1] 38.46985
#$`3`
#[1] 96.41331
#$`4`
#[1] 174.1584
#$`5`
#[1] 240.5837
`I'm wondering how I would go about altering this code so that corresponding values of both vectors cannot be equal. As an example: if x = (1, 2, 2, 4, 8, 1, 7, 9, 5, 10) and y = (3, 2, 7, 8, 4, 10, 4, 8, 2, 1), the second values for both vectors equal 2. Is there any way I can tell R to re-sample in this second spot in vector x until it is not the same value in vector y?
x <- c(1:10)
y <- c(1:10)
sample_x <- sample(x, length(10), replace = TRUE)
z <- sample_x > y`
You could do:
while(any(x == y)) x <- sample(x)
Edit: Now I realize x and y probably come from a similar sample call with replace = TRUE, here is an interesting approach that avoids a while loop. It uses indices and modulo to ensure that the two samples do not match:
N <- 1:10 # vector to choose from (assumes distinct values)
L <- 20 # sample size - this might be length(N) as in your example
n <- length(N)
i <- sample(n, L, replace = TRUE)
j <- sample(n-1, L, replace = TRUE)
x <- N[i]
y <- N[1 + (i + j - 1) %% n]
while (any(ind <- x==y))
x[ind] <- sample(N, sum(ind), TRUE)
where N is what you are sampling from (or the max integer)
The advantage here is that if you do not need to resample all of x, then this will converge more quickly.
You can use function permn from library combinat to generate all permutations of vector of length 10.
ind <- permn(10)
xy_any_equal <- sapply(ind, function(i) any(x[i] == y))
if(sum(xy_any_equal) < length(xy_any_equal)) x_perm <- x[head(ind[!xy_any_equal],1)[[1]]]
exists(x_perm)