Looping through vector elements using macro variables

Looping through vector elements using macro variables - r

Suppose the user sets dim (1, ...., n), and then one or more x_i variables values, with i = 1,..., n.
After some more computations I need to automatically return a dim-dimensional vector vec of the form: (0, 0, 0.2, 0, 0, ..., 0.3), where in this specific case the user has set:
dim <- 10
x_3 <- 0.2
x_10 <- 0.3
Of course it is immediate to do:
vec <- rep(0, dim)
vec[3] <- x_3
vec[10] <- x_10
However, since I want to automatize as much operations as possible, I ask you how would you link the x_i variables together with the "respective" element of vec, taking into account that a priori it is not known which/how many variables will be set different from 0.
In other languages this could be done using a for-loop with macro variables... the syntax is wrong, but the idea is something like this:
vec <- rep(0, dim)
for (i in 1:dim) {
if (as.integer(exists(x_i))==1) {
vec[i] <- x_i
}
}
what would you suggest? Thanks!

sapply(paste("x_",seq(dim),sep=""),function(x) if(exists(x)) get(x) else 0)
x_1 x_2 x_3 x_4 x_5 x_6 x_7 x_8 x_9 x_10
0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.3

Related

Lower Triangular Matrix equal to value in Julia

I am trying to find a lower triangular matrix in Julia and place that matrix in a zeros matrix (replacing the zeros). I then want to set the lower triangular matrix equal to the value c. In R this would look something like this:
mat <- matrix(0, n,n)
mat[lower.tri(mat, diag=FALSE)] <- c
with an output that looks like this:
[,1] [,2]
[1,] 0.0000000 0
[2,] 0.4646787 0
I am able to make the zeros matrix in Julia, but I do not know how to place the lowertriangular function inside of the it along with the c value.

Here's probably the closest equivalent in Julia:
julia> n = 3; c = 0.4646787; A = zeros(n,n);
julia> A[tril!(trues(size(A)), -1)] .= c;
julia> A
3×3 Array{Float64,2}:
0.0 0.0 0.0
0.464679 0.0 0.0
0.464679 0.464679 0.0
Of course, one of the big advantages in Julia is that you don't need to worry about creating these clever one-liners nearly as much. Writing out the naive for loops is quite easy and should have similar performance:
julia> for j in 1:size(A, 2), i in j+1:size(A, 1)
A[i, j] = c
end
julia> A
3×3 Array{Float64,2}:
0.0 0.0 0.0
0.464679 0.0 0.0
0.464679 0.464679 0.0

Reduce computation time of simple function

I have a simple function, which is called hundreds of times. Is there any way to improve the speed of its computation?
# the input variable (y) is a scalar
my_function <- function(y){
ret_value <- 0.5*((max(0,y)**2)-(3*max(0,y-1)**2)+(3*max(0,y-2)**2)-(max(0,y-3)**2))
return(ret_value)
}
I already tried with an if& else if statement, which resulted in a slightly longer computation time.
I also read a lot about the rcpppackage, but it is my understanding that a C++ implementation is mostly helpful for more complicated structures such as recursions and multiple loops. Not sure if there would be any improvement for such a simple function as mine.
EDIT: The parent function is as follows and as such, a vectorization would probably be the best idea. What's the best / fastest way to do so?
val1 <- 0.9
val2 <- 0.7
F_val <- 1
loop_seq <- c(1, 2, 3)
for(i in loop_seq){
for(j in loop_seq){
F_val <- F_val + my_function(val1-i)*my_function(val2-j)
}
}
My current solution is the following, but I didn't really get any speed improvement:
ret_value <- 0.5*apply(matrix(pmax(0,rep(y_vec,each=4)+c(0,-1,-2,-3))^2*c(1,-3,3,-1), ncol=length(y_vec)),2,sum)

I think **2 is where the time is spend. Replaced it with multiplication
mff <- function(y) {
a <- if (y < 0.0) 0.0 else y
b <- if (y < 1.0) 0.0 else y-1.0
c <- if (y < 2.0) 0.0 else y-2.0
d <- if (y < 3.0) 0.0 else y-3.0
0.5*(a*a - 3.0*b*b + 3.0*c*c - d*d)
}
On my machine your original function
> system.time(replicate(1e6, mf(0.7)))
user system elapsed
2.88 0.00 2.88
Mine
> system.time(replicate(1e6, mff(0.7)))
user system elapsed
1.48 0.03 1.52
UPDATE
It is also easy to vectorize this function with ifelse()
vecmff <- function(y) {
a <- ifelse(y < 0.0, 0.0, y)
b <- ifelse(y < 1.0, 0.0, y-1.0)
c <- ifelse(y < 2.0, 0.0, y-2.0)
d <- ifelse(y < 3.0, 0.0, y-3.0)
0.5*(a*a - 3.0*b*b + 3.0*c*c - d*d)
}

Speeding up an iterative function with 2 inputs/outputs

Using R, I'm wondering what the best way to iteratively evaluate a function of multiple inputs and outputs. I'm motivated by the plots seen at: http://paulbourke.net/fractals/clifford/
The key equations are:
x_{n+1} = sin(A* y_n) + C* cos(A* x_n)
y_{n+1} = sin(B* x_n) + D* cos(B* y_n)
And I want to store the results for each iteration. I'm guessing there is a MUCH faster way than going through the loop described in the code below:
#Parameters
A <- -1.4
B <- 1.6
C <- 1.0
D <- 0.7
n_iter <- 10000000
#Initial values
x0 <- 0
y0 <- 0
#function to calculate n+1 points
cliff <- function(x,y){
c(sin(A*y) + C*cos(A*x), sin(B*x) + D*cos(B*y))
}
#matrix to store results
res_mat <- matrix(0,nrow=n_iter,ncol=2)
#recursive loop (definitely not the fastest way to do this?)
for (i in 2:n_iter){
res_mat[i,] <- cliff(res_mat[i-1,1],res_mat[i-1,2])
}
I imagine this doesn't actually have to be a single function, but 2 that operate on each other's outputs. Any insight into a more appropriate way to evaluate these functions would be greatly appreciated. I daresay I would benefit here from some general programming advice that would not necessarily be R specific.

One option would be using Rcpp; for iterative functions like this one where each new value is a complex function of the previous iteration's value, this often yields quite good speedups.
library(Rcpp)
cliff.rcpp = cppFunction("
NumericMatrix cliff(int nIter, double A, double B, double C, double D) {
NumericMatrix x(nIter, 2);
for (int i=1; i < nIter; ++i) {
x(i,0) = sin(A*x(i-1,1)) + C*cos(A*x(i-1,0));
x(i,1) = sin(B*x(i-1,0)) + D*cos(B*x(i-1,1));
}
return x;
}")
cliff.rcpp(10, 1, 2, 3, 4)
# [,1] [,2]
# [1,] 0.0000000 0.0000000
# [2,] 3.0000000 4.0000000
# [3,] -3.7267800 -0.8614156
# [4,] -3.2595913 -1.5266964
# [5,] -3.9781665 -4.2182644
# [6,] -1.1296464 -3.1953775
# [7,] 1.3346977 3.2046776
# [8,] 0.6386906 4.4230487
# [9,] 1.4501988 -2.3914781
# [10,] -0.3208062 0.5208984
We can see that this returns identical results to the code in the question:
cliff.orig <- function(n_iter, A, B, C, D) {
#function to calculate n+1 points
cliff <- function(x,y){
c(sin(A*y) + C*cos(A*x), sin(B*x) + D*cos(B*y))
}
#matrix to store results
res_mat <- matrix(0,nrow=n_iter,ncol=2)
#recursive loop (definitely not the fastest way to do this?)
for (i in 2:n_iter){
res_mat[i,] <- cliff(res_mat[i-1,1],res_mat[i-1,2])
}
res_mat
}
identical(cliff.rcpp(10, 1, 2, 3, 4), cliff.orig(10, 1, 2, 3, 4))
# [1] TRUE
For the input in the original question, the Rcpp approach yields a ~50 times speedup:
system.time(cliff.rcpp(10000000, -1.4, 1.6, 1.0, 0.7))
# user system elapsed
# 0.661 0.046 0.717
system.time(cliff.orig(10000000, -1.4, 1.6, 1.0, 0.7))
# user system elapsed
# 34.591 0.245 35.040

r create matrix from repeat loop output

For each value n in some vector N, I want to compute the percentage of values exceed n for each variable in my data frame T.
Consider the following input data frame:
T <- data.frame(A=c(0.1,0.2,0.3), B=c(0.3,0.3,0.9),C=c(1,0.5,0))
T
# A B C
# 1 0.1 0.3 1.0
# 2 0.2 0.3 0.5
# 3 0.3 0.9 0.0
I would like the output to be a matrix that looks something like this:
A B C
n=0.1 66.6 100 66.6
n=0.2 33.3 100 66.6
My current implementation is not working:
n <- 0.8
repeat {
Tlogic <- T > n
TU <- as.matrix(apply(Tlogic,2,sum))
q = NULL
for (i in seq(along=TU[,1]))
{
percent <- (TU[i]/nrow(T))*100
q = c(q, percent)
}
n <- n - 0.05;
print(n);
if(log(n) < -6) break
}

Basically you're asking, for each value n in some vector N, to compute the percentage of values in each column of T that exceed n.
You can actually do this in one line in R by moving from a solution that writes out loops to one that uses the *apply functions in R:
N <- c(0.1, 0.2)
do.call(rbind, lapply(N, function(n) c(n=n, 100*colMeans(T > n))))
# n A B C
# [1,] 0.1 66.66667 100 66.66667
# [2,] 0.2 33.33333 100 66.66667
For each value n in N, the call lapply(N, function(n) c(n=n, 100*colMeans(T > n))) computes a vector that indicates n as well as the percentage of values in each column of T that exceed n. Then do.call(rbind, ...) groups all of these together into a final output matrix.
In your case, you want N to form a decreasing sequence (by 0.05 each step) from 0.8 until log(n) < -6. You can get the N vector in this case with:
N <- seq(.8, 0, -.05)
N <- N[log(N) >= -6]

Tabular data to matrix in R

I'm trying to remove the shackles of some legacy code that we use to make decision trees in a retail setting. I got to playing with hclust in R and it's beautiful and I'd like to use it. The heavy lifting for calculating distances is done in SQL and I get an output like this:
main with dist
A A 0.00
A B 1.37
A C 0.64
B B 0
B C 0.1
C C 0
That's loaded as a data frame right now (just reading the SQL query dump), but hclust wants a matrix of distances. E.g.,:
A B C
--+-----------------
A | 0
B | 1.37 0
C | 0.64 0.1 0
My thinking is too procedural and I'm trying to do it in nested loops at the moment. Can someone point me in the direction of something more R-idiomatic to do this?
Thank!

If you are looking for an actual distance matrix in R, try:
as.dist(xtabs(dist ~ with + main, mydf), diag = TRUE)
# A B C
# A 0.00
# B 1.37 0.00
# C 0.64 0.10 0.00
I'm presuming that the combinations of "main" and "with" are unique, otherwise xtabs would sum the "dist" values.

I would suggest to change from letters to numbers (which is straight forward using the ASCII codes) and then use the linearized indices of R matrices to access each pair in a vectorwise manner.
Minimal example:
N <- 3
d <- data.frame(x = c(1,2), y = c(2,3), v = c(0.1, 0.2))
m <- matrix(0, N, N)
m[(d$y-1)*N+d$x] = d$v
The output is:
[,1] [,2] [,3]
[1,] 0 0.1 0.0
[2,] 0 0.0 0.2
[3,] 0 0.0 0.0
EDIT: To preserve arbitrary strings as row and col names, consider the following example:
codes <- c('A','B','C')
N <- 3
d <- data.frame(x = c('A','B'), y = c('B','C'), v = c(0.1, 0.2))
m <- matrix(0, N, N)
m[(vapply(d$y, function(x) which(codes == x), 0)-1)*N+
vapply(d$x, function(x) which(codes == x), 0)] = d$v
rownames(m) = codes
colnames(m) = codes

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Looping through vector elements using macro variables - r

sapply(paste("x_",seq(dim),sep=""),function(x) if(exists(x)) get(x) else 0) x_1 x_2 x_3 x_4 x_5 x_6 x_7 x_8 x_9 x_10 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.3

Related

Lower Triangular Matrix equal to value in Julia

Reduce computation time of simple function

Speeding up an iterative function with 2 inputs/outputs

r create matrix from repeat loop output

Tabular data to matrix in R

Categories

Resources