parallel programming and the <<- assignment operator in R - r

i have a question regarding parallel programming and the <<- operator in R.
I want to apply the xfun function on matrix x.
If the entry in the first column is smaller than 0.5, it's supposed to append the entry in the second column to the vector vec outside of the function in the global environment.
At the end the function should return the first column plus the random number in y.
If i use the regular apply function, it works exactly like it should. However i want to apply a similar function to a huge dataset and therefore want to do it in parallel
via the future_apply function from the package "future.apply". But when i do this, the <<- operator does not work and vec stays empty.
Is there anyone who knows why that is and if there is a way to make it work?
Thanks in advance
x <- matrix(runif(20), nrow = 10, ncol = 2)
vec <- NULL
xfun <- function(x) {
y <- runif(1)
if (x[1] < 0.5) {
vec <<- append(vec, x[2])
}
return(x[1] + y)
}
# works
xy <- apply(x, 1, xfun)
library(future.apply)
# vec stays empty
xy <- future_apply(x, 1, xfun)

Related

How to perform a loop to evaluate a function with different values of x in R?

I created a function and I want to evaluate that function at different values of x. I did a loop for this and I expected the result to be a vector with the different values of the function. However, something I am doing wrong, because of the two ways I tried to do this I did not get the expected vector.
my_function <- function(x) x^2
#First try
for(i in seq(0.1, 1, 0.1)) {
y[i] <- my_function(i)
}
y
#Second try
for(i in seq(0.1, 1, 0.1)) {
y[i] <- my_function[i]
}
y
The first attempt resulted in:
> y
[1] 1
The second attempt resulted in:
object of type 'closure' is not subsettable
I would like to obtain a resulting vector, composed of the function values evaluated at 0.1, 0.2, etc.
In the second attempt, it is trying to extract a value with an index and the myfunction is a function. So, it should be (i)
Instead, what we need is a vector to store the values
x <- seq(0.1, 1, 0.1)
y <- numeric(length(x))
Or else y gets updated in each iteration and the value we get is from the last iteration
for(i in seq_along(x)) {
y[i] <- my_function(x[i])
}
Note that my_function is a vectorized one. So, we don't need any loop
my_function(x)

why smart rounding works differently with map/lapply than without?

I would like to smartly round my results so that it sums up to the same sum after rounding.
Can someone explain me why this is different when I do it with map or lapply?
v <- c(
0.9472164,
71.5330771,
27.5197066)
smart.round <- function(x, digits = 0) {
up <- 10 ^ digits
x <- x * up
y <- floor(x)
indices <- tail(order(x-y), round(sum(x)) - sum(y))
y[indices] <- y[indices] + 1
y / up
}
### works correctly
smart.round(v)
### lapply and map is wrong
lapply(v,smart.round)
map(v,smart.round)
( I think this is merely a comment, but I have not yet earned my right add comments )
lapply, purrr::map are processing your input sequentially. In your example, lapply takes the first value of v and calls smart.round then moves on to the second value of v and so on ...
in total smart.round is called three times, each time without any knowledge of the other two values in v.
I'm not entirely sure why you try to use lapply here, if this is part of a more complex situation you might want to expand your question.
I have written my own solution. Definitely a bit cumbersome but it works.. :) My initial goal was just to input a dataframe and output the rounded dataframe.
The whole example here:
v <- data.frame(a = c(0.9472164,
71.5330771,
27.5197066),
b = c(4.6472164,
5.6330771,
27.1197066))
smart.round <- function(x, digits = 0) {
up <- 10 ^ digits
x <- x * up
y <- floor(x)
indices <- tail(order(x-y), round(sum(x)) - sum(y))
y[indices] <- y[indices] + 1
y / up
}
rounding_function <- function(input_df) {
output_df <- data.frame(matrix(ncol = ncol(input_df), nrow = nrow(input_df)))
for (i in 1:nrow(input_df)) {
a = smart.round(as.numeric(input_df[i,]))
for (k in 1:ncol(input_df)) {
output_df[i,k]=a[k]
}
colnames(output_df) = colnames(input_df)
}
return(output_df)
}
v_rounded <- rounding_function(v)

Compare array over matrix without loop

I need an array Y of integers and NA to compare to a matrix and return TRUE, FALSE, or NA. I'm limited in how I can write this - no loops or if statements. It has to be very plain. The issue is that it only compares the length of the array without repeating over the rest of the matrix; also, it isn't correctly recognizing FALSE values.
I know it's my apply function but I don't know how to get apply() to repeat by itself without looping.
answer <- function(x,y){
y <- as.matrix(y)
z <- apply(apply(x,2,`==`,y),1,any)
q <- as.matrix(z)
print(q)
}
It depends on how you see the matrix but R is a mostly vectorized language you don't need loops to compare elements of different sizes, but be mindful of direction and of recycling
answer <- function(x,y){
cat('+++++Solution 4+++++\n')
q <- x == y
print(q)
}
x <- matrix(c(1,0,1,0,1,1,1,1,0,1,0,1), nrow=4, ncol=4)
y <- c(1, 1, 1, NA)
answer(x,y)
Or solution by row very ugly stuff
answer <- function(x,y){
cat('+++++Solution 4+++++\n')
q <- matrix(apply(t(y),1,`==`,t(x)),nrow = 4,byrow = TRUE)
print(q)
}
answer(x,y)

How to Assign Values to a Vector Based on a Logical Expression in R

I'm trying to write the following function:
f <- function(q, r) {
for(i in seq(from = (1 - r), to = (r - 1), by = 1)){
s <- r + i;
if (q %% s == 0) {
here(s)
}
}
}
However, where I have "here," I'd like those values of s that meet the criterion specified by the "if" statement above it, so that I may perform operations on it (take max and min values, and whatnot), i.e. a vector of the form:
v <- c(those values of s that meet the criterion stipulated by if statement)
I'm sure this is relatively simple, but this is the first function I've tried to write in R, so bear with me, if you could. Thanks.
From what I understand in your code, you want to create a vector from (1-r+r) to (r-1+r) and then if the values of that vector are divisible by q, then you want to apply a function to them.
I created a vector of only the numbers that meet the condition (by sub-setting with the TRUE/FALSE vector) and then applied the function only to those that met the condition.
I hope this code correctly interprets your function.
f <- function(q, r) {
s <- seq(1, 2*r-1, by=1)
ind <- ifelse(q %% s == 0, TRUE, FALSE)
result <- here(s[ind])
return(result)
}

subscript out of bounds error with two for loops inside the function

I am trying to use a two dimension matrix to produce a two dimension matrix result where
the number of rows and number of columns are determined in a new way everytime I change the values in the function which determines the number of rows and number of columns accordingly.
The function that I would like to ask and resolve the "subscript out of bounds" problem is the following:
HRC <- function(n,b,c)
{
R=matrix( ,nrow = n*b, ncol = c)
R[0,]=133
for (j in 1:c)
{
r=rnorm(n*b)
for (i in 1:n*b){
R[i+1,j]=R[i,j]+3*b/r[i]
}
}
return(R)
}
HRC(10,1,3)
The error message that I get is the following:
Error in R[i + 1, j] = R[i, j] + 3 * b/r[i] : subscript out of bounds
I wonder how I can resolve this problem. Thank you so much in advance.
R's indexing starts at 1, not 0.
You also have to be careful with the operators precedence rules: the : operator has higher precedence than *. See ?Syntax.
This should work:
HRC <- function(n, b, c) {
R <- matrix(NA, nrow = n*b, ncol = c)
R[1,]=133
for (j in 1:c) {
r = rnorm(n*b)
for (i in 1:(n*b-1)){
R[i+1,j] = R[i,j] + 3*b/r[i]
}
}
return(R)
}
HRC(10,1,3)
The problem is that you loop from row b to row n*b (with stride b, due to the precedence of * and :) and then index to one greater, so you attempt to index row n*b + 1 of R, which is out of bounds.
R[0,]<- will cause incorrect results but not elicit an error from R.
I find the code easier to read if you loop from 2 to n*b, the number of rows, and write the formula in terms of creating row i from row i-1 (rather than creating row i+1 from row i).
In addition, you can drop one loop dimension by vectorizing the operations over the rows:
HRC <- function(n, b, c) {
R <- matrix(NA, nrow = n*b, ncol = c)
R[1,] <- 133
r <- matrix(rnorm(n*b*c), ncol=c)
for (i in 2:(n*b)){
R[i,] <- R[i-1,] + 3*b/r[i-1,]
}
return(R)
}
HRC(10,1,3)
Here, the same number of random samples are taken with rnorm but they are formed as a matrix, and used in the same order as used in the question. Note that not all of the random values are actually used in the computation.
If you set a random seed and then run this function, and the function in #flodel's answer, you will get identical results. His answer is also correct.
I think you are making three mistakes:
First: You are messing up the row count on the index. It should be 1:(n*b) and not 1:n*b.
Second: In R, indexing starts at 1. So R[0,] should be replaced by R[1,].
Third: You are running the loops in the right bounds 1:c and 1:(n:b), but you are probably not keeping track of the indices.
Try this:
set.seed(100)
HRC <- function(n, b, c) {
R <- matrix(0, nrow = n*b, ncol = c)
R[1,] <- 133
for (j in 1:c) {
r <- rnorm(n*b)
for (i in 2:(n*b)){
R[i,j] <- R[i-1,j] + 3*b/r[i-1]
}
}
return(R)
}
HRC(10,1,3)
Lastly, I would like to warn you about interchangeable use of the assignment operators. See here.

Resources