converting a for cycle in R - r

I would like to convert a for cycle into a faster operation such as apply.
Here is my code
for(a in 1:dim(k)[1]){
for(b in 1:dim(k)[2]){
if( (k[a,b,1,1]==0) & (k[a,b,1,2]==0) & (k[a,b,1,3]==0) ){
k[a,b,1,1]<-1
k[a,b,1,2]<-1
k[a,b,1,3]<-1
}
}
}
It's a simple code that does a check on each element of the multidimensional array k and if the three elements are the same and equal to 0, it assigns the value 1.
Is there a way to make it faster?. The matrix k has 1,444,000 elements and it takes too long to run it. Can anyone help?
Thanks

With apply you can return all your 3-combinations as a numeric vector and then check for your specific condition:
# This creates an array with the same properties as yours
array <- array(data = sample(c(0, 1), 81, replace = TRUE,
prob = c(0.9, 0.1)), c(3, 3, 3, 3))
# This loops over all vectors in the fourth dimension and returns a
# vector of ones if your condition is met
apply(array, MARGIN = c(1, 2, 3), FUN = function(x) {
if (sum(x) == 0 & length(unique(x)) == 1)
return(c(1, 1, 1))
else
return(x)
})
Note that the MARGIN argument specifies the dimensions over which to loop. You want the fourth dimension vectors so you specify c(1, 2, 3).
If you then assign this newly created array to the old one, you replaced all vectors where the condition is met with ones.

You should first use the filter function twice (composed), and then the apply (lapply?) function on the filtered array. Maybe you can also reduce the array, because it looks like you're not very interested in the third dimension (always accessing the 1st item). You should probably do some reading about functional programming in R here http://adv-r.had.co.nz/Functionals.html
Note I'm not a R programmer, but I'm quite familiar with functional programming (Haskell etc) so this might give you an idea. This might be faster, but it depends a bit on how R is designed (lazy or eager evaluation etc).

Related

Is there a native R syntax to extract rows of an array?

Imagine I have an array in R with N dimensions (a matrix would be an array with 2 dimensions) and I want to select the rows from 1 to n of my array. I was wondering if there was a syntax to do this in R without knowing the number of dimensions.
Indeed, I can do
x = matrix(0, nrow = 10, ncol = 2)
x[1:5, ] # to take the 5 first rows of a matrix
x = array(0, dim = c(10, 2, 3))
x[1:5, , ] # to take the 5 first rows of a 3D array
So far I haven't found a way to use this kind of writing to extract rows of an array without knowing its number of dimensions (obviously if I knew the number of dimensions I would just have to put as many commas as needed). The following snippet works but does not seem to be the most native way to do it:
x = array(0, dim = c(10, 2, 3, 4)
apply(x, 2:length(dim(x)), function(y) y[1:5])
Is there a more R way to achieve this?
Your apply solution is the best, actually.
apply(x, 2:length(dim(x)), `[`, 1:5)
or even better as #RuiBarradas pointed out (please vote his comment too!):
apply(x, -1, `[`, 1:5)
Coming from Lisp, I can say, that R is very lispy.
And the apply solution is a very lispy solution.
And therefore it is very R-ish (a solution following the functional programming paradigm).
Function slice.index() is easily overlooked (as I know to my cost! see magic::arow()) but can be useful in this case:
x <- array(runif(60), dim = c(10, 2, 3))
array(x[slice.index(x,1) %in% 1:5],c(5,dim(x)[-1]))
HTH, Robin

R function ifelse returning only evaluation of the last row

I have a 200 x 10 matrix of digits , either 2 or 7 called predictions.
I'm trying to return the majority for each row using this code.
for (i in 1:nrow(predictions)) {
if_else(mean(predictions[i,] == 7) > 0.5, 7, 2)
}
This returns nothing at all on the console.
If I try to assign a variable to it like so:
for (i in 1:nrow(predictions)) {
if_else(mean(predictions[i,] == 7) > 0.5, 7, 2)
} -> ens
it returns the result for the last row.
If I try to assign a variable to it at the beginning, the variable contains NULL:
ens <- for(i in 1:nrow(predictions)) {
if_else(mean(predictions[i,] == 7) > 0.5, 7, 2)
}
What am I missing?
Your previous code is not returning anything because you're not saving the results of each iteration. To return the majority for each row, the minimum number of modifications to achieve what you want should be like this:
majority <- c()
for(i in 1:nrow(predictions)){
majority[i] <- if_else(mean(predictions[i,]==7)>0.5,7,2)
}
Then, depending on what you wanted to do with the vector of majorities, you could either mutate a row number or bind it to the original predictions matrix.
EDIT
If you want to step away from using for loops, you can use apply statements. If you want to stay in the tidyverse (I see you're using dplyr::if_else()), check out the purrr family of map() functions.
I think you want to be using apply here in row mode:
ens <- apply(predictions, 1, function(x) {
if (mean(x == 7) > 0.5) 7 else 2
})
Note that I am using regular if ... else non vectorized flow logic, since we are dealing with scalar aggregates of each row in the anonymous function being passed to apply().
You generally do not want or need to process data frame content with for loops. You can generate the row means as a vector with the rowMeans function. Try:
result <- ifelse(rowMeans(predictions) > 0.5, 7, 2)

What does term in function mean?

Hi sorry still learning here and slow to learning code arguments.
Just wondering could anyone explain what a certain part of a function means:
x = sum(abs(apply(embed(y, 4), 1, prod)))
It does give the following on paper:
#sum(y|{j}|*y|{j-1}|*y|{j-2}|*|y{j-3}|)
I am wondering what does the 1 do? as I think the (y, 4) means y with y plus 3 lags and prod I know is product
this specific function was wrote for be by I am trying to modify it to equal:
#sum((|y{j}|^3/2)*(|y{j-1}|^3/2)*(|y{j-2}|^3/2)*(|y{j-3}|^3/2))
So basically I am wondering should my modified function to raise the y's to ^3/2 should I compute:
x = sum(abs(apply(embed((y^3/2), 4), 1, prod)))
or to:
x = sum(abs(apply(embed(y, 4), 3/2, prod)))
or another?
Any help?
Thank you in advance for your input
1 is part of the apply function of argument called MARGIN. This is why I advocate specifying argument names. Anyway, apply function will "loop" through rows (1) or columns (2) of data frames, arrays, matrices... An expression or an evaluated object should be passed as X or if you prefer your functions bare, the first argument. If you want to raise y by some amount, you will have to do it like you've showed in one of the lines: y^(3/2).
In other words, this command will sum all elements across rows:
apply(X = my.object, MARGIN = 1, FUN = sum)
or across columns:
apply(X = my.object, MARGIN = 2, FUN = sum)

Fortran's do-loop over arbitary indices like for-loop in R?

I have two p-times-n arrays x and missx, where x contains arbitrary numbers and missx is an array containing zeros and ones. I need to perform recursive calculations on those points where missx is zero. The obvious solution would be like this:
do i = 1, n
do j = 1, p
if(missx(j,i)==0) then
z(j,i) = ... something depending on the previous computations and x(j,i)
end if
end do
end do
Problem with this approach is that most of the time missx is always 0, so there is quite a lot if statements which are always true.
In R, I would do it like this:
for(i in 1:n)
for(j in which(xmiss[,i]==0))
z[j,i] <- ... something depending on the previous computations and x[j,i]
Is there a way to do the inner loop like that in Fortran? I did try a version like this:
do i = 1, n
do j = 1, xlength(i) !xlength(i) gives the number of zero-elements in x(,i)
j2=whichx(j,i) !whichx(1:xlength(i),i) contains the indices of zero-elements in x(,i)
z(j2,i) = ... something depending on the previous computations and x(j,i)
end do
end do
This seemed slightly faster than the first solution (if not counting the amount of defining xlength and whichx), but is there some more clever way to this like the R version, so I wouldn't need to store those xlength and whichx arrays?
I don't think you are going to get dramatic speedup anyway, if you must do the iteration for most items, than storing just the list of those with the 0 value for the whole array is not an option. You can of course use the WHERE or FORALL construct.
forall(i = 1: n,j = 1: p,miss(j,i)==0) z(j,i) = ...
or just
where(miss==0) z = ..
But the ussual limitations of these constructs apply.

unexpected behavior of replace

I need help with the replace() command
replace(c(3,2,2,1),1:3,4:6)
I was expecting an output of 6,5,5,4 but got 4,5,6,1
What am i doing wrong?
My understanding of what replace is this: it looks up index values of elements of the first argument in the second argument (e.g. 3 is the 3rd element in 1:3) and then replaces it with elements in the third argument with the same index (e.g. 3rd element in 4:6 is 6 thus the reason for me expecting the first element in the vector to be 6)
Thank you. (replace help file doesn't have example... need to ask for clarification here)
While replace doesn't give the behaviour your desired, to achieve what you were intending is quite easy to do using match:
new[match(x,i)]
It is all given in the description of replace(), just read carefully:
‘replace’ replaces the values in ‘x’ with indices given in ‘list’
by those given in ‘values’. If necessary, the values in ‘values’
are recycled.
x <- c(3, 2, 2, 1)
i <- 1:3
new <- 4:6
so this means in your case:
x[i] <- new
That command says to take the vector c(3, 2, 2, 1) and to replace the components with indices in 1:3 by the values given by the vector 4:6. This gives c(4, 5, 6, 1).

Resources