Query for loops in R - r

I am working out with a data in R.
a<-rep(NA,400)
for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
Instead of printing, I want to add that value into an empty array (a). I would be thankful if someone help me with the same.

In a case like this (nested loops), often the easiest way is to add a counter to keep track of positions in the array:
a <- rep(NA, 400)
counter <- 1
for(i in 1:10){
for(j in 0:40){
a[counter] <- dat$V2[i]-j
counter <- counter + 1
}
}

Here is one way:
#a<-rep(NA,400)
#for(i in 1:10){for(j in 0:40){print(dat$V2[i]-j)}}
a <- as.numeric( sapply( 1:10, function(i){
sapply( 0:40, function(j) {
dat$V2[i]-j
})
}))
sapply is useful here because countrary to a for loop it returns something with each loop. So in this case I first loop over 1:10, like your for loop, with the useful difference that it actually returns something each time.
What it does in each iteration is to run a new nested block, also using sapply, this time looping over 0:40, also this time returning something, this time its the expression you have innermost in your for loop.
So for 10 times, for each of 1:10, it will loop over 0:40, each time calculating your expression and returning it, which should result in it getting calculated and returned as you want.
as.numeric is wrapped around it to make sure it stays as one long vector, which seems to be what you want.

A way only using loops:
for(i in 1:10){
for(j in 0:39){
print(i*40 - 39 + j)
a[i*40 - 39 + j] = dat$V2[i]-j}}
PS: as you want to create a vector with 400 observations, and i goes from 1 to 10, j needs to have "length" 40, so you want it to be 0:39.
When you want to create a vector (1 dimentional) with a doble loop, the following formula normally applies:
index of the vector = i*length(j) - (length(j)-1) + j

Related

How to take results of a function and apply it to function again in R?

I am aware this is a very basic question and am sorry to take up everyone's time. I created a function but would like to take those results, and apply it to the function again ( I am trying to model growth).
I don't think I want to use a loop because I need the values to come from the function. I also don't think it's apply because I need to extract the values from the function.
Here is my function
initial<-c(36.49)
second<-NULL
growth <- function(x){
second <- (131.35-(131.35 -x)*exp(-0.087))
}
second<-growth(initial)
third<-growth(second)
fourth<-growth(third)
fifth<-growth(fourth)
sixth<-growth(fifth)
seventh<-growth(sixth)
here is how I am doing it now, but as you can see I would have to keep doing this over and over again
You can use loop. Just store the outputs in a vector:
# initial value
initial<-c(36.49)
# dont need this i think
# second<-NULL
# create a holding vector fro result
values <- vector()
# assign
values[1] <- initial
# your function
growth <- function(x){
second <- (131.35-(131.35 -x)*exp(-0.087))
}
# start a loop; you start with 2
for(i in 2:7){
# then access the previous value using i - 1
# then store to the next index, which is i
values[i] <- growth(values[i - 1])
}
This should do the same.
Something along the lines of this maybe can help
x <- 1
try <- function(x) x <<- x+1
for(i in 1:5) try(x)

How to append a vector to a vector r - in a vectorized style

We all know that appending a vector to a vector within a for loop in R is a bad thing because it costs time. A solution would be to do it in a vectorized style. Here is a nice example by Joshua Ulrich. It is important to first create a vector with known length and then fill it up, instead of appending each new piece to an existing piece within the loop.
Still, in his example he demonstrates 'only' how to append one data piece at a time. I am now fighting with the idea to fill a vector with vectors - not scalars.
Imagine I have a vector with a length of 100
vector <- numeric(length=100)
and a smaller vector that would fit 10 times into the first vector
vec <- seq(1,10,1)
How would I have to construct a loop that adds the smaller vector to the large vector without using c() or append ?
EDIT: This example is simplified - vec does not always consist of the same sequence but is generated within a for loop and should be added to vector.
You could just use normal vector indexing within the loop to accomplish this:
vector <- numeric(length=100)
for (i in 1:10) {
vector[(10*i-9):(10*i)] <- 1:10
}
all.equal(vector, rep(1:10, 10))
# [1] TRUE
Of course if you were just trying to repeat a vector a certain number of times rep(vec, 10) would be the preferred solution.
A similar approach, perhaps a little more clear if your new vectors are of variable length:
# Let's over-allocate so that we now the big vector is big enough
big_vec = numeric(1e4)
this.index = 1
for (i in 1:10) {
# Generate a new vector of random length
new_vec = runif(sample(1:20, size = 1))
# Stick in in big_vec by index
big_vec[this.index:(this.index + length(new_vec) - 1)] = new_vec
# update the starting index
this.index = this.index + length(new_vec)
}
# truncate to only include the added values
big_vec = big_vec[1:(this.index - 1)]
As #josilber suggested in comments, lists would be more R-ish. This is a much cleaner approach, unless the new vector generation depends on the previous vectors, in which case the for loop might be necessary.
vec_list = list()
for (i in 1:10) {
# Generate a new vector of random length
vec_list[[i]] = runif(sample(1:20, size = 1))
}
# Or, use lapply
vec_list = lapply(1:10, FUN = function(x) {runif(sample(1:20, size = 1))})
# Then combine with do.call
do.call(c, vec_list)
# or more simply, just unlist
unlist(vec_list)

Explaining a for loop in R

I'm very new to R, and much more new to programming in R. I have the following question and its answer (which is not mine). I've trying to understand why some values, from where they are obtained, why they are used, etc.
Question: Make the vector 3 5 7 9 11 13 15 17 with a for loop. Start
with x=numeric() and fill this vector with the for loop
I know I have to create x=numeric() so I can fill it with the result obtained from the loop.
The answer from a classmate was:
> x <- numeric()
> for(i in 1:8){
if(i==1){ ## Why ==1 and not 0, or any other value
x[i] <- 3
}else{
x[i] <- x[i-1]+2 ### And why i-1
}
I'm having similar problems in questions like:
Make a for loop that adds the second element of a vector to the first,
subtracts the third element from the result, adds the fourth again and
so on for the entire length of the vector
So far, I created the vector and the empty vector
> y = c(5, 10, 15, 20, 25, 30)
> answer <- 0
And then, when I try to do the for loop, I get stuck here:
for(i in 1:length(y)){
if(i...){ ### ==1? ==0?
answer = y[i] ###and here I really don't know how to continue.
}else if()
}
Believe me when I tell you I've read several replies to questions here, like in How to make a vector using a for loop, plus pages and pages about for loop, but cannot really figure how to solve these (and other) problems.
I repeat, I'm very new, so I'm struggling trying to understand it. Any help would be much appreciated.
First, I will annotate the loop to answer what the loop is doing.
# Initialize the vector
x <- numeric()
for(i in 1:8){
# Initialize the first element of the vector, x[1]. Remember, R indexes start at 1, not 0.
if(i==1){
x[i] <- 3
} else {
# Define each additional element in terms of the previous one (x[i - 1]
# is the element of x before the current one.
x[i] <- x[i-1]+2 ### And why i-1
}
}
A better solution that uses a loop and grows it (like the instructions state) is something like this:
x <- numeric()
for(i in 1:8){
x[i] <- 2 * i + 1
}
This is still not a good way to do things because growing a vector inside a loop is very slow. To fix this, you can preallocate the vector by telling numeric the length of the vector you want:
x <- numeric(8)
The best way to solve this would be:
2 * 1:8 + 1
using vectorized operations.
To help you solve your other problem, I suggest writing out each step of the loop as a table. For example, for my solution, the table would be
i | x[i]
------------------
1 | 2 * 1 + 1 = 3
2 | 2 * 2 + 1 = 5
and so on. This will give you an idea of what the for loop is doing at each iteration.
This is intentionally not an answer because there are better ways to solve the alternating sign summation problem than a for-loop. I suppose there could be value in getting comfortable with for-loops but the vectorized approaches in R should be learned as well. R has "argument recycling" for many of its operations, including the "*" (multiplication) operation: Look at:
(1:10)*c(1,-1)
Then take an arbitrary vector, say vec and try:
sum( vec*c(1,-1) )
The more correct answer after looking at that result would be:
vvec[1] + sum( vec[-1]*c(1,-1) )
Which has the educational advantage of illustrating R's negative indexing. Look up "argument recycling" in your documentation. The shorter objects are automagically duplicatied/triplicated/however-many-needed-cated to exactly match the length of the longest vector in the mathematical or logical expression.

How to print the name of current row when using apply in R?

For example, I have a matrix k
> k
d e
a 1 3
b 2 4
I want to apply a function on k
> apply(k,MARGIN=1,function(p) {p+1})
a b
d 2 3
e 4 5
However, I also want to print the rowname of the row being apply so that I can know which row the function is applied on at that time.
It may looks like this:
apply(k,MARGIN=1,function(p) {print(rowname(p)); p+1})
But I really don't do how to do that in R.
Does anyone has any idea?
Here's a neat solution to what I think you're asking. (I've called the input matrix mat rather than k for clarity - in this example, mat has 2 columns and 10 rows, and the rows are named abc1 through to abc10.)
In the code below, the result out1 is the thing you wanted to calculate (the outcome of the apply command). The result out2 comes out identically to out1 except that it prints out the rownames that it is working on (I put in a delay of 0.3 seconds per row so you can see it really does do this - take this out when you want the code to run full speed obviously!)
The trick I came up with was to cbind the row numbers (1 to n) onto the left of mat (to create a matrix with one additional column), and then use this to refer back to the rownames of mat. Note the line x = y[-1] which means that the actual calculation within the function (here, adding 1) ignores the first column of row numbers, which means it's the same as the calculation done for out1. Whatever sort of calculation you want to perform on the rows can be done this way - just pretend that y never existed, and formulate your desired calculation using x. Hope this helps.
set.seed(1234)
mat = as.matrix(data.frame(x = rpois(10,4), y = rpois(10,4)))
rownames(mat) = paste("abc", 1:nrow(mat), sep="")
out1 = apply(mat,1,function(x) {x+1})
out2 = apply(cbind(seq_len(nrow(mat)),mat),1,
function(y) {
x = y[-1]
cat("Doing row:",rownames(mat)[y[1]],"\n")
Sys.sleep(0.3)
x+1
}
)
identical(out1,out2)
You can use a variable outside of the apply call to keep track of the row index and pass the row names as an extra argument to your function:
idx <- 1
apply(k, 1, function(p, rn) {print(rn[idx]); idx <<- idx + 1; p + 1}, rownames(k))
This should work. The cat() function is what you want to use when printing results during evaluation of a function. paste(), conversely, just returns a character vector but doesn't send it to the command window.
The solution below uses a counter created as a closure, allowing it to "remember" how many times the function has been run before. Note the use of the global assign <<-. If you really want to understand what's going on here, I recommend reading through this wiki https://github.com/hadley/devtools/wiki/
Note there may be an easier way to do this; my solution assumes that there is no way to access the rownumber or rowname of a current row using typical means within an apply function. As previously mentioned, this would be no problem in a loop.
k <- matrix(c(1,2,3,4),ncol=2)
rownames(k) <- c("a","b")
colnames(k) <- c("d","e")
make.counter <- function(x){
i <- 0
function(){
i <<- i+1
i
}
}
counter1 <- make.counter()
apply(k,MARGIN=1,function(p){
current.row <- rownames(k)[counter1()]
cat(current.row,"\n")
return(p+1)
})
As far as I know you cannot do that with apply, but you could loop through the rownames of your data frame. Lame example:
lapply(rownames(mtcars), function(x) sprintf('The mpg of %s is %s.', x, mtcars[x, 1]))

missing value where TRUE/FALSE needed error in R

I have got a column with different numbers (from 1 to tt) and would like to use looping to perform a count on the occurrence of these numbers in R.
count = matrix(ncol=1,nrow=tt) #creating an empty matrix
for (j in 1:tt)
{count[j] = 0} #initiate count at 0
for (j in 1:tt)
{
for (i in 1:N) #for each observation (1 to N)
{
if (column[i] == j)
{count[j] = count[j] + 1 }
}
}
Unfortunately I keep getting this error.
Error in if (column[i] == j) { :
missing value where TRUE/FALSE needed
So I tried:
for (i in 1:N) #from obs 1 to obs N
if (column[i] = 1) print("Test")
I basically got the same error.
Tried to do abit research on this kind of error and alot have to said about "debugging" which I'm not familiar with.
Hopefully someone can tell me what's happening here. Thanks!
As you progress with your learning of R, one feature you should be aware of is vectorisation. Many operations that (in C say) would have to be done in a loop, can be don all at once in R. This is particularly true when you have a vector/matrix/array and a scalar, and want to perform an operation between them.
Say you want to add 2 to the vector myvector. The C/C++ way to do it in R would be to use a loop:
for ( i in 1:length(myvector) )
myvector[i] = myvector[i] + 2
Since R has vectorisation, you can do the addition without a loop at all, that is, add a scalar to a vector:
myvector = myvector + 2
Vectorisation means the loop is done internally. This is much more efficient than writing the loop within R itself! (If you've ever done any Matlab or python/numpy it's much the same in this sense).
I know you're new to R so this is a bit confusing but just keep in mind that often loops can be eliminated in R.
With that in mind, let's look at your code:
The initialisation of count to 0 can be done at creation, so the first loop is unnecessary.
count = matrix(0,ncol=1,nrow=tt)
Secondly, because of vectorisation, you can compare a vector to a scalar.
So for your inner loop in i, instead of looping through column and doing if column[i]==j, you can do idx = (column==j). This returns a vector that is TRUE where column[i]==j and FALSE otherwise.
To find how many elements of column are equal to j, we just count how many TRUEs there are in idx. That is, we do sum(idx).
So your double-loop can be rewritten like so:
for ( j in 1:tt ) {
idx = (column == j)
count[j] = sum(idx) # no need to add
}
Now it's even possible to remove the outer loop in j by using the function sapply:
sapply( 1:tt, function(j) sum(column==j) )
The above line of code means: "for each j in 1:tt, return function(j)", an returns a vector where the j'th element is the result of the function.
So in summary, you can reduce your entire code to:
count = sapply( 1:tt, function(j) sum(column==j) )
(Although this doesn't explain your error, which I suspect is to do with the construction or class of your column).
I suggest to not use for loops, but use the count function from the plyr package. This function does exactly what you want in one line of code.

Resources