append results of loop into numeric vector - r

I would like to create a numeric vector with the results of a loop such as
> for (i in 1:5) print(i+1)
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
It seems strange that the same expression without 'print' returns nothing
> for (i in 1:5) i+1
>
Does anyone have an explanation/solution?

This is standard behaiviour -- when you say you want to create a numeric vector,
print will not do that
The expression in a for loop is an argument to the primitive function for
From ?`for` in the value section
for, while and repeat return NULL invisibly. for sets var to the last
used element of seq, or to NULL if it was of length zero.
print prints the results to the console.
for(i in 1:5) i + 1
merely calculates i + 1 for each iteration and returns nothing
If you want to assign something then assign it using <-, or less advisably assign
You can avoid an explicit loops by using sapply. This (should) avoid any pitfalls of growing vectors
results <- sapply(1:5, function(i) { i + 1})

Now frankly, there must be a better solution than this
loopee <- function(x){
res <- vector(mode = "numeric", length(x))
for (i in 1:x) {res[i] <- i+1}
return(res)}
> loopee(5)
[1] 2 3 4 5 6

Related

Can I further vectorize this function

I am relatively new to R, and matrix-based scripting languages in general. I have written this function to return the index's of each row which has a content similar to any another row's content. It is a primitive form of spam reduction that I am developing.
if (!require("RecordLinkage")) install.packages("RecordLinkage")
library("RecordLinkage")
# Takes a column of strings, returns a list of index's
check_similarity <- function(x) {
threshold <- 0.8
values <- NULL
for(i in 1:length(x)) {
values <- c(values, which(jarowinkler(x[i], x[-i]) > threshold))
}
return(values)
}
is there a way that I could write this to avoid the for loop entirely?
We can simplify the code somewhat using sapply.
# some test data #
x = c('hello', 'hollow', 'cat', 'turtle', 'bottle', 'xxx')
# create an x by x matrix specifying which strings are alike
m = sapply(x, jarowinkler, x) > threshold
# set diagonal to FALSE: we're not interested in strings being identical to themselves
diag(m) = FALSE
# And find index positions of all strings that are similar to at least one other string
which(rowSums(m) > 0)
# [1] 1 2 4 5
I.e. this returns the index positions of 'hello', 'hollow', 'turtle', and 'bottle' as being similar to another string
If you prefer, you can use colSums instead of rowSums to get a named vector, but this could be messy if the strings are long:
which(colSums(m) > 0)
# hello hollow turtle bottle
# 1 2 4 5

Avoid storing null values when skipping an iteration in a for loop

Exist a way to avoiding to store null values in an iterative process when some condition is activated to skip to the next iteration? The intention of "how to solve" this problem is with the structure itself of the loop
[CONTEXT]:
I refer to the case when you need to use a storing mechanism inside a loop in conjunction with a conditional statement, and it is given the scenario where basically one of the possibles path is not of your interest. In the honor to give the treatment in the moment, and not posterior of the computation, you skip to the next iteration.
[EXAMPLE]
Suppose given a certain sequence of numbers, I interested only in stored the numbers of the sequence that are greater than 2 in a list.
storeGreaterThan2 <- function(x){
y <- list()
for (i in seq_along(x)) {
if (x[i] > 2) {
y[[i]] <- x[i]
} else {
next
}
}
y
}
The previous function deal with the final purpose, but when the condition to skip the iteration is activated the missing operation in the index is filled with a null value in the final list.
> storeGeaterThan2(1:5)
[[1]]
NULL
[[2]]
NULL
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
In the spirit of dealing with the problem inside the structure of the loop, how it could deal with that?
This is a rather strange example, and I wonder if it's an x-y problem. It may be better to say more about your situation and what you ultimately want to do. For example, there are different ways of trying to do this depending on if the function's input will always be an ascending sequence. #Dave2e's comment that there will be better ways depending of what you are really after is right on the mark, in my opinion. At any rate, you can simply removed the NULL elements before you return the list. Consider:
storeGreaterThan2 <- function(x){
y <- list()
for(i in seq_along(x)) {
if(x[i] > 2) {
y[[i]] <- x[i]
} else {
next
}
}
y <- y[-which(sapply(y, is.null))]
return(y)
}
storeGreaterThan2(1:5)
# [[1]]
# [1] 3
#
# [[2]]
# [1] 4
#
# [[3]]
# [1] 5
Here is a possible way to do this without ever having stored the NULL element, rather than cleaning it up at the end:
storeGreaterThan2 <- function(x){
y <- list()
l <- 1 # l is an index for the list
for(i in seq_along(x)){ # i is an index for the x vector
if(x[i] > 2) {
y[[l]] <- x[i]
l <- l+1
}
}
return(y)
}

R programming language LOOPS

y <- vector()
i <- 5
while((2<3)<i){
y[i] <- "Hello World!"
i <- i-1 }
y
So I didn't understand how to while loop works when while((2<3)<i) is the case, 2<3 is true for all conditions and i end up with TRUE<i, what does this mean? Or am I thinking wrong?
I just didn't get how to condition of the while loop works, if I get that I believe I will work it out.
Also another question:
xxx <- function(vec){
n <- length(vec)
}
for(i in 1:n){
x <- vec[i]
if (vec[i]<x){
x <- vec[i]
}
} return(x)
This xxx function is suppose to output the minimum value of the function? okay i see but how?
when we enter the loop we first do x<- vec[i] without doing this we can't pass to the next command the if statement right? so since we do x <- vec[i] earlier if command won't work probably since x==vec[i] all the time.
Please help guys since iI have the exam tomorrow :(
1) ?Comparison says, referring to the two arguments of any comparison operator such as < :
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.
so in this case we have one logical argument and one numeric argument so the the logical argument is coerced to numeric (where FALSE is converted to 0 and TRUE is converted to 1). Thus (2<3)<5 is the same as TRUE < 5 which is the same as 1 < 5 which is TRUE:
(2<3)<5
## [1] TRUE
2) For xxx you probably want this:
xxx <- function(vec) {
x <- Inf
for(i in seq_along(vec)) if (vec[i] < x) x <- vec[i]
x
}
The first statement in the body assigns Inf to x In the second statement in the body seq_along(vec) is 1, 2, ..., length(vec) so the for loop iterates i over 1, 2, ..., length(vec) with each iteration replacing x with vec[i] if vec[i] is less than x. Note that if vec has zero length then the loop is not run at all since seq_along(vec) has zero length.
Testing it out:
> xxx(1:3)
[1] 1
> xxx(3:1)
[1] 1
> xxx(numeric(0)) # zero length input
Inf
Of course R already has the min function which does the same thing.

adding values to the vector inside for loop in R

I have just started learning R and I wrote this code to learn on functions and loops.
squared<-function(x){
m<-c()
for(i in 1:x){
y<-i*i
c(m,y)
}
return (m)
}
squared(5)
NULL
Why does this return NULL. I want i*i values to append to the end of mand return a vector. Can someone please point out whats wrong with this code.
You haven't put anything inside m <- c() in your loop since you did not use an assignment. You are getting the following -
m <- c()
m
# NULL
You can change the function to return the desired values by assigning m in the loop.
squared <- function(x) {
m <- c()
for(i in 1:x) {
y <- i * i
m <- c(m, y)
}
return(m)
}
squared(5)
# [1] 1 4 9 16 25
But this is inefficient because we know the length of the resulting vector will be 5 (or x). So we want to allocate the memory first before looping. This will be the better way to use the for() loop.
squared <- function(x) {
m <- vector("integer", x)
for(i in seq_len(x)) {
m[i] <- i * i
}
m
}
squared(5)
# [1] 1 4 9 16 25
Also notice that I have removed return() from the second function. It is not necessary there, so it can be removed. It's a matter of personal preference to leave it in this situation. Sometimes it will be necessary, like in if() statements for example.
I know the question is about looping, but I also must mention that this can be done more efficiently with seven characters using the primitive ^, like this
(1:5)^2
# [1] 1 4 9 16 25
^ is a primitive function, which means the code is written entirely in C and will be the most efficient of these three methods
`^`
# function (e1, e2) .Primitive("^")
Here's a general approach:
# Create empty vector
vec <- c()
for(i in 1:10){
# Inside the loop, make one or elements to add to vector
new_elements <- i * 3
# Use 'c' to combine the existing vector with the new_elements
vec <- c(vec, new_elements)
}
vec
# [1] 3 6 9 12 15 18 21 24 27 30
If you happen to run out of memory (e.g. if your loop has a lot of iterations or vectors are large), you can try vector preallocation which will be more efficient. That's not usually necessary unless your vectors are particularly large though.

indexing through values of a nested list using mapply

I have a list of lists, with each sub-list containing 3 values. My goal is to cycle through every value of this nested list in a systematic way (i.e. start with list 1, go through all 3 values, go to list 2, and so on), applying a function to each. But my function hits missing values and breaks and I've traced the problem to the indexing itself, which doesn't behave in the way I am expecting. The lists are constructed as:
pop <- 1:100
treat.temp <- NULL
treat <- NULL
## Generate 5 samples of pop
for (i in 1:5){
treat.temp <- sample(pop, 3)
treat[[i]] <- treat.temp
}
## Create a list with which to index mapply
iterations <- (1:5)
Illustrative function and results.
test.function <- function(j, k){
for (n in 1:3){
print(k[[n]][j])
}
}
results <- mapply(test.function, iterations, treat)
[1] 61
[1] 63
[1] 73
[1] NA
[1] NA
[1] NA
[1] NA
[1] NA
<snipped>
For the first cycle through 'j', this works. But after that it throws NAs. But if I do it manually, it returns the values I would expect.
> print(treat[[1]][1])
[1] 61
> print(treat[[1]][2])
[1] 63
> print(treat[[1]][3])
[1] 73
> print(treat[[2]][1])
[1] 59
> print(treat[[2]][2])
[1] 6
> print(treat[[2]][3])
[1] 75
<snipped>
I'm sure this is a basic question, but I can't seem to find the right search terms to find an answer here or on Google. Thanks in advance!
Edited to Add: MrFlick's answer works well for my problem. I have multiple list inputs (hence mapply) in my actual use. A more detailed example, with a few notes.
pop <- 1:100
years <- seq.int(2000, 2014, 1)
treat.temp <- NULL
treat <- NULL
year.temp <- NULL
year <- NULL
## Generate 5 samples of treated states, control states and treatment years
for (i in 1:5){
treat.temp <- sample(pop, 20)
treat[[i]] <- treat.temp
year.temp <- sample(years, 1)
year[[i]] <- year.temp
}
## Create a list with which to index mapply
iterations <- (1:5)
## Define function
test.function <- function(j, k, l){
for (n in 1:3){
## Cycles treat through each value of jXn
print(k[n])
## Holds treat (k) fixed for each 3 cycle set of n (using first value in each treat sub-list); cycles through sub-lists as j changes
print(k[1])
## Same as above, but with 2nd value in each sub-list of treat
print(k[2])
## Holds year (l) fixed for each 3 cycle set of n, cycling through values of year each time j changes
print(l[1])
## Functionally equivalent to
print(l)
}
}
results <- mapply(test.function, iterations, treat, year)
Well, you might be misunderstanding how mapply works. The function will loop through both of the iterations you pass as parameters, which means treat will also be subset each iteration. Essentially, the functions being called are
test.function(iterations[1], treat[[1]])
test.function(iterations[2], treat[[2]])
test.function(iterations[3], treat[[3]])
...
and you seem to treat the k variable as if it were the entire list. Also, you have your indexes backwards as well. But just to get your test working, you can do
test.function <- function(j, k){
for (n in 1:3) print(k[n])
}
results <- mapply(test.function, iterations, treat)
but this isn't really a super awesome way to iterate a list. What exactly are you trying to accomplish?

Resources