I am attempting to create a loop that runs a function with specific values of i in a vector:
For example I would like to save i + 2 for when i is 1 and 5
test<- c()
for(i in c(1,5)){
test[i] <- i + 2
}
This ends up printing NA for 2 ,3 and 4:
[1] 3 NA NA NA 7
while the result I would like is:
[1] 3 7
This is probably very elementary but I cannot seem to figure this out.
R is vectorized, means you can do this:
c(1, 5) + 2
# [1] 3 7
for loops in R are often very slow, which is why they are implemented in C in functions of the *apply family, e.g.
sapply(c(1, 5), \(i) i + 2)
# [1] 3 7
If you really need to rely on a for loop, If you really need to rely on a "for" loop, you may want to loop over the indices rather than the values (a quite common mistake!):
v <- c(1, 5)
test <- vector('numeric', length(v))
for (i in seq_along(v)) {
test[i] <- v[i] + 2
}
test
# [1] 3 7
Use append
test<- c()
for(i in c(1,5)){
test<-append(test,i+2)
}
Related
I'm writing a function to analyse .csv files in a directory on my hard drive, using a series of for and while loops (I know for loops are unpopular in R, but they're good for what I need).
The function creates a number of data-frames, and performs actions on each one in turn before overwriting them and moving on to the next file in the directory to repeat the action.
The part of the code that does not work so far is the creation of a matrix from vectors taken from the data files being analysed. A simplified version of the code is shown below:
data1 <- seq(1, 10, 1)
data2 <- seq(1, 7, 1)
data3 <- seq(1, 5, 1)
n <- max(length(data1), length(data2), length(data3))
k <- c(1, 2, 3)
for(a in k){
if(a == 1){
length(get(paste("data", a, sep = ""))) <- n
data_matrix <- get(paste("data", a, sep = ""))
}else{
while(exists(paste("data", a, sep = ""))){
length(get(paste("data", a, sep = ""))) <- n
data_matrix <- cbind(data_matrix, get(paste("data", a, sep = "")))
}
}
}
The nature of my data is that the length of the columns in my datasets vary with each data collection, so I've adapted a technique found in this post that deals with using cbind to bind objects of a different length without replication of the data within the smaller objects.
The issue I have when trying to implement this code is I get the error message:
Error in length(get(paste("data", a, sep = ""))) <- n :
target of assignment expands to non-language object
I'm guessing the issue is that the function get() cannot be used to select items in the Global Environment and to modify them in this way.
You could use:
get("x")[1:n]
to get a vector called "x" padded with NA to length n.
That is:
> x=1:3
> n=10
> get("x")[1:n]
[1] 1 2 3 NA NA NA NA NA NA NA
Having said that, this is a neater way to get the matrix you want (hopefully you can adapt to your scenario):
> datalist <- list(data1, data2, data3)
> maxlength <- max(lengths(datalist))
> sapply(datalist, function(x) x[1:maxlength] )
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 NA
[7,] 7 7 NA
[8,] 8 NA NA
[9,] 9 NA NA
[10,] 10 NA NA
For those who want to see how the solution proposed by #GeorgeSavva looks using the loop method that I am employing (my loop contained additional errors):
data1 <- seq(1, 10, 1)
data2 <- seq(1, 7, 1)
data3 <- seq(1, 5, 1)
n <- max(length(data1), length(data2), length(data3))
k <- c(1, 2, 3)
for(a in k){
if(a == 1){
data_matrix <- get(paste("data", a, sep = ""))[1:n]
}else{
data_matrix <- cbind(data_matrix, get(paste("data", a, sep = ""))[1:n])
}
}
While loop was unnecessary. I have written my code this way so that I can make it as versatile as possible as I obtain on a daily basis a varying number of datasets, with a varying size in each dataset.
I can use common operations on each dataset, so I can write a function that will tidy the data, construct charts and compare the datasets automatically without having to write new commands for each analysis.
Suppose I have the following R loop:
for(i in 1:5){
print(i)
i = i + 1
}
This produces the following result:
1
2
3
4
5
Which is weird since I did redefine the index inside the loop.
Why do we see this behavior?
I thought I would see something like the following:
1
3
4
5
6
Assignment to the looping variable i is discarded on the next loop iteration, so i = i + 1 has no effect. If you want to change the value of i within the loop, you can use a while loop.
However, your intended output of 1 3 4 5 6 doesn't make sense for a couple of reasons:
assignment of i, as already mentioned;
why does it not increment every other loop?; and
the domain of i, 1:5, is locked in at the first pass of the loop.
Having said that, try this:
i <- 1
lim <- 5
while (i <= lim) {
if (i == 2) {
lim <- lim + 1
} else {
print(i)
}
i <- i + 1
}
# [1] 1
# [1] 3
# [1] 4
# [1] 5
# [1] 6
(My only caution here is that if your increment of i is conditional on anything, there needs to be something else that prevents this from being an infinite loop. Perhaps this caution is unnecessary in your use-case, but it seems relevant to at least mention it.)
The i is the for each value and that cannot be modified with i = i + 1. We may use an if condition
for(i in 1:5){
if(i != 2)
print(i)
}
-output
[1] 1
[1] 3
[1] 4
[1] 5
Also, if the intention is to subset a vector, why not use vectorized option
v1 <- 1:5
v1[v1 != 2]
Good morning,
I have the following problem.
My Data.frame "data" has the format:
Type amount
1 2
2 0
3 3
I would like to create a vector with the format:
1
1
3
3
3
This means I would like to transform my data.
I created a vector and wrote the following code for my transformation in R:
vector <- numeric(5)
for (i in 1:3){
k <- 1
while (k <= data[i,2]){
vector[k] <- data[i,1]
k <- k+1
}
}
The problem is, I get the following results and I have no Idea at which part I go wrong…
3
3
3
0
0
There might be many different ways in solving this particular problem in R but I am curious why my solution doesn't work. I am thankful for alternatives, but really would like to know what my mistake is.
Thank's for your help!
Try this solution:
df <- data.frame(type = c(1, 2, 3), amount = c(2, 0, 3))
result <- unlist(mapply(function(x, y) rep.int(x, y), df[, "type"], df[, "amount"]))
result
Output is following:
# [1] 1 1 3 3 3
Exaclty your code is buggy. Correct code should looks following:
df <- data.frame(type = c(1, 2, 3), amount = c(2, 0, 3))
vector <- numeric(5)
k <- 1
for (i in 1:3) {
j <- 1
while (j <= df[i, 2]) {
vector[k] <- df[i, 1]
k <- k + 1
j <- j + 1
}
}
vector
# [1] 1 1 3 3 3
Probably the fastest and most elegant way to obtain this result has been posted before in a comment by #akrun:
with(data, rep(Type, amount))
[1] 1 1 3 3 3
However, if you want to do this with for/while loops, it could be helpful to use a list for such cases, where the number of entries is not known at the beginning.
Here is an example with minimal modifications of your code:
my_list <- vector("list", 3)
for (i in 1:3) {
k <- 1
while (k <= data[i,2]){
my_list[[i]][k] <- data[i,1]
k <- k + 1
}
}
vector <- unlist(my_list)
#> vector
#[1] 1 1 3 3 3
The reason why your code didn't work was essentially that you were trying to put too much information into a single variable, k. It cannot serve as both, an index of your output vector, and as a counter for the individual entries in the first column of data; a counter which is reset to 1 each time the while loop has finished.
I have theoretically identical solutions, one is vectorized solution and another is with for-loop. But vectorized solution returns wrong result and I want to understand why. Solution's logic is simple: need to replace NA with previous non-NA value in the vector.
# vectorized
f1 <- function(x) {
idx <- which(is.na(x))
x[idx] <- x[ifelse(idx > 1, idx - 1, 1)]
x
}
# non-vectorized
f2 <- function(x) {
for (i in 2:length(x)) {
if (is.na(x[i]) && !is.na(x[i - 1])) {
x[i] <- x[i - 1]
}
}
x
}
v <- c(NA,NA,1,2,3,NA,NA,6,7)
f1(v)
# [1] NA NA 1 2 3 3 NA 6 7
f2(v)
# [1] NA NA 1 2 3 3 3 6 7
The two pieces of code are different.
The first one replace NA with the previous element if this one is not NA.
The second one replace NA with the previous element if this one is not NA, but the previous element can be the result of a previous NA substitution.
Which one is correct really depends on you. The second behaviour is more difficult to vectorize, but there are some already implemented functions like zoo::na.locf.
Or, if you only want to use base packages, you could have a look at this answer.
These two solutions are not equivalent. The first function is rather like:
f2_as_f1 <- function(x) {
y <- x # a copy of x
for (i in 2:length(x)) {
if (is.na(y[i])) {
x[i] <- y[i - 1]
}
}
x
}
Note the usage of the y vector.
Using the following code:
a <- seq(1, 10, 1)
b <- seq(2, 20, 2)
I would like to subtract a[i - 1] from b[i] for each i, in order to obtain something like
c <- NULL
for(i in 1:length(b)) {
c[i] <- b[i] - a[i - 1]
}
but I would like to do this without using for() loop.
Anyone knows how to do it in just one command line?
Since your a and b are the same length, I've assumed you'd like to first trim the last element off of b. (Try b - a[-1] to see why that's probably desirable.)
b[-length(b)] - a[-1]
# [1] 0 1 2 3 4 5 6 7 8
You can do this with time series:
a <- ts(seq(1, 10, 1))
b <- ts(seq(2, 20, 2))
b- lag(a,1)
##-----
Time Series:
Start = 1
End = 9
Frequency = 1
[1] 0 1 2 3 4 5 6 7 8
Not that I am necessarily recommending this. The base time-series formalism is a widely feared source of confusion. Most people avoid it, giving preference to the zoo and xts classed objects.