I am relatively new to R and am trying to create a for loop with a conditional that references the previous row for equivalence. In order to learn to write this code for my own data, I created a simpler, representative data frame:
df<- c(1,1,2,0,0,0,0,1,1,2)
My goal would be to print a 1 for every value in df that is different from the previous value. For this df, this should look like:
[1] 0,0,1,1,0,0,0,1,0,1.
Here is what I have tried thus far:
for(i in df){
if(x[i] != x[i-1]){
print(1)
}else{
print(0)
}
}
From the above code, I consistently get the error "argument is of length zero". Very possible that I am making a simple mistake, but I appreciate any suggestions!
Maybe try this. You are using x which does not exist. Instead use df inside the loop. Also as you need comparison from one element to next, it would be better to start the loop from second position because the first element does not have any previous value to be compared. Here the code:
#Data
df<- c(1,1,2,0,0,0,0,1,1,2)
#Loop
for(i in 2:length(df)){
if(df[i] != df[i-1]){
print(1)
}else{
print(0)
}
}
Output:
[1] 0
[1] 1
[1] 1
[1] 0
[1] 0
[1] 0
[1] 1
[1] 0
[1] 1
Related
list<-c("a2012","a2013")
a2012<-c("al,","al,rb,","cu,pvc,")
a2013<-c("ab,al,","al,cu,","pvc,al,")
sum(str_count(a2012,"al,")==1)
[1] 2
sum(str_count(a2013,"al,")==1)
[1] 3
output <- vector("integer")
for(i in seq_along(list))
{
output[[i]]<-sum(str_count(list[[i]],"al,")==1)
}
output
[1] 0 0
This is the whole process. I'm pretty much a noob.
I don't know why this happens. Please help
a2012<-c("al,","al,rb,","cu,pvc,")
a2013<-c("ab,al,","al,cu,","pvc,al,")
mylist <- list("a2012" = a2012,
"a2013" = a2013)
output <- vector("integer")
for(i in seq_along(mylist))
{
output[[i]]<-sum(str_count(mylist[[i]],"al,")==1)
}
> output
[1] 2 3
> mylist
$a2012
[1] "al," "al,rb," "cu,pvc,"
$a2013
[1] "ab,al," "al,cu," "pvc,al,"
The main thing is that your list doesn't need to contain names and then refer to variables by those names - it can just contain the vectors themselves.
i have a numbers vector and i need to find the index of the first number that is greater than 24 and divisible by 13, if no number answers the conditions print 0. this is the code i wrote:
numbers_vector=c(1,5,26,7,94)
for(i in numbers_vector){
if(i>24&&i%%13==0){
print(i)
}else{
print(0)
}
}
the answer it returns:
[1] 0
[1] 0
[1] 26
[1] 0
[1] 0
it should return the number 3 (the index), as 26 answers the conditions.
Can anyone see what im doing wrong?
Thanks
which.max(numbers_vector>24 & numbers_vector%%13==0)
This will give you the result you're looking for, but if none of the numbers fits, it returns NA. If you want zero in such cases, do this :
a=which.max(numbers_vector>24 & numbers_vector%%13==0)
ifelse(is.na(a), 0, a)
Two general comments: a. avoid automatically going for the for loop. R's greatest strength is in vectorized calculations.
B. Avoid using print to return your result.
Only a small change is needed:
numbers_vector=c(1,5,26,7,94)
for(i in numbers_vector){
if(i>24 && i%%13==0){
print(which(numbers_vector == i))
}else{
print(0)
}
}
You are printing i, which is the number itself rather than its index
I need to create a vector(lets call it "support vector") that matches position in other vector ("main vector") with desired value. My guess is that the easiest way is to use for and ifelse. I am familiar with appending new values to a vector using for but it does not work with ifelse. Here is simple example that shows what I have in mind (please read # description):
#the "main vector" with fixed values
main_vector=c("ABC","ABC","ABC","XYZ")
#empty "support vector" which I want fill
support_vector=c()
#loop that puts into "support vector" 1 if "ABC" and 0 if "XYZ"
for(i in 1:length(main_vector)){ifelse(main_vector[i]=="ABC",support_vector[i]=1,support_vector[i]=0}
It generates error that suggest = is an issue in my code (or am I wrong?). What method/functions should I use to bypass using =?
Thank you in advance
Simplest way could be:
support_vector <- as.numeric(main_vector == "ABC")
> support_vector
#[1] 1 1 1 0
For some reason if OP still want to use for-loop then there is no need of ifelse rather if-else can be more convenient/readable option for argument of length 1.
support_vector <- c()
for(i in 1:length(main_vector)){
if(main_vector[i]=="ABC"){
support_vector[i]=1
} else {
support_vector[i]=0
}
}
Note: Solution provided by #gdkrmr is very elegant.
Here's another approach (not as nice as #gdkm or #MKR but may be useful in some way):
> main_vector=c("ABC","ABC","ABC","XYZ")
> support_vector <- sapply(main_vector, function(x) as.integer(x=="ABC"))
> support_vector
ABC ABC ABC XYZ
1 1 1 0
I am trying to understand the for and if-statement in r, so I run a code where I am saying that if the sum of rows are bigger than 3 then return 1 else zero:
Here is the code
set.seed(2)
x = rnorm(20)
y = 2*x
a = cbind(x,y)
hold = c()
Now comes the if-statement
for (i in nrow(a)) {
if ([i,1]+ [i,2] > 3) hold[i,] == 1
else ([i,1]+ [i,2]) <- hold[i,] == 0
return (cbind(a,hold)
}
I know that maybe combining for and if may not be ideal, but I just want to understand what is going wrong. Please keep the explanation at a dummy level:) Thanks
You've got some issues. #mnel covered a better way to go about doing this, I'll focus on understanding what went wrong in this attempt (but don't do it this way at all, use a vectorized solution).
Line 1
for (i in nrow(a)) {
a has 20 rows. nrow(a) is 20. Thus your code is equivalent to for (i in 20), which means i will only ever be 20.
Fix:
for (i in 1:nrow(a)) {
Line 2
if ([i,1]+ [i,2] > 3) hold[i,] == 1
[i,1] isn't anything, it's the ith row and first column of... nothing. You need to reference your data: a[i,1]
You initialized hold as a vector, c(), so it only has one dimension, not rows and columns. So we want to assign to hold[i], not hold[i,].
== is used for equality testing. = or <- are for assignment. Right now, if the >3 condition is met, then you check if hold[i,] is equal to 1. (And do nothing with the result).
Fix:
if (a[i,1]+ a[i,2] > 3) hold[i] <- 1
Line 3
else ([i,1]+ [i,2]) <- hold[i,] == 0
As above for assignment vs equality testing. (Here you used an arrow assignment, but put it in the wrong place - as if you're trying to assign to the else)
else happens whenever the if condition isn't met, you don't need to try to repeat the condition
Fix:
else hold[i] <- 0
Fixed code together:
for (i in 1:nrow(a)) {
if (a[i,1] + a[i,2] > 3) hold[i] <- 1
else hold[i] <- 0
}
You aren't using curly braces for your if and else expressions. They are not required for single-line expressions (if something do this one line). They are are required for multi-line (if something do a bunch of stuff), but I think they're a good idea to use. Also, in R, it's good practice to put the else on the same line as a } from the preceding if (inside the for loop or a function it doesn't matter, but otherwise it would, so it's good to get in the habit of always doing it). I would recommend this reformatted code:
for (i in 1:nrow(a)) {
if (a[i, 1] + a[i, 2] > 3) {
hold[i] <- 1
} else {
hold[i] <- 0
}
}
Using ifelse
ifelse() is a vectorized if-else statement in R. It is appropriate when you want to test a vector of conditions and get a result out for each one. In this case you could use it like this:
hold <- ifelse(a[, 1] + a[, 2] > 3, 1, 0)
ifelse will take care of the looping for you. If you want it as a column in your data, assign it directly (no need to initialize first)
a$hold <- ifelse(a[, 1] + a[, 2] > 3, 1, 0)
Such operations in R are nicely vectorised.
You haven't included a reference to the dataset you wish to index with your call to [ (eg a[i,1])
using rowSums
h <- rowSums(a) > 3
I am going to assume that you are new to R and trying to learn about the basic function of the for loop itself. R has fancy functions called "apply" functions that are specifically for doing basic math on each row of a data frame. I am not going to talk about these.
You want to do the following on each row of the array.
Sum the elements of the row.
Test that the sum is greater than 3.
Return a value of 1 or 0 representing the result of 2.
For 1, luckily "sum" is a built in function. It pays off to check out the built in functions within every programming language because they save you time. To sum the elements of a row, just use sum(a[row_number,]).
For 2, you are evaluating a logical statement "is x >3?" where x is the result from 1. The ">3" statement returns a value of true or false. The logical expression is a fancy "if then" statement without the "if then".
> 4>3
[1] TRUE
> 2>3
[1] FALSE
For 3, a true or false value is a data structure called a "logical" value in R. A 1 or 0 value is a data structure called a "numeric" value in R. By converting the "logical" into a "numeric", you can change the TRUE to 1's and FALSE to 0's.
> class(4>3)
[1] "logical"
> as.numeric(4>3)
[1] 1
> class(as.numeric(4>3))
[1] "numeric"
A for loop has a min, a max, a counter, and an executable. The counter starts at the min, and increments until it goes to the max. The executable will run for each run of the counter. You are starting at the first row and going to the last row. Putting all the elements together looks like this.
for (i in 1:nrow(a)){
hold[i] <- as.numeric(sum(a[i,])>3)
}
I am relatively new to R. I am iterating over a vector in R by using for() loop. However, based on a certain condition, I need to skip some values in the vector. The first thought that comes to mind is to change the loop index within the loop. I have tried that but somehow its not changing it. There must be some what to achieve this in R.
Thanks in advance.
Sami
You can change the loop index within a for loop, but it will not affect the execution of the loop; see the Details section of ?"for":
The ‘seq’ in a ‘for’ loop is evaluated at the start of the loop;
changing it subsequently does not affect the loop. If ‘seq’ has
length zero the body of the loop is skipped. Otherwise the
variable ‘var’ is assigned in turn the value of each element of
‘seq’. You can assign to ‘var’ within the body of the loop, but
this will not affect the next iteration. When the loop terminates,
‘var’ remains as a variable containing its latest value.
Use a while loop instead and index it manually:
i <- 1
while(i < 100) {
# do stuff
if(condition) {
i <- i+3
} else {
i <- i+1
}
}
Look at
?"next"
The next command will skip the rest of the current iteration of the loop and begin the next one. That may accomplish what you want.
Without an example it is hard to see what you want to do, but you can always use an if-statement inside a for-loop:
foo <- 1:10*5
for (i in seq(length(foo)))
{
if (foo[i] != 15) print(foo[i])
}
In R, local alterations in the index variable are "corrected" with the next pass:
for (i in 1:10){
if ( i==5 ) {i<-10000; print(i)} else{print(i)}
}
#-----
[1] 1
[1] 2
[1] 3
[1] 4
[1] 10000
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
Since you have some criterion for skipping you should apply the criterion to the loop vector inside the for-parentheses. E.g:
for( i in (1:10)[-c(3,4,6,8,9)] ) {
print(i)}
#----
[1] 1
[1] 2
[1] 5
[1] 7
[1] 10