Multiple if statement in loop condition - r

I am a Phd student in the university of Padua and I am trying to write a little script (the first!) in R cran v. 3.0.1 to make a simulation on epidemiology.
I'd like to change the values of a vector of 883 values basing on a neighbour matrix constructed with nb2mat from a shapefile: if i and j (two cells) are neighbour (matrix) and i or j have a positive value in the vector, I'd like to transform the value of both i and j to 1 (positive), otherwise the value of i and j should remain 0. When I launch the next little script:
for(i in 1:883)
{ for(j in 1:883)
{ if(MatriceDist[i,j] > 0 & ((vectorID[i] > 0 | vectorID[j] > 0)) {
vectorID[i] = 1 & vectorID[j] = 1
print(vectorID)
} } }
the answer from the software is:
Error: unexpected '{' in:
" { for(j in 1:883)
{ while(MatriceDist[i,j] > 0 & ((vectorID[i] > 0 | vectorID[j] > 0)) {"
I think that it is an error in the statement for if but I can not understand how to solve it...
Thank you everyone!
Elisa

check your brackets :-)
for(i in 1:883) {
for(j in 1:883) {
if(MatriceDist[i,j] > 0 & (vectorID[i] > 0 | vectorID[j] > 0)) { vectorID[i] = 1 & vectorID[j] = 1 print(vectorID)
}
}
}
you had one ( to mucch before vectorID in your if statement.
please double check is the condition now specified in the statement is still the one you require.
btw: for loops are very slow in R. If you know the end size of vectorID, try pre-allocating the full matrix. That will speed things up a little bit.

Related

R nested if statements error "condition has length > 1 and only the first element will be used"

I'm trying to do many conditional events in R but im getting the warning:
Warning messages:
1: In if (closeV > openV) { :
the condition has length > 1 and only the first element will be used
2: In if ((highV - closeV) < Minimum) { :
the condition has length > 1 and only the first element will be used
3: In if ((openV - lowV) > Threshold) { :
the condition has length > 1 and only the first element will be used
4: In if (((openV - lowV) < Threshold)) { :
the condition has length > 1 and only the first element will be used
5: In if ((closeV - openV) < Threshold) { :
the condition has length > 1 and only the first element will be used
6: In if ((closeV - lowV) < (Threshold * 2)) { :
the condition has length > 1 and only the first element will be used
this is a huge nest of ifs, it is not optimized right now but i cant get it to work because of that warning.
There are around of 40 ifs in that function, any idea of what i need to do to get around this warning?
The code looks something like this
if(closeV>openV)#1 First we check if we have a positive value
{
if((highV-closeV)<Minimum)
{
if((openV-lowV) >Threshold)
{
if((closeV-openV)<Threshold)
{
#3.1 This is a Hammer with positive movement
if((closeV-lowV)<(Threshold*2))
{
#3.1.1 not much movement
return(X*2)
}
else if((closeV-lowV)>(Treshold*2))
{
#3.1.2 a lot of movement
return(X*3)
}
}
else if((closeV-openV)>Threshold)
{
#3.2 Hammer but with a lot of movement
if((closeV-lowV)<(Threshold*2))
{
#3.2.1 not much movement
return(X)
}
else if((closeV-lowV)>(Treshold*2))
{
#3.2.2 a lot of movement
return(X*5)
}
}
}
else if(((openV-lowV)<Threshold)
and it keeps on going through a lot of possibilites
The issue is not the nested if-statements, but rather the data structure you feed into them: The warning tells you that the comparison operator is only applied to the first element of the data structure you feed into the if-statements.
While
a = seq(1, 10, 1)
b = seq(0, 18, 2)
if (a>b){
print(a)
} else{
print(b)
}
throws the same warning messages you get,
a = seq(1, 10, 1)
b = seq(0, 18, 2)
for (i in 1:10) {
if (a[i]>b[i]){
print(a[i])
} else{
print(b[i])
}
}
in contrast evaluates smoothly.
Also, please notice that although both pieces of code are evaluated, they give very different results.

Adding a counter to a loop

On a broad question that I haven't been able to find for R:
I'm trying to add a counter at the beginning of a loop.
So that when I run the loop sim = 1000:
if(hours$week1 > 1 and hours$week1 < 48) add 1 to the counter
ifelse add 0
I have came across counter tutorials that print a sentence to let you know where you are (if something goes wrong):
e.g
For (i in 1:1000) {
if (i%%100==0) print(paste("No work", i))
}
But the purpose of my counter is to generate a value output, measuring how many of the 1000 runs in the loop fall inside a specified range.
You basically had it. You just need to a) initialize the counter before the loop, b) use & instead of and in your if condition, c) actually add 1 to the counter. Since adding 0 is the same as doing nothing, you don't have to worry about the "else".
counter = 0
for (blah in your_loop_definition) {
... loop code ...
if(hours$week1 > 1 & hours$week1 < 48) {
counter = counter + 1
}
... more loop code ...
}
Instead of
if(hours$week1 > 1 & hours$week1 < 48) {
counter = counter + 1
}
you could also use
counter = counter + (hours$week1 > 1 && hours$week1 < 48)
since R is converting TRUE to 1 and FALSE to 0.
How about this?
count = 0
for (i in 1:1000) {
count = ifelse(i %in% 1:100, count + 1, count)
}
count
#> [1] 100
If your goal is just to monitor progression coarsely, and you're using Rstudio, a simple solution is to just refresh the environment tab to check the current value of i.

Plot function with else if statement in R

trying to plot the following function over the interval [-1,1] but am getting error code:
"Warning messages:
1: In if (g < a) { :
the condition has length > 1 and only the first element will be used
2: In if (g >= a & g <= b) { :
the condition has length > 1 and only the first element will be used"
unifCDF<-function(g) {
if (g< a) {
0
}
else if (g>=a & g<=b) {
(g-a)/(b-a)
}
else if (g>b) {
1
}
}
I know the function itself works since unifCDF() works for all values I tested. Any ideas?
Your function works on single values:
> unifCDF(.5)
[1] 0.75
but not on vectors:
> unifCDF(c(0.2,.3))
[1] 0.60 0.65
Warning messages:
1: In if (g < a) { :
the condition has length > 1 and only the first element will be used
2: In if (g >= a & g <= b) { :
the condition has length > 1 and only the first element will be used
and plot.function needs functions to work on vectors. The lazy way is to just Vectorize your function:
> unifCDF=Vectorize(unifCDF)
> unifCDF(c(0.2,.3))
[1] 0.60 0.65
> plot.function(unifCDF,-1,1)
which then works.
The right way is to code it so that it handles vector arguments naturally.
unifCDF = function(g){
res = (g-a)/(b-a)
res[g<a]=0
res[g>b]=1
res
}
in this code, res is always a vector of the same length as g. The first line computes the slopey bit for all values of g, and then the next two lines set the relevant bits outside the (a,b) limits to 0 and 1.
Note that having global variables, like a and b are generally a bad thing.

Multiple conditions in if statements in R

I am trying to cut down a list of gene names that I have been given. I'm trying to eliminate any repetitive names that may be present but I keep getting an error when running my code:
counter=0
i=0
j=0
geneNamesRevised=array(dim=length(geneNames))
for (i in 0:length(geneNamesRevised))
geneNamesRevised[i]=""
geneNamesRevised
for (i in 1:length(geneNames))
for (j in 1:length(geneNamesRevised))
if (geneNames[i]==geneNamesRevised[j])
{
break
}
else if ((j==length(geneNamesRevised)-1) &&
(geneNames[i]!=geneNamesRevised[j]))
{
geneNamesRevised[counter]=geneNames[i]
counter++
}
The error message is a repetitive string of :
the condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be used
and this error message is for the last "else if" statement that has the '&&'.
Thank you!
Why not just
geneNamesRevised <- unique( geneNames )
... which returns a shortened list. There is also a duplicated function that can be used to remove duplicates when negated.
There are a few problems in your code.
1) The else is incorrectly specified - or not :) thanks #Mohsen_Fatemi
2) & is usually what you need rather than &&
3) counter++ isn't R
Copy the code below and see if it runs
for (i in 1:length(geneNames)){
for (j in 1:length(geneNamesRevised)){
if (geneNames[i]==geneNamesRevised[j])
{
break
} else {
if ((j==length(geneNamesRevised)-1) & (geneNames[i]!=geneNamesRevised[j]))
{
geneNamesRevised[counter]=geneNames[i]
counter <- counter + 1
}
}
}
}
Edit
4) also you were missing braces for your fors
use & instead of && ,
else if ((j==length(geneNamesRevised)-1) & (geneNames[i]!=geneNamesRevised[j]))

R - vectorised conditional replace

Hi I'm trying manipulate a list of numbers and I would like to do so without a for loop, using fast native operation in R. The pseudocode for the manipulation is :
By default the starting total is 100 (for every block within zeros)
From the first zero to next zero, the moment the cumulative total falls by more than 2% replace all subsequent numbers with zero.
Do this far all blocks of numbers within zeros
The cumulative sums resets to 100 every time
For example if following were my data :
d <- c(0,0,0,1,3,4,5,-1,2,3,-5,8,0,0,-2,-3,3,5,0,0,0,-1,-1,-1,-1);
Results would be :
0 0 0 1 3 4 5 -1 2 3 -5 0 0 0 -2 -3 0 0 0 0 0 -1 -1 -1 0
Currently I have an implementation with a for loop, but since my vector is really long, the performance is terrible.
Thanks in advance.
Here is a running sample code :
d <- c(0,0,0,1,3,4,5,-1,2,3,-5,8,0,0,-2,-3,3,5,0,0,0,-1,-1,-1,-1);
ans <- d;
running_total <- 100;
count <- 1;
max <- 100;
toggle <- FALSE;
processing <- FALSE;
for(i in d){
if( i != 0 ){
processing <- TRUE;
if(toggle == TRUE){
ans[count] = 0;
}
else{
running_total = running_total + i;
if( running_total > max ){ max = running_total;}
else if ( 0.98*max > running_total){
toggle <- TRUE;
}
}
}
if( i == 0 && processing == TRUE )
{
running_total = 100;
max = 100;
toggle <- FALSE;
}
count <- count + 1;
}
cat(ans)
I am not sure how to translate your loop into vectorized operations. However, there are two fairly easy options for large performance improvements. The first is to simply put your loop into an R function, and use the compiler package to precompile it. The second slightly more complicated option is to translate your R loop into a c++ loop and use the Rcpp package to link it to an R function. Then you call an R function that passes it to c++ code which is fast. I show both these options and timings. I do want to gratefully acknowledge the help of Alexandre Bujard from the Rcpp listserv, who helped me with a pointer issue I did not understand.
First, here is your R loop as a function, foo.r.
## Your R loop as a function
foo.r <- function(d) {
ans <- d
running_total <- 100
count <- 1
max <- 100
toggle <- FALSE
processing <- FALSE
for(i in d){
if(i != 0 ){
processing <- TRUE
if(toggle == TRUE){
ans[count] <- 0
} else {
running_total = running_total + i;
if (running_total > max) {
max <- running_total
} else if (0.98*max > running_total) {
toggle <- TRUE
}
}
}
if(i == 0 && processing == TRUE) {
running_total <- 100
max <- 100
toggle <- FALSE
}
count <- count + 1
}
return(ans)
}
Now we can load the compiler package and compile the function and call it foo.rcomp.
## load compiler package and compile your R loop
require(compiler)
foo.rcomp <- cmpfun(foo.r)
That is all it takes for the compilation route. It is all R and obviously very easy. Now for the c++ approach, we use the Rcpp package as well as the inline package which allows us to "inline" the c++ code. That is, we do not have to make a source file and compile it, we just include it in the R code and the compilation is handled for us.
## load Rcpp package and inline for ease of linking
require(Rcpp)
require(inline)
## Rcpp version
src <- '
const NumericVector xx(x);
int n = xx.size();
NumericVector res = clone(xx);
int toggle = 0;
int processing = 0;
int tot = 100;
int max = 100;
typedef NumericVector::iterator vec_iterator;
vec_iterator ixx = xx.begin();
vec_iterator ires = res.begin();
for (int i = 0; i < n; i++) {
if (ixx[i] != 0) {
processing = 1;
if (toggle == 1) {
ires[i] = 0;
} else {
tot += ixx[i];
if (tot > max) {
max = tot;
} else if (.98 * max > tot) {
toggle = 1;
}
}
}
if (ixx[i] == 0 && processing == 1) {
tot = 100;
max = 100;
toggle = 0;
}
}
return res;
'
foo.rcpp <- cxxfunction(signature(x = "numeric"), src, plugin = "Rcpp")
Now we can test that we get the expected results:
## demonstrate equivalence
d <- c(0,0,0,1,3,4,5,-1,2,3,-5,8,0,0,-2,-3,3,5,0,0,0,-1,-1,-1,-1)
all.equal(foo.r(d), foo.rcpp(d))
Finally, create a much larger version of d by repeating it 10e4 times. Then we can run the three different functions, pure R code, compiled R code, and R function linked to c++ code.
## make larger vector to test performance
dbig <- rep(d, 10^5)
system.time(res.r <- foo.r(dbig))
system.time(res.rcomp <- foo.rcomp(dbig))
system.time(res.rcpp <- foo.rcpp(dbig))
Which on my system, gives:
> system.time(res.r <- foo.r(dbig))
user system elapsed
12.55 0.02 12.61
> system.time(res.rcomp <- foo.rcomp(dbig))
user system elapsed
2.17 0.01 2.19
> system.time(res.rcpp <- foo.rcpp(dbig))
user system elapsed
0.01 0.00 0.02
The compiled R code takes about 1/6 the time the uncompiled R code taking only 2 seconds to operate on the vector of 2.5 million. The c++ code is orders of magnitude faster even then the compiled R code requiring just .02 seconds to complete. Aside from the initial setup, the syntax for the basic loop is nearly identical in R and c++ so you do not even lose clarity. I suspect that even if parts or all of your loop could be vectorized in R, you would be sore pressed to beat the performance of the R function linked to c++. Lastly, just for proof:
> all.equal(res.r, res.rcomp)
[1] TRUE
> all.equal(res.r, res.rcpp)
[1] TRUE
The different functions return the same results.

Resources