r for loop in dataframe

r for loop in dataframe - r

I have a for loop calculating fields in a dataframe, averaging out the difference over missing data over several rows. I used:
for(l in i:k-1){data[l,j]=as.numeric(data[l-1,j])+increase}
intending to change the fields from i to k-1
what happens is that fields from i-1 to k get changed -- is this what R should do?
I appreciate that I can get the results I really want by enclosing the k-1 in brackets (as I did in the first for loop below), but don't understand why R is interpreting my i:k-1 as i-1:k (see the second loop).
Example:
> data[l,j]=0
> data[l-1,j]=0
> data[l+1,j]=0
> increase
[1] -2.8
> i
[1] 11019
> k
[1] 11020
> for(l in i:(k-1)){data[l,j]=as.numeric(data[l-1,j])+increase}
> data[l-1,j]
[1] "0"
> data[l+1,j]
[1] "0"
> data[l,j]
[1] "-2.8"
> data[l,j]=0
> for(l in i:k-1){data[l,j]=as.numeric(data[l-1,j])+increase}
> data[l,j]
[1] "7.7"
> data[l+1,j]
[1] "0"
> data[l-1,j]
[1] "10.5"
> data[l-2,j]
[1] "11.1"

Related

Getting "argument is of length 0" inside if statement [duplicate]

I am having a little problem with R and I am not sure why. It is telling me that this line: if(temp > data[[k]][[k2]]) { is of argument length 0. Here is the block which is not that big:
for(k in 1:length(data)) {
temp <- 0
for(k2 in 3:length(data[[k]])) {
print(data[[k]][[k2]])
if(temp > data[[k]][[k2]]) {
temp <- data[[k]][[k2]]
}
fMax[k] <- temp
k2 <- k2 + 1
}
k <- k + 1
}
example of what is in data[[k]][[k2]]:
[1] "3050"
[1] "3051"
[1] "3054"
[1] "3054"
[1] "3052"
[1] "3053"
[1] "3059"
[1] "3059"
[1] "3057"
[1] "3060"
[1] "3063"
[1] "3060"
[1] "3068"
[1] "3067"
[1] "3079"
[1] "3085"
[1] "3094"
[1] "3107"
[1] "3121"
[1] "3135"
[1] "3147"
[1] "3161"
[1] "3200"
[1] "3237"
[1] "3264"
[1] "3274"
[1] "3284"
[1] "3289"
[1] "3292"
[1] "3300"
[1] "3301"
[1] "3303"
[1] "3306"
[1] "3310"
[1] "3312"
[1] "3313"
[1] "3319"
[1] "3314"
[1] "3318"
[1] "3318"
[1] "3320"
[1] "3322"
[1] "3322"
[1] "3322"
[1] "3328"
[1] "3332"
[1] "3338"
[1] "3350"
[1] "3358"
[1] "3378"
[1] "3395"
[1] "3402"
[1] "3875"
[1] "3950"
[1] "3988"
[1] "4018"
[1] "4039"
[1] "4048"
[1] "4057"
[1] "4062"
[1] "4067"
[1] "4076"
[1] "4082"
[1] "4085"
[1] "4092"
[1] "4098"
[1] "4099"
[1] "4101"
[1] "4107"
[1] "4119"
[1] "4139"
[1] "4164"
[1] "4231"
[1] "4347"
[1] "4559"

"argument is of length zero" is a very specific problem that comes from one of my least-liked elements of R. Let me demonstrate the problem:
> FALSE == "turnip"
[1] FALSE
> TRUE == "turnip"
[1] FALSE
> NA == "turnip"
[1] NA
> NULL == "turnip"
logical(0)
As you can see, comparisons to a NULL not only don't produce a boolean value, they don't produce a value at all - and control flows tend to expect that a check will produce some kind of output. When they produce a zero-length output... "argument is of length zero".
(I have a very long rant about why this infuriates me so much. It can wait.)
So, my question; what's the output of sum(is.null(data[[k]]))? If it's not 0, you have NULL values embedded in your dataset and will need to either remove the relevant rows, or change the check to
if(!is.null(data[[k]][[k2]]) & temp > data[[k]][[k2]]){
#do stuff
}
Hopefully that helps; it's hard to tell without the entire dataset. If it doesn't help, and the problem is not a NULL value getting in somewhere, I'm afraid I have no idea.

The same error message results not only for null but also for e.g. factor(0). In this case, the query must be if(length(element) > 0 & otherCondition) or better check both cases with if(!is.null(element) & length(element) > 0 & otherCondition).

You can use isTRUE for such cases. isTRUE is the same as { is.logical(x) && length(x) == 1 && !is.na(x) && x }
If you use shiny there you could use isTruthy which covers the following cases:
FALSE
NULL
""
An empty atomic vector
An atomic vector that contains only missing values
A logical vector that contains all FALSE or missing values
An object of class "try-error"
A value that represents an unclicked actionButton()

I spent an entire day bashing my head against this, the solution turned out to be simple..
R isn't zero-index.
Every programming language that I've used before has it's data start at 0, R starts at 1.
The result is an off-by-one error but in the opposite direction of the usual.
going out of bounds on a data structure returns null and comparing null in an if statement gives the argument is of length zero error. The confusion started because the dataset doesn't contain any null, and starting at position [0] like any other pgramming language turned out to be out of bounds.
Perhaps starting at 1 makes more sense to people with no programming experience (the target market for R?) but for a programmer is a real head scratcher if you're unaware of this.

The argument is of length zero takes places when you get an output as an integer of length 0 and not a NULL output.i.e.,
integer(0).
You can further verify my point by finding the class of your output-
>class(output)
"integer"

The simplest solution to the problem is to change your for loop statement :
Instead of using
for (i in **0**:n))
Use
for (i in **1**:n))

In my case, I just wanted to see the first position of the character as follows
htagPos <- which(strsplit(val, "")[[1]] == "#")
if(htagPos == 1){
next
}# this did now work:(
So I had to check the length of the result first before checking the value
htagPos <- which(strsplit(val, "")[[1]] == "#")
if(length(htagPos) >= 1 && htagPos == 1){
next
}
I see why most people prefer python...

So the other possibility for this error can be when the condition in IF is a return value from other function.
For example,
check <- function (value) {
if (value == 0) {
return TRUE
}
Now,
If this function is called like this:
if(check(value)) {
do something
}
So here, let's assume the value is not 0, there is no return statement for that case. In this case too, you'll get "argument is of length zero" error.
Hope this is helpful!

String data structures have the last data addressed nulled so use max(data) instead of data[last].
https://www.geeksforgeeks.org/string-data-structure/
For example,a string with 4 elements will have a number element in it's 5th element.

Confused on a very simple "==" test

How is this possible?
> a=TC_df$temp[561]
> a
[1] 15.6
> a==15.6
[1] FALSE
> a=="15.6"
[1] TRUE
> class(a)
[1] "numeric"

You compare a number with string with == operator. Use identical instead.
Start with:
> a=15.60000000000001
> a
[1] 15.6
> a=="15.6"
[1] TRUE
> a==15.6
[1] FALSE
A kind of such a number is in your case stored in the variable a.
The options(digits...) controls the number of digits to print when printing numeric values. Now set the number of digits to print to 16:
> options(digits=16)
> a
[1] 15.60000000000001
> toString(a)
[1] "15.6"
Do you see what happened? The identical does not suffer from this problem.

Rounding Error when converting from character to numeric

I have a data.table of data numbers in character format that I am trying to convert to numeric numbers. However the issue is that the numbers are very long and I want to retain all of the numbers without any rounding from R. For examle the first 5 elements of the data.table:
> TimeO[1]
[1] "20110630224701281482"
> TimeO[2]
[1] "20110630224701281523"
> TimeO[3]
[1] "20110630224701281533"
> TimeO[4]
[1] "20110630224701281548"
> TimeO[5]
[1] "20110630224701281762"
I wrote a function to convert from a character into numeric:
convert_time_fast <- function(tim){
b <- tim - tim%/%10^12*10^12
# hhmmssffffff
ms <- b%%10^6; b <-(b-ms)/10^6
ss <- b%%10^2; b <-(b-ss)/10^2
mm <- b%%10^2; hh <-(b-mm)/10^2
# if hours>=22, subtract 24 (previous day)
hh <- hh - (hh>=22)*24
return(hh+mm/60+ss/3600+ms/(3600*10^6))
}
However the rounding occurs in R so datapoints now have the same time. See first 5 elements after converting:
TimeOC <--convert_time_fast(as.numeric(TimeO))
> TimeOC[1]
[1] 1.216311
> TimeOC[2]
[1] 1.216311
> TimeOC[3]
[1] 1.216311
> TimeOC[4]
[1] 1.216311
> TimeOC[5]
[1] 1.216311
Any help figuring this out would be greatly appreciated!

You should test to see if they are really equal (all.equal()).
Usually R limits the number of digits it prints (usually to 7), but they are still there.
See also this example:
> as.numeric("1.21631114")
[1] 1.216311
> as.numeric("1.21631118")
[1] 1.216311
> all.equal(as.numeric("1.21631114"), as.numeric("1.21631118"))
[1] "Mean relative difference: 3.288632e-08" # which indicates they're not the same

Argument is of length zero in if statement

I am having a little problem with R and I am not sure why. It is telling me that this line: if(temp > data[[k]][[k2]]) { is of argument length 0. Here is the block which is not that big:
for(k in 1:length(data)) {
temp <- 0
for(k2 in 3:length(data[[k]])) {
print(data[[k]][[k2]])
if(temp > data[[k]][[k2]]) {
temp <- data[[k]][[k2]]
}
fMax[k] <- temp
k2 <- k2 + 1
}
k <- k + 1
}
example of what is in data[[k]][[k2]]:
[1] "3050"
[1] "3051"
[1] "3054"
[1] "3054"
[1] "3052"
[1] "3053"
[1] "3059"
[1] "3059"
[1] "3057"
[1] "3060"
[1] "3063"
[1] "3060"
[1] "3068"
[1] "3067"
[1] "3079"
[1] "3085"
[1] "3094"
[1] "3107"
[1] "3121"
[1] "3135"
[1] "3147"
[1] "3161"
[1] "3200"
[1] "3237"
[1] "3264"
[1] "3274"
[1] "3284"
[1] "3289"
[1] "3292"
[1] "3300"
[1] "3301"
[1] "3303"
[1] "3306"
[1] "3310"
[1] "3312"
[1] "3313"
[1] "3319"
[1] "3314"
[1] "3318"
[1] "3318"
[1] "3320"
[1] "3322"
[1] "3322"
[1] "3322"
[1] "3328"
[1] "3332"
[1] "3338"
[1] "3350"
[1] "3358"
[1] "3378"
[1] "3395"
[1] "3402"
[1] "3875"
[1] "3950"
[1] "3988"
[1] "4018"
[1] "4039"
[1] "4048"
[1] "4057"
[1] "4062"
[1] "4067"
[1] "4076"
[1] "4082"
[1] "4085"
[1] "4092"
[1] "4098"
[1] "4099"
[1] "4101"
[1] "4107"
[1] "4119"
[1] "4139"
[1] "4164"
[1] "4231"
[1] "4347"
[1] "4559"

"argument is of length zero" is a very specific problem that comes from one of my least-liked elements of R. Let me demonstrate the problem:
> FALSE == "turnip"
[1] FALSE
> TRUE == "turnip"
[1] FALSE
> NA == "turnip"
[1] NA
> NULL == "turnip"
logical(0)
As you can see, comparisons to a NULL not only don't produce a boolean value, they don't produce a value at all - and control flows tend to expect that a check will produce some kind of output. When they produce a zero-length output... "argument is of length zero".
(I have a very long rant about why this infuriates me so much. It can wait.)
So, my question; what's the output of sum(is.null(data[[k]]))? If it's not 0, you have NULL values embedded in your dataset and will need to either remove the relevant rows, or change the check to
if(!is.null(data[[k]][[k2]]) & temp > data[[k]][[k2]]){
#do stuff
}
Hopefully that helps; it's hard to tell without the entire dataset. If it doesn't help, and the problem is not a NULL value getting in somewhere, I'm afraid I have no idea.

The same error message results not only for null but also for e.g. factor(0). In this case, the query must be if(length(element) > 0 & otherCondition) or better check both cases with if(!is.null(element) & length(element) > 0 & otherCondition).

You can use isTRUE for such cases. isTRUE is the same as { is.logical(x) && length(x) == 1 && !is.na(x) && x }
If you use shiny there you could use isTruthy which covers the following cases:
FALSE
NULL
""
An empty atomic vector
An atomic vector that contains only missing values
A logical vector that contains all FALSE or missing values
An object of class "try-error"
A value that represents an unclicked actionButton()

I spent an entire day bashing my head against this, the solution turned out to be simple..
R isn't zero-index.
Every programming language that I've used before has it's data start at 0, R starts at 1.
The result is an off-by-one error but in the opposite direction of the usual.
going out of bounds on a data structure returns null and comparing null in an if statement gives the argument is of length zero error. The confusion started because the dataset doesn't contain any null, and starting at position [0] like any other pgramming language turned out to be out of bounds.
Perhaps starting at 1 makes more sense to people with no programming experience (the target market for R?) but for a programmer is a real head scratcher if you're unaware of this.

The argument is of length zero takes places when you get an output as an integer of length 0 and not a NULL output.i.e.,
integer(0).
You can further verify my point by finding the class of your output-
>class(output)
"integer"

The simplest solution to the problem is to change your for loop statement :
Instead of using
for (i in **0**:n))
Use
for (i in **1**:n))

In my case, I just wanted to see the first position of the character as follows
htagPos <- which(strsplit(val, "")[[1]] == "#")
if(htagPos == 1){
next
}# this did now work:(
So I had to check the length of the result first before checking the value
htagPos <- which(strsplit(val, "")[[1]] == "#")
if(length(htagPos) >= 1 && htagPos == 1){
next
}
I see why most people prefer python...

So the other possibility for this error can be when the condition in IF is a return value from other function.
For example,
check <- function (value) {
if (value == 0) {
return TRUE
}
Now,
If this function is called like this:
if(check(value)) {
do something
}
So here, let's assume the value is not 0, there is no return statement for that case. In this case too, you'll get "argument is of length zero" error.
Hope this is helpful!

String data structures have the last data addressed nulled so use max(data) instead of data[last].
https://www.geeksforgeeks.org/string-data-structure/
For example,a string with 4 elements will have a number element in it's 5th element.

A more precise sum() in R for big values?

I'm trying to do a simple sum over a large column in R. The answer comes back all right, but not to the specificity that I want. For example:
> tail(x)
[,1]
[1999995,] 1999995
[1999996,] 0
[1999997,] 1999997
[1999998,] 0
[1999999,] 1999999
[2e+06,] 0
If I do a sum(x), I get:
> sum(x)
[1] 1e+12
Which is fine, but I'd like it to print out something with more significant figures like 158683269821 or something. Is there an option in sum() to specify how many sigfigs I want?

The options I wound up using were thus:
> options("scipen"=100, "digits"=4)
> sum(x)
[1] 1000000000000
> sum(x)
[1] 1000000000000
> sum(x)+1
[1] 1000000000001
> sum(x)+2
[1] 1000000000002
> sum(x)-1
[1] 999999999999

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

r for loop in dataframe - r

Related

Getting "argument is of length 0" inside if statement [duplicate]

Confused on a very simple "==" test

Rounding Error when converting from character to numeric

Argument is of length zero in if statement

A more precise sum() in R for big values?

Categories

Resources