Is there an easy way of avoiding 0 division error in R. Specifically,
a <- c(1,0,2,0)
b <- c(3,2,1,0)
sum(b/a)
This code gives an error due to division by zero.
I would like a way to define anything/0 = 0 so that this kind of operation would still be valid.
Well, let's back up a minute. R does NOT return an error. It quite correctly returns NaN. You had better have a darn good reason for rejecting NaN values in your work. Why do you allow any b element to be zero in the first place? You need to think about what your code is really intended to do, and what it means (statistically, say) to cheerfully throw away all the cases where b[j]==0 .
What is you set all items in the denominator to 0 to NA and then exclude NA? in your sum?
a[a==0] <- NA
sum(b/a, na.rm=TRUE)
#-----
[1] 3.5
Or without modifying a: sum(b/ifelse(a==0,NA,a), na.rm = TRUE)
You could change the function "/" to have an exception for zero:
"/" <- function(x,y) ifelse(y==0,0,base:::"/"(x,y))
For example:
> 10/0
[1] 0
This is very risky though, for example it might break other people's code. If you want to do this it is probably a good idea to assign a different operator rather than changing /. Also it makes mathematically no sense!
If you want to mask all the NaN and Inf results, try something like:
a <- c(1,0,2,0)
b <- c(3,2,1,0)
result <- b/a
sum(result[is.finite(result)])
[1] 3.5
Or all in one line:
sum((b/a)[is.finite(b/a)])
[1] 3.5
I just worked on similar situation.
This works well with ifelse():
sum(ifelse(a==0,0,b/a))
Related
I am trying to log transform my data with the following...
skew(data_1)
any(data_1 == 0)
data_1_no_zero <- data_1 != 0
transform <- log(data_1_no_zero)
skew(transform)
[1] NaN
But I get a following output of NaN which I know is not a number. Can anyone help figure out what is wrong with this or what else could be done to get a better output?
Your proximal problem is that you should write
data_1_no_zero <- data_1[data_1 != 0]
That is, select the elements of data_1 that are non-zero. na.omit(log(data_1)) would also work, although removing zeros first is arguably better (farther upstream).
As to what you should do more broadly - use a different transformation, remove zeros, add 1/use log1p(), add a small constant (e.g. data_1 <- data_1 + min(data_1[data_1>0])/2) ... that very much depends on your context/what you're trying to do.
I'm trying to create a vector in R using the rep() function
p <- .9
n <- 100
rep(8,n*(1-p)^2) # expect 8
What is causing the unexpected behavior?
The reason for this is in the comments to the question. A workaround is using:
rep(8, round(n*(1-p)^2))
Condensing the comments. The second argument of rep should be an integer. From the help page: ?as.integer, we know that real numbers are truncated towards zero. So
n*(1-p)^2
is passed to
as.integer(n*(1-p)^2)
which is equal to 0.
I want to exclude the rows in which x has values less than or equal to -10, so I wrote this:
newdata <- data[which(data$x> -10), ]
Is this right or I need to put -10 in double quotation marks?
Thank you.
(Decided to upgrade this from a comment to an answer.)
Using double quotation marks is not wise: it will mess you up in some quite surprising ways. For example, 1 > "-10" is FALSE (!!) because of the way in which R compares strings.
R's use of <- for assignment may get you in trouble; if you want x<-10 to do the comparison rather than assign the value 10 to x, you need either spaces x < -10 or parentheses (x<(-10)). However, this doesn't arise with the > comparison.
You can always use parentheses if you're worried (x > (-10)); the only drawback is that things get harder to read if you use too many (e.g., data[(which(((data$x)>(-10)))),])).
As pointed out in the comments, R is an interactive environment; if you can't figure something like this out from the documentation or other help sources, you should just try a small example and convince yourself that it works.
For example:
x <- c(-20,-15,-10,-4,0)
x[x>-10]
## -4 0
sorry for the ugly code, but I'm not sure exactly what's going wrong
for (i in 1:1)
tab_sector[1:48,i] <-
tapply(get(paste("employee",1997-1+i, "[birth<=(1997-1+i)]",sep="")),
ordered(sic2digit[birth<=(1997-1+i)],levels=tab_sector_list))
# Error in get(paste("employee", 1997 - 1 + i,
# "[birth<=(1997-1+i))]", : object 'employee97[birth<=(1997-1+i)]' not found
but the variable is there:
head(employee97[birth<=(1997-1+i)])
# [1] 1 2 2 1 3 4
a simpler version where "employee" is not conditioned by "birth" works
It would help if you told us what you are trying to accomplish.
In your code the get function is looking for a variable whose name is "'employee97[birth<=(1997-1+i)]", the code that works is finding a variable whose name is "employee1997" then subsetting it, those are very different. The get function does not do subsetting.
Part of what you are trying to do is FAQ 7.21, the most important part of which is the end where it suggests storing your data in lists to make accessing easier.
You can't get an indexed element, e.g. get("x[i]") fails: you need get("x")[i].
Your code is almost too messy too see what's going on, but this is an attempt at a translation:
for (i in 1:1){
ind <- 1997-1+i
v1 <- get(paste0("employee",ind))
tab_sector[1:48,i] <- tapply(v1[birth<=ind],
ordered(sic2digit[birth<=ind],levels=tab_sector_list))
}
I am not sure what I am doing wrong here.
ee <- eigen(crossprod(X))$values
for(i in 1:length(ee)){
if(ee[i]==0:1e^-9) stop("singular Matrix")}
Using the eigen value approach, I am trying to determine if the matrix is singular or not. I am attempting to find out if one of the eigen values of the matrix is between 0 and 10^-9. How can I use the if statement (as above) correctly to achieve my goal? Is there any other way to approach this?
what if I want to concatenate the zero eigen value in vector
zer <-NULL
ee <- eigen(crossprod(X))$values
for(i in 1:length(ee)){
if(abs(ee[i])<=1e-9)zer <- c(zer,ee[i])}
Can I do that?
#AriBFriedman is quite correct. I can, however see a couple of other issues
1e^-9 should be 1e-9.
0:1e-9 returns 0, (: creates a sequence by one between 0 and 1e-9, therefore returns just 0. See ?`:` for more details
Using == with decimals will cause problems due to floating point arithmetic
In the form written, your code checks (individually) whether the elements ee[i] == 0, which is not what you want (nor does it make sense in terms floating point arithmetic)
You are looking for cases where the eigen value is less than this small number, so use less than (<).
What you are looking for is something like
if(any(abs(ee) < 1e-9)) stop('singular matrix')
If you want to get the 0 (or small) eigen vectors, then use which
# this will give the indexs (which elements are small)
small_values <- which(abs(ee) < 1e-9))
# and those small values
ee[small_values]
There is no need for the for loop as everything being done is vectorized.
if takes a single argument of length 1.
Try either ifelse or using any() or all() to turn your vector of logicals into a logical vector of length 1.
Here's an example reproducing your data:
X <- matrix(1:10,1:10)
ee <- eigen(crossprod(X))$values
This will test if any of the values of ee are > 0 AND< 1e-9
if (any((ee > 0) & (ee < 1e-9))) {stop("singular matrix")}