what does accessing zero element in R do? - r

if I have a vector a<-c(3, 5, 7, 8)
and run a[1], not surprisingly I will get 3
but if I will run a[0] I basically get numeric(0)
What does this mean?
And what does this do?
How can I use it for normal reasons?

Others have answered what x[0] does, so I thought I'd expand on why it's useful: generating test cases. It's great for making sure that your functions work with unusual data structure variants that users sometimes produce accidentally.
For example, it makes it easy to generate 0 row and 0 column data frames:
mtcars[0, ]
mtcars[, 0]
These can arise when subsetting goes wrong:
mtcars[mtcars$cyl > 10, ]
But in your testing code it's useful to flag that you're doing it deliberately.

http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Indexing-by-vectors
As you can see it says: A special case is the zero index, which has null effects: x[0] is an empty vector and otherwise including zeros among positive or negative indices has the same effect as if they were omitted.

Related

Should I use 'which' on filters?

When filtering a dataset you can use:
df[df$column==value,]
or
df[which(df$column==value),]
The first filter returns a logical vector. The second one returns a list of indexes (the ones which value is 'True' in that logical vector). Should I use one better than the other? I see that sometimes the first one returns a row with all values as NA...
Which of both expression is more correct?
Thanks!
You should (almost) always prefer the first version.
Why? Because it’s simpler. Don’t add unnecessary complexity to your code — programming is hard enough as it is, we do not want to make it even harder; and small complexities add to each other supra-linearly.
One case where you might want to use which is when your input contains NAs that you want to ignore:
df = data.frame(column = c(1, NA, 2, 3))
df[df$column == 1, ]
# 1 NA
df[which(df$column == 1), ]
# 1
However, even in this case I would not use which; instead, I would handle the presence of NAs explicitly to document that the code expects NAs and wants to handle them. The idea is, once again, to make the code as simple and self-explanatory as possibly. This implies being explicit about your intent, instead of hiding it behind non-obvious functions.
That is, in the presence of NAs I would use the following instead of which:
df[! is.na(df$column) & df$column == 1, ]

Making a looping statement that populates a vector?

I've tried a couple ways of doing this problem but am having trouble with how to write it. I think I did the first three steps correctly, but now I have to fill the vector z with numbers from y that are divisible by four, not divisible by three, and have an odd number of digits. I know that I'm using the print function in the wrong way, I'm just at a loss on what else to use ...
This is different from that other question because I'm not using a while loop.
#Step 1: Generate 1,000,000 random, uniformly distributed numbers between 0
#and 1,000,000,000, and name as a vector x. With a seed of 1.
set.seed(1)
x=runif(1000000, min=0, max=1000000000)
#Step 2: Generate a rounded version of x with the name y
y=round(x,digits=0)
#Step 3: Empty vector named z
z=vector("numeric",length=0)
#Step 4: Create for loop that populates z vector with the numbers from y that are divisible by
#4, not divisible by 3, with an odd number of digits.
for(i in y) {
if(i%%4==0 && i%%3!=0 && nchar(i,type="chars",allowNA=FALSE,keepNA=NA)%%2!=0){
print(z,i)
}
}
NOTE: As per #BenBolker's comment, a loop is an inefficient way to solve your problem here. Generally, in R, try to avoid loops where possible to maximise the efficiency of your code. #SymbolixAU has provided an example of doing so here in the comments. Having said that, in aid of helping you learn the ins-and-outs of loops and vectors, here's a solution which only requires a change to one line of your code:
You've got the vector created before the loop, that's a good start. Now, inside your loop, you need to populate that vector. To do so, you've currently got print(z,i), which won't really do too much. What you need to to change the vector itself:
z <- c( z, i )
Should work for you (just replace that print line in your loop).
What's happening here is that we're taking the existing z vector, binding i to the end of it, and making that new vector z again. So every time a value is added, the vector gets a little longer, such that you'll end up with a complete vector.
where you have print put this instead:
z <- append(z, i)

How to use negative values in which() statement in R?

I want to exclude the rows in which x has values less than or equal to -10, so I wrote this:
newdata <- data[which(data$x> -10), ]
Is this right or I need to put -10 in double quotation marks?
Thank you.
(Decided to upgrade this from a comment to an answer.)
Using double quotation marks is not wise: it will mess you up in some quite surprising ways. For example, 1 > "-10" is FALSE (!!) because of the way in which R compares strings.
R's use of <- for assignment may get you in trouble; if you want x<-10 to do the comparison rather than assign the value 10 to x, you need either spaces x < -10 or parentheses (x<(-10)). However, this doesn't arise with the > comparison.
You can always use parentheses if you're worried (x > (-10)); the only drawback is that things get harder to read if you use too many (e.g., data[(which(((data$x)>(-10)))),])).
As pointed out in the comments, R is an interactive environment; if you can't figure something like this out from the documentation or other help sources, you should just try a small example and convince yourself that it works.
For example:
x <- c(-20,-15,-10,-4,0)
x[x>-10]
## -4 0

if statement in r?

I am not sure what I am doing wrong here.
ee <- eigen(crossprod(X))$values
for(i in 1:length(ee)){
if(ee[i]==0:1e^-9) stop("singular Matrix")}
Using the eigen value approach, I am trying to determine if the matrix is singular or not. I am attempting to find out if one of the eigen values of the matrix is between 0 and 10^-9. How can I use the if statement (as above) correctly to achieve my goal? Is there any other way to approach this?
what if I want to concatenate the zero eigen value in vector
zer <-NULL
ee <- eigen(crossprod(X))$values
for(i in 1:length(ee)){
if(abs(ee[i])<=1e-9)zer <- c(zer,ee[i])}
Can I do that?
#AriBFriedman is quite correct. I can, however see a couple of other issues
1e^-9 should be 1e-9.
0:1e-9 returns 0, (: creates a sequence by one between 0 and 1e-9, therefore returns just 0. See ?`:` for more details
Using == with decimals will cause problems due to floating point arithmetic
In the form written, your code checks (individually) whether the elements ee[i] == 0, which is not what you want (nor does it make sense in terms floating point arithmetic)
You are looking for cases where the eigen value is less than this small number, so use less than (<).
What you are looking for is something like
if(any(abs(ee) < 1e-9)) stop('singular matrix')
If you want to get the 0 (or small) eigen vectors, then use which
# this will give the indexs (which elements are small)
small_values <- which(abs(ee) < 1e-9))
# and those small values
ee[small_values]
There is no need for the for loop as everything being done is vectorized.
if takes a single argument of length 1.
Try either ifelse or using any() or all() to turn your vector of logicals into a logical vector of length 1.
Here's an example reproducing your data:
X <- matrix(1:10,1:10)
ee <- eigen(crossprod(X))$values
This will test if any of the values of ee are > 0 AND< 1e-9
if (any((ee > 0) & (ee < 1e-9))) {stop("singular matrix")}

How to access single elements in a table in R

How do I grab elements from a table in R?
My data looks like this:
V1 V2
1 12.448 13.919
2 22.242 4.606
3 24.509 0.176
etc...
I basically just want to grab elements individually. I'm getting confused with all the R terminology, like vectors, and I just want to be able to get at the individual elements.
Is there a function where I can just do like data[v1][1] and get the element in row 1 column 1?
Try
data[1, "V1"] # Row first, quoted column name second, and case does matter
Further note: Terminology in discussing R can be crucial and sometimes tricky. Using the term "table" to refer to that structure leaves open the possibility that it was either a 'table'-classed, or a 'matrix'-classed, or a 'data.frame'-classed object. The answer above would succeed with any of them, while #BenBolker's suggestion below would only succeed with a 'data.frame'-classed object.
There is a ton of free introductory material for beginners in R: CRAN: Contributed Documentation
?"[" pretty much covers the various ways of accessing elements of things.
Under usage it lists these:
x[i]
x[i, j, ... , drop = TRUE]
x[[i, exact = TRUE]]
x[[i, j, ..., exact = TRUE]]
x$name
getElement(object, name)
x[i] <- value
x[i, j, ...] <- value
x[[i]] <- value
x$i <- value
The second item is sufficient for your purpose
Under Arguments it points out that with [ the arguments i and j can be numeric, character or logical
So these work:
data[1,1]
data[1,"V1"]
As does this:
data$V1[1]
and keeping in mind a data frame is a list of vectors:
data[[1]][1]
data[["V1"]][1]
will also both work.
So that's a few things to be going on with. I suggest you type in the examples at the bottom of the help page one line at a time (yes, actually type the whole thing in one line at a time and see what they all do, you'll pick up stuff very quickly and the typing rather than copypasting is an important part of helping to commit it to memory.)
Maybe not so perfect as above ones, but I guess this is what you were looking for.
data[1:1,3:3] #works with positive integers
data[1:1, -3:-3] #does not work, gives the entire 1st row without the 3rd element
data[i:i,j:j] #given that i and j are positive integers
Here indexing will work from 1, i.e,
data[1:1,1:1] #means the top-leftmost element

Resources