How can we do operations inside indexing operations in R? - r

For example, let's imagine following vector in R:
a <- 1:8; k <- 2
What I would like to do is getting for example all elements between 2k and 3k, namely:
interesting_elements <- a[2k:3k]
Erreur : unexpected symbol in "test[2k"
interesting_elements <- a[(2k):(3k)]
Erreur : unexpected symbol in "test[2k"
Unfortunately, indexing vectors in such a way in R does not work, and the only way I can do such an operation seems to create a specific variable k' storing result of 2k, and another k'' storing result of 3k.
Is there another way, without creating each time a new variable, for doing operations when indexing?

R does not interpret 2k as scalar multiplication as with other languages. You need to use explicit arithmetic operators.
If you are trying to access the 4 to 6 elements of a then you need to use * and parentheses:
a[(2*k):(3*k)]
[1] 4 5 6
If you leave off the parentheses then the sequence will evaluate first then the multiplication:
2*k:3*k
[1] 8 12
Is the same as
(k:3)*2*k
[1] 8 12

Related

array index difference notation Python <-> R

what is the Python notation a[i-j] translated to R? As far as I understand it, it should be the array element at position i-j. But in R it seems to be the array until the ith element subtracted by the element at position j.
R and Python have somewhat similar indexing properties, with the main difference being that indexing in Python starts at 0 while in R it starts at 1. Beyond the index start, there is also the fact that Python supports negative indexing, while in R negative indexing means that you are removing the element at that exact index from your list. To be specific to your case, the indexing list[i-j] could be somewhat the same thing if i - j returns a positive integer. Otherwise, you are talking about two completely different things. The illustration below should be helpful to you:
Python:
#Create a list
lst = [1,3,5,6,7,7]
#index element at 4-2 (which is 2)
lst[4-2] # returns 5
#index element at 2-4 (which is -2) or lst[len(lst)-2]
lst[2-4] # returns 7
R:
lst <- c(1,3,5,6,7,7)
#indexing element at 4-2 (which is 2)
lst[4-2] # returns 3 (because R indexing starts at 1, not 0)
[1] 3
#BUT indexing element at 2-4 (which is -2) does not work,
#because it means that you are removing the element at index 2, i.e. 3
lst[2-4] #returns the original list without element at index 2
[1] 1 5 6 7 7
These are the main differences in indexing a list that I could offer to help with your question. The differences in indexing become more prominent as you tackle more complicated data structures in both languages.
I hope this is helpful.

Combining the common elements in two lists in R, using only logical and arithmetic operators

I'm currently attempting to work out the GCD of two numbers (x and y) in R. I'm not allowed to use loops or if, else, ifelse statements. So i'm restricted to logical and arithmetic operators. So far using the code below i've managed to make lists of the factors of x and y.
xfac<-1:x
xfac[x%%fac==0]
This gives me two lists of factors but i'm not sure where to go from here. Is there a way I can combine the common elements in the two lists and then return the greatest value?
Thanks in advance.
Yes, max(intersect(xfac,yfac)) should give the gcd.
You have almost solved the problem. Let's take the example x <- 12 and y <- 18. The GCD is in this case 6.
We can start by creating vectors xf and yf containing the factor decomposition of each number, similar to the code you have shown:
xf <- (1:x)[!(x%%(1:x))]
#> xf
#[1] 1 2 3 4 6 12
yf <- (1:y)[!(y%%(1:y))]
#> yf
#[1] 1 2 3 6 9 18
The parentheses after the negation operator ! are not necessary due to specific rules of operator precedence in R, but I think that they make the code clearer in this case (see fortunes::fortune(138)).
Once we have defined these vectors, we can extract the GCD with
max(xf[xf %in% yf])
#[1] 6
Or, equivalently,
max(yf[yf %in% xf])
#[1] 6

Why my output has different spacing for a specific vector with different data types in R?

I have recently started learning R language and was working on combination of vectors. I was following a tutorial and when I try to print character, complex, integer vector in c() there is a space difference between them.
I have enclosed the snapshot for the same as I might not be able to articulate it properly in words.
As Roland commented, a vector can only contain one specific data type. Here since you have character datatype, all the other data types are coerced into character datatype.
x <- c(123.56, 21, "rajat", 2+4i); print(x)
The space which should not be a problem as far as I understand is created because you have different number of characters in each elements of the vector.
>nchar(x)
[1] 6 2 5 4
Now, if you have equal number of characters the space distribution is as expected:
x <- c(123.56, 210000, "rajata", 2+442i); print(x)
[1] "123.56" "210000" "rajata" "2+442i"
nchar(x)
[1] 6 6 6 6

R: detect changing characters without loop

I'm analyzing a huge dataset of ~700000 rows.
I would like to detect where (in which rows) the character change from previous one without using loops.
For instance, in the array "dat", the ideal function would give c(4,6)
dat=c(BIS84003, BIS84003, BIS84003, BIS84005, BIS84005, BIS84006)
Does someone has any idea?
Here are two ways of doing this:
Use run-length encoding
Directly compare vectors
Method 1: Use run length encoding with the function rle().
dat=c("BIS84003", "BIS84003", "BIS84003", "BIS84005", "BIS84005", "BIS84006")
head(cumsum(rle(dat)$lengths) + 1, -1)
[1] 4 6
Method 2: compare vectors
1 + which(dat[-1] != dat[-length(dat)])
[1] 4 6
Using diff
which(!!c(0,diff(as.numeric(factor(dat)))))
#[1] 4 6

r Error dim(X) must have a positive length?

I want to compute the mean of "Population" of built-in matrix state.x77. The codes are :
apply(state.x77[,"Population"],2,FUN=mean)
#Error in apply(state.x77[, "Population"], 2, FUN = mean) :
# dim(X) must have a positive length
how can I prevent this error? If I use $ sign
apply(state.x77$Population,2,mean)
# Error in state.x77$Population : $ operator is invalid for atomic vectors
What is atomic vector?
To expand on joran's comments, consider:
> is.vector(state.x77[,"Population"])
[1] TRUE
> is.matrix(state.x77[,"Population"])
[1] FALSE
So, your Population data is now no diferent from any other vector, like 1:10, which has neither columns or rows to apply against. It is just a series of numbers with no more advanced structure or dimension. E.g.
> apply(1:10,2,mean)
Error in apply(1:10, 2, mean) : dim(X) must have a positive length
Which means you can just use the mean function directly against the matrix subset which you have selected: E.g.:
> mean(1:10)
[1] 5.5
> mean(state.x77[,"Population"])
[1] 4246.42
To explain 'atomic' vector more, see the R FAQ again (and this gets a bit complex, so hold on to your hat)...
R has six basic (‘atomic’) vector types: logical, integer, real,
complex, string (or character) and raw.
http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Vector-objects
So atomic in this instance is referring to vectors as the basic building blocks of R objects (like atoms make up everything in the real world).
If you read R's inline help by entering ?"$" as a command, you will find it says:
‘$’ is only valid for recursive objects, and is only
discussed in the section below on recursive objects.
Since vectors (like 1:10) are basic building blocks ("atomic"), with no recursive sub-elements, trying to use $ to access parts of them will not work.
Since your matrix (statex.77) is essentially just a vector with some dimensions, like:
> str(matrix(1:10,nrow=2))
int [1:2, 1:5] 1 2 3 4 5 6 7 8 9 10
...you also can't use $ to access sub-parts.
> state.x77$Population
Error in state.x77$Population : $ operator is invalid for atomic vectors
But you can access subparts using [ and names like so:
> state.x77[,"Population"]
Alabama Alaska Arizona...
3615 365 2212...

Resources