Unexpected behavior of order(x, na.last = FALSE) [duplicate] - r

i am having trouble understanding the difference between the R function rank and the R function order. they seem to produce the same output:
> rank(c(10,30,20,50,40))
[1] 1 3 2 5 4
> order(c(10,30,20,50,40))
[1] 1 3 2 5 4
Could somebody shed some light on this for me?
Thanks

set.seed(1)
x <- sample(1:50, 30)
x
# [1] 14 19 28 43 10 41 42 29 27 3 9 7 44 15 48 18 25 33 13 34 47 39 49 4 30 46 1 40 20 8
rank(x)
# [1] 9 12 16 25 7 23 24 17 15 2 6 4 26 10 29 11 14 19 8 20 28 21 30 3 18 27 1 22 13 5
order(x)
# [1] 27 10 24 12 30 11 5 19 1 14 16 2 29 17 9 3 8 25 18 20 22 28 6 7 4 13 26 21 15 23
rank returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order returns the indices that would put the initial vector x in order.
The 27th value of x is the lowest, so 27 is the first element of order(x) - and if you look at rank(x), the 27th element is 1.
x[order(x)]
# [1] 1 3 4 7 8 9 10 13 14 15 18 19 20 25 27 28 29 30 33 34 39 40 41 42 43 44 46 47 48 49

As it turned out this was a special case and made things confusing. I explain below for anyone interested:
rank returns the order of each element in an ascending list
order returns the index each element would have in an ascending list

I always find it confusing to think about the difference between the two, and I always think, "how can I get to order using rank"?
Starting with Justin's example:
Order using rank:
## Setup example to match Justin's example
set.seed(1)
x <- sample(1:50, 30)
## Make a vector to store the sorted x values
xx = integer(length(x))
## i is the index, ir is the ith "rank" value
i = 0
for(ir in rank(x)){
i = i + 1
xx[ir] = x[i]
}
all(xx==x[order(x)])
[1] TRUE

rank is more complicated and not neccessarily an index (integer):
> rank(c(1))
[1] 1
> rank(c(1,1))
[1] 1.5 1.5
> rank(c(1,1,1))
[1] 2 2 2
> rank(c(1,1,1,1))
[1] 2.5 2.5 2.5 2.5

In layman's language, order gives the actual place/position of a value after sorting the values
For eg:
a<-c(3,4,2,7,8,5,1,6)
sort(a) [1] 1 2 3 4 5 6 7 8
The position of 1 in a is 7. similarly position of 2 in a is 3.
order(a) [1] 7 3 1 2 6 8 4 5

as is stated by ?order() in R prompt,
order just return a permutation which sort the original vector into ascending/descending order.
suppose that we have a vector
A<-c(1,4,3,6,7,4);
A.sort<-sort(A);
then
order(A) == match(A.sort,A);
rank(A) == match(A,A.sort);
besides, i find that order has the following property(not validated theoratically):
1 order(A)∈(1,length(A))
2 order(order(order(....order(A)....))):if you take the order of A in odds number of times, the results remains the same, so as to even number of times.

some observations:
set.seed(0)
x<-matrix(rnorm(10),1)
dm<-diag(length(x))
# compute rank from order and backwards:
rank(x) == col(x)%*%dm[order(x),]
order(x) == col(x)%*%dm[rank(x),]
# in special cases like this
x<-cumsum(rep(c(2,0),times=5))+rep(c(0,-1),times=5)
# they are equal
order(x)==rank(x)

Related

How to do some operation in the elements of a vector with previous elements

This may have been a very basic question, but I am scratching my head ..
Suppose I have a vector v with 10 elements
v <- 1:10
> v
[1] 1 2 3 4 5 6 7 8 9 10
Now I want to perform some operation (2 argument function) say + on its elements with previous elements, to get an output like.. 1+1 1+2 2+3 3+4.. and so on. For the first element where no previous value is there, I'll take first value only. I can perform this operation by manually creating another vector something like c(v[1], v[-length(v)]), but I think/presume there may be some direct method/in-built function to do so.
> v + c(v[1], v[-length(v)])
[1] 2 3 5 7 9 11 13 15 17 19
#OR if product is the operation
> v * c(v[1], v[-length(v)])
[1] 1 2 6 12 20 30 42 56 72 90
Please guide
This can be achieved with dplyr::lag or data.table::shift, although I am not sure if this is what you were looking for.
v + dplyr::lag(v, default = v[1])
#[1] 2 3 5 7 9 11 13 15 17 19
Perhaps you may want to try mapply.
v <- 1:10
init <- v[1]
mapply(`+`, c(init, v[-length(v)]), v)
# [1] 2 3 5 7 9 11 13 15 17 19
mapply(`*`, c(init, v[-length(v)]), v)
# [1] 1 2 6 12 20 30 42 56 72 90
mapply(`^`, c(init, v[-length(v)]), v)
# [1] 1 1 8 81 1024 15625 279936 5764801 134217728 3486784401

Multiplying odd numbers in a vector

I am a beginner in R and I need to multiply odd numbers (by two) of the following vector:
x<-c(1:20)
I tried with this:
x2<-c[lapply(x,"%%",2*2)==1]
But something is wrong.
For a vector like your example comprised of consecutive integers, we can use recycling
x * c(2,1)
##[1] 2 2 6 4 10 6 14 8 18 10 22 12 26 14 30 16 34 18 38 20
More generally, we can do
x * (x%%2 + 1L)
Using base r, we can try
ifelse(x %% 2 != 0, x * 2, x)
> [1] 2 2 6 4 10 6 14 8 18 10 22 12 26 14 30 16 34 18 38 20
We could find out the indices which are odd and multiply them by 2.
inds <- as.logical(x %% 2)
x[inds] <- x[inds] * 2
x
#[1] 2 2 6 4 10 6 14 8 18 10 22 12 26 14 30 16 34 18 38 20

R ordering bug?

R Fiddle
vals<-c(10.3,10.3,10.2,16.4,18.8,19.7,15.6,18.2,22.6,19.9,24.2,21.0,21.4,21.3,19.1,22.2,33.8,27.4,25.7,24.9,34.5,31.7,36.3,38.3,42.6,55.4,55.7,58.3,51.5,51.0,77.0)
# Standard Order
# the second and third values should be reversed
order(vals)
# ------------------------------------------------------------
# [1] 3 1 2 7 4 8 5 15 6 10 12 14 13 16 9 11 20 19 18 22 17 21 23 24 25
# [26] 30 29 26 27 28 31
# ------------------------------------------------------------
# Reverse Decreasing
# should be the same as the original, but it isn't (it's correct)
rev(order(vals, decreasing=T))
# ------------------------------------------------------------
# [1] 3 2 1 7 4 8 5 15 6 10 12 14 13 16 9 11 20 19 18 22 17 21 23 24 25
# [26] 30 29 26 27 28 31
# ------------------------------------------------------------
I need some help in understanding what is happening in R. I think there's a bug when outputting order and how they are not the same. Notice the second and third values of both outputs. Shouldn't the order be 3,3,1 or 2,2,1 or 3,2,1 depending on how order treats the same value? Regardless.. the third value should have order=1.
Is my understanding correct, or am I missing something?
As per the documentation,
order returns a permutation which rearranges its first argument into ascending or descending order, breaking ties by further arguments.
i.e. order() returns a set of indices such that x[order(x)] is in increasing order, or that x[order(x,decreasing = TRUE)] is in decreasing order.
If two consecutive values in x are identical, then the order of their indices in the value returned by order is immaterial, and will simply depend on what is most efficient and involves the least amount of swapping values around in the internal C code.

How does the order() function in R work for character vectors? [duplicate]

i am having trouble understanding the difference between the R function rank and the R function order. they seem to produce the same output:
> rank(c(10,30,20,50,40))
[1] 1 3 2 5 4
> order(c(10,30,20,50,40))
[1] 1 3 2 5 4
Could somebody shed some light on this for me?
Thanks
set.seed(1)
x <- sample(1:50, 30)
x
# [1] 14 19 28 43 10 41 42 29 27 3 9 7 44 15 48 18 25 33 13 34 47 39 49 4 30 46 1 40 20 8
rank(x)
# [1] 9 12 16 25 7 23 24 17 15 2 6 4 26 10 29 11 14 19 8 20 28 21 30 3 18 27 1 22 13 5
order(x)
# [1] 27 10 24 12 30 11 5 19 1 14 16 2 29 17 9 3 8 25 18 20 22 28 6 7 4 13 26 21 15 23
rank returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order returns the indices that would put the initial vector x in order.
The 27th value of x is the lowest, so 27 is the first element of order(x) - and if you look at rank(x), the 27th element is 1.
x[order(x)]
# [1] 1 3 4 7 8 9 10 13 14 15 18 19 20 25 27 28 29 30 33 34 39 40 41 42 43 44 46 47 48 49
As it turned out this was a special case and made things confusing. I explain below for anyone interested:
rank returns the order of each element in an ascending list
order returns the index each element would have in an ascending list
I always find it confusing to think about the difference between the two, and I always think, "how can I get to order using rank"?
Starting with Justin's example:
Order using rank:
## Setup example to match Justin's example
set.seed(1)
x <- sample(1:50, 30)
## Make a vector to store the sorted x values
xx = integer(length(x))
## i is the index, ir is the ith "rank" value
i = 0
for(ir in rank(x)){
i = i + 1
xx[ir] = x[i]
}
all(xx==x[order(x)])
[1] TRUE
rank is more complicated and not neccessarily an index (integer):
> rank(c(1))
[1] 1
> rank(c(1,1))
[1] 1.5 1.5
> rank(c(1,1,1))
[1] 2 2 2
> rank(c(1,1,1,1))
[1] 2.5 2.5 2.5 2.5
In layman's language, order gives the actual place/position of a value after sorting the values
For eg:
a<-c(3,4,2,7,8,5,1,6)
sort(a) [1] 1 2 3 4 5 6 7 8
The position of 1 in a is 7. similarly position of 2 in a is 3.
order(a) [1] 7 3 1 2 6 8 4 5
as is stated by ?order() in R prompt,
order just return a permutation which sort the original vector into ascending/descending order.
suppose that we have a vector
A<-c(1,4,3,6,7,4);
A.sort<-sort(A);
then
order(A) == match(A.sort,A);
rank(A) == match(A,A.sort);
besides, i find that order has the following property(not validated theoratically):
1 order(A)∈(1,length(A))
2 order(order(order(....order(A)....))):if you take the order of A in odds number of times, the results remains the same, so as to even number of times.
some observations:
set.seed(0)
x<-matrix(rnorm(10),1)
dm<-diag(length(x))
# compute rank from order and backwards:
rank(x) == col(x)%*%dm[order(x),]
order(x) == col(x)%*%dm[rank(x),]
# in special cases like this
x<-cumsum(rep(c(2,0),times=5))+rep(c(0,-1),times=5)
# they are equal
order(x)==rank(x)

rank and order in R

i am having trouble understanding the difference between the R function rank and the R function order. they seem to produce the same output:
> rank(c(10,30,20,50,40))
[1] 1 3 2 5 4
> order(c(10,30,20,50,40))
[1] 1 3 2 5 4
Could somebody shed some light on this for me?
Thanks
set.seed(1)
x <- sample(1:50, 30)
x
# [1] 14 19 28 43 10 41 42 29 27 3 9 7 44 15 48 18 25 33 13 34 47 39 49 4 30 46 1 40 20 8
rank(x)
# [1] 9 12 16 25 7 23 24 17 15 2 6 4 26 10 29 11 14 19 8 20 28 21 30 3 18 27 1 22 13 5
order(x)
# [1] 27 10 24 12 30 11 5 19 1 14 16 2 29 17 9 3 8 25 18 20 22 28 6 7 4 13 26 21 15 23
rank returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order returns the indices that would put the initial vector x in order.
The 27th value of x is the lowest, so 27 is the first element of order(x) - and if you look at rank(x), the 27th element is 1.
x[order(x)]
# [1] 1 3 4 7 8 9 10 13 14 15 18 19 20 25 27 28 29 30 33 34 39 40 41 42 43 44 46 47 48 49
As it turned out this was a special case and made things confusing. I explain below for anyone interested:
rank returns the order of each element in an ascending list
order returns the index each element would have in an ascending list
I always find it confusing to think about the difference between the two, and I always think, "how can I get to order using rank"?
Starting with Justin's example:
Order using rank:
## Setup example to match Justin's example
set.seed(1)
x <- sample(1:50, 30)
## Make a vector to store the sorted x values
xx = integer(length(x))
## i is the index, ir is the ith "rank" value
i = 0
for(ir in rank(x)){
i = i + 1
xx[ir] = x[i]
}
all(xx==x[order(x)])
[1] TRUE
rank is more complicated and not neccessarily an index (integer):
> rank(c(1))
[1] 1
> rank(c(1,1))
[1] 1.5 1.5
> rank(c(1,1,1))
[1] 2 2 2
> rank(c(1,1,1,1))
[1] 2.5 2.5 2.5 2.5
In layman's language, order gives the actual place/position of a value after sorting the values
For eg:
a<-c(3,4,2,7,8,5,1,6)
sort(a) [1] 1 2 3 4 5 6 7 8
The position of 1 in a is 7. similarly position of 2 in a is 3.
order(a) [1] 7 3 1 2 6 8 4 5
as is stated by ?order() in R prompt,
order just return a permutation which sort the original vector into ascending/descending order.
suppose that we have a vector
A<-c(1,4,3,6,7,4);
A.sort<-sort(A);
then
order(A) == match(A.sort,A);
rank(A) == match(A,A.sort);
besides, i find that order has the following property(not validated theoratically):
1 order(A)∈(1,length(A))
2 order(order(order(....order(A)....))):if you take the order of A in odds number of times, the results remains the same, so as to even number of times.
some observations:
set.seed(0)
x<-matrix(rnorm(10),1)
dm<-diag(length(x))
# compute rank from order and backwards:
rank(x) == col(x)%*%dm[order(x),]
order(x) == col(x)%*%dm[rank(x),]
# in special cases like this
x<-cumsum(rep(c(2,0),times=5))+rep(c(0,-1),times=5)
# they are equal
order(x)==rank(x)

Resources