I'm reading a book on R and I don't understand the behavior of the seq function. Can someone please explain to me what it's doing when you give it a vector such as what is shown below on line 4?
> seq(1,5,1)
[1] 1 2 3 4 5
> x <- c(1,5,1)
> seq(x)
[1] 1 2 3
seq generates a sequence basically, so:
seq(from, to, increment)
printed out 1 to 5 incrementing by 1 each time.
Then the c function combines lists or vectors. So it has added the variables to x and then seq is performed on x which by default calls seq_len which outputs a sequence of 1 to length(x).
Check the documenation in the links below to see the default methods.
Sequence generation: seq
Combine/concatenate: c
Related
I have the following vector in R:
> A<-c(8.1915935, 3.0138083, 0.3245712, 10.7353747, 13.7505131 ,63.2337407, 16.7505131, 5.7781297)
I want to sort it, and, at the same time, know each element's position in the sorted vector. So i use the following function:
sort(A, index.return=T)
And I get the following output, which I don't clearly understand:
$x
[1] 0.3245712 3.0138083 5.7781297 8.1915935 10.7353747 13.7505131 16.7505131 63.2337407
$ix
[1] 3 2 8 1 4 5 7 6
Looking at the original vector A, the first element, goes in the 4th position of the sorted vector. So the first element of "$ix" should be 4. Why is it 3?
Then, the biggest number of the vector is the 6th of A. But the 6th element of $ix is not 8, as I expected to see (the length of the vector)but 6. Why?
And so on, for all the elements. Clearly, there is something I don't understand about this output.
$ix is indicating the position of the elements of x in the original vector; you were hoping for the reverse -- the location of the elements in the original vector in x. The difference is between order() and rank()
> order(A)
[1] 3 2 8 1 4 5 7 6
> rank(A)
[1] 4 2 1 5 6 8 7 3
Note that order(order(A)) == rank(A), so one way to get the answer you're looking for is
result <- sort(A, index.return = TRUE)
order(result$ix)
When we want a sequence in R, we use either construction:
> 1:5
[1] 1 2 3 4 5
> seq(1,5)
[1] 1 2 3 4 5
this produces a sequence from start to stop (inclusive)
is there a way to generate a sequence from start to stop (exclusive)? like
[1] 1 2 3 4
Also, I don't want to use a workaround like a minus operator, like:
seq(1,5-1)
This is because I would like to have statements in my code that are elegant and concise. In my real world example the start and stop are not hardcoded integers but descriptive variable names. Using the variable_name -1 construction just my script uglier and difficult to read for a reviewer.
PS: The difference between this question and the one at remove the last element of a vector is that I am asking for sequence generation while the former focuses on removing the last element of a vector
Moreover the answers provided here are different and relevant to my problem
One possible solution would be
head(1:5, -1)
# [1] 1 2 3 4
or you could define your own function
seq_last_exclusive <- function(x) return(x[-length(x)])
seq_last_exclusive(1:5)
# [1] 1 2 3 4
We can use the following function
f <- function(start, stop, ...) {
if(identical(start, stop)) {
return(vector("integer", 0))
}
seq.int(from = start, to = stop - 1L, ...)
}
Test
f(1, 5)
# [1] 1 2 3 4
f(1, 1)
# integer(0)
I am trying to count integers in a vector that also contains zeros. However, tabulate doesn't count the zeros. Any ideas what I am doing wrong?
Example:
> tabulate(c(0,4,4,5))
[1] 0 0 0 2 1
but the answer I expect is:
[1] 1 0 0 0 2 1
Use a factor and define its levels
tabulate(factor(c(0,4,4,5), 0:5))
#[1] 1 0 0 0 2 1
The explanation for the behaviour you're seeing is in ?tabulate (bold face mine)
bin: a numeric vector (of positive integers), or a factor. Long
vectors are supported.
In other words, if you give a numeric vector, it needs to have positive >0 integers. Or use a factor.
I got annoyed enough by tabulate to write a short function that can count not only the zeroes but any other integers in a vector:
my.tab <- function(x, levs) {
sapply(levs, function(n) {
length(x[x==n])
}
)}
The parameter x is an integer vector that we want to tabulate. levs is another integer vector that contains the "levels" whose occurrences we count. Let's set x to some integer vector:
x <- c(0,0,1,1,1,2,4,5,5)
A) Use my.tab to emulate R's built-in tabulate. 0-s will be ignored:
my.tab(x, 1:max(x))
# [1] 3 1 0 1 2
B) Count the occurrences of integers from 0 to 6:
my.tab(x, 0:6)
# [1] 2 3 1 0 1 2 0
C) If you want to know (for some strange reason) only how many 1-s and 4-s your x vector contains, but ignore everything else:
my.tab(x, c(1,4))
# [1] 3 1
First of all sorry for this question. I suppose it's super basic but I can't find the right search terms. For a vector a lets say:
a<-c(1,1,3,2,1)
I want to get a vector b which results when suming element by element
>b
1 2 5 7 8
it would be something like:
x<-2
b<-as.vector(a[1])
while(x<=length(a)) {
c<-a[x]+b[x-1]
b=c(b,c)
x=x+1
}
rm(x,c)
but isn't there a built-in function for this?
You are looking for cumsum:
a = c(1,1,3,2,1)
R> cumsum(a)
[1] 1 2 5 7 8
I do:
assign('test', 'bye')
test
[1] "bye"
now, I have the vector inside 'test' variable.
I would like to use the string inside 'test' variable as name of a column of the follow list:
list(test=c(1:10))
$test
[1] 1 2 3 4 5 6 7 8 9 10
But I would like to use 'bye' as NAME (because 'bye' is wrote inside the test variable)
How can I do it?
I don't think eval or assign are at all necessary here; their use usually (although not always) indicates that you're doing something the hard way, or at least the un-R-ish way.
> test <- "bye"
> L <- list(1:10) ## c() unnecessary here too
> names(L) <- test
> L
$bye
[1] 1 2 3 4 5 6 7 8 9 10
If you really want to do this in a single statement, you can do:
L <- setNames(list(1:10), test)
or
L <- structure(list(1:10), .Names=test)
I guess this will be the answer you're looking for?
assign('test','bye')
z<-list(c(1:10))
names(z)<-test