Create a vector that repeats itself in R - r

I would like to create a vector that repeats itself. (eg 1:3 until 12 rows)
1,2,3,1,2,3,1,2,3,1,2,3
How can I do this in R?
Thanks for your help.

See ?rep. What you want is as easy as
> rep(1:3, times = 4)
[1] 1 2 3 1 2 3 1 2 3 1 2 3
but if you don't know the length of the vector until run time but you do know the length of the output required, you could do (updated to reflect comment from #baptiste):
> rep(1:3, length.out = 12)
[1] 1 2 3 1 2 3 1 2 3 1 2 3

Related

Reorder (collate) vector elements automatically

It's an easy one, but I can find a simple solution for my problem. I have several vectors look like this one: rep(1:3, each = 3) and I want to convert them to like rep(1:3, times = 3).
So each element is repeated multiple times c(1,1,1,2,2,2,3,3,3) and I want to reorder them to c(1,2,3,1,2,3,1,2,3). How can I achieve that?
You can use a matrix transpose:
as.vector(t(matrix(x, nrow = 3)))
# [1] 1 2 3 1 2 3 1 2 3
v1 <- c(1,1,1,2,2,2,3,3,3)
o1 <- rle(v1)
rep(o1$values, min(o1$length))
[1] 1 2 3 1 2 3 1 2 3
This allows for unknown amount of numbers or strings but expects each value to be present in equal numbers. It only has some flexibility on what you want to do on some values occuring more than others.
Consider:
v2 <- c(1,1,1,2,2,2,3,3,3,3)
o2 <- rle(v2)
rep(o2$values, min(o2$length))
[1] 1 2 3 1 2 3 1 2 3
rep(o2$values, max(o2$length))
[1] 1 2 3 1 2 3 1 2 3 1 2 3

how to compare and select the minimum of two features in R?

Assume i have the following dataset:
dt<-data.frame(X=sample(5),Y=sample(5))
now, i need to compare these two features and select the one which is smaller.
X Y
1 4 3
2 5 2
3 2 4
4 3 5
5 1 1
Then the expected answer would be
3
2
2
3
1
I know
min(dt[1,])
could be helpful but it only gives me 1
Use pmin, which is the vectorized version of min:
pmin(dt$X,dt$Y)
Like thus:
> dt<-data.frame(X=sample(5),Y=sample(5))
> dt
X Y
1 3 2
2 4 3
3 1 5
4 2 4
5 5 1
> pmin(dt$X,dt$Y)
[1] 2 3 1 2 1
high <- apply(dt[,c("X","Y")], 1, max)
is another implementation
integer(0) or length 0 element happens when one of X or Y is of length(0)
For min or max, a length-one vector. For pmin or pmax, a vector of length the longest of the input vectors, or length zero if one of the inputs had zero length.
(from documentation)
max(which(1:3 == 5),10) works but pmax(which(1:3 == 5),10) gives integer(0)

Using factors in R programming

If I have the code:
x <- c(rnorm(10),runif(10), rnorm(10,1))
f <- gl(3,10)
f
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
Levels: 1 2 3
tapply(x,f,mean)
1 2 3
0.07368817 0.42992416 0.64212383
How are the 1,2,3's decided? I am assuming they are levels of something.
Furthermore, why is f used in the second argument, I dont see why it is an index and how does it know when to stop running through the index?.
I tried looking up the function definition but to no avail.
If you are asking about how tapply works (rather than gl) consider another simpler example:
> x1 <- c(1,1,2,2,3,3)
> tapply(x1, x1, mean)
1 2 3
1 2 3
> f2 <- c(2,2,2,2,3,3)
> tapply(x1, f2, mean)
2 3
1.5 3.0
In the first case, tapply has picked the first two items (indices), and found their mean
giving 1 for 1, then the next two items (2 and 2) having mean 2 etc.
In the second case, the first 4 items are treated as 2's, having mean (1+1+2+2)/4, and the last two and 3's having mean (3+3)/2
In effect, then "index" is labelling the data, and applying the requested function to each "group"

How to fill a list based off of other items in the list in R?

I have a list that looks like this:
n <- c(1, rep(NA, 9), 2, rep(NA, 9))
I want the 9 observations following the first observation to contain the same value as the first observation. And continue this pattern throughout the whole list. So ideally, I want my list to look like this:
c(rep(1, 10), rep(2, 10))
I want to accomplish this without using for loops, is there a way to do this?
library(zoo)
na.locf(n)
##[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
You can use the each argument in the rep command:
rep(1:2, each = 10)
# [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
My favorite non-na.locf way:
c(NA, n[!is.na(n)])[cumsum(!is.na(n)) + 1]
# [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
If there are NAs before the first value, they will stay. But if you know there are no NAs at the beginning of the vector it's just:
not.na <- !is.na(n)
n[not.na][cumsum(not.na)]

extracting row labels (?) from a data.frame

Starting with a data.frame...
df = data.frame(k=c(1,5,4,7,6), v=c(3,1,4,1,5))
> df
k v
1 1 3
2 5 1
3 4 4
4 7 1
5 6 5
I might run some number of arbitrary manipulations...
> foo1 = df[df$k>3,]
> foo2 = head(foo1[order(foo1$v),], 2)
> foo2
k v
2 5 1
4 7 1
At this point foo2 has somehow retained the original row numbers fromdf (in this case 2 and 4).
How do I extract these?
> insert_magic_function_here(foo2)
[1] 2 4
I think you're looking for rownames.

Resources