reshuffle the sequence of rows in data frame [duplicate] - r

This question already has answers here:
How to randomize (or permute) a dataframe rowwise and columnwise?
(9 answers)
Closed 7 years ago.
I have a dataframe with 9000 rows and 6 columns. I want to make the order of rows random i.e. some kind of shuffling to produce another dataframe with the same data but the rows in random order. Could anyone tell me how to do this in R?
Thanks

If you want to sample (but keep) the same order of the rows then you can just sample the rows.
df <- data.frame(x=1:8, y=1:8, z=1:8)
df[sample(1:nrow(df)),]
which will produce
x y z
2 2 2 2
3 3 3 3
4 4 4 4
6 6 6 6
5 5 5 5
8 8 8 8
7 7 7 7
1 1 1 1
If you rows should be sampled individually for each row then you can do something like
lapply(df, function(x) { sample(x)})
which results in
$x
[1] 3 1 4 6 5 2 8 7
$y
[1] 2 5 6 3 4 8 7 1
$z
[1] 6 1 8 3 2 7 4 5

Related

create a column in R with data from two other columns [duplicate]

This question already has answers here:
Replace a value NA with the value from another column in R
(5 answers)
Closed 3 years ago.
I don't have the slightest idea of programming, but I need to solve the following problem in R.
Let's suppose I have this data:
x y
5 8
6 5
2
9 8
4
0
6 6
7 3
3 2
I need to create a third column called "z" containing the data of "y" exccept for the missing values where it should have the values of "x". It would be something like this:
x y z
5 8 8
6 5 5
2 2
9 8 8
4 4
0 0
6 6 6
7 3 3
3 2 2
dat <- data.frame(x=c(5,6,2,9,4,0,6,7,3), y = c(8,5,NA,8,NA,NA,6,3,2))
library(tidyverse)
dat %>% mutate(z = ifelse(is.na(y), x, y))
# x y z
# 1 5 8 8
# 2 6 5 5
# 3 2 NA 2
# 4 9 8 8
# 5 4 NA 4
# 6 0 NA 0
# 7 6 6 6
# 8 7 3 3
# 9 3 2 2

Formatting R combn output

As a short example, when running combn(1:5,2), I get a matrix of 2 rows and 10 columns.
I know I can convert the output matrix to a data frame, but is it possible (any option inside combn) to have the output readily in the form of a vertical data frame of 2 columns and 10 rows ?
Thanks.
Simply transpose the matrix with t():
data.frame(t(combn(1:5, 2)))
Yields:
X1 X2
1 1 2
2 1 3
3 1 4
4 1 5
5 2 3
6 2 4
7 2 5
8 3 4
9 3 5
10 4 5

Order data frame by column and display WITH indices

I have the following R data frame
> df
a
1 3
3 2
4 1
5 3
6 6
7 7
8 2
10 8
I order it by the a column with the order function df[ order(df), ]:
[1] 1 2 2 3 3 6 7 8
This is the result I want, BUT, how can list the whole data frame with the permuted indices?
The only thing that works is the following, but it seems sloppy and I don't really understand what it does:
> df[ order(df), c(1,1) ] # I want this but without the a.1 column!!!!
a a.1
4 1 1
3 2 2
8 2 2
1 3 3
5 3 3
6 6 6
7 7 7
10 8 8
Thanks
If we need the indices as well, use sort with index.return = TRUE
data.frame(sort(df$a, index.return=TRUE))

Sort two columns alphabetically by row and swap [duplicate]

This question already has answers here:
Sorting rows alphabetically
(4 answers)
Closed 7 years ago.
I am trying to sort each row of a data frame using this line,
sapply(df, function(x) sort(x))
However, the columns are getting sorted instead of the rows.
For example, this data frame
5 10 7 1 5
6 3 9 2 4
4 5 1 3 3
is ending up like this:
4 3 1 1 3
5 5 7 2 4
6 10 9 3 5
And I want this:
1 5 5 7 10
2 3 4 6 9
1 3 3 4 5
Any recommendations? Thanks
You could use the plain apply function with MARGIN = 1 to apply over rows and then transpose the result.
t(apply(df, 1, sort))
You can transpose it (coverts it to matrix), and split by column and sort
t(sapply(split(t(df), col(t(df))), sort))
# [,1] [,2] [,3] [,4] [,5]
# 1 1 5 5 7 10
# 2 2 3 4 6 9
# 3 1 3 3 4 5
Because a data.frame is a list of columns, when you sapply like that you are sorting the columns.
or apply by row
t(apply(df, 1, sort))

Sorting each row of a data frame [duplicate]

This question already has answers here:
Sorting rows alphabetically
(4 answers)
Closed 7 years ago.
I am trying to sort each row of a data frame using this line,
sapply(df, function(x) sort(x))
However, the columns are getting sorted instead of the rows.
For example, this data frame
5 10 7 1 5
6 3 9 2 4
4 5 1 3 3
is ending up like this:
4 3 1 1 3
5 5 7 2 4
6 10 9 3 5
And I want this:
1 5 5 7 10
2 3 4 6 9
1 3 3 4 5
Any recommendations? Thanks
You could use the plain apply function with MARGIN = 1 to apply over rows and then transpose the result.
t(apply(df, 1, sort))
You can transpose it (coverts it to matrix), and split by column and sort
t(sapply(split(t(df), col(t(df))), sort))
# [,1] [,2] [,3] [,4] [,5]
# 1 1 5 5 7 10
# 2 2 3 4 6 9
# 3 1 3 3 4 5
Because a data.frame is a list of columns, when you sapply like that you are sorting the columns.
or apply by row
t(apply(df, 1, sort))

Resources