This question already has answers here:
How to randomize (or permute) a dataframe rowwise and columnwise?
(9 answers)
Closed 7 years ago.
I have a dataframe with 9000 rows and 6 columns. I want to make the order of rows random i.e. some kind of shuffling to produce another dataframe with the same data but the rows in random order. Could anyone tell me how to do this in R?
Thanks
If you want to sample (but keep) the same order of the rows then you can just sample the rows.
df <- data.frame(x=1:8, y=1:8, z=1:8)
df[sample(1:nrow(df)),]
which will produce
x y z
2 2 2 2
3 3 3 3
4 4 4 4
6 6 6 6
5 5 5 5
8 8 8 8
7 7 7 7
1 1 1 1
If you rows should be sampled individually for each row then you can do something like
lapply(df, function(x) { sample(x)})
which results in
$x
[1] 3 1 4 6 5 2 8 7
$y
[1] 2 5 6 3 4 8 7 1
$z
[1] 6 1 8 3 2 7 4 5
Related
This question already has answers here:
Replace a value NA with the value from another column in R
(5 answers)
Closed 3 years ago.
I don't have the slightest idea of programming, but I need to solve the following problem in R.
Let's suppose I have this data:
x y
5 8
6 5
2
9 8
4
0
6 6
7 3
3 2
I need to create a third column called "z" containing the data of "y" exccept for the missing values where it should have the values of "x". It would be something like this:
x y z
5 8 8
6 5 5
2 2
9 8 8
4 4
0 0
6 6 6
7 3 3
3 2 2
dat <- data.frame(x=c(5,6,2,9,4,0,6,7,3), y = c(8,5,NA,8,NA,NA,6,3,2))
library(tidyverse)
dat %>% mutate(z = ifelse(is.na(y), x, y))
# x y z
# 1 5 8 8
# 2 6 5 5
# 3 2 NA 2
# 4 9 8 8
# 5 4 NA 4
# 6 0 NA 0
# 7 6 6 6
# 8 7 3 3
# 9 3 2 2
As a short example, when running combn(1:5,2), I get a matrix of 2 rows and 10 columns.
I know I can convert the output matrix to a data frame, but is it possible (any option inside combn) to have the output readily in the form of a vertical data frame of 2 columns and 10 rows ?
Thanks.
Simply transpose the matrix with t():
data.frame(t(combn(1:5, 2)))
Yields:
X1 X2
1 1 2
2 1 3
3 1 4
4 1 5
5 2 3
6 2 4
7 2 5
8 3 4
9 3 5
10 4 5
I have the following R data frame
> df
a
1 3
3 2
4 1
5 3
6 6
7 7
8 2
10 8
I order it by the a column with the order function df[ order(df), ]:
[1] 1 2 2 3 3 6 7 8
This is the result I want, BUT, how can list the whole data frame with the permuted indices?
The only thing that works is the following, but it seems sloppy and I don't really understand what it does:
> df[ order(df), c(1,1) ] # I want this but without the a.1 column!!!!
a a.1
4 1 1
3 2 2
8 2 2
1 3 3
5 3 3
6 6 6
7 7 7
10 8 8
Thanks
If we need the indices as well, use sort with index.return = TRUE
data.frame(sort(df$a, index.return=TRUE))
This question already has answers here:
Sorting rows alphabetically
(4 answers)
Closed 7 years ago.
I am trying to sort each row of a data frame using this line,
sapply(df, function(x) sort(x))
However, the columns are getting sorted instead of the rows.
For example, this data frame
5 10 7 1 5
6 3 9 2 4
4 5 1 3 3
is ending up like this:
4 3 1 1 3
5 5 7 2 4
6 10 9 3 5
And I want this:
1 5 5 7 10
2 3 4 6 9
1 3 3 4 5
Any recommendations? Thanks
You could use the plain apply function with MARGIN = 1 to apply over rows and then transpose the result.
t(apply(df, 1, sort))
You can transpose it (coverts it to matrix), and split by column and sort
t(sapply(split(t(df), col(t(df))), sort))
# [,1] [,2] [,3] [,4] [,5]
# 1 1 5 5 7 10
# 2 2 3 4 6 9
# 3 1 3 3 4 5
Because a data.frame is a list of columns, when you sapply like that you are sorting the columns.
or apply by row
t(apply(df, 1, sort))
This question already has answers here:
Sorting rows alphabetically
(4 answers)
Closed 7 years ago.
I am trying to sort each row of a data frame using this line,
sapply(df, function(x) sort(x))
However, the columns are getting sorted instead of the rows.
For example, this data frame
5 10 7 1 5
6 3 9 2 4
4 5 1 3 3
is ending up like this:
4 3 1 1 3
5 5 7 2 4
6 10 9 3 5
And I want this:
1 5 5 7 10
2 3 4 6 9
1 3 3 4 5
Any recommendations? Thanks
You could use the plain apply function with MARGIN = 1 to apply over rows and then transpose the result.
t(apply(df, 1, sort))
You can transpose it (coverts it to matrix), and split by column and sort
t(sapply(split(t(df), col(t(df))), sort))
# [,1] [,2] [,3] [,4] [,5]
# 1 1 5 5 7 10
# 2 2 3 4 6 9
# 3 1 3 3 4 5
Because a data.frame is a list of columns, when you sapply like that you are sorting the columns.
or apply by row
t(apply(df, 1, sort))