This question already has answers here:
Reorder the rows of data frame in dplyr
(2 answers)
dplyr arrange by reverse alphabetical order [duplicate]
(1 answer)
Closed 3 years ago.
How can I invert the rows of a dataframe/tibble using dplyr? I don't want to arrange it by a certain variable, but rather have it just inverted.
I.e. the tibble
# A tibble: 5 x 2
a b
<int> <chr>
1 1 one
2 2 two
3 3 three
4 4 four
5 5 five
should become
# A tibble: 5 x 2
a b
<int> <chr>
1 5 five
2 4 four
3 3 three
4 2 two
5 1 one
Just arrange() by descending row_number() like this:
my_tibble %>%
dplyr::arrange(-dplyr::row_number())
We can use desc
my_tibble %>%
arrange(desc(row_number()))
Or another option is slice
my_tibble %>%
slice(rev(row_number()))
Or the 'a' column
my_tibble %>%
arrange(desc(a))
# a b
#1 5 five
#2 4 four
#3 3 three
#4 2 two
#5 1 one
Related
This question already has answers here:
count number of rows in a data frame in R based on group [duplicate]
(8 answers)
Closed 1 year ago.
I want summarize the following data frame to a summary table.
plot <- c(rep(1,2), rep(2,4), rep(3,3))
bird <- c('a','b', 'a','b', 'c', 'd', 'a', 'b', 'c')
area <- c(rep(10,2), rep(5,4), rep(15,3))
birdlist <- data.frame(plot,bird,area)
birdlist
plot bird area
1 1 a 10
2 1 b 10
3 2 a 5
4 2 b 5
5 2 c 5
6 2 d 5
7 3 a 15
8 3 b 15
9 3 c 15
I tried the following
birdlist %>%
group_by(plot, area) %>%
mutate(count(bird))
I am trying to get a data frame as result that looks like the following
plot bird area
1 2 10
2 4 5
3 3 15
Please help/advice on how to count bird with reference to plot and respective area of the plot. Thanks.
You were very close, you want summarize instead of mutate though and you can use n() to count the number of rows within the group you're specifying.
library(tidyverse)
birdlist %>%
group_by(plot, area) %>%
summarize(bird = n(),
.groups = "drop")
#> # A tibble: 3 x 3
#> plot area bird
#> <dbl> <dbl> <int>
#> 1 1 10 2
#> 2 2 5 4
#> 3 3 15 3
If you're set on count, you would use it without group_by.
birdlist %>%
count(plot, area, name = "bird")
We could group_by plot and summarise using unique():
birdlist %>%
group_by(plot) %>%
summarise(bird = n(), area = unique(area))
plot bird area
<dbl> <int> <dbl>
1 1 2 10
2 2 4 5
3 3 3 15
This question already has answers here:
Numbering rows within groups in a data frame
(10 answers)
Closed 1 year ago.
I am working on a two columns dataset in R representing response values ("Response") of different samples belonging to different groups ("Group") and I want to create a third ID column to identify each sample with a number from 1 to [..] (there is not the same number of sample in each group). Here is just a few lines as an example: Example
Thanks for your help.
try
library(tidyverse)
your_data %>%
group_by(Group)%>%
mutate(ID = 1:n())
We could use cur_group_id
library(dplyr)
df %>%
group_by(Group) %>%
mutate(new_id = cur_group_id())
Output:
Group Response Id new_id
<chr> <dbl> <dbl> <int>
1 A 1.5 1 1
2 A 3.4 2 1
3 A 2.3 3 1
4 A 1.8 4 1
5 B 1.9 1 2
6 B 1.4 2 2
7 C 2.7 1 3
8 C 2.3 2 3
9 C 3.2 3 3
This question already has answers here:
Convert data from long format to wide format with multiple measure columns
(6 answers)
Closed 3 years ago.
I have a df:
df= data.frame(year=c(rep(2018,4),rep(2017,3)),Area=c(1:4,1:3),P=1:7,N=1:7)
I want to split it by years, and then merge everything together again so I can see years as columns for each area. In order to do this, I am splitting and merging:
s=split(df,df$year)
m=merge(s[[1]][,2:4],[s[[2]][,2:4],by='Area',all=1)
colnames(m)=c('area','P2018','C2018','P2017','C2017')
I am sure there is a more efficient way, expecially as the possibility for errors is very high once I include data from other years.
Any suggestions?
We can gather data to long form excluding year and Area column, unite the year and then spread it to wide format.
library(dplyr)
library(tidyr)
df %>%
gather(key, value, -year, -Area) %>%
unite(key, key, year, sep = "") %>%
spread(key, value)
# Area N2017 N2018 P2017 P2018
#1 1 5 1 5 1
#2 2 6 2 6 2
#3 3 7 3 7 3
#4 4 NA 4 NA 4
We can do this with dcast from data.table which can take multiple value.var columns
library(data.table)
dcast(setDT(df), Area ~ year, value.var = c("P", "N"))
# Area P_2017 P_2018 N_2017 N_2018
#1: 1 5 1 5 1
#2: 2 6 2 6 2
#3: 3 7 3 7 3
#4: 4 NA 4 NA 4
This question already has answers here:
R: reshaping wide to long [duplicate]
(1 answer)
Using tidyr to combine multiple columns [duplicate]
(1 answer)
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Closed 4 years ago.
I'm hoping to reshape a dataframe in R so that a set of columns read in with duplicated names, and then renamed as var, var.1, var.2, anothervar, anothervar.1, anothervar.2 etc. can be treated as independent observations. I would like the number appended to the variable name to be used as the observation so that I can melt my data.
For example,
dat <- data.frame(ID=1:3, var=c("A", "A", "B"),
anothervar=c(5,6,7),var.1=c(C,D,E),
anothervar.1 = c(1,2,3))
> dat
ID var anothervar var.1 anothervar.1
1 1 A 5 C 1
2 2 A 6 D 2
3 3 B 7 E 3
How can I reshape the data so it looks like the following:
ID obs var anothervar
1 1 A 5
1 2 C 1
2 1 A 6
2 2 D 2
3 1 B 7
3 2 E 3
Thank you for your help!
We can use melt from data.table that takes multiple patterns in the measure
library(data.table)
melt(setDT(dat), measure = patterns("^var", "anothervar"),
variable.name = "obs", value.name = c("var", "anothervar"))[order(ID)]
# ID obs var anothervar
#1: 1 1 A 5
#2: 1 2 C 1
#3: 2 1 A 6
#4: 2 2 D 2
#5: 3 1 B 7
#6: 3 2 E 3
As for a tidyverse solution, we can use unite with gather
dat %>%
unite("1", var, anothervar) %>%
unite("2", var.1, anothervar.1) %>%
gather(obs, value, -ID) %>%
separate(value, into = c("var", "anothervar"))
# ID obs var anothervar
#1 1 1 A 5
#2 2 1 A 6
#3 3 1 B 7
#4 1 2 C 1
#5 2 2 D 2
#6 3 2 E 3
This question already has answers here:
Aggregate / summarize multiple variables per group (e.g. sum, mean)
(10 answers)
Closed 6 years ago.
I'm attempting to collapse a dataframe onto itself. The aggregate dataset seems like my best bet but I'm not sure how to have some columns add themselves and others remain the same.
My dataframe looks like this
A 1 3 2
A 2 3 4
B 1 2 4
B 4 2 2
How can I use the aggergate function or the ddply function to create something that looks like this:
A 3 3 6
B 5 2 6
We can use dplyr
library(dplyr)
df1 %>%
group_by(col1) %>%
summarise_each(funs(if(n_distinct(.)==1) .[1] else sum(.)))
Or another option if the column 'col3' is the same would be to keep it in the group_by and then summarise others
df1 %>%
group_by(col1, col3) %>%
summarise_each(funs(sum))
# col1 col3 col2 col4
# <chr> <int> <int> <int>
#1 A 3 3 6
#2 B 2 5 6
Or with aggregate
aggregate(.~col1+col3, df1, FUN = sum)
# col1 col3 col2 col4
#1 B 2 5 6
#2 A 3 3 6