This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 6 years ago.
I have a data set as following, in which each ID has multiple rows for different attributes.
ID<-c(1,1, 2,2,3,3)
Score<-c(4,5, 5,7,8,9)
Attribute<-c("Att_1","Att_2", "Att_1","Att_2", "Att_1","Att_2")
T<-data.frame(ID, Score, Attribute)
Need to transform it to following format so each ID has one row:
ID Att_1 Att_2
1 4 5
2 5 7
3 8 9
There are threads on how to do this in excel, just wondering is there is any neat way to do in R? Thanks a lot!
You could try this:
library(reshape2)
dcast(T, ID ~ Attribute, value.var="Score")
# ID Att_1 Att_2
#1 1 4 5
#2 2 5 7
#3 3 8 9
This can be done with reshape():
reshape(data.frame(ID,Score,Attribute),idvar='ID',timevar='Attribute',dir='w');
## ID Score.Att_1 Score.Att_2
## 1 1 4 5
## 3 2 5 7
## 5 3 8 9
Related
This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
I want to create a data frame by repeating rows by using content of a column in a data frame. Below is the source data frame.
data.frame(c("a","b","c"), c(4,5,6), c(2,2,3)) -> df
colnames(df) <- c("sample", "measurement", "repeat")
df
sample measurement repeat
1 a 4 2
2 b 5 2
3 c 6 3
I want to repeat the rows by using the "repeat" column and its content to get a data frame like the one below. Ideally, I would like to have a function to this.
sample measurement repeat
1 a 4 2
2 a 4 2
3 b 5 2
4 b 5 2
5 c 6 3
6 c 6 3
7 c 6 3
Thanks in advance!
Solved. df[rep(rownames(df), df$repeat), ] did the job.
This question already has answers here:
Aggregate a dataframe on a given column and display another column
(8 answers)
Closed 6 years ago.
I have a df like this:
Id count
1 0
1 5
1 7
2 5
2 10
3 2
3 5
3 4
and I want to get the maximum count and apply that to the whole "group" based on ID, like this:
Id count max_count
1 0 7
1 5 7
1 7 7
2 5 10
2 10 10
3 2 5
3 5 5
3 4 5
I've tried pmax, slice etc. I'm generally having trouble working with data that is in interval-specific form; if you could direct me to tools well-suited to that type of data, would really appreciate it!
Figured it out with help from Gavin Simpson here: Aggregate a dataframe on a given column and display another column
maxcount <- aggregate(count ~ Id, data = df, FUN = max)
new_df<-merge(df, maxcount)
Better way:
df$max_count <- with(df, ave(count, Id, FUN = max))
This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 6 years ago.
I have a single row of numbers. I'm wondering how I can separate it out so that it outputs columns that total the tally of each set of numbers. I've tried playing around with "separate" but I can't figure out how to make it work.
Here's my data frame:
2
2
2
2
2
4
4
4
I'd like it to be
2 4
5 3
You can use the table() function.
> df
V1
1 2
2 2
3 2
4 2
5 2
6 4
7 4
8 4
> table(df$V1)
2 4
5 3
We can use tabulate which would be faster
tabulate(factor(df1$V1))
#[1] 5 3
This question already has answers here:
Convert data from many rows to many columns [duplicate]
(3 answers)
Closed 6 years ago.
I have one dataset
sn Name Feature score
1 pen-1 cost 2
2 pen-1 color 3
3 pen-1 look 1
4 pen-2 cost 1
5 pen-2 color 2
6 pen-2 look 4
I want to change it to the below format
sn Name Cost Look color
1 Pen-1 2 1 3
2 pen-2 1 4 2
Please Solve My problem using R Programming. Thanks
We can use dcast
library(reshape2)
dcast(df1, Name~Feature, value.var="score")
Or spread from tidyr
library(tidyr)
spread(df1[-1], Feature, score)
# Name color cost look
#1 pen-1 3 2 1
#2 pen-2 2 1 4
This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 7 years ago.
If I want to sort the Chrom# from ascending order (1 to 23) for each unique ID (as shown below there's multiple rows of same IDs, how to write the R code for it? eg) MB-0002, chrom from 1,1,1,2,4,22... etc. 1 chrom per row. I am new to R so any help would be appreciated. Thanks so much!
sample dataset
If you can use dplyr::arrange then you can easily sort by two variables.
tmp <- data.frame(id=c("a","a","b","a","b","c","a","b","c"),
value=c(3,2,4,1,2,1,7,4,3))
tmp
# id value
# 1 a 3
# 2 a 2
# 3 b 4
# 4 a 1
# 5 b 2
# 6 c 1
# 7 a 7
# 8 b 4
# 9 c 3
library(dplyr)
tmp %>% arrange(id, value)
# id value
# 1 a 1
# 2 a 2
# 3 a 3
# 4 a 7
# 5 b 2
# 6 b 4
# 7 b 4
# 8 c 1
# 9 c 3
FYI, an image doesn't work as a usable sample dataset.