This question already has answers here:
Convert data from many rows to many columns [duplicate]
(3 answers)
Closed 6 years ago.
I have one dataset
sn Name Feature score
1 pen-1 cost 2
2 pen-1 color 3
3 pen-1 look 1
4 pen-2 cost 1
5 pen-2 color 2
6 pen-2 look 4
I want to change it to the below format
sn Name Cost Look color
1 Pen-1 2 1 3
2 pen-2 1 4 2
Please Solve My problem using R Programming. Thanks
We can use dcast
library(reshape2)
dcast(df1, Name~Feature, value.var="score")
Or spread from tidyr
library(tidyr)
spread(df1[-1], Feature, score)
# Name color cost look
#1 pen-1 3 2 1
#2 pen-2 2 1 4
Related
This question already has answers here:
How to create a consecutive group number
(13 answers)
Closed 3 years ago.
I'm trying to use the tidyverse (whatever package is appropriate) to add a column (via mutate()) that is a running total of the unique values that have occurred in the column so far. Here is some toy data, showing the desired output.
data.frame("n"=c(1,1,1,6,7,8,8),"Unique cumsum"=c(1,1,1,2,3,4,4))
Who knows how to accomplish this in the tidyverse?
Here is an option with group_indices
library(dplyr)
df1%>%
mutate(unique_cumsum = group_indices(., n))
# n unique_cumsum
#1 1 1
#2 1 1
#3 1 1
#4 6 2
#5 7 3
#6 8 4
#7 8 4
data
df1 <- data.frame("n"=c(1,1,1,6,7,8,8))
Here's one way, using the fact that a factor will assign a sequential value to each unique item, and then converting the underlying factor codes with as.numeric:
data.frame("n"=c(1,1,1,6,7,8,8)) %>% mutate(unique_cumsum=as.numeric(factor(n)))
n unique_cumsum
1 1 1
2 1 1
3 1 1
4 6 2
5 7 3
6 8 4
7 8 4
Another solution:
df <- data.frame("n"=c(1,1,1,6,7,8,8))
df <- df %>% mutate(`unique cumsum` = cumsum(!duplicated(n)))
This should work even if your data is not sorted.
This question already has answers here:
Creation of a specific vector without loop or recursion in R
(2 answers)
Split data.frame by value
(2 answers)
Closed 4 years ago.
I have a dataframe whose rows represent people. For a given family, the first row has the value 1 in the column A, and all following rows contain members of the same family until another row in in column A has the value 1. Then, a new family starts.
I would like to assign IDs to all families in my dataset. In other words, I would like to take:
A
1
2
3
1
3
3
1
4
And turn it into:
A family_id
1 1
2 1
3 1
1 2
3 2
3 2
1 3
4 3
I'm playing with a dataframe of 3 million rows, so a simple for-loop solution I came up with falls short of necessary efficiency. Also, the family_id need not be sequential.
I'll take a dplyr solution.
data:
df <- data.frame(A = c(1:3,1,3,3,1,4))
code:
df$familiy_id <- cumsum(c(-1,diff(df$A)) < 0)
result:
# A familiy_id
#1 1 1
#2 2 1
#3 3 1
#4 1 2
#5 3 2
#6 3 2
#7 1 3
#8 4 3
please note:
This solution starts a new group when a number occurs that is smaller than the previous one.
When its 100% sure that a new group always begins with a 1 consistently, then ronak's solution is perfect.
This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 6 years ago.
I have a single row of numbers. I'm wondering how I can separate it out so that it outputs columns that total the tally of each set of numbers. I've tried playing around with "separate" but I can't figure out how to make it work.
Here's my data frame:
2
2
2
2
2
4
4
4
I'd like it to be
2 4
5 3
You can use the table() function.
> df
V1
1 2
2 2
3 2
4 2
5 2
6 4
7 4
8 4
> table(df$V1)
2 4
5 3
We can use tabulate which would be faster
tabulate(factor(df1$V1))
#[1] 5 3
This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 6 years ago.
I have a data set as following, in which each ID has multiple rows for different attributes.
ID<-c(1,1, 2,2,3,3)
Score<-c(4,5, 5,7,8,9)
Attribute<-c("Att_1","Att_2", "Att_1","Att_2", "Att_1","Att_2")
T<-data.frame(ID, Score, Attribute)
Need to transform it to following format so each ID has one row:
ID Att_1 Att_2
1 4 5
2 5 7
3 8 9
There are threads on how to do this in excel, just wondering is there is any neat way to do in R? Thanks a lot!
You could try this:
library(reshape2)
dcast(T, ID ~ Attribute, value.var="Score")
# ID Att_1 Att_2
#1 1 4 5
#2 2 5 7
#3 3 8 9
This can be done with reshape():
reshape(data.frame(ID,Score,Attribute),idvar='ID',timevar='Attribute',dir='w');
## ID Score.Att_1 Score.Att_2
## 1 1 4 5
## 3 2 5 7
## 5 3 8 9
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I want to reshape my data into a long format, but I would like to repeat the entire range of id's for each variable in my data set, even for those id entries on which the variable takes no value. At the moment I can get narrow data, with ids for each variable on which there is a corresponding entry
Suppose my data has 15 variables, with 20 possible id's, I want to create a narrow form of this data that is 15*20 in length (the range of ids, repeated for each variable), whereby each repeated range of id's shows the values taken by variable, for id1, id2, id3 e.t.c until the end of the range of id's is reached, then variable2 is displayed for id1, id2, id3 e.t.c..
I am unsure of ohw to do this in R, I am currently using the reshape package.
You can use the replicate function which is explained here
v1 <- 1:5
v2 <- 1:6
rep(v1, each = 6)
# 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5
rep(v2, 5)
#1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
Yeah, this is hard to work with, but you're looking for the melt function I think...
library(reshape2)
melt(yourdata, id.vars = 'ID COLUMN')
This will return a 300 x 3 data set that looks like:
ID COLUMN variable value
1 col2 7
1 col3 8
.... .... ....
20 col14 99
20 col15 100