This question already has answers here:
Reshape data from wide to long? [duplicate]
(3 answers)
Closed 9 years ago.
I have a table with header like this
Id x.1960 x.1970 x.1980 x.1990 x.2000 y.1960 y.1970 y.1980 y.1990 y.2000
I want to pivot this table as
Id time x y
What is the best way to do this in Excel or R?
Something like this using base R reshape:
Get some data first
test <- read.table(text="Id x.1960 x.1970 x.1980 x.1990 x.2000 y.1960 y.1970 y.1980 y.1990 y.2000
a 1 2 3 4 5 6 7 8 9 10
b 10 20 30 40 50 60 70 80 90 100",header=TRUE)
Then reshape:
reshape(
test,
idvar="Id",
varying=list(2:6,7:11),
direction="long",
v.names=c("x","y"),
times=seq(1960,2000,10)
)
Or let reshape guess the names automatically based on the . separator:
reshape(
test,
idvar="Id",
varying=-1,
direction="long",
sep="."
)
Resulting in:
Id time x y
a.1960 a 1960 1 6
b.1960 b 1960 10 60
a.1970 a 1970 2 7
b.1970 b 1970 20 70
a.1980 a 1980 3 8
b.1980 b 1980 30 80
a.1990 a 1990 4 9
b.1990 b 1990 40 90
a.2000 a 2000 5 10
b.2000 b 2000 50 100
Related
This question already has answers here:
Calculate the mean by group
(9 answers)
Aggregate / summarize multiple variables per group (e.g. sum, mean)
(10 answers)
Closed 5 years ago.
Hi I have 3 data set with contains the items and counts. I need to add the all data sets and combine the count based on the item names. He is my input.
Df1 <- data.frame(items =c("Cookies", "Candys","Toys","Games"), Counts = c( 10,20,30,5))
Df2 <- data.frame(items =c( "Candys","Cookies","Toys"), Counts = c( 5,21,20))
Df3 <- data.frame(items =c( "Playdows","Gummies","Candys"), Counts = c(10,15,20))
Df_all <- rbind(Df1,Df2,Df3)
Df_all
items Counts
1 Cookies 10
2 Candys 20
3 Toys 30
4 Games 5
5 Candys 5
6 Cookies 21
7 Toys 20
8 Playdows 10
9 Gummies 15
10 Candys 20
I need to combine the columns based on the item values. Delete the Row after adding the values. My output should be
items Counts
1 Cookies 31
2 Candys 45
3 Toys 50
4 Games 5
5 Playdows 10
6 Gummies 15
Could you help in getting this output in r.
use dplyr:
library(dplyr)
result<-Df_all%>%group_by(items)%>%summarize(sum(Counts))
> result
# A tibble: 6 x 2
items `sum(Counts)`
<fct> <dbl>
1 Candys 45.0
2 Cookies 31.0
3 Games 5.00
4 Toys 50.0
5 Gummies 15.0
6 Playdows 10.0
You can use tapply
tapply(Df_all$Counts, Df_all$items, FUN=sum)
what returns
Candys Cookies Games Toys Gummies Playdows
45 31 5 50 15 10
This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 5 years ago.
I'm trying to change a dataframe in R to group multiple rows by a measurement. The table has a location (km), a size (mm) a count of things in that size bin, a site and year. I want to take the sizes, make a column from each one (2, 4 and 6 in this example), and place the corresponding count into each the row for that location, site and year.
It seems like a combination of transposing and grouping, but I can't figure out a way to accomplish this in R. I've looked at t(), dcast() and aggregate(), but those aren't really close at all.
So I would go from something like this:
df <- data.frame(km=c(rep(32,3),rep(50,3)), mm=rep(c(2,4,6),2), count=sample(1:25,6), site=rep("A", 6), year=rep(2013, 6))
km mm count site year
1 32 2 18 A 2013
2 32 4 2 A 2013
3 32 6 12 A 2013
4 50 2 3 A 2013
5 50 4 17 A 2013
6 50 6 21 A 2013
To this:
km site year mm_2 mm_4 mm_6
1 32 A 2013 18 2 12
2 50 A 2013 3 17 21
Edit: I tried the solution in a suggested duplicate, but I did not work for me, not really sure why. The answer below worked better.
As suggested in the comment above, we can use the sep argument in spread:
library(tidyr)
spread(df, mm, count, sep = "_")
km site year mm_2 mm_4 mm_6
1 32 A 2013 4 20 1
2 50 A 2013 15 14 22
As you mentioned dcast(), here is a method using it.
set.seed(1)
df <- data.frame(km=c(rep(32,3),rep(50,3)),
mm=rep(c(2,4,6),2),
count=sample(1:25,6),
site=rep("A", 6),
year=rep(2013, 6))
library(reshape2)
dcast(df, ... ~ mm, value.var="count")
# km site year 2 4 6
# 1 32 A 2013 13 10 20
# 2 50 A 2013 3 17 1
And if you want a bit of a challenge you can try the base function reshape().
df2 <- reshape(df, v.names="count", idvar="km", timevar="mm", ids="mm", direction="wide")
colnames(df2) <- sub("count.", "mm_", colnames(df2))
df2
# km site year mm_2 mm_4 mm_6
# 1 32 A 2013 13 10 20
# 4 50 A 2013 3 17 1
I am very new to R so I am not sure how basic my question is, but I am stuck at the following point.
I have data that has a panel structure, similar to this
Country Year Outcome Country-characteristic
A 1990 10 40
A 1991 12 40
A 1992 14 40
B 1991 10 60
B 1992 12 60
For some reason I need to put this in a cross-sectional structure such I get averages over all years for each country, that is in the end, it should look like,
Country Outcome Country-Characteristic
A 12 40
B 11 60
Has anybody faced a similar problem? I was playing with lapply(table$country, table$outcome, mean) but that did not work as I wanted it.
Two tips: 1- When you ask a question, you should provide a reproducible example for the data too (as I did with read.table below). 2- It's not a good idea to use "-" in column names. You should use "_" instead.
You can get a summary using the dplyr package:
df1 <- read.table(text="Country Year Outcome Countrycharacteristic
A 1990 10 40
A 1991 12 40
A 1992 14 40
B 1991 10 60
B 1992 12 60", header=TRUE, stringsAsFactors=FALSE)
library(dplyr)
df1 %>%
group_by(Country) %>%
summarize(Outcome=mean(Outcome),Countrycharacteristic=mean(Countrycharacteristic))
# A tibble: 2 x 3
Country Outcome Countrycharacteristic
<chr> <dbl> <dbl>
1 A 12 40
2 B 11 60
We can do this in base R with aggregate
aggregate(.~Country, df1[-2], mean)
# Country Outcome Countrycharacteristic
#1 A 12 40
#2 B 11 60
This question already has answers here:
how to spread or cast multiple values in r [duplicate]
(2 answers)
Closed 7 years ago.
I would like to concatenate column values with column names to create new columns. I am experimenting with library(reshape2), dcast however I can't get the required output.
Is there a method that doesn't involve performing dcast multiple times then merging the resulting sets back together?
Current data frame:
observation=c(1,1,1,2,2,2,3,3,3)
event=c('event1','event2','event3','event1','event2','event3','event1','event2','event3')
value1=c(1,2,3,4,5,6,7,8,9)
value2=c(11,12,13,14,15,16,17,18,19)
current=data.frame(observation,event,value1,value2)
current
Required data frame:
observation=c(1,2,3)
event1_value1 =c(1,4,7)
event2_value1 =c(2,5,8)
event3_value1 =c(3,6,9)
event1_value2 =c(11,14,17)
event2_value2 =c(12,15,18)
event3_value2 =c(13,16,19)
required=data.frame(observation,event1_value1,event2_value1,event3_value1,event1_value2,event2_value2,event3_value2)
required
The method below works but I feel there must be a quicker way!
library(reshape2)
value1 <- dcast(current,observation~event,value.var ="value1")
value2 <- dcast(current,observation~event,value.var ="value2")
merge(value1,value2,by="observation",suffixes = c("_value1","_value2"))
This is an extension of reshape from long to wide
You can use the devel version of data.table i.e. v1.9.5 which can take multiple value.var columns. Instructions to install the devel version are here
library(data.table)#v1.9.5+
dcast(setDT(current), observation~event, value.var=c('value1', 'value2'))
# observation event1_value1 event2_value1 event3_value1 event1_value2
#1: 1 1 2 3 11
#2: 2 4 5 6 14
#3: 3 7 8 9 17
# event2_value2 event3_value2
#1: 12 13
#2: 15 16
#3: 18 19
Or reshape from base R
reshape(current, idvar='observation', timevar='event', direction='wide')
# observation value1.event1 value2.event1 value1.event2 value2.event2
#1 1 1 11 2 12
#4 2 4 14 5 15
#7 3 7 17 8 18
# value1.event3 value2.event3
#1 3 13
#4 6 16
#7 9 19
I'm not sure of the efficiency but you could try this -
> dcast(melt(current,id.vars = c('observation','event')),observation~event+variable)
observation event1_value1 event1_value2 event2_value1 event2_value2 event3_value1 event3_value2
1 1 1 11 2 12 3 13
2 2 4 14 5 15 6 16
3 3 7 17 8 18 9 19
This question already has answers here:
Reshaping data.frame from wide to long format
(8 answers)
Closed 6 years ago.
How do I reshape this wide data: (from a csv file)
Name Code Indicator 1960 1961 1962
Into this long format?
Name Code Indicator Year
the reshape2 package does this nicely with the function melt.
yourdata_melted <- melt(yourdata, id.vars=c('Name', 'Code', 'Indicator'), variable.name='Year')
This will add a column of value that you can drop. yourdata_melted$value <- NULL
And just because I like to continue my campaign for using base R functions:
Test data:
test <- data.frame(matrix(1:12,nrow=2))
names(test) <- c("name","code","indicator","1960","1961","1962")
test
name code indicator 1960 1961 1962
1 1 3 5 7 9 11
2 2 4 6 8 10 12
Now reshape it!
reshape(
test,
idvar=c("name","code","indicator"),
varying=c("1960","1961","1962"),
timevar="year",
v.names="value",
times=c("1960","1961","1962"),
direction="long"
)
# name code indicator year value
#1.3.5.1960 1 3 5 1960 7
#2.4.6.1960 2 4 6 1960 8
#1.3.5.1961 1 3 5 1961 9
#2.4.6.1961 2 4 6 1961 10
#1.3.5.1962 1 3 5 1962 11
#2.4.6.1962 2 4 6 1962 12
With tidyr
gather(test, "time", "value", 4:6)
Data
test <- data.frame(matrix(1:12,nrow=2))
names(test) <- c("name","code","indicator","1960","1961","1962")