R: Average years in time series per group

R: Average years in time series per group - r

Dear Community,
I am working with R and looking for trends in time series data of bilateral exports over a duration of 20 years. As the data is fluctuating a lot between the years (and in addition is not 100% reliable), I would prefer to use four-years-average data (instead of looking at every single year separately) in order to analyze how the main export partners have changed over time.
I have the following dataset, called GrossExp3, covering the bilateral exports (in 1000 USD) of 15 reporter countries for all years between (1998 – 2019) to all available partner countries.
It covers the following four variables:
Year, ReporterName (= exporter) , PartnerName (= export destination), 'TradeValue in 1000 USD' ( = export value to the destination)
The PartnerName column also includes an entry, called “All”, which is the total sum of all exports for each year by reporter.
Here is the summary of my data
> summary(GrossExp3)
Year ReporterName PartnerName TradeValue in 1000 USD
Min. :1998 Length:35961 Length:35961 Min. : 0
1st Qu.:2004 Class :character Class :character 1st Qu.: 39
Median :2009 Mode :character Mode :character Median : 597
Mean :2009 Mean : 134370
3rd Qu.:2014 3rd Qu.: 10090
Max. :2018 Max. :47471515
My goal is to return a table which shows the percentage of total trade for each exporter to the export destination in percentage of total exports for that period. Instead of every single year, I want to have the average data for the following periods: 2000-2003, 2004-2007, 2008-2011, 2012-2015, 2016-2019.
What I tried
My current code (created with support of this amazing community is the following: (At the current moment, it shows the data for each year separately, but I need the average data in the headline)
# install packages
library(data.table)
library(dplyr)
library(tidyr)
library(stringr)
library(plyr)
library(visdat)
# set working directory
setwd("C:/R/R_09.2020/Other Indicators/Bilateral Trade Shift of Partners")
# load data
# create a file path SITC 3
path1 <- file.path("SITC Rev 3_Data from 1998.csv")
# load cvs data table, call "SITC3"
SITC3 <- fread(path1, drop = c(1,9,11,13))
# prepare data (SITC3) for analysis
# Filter for GROSS EXPORTS SITC3 (Gross exports = Exports that include intermediate products)
GrossExp3 <- SITC3 %>%
filter(TradeFlowName == "Gross Exp.", PartnerISO3 != "All", Year != 2019) %>% # filter for gross exports, remove "All", remove 2019
select(Year, ReporterName, PartnerName, `TradeValue in 1000 USD`) %>%
arrange(ReporterName, desc(Year))
# compare with old subset
summary(GrossExp3)
summary(SITC3)
# calculate percentage of total
GrossExp3Main <- GrossExp3 %>%
group_by(Year, ReporterName) %>%
add_tally(wt = `TradeValue in 1000 USD`, name = "TotalValue") %>%
mutate(Percentage = 100 * (`TradeValue in 1000 USD` / TotalValue)) %>%
arrange(ReporterName, desc(Year), desc(Percentage))
head(GrossExp3Main, n = 20)
# print tables in separate sheets to get an overview about hierarchy of export partners and development over time
SpreadExpMain <- GrossExp3Main %>%
select(Year, ReporterName, PartnerName, Percentage) %>%
spread(key = Year, value = Percentage) %>%
arrange(ReporterName, desc(`2018`))
View(SpreadExpMain) # shows whole table
Here is the head of my data
> head(GrossExp3Main, n = 20)
# A tibble: 20 x 6
# Groups: Year, ReporterName [7]
Year ReporterName PartnerName `TradeValue in 100~ TotalValue Percentage
<int> <chr> <chr> <dbl> <dbl> <dbl>
1 2018 Angola China 24517058. 42096736. 58.2
2 2018 Angola India 3768940. 42096736. 8.95
3 2017 Angola China 19487067. 34904881. 55.8
4 2017 Angola India 2890061. 34904881. 8.28
5 2016 Angola China 13923092. 28057500. 49.6
6 2016 Angola India 1948845. 28057500. 6.95
7 2016 Angola United States 1525650. 28057500. 5.44
8 2015 Angola China 14320566. 33924937. 42.2
9 2015 Angola India 2676340. 33924937. 7.89
10 2015 Angola Spain 2245976. 33924937. 6.62
11 2014 Angola China 27527111. 58672369. 46.9
12 2014 Angola India 4507416. 58672369. 7.68
13 2014 Angola Spain 3726455. 58672369. 6.35
14 2013 Angola China 31947235. 67712527. 47.2
15 2013 Angola India 6764233. 67712527. 9.99
16 2013 Angola United States 5018391. 67712527. 7.41
17 2013 Angola Other Asia, ~ 4007020. 67712527. 5.92
18 2012 Angola China 33710030. 70863076. 47.6
19 2012 Angola India 6932061. 70863076. 9.78
20 2012 Angola United States 6594526. 70863076. 9.31
I am not sure if the results I get up to this point are right?
In addition, I have the following questions:
Do you have any recommendation on how to print nice looking tables with R?
How can I better round the percentage data to one number behind the comma?
As I have been stuck with these issues over the week, I would be very grateful for any recommendations on how to solve the issue!!
Wishing you a nice weekend and all the best,
Melike
** EDIT**
here is some sample data
dput(head(GrossExp3Main, n = 20))
structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L), ReporterName = c("Angola",
"Angola", "Angola", "Angola", "Angola", "Angola", "Angola", "Angola",
"Angola", "Angola", "Angola", "Angola", "Angola", "Angola", "Angola",
"Angola", "Angola", "Angola", "Angola", "Angola"), PartnerName = c("China",
"India", "United States", "Spain", "South Africa", "Portugal",
"United Arab Emirates", "France", "Thailand", "Canada", "Indonesia",
"Singapore", "Italy", "Israel", "United Kingdom", "Unspecified",
"Namibia", "Uruguay", "Congo, Rep.", "Japan"), `TradeValue in 1000 USD` = c(24517058.342,
3768940.47, 1470132.736, 1250554.873, 1161852.097, 1074137.369,
884725.078, 734551.345, 649626.328, 647164.297, 575477.283, 513982.584,
468914.918, 452453.482, 425616.975, 423008.886, 327921.516, 320586.229,
299119.102, 264671.779), TotalValue = c(42096736.31, 42096736.31,
42096736.31, 42096736.31, 42096736.31, 42096736.31, 42096736.31,
42096736.31, 42096736.31, 42096736.31, 42096736.31, 42096736.31,
42096736.31, 42096736.31, 42096736.31, 42096736.31, 42096736.31,
42096736.31, 42096736.31, 42096736.31), Percentage = c(58.2398078593471,
8.9530467213552, 3.49227247731025, 2.97066942147468, 2.75995765667944,
2.55159298119945, 2.10164767046284, 1.74491281127062, 1.54317504144777,
1.53732653342598, 1.3670353890672, 1.22095589599877, 1.11389850877492,
1.07479467925527, 1.01104506502775, 1.00484959899258, 0.778971352043039,
0.761546516668669, 0.710551762961598, 0.62872279943737)), row.names = c(NA,
-20L), groups = structure(list(Year = 2018L, ReporterName = "Angola",
.rows = structure(list(1:20), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1L, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
>

To do what you want need an additional variable to group the year together. I used cut to do that.
library(dplyr)
# Define the cut breaks and labels for each group
# The cut define by the starting of each group and when using cut function
# I would use param right = FALSE to have the desire cut that I want here.
year_group_break <- c(2000, 2004, 2008, 2012, 2016, 2020)
year_group_labels <- c("2000-2003", "2004-2007", "2008-2011", "2012-2015", "2016-2019")
data %>%
# create the year group variable
mutate(year_group = cut(Year, breaks = year_group_break,
labels = year_group_labels,
include.lowest = TRUE, right = FALSE)) %>%
# calculte the total value for each Reporter + Partner in each year group
group_by(year_group, ReporterName, PartnerName) %>%
summarize(`TradeValue in 1000 USD` = sum(`TradeValue in 1000 USD`),
.groups = "drop") %>%
# calculate the percentage value for Partner of each Reporter/Year group
group_by(year_group, ReporterName) %>%
mutate(Percentage = `TradeValue in 1000 USD` / sum(`TradeValue in 1000 USD`)) %>%
ungroup()
Sample output
year_group ReporterName PartnerName `TradeValue in 1000 USD` Percentage
<fct> <chr> <chr> <dbl> <dbl>
1 2016-2019 Angola Canada 647164. 0.0161
2 2016-2019 Angola China 24517058. 0.609
3 2016-2019 Angola Congo, Rep. 299119. 0.00744
4 2016-2019 Angola France 734551. 0.0183
5 2016-2019 Angola India 3768940. 0.0937
6 2016-2019 Angola Indonesia 575477. 0.0143
7 2016-2019 Angola Israel 452453. 0.0112
8 2016-2019 Angola Italy 468915. 0.0117
9 2016-2019 Angola Japan 264672. 0.00658
10 2016-2019 Angola Namibia 327922. 0.00815
11 2016-2019 Angola Portugal 1074137. 0.0267
12 2016-2019 Angola Singapore 513983. 0.0128
13 2016-2019 Angola South Africa 1161852. 0.0289
14 2016-2019 Angola Spain 1250555. 0.0311
15 2016-2019 Angola Thailand 649626. 0.0161
16 2016-2019 Angola United Arab Emirates 884725. 0.0220
17 2016-2019 Angola United Kingdom 425617. 0.0106
18 2016-2019 Angola United States 1470133. 0.0365
19 2016-2019 Angola Unspecified 423009. 0.0105
20 2016-2019 Angola Uruguay 320586. 0.00797

Related

Long to wide change in panel data, but only for certain values in rows [duplicate]

This question already has answers here:
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Closed 11 months ago.
I've browsed extensively online but could so far not find an appropriate answer for my question in this specific case.
I'm looking to partly re-structure a panel data set from long to wide format, but only for specific values that are specified by their respective names/characters in rows in R.
Consider this original format:
SERIES ECONOMY YEAR Value
246 CPI Panama 1960 0.05
247 CPI Peru 1960 0.05
248 CPI XXXXXX 1960 0.05
249 CPI Panama 1961 0.06
250 CPI Peru 1961 0.06
251 CPI XXXXXX 1961 0.06
252 % Gross savings Panama 1960 5
253 % Gross savings Peru 1960 6
254 % Gross savings XXXXXX 1960 7
255 % Gross savings Panama 1961 20
256 % Gross savings Peru 1961 21
257 % Gross savings XXXXXX 1961 22
(And so on for different countries, different indicators in the "SERIES" column, during 1960-2020 for each country and indicator.)
I'm looking to keep "ECONOMY" as its own column specifying the country as originally seen, keep the year as a column as well, but move each separate indicator under SERIES (e.g. CPI / % Gross savings) into their own columns like this:
ECONOMY YEAR CPI %_GROSS_SAVINGS
1 Panama 1960 0.05 5
2 Peru 1960 0.05 6
3 XXXXXX 1960 0.05 7
4 Panama 1961 0.06 20
5 Peru 1961 0.06 21
6 XXXXXX 1961 0.06 22
Any ideas? Grateful for answers.

Not sure if I follow - this seems to me like a typical pivot_wider use:
library(tidyr)
dat |> pivot_wider(names_from = "SERIES",
values_from = "Value")
#> # A tibble: 6 x 4
#> ECONOMY YEAR CPI `% Gross savings`
#> <chr> <dbl> <dbl> <dbl>
#> 1 Panama 1960 0.05 5
#> 2 Peru 1960 0.05 6
#> 3 XXXXXX 1960 0.05 7
#> 4 Panama 1961 0.06 20
#> 5 Peru 1961 0.06 21
#> 6 XXXXXX 1961 0.06 22
Created on 2022-04-08 by the reprex package (v2.0.0)
Reproducible data:
dat <- structure(list(SERIES = c("CPI", "CPI", "CPI", "CPI", "CPI",
"CPI", "% Gross savings", "% Gross savings", "% Gross savings",
"% Gross savings", "% Gross savings", "% Gross savings"), ECONOMY = c("Panama",
"Peru", "XXXXXX", "Panama", "Peru", "XXXXXX", "Panama", "Peru",
"XXXXXX", "Panama", "Peru", "XXXXXX"), YEAR = c(1960, 1960, 1960,
1961, 1961, 1961, 1960, 1960, 1960, 1961, 1961, 1961), Value = c(0.05,
0.05, 0.05, 0.06, 0.06, 0.06, 5, 6, 7, 20, 21, 22)), row.names = c(NA,
-12L), class = c("tbl_df", "tbl", "data.frame"))

reshape2
reshape2::dcast(ECONOMY + YEAR ~ SERIES, data = zz)
# Using Value as value column: use value.var to override.
# ECONOMY YEAR %_Gross_savings CPI
# 1 Panama 1960 5 0.05
# 2 Panama 1961 20 0.06
# 3 Peru 1960 6 0.05
# 4 Peru 1961 21 0.06
# 5 XXXXXX 1960 7 0.05
# 6 XXXXXX 1961 22 0.06
Data
zz <- structure(list(SERIES = c("CPI", "CPI", "CPI", "CPI", "CPI", "CPI", "%_Gross_savings", "%_Gross_savings", "%_Gross_savings", "%_Gross_savings", "%_Gross_savings", "%_Gross_savings"), ECONOMY = c("Panama", "Peru", "XXXXXX", "Panama", "Peru", "XXXXXX", "Panama", "Peru", "XXXXXX", "Panama", "Peru", "XXXXXX"), YEAR = c(1960L, 1960L, 1960L, 1961L, 1961L, 1961L, 1960L, 1960L, 1960L, 1961L, 1961L, 1961L), Value = c(0.05, 0.05, 0.05, 0.06, 0.06, 0.06, 5, 6, 7, 20, 21, 22)), class = "data.frame", row.names = c("246", "247", "248", "249", "250", "251", "252", "253", "254", "255", "256", "257"))

Aggregate R (absolute) difference

I have a dataframe like this:
structure(list(from = c("China", "China", "Canada", "Canada",
"USA", "China", "Trinidad and Tobago", "China", "USA", "USA"),
to = c("Japan", "Japan", "USA", "USA", "Japan", "USA", "USA",
"Rep. of Korea", "Canada", "Japan"), weight = c(4766781396,
4039683737, 3419468319, 3216051707, 2535151299, 2513604035,
2303474559, 2096033823, 2091906420, 2066357443)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(
from = c("Canada", "China", "China", "China", "Trinidad and Tobago",
"USA", "USA"), to = c("USA", "Japan", "Rep. of Korea", "USA",
"USA", "Canada", "Japan"), .rows = structure(list(3:4, 1:2,
8L, 6L, 7L, 9L, c(5L, 10L)), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -7L), .drop = TRUE))
I would like to perform the absolute value of difference in weight column grouped by from and to.
I'm trying with the function aggregate() but it seems to work for means and sums and not for difference. For example (df is the name of my dataframe):
aggregate(weight~from+to, data = df, FUN=mean)
which produces:
from to weight
1 USA Canada 2091906420
2 China Japan 4403232567
3 USA Japan 2300754371
4 China Rep. of Korea 2096033823
5 Canada USA 3317760013
6 China USA 2513604035
7 Trinidad and Tobago USA 2303474559
EDIT. The desired result is instead
from to weight
1 USA Canada 2091906420
2 China Japan 727097659
3 USA Japan 468793856
4 China Rep. of Korea 2096033823
5 Canada USA 203416612
6 China USA 2513604035
7 Trinidad and Tobago USA 2303474559
As we can see, the countries that appear two times in the columns from and to colllapsed in only one row with the difference between weights in the column weight. E.g.,
from to weight
China Japan 4766781396
China Japan 4039683737
become
from to weight
China Japan 727097659
because
> 4766781396-4039683737
[1] 727097659
The difference should be positive (and this is why I wrote "the absolute value of difference of the weights").
The couples of countries which instead appear just in one row of dataframe df remain the same, as e.g.
from to weight
7 Trinidad and Tobago USA 2303474559

Assuming at most 2 values per group and that the order of the difference is not important
aggregate(weight~from+to, data=df, FUN=function(x){
abs(ifelse(length(x)==1,x,diff(x)))
})
from to weight
1 USA Canada 2091906420
2 China Japan 727097659
3 USA Japan 468793856
4 China Rep. of Korea 2096033823
5 Canada USA 203416612
6 China USA 2513604035
7 Trinidad and Tobago USA 2303474559

Is the following what you are looking for?
f <- function(x) abs(x[2] - x[1])
aggregate(weight ~ from + to, data = df, FUN = f)
#> from to weight
#> 1 USA Canada NA
#> 2 China Japan 727097659
#> 3 USA Japan 468793856
#> 4 China Rep. of Korea NA
#> 5 Canada USA 203416612
#> 6 China USA NA
#> 7 Trinidad and Tobago USA NA

how to use pipes in condition matching in r?

I am getting an error while trying to filter data in 1st df based on countries available in 2nd df by using pipe operator in conditional statement.
Referencing countries df
Overall_top5
########### output ###########
continent country gdpPercap
Africa Botswana 8090
Africa Equatorial Guinea 20500
Africa Gabon 19600
Africa Libya 12100
Africa Mauritius 10900
Americas Canada 51600
Americas Chile 15100
Americas Trinidad and Tobago 17100
main df
gap_longer
########### output #############
country year gdpPercap continent
Australia 2019 57100 Oceania
Botswana 2019 8090 Africa
Canada 2019 51600 Americas
Chile 2019 15100 Americas
Denmark 2019 65100 Europe
Error: When I try below code it gives me errors:
gap_longer %>%
filter(year == 2019,
country %in% Overall_top5 %>% select(country) )
Error: Problem with `filter()` input `..1`. x no applicable method for 'select_' applied to an object of class "logical" i Input `..1` is `country %in% Overall_top5 %>% select(country)`. Run `rlang::last_error()` to see where the error occurred.
How can I run this using pipes ? I am able to run this using base R but don't know how to fix it using pipes .
gap_longer %>%
filter(year == 2019,
country %in% Overall_top5$country )
Raw data
Overall_top5 <- structure(list(continent = c("Africa", "Africa", "Africa", "Africa", "Africa", "Americas", "Americas", "Americas"), country = c("Botswana", "Equatorial Guinea", "Gabon", "Libya", "Mauritius", "Canada", "Chile", "Trinidad and Tobago"), gdpPercap = c(8090L, 20500L, 19600L, 12100L, 10900L, 51600L, 15100L, 17100L)), row.names = c(NA, -8L), class = "data.frame")
gap_longer <- structure(list(country = c("Australia", "Botswana", "Canada", "Chile", "Denmark"), year = c(2019L, 2019L, 2019L, 2019L, 2019L), gdpPercap = c(57100L, 8090L, 51600L, 15100L, 65100L), continent = c("Oceania", "Africa", "Americas", "Americas", "Europe")), class = "data.frame", row.names = c(NA, -5L))

First, you want to use pull rather than select as select will return a data frame rather than a vector (but that doesn't solve your problem).
Your problem comes from precedence. In your example, %in% is evaluated first, then %>%. To fix this, use parentheses.
gap_longer %>%
filter(
year == 2019,
country %in% (Overall_top5 %>% pull(country))
)
#> # A tibble: 3 x 4
#> country year gdpPercap continent
#> <chr> <dbl> <dbl> <chr>
#> 1 Botswana 2019 8090 Africa
#> 2 Canada 2019 51600 Americas
#> 3 Chile 2019 15100 Americas

Parenthesis problem. Try:
gap_longer %>%
filter(year == 2019, country %in% Overall_top5$country) %>%
select(country)
or, if you want a vector of country names, not a data frame:
gap_longer %>%
filter(year == 2019, country %in% Overall_top5$country) %>%
pull(country)

R | Adding index numbers

I have two dataset which look like below
Sales
Region ReviewYear Sales Index
South Asia 2006 1.5 NA
South Asia 2009 4.5 NA
South Asia 2011 11 0
South Asia 2014 16.7 NA
Africa 2008 0.4 NA
Africa 2013 3.5 0
Africa 2017 9.7 NA
Strategy
Region StrategyYear
South Asia 2011
Africa 2013
Japan 2007
SE Asia 2009
There are multiple regions and many review years which are not periodic and not even same for all regions. I have added a column 'Index' to dataframe 'Sales' such that for a strategy year from second dataframe, the index value is zero. I now want to change NA to a series of numbers that tell how many rows before or after that particular row is to 0 row, grouped by 'Region'.
I can do this using a for loop but that is just tedious and checking if there is a cleaner way to do this. Final output should look like
Sales
Region ReviewYear Sales Index
South Asia 2006 1.5 -2
South Asia 2009 4.5 -1
South Asia 2011 11 0
South Asia 2014 16.7 1
Africa 2008 0.4 -1
Africa 2013 3.5 0
Africa 2017 9.7 1

Join the two datasets by Region and for each Region create an Index column by subtracting the row number with the index where StrategyYear matches the ReviewYear.
library(dplyr)
left_join(Sales, Strategy, by = 'Region') %>%
arrange(Region, StrategyYear) %>%
group_by(Region) %>%
mutate(Index = row_number() - match(first(StrategyYear), ReviewYear))
# Region ReviewYear Sales Index StrategyYear
# <chr> <int> <dbl> <int> <int>
#1 Africa 2008 0.4 -1 2013
#2 Africa 2013 3.5 0 2013
#3 Africa 2017 9.7 1 2013
#4 SouthAsia 2006 1.5 -2 2011
#5 SouthAsia 2009 4.5 -1 2011
#6 SouthAsia 2011 11 0 2011
#7 SouthAsia 2014 16.7 1 2011
data
Sales <- structure(list(Region = c("SouthAsia", "SouthAsia", "SouthAsia",
"SouthAsia", "Africa", "Africa", "Africa"), ReviewYear = c(2006L,
2009L, 2011L, 2014L, 2008L, 2013L, 2017L), Sales = c(1.5, 4.5,
11, 16.7, 0.4, 3.5, 9.7), Index = c(NA, NA, 0L, NA, NA, 0L, NA
)), class = "data.frame", row.names = c(NA, -7L))
Strategy <- structure(list(Region = c("SouthAsia", "Africa", "Japan", "SEAsia"
), StrategyYear = c(2011L, 2013L, 2007L, 2009L)), class = "data.frame",
row.names = c(NA, -4L))

R setting a value based on another column by group

I have a data frame in R that looks like the one below. I want to create a new column called tfp level[1980] that takes the 1980 value of tfp level. Taking into account a grouping by country.
So e.g. Australia will take the value 0.796980202 for each year and Costa Rica 1.082085967 for each year.
country ISO year tfp level tfp level[1980]
Australia AUS 1980 0.796980202
Australia AUS 1981 0.808527768
Australia AUS 1982 0.790943801
Australia AUS 1983 0.818122745
Australia AUS 1984 0.827925146
Australia AUS 1985 0.825170755
Costa Rica CRI 1980 1.082085967
Costa Rica CRI 1981 1.033975005
Costa Rica CRI 1982 0.934024811
Costa Rica CRI 1983 0.920588791
There must be a way to solve this neatly with dplyr, for instance using the group_by command, but I can't get to a good solution myself.
Thanks.

After grouping by 'country', mutate to get the corresponding 'tfp.level' for 'year' value 1980
library(dplyr)
df1 %>%
group_by(country) %>%
mutate(tfllevel1980 = `tfp level`[year == 1980])
# A tibble: 10 x 5
# Groups: country [2]
# country ISO year `tfp level` tfllevel1980
# <chr> <chr> <int> <dbl> <dbl>
# 1 Australia AUS 1980 0.797 0.797
# 2 Australia AUS 1981 0.809 0.797
# 3 Australia AUS 1982 0.791 0.797
# 4 Australia AUS 1983 0.818 0.797
# 5 Australia AUS 1984 0.828 0.797
# 6 Australia AUS 1985 0.825 0.797
# 7 Costa Rica CRI 1980 1.08 1.08
# 8 Costa Rica CRI 1981 1.03 1.08
# 9 Costa Rica CRI 1982 0.934 1.08
#10 Costa Rica CRI 1983 0.921 1.08
Or using base R
df1$tfplevel1980 <- with(df1, ave(`tfp level` * (year == 1980),
country, FUN = function(x) x[x!= 0]))
data
df1 <- structure(list(country = c("Australia", "Australia", "Australia",
"Australia", "Australia", "Australia", "Costa Rica", "Costa Rica",
"Costa Rica", "Costa Rica"), ISO = c("AUS", "AUS", "AUS", "AUS",
"AUS", "AUS", "CRI", "CRI", "CRI", "CRI"), year = c(1980L, 1981L,
1982L, 1983L, 1984L, 1985L, 1980L, 1981L, 1982L, 1983L),
`tfp level` = c(0.796980202,
0.808527768, 0.790943801, 0.818122745, 0.827925146, 0.825170755,
1.082085967, 1.033975005, 0.934024811, 0.920588791)),
class = "data.frame", row.names = c(NA,
-10L))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: Average years in time series per group - r

Related

Long to wide change in panel data, but only for certain values in rows [duplicate]

Aggregate R (absolute) difference

how to use pipes in condition matching in r?

R | Adding index numbers

R setting a value based on another column by group

Categories

Resources