Make time-period observations into annual observations in R - r

I have a dataset (df1) on hundreds of national crises, where each observation is a crisis event at the country level with a start and an end date. I also have the date when the crisis was announced (yyyy-mm-dd format), and a bunch of other crisis characteristics.
df1 <- data.frame(cbind(eventID=c(1,2,3,4), country=c("ALB","ALB","ARG","ARG"), start=c(1994, 1998, 1998, 1991), end=c(1996,1999,1999,1993), announcement=c("1994-11-01","1998-03-01","1998-07-01","1992-01-01"), x1=c(6,2,8,7), x2=c("a","q","k","b")))
eventID country start end announcement x1 x2
1 ALB 1994 1996 1994-11-01 6 a
2 ALB 1998 1999 1998-03-01 2 q
3 ARG 1998 1999 1998-07-01 8 k
4 ARG 1991 1993 1992-01-01 7 b
I need to make df2, a panel of countries with annual observations from the earliest "start" year to the latest "end" year. I want to have a dummy variable, "crisis", that equals 1 for the years between "start" and "end" in df1, and 0 otherwise. I want "announcement" to contain the announcement date in df1 for the year with an announcement, and "NA" otherwise. I would like the extra crisis characteristics, x1 and x2, to show up for crisis years to which they correspond, and "NA" otherwise.
I also need observations for each country for years in which no country has a crisis (in df2: 1997).
df2 <- data.frame(cbind(year=c(1991,1992,1993,1994,1995,1996,1997,1998,1999,1991,1992,1993,1994,1995,1996,1997,1998,1999), country=c("ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG"),crisis=c(0,0,0,1,1,1,0,1,1,1,1,1,0,0,0,0,1,1), announcement=c(NA, NA,NA,"1994-11-01",NA,NA,NA,"1998-03-01",NA,NA,"1992-01-01",NA,NA,NA,NA,NA,"1998-07-01"), x1=c(NA,NA,NA,6,6,6,NA,2,2,8,8,8,NA,NA,NA,NA,7,7), x2=c(NA,NA,NA,"a","a","a",NA,"q","q","k","k","k",NA,NA,NA,NA,"b","b")))
year country crisis announcement x1 x2
1991 ALB 0 NA NA NA
1992 ALB 0 NA NA NA
1993 ALB 0 NA NA NA
1994 ALB 1 1994-11-01 6 a
1995 ALB 1 NA 6 a
1996 ALB 1 NA 6 a
1997 ALB 0 NA NA NA
1998 ALB 1 1998-03-01 2 q
1999 ALB 1 NA 2 q
1991 ARG 1 NA 8 k
1992 ARG 1 1992-01-01 8 k
1993 ARG 1 NA 8 k
1994 ARG 0 NA NA NA
1995 ARG 0 NA NA NA
1996 ARG 0 NA NA NA
1997 ARG 0 NA NA NA
1998 ARG 1 1998-07-01 7 b
1999 ARG 1 NA 7 b
I would love any suggestions! I'm stumped as to how to replicate the observations for each year, but only include x1 and x2 values when my new "crisis" dummy = 1
Thanks!

Making use of dplyr and tidyr this could be achieved like so:
library(dplyr)
library(tidyr)
df1 <- data.frame(cbind(eventID=c(1,2,3,4), country=c("ALB","ALB","ARG","ARG"), start=c(1994, 1998, 1998, 1991), end=c(1996,1999,1999,1993), announcement=c("1994-11-01","1998-03-01","1998-07-01","1992-01-01"), x1=c(6,2,8,7), x2=c("a","q","k","b")))
df1 %>%
mutate(year = factor(start, levels = min(start):max(end))) %>%
complete(year, country) %>%
mutate(year = as.numeric(as.character(year))) %>%
arrange(country, year) %>%
group_by(country) %>%
fill(eventID, end, x1, x2) %>%
ungroup() %>%
mutate(across(c(eventID, end, x1, x2), ~ ifelse(end < year, NA, .)),
crisis = as.numeric(!is.na(eventID)))
#> # A tibble: 18 x 9
#> year country eventID start end announcement x1 x2 crisis
#> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 1991 ALB <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 2 1992 ALB <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 3 1993 ALB <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 4 1994 ALB 1 1994 1996 1994-11-01 6 a 1
#> 5 1995 ALB 1 <NA> 1996 <NA> 6 a 1
#> 6 1996 ALB 1 <NA> 1996 <NA> 6 a 1
#> 7 1997 ALB <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 8 1998 ALB 2 1998 1999 1998-03-01 2 q 1
#> 9 1999 ALB 2 <NA> 1999 <NA> 2 q 1
#> 10 1991 ARG 4 1991 1993 1992-01-01 7 b 1
#> 11 1992 ARG 4 <NA> 1993 <NA> 7 b 1
#> 12 1993 ARG 4 <NA> 1993 <NA> 7 b 1
#> 13 1994 ARG <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 14 1995 ARG <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 15 1996 ARG <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 16 1997 ARG <NA> <NA> <NA> <NA> <NA> <NA> 0
#> 17 1998 ARG 3 1998 1999 1998-07-01 8 k 1
#> 18 1999 ARG 3 <NA> 1999 <NA> 8 k 1

Related

Add multiple columns lagged by one year

I need to add a 1-year-lagged version of multiple columns from my dataframe. Here's my data:
data<-data.frame(Year=c("2011","2011","2011","2012","2012","2012","2013","2013","2013"),
Country=c("America","China","India","America","China","India","America","China","India"),
Value1=c(234,443,754,334,117,112,987,903,476),
Value2=c(2,4,5,6,7,8,1,2,2))
And I want to add two columns that contain Value1 and Value2 at t-1, so that my dataframe looks like this:
How can I do this? Would this be the correct way to lag my variables by year?
Thanks in advance!
Using data.table:
library(data.table)
setDT(data)
cols <- grep("^Value", colnames(data), value = TRUE)
data[, paste0(cols, "_lag") := lapply(.SD, shift), .SDcols = cols, by = Country]
# Year Country Value1 Value2 Value1_lag Value2_lag
# 1: 2011 America 234 2 NA NA
# 2: 2011 China 443 4 NA NA
# 3: 2011 India 754 5 NA NA
# 4: 2012 America 334 6 234 2
# 5: 2012 China 117 7 443 4
# 6: 2012 India 112 8 754 5
# 7: 2013 America 987 1 334 6
# 8: 2013 China 903 2 117 7
# 9: 2013 India 476 2 112 8
In dplyr, use lag by group:
library(dplyr) #1.1.0
data %>%
mutate(across(contains("Value"), lag, .names = "{col}_lagged"), .by = Country)
Year Country Value1 Value2 Value1_lagged Value2_lagged
1 2011 America 234 2 NA NA
2 2011 China 443 4 NA NA
3 2011 India 754 5 NA NA
4 2012 America 334 6 234 2
5 2012 China 117 7 443 4
6 2012 India 112 8 754 5
7 2013 America 987 1 334 6
8 2013 China 903 2 117 7
9 2013 India 476 2 112 8
Below 1.1.0:
data %>%
group_by(Country) %>%
mutate(across(c(GDP, Population), lag, .names = "{col}_lagged")) %>%
ungroup()
Another way using dplyr to ge tthe job done.
library(dplyr)
data_lagged <- data %>%
group_by(Country) %>%
mutate(Value1_Lagged = lag(Value1),
Value2_Lagged = lag(Value2),
Year = as.integer(as.character(Year)) + 1)
data_final <- cbind(data, data_lagged[, c("Value1_Lagged", "Value2_Lagged")])
data_final
Output:
Year Country Value1 Value2 Value1_Lagged Value2_Lagged
1 2011 America 234 2 NA NA
2 2011 China 443 4 NA NA
3 2011 India 754 5 NA NA
4 2012 America 334 6 234 2
5 2012 China 117 7 443 4
6 2012 India 112 8 754 5
7 2013 America 987 1 334 6
8 2013 China 903 2 117 7
9 2013 India 476 2 112 8

R - calculate annual population conditional on survival in every year

I have a data frame with three columns: birth_year, death_year, gender.
I have to calculate total alive male and female population for every year in a given range (1950:1980).
The data frame looks like this:
birth_year death_year gender
1934 1988 male
1922 1993 female
1890 1966 male
1901 1956 male
1946 2009 female
1909 1976 female
1899 1945 male
1887 1949 male
1902 1984 female
The person is alive in year x if death_year > x & birth year <= x
The output I am looking for is something like this:
year male female
1950 3 4
1951 2 3
1952 4 3
1953 4 5
.
.
1980 6 3
Thanks!
Does this work:
library(tidyr)
library(purrr)
library(dplyr)
df %>% mutate(year = map2(1950,1980, seq)) %>% unnest(year) %>%
mutate(isalive = case_when(year >= birth_year & year < death_year ~ 1, TRUE ~ 0)) %>%
group_by(year, gender) %>% summarise(alive = sum(isalive)) %>%
pivot_wider(names_from = gender, values_from = alive) %>% print( n = 50)
`summarise()` regrouping output by 'year' (override with `.groups` argument)
# A tibble: 31 x 3
# Groups: year [31]
year female male
<int> <dbl> <dbl>
1 1950 4 3
2 1951 4 3
3 1952 4 3
4 1953 4 3
5 1954 4 3
6 1955 4 3
7 1956 4 2
8 1957 4 2
9 1958 4 2
10 1959 4 2
11 1960 4 2
12 1961 4 2
13 1962 4 2
14 1963 4 2
15 1964 4 2
16 1965 4 2
17 1966 4 1
18 1967 4 1
19 1968 4 1
20 1969 4 1
21 1970 4 1
22 1971 4 1
23 1972 4 1
24 1973 4 1
25 1974 4 1
26 1975 4 1
27 1976 3 1
28 1977 3 1
29 1978 3 1
30 1979 3 1
31 1980 3 1
Data used:
df
# A tibble: 9 x 3
birth_year death_year gender
<dbl> <dbl> <chr>
1 1934 1988 male
2 1922 1993 female
3 1890 1966 male
4 1901 1956 male
5 1946 2009 female
6 1909 1976 female
7 1899 1945 male
8 1887 1949 male
9 1902 1984 female
Here's a simple base R solution. Summing a logical vector will get you your count of alive or dead because TRUE is 1 and FALSE is 0.
number_alive <- function(range, df){
sapply(range, function(x) sum((df$death_year > x) & (df$birth_year <= x)))
}
output <- data.frame('year' = 1950:1980,
'female' = number_alive(1950:1980, df[df$gender == 'female']),
'male' = number_alive(1950:1980, df[df$gender == 'male']))
# year female male
# 1 1950 4 3
# 2 1951 4 3
# 3 1952 4 3
# 4 1953 4 3
# 5 1954 4 3
# 6 1955 4 3
# 7 1956 4 2
# 8 1957 4 2
# 9 1958 4 2
# 10 1959 4 2
# 11 1960 4 2
# 12 1961 4 2
# 13 1962 4 2
# 14 1963 4 2
# 15 1964 4 2
# 16 1965 4 2
# 17 1966 4 1
# 18 1967 4 1
# 19 1968 4 1
# 20 1969 4 1
# 21 1970 4 1
# 22 1971 4 1
# 23 1972 4 1
# 24 1973 4 1
# 25 1974 4 1
# 26 1975 4 1
# 27 1976 3 1
# 28 1977 3 1
# 29 1978 3 1
# 30 1979 3 1
# 31 1980 3 1
This approach uses an ifelse to determine if alive (1) or dead (0).
Data:
df <- "birth_year death_year gender
1934 1988 male
1922 1993 female
1890 1966 male
1901 1956 male
1946 2009 female
1909 1976 female
1899 1945 male
1887 1949 male
1902 1984 female"
df <- read.table(text = df, header = TRUE)
Code:
library(dplyr)
library(tidyr)
library(tibble)
library(purrr)
df %>%
mutate(year = map2(1950,1980, seq)) %>%
unnest(year) %>%
select(year, birth_year, death_year, gender) %>%
mutate(
alive = ifelse(year >= birth_year & year <= death_year, 1, 0)
) %>%
group_by(year, gender) %>%
summarise(
is_alive = sum(alive)
) %>%
pivot_wider(
names_from = gender,
values_from = is_alive
) %>%
select(year, male, female)
Output:
#> # A tibble: 31 x 3
#> # Groups: year [31]
#> year male female
#> <int> <dbl> <dbl>
#> 1 1950 3 4
#> 2 1951 3 4
#> 3 1952 3 4
#> 4 1953 3 4
#> 5 1954 3 4
#> 6 1955 3 4
#> 7 1956 3 4
#> 8 1957 2 4
#> 9 1958 2 4
#> 10 1959 2 4
#> # … with 21 more rows
Created on 2020-11-11 by the reprex package (v0.3.0)

r data.table adjust min and max years only if each set has at least one incrementing obs

I have a data set that holds an id, location, start year, end year, age1 and age2. For each group defined as id, location, age1 and age2, I would like to create new start and end year. For instance, I may have three entries for china encompassing age 0 - age 4. One will be 2000 - 2000, the other is 2001 - 2001, and the final is 2005-2005. Since the years are incrementing by 1 in the first two entries, I'd want their corresponding newstart and newend to be 2000-2001. The third entry would have newstart==2005 and newend==2005 as this is not apart of a continuous set of years.
The data table I have resembles the following, except it has thousands of entries many combinations :
id location start end age1 age2
1 brazil 2000 2000 0 4
1 brazil 2001 2001 0 4
1 brazil 2002 2002 0 4
2 argentina 1990 1991 1 1
2 argentina 1991 1991 2 2
2 argentina 1992 1992 2 2
2 argentina 1993 1993 2 2
3 belize 2001 2001 0.5 1
3 belize 2005 2005 1 2
I want to alter the data table so that it will look like the following
id location start end age1 age2 newstart newend
1 brazil 2000 2000 0 4 2000 2002
1 brazil 2001 2001 0 4 2000 2002
1 brazil 2002 2002 0 4 2000 2002
2 argentina 1990 1991 1 1 1991 1991
2 argentina 1991 1991 2 2 1991 1993
2 argentina 1992 1992 2 2 1991 1993
2 argentina 1993 1993 2 2 1991 1993
3 belize 2001 2001 0.5 1 2001 2001
3 belize 2005 2005 1 2 2005 2005
I have tried creating a variable that tracks the difference of the previous year and the current year using lag and then calculating the difference between these two years. I then created the newstart and newend by placing the min start and max end. I have found that this only works if there is a set of 2 in continuous years. If I have a larger set, this doesn't work as it has no way of tracking the number of obs in which the years increase by 1 for each grouping. I believe I need some type of loop.
Is there a more efficient way to accomplish this?
data.table
You tagged with data.table, so my first suggestion is this:
library(data.table)
dat[, contiguous := rleid(c(TRUE, diff(start) == 1)), by = .(id)]
dat[, c("newstart", "newend") := .(min(start), max(end)), by = .(id, contiguous)]
dat[, contiguous := NULL]
dat
# id location start end age1 age2 newstart newend
# 1: 1 brazil 2000 2000 0.0 4 2000 2002
# 2: 1 brazil 2001 2001 0.0 4 2000 2002
# 3: 1 brazil 2002 2002 0.0 4 2000 2002
# 4: 2 argentina 1990 1991 1.0 1 1990 1993
# 5: 2 argentina 1991 1991 2.0 2 1990 1993
# 6: 2 argentina 1992 1992 2.0 2 1990 1993
# 7: 2 argentina 1993 1993 2.0 2 1990 1993
# 8: 3 belize 2001 2001 0.5 1 2001 2001
# 9: 3 belize 2005 2005 1.0 2 2005 2005
base R
If instead you really just mean data.frame, then
dat <- transform(dat, contiguous = ave(start, id, FUN = function(a) cumsum(c(TRUE, diff(a) != 1))))
dat <- transform(dat,
newstart = ave(start, id, contiguous, FUN = min),
newend = ave(end , id, contiguous, FUN = max)
)
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to min; returning Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to min; returning Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to max; returning -Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to max; returning -Inf
dat
# id location start end age1 age2 newstart newend contiguous
# 1 1 brazil 2000 2000 0.0 4 2000 2002 1
# 2 1 brazil 2001 2001 0.0 4 2000 2002 1
# 3 1 brazil 2002 2002 0.0 4 2000 2002 1
# 4 2 argentina 1990 1991 1.0 1 1990 1993 1
# 5 2 argentina 1991 1991 2.0 2 1990 1993 1
# 6 2 argentina 1992 1992 2.0 2 1990 1993 1
# 7 2 argentina 1993 1993 2.0 2 1990 1993 1
# 8 3 belize 2001 2001 0.5 1 2001 2001 1
# 9 3 belize 2005 2005 1.0 2 2005 2005 2
dat$contiguous <- NULL
Interesting point I just learned about ave: it uses interaction(...) (all grouping variables), which is going to give all possible combinations, not just the combinations observed in the data. Because of that, the FUNction may be called with zero data. In this case, it did, giving the warnings. One could suppress this with function(a) suppressWarnings(min(a)) instead of just min.
We could use dplyr. After grouping by 'id', take the difference of the 'start' and the lagof the 'start', apply rleid to get the run-length-id' and create the 'newstart', 'newend' as the min and max of the 'start'
library(dplyr)
library(data.table)
df1 %>%
group_by(id) %>%
group_by(grp = rleid(replace_na(start - lag(start), 1)),
.add = TRUE) %>%
mutate(newstart = min(start), newend = max(end))
-output
# A tibble: 9 x 9
# Groups: id, grp [4]
# id location start end age1 age2 grp newstart newend
# <int> <chr> <int> <int> <dbl> <int> <int> <int> <int>
#1 1 brazil 2000 2000 0 4 1 2000 2002
#2 1 brazil 2001 2001 0 4 1 2000 2002
#3 1 brazil 2002 2002 0 4 1 2000 2002
#4 2 argentina 1990 1991 1 1 1 1990 1993
#5 2 argentina 1991 1991 2 2 1 1990 1993
#6 2 argentina 1992 1992 2 2 1 1990 1993
#7 2 argentina 1993 1993 2 2 1 1990 1993
#8 3 belize 2001 2001 0.5 1 1 2001 2001
#9 3 belize 2005 2005 1 2 2 2005 2005
Or with data.table
library(data.table)
setDT(df1)[, grp := rleid(replace_na(start - shift(start), 1))
][, c('newstart', 'newend') := .(min(start), max(end)), .(id, grp)][, grp := NULL]

Trying to convert data long format to wide format

My data frame currently looks like
country_txt Year nkill_yr Countrycode Population deathsPer100k
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Afghanistan 1973 0 4 12028 0.000000e+00
2 Afghanistan 1979 53 4 13307 3.982866e-05
3 Afghanistan 1987 0 4 11503 0.000000e+00
4 Afghanistan 1988 128 4 11541 1.109089e-04
5 Afghanistan 1989 10 4 11778 8.490406e-06
6 Afghanistan 1990 12 4 12249 9.796718e-06
It contains a list of al countries, and the terrorist Deaths per 100,000 population.
Ideally I would Like a data frame in wide format that has the structure of:
country_txt 1970 1971 1972 1973 1974 1975
Afghanistan 3.98 1.1 0 4.3 0.8 0.09
Albania 0 0.4 0.5 0 0 0
Algeria 0 0 0 0.1 0.2 0
Angola 0 0.3 0 0 0 0
Except my function currently repeats like this:
YearCountryRatio<- spread(data = YearCountryRatio, Year, deathsPer100k )
country_txt 1970 1971 1972 1973
Afghanistan 3.98 NA NA NA
Afghanistan NA 1.1 NA NA
Afghanistan NA NA 0 NA
Afghanistan NA NA NA 4.3
And similarly for other countries,
Is there any way to either:
Collapse all of the NA values to show only one country or
Put it directly into wide format?
I've assumed you want each country_txt value reduced to a single row and are happy to drop the unused variables. (Note: I added a dummy country_txt value of "XYZ" to the sample data to show how multiple countries spread)
library(dplyr)
library(tidyr)
df <- read.table(text = "country_txt Year nkill_yr Countrycode Population deathsPer100k
1 Afghanistan 1973 0 4 12028 0.000000e+00
2 Afghanistan 1979 53 4 13307 3.982866e-05
3 Afghanistan 1987 0 4 11503 0.000000e+00
4 XYZ 1988 128 4 11541 1.109089e-04
5 XYZ 1989 10 4 11778 8.490406e-06
6 XYZ 1990 12 4 12249 9.796718e-06", header = TRUE)
df <- mutate(df, deathsPer100k = round(deathsPer100k*100000, 2))
select(df, country_txt, Year, deathsPer100k) %>% spread(Year, deathsPer100k, fill = 0)
#> country_txt 1973 1979 1987 1988 1989 1990
#> 1 Afghanistan 0 3.98 0 0.00 0.00 0.00
#> 2 XYZ 0 0.00 0 11.09 0.85 0.98

Merging datasets based on more than 1 column in both datasets

I'm trying to merge two datasets, by year and country. The first data set (df = GNIPC) represent Gross national income per capite for every country from 1980-2008.
Country Year GNIpc
(chr) (dbl) (dbl)
1 Afghanistan 1990 NA
2 Afghanistan 1991 NA
3 Afghanistan 1992 2010
4 Afghanistan 1993 NA
5 Afghanistan 1994 12550
6 Afghanistan 1995 NA
The second dataset (df = sanctions) represents the imposition of economic sanctions from 1946 to present day.
country imposition sanctiontype sanctions_period
(chr) (dbl) (chr) (chr)
1 Afghanistan 1 1 6 8 1997-2001
2 Afghanistan 1 7 1979-1979
3 Afghanistan 1 4 7 1995-2002
4 Albania 1 2 8 2005-2005
5 Albania 1 7 2005-2006
6 Albania 1 8 2004-2005
I would like to merge the two datasets so that for every GNI year i either have sanctions present in the country or not. For the GNI years that are not in the sanctions_period the value would be 0 and for those that are it would be 1. This is what i want it to look like:
Country Year GNIpc Imposition sanctiontype
(chr) (dbl) (dbl) (dbl) (chr)
1 Afghanistan 1990 NA 0 NA
2 Afghanistan 1991 NA 0 NA
3 Afghanistan 1992 2010 0 NA
4 Afghanistan 1993 NA 0 NA
5 Afghanistan 1994 12550 0 NA
6 Afghanistan 1995 NA 1 4 7
Some example data:
df1 <- data.frame(country = c('Afghanistan', 'Turkey'),
imposition = c(1, 0),
sanctiontype = c('1 6 8', '4'),
sanctions_period = c('1997-2001', '2003-ongoing')
)
country imposition sanctiontype sanctions_period
1 Afghanistan 1 1 6 8 1997-2001
2 Turkey 0 4 2012-ongoing
The "sanctions_period" column can be transformed with dplyr and tidyr:
library(tidyr)
library(dplyr)
df.new <- separate(df1, sanctions_period, c('start', 'end'), remove = F) %>%
mutate(end = ifelse(end == 'ongoing', '2016', end)) %>%
mutate(start = as.numeric(start), end = as.numeric(end)) %>%
group_by(country, sanctions_period) %>%
do(data.frame(country = .$country, imposition = .$imposition, sanctiontype = .$sanctiontype, year = .$start:.$end))
sanctions_period country imposition sanctiontype year
<fctr> <fctr> <dbl> <fctr> <int>
1 1997-2001 Afghanistan 1 1 6 8 1997
2 1997-2001 Afghanistan 1 1 6 8 1998
3 1997-2001 Afghanistan 1 1 6 8 1999
4 1997-2001 Afghanistan 1 1 6 8 2000
5 1997-2001 Afghanistan 1 1 6 8 2001
6 2012-ongoing Turkey 0 4 2012
7 2012-ongoing Turkey 0 4 2013
8 2012-ongoing Turkey 0 4 2014
9 2012-ongoing Turkey 0 4 2015
10 2012-ongoing Turkey 0 4 2016
From there, it should easy to merge with your first data frame. Note that your first data frame capitalizes Country and Year, while the second doesn't.
df.merged <- merge(df.first, df.new, by.x = c('Country', 'Year'), by.y = c('country', 'year'))
Using dplyr:
left_join(GNIPC, sanctions, by=c("Country"="country", "Year"="Year")) %>%
select(Country,Year, GNIpc, Imposition, sanctiontype)

Resources