Fill in Multiple NAs with Lagged Values R

Fill in Multiple NAs with Lagged Values R - r

I am trying to fill the NA values in this data frame with the most recent non-NA value in the cost column. I want to group by city - so all NAs for Omaha should be 44.50, and the NAs for Lincoln should be 62.50. Here is the code I have been using - it replaces the first NA (April) for each group with the correct value, but does not fill past that.
df <- df %>%
group_by(city) %>%
mutate(cost = ifelse(is.na(cost), lag(cost, na.rm=TRUE), cost))
Data before running code:
year month city cost
2021 January Omaha 45.50
2021 February Omaha 46.75
2021 March Omaha 44.50
2021 April Omaha NA
2021 May Omaha NA
2021 June Omaha NA
2021 January Lincoln 55.25
2021 February Lincoln 53.80
2021 March Lincoln 62.50
2021 April Lincoln NA
2021 May Lincoln NA
2021 June Lincoln NA

Use:
library(tidyverse)
df %>%
group_by(city) %>%
fill(cost)
# A tibble: 12 x 4
# Groups: city [2]
year month city cost
<int> <chr> <chr> <dbl>
1 2021 January Omaha 45.5
2 2021 February Omaha 46.8
3 2021 March Omaha 44.5
4 2021 April Omaha 44.5
5 2021 May Omaha 44.5
6 2021 June Omaha 44.5
7 2021 January Lincoln 55.2
8 2021 February Lincoln 53.8
9 2021 March Lincoln 62.5
10 2021 April Lincoln 62.5
11 2021 May Lincoln 62.5
12 2021 June Lincoln 62.5

With your code, you would want to use last rather than lag (though fill is the much better option here). We also need to wrap cost in na.omit.
library(tidyverse)
df %>%
group_by(city) %>%
mutate(cost = ifelse(is.na(cost), last(na.omit(cost)), cost))
Output
year month city cost
<int> <chr> <chr> <dbl>
1 2021 January Omaha 45.5
2 2021 February Omaha 46.8
3 2021 March Omaha 44.5
4 2021 April Omaha 44.5
5 2021 May Omaha 44.5
6 2021 June Omaha 44.5
7 2021 January Lincoln 55.2
8 2021 February Lincoln 53.8
9 2021 March Lincoln 62.5
10 2021 April Lincoln 62.5
11 2021 May Lincoln 62.5
12 2021 June Lincoln 62.5
Data
df <- structure(list(year = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L), month = c("January",
"February", "March", "April", "May", "June", "January", "February",
"March", "April", "May", "June"), city = c("Omaha", "Omaha",
"Omaha", "Omaha", "Omaha", "Omaha", "Lincoln", "Lincoln", "Lincoln",
"Lincoln", "Lincoln", "Lincoln"), cost = c(45.5, 46.75, 44.5,
NA, NA, NA, 55.25, 53.8, 62.5, NA, NA, NA)), class = "data.frame", row.names = c(NA,
-12L))

Related

Rename/recode variable value in R based on condition using dplyr

I have a dataset dataExtended with variable CountryOther and n which is a count of wines in that particular country. CountryOther is character type and n is integer. What I want to do, is to rename values in CountryOther to Other in case the n <=20. I would like to do it with dyplr package and I am not sure how to do it and if to use only mutate or mutate_at.
As long as I wasn't able to do wrote the condition as stated above, I tried to do it manually as follows but it didn't work:
dataExtended$CountryOther <- dataExtended$Country
dataExtended %>%
mutate(CountryOther = recode(CountryOther,
China = "Other",
Mexico = "Other",
Slovakia = "Other",
Bulgaria = "Other",
Canada = "Other",
Croatia = "Other",
Uruguay = "Other",
Georgia = "Other",
Turkey = "Other",
Moldova = "Other",
Slovenia = "Other",
Hungary = "Other",
Switzerland = "Other",
Greece = "Other",
Israel = "Other",
Lebanon= "Other"))

Using the Red.csv from your link imported with readr::read_csv() creates a data.frame / tibble
#> data
# A tibble: 8,666 × 8
Name Country Region Winery Rating NumberOf…¹ Price Year
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 Pomerol 2011 France Pomerol Château La Providence 4.2 100 95 2011
2 Lirac 2017 France Lirac Château Mont-Redon 4.3 100 15.5 2017
3 Erta e China Rosso di Toscana 2015 Italy Toscana Renzo Masi 3.9 100 7.45 2015
4 Bardolino 2019 Italy Bardolino Cavalchina 3.5 100 8.72 2019
5 Ried Scheibner Pinot Noir 2016 Austria Carnuntum Markowitsch 3.9 100 29.2 2016
6 Gigondas (Nobles Terrasses) 2017 France Gigondas Vieux Clocher 3.7 100 19.9 2017
7 Marion's Vineyard Pinot Noir 2016 New Zealand Wairarapa Schubert 4 100 43.9 2016
8 Red Blend 2014 Chile Itata Valley Viña La Causa 3.9 100 17.5 2014
9 Chianti 2015 Italy Chianti Castello Montaùto 3.6 100 10.8 2015
10 Tradition 2014 France Minervois Domaine des Aires Hautes 3.5 100 6.9 2014
# … with 8,656 more rows, and abbreviated variable name ¹NumberOfRatings
Now with dplyrs help
library(dplyr)
data %>%
add_count(Country, name = "WineCount") %>%
mutate(CountryOther = ifelse(WineCount <= 20, "Other", Country))
we get
# A tibble: 8,666 × 10
Name Country Region Winery Rating Numbe…¹ Price Year WineC…² Count…³
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr> <int> <chr>
1 Pomerol 2011 France Pomerol Château La… 4.2 100 95 2011 2256 France
2 Lirac 2017 France Lirac Château Mo… 4.3 100 15.5 2017 2256 France
3 Erta e China Rosso di Toscana 2015 Italy Toscana Renzo Masi 3.9 100 7.45 2015 2650 Italy
4 Bardolino 2019 Italy Bardolino Cavalchina 3.5 100 8.72 2019 2650 Italy
5 Ried Scheibner Pinot Noir 2016 Austria Carnuntum Markowitsch 3.9 100 29.2 2016 220 Austria
6 Gigondas (Nobles Terrasses) 2017 France Gigondas Vieux Cloc… 3.7 100 19.9 2017 2256 France
7 Marion's Vineyard Pinot Noir 2016 New Zealand Wairarapa Schubert 4 100 43.9 2016 63 New Ze…
8 Red Blend 2014 Chile Itata Valley Viña La Ca… 3.9 100 17.5 2014 326 Chile
9 Chianti 2015 Italy Chianti Castello M… 3.6 100 10.8 2015 2650 Italy
10 Tradition 2014 France Minervois Domaine de… 3.5 100 6.9 2014 2256 France
# … with 8,656 more rows, and abbreviated variable names ¹NumberOfRatings, ²WineCount, ³CountryOther
We can filter for WineCount <= 30:
# A tibble: 125 × 10
Name Country Region Winery Rating Numbe…¹ Price Year WineC…² Count…³
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr> <int> <chr>
1 Steiner 2013 Hungary Sopron Wenin… 3.7 100 24.5 2013 9 Other
2 Viile Metamorfosis Merlot 2015 Romania Dealu Mare Vitis… 3.5 102 7.5 2015 23 Romania
3 Halkidiki Limnio - Merlot 2013 Greece Chalkidiki Tsant… 3.2 105 12.5 2013 13 Other
4 Cabernet Sauvignon 2013 Mexico Valle de Guad… L. A.… 3.4 1066 8.65 2013 1 Other
5 Driopi Classic Agiorgitiko Nemea 2017 Greece Nemea Κτημα… 3.7 107 11.5 2017 13 Other
6 Malbec de Purcari 2018 Moldova South Eastern Châte… 4.1 107 12.0 2018 8 Other
7 Cabernet Sauvignon de Purcari 2017 Moldova South Eastern Châte… 4.1 1082 13.0 2017 8 Other
8 Cabernet Sauvignon 2016 Romania Samburesti Caste… 3.3 112 7.9 2016 23 Romania
9 Aigle Les Murailles Rouge 2015 Switzerland Aigle Henri… 3.7 112 23.2 2015 12 Other
10 Γουμένισσα (Goumenissa) 2015 Greece Goumenissa Chatz… 3.7 115 20 2015 13 Other
to check the desired output: There are several rows filled with "Other" in column CountryOther.

in the end I created this code which works:
#New table with wine count
wineCount <- data %>% count(Country)
#Joining two tables together
dataExtended <- inner_join(wineCount, data, by = "Country")
# Creating new variable CountryOther
dataExtended$CountryOther <- dataExtended$Country
# Renaming count from n to WineCount
dataExtended <- rename(dataExtended, WineCount = n)
# Replacement of countries with WineCount<=20 to Other
dataExtended <- dataExtended %>%
mutate(CountryOther = ifelse(WineCount<=20, "Other", CountryOther))
# Final check
unique(dataExtended$CountryOther)
The problem was I needed to store changes into the dataframe, which I didn't do before (as you can see in my last comment):
dataExtended <- rename(dataExtended, WineCount = n)
and
dataExtended <- dataExtended %>%
mutate(CountryOther = ifelse(WineCount<=20, "Other", CountryOther))
I also tested your code and it works as well and additionally it looks neater. So thank you very much for your help.

How can I create a new column that only extracts either the year or month from a mm/dd/yy hh:mm string?

I have a date/time string variable that looks like this:
> dput(df$starttime)
c("12/16/20 7:24", "6/21/21 13:20", "1/22/20 9:03", "1/07/20 17:19",
"11/8/21 10:14", NA, NA, "10/26/21 7:19", "3/14/22 9:48", "5/12/22 13:29"
I basically want to create a column that only has the year (2020, 2021, 2022) and the year + month (e.g., "Jan 2022)

1) Base R Assuming that you want separate month and year numeric columns, define a function which converts a string in the format shown in the question to a year or month number and then invoke it twice. No packages are used.
toNum <- function(x, fmt) format(as.Date(x, "%m/%d/%y"), fmt) |>
type.convert(as.is = TRUE)
transform(df, year = toNum(starttime, "%Y"), month = toNum(starttime, "%m"))
giving
starttime year month
1 12/16/20 7:24 2020 12
2 6/21/21 13:20 2021 6
3 1/22/20 9:03 2020 1
4 1/07/20 17:19 2020 1
5 11/8/21 10:14 2021 11
6 <NA> NA NA
7 <NA> NA NA
8 10/26/21 7:19 2021 10
9 3/14/22 9:48 2022 3
10 5/12/22 13:29 2022 5
2) yearmon Assuming that you want a yearmon class column which represents year and month internally as year + fraction where fraction is 0 for Ja, 1/12 for Feb, ..., 11/12 for Dec so that it sorts appropriately and adding 1/12, say, will give the next month we can use the following. Note that if ym is yearmon then as.integer(ym) is the year and cycle(ym) is the month number (1, 2, ..., 12).
library(zoo)
transform(df, yearmon = as.yearmon(starttime, "%m/%d/%y"))
giving:
starttime yearmon
1 12/16/20 7:24 Dec 2020
2 6/21/21 13:20 Jun 2021
3 1/22/20 9:03 Jan 2020
4 1/07/20 17:19 Jan 2020
5 11/8/21 10:14 Nov 2021
6 <NA> <NA>
7 <NA> <NA>
8 10/26/21 7:19 Oct 2021
9 3/14/22 9:48 Mar 2022
10 5/12/22 13:29 May 2022
Note
If you want to sort by starttime then use
ct <- as.POSIXct(df$starttime, format = "%m/%d/%Y %H:%M")
df[order(ct),, drop = FALSE ]

If you want a chronologically sortable output, you could use the tsibble::yearmonth type:
tsibble::yearmonth(lubridate::mdy_hm(c("12/16/20 7:24", "6/21/21 13:20", "1/22/20 9:03", "1/07/20 17:19",
"11/8/21 10:14", NA, NA, "10/26/21 7:19", "3/14/22 9:48", "5/12/22 13:29")))
result
<yearmonth[10]>
[1] "2020 Dec" "2021 Jun" "2020 Jan" "2020 Jan" "2021 Nov" NA NA
[8] "2021 Oct" "2022 Mar" "2022 May"

An option is to convert to datetime class POSIXct with mdy_hm (from lubridate), then format to extract the month (%b) and 4 digit year (%Y), filter out the NA elements and arrange based on the converted datetime column
library(dplyr)
library(lubridate)
df %>%
mutate(starttime = mdy_hm(starttime),
yearmonth = format(starttime, "%b %Y")) %>%
filter(complete.cases(yearmonth)) %>%
arrange(starttime)
-output
# A tibble: 8 × 2
starttime yearmonth
<dttm> <chr>
1 2020-01-07 17:19:00 Jan 2020
2 2020-01-22 09:03:00 Jan 2020
3 2020-12-16 07:24:00 Dec 2020
4 2021-06-21 13:20:00 Jun 2021
5 2021-10-26 07:19:00 Oct 2021
6 2021-11-08 10:14:00 Nov 2021
7 2022-03-14 09:48:00 Mar 2022
8 2022-05-12 13:29:00 May 2022

Try this with lubridate
library(lubridate)
data.frame(df,
Year = format(mdy_hm(df$starttime), "%Y"),
MonthYear = format(mdy_hm(df$starttime), "%b %Y"))
starttime Year MonthYear
1 12/16/20 7:24 2020 Dec 2020
2 6/21/21 13:20 2021 Jun 2021
3 1/22/20 9:03 2020 Jan 2020
4 1/07/20 17:19 2020 Jan 2020
5 11/8/21 10:14 2021 Nov 2021
6 <NA> <NA> <NA>
7 <NA> <NA> <NA>
8 10/26/21 7:19 2021 Oct 2021
9 3/14/22 9:48 2022 Mar 2022
10 5/12/22 13:29 2022 May 2022
It uses mdy_hm in conjunction with format to get the desired Year %Y and %b %Y abbreviated month and year part of the date.
Ordered rows:
df_new <- data.frame(df,
Year = format(mdy_hm(df$starttime), "%Y"),
MonthYear = format(mdy_hm(df$starttime), "%b %Y"))
df_new[order(my(df_new$MonthYear)),]
starttime Year MonthYear
3 1/22/20 9:03 2020 Jan 2020
4 1/07/20 17:19 2020 Jan 2020
1 12/16/20 7:24 2020 Dec 2020
2 6/21/21 13:20 2021 Jun 2021
8 10/26/21 7:19 2021 Oct 2021
5 11/8/21 10:14 2021 Nov 2021
9 3/14/22 9:48 2022 Mar 2022
10 5/12/22 13:29 2022 May 2022
6 <NA> <NA> <NA>
7 <NA> <NA> <NA>
Without NAs
na.omit(df_new[order(my(df_new$MonthYear)),])
starttime Year MonthYear
3 1/22/20 9:03 2020 Jan 2020
4 1/07/20 17:19 2020 Jan 2020
1 12/16/20 7:24 2020 Dec 2020
2 6/21/21 13:20 2021 Jun 2021
8 10/26/21 7:19 2021 Oct 2021
5 11/8/21 10:14 2021 Nov 2021
9 3/14/22 9:48 2022 Mar 2022
10 5/12/22 13:29 2022 May 2022
Data
df <- structure(list(starttime = c("12/16/20 7:24", "6/21/21 13:20",
"1/22/20 9:03", "1/07/20 17:19", "11/8/21 10:14", NA, NA, "10/26/21 7:19",
"3/14/22 9:48", "5/12/22 13:29")), class = "data.frame", row.names = c(NA,
-10L))

NAs in a data frame split by country in R [duplicate]

This question already has answers here:
How to replace NA with mean by group / subset?
(5 answers)
Closed 1 year ago.
I would like to impute NA's in a dataframe with means of observed data in each country. In other words, while dealing with NAs, the values in the specific country should be taken into consideration. For instance;
Date Country Battles Riots
March 2018 Afghanistan 380 NA
March 2018 Yemen 88 5
March 2018 Mali 45 NA
April 2018 Afghanistan 350 NA
April 2018 Yemen NA 66
April 2018 Mali 67 NA
May 2018 Afghanistan NA 7
May 2018 Yemen NA NA
May 2018 Mali NA 6
I have used the following code, but obviously it calculates the means without taking the country specific information.
for(i in 6:ncol(my_data)) {
my_data[ , i][is.na(my_data[ , i])] <- mean(my_data[ , i], na.rm = TRUE)
}
Many thanks in advance.

You could use:
library(dplyr)
library(tidyr)
df %>%
group_by(Country) %>%
mutate(across(c(Battles, Riots), ~ replace_na(.x, mean(.x, na.rm = TRUE)))) %>%
ungroup()
which returns
Date Country Battles Riots
<chr> <chr> <dbl> <dbl>
1 March 2018 Afghanistan 380 7
2 March 2018 Yemen 88 5
3 March 2018 Mali 45 6
4 April 2018 Afghanistan 350 7
5 April 2018 Yemen 88 66
6 April 2018 Mali 67 6
7 May 2018 Afghanistan 365 7
8 May 2018 Yemen 88 35.5
9 May 2018 Mali 56 6
Data
structure(list(Date = c("March 2018", "March 2018", "March 2018",
"April 2018", "April 2018", "April 2018", "May 2018", "May 2018",
"May 2018"), Country = c("Afghanistan", "Yemen", "Mali", "Afghanistan",
"Yemen", "Mali", "Afghanistan", "Yemen", "Mali"), Battles = c(380,
88, 45, 350, NA, 67, NA, NA, NA), Riots = c(NA, 5, NA, NA, 66,
NA, 7, NA, 6)), row.names = c(NA, -9L), class = c("tbl_df", "tbl",
"data.frame"))

A data.table option (borrow df from #Martin Gal)
setDT(df)[
,
c("Battles", "Riots") := lapply(
.(Battles, Riots),
function(x) replace(x, is.na(x), mean(x, na.rm = TRUE))
),
Country
][]
gives
Date Country Battles Riots
1: March 2018 Afghanistan 380 7.0
2: March 2018 Yemen 88 5.0
3: March 2018 Mali 45 6.0
4: April 2018 Afghanistan 350 7.0
5: April 2018 Yemen 88 66.0
6: April 2018 Mali 67 6.0
7: May 2018 Afghanistan 365 7.0
8: May 2018 Yemen 88 35.5
9: May 2018 Mali 56 6.0

dplyr sample_n by one variable through another one

I have a data frame with a "grouping" variable season and another variable year which is repeated for each month.
df <- data.frame(month = as.character(sapply(month.name,function(x)rep(x,4))),
season = c(rep("winter",8),rep("spring",12),rep("summer",12),rep("autumn",12),rep("winter",4)),
year = rep(2021:2024,12))
I would like to use dplyr::sample_n or something similar to choose 2 months in the data frame for each season and keep the same months for all the years, for example:
month season year
1 January winter 2021
2 January winter 2022
3 January winter 2023
4 January winter 2024
5 February winter 2021
6 February winter 2022
7 February winter 2023
8 February winter 2024
9 March spring 2021
10 March spring 2022
11 March spring 2023
12 March spring 2024
13 May spring 2021
14 May spring 2022
15 May spring 2023
16 May spring 2024
17 June summer 2021
18 June summer 2022
19 June summer 2023
20 June summer 2024
21 July summer 2021
22 July summer 2022
23 July summer 2023
24 July summer 2024
25 October autumn 2021
26 October autumn 2022
27 October autumn 2023
28 October autumn 2024
29 November autumn 2021
30 November autumn 2022
31 November autumn 2023
32 November autumn 2024
I cannot make df %>% group_by(season,year) %>% sample_n(2) since it chooses different months for each year.
Thanks!

We can randomly sample 2 values from month and filter them by group.
library(dplyr)
df %>%
group_by(season) %>%
filter(month %in% sample(unique(month),2))
# month season year
# <chr> <chr> <int>
# 1 January winter 2021
# 2 January winter 2022
# 3 January winter 2023
# 4 January winter 2024
# 5 February winter 2021
# 6 February winter 2022
# 7 February winter 2023
# 8 February winter 2024
# 9 March spring 2021
#10 March spring 2022
# … with 22 more rows
If for certain groups there are less than 2 unique values we can select minimum between 2 and unique values in the group to sample.
df %>%
group_by(season) %>%
filter(month %in% sample(unique(month),min(2, n_distinct(month))))
Using the same logic with base R, we can use ave
df[as.logical(with(df, ave(month, season,
FUN = function(x) x %in% sample(unique(x),2)))), ]

An option using slice
library(dplyr)
df %>%
group_by(season) %>%
slice(which(!is.na(match(month, sample(unique(month), 2)))))
# A tibble: 32 x 3
# Groups: season [4]
# month season year
# <fct> <fct> <int>
# 1 October autumn 2021
# 2 October autumn 2022
# 3 October autumn 2023
# 4 October autumn 2024
# 5 November autumn 2021
# 6 November autumn 2022
# 7 November autumn 2023
# 8 November autumn 2024
# 9 April spring 2021
#10 April spring 2022
# … with 22 more rows
Or using base R
by(df, df$season, FUN = function(x) subset(x, month %in% sample(unique(month), 2 )))

How to add missing months to a data frame?

I have a dataset with three observations: January, February, and March. I would like to add the remaining months as observations of zero to the same datatable, but I'm having trouble appending these.
Here's my current code:
library(dplyr)
Period <- c("January 2015", "February 2015", "March 2015",
"January 2016", "February 2016", "March 2016",
"January 2017", "February 2017", "March 2017",
"January 2018", "February 2018", "March 2018")
Month <- c("January", "February", "March",
"January", "February", "March",
"January", "February", "March",
"January", "February", "March")
Dollars <- c(936, 753, 731,
667, 643, 588,
948, 894, 997,
774,745, 684)
dat <- data.frame(Period = Period, Month = Month, Dollars = Dollars)
dat2 <- dat %>%
dplyr::select(Month, Dollars) %>%
dplyr::group_by(Month) %>%
dplyr::summarise(AvgDollars = mean(Dollars))
Any ideas for populating April through December in the dataset are greatly appreciated. Thanks in advance!

Here's the way to do it using complete in one step:
library(tidyverse)
Then use complete:
dat2 <- data.frame(Period = Period, Month = Month, Dollars = Dollars) %>%
# make a "year" variable
mutate(Year = word(Period, 2,2)) %>%
# remove period variable (we'll add it in later)
select(-Period) %>%
# month.name is a base variable listing all months (thanks #Gregor).
# nesting by "Year" lets complete know you only want the years listed in your dataset.
complete(Month = month.name, nesting(Year), fill = list(Dollars = 0)) %>%
# Arrange by Year and month
arrange(Year, Month) %>%
#remake the "period" variable
mutate(Period = paste(Month, Year)) %>%
group_by(Month) %>%
summarise(AvgDollars = mean(Dollars))

Here is a two-step solution:
library(dplyr)
Sys.setlocale("LC_TIME", "English")
# first, define a dataframe with each month from January 2015 to December 2018
dat2 <- data.frame(Period = format(seq(as.Date("2015/1/1"),
as.Date("2018/12/1"), by = "month"),
format = "%B %Y"),
Month = substr(Period, 1, nchar(Period)-5))
# then, merge dat and dat2
dat %>%
select(Period, Dollars) %>%
right_join(dat2, by = "Period") %>%
select(Period, Month, Dollars)
Period Month Dollars
1 January 2015 January 936
2 February 2015 February 753
3 March 2015 March 731
4 April 2015 January NA
5 May 2015 February NA
6 June 2015 March NA
7 July 2015 January NA
8 August 2015 February NA
9 September 2015 March NA
10 October 2015 January NA
11 November 2015 February NA
12 December 2015 March NA
13 January 2016 January 667
14 February 2016 February 643
15 March 2016 March 588
16 April 2016 January NA
17 May 2016 February NA
18 June 2016 March NA
19 July 2016 January NA
20 August 2016 February NA
21 September 2016 March NA
22 October 2016 January NA
23 November 2016 February NA
24 December 2016 March NA
25 January 2017 January 948
26 February 2017 February 894
27 March 2017 March 997
28 April 2017 January NA
29 May 2017 February NA
30 June 2017 March NA
31 July 2017 January NA
32 August 2017 February NA
33 September 2017 March NA
34 October 2017 January NA
35 November 2017 February NA
36 December 2017 March NA
37 January 2018 January 774
38 February 2018 February 745
39 March 2018 March 684
40 April 2018 January NA
41 May 2018 February NA
42 June 2018 March NA
43 July 2018 January NA
44 August 2018 February NA
45 September 2018 March NA
46 October 2018 January NA
47 November 2018 February NA
48 December 2018 March NA

Maybe there's a more graceful solution with dplyr, but here is a quick solution without much typing:
dat <- rbind(data.frame(Period = Period, Month = Month, Dollars = Dollars),
data.frame(Period = c(sapply(2015:2018, function(x) format(ISOdate(x,4:12,1),"%B %Y"))),
Month = c(sapply(2015:2018, function(x) format(ISOdate(x,4:12,1),"%B"))),
Dollars = 0))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Fill in Multiple NAs with Lagged Values R - r

Related

Rename/recode variable value in R based on condition using dplyr

How can I create a new column that only extracts either the year or month from a mm/dd/yy hh:mm string?

NAs in a data frame split by country in R [duplicate]

dplyr sample_n by one variable through another one

How to add missing months to a data frame?

Categories

Resources