How to create quarterly data in R

How to create quarterly data in R - r

I have daily scores and their corresponding dates as seen below and currently struggling to convert them to quarterly. However, the years are first of all not chronological and am quite confused as to how to deal with the two situations.
see sample.
data
dates score
1 July 1, 2019 Monday 8
2 October 25, 2015 Sunday -3
3 June 17, 2020 Wednesday -5
4 January 17, 2018 Wednesday -1
5 April 15, 2019 Monday 6
6 October 30, 2019 Wednesday 10
7 March 6, 2017 Monday -2
8 November 19, 2018 Monday 3
9 June 11, 2020 Thursday 5
10 October 11, 2017 Wednesday -13
11 December 3, 2017 Sunday -8
12 November 14, 2018 Wednesday -6
13 August 22, 2017 Tuesday 8
14 December 13, 2017 Wednesday 5
15 January 22, 2016 Friday 5`
dates <- sapply(date, function(x)
trimws(grep(paste(month.name, collapse = '|'), x, value = TRUE)));
sort(as.Date(dates,'%B %d, %Y %A'))

This is a job for lubridate. You can parse your date column with lubridate::parse_date_time() and extract the quarter they fall in with lubridate::quarter():
library("tibble")
library("dplyr")
library("lubridate")
tbl <- tribble(~date, ~score,
"July 1, 2019 Monday", 8,
"October 25, 2015 Sunday", -3,
"June 17, 2020 Wednesday", -5,
"January 17, 2018 Wednesday", -1,
"April 15, 2019 Monday", 6,
"October 30, 2019 Wednesday", 10,
"March 6, 2017 Monday", -2,
"November 19, 2018 Monday", 3,
"June 11, 2020 Thursday", 5,
"October 11, 2017 Wednesday", -13,
"December 3, 2017 Sunday", -8,
"November 14, 2018 Wednesday", -6,
"August 22, 2017 Tuesday", 8,
"December 13, 2017 Wednesday", 5,
"January 22, 2016 Friday", 5)
tbl %>%
mutate(date = parse_date_time(date, "B d, Y")) %>%
mutate(quarter = quarter(date, with_year = TRUE))
#> # A tibble: 15 x 3
#> date score quarter
#> <dttm> <dbl> <dbl>
#> 1 2019-07-01 00:00:00 8 2019.3
#> 2 2015-10-25 00:00:00 -3 2015.4
#> 3 2020-06-17 00:00:00 -5 2020.2
#> 4 2018-01-17 00:00:00 -1 2018.1
#> 5 2019-04-15 00:00:00 6 2019.2
#> 6 2019-10-30 00:00:00 10 2019.4
#> 7 2017-03-06 00:00:00 -2 2017.1
#> 8 2018-11-19 00:00:00 3 2018.4
#> 9 2020-06-11 00:00:00 5 2020.2
#> 10 2017-10-11 00:00:00 -13 2017.4
#> 11 2017-12-03 00:00:00 -8 2017.4
#> 12 2018-11-14 00:00:00 -6 2018.4
#> 13 2017-08-22 00:00:00 8 2017.3
#> 14 2017-12-13 00:00:00 5 2017.4
#> 15 2016-01-22 00:00:00 5 2016.1

If you are trying to change dates column to Date class you can use as.Date.
df$new_date <- as.Date(trimws(df$dates), '%B %d, %Y')
Or this should also work with lubridate's mdy :
df$new_date <- lubridate::mdy(df$dates)

Once the data has been converted to date values per Ronak Shah's answer, we can use lubridate::quarter() to generate year and quarter values.
textData <- " dates|score
July 1, 2019 Monday| 8
October 25, 2015 Sunday| -3
June 17, 2020 Wednesday| -5
January 17, 2018 Wednesday| -1
April 15, 2019 Monday| 6
October 30, 2019 Wednesday| 10
March 6, 2017 Monday| -2
November 19, 2018 Monday| 3
June 11, 2020 Thursday| 5
October 11, 2017 Wednesday| -13
December 3, 2017 Sunday| -8
November 14, 2018 Wednesday| -6
August 22, 2017 Tuesday| 8
December 13, 2017 Wednesday| 5
January 22, 2016 Friday| 5
"
df <- read.csv(text=textData,
header=TRUE,
sep="|")
library(lubridate)
df$dt_quarter <- quarter(mdy(df$dates),with_year = TRUE,
fiscal_start = 1)
head(df)
We include with_year = TRUE and fiscal_start = 1 arguments to illustrate that one can change the output to include / exclude the year information, as well as change the start month for the year from the default of 1.
...and the output:
> head(df)
dates score dt_quarter
1 July 1, 2019 Monday 8 2019.3
2 October 25, 2015 Sunday -3 2015.4
3 June 17, 2020 Wednesday -5 2020.2
4 January 17, 2018 Wednesday -1 2018.1
5 April 15, 2019 Monday 6 2019.2
6 October 30, 2019 Wednesday 10 2019.4

The yearqtr class represents a year and quarter as the year plus 0 for Q1, 0.25 for Q2, 0.5 for Q3 and 0.75 for Q4. If date is defined as a yearqtr object as below then as.integer(date) is the year and cycle(date) is the quarter: 1, 2, 3 or 4. Note that junk at the end of the date field is ignored by as.yearqtr so we only need to specify month, day and year percent codes.
If you want a Date object instead of a yearqtr object then uncomment one of the commented out lines.
data is defined reproducibly in the Note at the end. (In the future please use dput to display your input data to prevent ambiguity as discussed in the information at the top of the r tag page.)
library(zoo)
date <- as.yearqtr(data$date, "%B %d, %Y")
# uncomment one of these lines if you want a Date object instead of yearqtr object
# date <- as.Date(date) # first day of quarter
# date <- as.Date(date, frac = 1) # last day of quarter
data.frame(date, score = data$score)[order(date), ]
giving the following sorted data frame assuming that we do not uncomment any of the commented out lines above.
date score
2 2015 Q4 -3
15 2016 Q1 5
7 2017 Q1 -2
...snip...
Time series
If this is supposed to be a time series with a single aggregated score per quarter then we can get a zoo series like this where data is the original data defined in the Note below.
library(zoo)
to_ym <- function(x) as.yearqtr(x, "%B %d, %Y")
z <- read.zoo(data, FUN = to_ym, aggregate = "mean")
z
## 2015 Q4 2016 Q1 2017 Q1 2017 Q3 2017 Q4 2018 Q1 2018 Q4 2019 Q2
## -3.000000 5.000000 -2.000000 8.000000 -5.333333 -1.000000 -1.500000 6.000000
## 2019 Q3 2019 Q4 2020 Q2
## 8.000000 10.000000 0.000000
or as a ts object like this:
as.ts(z)
## Qtr1 Qtr2 Qtr3 Qtr4
## 2015 -3.000000
## 2016 5.000000 NA NA NA
## 2017 -2.000000 NA 8.000000 -5.333333
## 2018 -1.000000 NA NA -1.500000
## 2019 NA 6.000000 8.000000 10.000000
## 2020 NA 0.000000
Note
The input data in reproducible form:
data <- structure(list(dates = c("July 1, 2019 Monday", "October 25, 2015 Sunday",
"June 17, 2020 Wednesday", "January 17, 2018 Wednesday", "April 15, 2019 Monday",
"October 30, 2019 Wednesday", "March 6, 2017 Monday", "November 19, 2018 Monday",
"June 11, 2020 Thursday", "October 11, 2017 Wednesday", "December 3, 2017 Sunday",
"November 14, 2018 Wednesday", "August 22, 2017 Tuesday", "December 13, 2017 Wednesday",
"January 22, 2016 Friday"), score = c(8L, -3L, -5L, -1L, 6L,
10L, -2L, 3L, 5L, -13L, -8L, -6L, 8L, 5L, 5L)), class = "data.frame", row.names = c(NA,
-15L))
Update
Have updated this answer several times so be sure you are looking at the latest version.

Related

Identifying weeks from complex date format [duplicate]

This question already has an answer here:
Reading mdy_hms AM/PM off excel into r
(1 answer)
Closed 2 years ago.
I have a pretty annoying format for the dates in a data frame. here is a sample:
"Jan 1, 2020, 8:36:55 PM" "Jan 7, 2020, 12:00:00 PM" "Jan 9, 2020, 8:24:55 PM"
The first thing I had to do was to filter it by year. I ended up just using grep(), since there is no other context in which 2020 appears, but this isn't an elegant solution. I hope the answer to my current problem can help with this, too.
Anyway, I now want to identify the weeks. I want to take the sum of each cell of a different column by week. However, I don't even know how to turn that character string into some sort of date...
Just to give you a sample of my data, it would be this (already filtered for 2020):
Activity.Date Moving.Time
1 Jan 1, 2020, 8:36:55 PM 3581
2 Jan 7, 2020, 12:00:00 PM 1200
3 Jan 9, 2020, 8:24:55 PM 970
4 Jan 12, 2020, 7:51:30 PM 5564
5 Feb 4, 2020, 9:20:21 AM 1350
6 Feb 5, 2020, 9:20:00 AM 2400
7 Feb 6, 2020, 9:15:00 AM 2415
8 Feb 16, 2020, 11:55:51 AM 1836
9 Feb 17, 2020, 8:36:47 PM 511
10 Feb 25, 2020, 7:30:00 PM 928
11 Mar 4, 2020, 7:41:02 PM 558
12 Mar 6, 2020, 8:25:27 PM 2637
13 Mar 9, 2020, 8:37:11 PM 577
14 Mar 11, 2020, 7:46:10 PM 523
15 Mar 11, 2020, 10:00:25 PM 1278
16 Mar 12, 2020, 12:34:41 AM 442
17 Mar 13, 2020, 8:26:55 PM 2410
18 Mar 16, 2020, 8:25:22 PM 609
19 Sep 12, 2020, 7:27:26 PM 1884
20 Sep 15, 2020, 7:46:27 PM 1783
21 Sep 17, 2020, 8:41:19 PM 1838
22 Sep 19, 2020, 12:08:56 PM 1995
23 Sep 22, 2020, 7:29:01 PM 1776
24 Sep 24, 2020, 7:08:35 PM 1972
25 Sep 26, 2020, 7:24:52 PM 4032
26 Oct 3, 2020, 7:27:22 PM 4172
27 Oct 7, 2020, 8:00:41 PM 2987
28 Oct 8, 2020, 6:57:21 PM 2319
29 Oct 10, 2020, 7:23:39 PM 2509
30 Oct 12, 2020, 6:54:36 PM 5711
31 Oct 13, 2020, 7:56:59 PM 1764
32 Oct 14, 2020, 7:18:06 PM 4822
33 Oct 15, 2020, 8:09:31 PM 1863
34 Oct 17, 2020, 7:50:45 PM 5086
35 Oct 20, 2020, 7:58:39 PM 1583
36 Oct 21, 2020, 8:16:10 PM 4978
37 Oct 22, 2020, 7:23:26 PM 1940
38 Oct 22, 2020, 8:18:24 PM 1857
EDIT: I also need the number of rows that were summed in a third column, if possible...

You can use as.POSIXct.
x <- c("Jan 1, 2020, 8:36:55 PM", "Jan 7, 2020, 12:00:00 PM", "Jan 9, 2020, 8:24:55 PM")
as.POSIXct(x, format = '%b %d, %Y, %I:%M:%S %p', tz = 'UTC')
#[1] "2020-01-01 20:36:55 UTC" "2020-01-07 12:00:00 UTC" "2020-01-09 20:24:55 UTC"
The formats are mentioned in ?strptime.
If this is difficult to remember you can use mdy_hms from lubridate.
lubridate::mdy_hms(x)
Once you do that you can extract the week information and sum Moving.Time in each week.
library(dplyr)
library(lubridate)
df %>%
mutate(Activity.Date = mdy_hms(Activity.Date)) %>%
group_by(Week = week(Activity.Date )) %>%
summarise(Moving.Time = sum(Moving.Time))

Convert your activity.date column to a date/time object with this:
activitydate <-as.POSIXct("Jan 1, 2020, 8:36:55 PM", format="%b %d, %Y, %r")
Then to identify the week number:
format(activitydate, "%V") #or %U
See help for strptime for more information.
Update
To answer your second question about providing number of the rows this is easily done with the dplyr library.
df$Activity.Date <- as.POSIXct(df$Activity.Date, format="%b %d, %Y, %r")
df$week <- format(df$Activity.Date, "%V")
library(dplyr)
df %>% group_by(week) %>% summarize(count=n(), sum=sum(Moving.Time))

Is there an R function or string of code to identify when two cells columns in a row are duplicates?

I have a dataframe in R with roughly 1,700 observations. I intersected GPS points with polygons, and want to determine if multiple IDs enter in the same polygon in the same 12 hour period (6pm to 6 am). Here is the head of my dataframe.
ID date time DOP datetime p pid1 Long Lat
289 Friday, September 1, 2017 1:15:29 AM 4.2 2017-09-01 01:15:29 <NA> 2 763692.8 3617676
289 Friday, September 1, 2017 4:15:15 AM 1.4 2017-09-01 04:15:15 <NA> 2 763674.5 3617692
299 Friday, September 1, 2017 5:00:16 AM 3.6 2017-09-01 05:00:16 <NA> 2 764427.2 3616750
13 Friday, September 1, 2017 5:15:25 AM 2.8 2017-09-01 05:15:25 <NA> 1 767800.5 3613057
299 Friday, September 1, 2017 5:15:29 AM 1.6 2017-09-01 05:15:29 <NA> 2 764420.7 3616746
299 Friday, September 1, 2017 5:30:08 AM 1.4 2017-09-01 05:30:08 <NA> 2 764420.7 3616747
You can see that for Friday September 1st, 2017, both ID numbers 289 and 299 where within PID1 #2 (PID1 #2 refers to polygon #2) at one point (roughly 45 minutes apart). I'd like to have some function or script to run through my dataset and identify instances where this occurs. That way I can identify what IDs are in what PID1 during specific times (within the 12 hour window), to ultimately have a dataset that shows how many times multiple IDs interact within a specific polygon.
Here is a sample dataset using dput for the first 5 lines of my dataset:
structure(list(X = c("388933", "387022", "507722", "941954",
"506441"), ID = structure(c(12L, 12L, 15L, 1L, 15L), .Label = c("13",
"17", "97", "100", "253", "255", "256", "259", "263", "272",
"281", "289", "294", "297", "299", "329", "337", "339", "344",
"347"), class = "factor"), date = c("Friday, September 1, 2017",
"Friday, September 1, 2017", "Friday, September 1, 2017", "Friday, September 1, 2017",
"Friday, September 1, 2017"), time = c("1:15:29 AM", "4:15:15 AM",
"5:00:16 AM", "5:15:25 AM", "5:15:29 AM"), DOP = c(4.2, 1.4,
3.6, 2.8, 1.6), datetime = structure(c(1504246529, 1504257315,
1504260016, 1504260925, 1504260929), class = c("POSIXct", "POSIXt"
), tzone = "CST6CDT"), p = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_), pid1 = c("2", "2", "2", "1", "2"
), Long = c(763692.811797531, 763674.546077539, 764427.163679506,
767800.455784065, 764420.684442097), Lat = c(3617675.85664874,
3617692.02070415, 3616749.72487458, 3613057.33334349, 3616746.22303673
)), row.names = c("224811", "223697", "277383", "525686", "276768"
), class = "data.frame")
EDIT: I am editing this to show the way that I figured out to make this work.
uni <- unique(df[,c("ID","date", "pid1")])
df2 <- aggregate(ID~pid1+date, data= uni,length)
This was able to create a dataframe with the number of unique IDs per pid1 per day.
Thank you

Major EDIT
Give it a try now on your real data. I think I was letting the fact that ID was a factor get in the way. If we agree thos is giving you the right ideas we can close the basics out and perhaps start a new question on your 12 hour blocks.
library(dplyr)
library(tidyr)
set.seed(2020)
ID <- sample(10:200, replace = TRUE, size = 1000)
PID <- sample(1:31, replace = TRUE, size = 1000)
Date <- sample(c("Friday, September 1, 2017",
"Saturday, September 2, 2017",
"Sunday, September 3, 2017",
"Monday, September 4, 2017",
"Tuesday, September 5, 2017",
"Wednesday, September 6, 2017",
"Thursday, September 7, 2017",
"Friday, September 8, 2017"),
replace = TRUE,
size = 1000)
play <- data.frame(ID, Date, PID)
play$ID <- factor(play$ID)
ids_by_pid <-
play %>%
mutate(ID = as.integer(as.character(ID))) %>%
arrange(Date, PID, ID) %>%
tidyr::pivot_wider(id_cols = Date,
values_from = ID,
names_from = PID,
names_prefix = "pid",
names_sort = TRUE,
values_fn = list)
Here's the code I use to compare IDs in a PID for a particular day
play %>%
filter(Date == "Saturday, September 2, 2017", PID == "1") %>%
pull(ID)
#> [1] 40 30 89 133 36
#> 189 Levels: 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ... 200
ids_by_pid %>%
filter(Date == "Saturday, September 2, 2017") %>%
select(pid1) %>%
pull %>%
unlist
#> [1] 30 36 40 89 133

Adding missing dates in time series data [duplicate]

This question already has answers here:
Insert rows for missing dates/times
(9 answers)
How to add only missing Dates in Dataframe
(3 answers)
Add missing months for a range of date in R
(2 answers)
Closed 2 years ago.
I have a data of random dates from 2008 to 2020 and their corresponding value
Date Val
September 16, 2012 32
September 19, 2014 33
January 05, 2008 26
June 07, 2017 02
December 15, 2019 03
May 28, 2020 18
I want to fill the missing dates from January 01 2008 to March 31, 2020 and their corresponding value as 1.
I refer some of the post like Post1, Post2 and I am not able to solve the problem based on that. I am a beginner in R.
I am looking for data like this
Date Val
January 01, 2008 1
January 02, 2008 1
January 03, 2008 1
January 04, 2008 1
January 05, 2008 26
........

Use tidyr::complete :
library(dplyr)
df %>%
mutate(Date = as.Date(Date, "%B %d, %Y")) %>%
tidyr::complete(Date = seq(as.Date('2008-01-01'), as.Date('2020-03-31'),
by = 'day'), fill = list(Val = 1)) %>%
mutate(Date = format(Date, "%B %d, %Y"))
# A tibble: 4,475 x 2
# Date Val
# <chr> <dbl>
# 1 January 01, 2008 1
# 2 January 02, 2008 1
# 3 January 03, 2008 1
# 4 January 04, 2008 1
# 5 January 05, 2008 26
# 6 January 06, 2008 1
# 7 January 07, 2008 1
# 8 January 08, 2008 1
# 9 January 09, 2008 1
#10 January 10, 2008 1
# … with 4,465 more rows
data
df <- structure(list(Date = c("September 16, 2012", "September 19, 2014",
"January 05, 2008", "June 07, 2017", "December 15, 2019", "May 28, 2020"
), Val = c(32L, 33L, 26L, 2L, 3L, 18L)), class = "data.frame",
row.names = c(NA, -6L))

We can create data frame with the desired date range and then join our data frame on it and replace all NAs with 1:
library(tidyverse)
days_seq %>%
left_join(df) %>%
mutate(Val = if_else(is.na(Val), as.integer(1), Val))
Joining, by = "Date"
# A tibble: 4,474 x 2
Date Val
<date> <int>
1 2008-01-01 1
2 2008-01-02 1
3 2008-01-03 1
4 2008-01-04 1
5 2008-01-05 33
6 2008-01-06 1
7 2008-01-07 1
8 2008-01-08 1
9 2008-01-09 1
10 2008-01-10 1
# ... with 4,464 more rows
Data
days_seq <- tibble(Date = seq(as.Date("2008/01/01"), as.Date("2020/03/31"), "days"))
df <- tibble::tribble(
~Date, ~Val,
"2012/09/16", 32L,
"2012/09/19", 33L,
"2008/01/05", 33L
)
df$Date <- as.Date(df$Date)

How to Create Table from text in R?

In R, what would be the best way to separate the following data into a table with 2 columns?
March 09, 2018
0.084752
March 10, 2018
0.084622
March 11, 2018
0.084622
March 12, 2018
0.084437
March 13, 2018
0.084785
March 14, 2018
0.084901
I considered using a for loop but was advised against it. I do not know how to parse things very well, so if the best method involves this process please
be as clear as possible.
The final table should look something like this:
https://i.stack.imgur.com/u5hII.png
Thank you!

Input:
input <- c("March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901")
Method:
library(dplyr)
library(lubridate)
df <- matrix(input, ncol = 2, byrow = TRUE) %>%
as_tibble() %>%
mutate(V1 = mdy(V1), V2 = as.numeric(V2))
Output:
df
# A tibble: 6 x 2
V1 V2
<date> <dbl>
1 2018-03-09 0.0848
2 2018-03-10 0.0846
3 2018-03-11 0.0846
4 2018-03-12 0.0844
5 2018-03-13 0.0848
6 2018-03-14 0.0849
Use names() or rename() to rename each columns.
names(df) <- c("Date", "Value")

data.table::fread can read "...a string (containing at least one \n)...."
'f' in fread stands for 'fast' so the code below should work on fairly large chunks as well.
require(data.table)
x = 'March 09, 2018
0.084752
March 10, 2018
0.084622
March 11, 2018
0.084622
March 12, 2018
0.084437
March 13, 2018
0.084785
March 14, 2018
0.084901'
o = fread(x, sep = '\n', header = FALSE)
o[, V1L := shift(V1, type = "lead")]
o[, keep := (1:.N)%% 2 != 0 ]
z = o[(keep)]
z[, keep := NULL]
z

result = data.frame(matrix(input, ncol = 2, byrow = T), stringsAsFactors = FALSE)
result
# X1 X2
# 1 March 09, 2018 0.084752
# 2 March 10, 2018 0.084622
# 3 March 11, 2018 0.084622
# 4 March 12, 2018 0.084437
# 5 March 13, 2018 0.084785
# 6 March 14, 2018 0.084901
You should next adjust the names and classes, something like this:
names(result) = c("date", "value")
result$value = as.numeric(result$value)
# etc.
Using Nik's nice input:
input = c(
"March 09, 2018",
"0.084752",
"March 10, 2018",
"0.084622",
"March 11, 2018",
"0.084622",
"March 12, 2018",
"0.084437",
"March 13, 2018",
"0.084785",
"March 14, 2018",
"0.084901"
)

R timeseries - identify missing observations (timestamps) and insert NAs to create time series of given length

I have a set of 24 grouped (hierarchical) time series supposedly running over 3 years, and I want to look at monthly sales, but it turns out that a number of them have missing observations, e.g.
getCounts(Shop1, ...)
2011-01 2011-02 2011-03 2011-04 2011-05 2011-06 2011-07 2011-08 2011-09 2011-10 2011-11 2011-12 2012-02 2012-03 2012-04 2012-05 2012-06 2012-07 2012-08 2012-09 2012-10 2012-11
10 22 10 12 36 31 25 19 7 7 7 5 1 9 9 11 10 16 25 3 2 5
is missing an observation for January 2012 and ends in November 2012 although it's supposed to run to December 2013.
getCounts uses the command
with(myDF, tapply(varName, substr(dateName, 1, 7), sum))
to get the monthly counts.
I want to replace the missing observations, both in the middle of the time series and at the end, with NAs, so that all my time series have the same number of observations and, if there are any "holes" they will be visible in a plot.
Can anybody help me do this?
Thanks!
Edit: My preferred output would be something like this:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011 1 NA 2 3 4 5 6 NA 7 8 9 10
2012 2 3 4 5 6 NA NA NA NA NA NA NA
where each NA is replacing a missing observation.
Edit 2: getCounts() look like this:
getCounts <- function(dataObject, dateName, varName){
dataNameString <- deparse(substitute(dataObject))
countsStr <- paste0("with(", dataNameString,", tapply(", varName, ", substr(", dateName, ", 1, 7), sum))")
counts <- eval(parse(text = countsStr))
return(counts)
}
And here's the dput:
structure(c(10, 22, 10, 12, 36, 31, 25, 19, 7, 7, 7, 5, 1, 9,
9, 11, 10, 16, 25, 3, 2, 5), .Dim = 22L, .Dimnames = list(c("2011-01",
"2011-02", "2011-03", "2011-04", "2011-05", "2011-06", "2011-07",
"2011-08", "2011-09", "2011-10", "2011-11", "2011-12", "2012-02",
"2012-03", "2012-04", "2012-05", "2012-06", "2012-07", "2012-08",
"2012-09", "2012-10", "2012-11")))

Try this
df <- data.frame(Year = substr(names(x), 1, 4),
Month = factor(month.abb[as.numeric(substr(names(x), 6, 7))],
levels = month.abb),
Value = x)
library(tidyr)
spread(df, Month, Value)
# Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
# 1 2011 10 22 10 12 36 31 25 19 7 7 7 5
# 2 2012 NA 1 9 9 11 10 16 25 3 2 5 NA
Data
x <- structure(c(10, 22, 10, 12, 36, 31, 25, 19, 7, 7, 7, 5, 1, 9,
9, 11, 10, 16, 25, 3, 2, 5), .Dim = 22L, .Dimnames = list(c("2011-01",
"2011-02", "2011-03", "2011-04", "2011-05", "2011-06", "2011-07",
"2011-08", "2011-09", "2011-10", "2011-11", "2011-12", "2012-02",
"2012-03", "2012-04", "2012-05", "2012-06", "2012-07", "2012-08",
"2012-09", "2012-10", "2012-11")))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to create quarterly data in R - r

If you are trying to change dates column to Date class you can use as.Date. df$new_date <- as.Date(trimws(df$dates), '%B %d, %Y') Or this should also work with lubridate's mdy : df$new_date <- lubridate::mdy(df$dates)

Related

Identifying weeks from complex date format [duplicate]

Is there an R function or string of code to identify when two cells columns in a row are duplicates?

Adding missing dates in time series data [duplicate]

How to Create Table from text in R?

R timeseries - identify missing observations (timestamps) and insert NAs to create time series of given length

Categories

Resources