I got different language result in r - r

If I write code like
everyday = seq(from=as.Date('2005-1-1'), to=as.Date('2005-12-31'), by='day')
cmonth = format(everyday, '%B')
table(cmonth)
cmonth
10월 11월 12월 1월 2월 3월 4월 5월 6월 7월 8월 9월
31 30 31 31 28 31 30 31 30 31 31 30
I get result in korean, but i want
October November December January February March ...
like this in eng. how can i change that

Related

Get last day of month from YYYYMM (year/month) variable in R [duplicate]

This question already has answers here:
Converting year and month ("yyyy-mm" format) to a date?
(9 answers)
Converting yearmon column to last date of the month in R
(3 answers)
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
I would like to get the last day of the month from a year/month variable that is formated as integer YYYYMM.
yearmonth <- c(seq(202001,202012),
seq(202101,202112))
The output I am looking for is below. For instance, the last day of Feb/2020 was 29 (2020 was a leap year) whereas the last day of Feb/2021 was 28.
last <- c(31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31,
31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
Ideally I would like to use the lubridate package.
Base R:
Try using:
day(as.Date(cut(as.Date(paste0(yearmonth, "01"), format = "%Y%m%d") + 32, 'month')) - 1)
Output:
31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31
Explanation:
This adds 32 days to all dates and uses the cut function to cut it by month (get the first day of each month). The after that, it subtracts 1 from the dates, which will give the last day of the original month
Update:
Please notice akrun's comment, where we can use the truncated argument of th ymd() function to declare the number of formats that can be truncated:
days_in_month(ymd(yearmonth, truncated = 1))
First answer:
Here is a lubridate solution:
construct date element such as year, month and day
use make_date() to get a date class
Then use days_in_month() function from lubridate
library(lubridate)
my_year <- substr(yearmonth,1,4)
my_month <- as.integer(substr(yearmonth,5,6))
my_day <- rep(1, length(my_year))
days_in_month(make_date(my_year, my_month, my_day))
# you can wrape around `unname` to get vector without names
Output:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul
31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31
Aug Sep Oct Nov Dec
31 30 31 30 31
# without names:
unname(days_in_month(make_date(my_year, my_month, my_day)))
[1] 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31
Using lubridate, I add one month to the first day of each provided month, and then I subtract one day. This date should be the last day of your months.
yearmonth <- c(seq(202001,202012),
seq(202101,202112))
yearmonthday <- as.Date(paste0(yearmonth, "01"), format = "%Y%m%d")
library(lubridate)
last <- as.numeric(format(yearmonthday + months(1) - days(1), format = "%d"))
last
Using zoo's as.yearmon -
yearmonth |>
as.character() |>
zoo::as.yearmon('%Y%m') |>
as.Date(frac = 1) |>
format('%d')
#[1] "31" "29" "31" "30" "31" "30" "31" "31" "30" "31" "30" "31" "31" "28"
#[15] "31" "30" "31" "30" "31" "31" "30" "31" "30" "31"
Using lubridate's ceiling_date function.
library(lubridate)
format(ceiling_date(ymd(paste0(yearmonth, '01')), 'month') - 1, '%d')
Using lurbidate:
library(lubridate)
day(ceiling_date(as.Date(paste0(yearmonth,'01'), format = '%Y%m%d'), unit = 'month') - 1)
[1] 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31

Converting Month character to date for time series without "0" before Month

How do I convert this data set into a time series format in R? Lets call the data set Bob. This is what it looks like
1/2013 25
2/2013 865
3/2013 26
4/2013 33
5/2013 74
6/2013 24
Are you looking for something like this....?
> dat <- read.table(text = "1/2013 25
2/2013 865
3/2013 26
4/2013 33
5/2013 74
6/2013 24
", header=FALSE) # your data
> ts(dat$V2, start=c(2013, 1), frequency = 12) # time series object
Jan Feb Mar Apr May Jun
2013 25 865 26 33 74 24
Assuming that your starting point is the data frame DF defined reproducibly in the Note at the end this converts it to a zoo series z as well as a ts series tt.
library(zoo)
z <- read.zoo(DF, FUN = as.yearmon, format = "%m/%Y")
tt <- as.ts(z)
z
## Jan 2013 Feb 2013 Mar 2013 Apr 2013 May 2013 Jun 2013
## 25 865 26 33 74 24
tt
## Jan Feb Mar Apr May Jun
## 2013 25 865 26 33 74 24
Note
Lines <- "1/2013 25
2/2013 865
3/2013 26
4/2013 33
5/2013 74
6/2013 24"
DF <- read.table(text = Lines)

No of monthly days between two dates

diff(seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by="month"))
Time differences in days
[1] 31 31 28
The above code generates no of days in the month Dec, Jan and Feb.
However, my requirement is as follows
#Results that I need
#monthly days from date 2016-12-21 to 2017-04-05
11, 31, 28, 31, 5
#i.e 11 days of Dec, 31 of Jan, 28 of Feb, 31 of Mar and 5 days of Apr.
I even tried days_in_month from lubridate but not able to achieve the result
library(lubridate)
days_in_month(c(as.Date("2016-12-21"), as.Date("2017-04-05")))
Dec Apr
31 30
Try this:
x = rle(format(seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by=1), '%b'))
> setNames(x$lengths, x$values)
# Dec Jan Feb Mar Apr
# 11 31 28 31 5
Although we have seen a clever replacement of table by rle and a pure table solution, I want to add two approaches using grouping. All approaches have in common that they create a sequence of days between the two given dates and aggregate by month but in different ways.
aggregate()
This one uses base R:
# create sequence of days
days <- seq(as.Date("2016-12-21"), as.Date("2017-04-05"), by = 1)
# aggregate by month
aggregate(days, list(month = format(days, "%b")), length)
# month x
#1 Apr 5
#2 Dez 11
#3 Feb 28
#4 Jan 31
#5 Mrz 31
Unfortunately, the months are ordered alphabetically as it happened with the simple table() approach. In these situations, I do prefer the ISO8601 way of unambiguously naming the months:
aggregate(days, list(month = format(days, "%Y-%m")), length)
# month x
#1 2016-12 11
#2 2017-01 31
#3 2017-02 28
#4 2017-03 31
#5 2017-04 5
data.table
Now that I've got used to the data.table syntax, this is my preferred approach:
library(data.table)
data.table(days)[, .N, .(month = format(days, "%b"))]
# month N
#1: Dez 11
#2: Jan 31
#3: Feb 28
#4: Mrz 31
#5: Apr 5
The order of months is kept as they have appeared in the input vector.

Why does R give me a time series of row numbers instead of values?

Why does R give me a time series of row numbers instead of values?
I load a CSV with a single column of values in the order I need them. I'm trying to make it a time series. Instead of giving me the values I've entered, R gives me the row number.
Here is what I get. If someone could help me understand what's happening and how I can get the time series of my values, I would be grateful.
Shown below, the actual list of values in the object "values".
The time series with a plain old ts() run on values.
The time series when I provide a start date and frequency.
> values
[1] 9976955.44 9362712.43 10012331.62 10068304.8 10532572.67 10195531.47 10324432.96 11208386.78
[9] 10700973.87 11068831.1 10176578.68 10188604.94 11380302.06 10204762.87 10668741.18 10897544.85
[17] 11521619.21 10323947.98 10778145.47 10454028.37 10455870.06 10382488.99 9987219.4 10260642.81
[25] 10848819.19 9732347.5 10203843.16 9869125.29 7542383.87 8569148.28 9890259.72 9440525.82
[33] 9361047.7 9715566.45 8409379.61
35 Levels: 10012331.62 10068304.8 10176578.68 10188604.94 10195531.47 10203843.16 ... 9987219.4
ts(values)
Time Series:
Start = 1
End = 35
Frequency = 1
[1] 34 28 1 2 14 5 10 21 16 20 3 4 22 7 15 19 23 9 17 12 13 11 35 8 18 31 6 32 24 26 33 29 27 30 25
attr(,"levels")
[1] 10012331.62 10068304.8 10176578.68 10188604.94 10195531.47 10203843.16 10204762.87 10260642.81
[9] 10323947.98 10324432.96 10382488.99 10454028.37 10455870.06 10532572.67 10668741.18 10700973.87
[17] 10778145.47 10848819.19 10897544.85 11068831.1 11208386.78 11380302.06 11521619.21 7542383.87
[25] 8409379.61 8569148.28 9361047.7 9362712.43 9440525.82 9715566.45 9732347.5 9869125.29
[33] 9890259.72 9976955.44 9987219.4
ts(values, start=c(2012, 1), frequency=12)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2012 34 28 1 2 14 5 10 21 16 20 3 4
2013 22 7 15 19 23 9 17 12 13 11 35 8
2014 18 31 6 32 24 26 33 29 27 30 25
Your values variable is a factor (usually used for categorical values).
Convert values to numeric before creating time series:
values <- as.numeric(levels(values))[values]

Seasonality by day of month

I want to check for seasonality in a time series by the day of the month.
The problem is that the months are not of equal length (or frequency) - there are months with 31, 28 & 30 days.
When declaring the ts object I can only specify a fixed frequency so it wont be correct.
> x <- data.frame(d = as.Date("2013-01-01") + 1:365 , v = runif(365))
> tapply(as.numeric(format(x$d,"%d")) , format(x$d,"%m") , max)
01 02 03 04 05 06 07 08 09 10 11 12
31 28 31 30 31 30 31 31 30 31 30 31
How can I create a time series object in r that i can later decompose and check for seasonality ?
Is it possible to create a pivot table and convert it into a ts ?

Resources