I want to implement a Shiny app that has a dateInput, but I want to restrict the user to select a range of months (e.g. January - June) regardless the year. Is there any possible way?
Sure, here you go:
test_dt <- seq(as.Date('2015-01-01'), as.Date('2015-12-31'), by = "month")
as.character(test_dt, format = "%B")
[1] "January" "February" "March" "April" "May" "June" "July" "August" "September" "October" "November" "December"
Probably you may realize this function through validate... need construction, where you can check that months entered by user are within required range. You can read more here: http://shiny.rstudio.com/reference/shiny/latest/validate.html
Related
I have a df with a column which has dates stored in character format, for which I want to extract the months. For this I use the following:
mutate(
Date = as.Date(
str_remove(Timestamp, "_.*")
),
Month = month(
Date,
label = F)
)
However, the October, November and December are stored with an extra zero in front of the month. The lubridate library doesn't recognise it. How can I adjust the code above to fix this? This is my Timestamp column:
c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m",
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
)
First convert the values to date and use format to get months from it.
format(as.Date(x, '%Y-0%m-%d'), '%b')
#[1] "Oct" "Oct" "Oct" "Oct" "Oct" "Oct"
%b gives abbreviated month name, you may also use %B or %m depending on your choice.
format(as.Date(x, '%Y-0%m-%d'), '%B')
#[1] "October" "October" "October" "October" "October" "October"
format(as.Date(x, '%Y-0%m-%d'), '%m')
#[1] "10" "10" "10" "10" "10" "10"
One way would be use strsplit to extract the second element:
month.abb[readr::parse_number(sapply(strsplit(x, split = '-'), "[[", 2))]
which will return:
#"Oct" "Oct" "Oct" "Oct" "Oct" "Oct"
data:
c("2021-010-01_00h39m", "2021-010-01_01h53m", "2021-010-01_02h36m",
"2021-010-01_10h32m", "2021-010-01_10h34m", "2021-010-01_14h27m"
) -> x
Let's assume I have a dataframe with one column - Date - that goes from 2000 to 2019. The problem is that I don't have perfect monthly frequence (in fact I should have 245 observations, instead I only have 215). My aim is to detect what are the missing months in the column.
Let's take this example. This is a sample dataframe:
df <- data.frame(Date = c("2015-01-22", "2015-03-05", "2015-04-15", "2015-06-03", "2015-07-16", "2015-09-03", "2015-10-22", "2015-12-03", "2016-01-21", "2016-03-10", "2016-04-21", "2016-06-02", "2016-07-21", "2016-09-08", "2016-10-20", "2016-12-08", "2017-01-19", "2017-03-09", "2017-04-27", "2017-06-08", "2017-07-20", "2017-09-07", "2017-10-26", "2017-12-14", "2018-01-25", "2018-03-08", "2018-04-26", "2018-06-14", "2018-07-26", "2018-09-13", "2018-10-25", "2018-12-13", "2019-01-24", "2019-03-07", "2019-04-10", "2019-06-06", "2019-07-25", "2019-09-12", "2019-10-24", "2019-12-12"))
df
I would like to find a code that gives me what are the missing months in my column vector of dates.
Can anyone help me?
Thanks a lot
Here are two types of results to see the missing months, with base R:
If you want to see the missing month regardless of years, you can try the following code
missingMonths <- month.name[setdiff(seq(12),as.numeric(format(as.Date(df$Date),"%m")))]
such that
> missingMonths
[1] "February" "May" "August" "November"
If you want to check the missing months by year, you can try the code below:
missingMonths <- lapply(split(df,format(as.Date(df$Date),"%Y")),
function(x) month.name[setdiff(seq(12),as.numeric(format(as.Date(x$Date),"%m")))])
such that
> missingMonths
$`2015`
[1] "February" "May" "August" "November"
$`2016`
[1] "February" "May" "August" "November"
$`2017`
[1] "February" "May" "August" "November"
$`2018`
[1] "February" "May" "August" "November"
$`2019`
[1] "February" "May" "August" "November"
Not as succinct as above, but still does the trick in a couple of steps:
month_date_strings <- unique(paste0(sub("-[^-]+$", "",
sapply(df$Date, as.character)), "-01"))
month_seq_strings <- unique(as.character(seq.Date(as.Date("2000-01-01", "%Y-%m-%d"),
as.Date("2019-12-31", "%Y-%m-%d"), by = "month")))
month_seq_strings[!(month_seq_strings %in% month_date_strings)]
In base R, we have easy access to an array containing the calendar month names, month.names, and to an array containing the calendar month abbreviations, month.abb:
> month.name
# [1] "January" "February" "March" "April" "May" "June"
# [7] "July" "August" "September" "October" "November" "December"
> month.abb
# [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
Similarly, in Python there are two array-like objects in the standard library calendar module:
>>> from calendar import month_name, month_abbr
>>> list(month_name)
# ['', 'January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
>>> list(month_abbr)
# ['', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
Does Julia have a similar array of month names in the standard library Dates module, or perhaps a third-party package?
#hckr has provided a nice answer about where these things are currently stored internally. However, LOCALES is not exported by Dates, and explicitly using non-exported objects from modules is something we should try to avoid. In this case, you could retrieve the month names using the (exported) function monthname:
julia> monthname.(1:12)
12-element Array{String,1}:
"January"
"February"
"March"
"April"
"May"
"June"
"July"
"August"
"September"
"October"
"November"
"December"
Edit: See #Colin T Bowers' correct answer. You should use monthname, monthabbr, dayname, dayabbr functions to retrieve these names and abbreviations, e.g. dayabbr.(1:7) or dayabbr(2). These functions can also take a locale argument to provide names/abbreviations in other languages/locales. My answer has become about where currently things are stored internally.
It has. They are stored in a Dict for available locales (only English, by default, though you can add others). You can access them with, for example Dates.LOCALES["english"]. This will give you a struct in the following form.
struct DateLocale
months::Vector{String}
months_abbr::Vector{String}
days_of_week::Vector{String}
days_of_week_abbr::Vector{String}
month_value::Dict{String, Int}
month_abbr_value::Dict{String, Int}
day_of_week_value::Dict{String, Int}
day_of_week_abbr_value::Dict{String, Int}
end
So, Dates.LOCALES["english"].months_abbr will give you the abbreviations of months in English as an array of strings. You can also get days of the week and their abbreviations. You can also add other locales to this dict using DateLocale constructors.
The information in Dates.LOCALES['localename'] is also used when parsing dates in localename locale.
https://docs.julialang.org/en/v1/stdlib/Dates/index.html#Query-Functions-1
I have a dataframe consisting of 96321 observatipns of 11 variables. This data is confidential so I am not able to share it with you. Although I am sharing some screenshot of my data.
My focus is on the FY and OM variables.
levels(mydata$FY)
[1] "2010/11" "2011/12" "2012/13" "2013/14" "2014/15" "2015/16"
levels(mydata$OM)
[1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
I just want to re-arrange the levels of the 'OM' variable as I want to start my year from April to March (financial Year).
I used the following command to rearrange the levels of my 'OM' variables:
table(is.na(mydata$OM))
FALSE
96321
levels(mydata$OM)<-c('Apr','May','Jun','July','Aug','Sep','Oct','Nov','Dec','Jan','Feb','Mar'
)
table(is.na(mydata$OM)) #NO NA is introduced
FALSE
96321
levels(mydata$OM)
[1] "Apr" "May" "Jun" "July" "Aug" "Sep" "Oct" "Nov" "Dec" "Jan" "Feb" "Mar"
I got the result as I expected but when I tried to arrange my data sorted by the 'OM' variable using sql I am not getting the desired result.
sortedData <-sqldf('SELECT * FROM mydata
ORDER BY OM ASC')
I expected the result in increasing order of levels of 'OM' variable like Apr first then May and then Mar in the last. But the order is somewhat distorted. Please help me on this.
Note:- I also tried
mydata$OM <- factor(mydata$OM, levels = c('Apr','May','Jun','July','Aug','Sep','Oct','Nov','Dec','Jan','Feb','Mar'
))
mydata$OM <-factor(mydata$OM, levels = c('Apr','May','Jun','July','Aug','Sep','Oct','Nov','Dec',
'Jan','Feb','Mar'),
labels = c('Apr','May','Jun','July','Aug','Sep','Oct','Nov','Dec',
'Jan','Feb','Mar'))
But these introduced NA in the result.
table(is.na(mydata$OM))
FALSE TRUE
88097 8224
mydata$OM <- factor(mydata$OM, levels = c('Apr','May','Jun','July','Aug','Sep','Oct','Nov','Dec','Jan','Feb','Mar'
))
Use mydata[order(mydata$OM),]
This will solve your problem. In case of Multiple sorting use
mydata[order(mydata$OM,mydata$FY),]
I would like to know which elements occur in a vector that has a lot of clones. Please, before you suggest using levels(), let me explain first.
So, for example:
data <-c( "Jan", "Jan", "Feb", "Feb", "Feb", "Mar" )
supermagicfunction( data )
[1] "Jan" "Feb" "Mar"
As you see, I'm working with dates. I'm using POSIX (actually strftime()) for that. This is where the problem is. Normally, I would use levels. But that returns all months of the year as levels because I work with POSIX dates. Like this:
levels( data )
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
I assume POSIXct kindly determines the levels for this vector.
Now, my question is: Does anyone know a function (perhaps even a primitive?) that could help here?
Ha! I just found it myself. This will work:
unique( data )
[1] "Jan" "Feb" "Mar"
And it's fast, too.