how to extract all values from a data frame by month for years - r

I have data in a zoo data structure. I want to pull all August daily vaules over 10 years and compute monthly statistics for a period of record. Any thoughts on easy way to do this?

an example will be great of the specific date format, however, try format()
for example:
x <- as.POSIXct("2009-08-03 12:01:59.23")
format(x,"%b")
For simplicity, just create a new column with the format() then subset it with the month your looking for.

Related

Data frames and datetimes [duplicate]

This question already has answers here:
Extracting time from POSIXct
(7 answers)
Closed 8 months ago.
I have a dataset that I’m working with and I’m trying to change the format of my time column. The current format reads like this, example: “2022-05-23 23:06:58”, I’m trying to change this to only show me the hour times and erase the dates.
Other info: I want to make this change within my data frame, not just random times. I want to change over 100,000 rows so I need a function or solution that will do so. Tidyverse, Lubridate, Format, etc. Thank you guys.
Edit: There was one thing I may not have articulated fully, I wanted to keep the exact time and nothing else. so ‘23:48:07 would’ be how I’m looking for it not just the our. I need it so I can eventually subtract the time passed between two columns. You get me?
Try this
for the first question here is the code to convert to the hour of the day
your_time<-format(as.POSIXct(your_time), format = "%H:%M:%S")
#which gives "23" hours of the day
Since you want to apply on a large dataset we use this below
large_df%>%
mutate(Hour = format(as.POSIXct(Datetime), format ="%H:%M:%S"))
where the large_df is your large dataset worth over 100,000 records
The mutate will open another column for the result which is named the Hour column
and the Datetime is the DateTime column in your large_df dataset
Is the time as a string ok? Cause then you can use substr to extract the hour and minutes like so:
time <- c("2022-05-23 23:02:58", "2022-05-23 13:52:58", "2022-05-23 03:31:58", "2022-05-23 09:09:58")
n <- nchar(time)
hour <- substr(time, n - 7, n - 3)
Just time with your 100.000 row time column
library(data.table)
hour("2022-05-23 23:06:58") # 23

How can I convert a characters into dates in RStudio?

still new to R. I wanted to create a simple (bar) chart of the fluctuations/occurrences of burglaries per month in my city. I found that the column, 'Occurence_Date' is a character, I wanted it to be "time", or something simpler, to create a visualization. I wanted the "x-axis" to be the months of January to June 2019, with the "y-axis" to be the amount of burglaries per month. Can anyone help me get started on this please? Thanks!
This is my data frame
The lubridate package is very helpful for working with dates and times in R.
# load.packages("lubridate") ## only run once
library(lubridate)
df$Occurence_Date <- ymd(df$Occurence_Date) # converts text in year month day format, igrores time
Generally it's better to put example data in your question so people can work with it and show an example.

time series in R with sales prediction with only date values

i have a data with date(2015)with mm/dd/yy format and sales. I need to predict sales for 2016 with the given data. I just know, I need to use time series forecasting. However no idea. Since, many examples have only year like(1960,1970,..) my data has only one year with several months. Don't know how to plot too. can you give me a clear structure how to proceed?
Assuming that the date is in string and in the format mm/dd/yy
convert string into date by using this code
a <- "07/23/15"
b <- as.Date(a, format = "%m/%d/%y")
fullYear <- format(b,'%Y') // to get 2015 as year
halfYear <- format(b, '%y') //to get 15 as year
After this you can work on
I have found the solution. Converted sales figure into time series format.
plotted the data and seen whether there is any trend/Seasonality.
Since the data has only trend applied holts exponential smoothing under forecast package. Sales of 2016 has been found and plotted.

Simple time series analysis with R: aggregating and subsetting

I want to convert monthly data into quarterly averages. These are my 2 datasets:
gas <- UKgas
dd <- UKDriverDeaths
I was able to accomplish (I think) for the dd data as so:
dd.zoo <- zoo(dd)
ddq <- aggregate(dd.zoo, as.yearqtr, mean)
However I cannot figure out how to do this with the gas data...any help?
Follow-up
When I try to subset the data based on date (1969-1984) the resulting data does not include 1969 Q1 and instead includes 1985 Q1...any suggestions on how to fix this? I was just trying to subset as gas[1969:1984].
Originally I did not plan to post answer, as it looks like you did not pre-check your UKgas dataset to see that it is already a quarterly time series.
But the follow-up question is worth answering. "ts" object comes with many handy generic functions. We can use window to easily subset a time series. To extract the section between first quarter of 1969 and the final quarter of 1984, we can use
window(UKgas, start = c(1969,1), end = c(1984,4))
The result will still be a quarterly time series.
On the other hand, if we use "[" for subsetting, we lose object class:
class(UKgas[1:12])
#[1] "numeric"

How to get monthly time series cross sectional into zoo using R

I want to get a panel data set into zoo so that it catches both month and year. My data set looks like this.
and the data can be downloaded from HERE.
The best way I could do is,
dat<-read.csv("dat_lag.csv")
zdat <- read.zoo(dat, format="%d/%m/%Y")
However, I could do this by including column 1- Date and column 4- Day in my data set. Is there any clever way to get both month and year into zoo using R without including the Date and Day columns? Thanks, in advance for any help.

Resources