Duplicate NETCDF data values across timesteps using R - r

I have 6-hourly data and will like to 'duplicate' it to hourly data.
The first 6-hour timestep starts on 2017-01-01 00:00:00 and the next 6-hour timestep starts on 2017-01-01 06:00:00. I would like to copy the value of 2017-01-01 00:00:00 and assign it to the next 5 time steps and so on ...
The outpout should follow this pattern (illustration only):
Date Time Value
2017-01-01 00:00:00 0.00012120
2017-01-01 01:00:00 0.00012120
2017-01-01 02:00:00 0.00012120
2017-01-01 03:00:00 0.00012120
2017-01-01 04:00:00 0.00012120
2017-01-01 05:00:00 0.00012120
.
.
.
2019-12-01 00:00:00 0.0024270
2019-12-01 01:00:00 0.0024270
2019-12-01 02:00:00 0.0024270
2019-12-01 03:00:00 0.0024270
2019-12-01 04:00:00 0.0024270
2019-12-01 05:00:00 0.0024270
.
.
.
Do the same for the next 6-hour timestep which is 2017-01-01 06:00:00 in the attached file.
Assume that the hourly rainfall remains constant during the 6h period. Thus, each hour in the 6h period has the same rainfall value.
Sample NETCDF data are found here

First create 5 NetCDF files with the time shifted by 1, 2, 3, 4 and 5 hours:
cdo -shifttime,1hour testing.nc testing1.nc
cdo -shifttime,2hour testing.nc testing2.nc
cdo -shifttime,3hour testing.nc testing3.nc
cdo -shifttime,3hour testing.nc testing4.nc
cdo -shifttime,4hour testing.nc testing5.nc
Then merge them using mergetime:
cdo mergetime testing*.nc out.nc

Related

Update year only in column timestamp date field SQLITE

I want to update the year only to 2025 without changing the month day and time
what I have
2027-01-01 09:30:00
2012-03-06 12:00:00
2014-01-01 17:24:00
2020-07-03 04:30:00
2020-01-01 05:50:00
2021-09-03 06:30:00
2013-01-01 23:30:00
2026-01-01 08:30:00
2028-01-01 09:30:00
what i required is below:
2025-01-01 09:30:00
2025-03-06 12:30:00
2025-01-01 17:24:00
2025-07-03 04:30:00
2025-01-01 05:50:00
2025-09-03 06:30:00
2025-01-01 23:30:00
2025-01-01 08:30:00
2025-01-01 09:30:00
I am using dB Browser for SQLite
what i have tried but it didn't worked
update t set
d = datetime(strftime('%Y', datetime(2059)) || strftime('-%m-%d', d));
You may update via a substring operation:
UPDATE yourTable
SET ts = '2025-' || SUBSTR(ts, 6, 14);
Note that SQLite does not actually have a timestamp/datetime type. Instead, these values would be stored as text, and hence we can do a substring operation on them.

Using subset on dates giving shifted dates from the desired time frame

I have a data frame (called homeAnew) from which the head is as follows.
date total
1 2014-01-01 00:00:00 0.756
2 2014-01-01 01:00:00 0.717
3 2014-01-01 02:00:00 0.643
4 2014-01-01 03:00:00 0.598
5 2014-01-01 04:00:00 0.604
6 2014-01-01 05:00:00 0.638
I wanted to extract explicit dates and I originally used:
Hourly <- subset(homeAnew,date >= "2014-04-10 00:00:00" & date <= "2015-04-10 00:00:00")
However the result was a dataframe that started at 2014-04-09 12:00:00 and ended 2015-04-09 12:00:00. Basically it was shifted back 12 hours from where I wanted it.
I was able to use
Date1<-as.Date("2014-04-10 00:00:00")
Date2<-as.Date("2015-04-10 00:00:00")
Hourly<-homeAnew[homeAnew$date>=Date1 & homeAnew$date<=Date2,]
To get what was after but I was wondering if someone could explain to me why subset would work like that?

Alter the data frame to compute hourly values from the daily values in r

I am working on analyzing a delivery unit. I have total number of packets delivered by a unit crew in a day aggregated by session.
First column defines Date, second column indicates Session_start time, third column indicates Session_end time, fourth column indicate total deliveries done in that session packets and fifth column indicates session length Diff_Time. I just added Session_Type to indicate difference between sessions.
Data Frame: df1
Date Session_start Session_end Packets Diff_Time Session_Type
7/01/2016 00:00:00 03:00:00 6000 3 NIGHT
7/01/2016 04:00:00 06:00:00 5000 2 MORNING
Now I would like to convert above data which is aggregated according to session into hourly data as follows:
Data Frame: df2
Date Session_start Session_end Packets Diff_Time Session_Type
7/01/2016 00:00:00 01:00:00 2000(6000/3) 1 NIGHT
7/01/2016 01:00:00 02:00:00 4000(cumsum) 1 NIGHT
7/01/2016 02:00:00 03:00:00 6000 1 NIGHT
7/01/2016 03:00:00 04:00:00 6000 1 NIGHT
7/01/2016 04:00:00 05:00:00 8500 1 MORNING
7/01/2016 05:00:00 06:00:00 11000 1 MORNING
7/01/2016 06:00:00 07:00:00 11000 1 MORNING
.
.
7/01/2016 23:00:00 24:00:00 11000 1 MORNING
Total Packets in the day = 11000(6000+5000), which should be cumulative sum in reformed data frame df2 at the end of the day.
Could any one point me towards the right direction to take this forward?

Create a dataframe with columns of given Date and Time

I would like to create a dataframe with its first column as Date and second column as Time. The condition is the time should increase in 30 minutes interval and the date accordingly. And later i will add other columns manually.
> df
Date Time
2012-01-01 00:00:00
2012-01-01 00:30:00
2012-01-01 01:00:00
2012-01-01 01:30:00
.......... ........
.......... ........
and so on...
EDIT
Can be done in another way as well.
A single column can be created with the given date and time and then separated later using tidyr or any other packages.
> df
DateTime
2012-01-01 00:00:00
2012-01-01 00:30:00
2012-01-01 01:00:00
2012-01-01 01:30:00
..........
..........
and so on...
Any help will be appreciated. Thank you in advance.
you can generate a sequence using seq, specifying the start and end dates, and the time interval
df <- data.frame(DateTime = seq(as.POSIXct("2012-01-01"),
as.POSIXct("2012-02-01"),
by=(30*60)))
head(df)
DateTime
1 2012-01-01 00:00:00
2 2012-01-01 00:30:00
3 2012-01-01 01:00:00
4 2012-01-01 01:30:00
5 2012-01-01 02:00:00
6 2012-01-01 02:30:00
And to get them in two separate columns we can use ?strftime
date_seq <- seq(as.POSIXct("2012-01-01"),
as.POSIXct("2012-02-01"),
by=(30*60))
df <- data.frame(Date = strftime(date_seq, format="%Y-%m-%d"),
Time = strftime(date_seq, format="%H:%M:%S"))
Date Time
1 2012-01-01 00:00:00
2 2012-01-01 00:00:30
3 2012-01-01 00:01:00
4 2012-01-01 00:01:30
5 2012-01-01 00:02:00
6 2012-01-01 00:02:30
Update
You can include the time part of the POSIXct datetime too. This will give you finer control over your upper & lower bounds:
date_seq <- seq(as.POSIXct("2012-01-01 00:00:00"),
as.POSIXct("2012-02-02 23:30:00"),
by=(30*60))
df <- data.frame(Date = strftime(date_seq, format="%Y-%m-%d"),
Time = strftime(date_seq, format="%H:%M:%S"))

R time series missing values

I was working with a time series dataset having hourly data. The data contained a few missing values so I tried to create a dataframe (time_seq) with the correct time value and do a merge with the original data so the missing values become 'NA'.
> data
date value
7980 2015-03-30 20:00:00 78389
7981 2015-03-30 21:00:00 72622
7982 2015-03-30 22:00:00 65240
7983 2015-03-30 23:00:00 47795
7984 2015-03-31 08:00:00 37455
7985 2015-03-31 09:00:00 70695
7986 2015-03-31 10:00:00 68444
//converting the date in the data to POSIXct format.
> data$date <- format.POSIXct(data$date,'%Y-%m-%d %H:%M:%S')
// creating a dataframe with the correct sequence of dates.
> time_seq <- seq(from = as.POSIXct("2014-05-01 00:00:00"),
to = as.POSIXct("2015-04-30 23:00:00"), by = "hour")
> df <- data.frame(date=time_seq)
> df
date
8013 2015-03-30 20:00:00
8014 2015-03-30 21:00:00
8015 2015-03-30 22:00:00
8016 2015-03-30 23:00:00
8017 2015-03-31 00:00:00
8018 2015-03-31 01:00:00
8019 2015-03-31 02:00:00
8020 2015-03-31 03:00:00
8021 2015-03-31 04:00:00
8022 2015-03-31 05:00:00
8023 2015-03-31 06:00:00
8024 2015-03-31 07:00:00
// merging with the original data
> a <- merge(data,df, x.by = data$date, y.by = df$date ,all=TRUE)
> a
date value
4005 2014-07-23 07:00:00 37003
4006 2014-07-23 07:30:00 NA
4007 2014-07-23 08:00:00 37216
4008 2014-07-23 08:30:00 NA
The values I get after merging are incorrect and they contain half-hourly values. What would be the correct approach for solving this?
Why are is the merge result in 30 minute intervals when both my dataframes are hourly?
PS:I looked into this question : Fastest way for filling-in missing dates for data.table and followed the steps but it didn't help.
You can use the padr package to solve this problem.
library(padr)
library(dplyr) #for the pipe operator
data %>%
pad() %>%
fill_by_value()

Resources