I have a simple(and long) Panda DF with Datetime Index and Prices, I am trying to create a new Column ['ewm_12'] which pick 11 previous rows every 120 minutes appends current price and calculates the ewm
I am trying to do it in a vectorized fashion as the DF is long, using the following code:
dftemp['ewm_12'] = dftemp.loc[dftemp.index::120][-11:].append(dftemp.loc[(dftemp.index)])[dfs].ewm(min_periods=12,span=12, adjust = True).mean()[-1:][0]
TypeError: Cannot convert input [DatetimeIndex(['2018-01-01 22:00:00+00:00', '2018-01-01 22:01:00+00:00','2019-01-18 22:00:00+00:00'],dtype='datetime64[ns, Europe/London]', length=394561, freq=None)] of type to Timestamp
This seem very strange as if i pick just one row dftemp.index it returns a Timestamp but when I ask it to iterate over the whole df.index it says it cannot convert the Datetimeindex (which is a collection of Timestamps), I can do it with a for loop but will take several minutes and I am sure there must be a way if someone knows pls help
Related
I want to create a date spread from 01-01-2005 till 23-01-2015, is it possible to populate such a dates range in the Azure data factory (specifically in mapping data flows). If yes then which function should one use to apply the same.
Thank you!
You can use Until activity to loop through start and end dates and add append variable activity to append the dates to get the range of dates.
Create a bunch of variables for start date, end date, number of days to be appended.
Until activity:
increment day by 1 : #string(add(int(variables('add_count')),1))
days to add : #variables('count')
Append date to list : #formatdatetime(adddays(formatdatetime(variables('start'),'MM-dd-yyyy'),int(variables('count'))),'MM-dd-yyyy')
max date from list : #adddays(formatdatetime(variables('start')),int(variables('count')))
output list of dates:
I am currently using a vector to extract certain rows from my data set based off time (formatted as POSIXct):
Vector.Time <- c('2020-03-06 10:09:11',
'2020-03-06 10:13:11',
'2020-03-06 10:18:12')
One of the instruments I am using logs data at the end of each minute, so I need to reference a second vector where 1-minute is added to all the values in the original vector. Is there a simple way of doing this without having to create a new vector?
Use the minutes from lubridate
library(lubridate)
as.POSIXct(Vector.Time) + minutes(1)
You can add/subtract time for POSIXct object using base R, it is done by second. So to add 1 minute in Vector.Time you can add 60 seconds.
as.POSIXct(Vector.Time) + 60
My data contains several measurements in one day. It is stored in CSV-file and looks like this:
enter image description here
The V1 column is factor type, so I'm adding a extra column which is date-time -type: vd$Vdate <- as_datetime(vd$V1) :
enter image description here
Then I'm trying to convert the vd-data into time series: vd.ts<- ts(vd, frequency = 365)
But then the dates are gone:
enter image description here
I just cannot get it what I am doing wrong! Could someone help me, please.
Your dates are gone because you need to build the ts dataframe from your variables (V1, ... V7) disregarding the date field and your ts command will order R to structure the dates.
Also, I noticed that you have what is seems like hourly data, so you need to provide the frequency that is appropriate to your time not 365. Considering what you posted your frequency seems to be a bit odd. I recommend finding a way to establish the frequency correctly. For example, if I have hourly data for 365 days of the year then I have a frequency of 365.25*24 (0.25 for the leap years).
So the following is just as an example, it still won't work properly with what I see (it is limited view of your dataset so I am not sure 100%)
# Build ts data (univariate)
vs.ts <- ts(vd$V1, frequency = 365, start = c(2019, 4)
# check to see if it is structured correctly
print(vd.ts, calendar = T)
Finally my time series is working properly. I used
ts <- zoo(measurements, date_times)
and I found out that the date_times was supposed to be converted with as_datetime() as otherwise they were character type. The measurements are converted into data.frame type.
I have 10 million+ data points which look like:
Identifier Times Data
6597104 2015-05-01 04:08:05 0.15512575543732
In order to study these I want to add a Period (1, 2,...) column so the oldest row with the 6597104 identifier is period 1 and the second oldest is period 2 etc. However the times come irregularly so I can't just make it a time series object.
Does anyone know how to do this? Thanks in advance
Let's call your data frame data
First sort it using
data <- data[sort(data$Times,decreasing=TRUE),]
Then add a new column called Period
for i in 1:nrow(data){
data$Period[i] <- paste("Period",i,sep=" ")
}
I am new to SSRS.
I have a dataset, my dataset brings data from a stored procedure.
one of the parameters of my sp is StartDate and another one is EndDate. Their type is datetime
And the table has a dateTime Column called Date.
I have two gauges and I wanna bind integer values to my gauges.
First one is the count of rows where Date < DateAdd(DateInterval.Hour,24,StartDate)
and te second is count of rows where Date > DateAdd(DateInterval.Hour,24,StartDate)
How will I write the exact script. Whatever I wrote is not working.
I appreciate any help, thanks.
You need to set the gauge Pointer value as something like:
=Sum(IIf(DateDiff(DateInterval.Day, Parameters!StartDate.Value, Fields!Date.Value) >= 1
, 1
, 0))
This is counting rows where the time difference is less than a day compared to the parameter StartDate. Just change it slightly to get those where the difference is at least a day:
=Sum(IIf(DateDiff(DateInterval.Day, Parameters!StartDate.Value, Fields!Date.Value) >= 1
, 0
, 1))
Worked fine for me in a quick test: