How does rmongodb convert time (and how to do the reverse operation)? - r

I use rmongodb to query a MongoDB. I connect to the DB which works nicely (require(rmongodb); mongo <- mongo.create("foo")) and I am generally able to get stuff out of the database. I just don't know what to do about the date formats..
TIME <- strptime("2013-11-11 15:00",format="%Y-%m-%d %H:%M",tz="CET")
query = mongo.bson.buffer.create()
mongo.bson.buffer.append(query, "timestamp", TIME)
query = mongo.bson.from.buffer(query)
when I look at this query it says:
timestamp : 9 1198930688
So mongo.bson.buffer.append has properly recognized that timestamp is a date class and does some conversion -- which I don't understand. This is not UNIX time and I would not really care if the values returned from the database weren't in this format as well. I'm particularly puzzled because quite some of these numeric date values are negative while all my dates are from 2013... Some more examples:
# 2013-10-10 12:15 --> -1579369312
# 2013-10-10 12:30 --> -1578469312
# 2013-11-10 12:30 --> 1103530688
So basically my question is: How can I convert this funny date format (1198930688) back to POSIXct?
Thanks a lot!
skr

Try
myTIME <- mongo.bson.value( query, "timestamp" )
myTIME
[1] "2013-11-11 15:00:00 CET"

Related

How to filter data between date & time in sqlite

I have a table Orders with Order_Date datatype is smalldatetime and my Order_Date Format is 01/10/2018 10:00:00 PM
Now I want to filter data between 01/10/2018 04:00:00 PM AND 02/10/2018 04:00:00 AM
What I tried
SELECT distinct(Order_No),Order_Date from Orders WHERE Order_Date BETWEEN '01/10/2018 04:00:00 PM' and '02/10/2018 04:00:00 AM'
This query is showing only 01/10/2018 Data but I want the data BETWEEN 01/10/2018 04:00:00 PM and 02/10/2018 04:00:00 AM
Is there any way to get the data from today 4PM To Next Day 4AM?
First off, sqlite does not have actual date/time types. It's a simple database with only a few types. Your smalldatetime column actually has NUMERIC affinity (See the affinity rules).
For Sqlite's builtin functions to be able to understand them, date and times can be stored as numbers or text; numbers are either the number of seconds since the Unix epoch, or a Julian day. Text strings can be one of a number of formats; see the list in the docmentation. All these have the additional advantage that, when compared to other timestamps in the same format, they can be properly sorted.
You seem to be using text strings like '01/10/2018 04:00:00 PM'. This is not one of the formats that sqlite date and time functions understand, and it doesn't sort naturally, so you can't use it in comparisons aside from testing equality. Plus it's ambiguous: Is it October 1, or January 10? Depending on where you're from you'll have a different interpretation of it.
If you change your timestamp format to a better one like (Assuming October 1) '2018-10-01 16:00:00', you'll be able to sort and compare ranges, and use it with sqlite functions.

How to convert sqlite dates to a date in R

I am working with a sqlite database table. I have pulled the data into R using the RSQLite package. One of the columns holds a date. Sqlite is storing it as a Real number, the number of days since noon in Greenwich on November 24, 4714 B.C. (e.g.1264896000). Any ideas on how to convert this to a valid date in R? I tried the following
as.POSIXct(1264896000,origin = "-4714-11-24")
However, this doesn't work as the character string in not in a standard form. Any ideas?
I tested my theory that your claim about the origin was unlikely. The theory that these are POSIX date-times (origin= 1970-01-01 and times in seconds) seems supported by experiment.
> as.POSIXct(1264896000,origin = "1970-01-01")
[1] "2010-01-30 16:00:00 PST"

Date issue in R

I have some data, for example:
Date CAC Index
2014-10-10 4073,71
2014-10-17 4033,18
2014-10-24 4128,9
But when I put it into R with XLConnect library I get the following:
wb<-loadWorkbook(file.choose())
lp<-getSheets(wb)
data=lapply(seq_along(lp),function(i) readWorksheet(wb,sheet=lp[i],startRow=1))[[1]]
data[,1]=as.character(data[,1])
tail(data,3)[,c(1,4)]
Date CAC.Index
719 2014-10-09 22:00:00 4073.71
720 2014-10-16 22:00:00 4033.18
721 2014-10-23 22:00:00 4128.90
Why don't I get the same dates?
In example:
I dont get 2014-10-24, instead I get 2014-10-23 22:00:00
Might it be an issue with
ttz<-Sys.getenv('TZ')
Sys.setenv(TZ='GMT')
?
Best regards
I think it comes from importing data as GMT and converting it to your local timezone, which seems to GMT-2, therefore 2014/10/10 00:00 is set to 2014/10/09 22:00.
Maybe you could solve this by specifying your tz according to OlsonNames() list or specify your date column is Date instead of POSIXct.
It looks like it has converted the date string in the Excel to a Date object in R. Try a str(data) to see what the types are in your data.frame (good habit to get into)
If it is a Date object, then you can use format to put it in the way you would like to read it. Something like:
##assuming data$Date is a Date class object
data$DateFormatted <- format(data$Date, format="%Y-%m-%d")
See ?format for other examples.

removing date from %d/%m/%Y %H:%M in R

The r code that I am working on is supposed to use the data collected in every five minute intervals.
The data is saved in csv format. However, due to inconsistency in the data collected, the time column in the data sometimes represent timestamp instead of just time.(dd/mm/yyyy HH:MM, instead of HH:MM)
This causes an error to my system as the system reads the data as having multiple different values for the same time value. Therefore, I would like to omit the date format from the timestamp such that the code would only read the time value.
My failed attempt was:
as.Date(data[[1]],"%H:%M")
which gave me all NA values for the time column.
I have searched for similar questions in SO, but I did not manage to find a clear answer to my question. Can anyone suggest me some possible functions to use?
I appreciate your help.
You could just strip the date portion of the text and then use as.POSIXct to convert them all to a %H:%M timestamp, e.g.:
x <- c("10:25","01/01/2014 10:30")
x <- gsub("^.+(\\d{2}:\\d{2})$","\\1",x)
as.POSIXct(x,format="%H:%M",tz="UTC")
#[1] "2014-06-02 10:25:00 UTC" "2014-06-02 10:30:00 UTC"

BigQuery converting to a different timezone

I am storing data in unixtimestamp on google big query. However, when the user will ask for a report, she will need the filtering and grouping of data by her local timezone.
The data is stored in GMT. The user may wish to see the data in EST. The report may ask the data to be grouped by date.
I don't see the timezone conversion function here:
Does anyone know how I can do this in bigquery? i.e. how do i group by after converting the timestamp to a different timezone?
Standard SQL in BigQuery has built-in functions:
DATE(timestamp_expression, timezone)
TIME(timestamp, timezone)
DATETIME(timestamp_expression, timezone)
Example:
SELECT
original,
DATETIME(original, "America/Los_Angeles") as adjusted
FROM sometable;
+---------------------+---------------------+
| original | adjusted |
+---------------------+---------------------+
| 2008-12-25 05:30:00 | 2008-12-24 21:30:00 |
+---------------------+---------------------+
You can use standard IANA timezone names or offsets.
As of September 2016 BigQuery has adopted standard SQL and you can now just use the "DATE(timestamp, timezone)" function to offset for a timezone. You can reference their docs here:
BigQuery DATE docs
To those that stumble here:
How to convert a timestamp to another timezone?
Given that TIMESTAMP values, once constructed, are stored as UTC, and that TIMESTAMP does not have a constructor (TIMESTAMP, STRING), you can convert a timestamp to another time zone by transforming it first to a DATETIME and then constructing the new TIMESTAMP from the DATETIME in the new timezone:
SELECT TIMESTAMP(DATETIME(timestamp_field, '{timezone}'))
Example:
SELECT
input_tz,
input,
'America/Montreal' AS output_tz,
TIMESTAMP(DATETIME(input,'America/Montreal')) AS output
FROM (
SELECT 'US/Pacific' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'US/Pacific') AS input
UNION ALL
SELECT 'UTC' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'UTC') AS input
UNION ALL
SELECT 'Europe/Berlin' AS input_tz, TIMESTAMP(DATETIME(DATE(2021, 1, 1), TIME(16, 0, 0)), 'Europe/Berlin') AS input
) t
results in:
Row
input_tz
input
output_tz
output
1
US/Pacific
2021-01-02 00:00:00 UTC
America/Montreal
2021-01-01 19:00:00 UTC
2
UTC
2021-01-01 16:00:00 UTC
America/Montreal
2021-01-01 11:00:00 UTC
3
Europe/Berlin
2021-01-01 15:00:00 UTC
America/Montreal
2021-01-0110:00:00 UTC
How to strip time zone info from a DATETIME value?
DATETIME in BigQuery are time zone naive, such that they do not contain timezone info. This being said, if you have business knowledge that allows you to know the timezone of a DATETIME, you can strip that timezone offset by converting it to a TIMESTAMP with the known timezone:
SELECT TIMESTAMP(datetime_value, '{timezone}')
Given that the TIMESTAMP stores the value in UTC, you can then re-convert to DATETIME if that's your preferred method of storage, but now you'll know that your DATETIME is in UTC :)
Hopefully this can be helpful! :)
Your premise is right. If you group like this, then users who want EST or EDT will get incorrect date grouping:
GROUP BY UTC_USEC_TO_DAY(ts_field)
But as long as you figure out the offset that your user wants, you can still do the full calculation on the server. For example, if EST is 5 hours behind UTC then query like this:
GROUP BY UTC_USEC_TO_DAY(ts_field - (5*60*60*1000*1000000) )
Just parameterize the "5" to be the offset in hours, and you're all set. Here's a sample based on one of the sample data sets:
SELECT
COUNT(*) as the_count,
UTC_USEC_TO_DAY(timestamp * 1000000 - (5*60*60*1000*1000000) ) as the_day
FROM
[publicdata:samples.wikipedia]
WHERE
comment CONTAINS 'disaster'
and timestamp >= 1104537600
GROUP BY
the_day
ORDER BY
the_day
You can remove the offset to see how some edits move to different days.
To convert any TimeZone DateTime string to UTC, one could use PARSE_TIMESTAMP using the supported TIMESTAMP Formats in BigQuery.
For example to convert IST (Indian Standard Time) string to UTC, use the following:
SAFE.PARSE_TIMESTAMP("%a %b %d %T IST %Y", timeStamp_vendor, "Asia/Kolkata")
Here PARSE_TIMESTAMP parses the IST string to a UTC TIMESTAMP (not string). Adding SAFE as prefix takes care of errors/nulls etc.
To convert this to a readable string format in BigQuery, use FORMAT_TIMESTAMP as follows:
FORMAT_TIMESTAMP("%d-%b-%Y %T %Z", SAFE.PARSE_TIMESTAMP("%a %b %d %T IST %Y", timeStamp_vendor, "Asia/Kolkata"))
This example would take an IST string of the format Fri May 12 09:45:12 IST 2019 and convert it to 12-May-2019 04:15:12 UTC.
Replace IST with the required TimeZone and Asia/Kolkata with relevant Timezone name to achieve the conversion for your timezone
2016 update: Look answers below, BigQuery now provides timestamp and timezone methods.
You are right - BigQuery doesn't provide any timestamp conversion methods.
In this case, I suggest that you run your GROUP BY based on dimensions of the GMT/UTC timestamp field, and then convert and display the result in the local timezone in your code.
For me TIMESTAMP_SUB and TIMESTAMP_ADD functions did the job. When needed to convert timestamp from UTC to PST I used:
TIMESTAMP_SUB(`timestamp`, INTERVAL 8 HOUR)

Resources