Proper handling of date operations in Gremlin - gremlin

I am using AWS Neptune Gremlin with gremlin_python.
My date in property is stored as datetime as required in Neptune specs.
I created it using Python code like this:
properties_dict['my_date'] = datetime.fromtimestamp(my_date, timezone.utc)
and then constructed the Vertex with properties:
for prop in properties:
query += """.property("%s", "%s")"""%(prop, properties[prop])
Later when interacting with the constructed graph, I am only able to find the vertices by an exact string matching query like the following:
g.V().hasLabel('Object').has("my_date", "2017-12-01 00:00:00+00:00").valueMap(True).limit(3).toList()
What's the best way for dealing with date or datetime in Gremlin?
How can I do range queries such as "give me all Vertices that have date in year 2017"?

Personally, I prefer to store date/time values as days/seconds/milliseconds since epoch. This will definitely work on any Graph DB and makes range queries much simpler. Also, the conversion to days or seconds since epoch and back should be a simple method call in pretty much any language.
So, when you create your properties dictionary, you could simplify your code by changing it to:
properties_dict['my_date'] = my_date
... as my_date should represent the number of seconds since epoch. And a range query would be as simple as:
g.V().has("Object", "my_date", P.between(startTimestamp, endTimestamp)).
limit(3).valueMap(True)

Related

How do we convert string to number in a gremlin step?

I have a graph which has vertex E1 with property "price" and "name" which are storing String values. I need to calculate sum of the column "price" grouped by "name". I am writing the below query using Java:
g.withSideEffect("Neptune#repeatMode","BFS")
.V().hasLabel("E1").group()
.by("name").by(values("price").unfold().sum())
.unfold()
.project("rowName","data")
.by(select(keys).properties(MandatoryCustomerAttributes.firstName.name()).value())
I am getting this error:
{
"requestId": "38b781ce-fc02-4f7d-a71e-476dfd1925ce",
"code": "UnsupportedOperationException",
"detailedMessage": "java.lang.String cannot be cast to java.lang.Number"
}
Please help me in converting the String to any number format so that I can do some mathematical operations.
At this time, you can not do such a conversion with Gremlin steps unless you use a lambda step (which is not always possible depending on the graph database you are using and since you tagged this question with Neptune, you definitely can't take that approach - it also isn't advisable). You would need to store your data natively as a number or do the conversion and related mathematical calculation within your application. There is possibility that Gremlin will address this limitation in 3.7.0 as part of the various primitive operations aimed at String.

Neo4j / Good way to retrieve nodes created from a specific startDate

Let's suppose this Cypher query (Neo4j):
MATCH(m:Meeting)
WHERE m.startDate > 1405591031731
RETURN m.name
In case of millions Meeting nodes in the graph, which strategy should I choose to make this kind of query fast?
Indexing the Meeting's startDate property?
Indexing it but with a LuceneTimeline?
Avoiding index and preferring such a structure?
However, this structure seems to be relevant for querying by a range of dates (FROM => TO), not for just a From.
I haven't use cases when I would query a range: FROM this startDate TO this endDate.
By the way, it seems that simple indexes work only when dealing with equality... (not comparison like >).
Any advice?
Take a look at this answer: How to filter edges by time stamp in neo4j?
When selecting nodes using relational operators, it is best to select on an intermediate node that is used to group your meeting nodes into a discrete interval of time. When adding meetings to the database you would determine which interval each timestamp occurred within and get or create the intermediate node that represents that interval.
You could run the following query from the Neo4j shell on your millions of meeting nodes which would group together meetings into an interval of 10 seconds. Assuming your timestamp is milliseconds.
MATCH (meeting:Meeting)
MERGE (interval:Interval { timestamp: toInt(meeting.timestamp / 10000) }
MERGE (meeting)-[:ON]->(interval);
Then for your queries you could do:
MATCH (interval:Interval) WHERE interval.timestamp > 1405591031731
WITH interval
MATCH (interval)<-[:ON]-(meeting:Meeting)
RETURN meeting

F# query expressions - restriction using string comparison in SqlProvider with SQLite

SQLite doesn't really have date columns. You can store your dates as ISO-8601 strings, or as the integer number of seconds since the epoch, or as Julian day numbers. In the table I'm using, I want my dates to be human-readable, so I've chosen to use ISO-8601 strings.
Suppose I want to query all the records with dates after today. The ISO-8601 strings will sort properly, so I should be able to use string comparison with the ISO-8601 string for today's date.
However, I see no way to do the comparison using the F# SqlProvider type provider. I'm hoping that this is just a reflection of my lack of knowledge of F# query expressions.
For instance, I can't do:
query {
for calendarEntry in dataContext.``[main].[calendar_entries]`` do
where (calendarEntry.date >= System.DateTime.Today.ToString("yyyy-MM-dd hh:mm:ss"))
... }
I get:
The binary operator GreaterThanOrEqual is not defined for the types 'System.String' and 'System.String'.
I also can't do any variation of:
query {
for calendarEntry in dataContext.``[main].[calendar_entries]`` do
where (calendarEntry.date.CompareTo(System.DateTime.Today.ToString("yyyy-MM-dd hh:mm:ss")) >= 0)
... }
I get:
Unsupported expression. Ensure all server-side objects appear on the left hand side of predicates. The In and Not In operators only support the inline array syntax.
Anyone know how I might do string comparisons in the where clause? It seems that my only option for filtering inside the query is to store seconds-since-epoch in the database and use integer comparisons.
This was a temporary bug with old SQLProvider version and it should be working now. If not, please open a new issue to the GitHub repository: https://github.com/fsprojects/SQLProvider

Fetching documents in xquery

I need to fetch documents from DB in xquery between dates [from date and to date].
From Date - 30 days before from Current Date
To Date - current date
In every document, I have an attribute named "loadDate". I have to fetch without creating an index for this attribute. Is that possible?
Please help.
Thanks,
-N
Assuming that your 'loadDate' attribute has type xs:date, and making up an imaginary structure for your documents, it sounds as if your query is simply:
/myns:doc
[#loadDate gt (current-date() - xs:dayTimeDuration('P30D'))]
Such a query might be slower without an index, but why would it not be possible? In a declarative query language, the general principle is that the existence of an index should not change the meaning of any query, only the speed with which it can be evaluated.

How to store and get datetime value in SQLite

My table contains Birthdate field which has datatype as datetime.
I want to get all records having birthday today.
How can I get it?
Try this query:
SELECT * FROM mytable
WHERE strftime('%m-%d', 'now') = strftime('%m-%d', birthday)
Having a special datetime type has always seemed like unnecessary overhead to me, integers are fast, flexible, and use less space.
For general datetime values use Unix Epoch timestamps. Easy to work with, extremely flexible, as well as timezone (and even calender!) agnostic. (I recently wrote an article on using them, which I really have to plug...)
That said, if you're only interested in dates in the Gregorian calendar you may want to use a large integer in the following format: YYYYMMDD, eg 19761203. For you particular usage you could even create a four digit integer like MMDD, say 1703 — that's got to result in fast selects!
SQLite has very poor support for storing dates. You can use the method suggested by Nick D above but bear in mind that this query will result in full table scan since dates are not indexed correctly in SQLite (actually SQLite does not support dates as a built-in type at all).
If you really want to do a fast query then you'll have to add a separate (integral) column for storing the birth day (1-31) and attach an index for it in the database.
If you only want to compare dates then you can add a single (INTEGER) column that will store the date UTC value (but this trick won't allow you to search for individual date components easily).
Good Luck

Resources