Gremlin query is not returning any hits when using both union and choose step - gremlin

I used the below gremlin query to fetch records, but it returns 0 hits when records are present.
g.V().hasLabel('LABEL').
union(hasLabel('LABEL')).as('vertex').
select('vertex').
choose(
has('core.classType'),
values('core.classType')
).groupCount('core.classType').
select('vertex')
When I used the same query without the 'Union' or the 'Choose' step, then it worked fine
Ex:
Without 'Union':
g.V().hasLabel('LABEL').as('vertex').
select('vertex').
choose(
has('core.classType'),
values('core.classType')
).groupCount('core.classType').
select('vertex')
Without 'Choose'
g.V().hasLabel('LABEL').
union(hasLabel('LABEL')).as('vertex').
select('vertex')
When I used the sideEffect instead of storing results in the 'As' step, then also it worked fine
g.V().hasLabel('LABEL').
union(hasLabel('LABEL')).sideEffect(choose(
has('core.classType'),
values('core.classType')
).
groupCount('core.classType'))
I want the hits using both the 'Union' and 'Choose' steps. Can someone please explain the reason for this? Should we need to use 'Barrier' steps to handle this?

Related

how to execute a query after getting response of another query?

I need the result of the first query to pass it as an input parameter to my second query. and also want to know to write multi queries.
In my use case, the second query can be traversed only using the result of the first query and that too using loop(which is similar to for loop)
const query1 =g.V().hasLabel('Province').has('code',market').inE('partOf').outV().has('type',state).values('code').as('state')
After executing query1,the result is
res=[{id1},{id2},........]
query2 = select('state').repeat(has('code',res[0]).inE('partOf').outV().has('type',city).value('name')).times(${res.length-1}).as('city')
I made the assumptions that your first query tries to finds "states by market" where the market is a variable you intend to pass to your query. If that is correct then your first query simplifies to:
g.V().hasLabel('Province').has('code',market).
in('partOf').
has('type','state').values('code')
so, prefer in() to inE().outV() when no filtering on edge properties is needed.
Your second query doesn't look like valid Gremlin, but maybe you were just trying to provide an example of what you wanted to do. You wrote:
select('state').
repeat(has('code',res[0]).
inE('partOf').outV().
has('type',city).value('name')).
times(${res.length-1}).as('city')
and I assume that means you want to use the states found in the first query to find their cities. If that's what you're after you can simplify this to a single query of:
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
in('partOf').has('type','city').
values('name')
If you need data about the state and the city as part of the result then consider project():
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
project('state','cities').
by('code').
by(__.in('partOf').has('type','city').
values('name').
fold())

Gremlin: Rollback the query if an exception occurs

I am trying to submit a batch like operation for creating multiple vertices and edges in the same query.
g.addV('os').property('name', 'linux').as('linux').
addV('os').property('name', 'windows').as('windows').
addV('os').property('name', 'mac').as('mac').
addE('competitor').from('linux').to('UNEXISTING OS'). # fail here
addE('competitor').from('linux').to('windows').
addE('competitor').from('windows').to('mac').
addE('competitor').from('linux').to('mac').
iterate()
The query is constructed to intentionally fail, however all vertices before the failing line are being created.
Is it possible to achieve a kind of transaction for the whole query? So that if one subquery fails, it should rollback the ones that were previously executed.
Thanks!
The query could not be executed in the Gremlin Console using TinkerGraph,
as per TinkerPop documentation, there isn't support for transactions for built-in TinkerGraph object.
But, as cygri pointed out, AWS Neptune offers support for transactions (see here), that can be executed under the form of original query from OP or by separating queries by a semicolon (;) or a newline character (\n)
g.addV('os').property('name', 'linux').next();
g.addV('os').property('name', 'windows').next();
g.addE('competitor').from('1101').to('1102')
You can also use Gremlin Sessions; create a sessioned-connection and it'll rollback queries in case of an error.

db2 UDB count(*) returns 0 from the view, but select * returns valid data

I have encountered a strange situation in DB2 UDB V11.
When I run SELECT COUNT(*) FROM view_name it returns 0 rows.
However, when I run SELECT * FROM *view_name* the data is returned properly.
I have tried dropping and re-creating the view and ran REORG and RUNSTAT on the underlying table.
Have anyone seen this situation before?
I have seen this before when a MQT was involved. Because the optimizer will use the best way to query and can rewrite it to use the MQT this situation could happen when the MQT has not been refreshed but the table itsef has been updated/deleted already.
So check if any MQTs are involved.

How to query GA export to BQ schema with hits.customDimensions.index in WHERE clause?

As of a few days ago, the following query worked fine on BQ with the schema generated by an export from GA:
SELECT hits.customDimensions.value
FROM TABLE_DATE_RANGE([88399188.ga_sessions_], TIMESTAMP('20150623'), TIMESTAMP('20150623'))
WHERE hits.customDimensions.index=14
LIMIT 1000
Now, I get the following error:
Error: Cannot query the cross product of repeated fields customDimensions.index and hits.customDimensions.index.
Interestingly, the following query works fine (i.e. without the WHERE clause):
SELECT hits.customDimensions.value
FROM TABLE_DATE_RANGE([88399188.ga_sessions_], TIMESTAMP('20150623'), TIMESTAMP('20150623'))
LIMIT 1000
Also, the following query works fine:
SELECT hits.customDimensions.value
FROM [88399188.ga_sessions_20150623]
WHERE hits.customDimensions.index=14
LIMIT 1000
Note that the FROM clause is the only difference between this one and the failing query; even though they are supposed to resolve to the exact same query. Please help! What am I doing wrong?
The problem is that both customDimensions is REPEATED RECORD, and hits is REPEATED RECORD, and each can repeat independently of the other. Therefore selecting hits.customDimensions.value while filtering on hits.customDimensions.index is not well defined in meaning. If, for example, you want to skip the entire record when non of the hits.customDimensions.index is 14, then you can use the following query:
SELECT hits.customDimensions.value
FROM TABLE_DATE_RANGE(
[88399188.ga_sessions_], TIMESTAMP('20150623'), TIMESTAMP('20150623')
OMIT RECORD IF EVERY(hits.customDimensions.index != 14)
LIMIT 1000
This anomaly is no longer reproducible. Though I have not gotten confirmation from Google, I have to assume this was a temporary bug that got fixed.

BigQuery error: Cannot query the cross product of repeated fields

I am running the following query on Google BigQuery web interface, for data provided by Google Analytics:
SELECT *
FROM [dataset.table]
WHERE
  hits.page.pagePath CONTAINS "my-fun-path"
I would like to save the results into a new table, however I am obtaining the following error message when using Flatten Results = False:
Error: Cannot query the cross product of repeated fields
customDimensions.value and hits.page.pagePath.
This answer implies that this should be possible: Is there a way to select nested records into a table?
Is there a workaround for the issue found?
Depending on what kind of filtering is acceptable to you, you may be able to work around this by switching to OMIT IF from WHERE. It will give different results, but, again, perhaps such different results are acceptable.
The following will remove entire hit record if (some) page inside of it meets criteria. Note two things here:
it uses OMIT hits IF, instead of more commonly used OMIT RECORD IF).
The condition is inverted, because OMIT IF is opposite of WHERE
The query is:
SELECT *
FROM [dataset.table]
OMIT hits IF EVERY(NOT hits.page.pagePath CONTAINS "my-fun-path")
Update: see the related thread, I am afraid this is no longer possible.
It would be possible to use NEST function and grouping by a field, but that's a long shot.
Using flatten call on the query:
SELECT *
FROM flatten([google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910],customDimensions)
WHERE
  hits.page.pagePath CONTAINS "m"
Thus in the web ui:
setting a destination table
allowing large results
and NO flatten results
does the job correctly and the produced table matches the original schema.
I know - it is old ask.
But now it can be achieved by just using standard SQL dialect instead of Legacy
#standardSQL
SELECT t.*
FROM `dataset.table` t, UNNEST(hits.page) as page
WHERE
  page.pagePath CONTAINS "my-fun-path"

Resources