Symfony 2 - Join returns less results when object has no associated children - symfony

I've been googling for a while now and I guess I just have trouble stating my question the correct way.
I have a Product and my Product has "optional" ProductImages associated with it.
When I lazyload the products everything works as expected but I'd like to join my images prior to reduce my total amoutn of queries.
Here's the code:
$qb->select('product')
->from('FocumaTCBundle:Product', 'product')
->join('product.ProductType', 'type')
->join('product.ProductImages', 'productImage')
->where('type.id = :productTypeId')
->setParameter('productTypeId', $PRODUCT_HOTEL_TYPE);
This however returns less results then without the join. I'm not sure how
to create an "optional" join :(
Thanks for some help on this!

Use leftJoin:
$qb->select('product')
->from('FocumaTCBundle:Product', 'product')
->leftJoin('product.ProductType', 'type')
->leftJoin('product.ProductImages', 'productImage')
->where('type.id = :productTypeId')
->setParameter('productTypeId', $PRODUCT_HOTEL_TYPE);

Related

Gremlin/Tinkerpop - is there a way to add metadata to a union step so I know which query the resulting traversal came from?

This is a little strange, but I have a situation where it'd be beneficial for me to know which traversal an element came from.
For a simple example, something like this:
.union(
select('parent').out('contains'), //traversal 1
select('parent2').out('contains') //traversal 2
)
.dedup()
.project('id','traversal')
.by(id())
.by( //any way to determine which traversal it came from? or if it was in both? )
Edit: One thing I found is that I can use Map with Group/By to get partly there:
.union(
select('parent').out('contains')
.map(group().by(identity()).by(constant('t1'))),
select('parent2').out('contains')
.map(group().by(identity()).by(constant('t2'))),
)
.dedup() //Dedup isn't gonna work here because each hashmap will be different.
.project('id','traversal')
.by( //here I can't figure out how to read a value from the hashmap inline )
The above query without the project/by piece returns this:
[{v[199272505353083909]: 't1'}, {v[199272515180338177]: 't2'}]
Or is there a better way to do this?
Thanks!
One simple approach might be to just fold the results. If you get back an empty list you will know you did not find any on that "branch":
g.V('44').
union(out('route').fold().as('a').project('res','branch').by().by(constant('b1')),
out('none').fold().as('b').project('res','branch').by().by(constant('b2')))
which yields
{'res': [v[8], v[13], v[20], v[31]], 'branch': 'b1'}
{'res': [], 'branch': 'b2'}
UPDATED after discussion in comments to include an alternative approach that uses nested union steps to avoid the project step inside the union. I still think I prefer the project approach unless the performance when measured is not good.
g.V('44').
union(local(union(out('route').fold(),constant('b1')).fold()),
local(union(out('none').fold(),constant('b2')).fold()))
which yields
[[v[8], v[13], v[20], v[31]], 'b1']
[[], 'b2']

Gremlin order on Map results

I have the below query:
g.V('1')
.union(
out('R1')
.project('timestamp', 'data')
.by('completionDate')
.by(valueMap().by(unfold()))
out('R2')
.project('timestamp', 'data')
.by('endDate')
.by(valueMap().by(unfold()))
)
How can I order the UNION results by timestamp?
I've tried using ".order().by('timestamp')" but this only works on traversals and UNION returns a MAP object.
Here are a couple of ways to approach it. First, you could just use your code as-is and then order() by the "timestamp":
g.V('1').
union(out('R1').
project('timestamp', 'data').
by('completionDate').
by(valueMap().by(unfold())),
out('R2').
project('timestamp', 'data').
by('endDate').
by(valueMap().by(unfold()))).
order().by(select('timestamp'))
Note the difference is to select() the key from the Map that you want to sort on. Versions after 3.4.5 will work more as you expect and you can simply do by('timestamp') for a Map as well as an Element.
I think that a more readable approach however would be to go with this approach:
g.V('1').
out('R1','R2').
project('timestamp', 'data').
by(coalesce(values('endDate'), values('completionDate'))).
by(valueMap().by(unfold())).
order().by(select('timestamp'))
You might need to enhance the by(coalesce(...)) depending on the nature of your schema, but hopefully you get the idea of what I'm trying to do there.

sqlite-net-plc full text rank function

I'm creating a xamarin.forms application, and we use sqlite-net-plc by Frank A. Krueger. It is supposed to support full text searching, which I am trying to implement.
Now, full text search seems to work. I created a query like:
SELECT * FROM Document d JOIN(
SELECT document_id
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
which seems to work fine. However, I'd like to return the results in order of their rank, otherwise the result is useless. According to the documentation (https://www.sqlite.org/fts3.html), the syntax should be:
SELECT * FROM Document d JOIN(
SELECT document_id, rank(matchinfo(SearchDocument)) AS rank
FROM SearchDocument
WHERE SearchDocument MATCH 'test*'
) AS ranktable USING(document_id)
ORDER BY ranktable.rank
However, the engine doesn't seem to know the "rank" function:
[ERROR] FATAL UNHANDLED EXCEPTION: SQLite.SQLiteException: no such function: rank
It does know the "matchinfo" function though.
Can anyone tell me what I'm doing wrong?
Edit: After some more searching it seems that the rank function is simply not implemented in the library. I'm confused. How can people use the fulltext search without caring about the order of the results? Is there some other way of ordering the results so that the most relevant results are at the top?
It depends on SQLitePCLRaw.bundle_green. It's worth looking into that.

BigQuery Timeout Errors in R Loop Using bigrquery

I am running a query in a loop for each store in a dataframe. Typically there are 70 or so stores so the loop repeats that many times for each complete loop.
Maybe 75% of the time this loop works all the way through with no errors.
About 25% of the time I get the following error during any one of the loop iterations:
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached
Then I have to figure out which iteration bombed, and repeat the loop excluding iterations that completed successfully.
I can't find anything on the web to help me understand what is causing this seemingly random error. Perhaps it is a BQ technical issue? There does not seem to be any relation to the size of the result set it crashes on.
Here is the part of my code that does the loop...again it works all the way through most of the time. The cartesian product across IDs is intentional, as I want every combination of each Test ID with all possible Control IDs within store.
sql<-"SELECT pstore as store, max(pretrips) as pretrips FROM analytics.campaign_ids
group by 1 order by 1"
store_maxtrips<-query_exec(sql,project=project, max_pages = 1)
store_maxtrips
for (i in 1:length(store_maxtrips$store)) {
#pull back all ids shopping in same primary store as each test ID with their pre metrics
sql<-paste("SELECT a.pstore as pstore, a.id as test_id,
b.id as ctl_id,
(abs(a.zpbsales-b.zpbsales)*",wt_pb_sales,")+(abs(a.zcatsales-b.zcatsales)*",wt_cat_sales,")+
(abs(a.zsales-b.zsales)*",wt_retail_sales,")+(abs(a.ztrips-b.ztrips)*",wt_retail_trips,") as zscore
FROM analytics.campaign_ids a inner join analytics.pre_zscores b
on a.pstore=b.pstore
where a.id<>b.id and a.pstore=",store_maxtrips$store[i]," order by a.pstore, a.id, zscore")
print(paste("processing store",store_maxtrips$store[i]))
query_exec(sql,project=project,destination_table = "analytics.campaign_matches",
write_disposition = "WRITE_APPEND", max_pages = 1)
}
Solved!
It turns out I was using query_exec, but I should have been using insert_query_job since I do not want to retrieve any results. The errors were all happening in the course of R trying to retrieve results from BigQuery which I didn't want anyhow.
By using insert_query_job + wait_for(job) in my loop instead of the query_exec command, it eliminated all issues with the loop finishing.
I did also need to add a try() function to help circumvent some rare errors that still popped up with this approach. Thanks to MarkeD for this tip. So my final solution looked like this:
try(job<-insert_query_job(sql,project=project,destination_table = "analytics.campaign_matches",
write_disposition = "WRITE_APPEND"))
wait_for(job)
Thanks to everyone who commented and helped me research the issue.

GraceNote rhythm API with Pygn

The following works and retrurns a list of semmingly random tracks which GraceNote thinks are similar to Bowe's work:
radioPlayList = pygn.createRadio(GRACENOTE_CLIENT_ID, GRACENOTE_USER_ID, artist='Bowie', count='3');
However, I would strongly prefer to pass a genre, rather than an atrist - I just can't figure our how.
This radioPlayList = pygn.createRadio(GRACENOTE_CLIENT_ID, GRACENOTE_USER_ID, genre='38', count='3'); returns <RESPONSES>\n <RESPONSE STATUS="NO_MATCH">\n </RESPONSE>\n</RESPONSES> which lead me to beleive that Genre should not just be a simple number.
And trying to give the genre as a text, radioPlayList = pygn.createRadio(GRACENOTE_CLIENT_ID, GRACENOTE_USER_ID, genre='Oldies', count='3'); gives <RESPONSES>\n <MESSAGE>GCSP: RADIOCREATE error: [8] radio: Invalid attribute seed.</MESSAGE>\n <RESPONSE STATUS="ERROR">\n </RESPONSE>\n</RESPONSES>\n so that is obviously not the way to do it.
QUESTION: how can I pass a Genre (only) and get a radio playlist in return?
The only Pygn docuemntation which I can find does not help. I am hoping that #cweichen will se thsi question & help me. Does anyone else know how?
[Update] Looking in the code of Pygn's test.py, I see
# Example how to create a radio playlist by genre classical music
result = pygn.createRadio(clientID, userID, genre='36061', popularity ='1000', similarity = '1000')
print(json.dumps(result, sort_keys=True, indent=4))
Question: where do I get a list of those genre values? The file readme.md says genre: a genre ID from the genres below, but here is no list below.
To get the list of genres (or moods, or eras) you need to make a call to the "fieldvalues" API - this isn't in pygn yet, but you can see how to do it here:
https://developer.gracenote.com/rhythm-api#attribute-station
This call will give you the list of supported genres:
https://cXXXXXXX.web.cddbp.net/webapi/json/1.0/radio/fieldvalues?fieldname=RADIOGENRE&client=CLIENT_ID&user=USER_ID
You can then use the returned ID's with pygn.createRadio()

Resources