ArangoDB - how to implement a custom recommendation engine using graph? - graph

Lets say we have a database of food items such as:
item1 = {name: 'item1', tags: ['mexican', 'spicy']};
item2 = {name: 'item2', tags: ['sweet', 'chocolate', 'nuts']};
item3 = {name: 'item3', tags: ['sweet', 'vanilla', 'cold']};
And we have a user looking for food recommendations, where they indicate their preference weight for some tags:
foodPref = {sweet: 4, chocolate: 11}
Now we need to calculate how well each item scores and recommend the best items:
item1 score = 0 (doesn't contain any of the tags user is looking for)
item2 score = 4 (contains the tag 'sweet')
item3 score = 15 (contains the tag 'sweet' and 'chocolate')
I have modeled the problem as a graph:
What's the correct way to get the recommendations -- a custom traversal object or just filter and count using AQL or just implement it in Foxx (javascript layer)?
Also, can you help out with a sample implementation for the methods you suggest?
Thanks in advance!

First, lets create the collections and their contents the way you specified them. We will add a second user.
db._create("user")
db._create("tags")
db._create("dishes")
db.user.save({_key: 'user1'})
db.user.save({_key: 'user2'})
db.tags.save({_key: 'sweet'})
db.tags.save({_key: 'chocolate'})
db.tags.save({_key: 'vanilla'})
db.tags.save({_key: 'spicy'})
db.dishes.save({_key: 'item1'})
db.dishes.save({_key: 'item2'})
db.dishes.save({_key: 'item3'})
Now lets create the edge collections with their edges:
db._createEdgeCollection("userPreferences")
db._createEdgeCollection("dishTags")
db.userPreferences.save("user/user1", "tags/sweet", {score: 4})
db.userPreferences.save("user/user1", "tags/chocolate", {score: 11})
db.userPreferences.save("user/user2", "tags/sweet", {score: 27})
db.userPreferences.save("user/user2", "tags/vanilla", {score: 7})
db.dishTags.save("tags/sweet", "dishes/item2", {score: 4});
db.dishTags.save("tags/sweet", "dishes/item3", {score: 7})
db.dishTags.save("tags/chocolate", "dishes/item2", {score: 2})
db.dishTags.save("tags/vanilla", "dishes/item3", {score: 3})
db.dishTags.save("tags/spicy", "dishes/item1", {score: 666})
Our relations are like this:
user-[userPreferences]->tags-[dishTags]->dishes
finding out what user1 likes can be done with this query:
FOR v, e IN 1..2 OUTBOUND "user/user1" userPreferences, dishTags
RETURN {item: v, connection: e}
if you now want to find all dishes that user1 likes best:
FOR v, e IN 2..2 OUTBOUND "user/user1" userPreferences, dishTags
FILTER e.score > 4 RETURN v
We filter for the score attribute.
Now we want to find another user that has the same preferences as user1 does:
FOR v, e IN 2..2 ANY "user/user1" userPreferences RETURN v
We go into ANY direction (forward and backward), but only are interested in the userPreferences edge collection, else 2..2 would also give use dishes. The way we do it now. we go back into the user collections to find users with similar preferences.
Whether or not creating a Foxx-service is a good option depends on personal preferences. Foxx is great if you want to combine & filter results on the server side, so client communication is less. You can also use it if you like to put your Application rather on top of microservices than on db-queries. Your application may then stay free of database specific code - it only operates with the microservice as its backend. There may be usecases where Foxx
In general, there is no "correct" way - there are different ways which you may prefer above others because of performance, code cleanness, scalability, etc.

Related

Godot 3.4: Quest System using Dictionaries

I took this to the Godot Reddit first, but honestly it's not much help imo. A lot of questions go unanswered there. So here I am.
As the title says, Im making a quest system in godot 2d using nested dictionaries. Ive seen people use Nodes, Classes or otherwise for their quest systems, and a few dictionary based ones out there. I chose dictionaries as I know them the best(still isnt a whole lot, or i probably wouldnt be here asking this lol) And my quest system is set up like so:
QuestBase --> QuestHandler --> PlayerData
QuestBase is a giant dictionary of all available quests in the game.
PlayerData is a giant dictionary of all the player stats(including active, completed, or failed quests)
and QuestHandler takes a questName in a function to Copy the quest(questName) from QuestBase dict into the PlayerData.quests_active dict.(quests_active is a dictionary of quests(also dictionaries) inside of the PlayerData dictionary lol) But i cant seem to get it to work. I've done it a couple different ways now, including the way the Godot documentation states on how to add Dinctionaries into dictionaries. Please help, this is the error I get from QuestHandler:
Invalid set index 'tutorialQuest' (on base: 'Nil') with value of type 'Dictionary'
QuestBase:
var storyQuests = {
"tutorialQuest":{
"name" : "Your First Steps",#Name of the Quest
"desc" : "Get to know your environment and learn the ropes.",#Description of the Quest
"level" : 1, #Required level before player can accept quest
"questType" : 0, #0=Story Quest || 1=Side Quest || 2=Repeateable Quest
"taskType": 0, #0=Progressive(task must be completed in order) || 1=Synchronous(tasks can be completed in any order, and all at once)
"tasks":{#Dictionary of Quest's Tasks
"task1":{
"text":"Talk to Bjorn",#Text to display in Quest log
"type":0,#0=Talk,1=Slay,2=Fetch,3=Deliver,4=Collect,5=Goto
"quantity":0,#Determines the amount to complete task for Slay, Fetch, and Collect
"target":"Bjorn",#Determines Who to talk to, or who to slay, or what to collect.
"completed":false#Is this task complete?
},
"task2":{
"text":"Talk to Bjorn",
"type":"Talk",
"target":"Bjorn",
"completed":false
},
"task3":{
"text":"Talk to Bjorn",
"type":"Talk",
"target":"Bjorn",
"completed":false
}
},
"itemReward":{
"items":["Sword", "Basic Elixer"],#Names of items to give
"amount":[1, 5]#Amount of each item to give, respectively.
},
"expReward":{
"skill":["Forestry","Smithing"],#Names of skills to add Exp to
"amount": [100,100] #Amount to add to each skill, respectively.
#1st number will increase first skill in "skill", etc...
},
"Complete":false,
"Failed":false
}
}
PlayerData:
var playerData = {
... irrelevant player data...
#Quests
"quests_active": {"blank":"blank"},
"quests_completed": {},
"quests_failed": {},
and Finally, Quest Handler:
func startQuest(questName: String):
var active_quests = PlayerData.playerData.get("quests_active")
if QuestBase.storyQuests.has(questName):
var questCopy = QuestBase.storyQuests.get(questName).duplicate(true)
PlayerData.playerData.get("quests_active")[questName] = questCopy #.quests_active.append(questCopy)
#print("Story Quest Started: " + PlayerData.playerData.QuestsActive.get(questName).get("name"))
elif QuestBase.sideQuests.has(questName):
var questCopy = QuestBase.sideQuests.get(questName).duplicate(true)
active_quests.append(questCopy)
#print("Side Quest Started: " + PlayerData.playerData.QuestsActive.get(questName).get("name"))
else:
print("Quest Doesn't Exist! Check Spelling!")

DynamoDB transactional insert with multiple conditions (PK/SK attribute_not_exists and SK attribute_exists)

I have a table with PK (String) and SK (Integer) - e.g.
PK_id SK_version Data
-------------------------------------------------------
c3d4cfc8-8985-4e5... 1 First version
c3d4cfc8-8985-4e5... 2 Second version
I can do a conditional insert to ensure we don't overwrite the PK/SK pair using ConditionalExpression (in the GoLang SDK):
putWriteItem := dynamodb.Put{
TableName: "example_table",
Item: itemMap,
ConditionExpression: aws.String("attribute_not_exists(PK_id) AND attribute_not_exists(SK_version)"),
}
However I would also like to ensure that the SK_version is always consecutive but don't know how to write the expression. In pseudo-code this is:
putWriteItem := dynamodb.Put{
TableName: "example_table",
Item: itemMap,
ConditionExpression: aws.String("attribute_not_exists(PK_id) AND attribute_not_exists(SK_version) **AND attribute_exists(SK_version = :SK_prev_version)**"),
}
Can someone advise how I can write this?
in SQL I'd do something like:
INSERT INTO example_table (PK_id, SK_version, Data)
SELECT {pk}, {sk}, {data}
WHERE NOT EXISTS (
SELECT 1
FROM example_table
WHERE PK_id = {pk}
AND SK_version = {sk}
)
AND EXISTS (
SELECT 1
FROM example_table
WHERE PK_id = {pk}
AND SK_version = {sk} - 1
)
Thanks
A conditional check is applied to a single item. It cannot be spanned across multiple items. In other words, you simply need multiple conditional checks. DynamoDb has transactWriteItems API which performs multiple conditional checks, along with writes/deletes. The code below is in nodejs.
const previousVersionCheck = {
TableName: 'example_table',
Key: {
PK_id: 'prev_pk_id',
SK_version: 'prev_sk_version'
},
ConditionExpression: 'attribute_exists(PK_id)'
}
const newVersionPut = {
TableName: 'example_table',
Item: {
// your item data
},
ConditionExpression: 'attribute_not_exists(PK_id)'
}
await documentClient.transactWrite({
TransactItems: [
{ ConditionCheck: previousVersionCheck },
{ Put: newVersionPut }
]
}).promise()
The transaction has 2 operations: one is a validation against the previous version, and the other is an conditional write. Any of their conditional checks fails, the transaction fails.
You are hitting your head on some of the differences between a SQL and a no-SQL database. DynamoDB is, of course, a no-SQL database. It does not, out of the box, support optimistic locking. I see two straight forward options:
Use a software layer to give you locking on your DynamoDB table. This may or may not be feasible depending on how often updates are made to your table. How fast 'versions' are generated and the maximum time your application can be gated on the lock will likely tell you if this can work foryou. I am not familiar with Go, but the Java API supports this. Again, this isn't a built-in feature of DynamoDB. If there is no such Go API equivalent, you could use the technique described in the link to 'lock' the table for updates. Generally speaking, locking a no-SQL DB isn't a typical pattern as it isn't exactly what it was created to do (part of which is achieving large scale on unstructured documents to allow fast access to many consumers at once)
Stop using an incrementor to guarantee uniqueness. Typically, incrementors are frowned upon in DynamoDB, in part due to the lack of intrinsic support for it and in part because of how DynamoDB shards you don't want a lot of similarity between records. Using a UUID will solve the uniqueness problem, but if you are porting an existing application that means more changes to the elements that create that ID and updates to reading the ID (perhaps to include a creation-time field so you can tell which is the newest, or the prepending or appending of an epoch time to the UUID to do the same). Here is a pertinent link to a SO question explaining on why to use UUIDs instead of incrementing integers.
Based on Hung Tran's answer, here is a Go example:
checkItem := dynamodb.TransactWriteItem{
ConditionCheck: &dynamodb.ConditionCheck{
TableName: "example_table",
ConditionExpression: aws.String("attribute_exists(pk_id) AND attribute_exists(version)"),
Key: map[string]*dynamodb.AttributeValue{"pk_id": {S: id}, "version": {N: prevVer}},
},
}
putItem := dynamodb.TransactWriteItem{
Put: &dynamodb.Put{
TableName: "example_table",
ConditionExpression: aws.String("attribute_not_exists(pk_id) AND attribute_not_exists(version)"),
Item: data,
},
}
writeItems := []*dynamodb.TransactWriteItem{&checkItem, &putItem}
_, _ = db.TransactWriteItems(&dynamodb.TransactWriteItemsInput{TransactItems: writeItems})

Multiple complications in watchos

I'm building complications for a nutrition tracking app. I'd like to use offer multiple smaller complications, so the user can track their nutrition.
EG:
'MyApp - Carbohydrates'
'MyApp - Protein'
'MyApp - Fat'
this way on the Modular watch face, they could track all three by using the three bottom 'modular small' complications.
I'm aware this can be achieved by only offering larger sizes that can display everything at once (eg the 'modular large' complication), but I'd like to offer the user choice about how they set up their watch face.
I can't see a way to offer multiple of the same complication, is there any way around this?
The previous answer is outdated. WatchOS 7 onwards, we can now add multiple complications to the same complication family for our app.
Step 1:
In our ComplicationController.swift file, we can make use of the getComplicationDescriptors function, which allows us to describe what complications we are making available in our app.
In the descriptors array, we can append one CLKComplicationDescriptor() for each kind of complication per family that we want to build.
func getComplicationDescriptors(
handler: #escaping ([CLKComplicationDescriptor]) -> Void) {
var descriptors: [CLKComplicationDescriptor] = []
for progressType in dataController.getProgressTypes() {
var dataDict = Dictionary<AnyHashable, Any>()
dataDict = ["id": progressType.id]
// userInfo helps us know which type of complication was interacted with by the user
let userActivity = NSUserActivity(
activityType: "org.example.foo")
userActivity.userInfo = dataDict
descriptors.append(
CLKComplicationDescriptor(
identifier: "\(progressType.id)",
displayName: "\(progressType.title)",
supportedFamilies: CLKComplicationFamily.allCases, // you can replace CLKComplicationFamily.allCases with an array of complication families you wish to support
userActivity: userActivity)
)
}
handler(descriptors)
}
The app will now have multiple complications (equal to the length of the dataController.getProgressTypes() array) for each complication family that you support.
But how do you now display different data and views for different complications?
Step 2:
In the getCurrentTimelineEntries and getTimelineEntries functions, we can then make use of the complication.identifier value to identify the data that was passed along when this complication entry was called for.
Example, in the getTimelineEntries function:
func getTimelineEntries(for complication: CLKComplication, after date: Date, limit: Int, withHandler handler: #escaping ([CLKComplicationTimelineEntry]?) -> Void) {
// Call the handler with the timeline entries after the given date
var entries: [CLKComplicationTimelineEntry] = []
...
...
...
var next: ProgressDetails
// Find the progressType to show using the complication identifier
if let progressType = dataController.getProgressAt(date: current).first(where: {$0.id == complication.identifier}) {
next = progressType
} else {
next = dataController.getProgressAt(date: current)[0] // Default to the first progressType
}
let template = makeTemplate(for: next, complication: complication)
let entry = CLKComplicationTimelineEntry(
date: current,
complicationTemplate: template)
entries.append(entry)
...
...
...
handler(entries)
}
You can similarly find the data that is passed in the getCurrentTimelineEntry and the getLocalizableSampleTemplate functions.
Step 3:
Enjoy!
There is currently no way to create multiple complications in the same family (e.g. Modular Small, Modular Large, Utilitarian Small etc.).
You could offer a way for the user to customize each complication to display Carbohydrates, Protein and Fat. You could even display different data for each complication family, but as far as having, for example, 2 modular small complications displaying different data, it is not possible yet.
You can see this if you put 2 of the same built in Apple complications in different places on your watchface they display the same thing. If Apple isn't even doing it with their own complications then it is more than likely impossible. Hope that explanation helps.

ArangoDB copy Vertex and Edges to neighbors

I'm trying to copy a vertex node and retain it's relationships in ArangoDB. I'm getting a "access after data-modification" error (1579). It doesn't like it when I iterate over the source node's edges and insert an edge copy within the loop. This makes sense but I'm struggling to figure out how to do what I'm wanting within a single transaction.
var query = arangojs.aqlQuery`
let tmpNode = (FOR v IN vertices FILTER v._id == ${nodeId} RETURN v)[0]
let nodeCopy = UNSET(tmpNode, '_id', '_key', '_rev')
let nodeCopyId = (INSERT nodeCopy IN 'vertices' RETURN NEW._id)[0]
FOR e IN GRAPH_EDGES('g', ${nodeId}, {'includeData': true, 'maxDepth': 1})
let tmpEdge = UNSET(e, '_id', '_key', '_rev')
let edgeCopy = MERGE(tmpEdge, {'_from': nodeCopyId})
INSERT edgeCopy IN 'edges'
`;
This quesion is somewhat similar to 'In AQL how to re-parent a vertex' - so let me explain this in a similar way.
One should use the ArangoDB 2.8 pattern matching traversals to solve this.
We will copy Alice to become Sally with similar relations:
let alice=DOCUMENT("persons/alice")
let newSally=UNSET(MERGE(alice, {_key: "sally", name: "Sally"}), '_id')
let r=(for v,e in 1..1 ANY alice GRAPH "knows_graph"
LET me = UNSET(e, "_id", "_key", "_rev")
LET newEdge = (me._to == "persons/alice") ?
MERGE(me, {_to: "persons/sally"}) :
MERGE(me, {_from: "persons/sally"})
INSERT newEdge IN knows RETURN newEdge)
INSERT newSally IN persons RETURN newSally
We therefore first load Alice. We UNSET the properties ArangoDB should set on its own. We change the properties that have to be uniq to be uniq for Alice so we have a Sally afterwards.
Then we open a subquery to traverse ANY first level relations of Alice. In this subequery we want to copy the edges - e. We need to UNSET once more the document attributes that have to be autogenerated by ArangoDB. We need to find out which side of _from and _to pointed to Alice and relocate it to Sally.
The final insert of Sally has to be outside of the subquery, else this statement will attempt to insert one Sally per edge we traverse. We can't insert Saly in front of the query as you already found out - no subsequent fetches are allowed after the insert.

CouchDB View with 2 Keys

I am looking for a general solution to a problem with couchdb views.
For example, have a view result like this:
{"total_rows":4,"offset":0,"rows":[
{"id":"1","key":["imported","1"],"value":null},
{"id":"2","key":["imported","2"],"value":null},
{"id":"3","key":["imported","3"],"value":null},
{"id":"4","key":["mapped","4"],"value":null},
{"id":"5,"key":["mapped","5"],"value":null}
]
1) If I want to select only "imported" documents I would use this:
view?startkey=["imported"]&endkey=["imported",{}]
2) If I want to select all imported documents with an higher id then 2:
view?startkey=["imported",2]&endkey=["imported",{}]
3) If I want to select all imported documents with an id between 2 and 4:
view?startkey=["imported",2]&endkey=["imported",4]
My Questtion is: How can I select all Rows with an id between 2 and 4?
You can try to extend the solution above, but prepend keys with a kind of "emit index" flag like this:
map: function (doc) {
emit ([0, doc.number, doc.category]); // direct order
emit ([1, doc.category, doc.number]); // reverse order
}
so you will be able to request them with
view?startkey=[0, 2]&endkey=[0, 4, {}]
or
view?startkey=[1, 'imported', 2]&endkey=[1, 'imported', 4]
But 2 different views will be better anyway.
I ran into the same problem a little while ago so I'll explain my solution. Inside of any map function you can have multiple emit() calls. A map function in your case might look like:
function(doc) {
emit([doc.number, doc.category], null);
emit([doc.category, doc.number], null);
}
You can also use ?include_docs=true to get the documents back from any of your queries. Then your query to get back rows 2 to 4 would be
view?startkey=[2]&endkey=[4,{}]
You can view the rules for sorting at CouchDB View Collation

Resources