I have a question to couchdb views / map-reduce.
Lets say, we have a database with hotel-documents like the following:
[{
_id: "1",
name: "Hotel A",
type: "hotel",
stars: 3,
flags: ["family-friendly","green-hotel","sport"],
hotelType: "premium",
food: ["breakfast","lunch"]
}, {
_id: "2",
name: "Hotel B",
type: "hotel",
stars: 5,
flags: ["family-friendly","pet-friendly"],
hotelType: "budget",
food: ["breakfast","lunch","dinner"]
}]
To find all hotels with 3 stars, the following view will fit:
function(doc) {
emit([doc.stars, doc.name]);
}
If I use startkey=[3], everything is fine.
But how is it possible to make a view with multiple filters?
For example - all hotels:
with 3 stars and the flags "family-friendly" and "pet-friendly" and
with hotelType "budget"?
with hotelType "premium" and food "breakfast" or "lunch"?
etc.
Any ideas?
EDIT:
I have now decided to use good old mysql. CouchDB was a nice experience for me, but there are tooo much problems if you need more then the data of one document :(
You can emit a key with a group of values:
emit([[doc.stars,doc.hotelType], doc.name]);
The problem is that this only works if you can order your attributes by importance since they will always get reduced in the same order. Kxepal's solution of using different views is probably the best for your situation.
Source: http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Grouping
You need to use different views for that. Each view will handle his own domain with own keys. You may create one-view-for-all-data via multiple emit with different key value, but in perspective it will be hard to maintain.
CouchDB views are one-dimensional. And you are looking for multi-dimensional query:
x = stars
y = flags
z = hotelType
Multi-dimensional queries are not supported unfortunately. For example if you need query geographical location by latitude and longitude, than you'll have to use GeoCouch.
Related
I want to get the shape of a road link (form one node to another node, i.e. between 2 junctions) but I cannot find how to do it.
If I try with
https://pde.api.here.com/1/tile.json?app_id=&app_code=&layer=LINK_ATTRIBUTE_FC1&level=9&tilex=537&tiley=399
there is no shape.
This is to store the shape in my geoserver to later reuse the map. I am not sure this is doable according to the commercial license... So any commercial explanation is also welcome.
Is there a price for this? Is this allowed?
I think the ADAS attribute layer would be more useful for your use case:
https://pde.api.here.com/1/tile.json?&app_id=xxx&app_code=yyy&layer=ADAS_ATTRIB_FC1&level=9&tilex=537&tiley=399
Example out of the response:
[...]
"LINK_ID": "52493206",
"HPX": "89681500,-1400,-8000",
"HPY": "502884700,1900,11200",
"HPZ": "18242,22,138",
"SLOPES": "547,-2,92",
"HEADINGS": "334899",
"CURVATURES": "-116",
"VERTICAL_FLAGS": "0",
"REFNODE_LINKCURVHEADS": "9588455:-110:-25252",
"NREFNODE_LINKCURVHEADS": "1143217772:-111:335789",
"BUA_ROAD": "4",
"BUA_ROAD_VERIFIED": "Y"
}, {
"LINK_ID": "52493207",
"HPX": "89658700,-8700",
"HPY": "502913000,13500",
"HPZ": "18592,167",
"SLOPES": "525,70",
"HEADINGS": "",
"CURVATURES": "",
"VERTICAL_FLAGS": "",
"REFNODE_LINKCURVHEADS": "497590520:-88:-22269",
"NREFNODE_LINKCURVHEADS": "869077244:-83:338527",
"BUA_ROAD": "4",
"BUA_ROAD_VERIFIED": "Y"
},
[...]
Here you can see all available layer and the data they contain:
https://pde.api.here.com/1/doc/layers.json?&app_id=xxx&app_code=yyy
Update: Regarding storing of the data please go through the Terms and Conditions on Here Website. https://legal.here.com/en-gb/terms/here-wego-here-application-and-here-maps-service-terms
You undertake that you will safeguard, protect, and keep your HERE
account confidential and shall not disclose it to any person, or store
the information in any manner, except as required by law.
I'm new to NoSQL database. Currently I'm trying to use the Firebase and integrate it with iOS. When it comes to predefine the database, with trial and error, I try to make it look like this:
When I tried to retrieve the "stories" path in iOS, I get json structure like this:
[
<null>,
{
comments: [
<null>,
1,
2,
3
],
desc: "Blue versus red in a classic battle of good versus evil and right versus wrong.",
duration: 30,
rating: 4.42,
tags: [
<null>,
"fantasy",
"scifi"
title: "The Order of the Midnight Sun",
writer: 1
]
}
]
My question is, why there's always a null at the beginning of each array? What should I do in the database editor to avoid the null?
It looks like you start pushing data to index 1 and not 0, inserting/retrieving data to/from a list starts with index 0:
For example, I have a list of 1.000.000 users, the data look like this:
users: {
$userId: {
name: "",
sex: "",
age: "",
city: "",
maritalStatus: "",
// can be more
}
}
I want to filter, paginate the data for: users who are single, male, with age < 30, living in city X.
Is there a good practice to make this kind of queries less painful?
Firebase doesn't have a a direct way to query for more than one child at a time.
You can structure your data to make it easier - for example
users
$userId
gender_age: male_27
$userId
gender_age: male_32
Then, to query for males between 30 and 40:
gender_age....queryStartingAtValue("male_30").endingAtValue("male_40")
That will narrow down the results - you could then filter in code for the ones you want, for example (conceptual)
if snapshot.child("maritalStatus") = "Single" and
snapshot.child("city") = "AnyTown" then
//add person to list for display
You could expand this out a bit to narrow the results further:
users
$userId
city_gender_age: anytown_male_27
city_gender_age....queryStartingAtValue("anytown_male_30").endingAtValue("anytown_male_40")
Unfortunately the pattern breaks down if the query is less specific; e.g. if we are querying for either male or female in anytown between 30 and 40, this won't work.
However, disk space is cheap so storing 'duplicate' data in another node would resolve that
another_node
$user_id
city_age: anytown_27
Lets say we have a database of food items such as:
item1 = {name: 'item1', tags: ['mexican', 'spicy']};
item2 = {name: 'item2', tags: ['sweet', 'chocolate', 'nuts']};
item3 = {name: 'item3', tags: ['sweet', 'vanilla', 'cold']};
And we have a user looking for food recommendations, where they indicate their preference weight for some tags:
foodPref = {sweet: 4, chocolate: 11}
Now we need to calculate how well each item scores and recommend the best items:
item1 score = 0 (doesn't contain any of the tags user is looking for)
item2 score = 4 (contains the tag 'sweet')
item3 score = 15 (contains the tag 'sweet' and 'chocolate')
I have modeled the problem as a graph:
What's the correct way to get the recommendations -- a custom traversal object or just filter and count using AQL or just implement it in Foxx (javascript layer)?
Also, can you help out with a sample implementation for the methods you suggest?
Thanks in advance!
First, lets create the collections and their contents the way you specified them. We will add a second user.
db._create("user")
db._create("tags")
db._create("dishes")
db.user.save({_key: 'user1'})
db.user.save({_key: 'user2'})
db.tags.save({_key: 'sweet'})
db.tags.save({_key: 'chocolate'})
db.tags.save({_key: 'vanilla'})
db.tags.save({_key: 'spicy'})
db.dishes.save({_key: 'item1'})
db.dishes.save({_key: 'item2'})
db.dishes.save({_key: 'item3'})
Now lets create the edge collections with their edges:
db._createEdgeCollection("userPreferences")
db._createEdgeCollection("dishTags")
db.userPreferences.save("user/user1", "tags/sweet", {score: 4})
db.userPreferences.save("user/user1", "tags/chocolate", {score: 11})
db.userPreferences.save("user/user2", "tags/sweet", {score: 27})
db.userPreferences.save("user/user2", "tags/vanilla", {score: 7})
db.dishTags.save("tags/sweet", "dishes/item2", {score: 4});
db.dishTags.save("tags/sweet", "dishes/item3", {score: 7})
db.dishTags.save("tags/chocolate", "dishes/item2", {score: 2})
db.dishTags.save("tags/vanilla", "dishes/item3", {score: 3})
db.dishTags.save("tags/spicy", "dishes/item1", {score: 666})
Our relations are like this:
user-[userPreferences]->tags-[dishTags]->dishes
finding out what user1 likes can be done with this query:
FOR v, e IN 1..2 OUTBOUND "user/user1" userPreferences, dishTags
RETURN {item: v, connection: e}
if you now want to find all dishes that user1 likes best:
FOR v, e IN 2..2 OUTBOUND "user/user1" userPreferences, dishTags
FILTER e.score > 4 RETURN v
We filter for the score attribute.
Now we want to find another user that has the same preferences as user1 does:
FOR v, e IN 2..2 ANY "user/user1" userPreferences RETURN v
We go into ANY direction (forward and backward), but only are interested in the userPreferences edge collection, else 2..2 would also give use dishes. The way we do it now. we go back into the user collections to find users with similar preferences.
Whether or not creating a Foxx-service is a good option depends on personal preferences. Foxx is great if you want to combine & filter results on the server side, so client communication is less. You can also use it if you like to put your Application rather on top of microservices than on db-queries. Your application may then stay free of database specific code - it only operates with the microservice as its backend. There may be usecases where Foxx
In general, there is no "correct" way - there are different ways which you may prefer above others because of performance, code cleanness, scalability, etc.
I am trying to learn how to use map reduce functions with Couchbase. until now i created reports engines based on SQL using Where with multi terms (adding and subtracting terms) and to modify the group part.
I am trying to create this report engine using views.
my problem is how to create a report that enable users to dive in and find more and more data, getting all the way to individual ip stats.
For example. how many clicks where today ? which traffic source ? what did they see? which country ? and etc..
My basic doc for this example look like this:
"1"
{
"date": "2014-01-13 10:00:00",
"ip": "111.222.333.444",
"country": "US",
"source":"1",
}
"2"
{
"date": "2014-01-13 10:00:00",
"ip": "555.222.333.444",
"country": "US",
"source":"1",
}
"3"
{
"date": "2014-01-13 11:00:00",
"ip": "111.888.888.888",
"country": "US",
"source":"2",
}
"4"
{
"date": "2014-01-13 11:00:00",
"ip": "111.777.777.777",
"country": "US",
"source":"1",
}
So i want to allow the user to see at the first screen , how many clicks per day there are at this site.
so i need to count the amount of clicks. simple map/reduce:
MAP:
function (doc, meta) {
emit(dateToArray(doc.date),1);
}
Reduce:
_count
group level 4, group true
will create the sum of clicks per hour.
Now if i want to allow a break down of countries, so i need a dynamic param to change.. from what i am understand it can only by the group level..
so assume i have added this to the emit like this:
emit([dateToArray(doc.date),source],1);
and then grouping level 5 will allow this divide, and using the key too focus on a certein date.. but what if i need to add a county break down? adding this to the emit again?
this seem to be a mess, also if i will want to do a country stats before the source.. is there any smarter way to do this?
Second part...
What if i want to get the first count as follow:
[2014,1,28,10] {ip:"555.222.333.444","111.222.333.444","count":"2"}
i want to see all the ips that are counted for this time...
how should i write my reduce function?
this is my current state that doesnt work..
function(key, values, rereduce) {
var result = {id: 0, count: 0};
for(i=0; i < values.length; i++) {
if(rereduce) {
result.id = result.id + (values[i]).ip +',';
result.count = result.count + values[i].count;
} else {
result.id = values.ip;
result.count = values.length;
}
}
return result;
i didnt get the answer format i was looking for..
i hope this is not to messy and that you could help me with this..
thanks!!
For the first part of your question, I think you are on the right track. That is how you break down views to enable coarse drill down. However, it is important to remember that views are not intended to store your entire documents, nor are they necessarily going to be able to give you a clean cut swatch of data. You probably will need to do fine-filtering within the access layer of your code (using Linq perhaps).
For the second part of your question, a reduce is not the appropriate mechanism to accomplish this. Reduce values have a very finite (and limited) size and will crash the map/reduce engine once they get too big. I suspect you have experimented with that and discovered this for yourself.
The way you worded the question, it seems like you wish to search for all IP addresses that have been counted "X" number of times. This cannot be accomplished directly in Couchbase's map/reduce architecture; however, if you simply want the count for a given IP address, that is something the map/reduce framework has built-in (just use Date + IP as a key).