How should I go about implementing search as you type in amazon CloudSearch to search Amazon dynamodb. Like the way algolia does it.
You can search-as-you-type by using a prefix search every time the user enters a character -- it would look something like this:
(prefix field=name 'dri')
The prefix search is necessary because a regular search for q=dri would not match drive, drivel, etc.
Here are the prefix search docs: http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-text.html#searching-text-prefixes
If you don't want to specify the fields for your prefix search you can use a query of the form q=dri* | dri (the non-* term is necessary because q=dri* does not match the word "dri" -- it requires there to be at least one additional character).
Folks, I was wondering what is the best way to model document and/or map functions that allows me "Not Equals" queries.
For example, my documents are:
1. { name : 'George' }
2. { name : 'Carlin' }
I want to trigger a query that returns every documents where name not equals 'John'.
Note: I don't have all possible names before hand. So the parameters in query can be any random text like 'John' in my example.
In short: there is no easy solution.
You have four options:
sending a multi range query
filter the view response with a server-side list function
using a CouchDB plugin
use the mango query language
sending a multi range query
You can request the view with two ranges defined by startkey and endkey. You have to choose the range so, that the key John is not requested.
Unfortunately you have to find the commit request that somewhere exists and compile your CouchDB with it. Its not included in the official source.
filter the view response with a server-side list function
Its not recommended but you can use a list function and ignore the row with the key John in your response. Its like you will do it with a JavaScript array.
using a CouchDB plugin
Create an additional index with e.g. couchdb-lucene. The lucene server has such query capabilities.
use the "mango" query language
Its included in the CouchDB 2.0 developer preview. Not ready for production but will be definitely included in the stable release.
I want to develop an app that pulls the singers of any song that we query for. So if someone types in Carry On from the Some Nights album, the app is supposed to pull out who all sang that song. Thanks.
You can search for this using the Freebase Search API and Search Metaschema like this:
https://www.googleapis.com/freebase/v1/search?query=Carry+On&filter=(all+/music/release_track/release:"Some+Nights")&output=(/music/release_track/release+/music/release_track/recording./music/recording/artist)
There are three parts to this API request: the query, the filter and the output parameter. The query is simply the name of the track that you're looking for:
query=Carry+On
The filter parameter constrains the results to only tracks which are part of an album release named "Some Nights"
filter=(all+/music/release_track/release:"Some+Nights")
The output parameter tells the API which properties to return in the response. In this case we want to know which release the track is part of and which artist recorded the track.
output=(/music/release_track/release+/music/release_track/recording./music/recording/artist)
You'll notice that this query actually returns 8 matching tracks right now. This is because there were many different releases of the album which all contained recordings of that track (and not necessarily the exact same recording).
For what you're building it sounds like you should be able to just take the first result. You can constrain the search API to only return the first result by adding a limit parameter to the request:
limit=1
I'm trying to work out how to use active record to return some data based on a nested model.
My relationship is setup as below:
User - Has many books
Book - Has many users
UserBook - belongs to user and belongs to book
I can access users through books like so:
book.users.first
book.users.second
etc.
I'd like to select all the books, that does not have a particular user.
I have generated a query like this, please note, the 'near', method is provided by the Geocoder gem.
Book.near(location, distance).joins(:users).where("users.id != #{#current_user.id}")
I believe the syntax is correct, no errors occur, however, the query still returns books with the current user.
The issue appears to be that if book.users contains a user id that is not current user id AND also contains the current user id, book is still returned.
I can get the desired result using code like this, but I presume there is a way to get ActiveRecord to do it for me.
search = Book.near(location, distance).reject do |book|
if book.users.include?(#current_user)
book
end
end
I'm trying to figure out how to model data in Riak. Let's say you are building something like a CMS with two features, news and products. You need to be able to store this information for multiple clients X and Y. How would you typically structure this?
One bucket per client and then two keys news and products. Store multiple objects under each key and then use map/reduce to order them.
Store both the news and the products in the same bucket, but with a new autogenerated key for each news item and product item. That is, one bucket for X and one for Y.
One bucket per client/feature combination, that is, the buckets would be X-news, X-products, Y-news and Y-products. Then use map/reduce on the whole bucket to return the results in order.
Which would be the best way to handle this problem?
I'd create 2 buckets: news and products.
Then I'd prefix keys in each bucket with client names.
I'd probably also include dates in news keys for easy date ranging.
news/acme_2011-02-23_01
news/acme_2011-02-23_02
news/bigcorp_2011-02-21_01
And optionally prefix product names with category names
products/acme_blacksmithing_anvil
products/bigcorp_databases_oracle
Then in your map/reduce you could use key filtering:
// BigCorp News items
{
"inputs":{
"bucket":"news",
"key_filters":[["starts_with", "bigcorp"]]
}
// ... rest of mapreduce job
}
// Acme Blacksmithing items
{
"inputs":{
"bucket":"products",
"key_filters":[["starts_with", "acme_blacksmithing"]]
}
// ... rest of mapreduce job
}
// News for all clients from Feb 12th to 19th
{
"inputs":{
"bucket":"news",
"key_filters":[["tokenize", "_", 2],
["between", "2011-02-12", "2011-02-19"]]
}
// ... rest of mapreduce job
}
An even more efficient approach to this than using key filtering (as per Kev Burns's recommendation) is to use Secondary Indexes or Riak Search, to model this scenario.
Take a look at my answers to Which clustered NoSQL DB for a Message Storing purpose? and Links in Riak: what can they do/not do, compared to graph databases? for a discussion of similar cases.
You have several decisions to make, depending on your use case. In all cases, you would start out with a company bucket, so that each company has a unique key.
1) Whether to store the items of interest in 2 separate buckets (news and products) or in one (something like items_of_interest) depends on your preference and ease of querying. If you're always going to be querying for both news and products for a company in a single query, you might as well store them in a single bucket. But I recommend using 2 separate ones, to keep easier track of them, especially if you'll have something like separate tabs or pages for "Company X - Products" and "Company X - News". And if you need to combine them into a single feed, you would make 2 queries (one for news and one for products), and combine them in the client code (by date or whatever).
2) If a news/product item can have one and only one company that it belongs to, create a secondary index on company_key for each item. That way, you can easily fetch all news or products for a company via a secondary index (2i) query for that company.
3) If there's a many-to-many relationship (if a news/product item can belong to several companies (perhaps the news item is about a joint venture for 2 separate companies)), then I recommend modeling the relationship as a separate Riak object. For example, you could create a mentions bucket, and for each company mentioned in a news story, you would insert a Mention object, with its own unique key, a secondary index for company_key, and the value would contain a type ('news' or 'product') and an item_key (news key or product key).
Extracting relationships to separate Riak objects like this allows you to do a lot of interesting things -- tag them arbitrarily using Riak Search, query them for subscription event notifications, etc.