Send data to a specific solr collection - collections

I would like to know how to send a data to a specific collection into running Solr instance (actually into running SolrCloud instance).
I've started a SolrCloud instance with a bunch of hand-made collections (using SolrCloud Collections REST API) and hence wanted to send some data to a specific collection in order to easily distinct one sort of data from another. Unfortunately I didn't find a way to do that..
It it possible? If it is than how?

The collection is part of the URL you're using when querying any server, usually located under http://localhost:8983/solr/<collection name>/. If you're querying the products collection to retrieve all documents, the url would be http://localhost:8983/solr/products/select?q=*:*. The same goes for updating a collection, POST your content to http://localhost:8983/solr/products/update.
Replace localhost:8983 with one of your own server host/port combinations.
You can also see further examples in the Getting Started with Solr Cloud tutorial.

Related

Is there anyway to run Firebase realtime database queries with multiple keys?

I am working on a personal project to recreate the news feed of Facebook. So what I am trying to do is to recreate the scenario where when the user goes to the news feed, the user gets posts of everyone he follows only. Is there any way to run a query like that using the Firebase real-time database using an of "followings".
I can successfully generate single users posts in the android studio app using snapshot and recycler view.
If you're asking whether you can get posts from multiple userUID values with a single query, that is not possible.
If you're asking whether you can pass a list of postUID values to retrieve, that is also not possible.
In both cases the solution is to execute a separate query/read operation for each of the values, and merge the results in your application code. This is not nearly as slow as you may think, since Firebase pipelines the requests over a single web socket connection - which is quite efficient. For more on this, see Speed up fetching posts for my social network app by using query instead of observing a single event repeatedly

Archive Data from Cosmos DB - Mongo DB

in the project i am working on, we have a database per tenant and each tenant consists of at least 1 department. One of the requirements we have is that when an admin user deletes a department using a custom frontend we've provided, the system should first archive the data of that department on a blob storage before the data is deleted. The same we have for the tenant, we need to archive the data before the database of that tenant is removed from the account.
Now, my question: is there any best practice to do this? We are planning to retrieve all the data from all collections, using a mongo query, based on the department id (which is also the partition key) and then send it to a blob storage. The challenge we have is the execution of the query to retrieve all the data because it can be a huge amount and the RUs required for that action may affect the performance of the system because other users may be using the system while we remove the data.
I looked at mongodump and mongoexport but these are applications so we cannot execute it from our code?
Any ideas? Thanks a lot.
I think one way to solve this is by using ChangeFeed, as it reallyhelps and simplifies writing a carbon copy somewhere else.
However, as of now the change feed processor won't notify you for deleted documents so you can't listen for them, this feature is planned as of now.
Your best bet is to write some custom application that does archiving using Query language support

Adobe Echosign: Is it possible to get Documents by Name?

We have an EchoSign account with lots of documents in it. We now need to retrieve the documents from another application and would like to use the SOAP or REST API for it. I have an API key and managed to get the SOAP API working. Unfortunately I did not find a way do retrieve documents by their name? The creator of the documents did not provide an externalID.
I got some responses on this issue from Adobe support:
The external ID needs to be set at the time of creating the documents, and cannot be set later on.
Only way to get a document by name is to use the getUsersInAccount request to retrieve a set of all users, then use the getUserDocuments request for each user, then scan the names in the DocumentsListItems and finally retrieve the document using the documentKey.

Putting records into the Elasticsearch index before the relational database

I have an application which consumes RSS feeds and makes them searchable by performing the following steps:
pulling article from the feed URL
storing that data in a relational DB
indexing the data in Elasticsearch
I'd like to reverse this process so that I can use the RSS River Elasticsearch plugin to pull data from feeds. However, this plugin integrates directly with Elasticsearch, bypassing my relational DB (which is a problem for other parts of the application which rely on each article having a record in the DB).
How can I have Elasticsearch notify the DB when a new article has been indexed (and de-indexed)?
Edit
Currently I'm using Ruby on Rails 4 with a PostgreSQL DB. RSS feeds are fetched in the background using Sidekiq to manage jobs. They go directly into PG and are then indexed by Elasticsearch. I'm using Chewy to provide an interface to the ES index. It doesn't support callbacks like I'm looking for (no Ruby library does afaik?).
Searching queries ES for matches then loads the records from PG to display results.
It sounds like you are looking for the sort of notification/trigger functionality described in this feature request. In the absence of that feature I think the approach suggested in that thread by the user "cravergara" is your best bet - that is, you can alter the RSS river Elasticsearch plugin to update your DB whenever an article is indexed.
That would handle the indexing requirement. To sync the de-indexing, you should make sure that any code that deletes your Elasticsearch documents also deletes the corresponding DB records.

Firebase and indexing/search

I am considering using Firebase for an application that should people to use full-text search over a collection of a few thousand objects. I like the idea of delivering a client-only application (not having to worry about hosting the data), but I am not sure how to handle search. The data will be static, so the indexing itself is not a big deal.
I assume I will need some additional service that runs queries and returns Firebase object handles. I can spin up such a service at some fixed location, but then I have to worry about its availability ad scalability. Although I don't expect too much traffic for this app, it can peak at a couple of thousand concurrent users.
Architectural thoughts?
Long-term, Firebase may have more advanced querying, so hopefully it'll support this sort of thing directly without you having to do anything special. Until then, you have a few options:
Write server code to handle the searching. The easiest way would be to run some server code responsible for the indexing/searching, as you mentioned. Firebase has a Node.JS client, so that would be an easy way to interface the service into Firebase. All of the data transfer could still happen through Firebase, but you would write a Node.JS service that watches for client "search requests" at some designated location in Firebase and then "responds" by writing the result set back into Firebase, for the client to consume.
Store the index in Firebase with clients automatically updating it. If you want to get really clever, you could try implementing a server-less scheme where clients automatically index their data as they write it... So the index for the full-text search would be stored in Firebase, and when a client writes a new item to the collection, it would be responsible for also updating the index appropriately. And to do a search, the client would directly consume the index to build the result set. This actually makes a lot of sense for simple cases where you want to index one field of a complex object stored in Firebase, but for full-text-search, this would probably be pretty gnarly. :-)
Store the index in Firebase with server code updating it. You could try a hybrid approach where the index is stored in Firebase and is used directly by clients to do searches, but rather than have clients update the index, you'd have server code that updates the index whenever new items are added to the collection. This way, clients could still search for data when your server is down. They just might get stale results until your server catches up on the indexing.
Until Firebase has more advanced querying, #1 is probably your best bet if you're willing to run a little server code. :-)
Google's current method to do full text search seems to be syncing with either Algolia or BigQuery with Cloud Functions for Firebase.
Here's Firebase's Algolia Full-text search integration example, and their BigQuery integration example that could be extended to support full search.

Resources