Creating collections in solr4.4 cloud with different schemas - solrcloud

How can I create collections in solr4.4 cloud with different schema.
I want to create collection in solr4.4 cloud having two shard. However each collection should have schema of its own i.e each collection have different schema

you would have to upload two configs to zookeeper for each schema, and then make two collection apis call like:
curl -v "http://localhost:8080/solr/admin/collections?action=CREATE&name=collection1&numShards=2&collection.configName=collection1"
and
curl -v "http://localhost:8080/solr/admin/collections?action=CREATE&name=collection2&numShards=2&collection.configName=collection2"

Related

Is it possible to generate jobs in Dagster dynamically using configuration from database

Currently, my database has multi departments. I need to apply a data pipeline to all of these departments with different configurations.
I want to load configurations for each department from a database. Then use these configuration to generate a list of Jobs in Dagster.
For example, I have 3 tenants:
Department1: Configuration1
Department2: Configuration2
Department3: Configuration3
These information is stored in my database.
How can I load these information and dynamically create 3 jobs (pipelines):
Pipeline1 for Department1 with Configuration1
Pipeline2 for Department2 with Configuration2
Pipeline3 for Department3 with Configuration3
Is it possible to do it on Dagster? I can do it with Airflow (dynamically generating DAGs) but not sure how to do this in Dagster. I cannot load database configuration outside of op/job in Dagster.
In Dagster, your #repository function is just a regular function, so you can run arbitrary code in there to query your database and generate jobs dynamically:
#repository
def my_repo():
configs = # some query to your database
jobs = []
for config in configs:
jobs.append(get_job_for_my_config(config))
return jobs
If you expect that the database call might be somewhat expensive in terms of time, you can look into making your repository lazy-loaded, which the Dagster RepositoryDefinition docs detail how to do.

How to get the size of a partition in Cosmos DB collection?

I am really surprised that I could not find this information.
I have seen a comment (without details) that there is a REST API to get this information.
I will try to find this information.
You can use azure cli to get the size of a mongo collection like this.
az cosmosdb mongodb collection show -g myRG -a mycosmosaccount -d mydatabase -n mycollection
I also think you can get this via REST API on Management Plane. API sample for that is here

List available collections for database in ArangoDB using HTTP interface?

I am trying to use ArangoDB's HTTP interface to dump all collections belonging to a specific database.
I am able to view all available databases using the following command:
curl http://localhost:8529/_api/database
However, once I find a database name (for example, "test") I am unable to dump the collections belonging to this database. Ultimately, I would like to dump the collections for this database, and then all results within a chosen collection.
I have followed the documentation provided here: https://www.arangodb.com/docs/stable/http/general.html, however I am still unable to find the relevant documentation for this request.
You can get the list of all collections with
curl http://localhost:8529/_db/DBNAME/_api/collection
as is implied in this part of the documentation: https://www.arangodb.com/docs/stable/http/collection.html#address-of-a-collection
The rest of the interface works accordingly, e.g.
curl http://localhost:8529/_db/DBNAME/_api/collection/COLLNAME
to get information about a single collection (that is already included in the output of the first call).
You find the complete swagger documentation with just two clicks in the Web-Interface:

Does Firebase Realtime Database REST API support multi path updates at different entity locations?

I am using the REST API of Firebase Realtime Database from an AppEngine Standard project with Java. I am able to successfully put data under different locations, however I don't know how I could ensure atomic updates to different paths.
To put some data separately at a specific location I am doing:
requestFactory.buildPutRequest("dbUrl/path1/17/", new ByteArrayContent("application/json", json1.getBytes())).execute();
requestFactory.buildPutRequest("dbUrl/path2/1733455/", new ByteArrayContent("application/json", json2.getBytes())).execute();
Now to ensure that when saving a /path1/17/ a /path2/1733455/ is also saved, I've been looking into multi path updates and batched updates (https://firebase.google.com/docs/firestore/manage-data/transactions#batched-writes, only available in Cloud Firestore?) However, I did not find whether this feature is available for the REST API of the Firebase Realtime Database as well or only through the Firebase Admin SDK.
The example here shows how to do a multi path update at two locations under the "users" node.
curl -X PATCH -d '{
"alanisawesome/nickname": "Alan The Machine",
"gracehopper/nickname": "Amazing Grace"
}' \
'https://docs-examples.firebaseio.com/rest/saving-data/users.json'
But I don't have a common upper node for path1 and path2.
Tried setting as the url as the database url without any nodes (https://db.firebaseio.com.json) and adding the nodes in the json object sent, but I get an error: nodename nor servname provided, or not known.
This would be possible with the Admin SDK I think, according to this blog post: https://firebase.googleblog.com/2015/09/introducing-multi-location-updates-and_86.html
Any ideas if these atomic writes can be achieved with the REST API?
Thank you!
If the updates are going to a single database, there is always a common path.
In your case you'll run the PATCH command against the root of the database:
curl -X PATCH -d '{
"path1/17": json1,
"path2/1733455": json2
}' 'https://yourdatabase.firebaseio.com/.json'
The key difference with your URL seems to be the / before .json. Without that you're trying to connect to a domain on the json TLD, which doesn't exist (yet) afaik.
Note that the documentation link you provide for Batched Updates is for Cloud Firestore, which is a completely separate database from the Firebase Realtime Database.

Adding shard in collection using collection API in solr

I am using Apache solr to create collection, shards. I am able to build collection using
sudo curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=demo&numShards=2&replicationFactor=1'
Here, collection name = "demo"
number of Shards = "2"
but when I am adding new shard using
sudo curl 'http://localhost:8983/solr/admin/collections?action=CREATESHARD&shard=shard3&collection=demo'
It is giving error :
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int name="QTime">1</int></lst><lst name="error"><str name="msg">shards can be added only to 'implicit' collections</str><int name="code">400</int></lst>
</response>
From the documentation for CREATESHARD:
Shards can only created with this API for collections that use the 'implicit' router. Use SPLITSHARD for collections using the 'compositeId' router. A new shard with a name can be created for an existing 'implicit' collection.
So the proper way to do this is to issue a SPLITSHARD command instead, and then remove the old shard after the two new shards have been created. From the SPLITSHARD documentation:
Splitting a shard will take an existing shard and break it into two pieces. The original shard will continue to contain the same data as-is but it will start re-routing requests to the new shards. The new shards will have as many replicas as the original shard. After splitting a shard, you should issue a commit to make the documents visible, and then you can remove the original shard (with the Core API or Solr Admin UI) when ready.
Shards can only created with this API for collections that use the 'implicit' router (i.e., when the collection was created, router.name=implicit). A new shard with a name can be created for an existing 'implicit' collection.
reference,https://solr.apache.org/guide/8_6/collection-management.html

Resources