AddToSet operation requires a target array field - azure-cosmosdb

Trying to make use of Azure DocumentDB/CosmsoDB using the MongoDB driver. I have learned that there are many limitations as the full set of features is not currently implemented. I want to use aggregate functions, specifically $group, and .distinct but I don't think that is available yet. As a work around, I am trying to maintain a separate "tracking" document to enable "distinct". trying to update a document using $addToSet, but getting the following:
MongoError: Message: {"Errors":["Encountered exception while executing function. Exception = Error: AddToSet operation requires a target array field.\r\nStack trace: Error: AddToSet operation requires a target array field.\n at arrayAddToSet (__.sys.commonUpdate.js:2907:25)\n at handleUpdate (__.sys.commonUpdate.js:2649:29)\n at processOneResult (__.sys.commonUpdate.js:2484:25)\n at queryCallback (__.sys.commonUpdate.js:2461:21)\n at Anonymous function (__.sys.commonUpdate.js:619:29)"]}
The update command i am using:
var usersDocument = collection.updateOne(
{ "type": "users" },
{ $addToSet: {users: "someone#gmail.com"} },
function(err, count, status) {
console.log("updateOne err: " + err)
console.log("updateOne count: " + count)
console.log("updateOne status: " + status)
}
)
This seems to me to be a pretty straight-forward command, pulled from the mongo documentation and fields adjusted as needed. Maybe I am missing something really basic?
My ultimate goal was to make sure that my code was portable as to be able to move it into a Mongo cluster, if I so desired (not be locked into Azure-specific). To get started and not have to manage a multi-server cluster, Azure CosmosDB looked like a great jumpstart, but the limitations are maddening.
UPDATE:
Now that I have fixed my document and I actually have a field with an array, $addToSet is just replacing the value, rather than adding to the array. I'll create a new question for that.

Yup, something basic. The error message was actually correct. After inspecting the existing document:
I found:
{ "users": "[]" }
And changed it to:
{ "users": [] }
Now it is working.

Related

DynamoBD/Amplify non-negative field and field validation on mutations

I am new to AWS in general, I am building a relatively simple application with Amplify, but I've used Google Firebase before. My question is: Is there a way to set a constrain for a field to be non-negative? I have an application that does transactions and I don't want my balance to be negative. I just need a simple error/exception. Is it possible to set a field constraint in DynamoDB that says "This field should be >= 0"?.
I also checked if it was possible to do it in the VTL amplify generated resolver of my graphql mutation, and indeed it is possible to set some constraints, But somehow it allows the operation and crashes on the next one (when the balance on the DB is already < 0, like if it checks it before the update). I tried saying something like "current_balance - transaction >= 0" but I couldn't get it to work.
So it seems that the only way is to create a custom lambda resolver that does the various checks before submitting the mutation to DynamoDB. I haven't tried it yet but I don't understand how I can do a check on the current balance (stored in the DB) without doing a query.
More in general is it even possible to validate fields (even with simple assertions like non-negative) on amplify/dynamoDB? Moving to another DB like Aurora would help?
Thanks for you help
DynamoDb supports conditional updates which allow an update to be applied when the given condition is met. You can set the condition current_balance >= cost for your update.
However, the negative balance is not the main problem. What you should address is how to prevent other requests from updating the same current_balance at the same time, or in short, race conditions on current_balance. In order to deal with that, you also need a conditional update whose condition is "current_balance = initial_balance". The initial_balance is, I guess, what you get from DynamoDB at the very beginning of the purchase process.
Sample VTL code
#set( $remaining_balance = $initial_balance - $transaction_cost )
#if( $remaining_balance < 0 )
$util.error("Insufficient balance")
#end
{
"version" : "2018-05-29",
"operation" : "UpdateItem",
"key": { <your-dynamodb-key> },
"update" : {
"expression" : "SET current_balance = :remaining_balance",
"expressionValues" : {
":remaining_balance" : $util.dynamodb.toNumberJson($remaining_balance)
}
},
"condition": {
"expression": "current_balance = :initial_balance",
"expressionValues" : {
":initial_balance" : $util.dynamodb.toNumberJson($initial_balance)
}
}
}

Ingest from storage with persistDetails = true not save ingest status result

I'm now implement a program to migrate large amount of data to ADX base on Ingest from Storage feature of ADX and I'm need to check that status of each ingestion request each time the request finish but I'm facing an issue
Base on MS document in here
If I set the persistDetails = true for example with the command below it must save the ingestion status but currently this setting seem not work (with or without it)
.ingest async into table MigrateTable
(
h'correct blob url link'
)
with (
jsonMappingReference = 'table_mapping',
format = 'json',
persistDetails = true
)
Above command will return an OperationId and when I using it to check export status when the ingest task finish I always get this error message :
Error An admin command cannot be executed due to an invalid state: State='Operation 'DataIngestPull' does not persist its operation results' clientRequestId: KustoWebV2;
Can someone clarify for me what is the root cause relate to this? With me it seem like a bug relate to ADX
Ingesting data directly against the Data Engine, by running .ingest commands, is usually not recommended, compared to using Queued Ingestion (motivation included in the link). Using Kusto's ingestion client library allows you to track the ingestion status.
Some tools/services already do that for you, and you can consider using them directly. e.g. LightIngest, Azure Data Factory
If you don't follow option 1, you can still look for the state/status of your command using the operation ID you get when using the async keyword, by using .show operations
You can also use the client request ID to filter the result set of .show commands to view the state/status of your command.
If you're interested in looking specifically at failures, .show ingestion failures is also available for you.
The persistDetails option you specified in your .ingest command actually has no effect - as mentioned in the docs:
Not all control commands persist their results, and those that do usually do so by default on asynchronous executions only (using the async keyword). Please search the documentation for the specific command and check if it does (see, for example data export).
============ Update sample code follow suggestion from Yoni ========
Turn out, other member in my team mess up with access right with adx, after fixing it everything work fine
I just have one concern relate to PartiallySucceeded that need clarify from #yoni or someone have better knowledge relate to that
try
{
var ingestProps = new KustoQueuedIngestionProperties(model.DatabaseName, model.IngestTableName)
{
ReportLevel = IngestionReportLevel.FailuresAndSuccesses,
ReportMethod = IngestionReportMethod.Table,
FlushImmediately = true,
JSONMappingReference = model.IngestMappingName,
AdditionalProperties = new Dictionary<string, string>
{
{"jsonMappingReference",$"{model.IngestMappingName}" },
{ "format","json"}
}
};
var sourceId = Guid.NewGuid();
var clientResult = await IngestClient.IngestFromStorageAsync(model.FileBlobUrl, ingestProps, new StorageSourceOptions
{
DeleteSourceOnSuccess = true,
SourceId = sourceId
});
var ingestionStatus = clientResult.GetIngestionStatusBySourceId(sourceId);
while (ingestionStatus.Status == Status.Pending)
{
await Task.Delay(WaitingInterval);
ingestionStatus = clientResult.GetIngestionStatusBySourceId(sourceId);
}
if (ingestionStatus.Status == Status.Succeeded)
{
return true;
}
LogUtils.TraceError(_logger, $"Error when ingest blob file events, error: {ingestionStatus.ErrorCode.FastGetDescription()}");
return false;
}
catch (Exception e)
{
return false;
}

IncludeKeys in PFQuery not returning all relational data

I am using PFQueryTableViewController to retrieve objects from Parse. This particular Parse class has three columns (group, category, client) which are pointers to other Parse classes. I want to use the includeKey option to bring in all object data for each of those pointer objects. However, when I run the code below, I do retrieve the basic data about each pointer column (like ObjectID), but none of the additional columns, like the "name" column of the Category class.
override func queryForTable() -> PFQuery {
let query:PFQuery = PFQuery(className:self.parseClassName!)
query.whereKey("client", equalTo: currentUser!)
query.whereKey("status", equalTo: "Open")
query.whereKey("expires", greaterThan: NSDate())
query.includeKey("group")
query.includeKey("category")
query.includeKey("client")
query.orderByAscending("expires")
if(objects?.count == 0)
{
query.cachePolicy = PFCachePolicy.CacheThenNetwork
}
return query
}
When my PFQueryTableViewController calls it's cellForRowAtIndexPath function, I have code to get the 'name' column of the category object brought in via includeKey
override func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath, object: PFObject?) -> PFTableViewCell? {
...
if let category = task["category"] as? PFObject {
cell?.categoryLabel.text = String(category["name"])
}
...
}
Running this retrieval of the category's 'name' value results in the following error and crash in Xcode:
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'Key "name" has no data. Call fetchIfNeeded before getting its value.'
This same error results when I attempt to access any additional columns of my pointer reference objects.
When I print(category) I appear to get a valid (but empty) object in my log, with no additional columns, like 'name':
<Category: 0x13c6ed870, objectId: mEpn6TH6Tc, localId: (null)> {
}
I have successfully tested calling the suggested fetch query to retrieve the missing pointer data for each pointer column, but (IMHO) the additional fetch defeats the purpose of the includeKey query option to limit API requests.
One thought I had was that a PFQuery may only allow one includeKey call. But, the research that I've done through both Parse's own iOS documentation and various developer posts do not indicate any limitation of max number of includeKeys a query can have.
Am I doing something unsupported by Parse or am I just syntactically not retrieving each pointers' object data the proper way?
I'm running the latest ParseUI as of this posting (1.1.6) and Parse (1.9.0) with Xcode 7.0.1
Thank you in advance for reading my post! I am only a couple months into learning iOS development, so this is both interesting and frustrating at the same time!
cell.category.text = object.objectForKey("category")!.objectForKey("name") as! String
also, use the other "version" of cellForRowAtIndexPath, the one for PFQueryTableViewControllers:
override func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath, object: PFObject!) -> PFTableViewCell? {
//4
let cell = tableView.dequeueReusableCellWithIdentifier("yourCellIdentifier", forIndexPath: indexPath) as! yourCell
return cell
}
this allows you to use the above syntax that I answered with.

this.removed(collection, id) makes somewhere exception thrown

I'm trying to inform the subscribers when a document is removed from a collection. I use this.removed(collection, id) when removed function of observeChanges is called:
Meteor.publish('tasks_listsPub', function(sUrl){
...
var self = this;
ocTasksLists.find().observeChanges({
added: function (sId, oFields) {
console.log('added:'+sId);
self.added('tasks_lists', sId, oFields);
},
removed: function (sId) {
console.log('removed:'+sId);
self.removed('tasks_lists', sId); //throws a exception but sometimes it works in the browser
},
changed: function(sId, oFields){
console.log('changed:'+sId);
self.changed('tasks_lists', sId, oFields);
}
});
var cVisibleTasksLists = ocTasksLists.find({_id: {$in: oWs.tasks_lists}});
return cVisibleTasksLists;
});
The problem is that server throws exception:
removed:K8BBys7WRH4tTQRBg
Exception in queued task: Error: Removed nonexistent document K8BBys7WRH4tTQRBg
at _.extend.removed (app/packages/livedata/livedata_server.js:181:17)
and the other browsers sometimes does not remove the deleted document. Any solution? Thx
You appear to be publishing two conflicting datasets within one publish function.
The self.added, self.removed, and self.changed functions inside the observeChanges is trying to keep the client updated with everything in ocTasksLists.
return cVisibleTasksLists; however is trying to only publish the subset of ocTasksLists that match your query.
These conflicting publish instructions lead to the client sometimes not having all documents that are removed from ocTasksLists - your error message results.
Whether you want the whole dataset or the subset either can be done just by returning the database cursor as you do in your last two lines. Removing the observeChanges function along with .added, .removed, and .changed functions will fix your error.

How to work with async code in Mongoose virtual properties?

I'm trying to work with associating documents in different collections (not embedded documents) and while there is an issue for that in Mongooose, I'm trying to work around it now by lazy loading the associated document with a virtual property as documented on the Mongoose website.
The problem is that the getter for a virtual takes a function as an argument and uses the return value for the virtual property. This is great when the virtual doesn't require any async calls to calculate it's value, but doesn't work when I need to make an async call to load the other document. Here's the sample code I'm working with:
TransactionSchema.virtual('notebook')
.get( function() { // <-- the return value of this function is used as the property value
Notebook.findById(this.notebookId, function(err, notebook) {
return notebook; // I can't use this value, since the outer function returns before we get to this code
})
// undefined is returned here as the properties value
});
This doesn't work since the function returns before the async call is finished. Is there a way I could use a flow control library to make this work, or could I modify the first function so that I pass the findById call to the getter instead of an anonymous function?
You can define a virtual method, for which you can define a callback.
Using your example:
TransactionSchema.method('getNotebook', function(cb) {
Notebook.findById(this.notebookId, function(err, notebook) {
cb(notebook);
})
});
And while the sole commenter appears to be one of those pedantic types, you also should not be afraid of embedding documents. Its one of mongos strong points from what I understand.
One uses the above code like so:
instance.getNotebook(function(nootebook){
// hey man, I have my notebook and stuff
});
While this addresses the broader problem rather than the specific question, I still thought it was worth submitting:
You can easily load an associated document from another collection (having a nearly identical result as defining a virtual) by using Mongoose's query populate function. Using the above example, this requires specifying the ref of the ObjectID in the Transaction schema (to point to the Notebook collection), then calling populate(NotebookId) while constructing the query. The linked Mongoose documentation addresses this pretty thoroughly.
I'm not familiar with Mongoose's history, but I'm guessing populate did not exist when these earlier answers were submitted.
Josh's approach works great for single document look-ups, but my situation was a little more complex. I needed to do a look-up on a nested property for an entire array of objects. For example, my model looked more like this:
var TransactionSchema = new Schema({
...
, notebooks: {type: [Notebook]}
});
var NotebookSchema = new Schema({
...
, authorName: String // this should not necessarily persist to db because it may get stale
, authorId: String
});
var AuthorSchema = new Schema({
firstName: String
, lastName: String
});
Then, in my application code (I'm using Express), when I get a Transaction, I want all of the notebooks with author last name's:
...
TransactionSchema.findById(someTransactionId, function(err, trans) {
...
if (trans) {
var authorIds = trans.notebooks.map(function(tx) {
return notebook.authorId;
});
Author.find({_id: {$in: authorIds}, [], function(err2, authors) {
for (var a in authors) {
for (var n in trans.notebooks {
if (authors[a].id == trans.notebooks[n].authorId) {
trans.notebooks[n].authorLastName = authors[a].lastName;
break;
}
}
}
...
});
This seems wildly inefficient and hacky, but I could not figure out another way to accomplish this. Lastly, I am new to node.js, mongoose, and stackoverflow so forgive me if this is not the most appropriate place to extend this discussion. It's just that Josh's solution was the most helpful in my eventual "solution."
As this is an old question, I figured it might use an update.
To achieve asynchronous virtual fields, you can use mongoose-fill, as stated in mongoose's github issue: https://github.com/Automattic/mongoose/issues/1894

Resources