Do CosmosDB Mongo API compound unique indexes require each field to be unique? - azure-cosmosdb

I'm trying to set up a collection of versioned documents in which I insert a new document with the same id and a timestamp whenever there's an edit operation. I use a unique compound index for this on the id and timestamp fields. CosmosDB is giving me MongoError: E11000 duplicate key error whenever I try to insert a document with a different id but an identical timestamp to another document. The MongoDB documentation says that I should be able to do this:
https://docs.mongodb.com/v3.4/core/index-unique/#unique-compound-index
You can also enforce a unique constraint on compound indexes. If you use the unique constraint on a compound index, then MongoDB will enforce uniqueness on the combination of the index key values.
I tried using a non-unique index but the Resource Manager template failed, saying that non-unique compound indexes are not supported. I'm using the node.js native driver v3.2.4. I also tried to use Azure Portal to insert documents but received the same error. This makes me believe it's not a problem between CosmosDB and the node.js driver.
Here's a small example to demonstrate the problem. I'm running it with Node v10.15.3.
const { MongoClient } = require('mongodb');
const mongoUrl = process.env.COSMOSDB_CONNECTION_STRING;
const collectionName = 'indextest';
const client = new MongoClient(mongoUrl, { useNewUrlParser: true });
let connection;
const testIndex = async () => {
const now = Date.now();
connection = await client.connect();
const db = connection.db('master');
await db.collection(collectionName).drop();
const collection = await db.createCollection(collectionName);
await collection.createIndex({ id: 1, ts: -1 }, { unique: true });
await collection.insertOne({ id: 1, ts: now, title: 'My first document' });
await collection.insertOne({ id: 2, ts: now, title: 'My other document' });
};
(async () => {
try {
await testIndex();
console.log('It works');
} catch (err) {
console.error(err);
} finally {
await connection.close();
}
})();
I would expect the two insert operations to work and for the program to exit with It works. What I get instead is an Error:
{ MongoError: E11000 duplicate key error collection: master.indextest Failed _id or unique key constraint
at Function.create (/home/node/node_modules/mongodb-core/lib/error.js:43:12)
at toError (/home/node/node_modules/mongodb/lib/utils.js:149:22)
at coll.s.topology.insert (/home/node/node_modules/mongodb/lib/operations/collection_ops.js:859:39)
at handler (/home/node/node_modules/mongodb-core/lib/topologies/replset.js:1155:22)
at /home/node/node_modules/mongodb-core/lib/connection/pool.js:397:18
at process._tickCallback (internal/process/next_tick.js:61:11)
driver: true,
name: 'MongoError',
index: 0,
code: 11000,
errmsg:
'E11000 duplicate key error collection: master.indextest Failed _id or unique key constraint',
[Symbol(mongoErrorContextSymbol)]: {} }
Is this expected behavior or a bug in CosmosDB's MongoDB API?

Related

How to backfill new AppSync fields using AWS Amplify

I'm adding a sort field to one of my AppSync tables using GraphQL. The new schema looks like:
type MyTable
#model
#auth(rules: [{allow: owner}])
#key(name: "BySortOrder", fields: ["sortOrder"], queryField: "tableBySortOrder")
{
id: ID!
name: String!
sortOrder: Int
}
However, when retrieving a list using tableBySortOrder I get an empty list because the new field sortOrder is null.
My question is, how do I backfill this data in the DynamoDB table so that my existing users will not be disrupted by this new change? With a traditional database, I would run a SQL update: UPDATE MyTable SET sortOrder = #.
However, I'm new to NoSQL/AWS and couldn't find a way to do this except build a backfill script whenever a user logs into my app. That feels very hacky. What is the best practice for handling this type of scenario?
Have you already created the new field in DDB?
If yes, I think you should backfill it before making the client side change.
Write a script to iterate through and update the table. Options for this:
Java - Call updateItem to update the table if you have any integ tests running.
Bash - Use AWS CLI: aws dynamodb scan --table-name item_attributes --projection-expression "whatever" > /tmp/item_attributes_table.txt and then aws dynamodb update-item --table-name item_attributes --key. This is a dirty way.
Python - Same logic as above.
Ended up using something similar to what Sunny suggested with a nodejs script:
const AWS = require('aws-sdk')
AWS.config.update({
region: 'us-east-1'
})
// To confirm credentials are set
AWS.config.getCredentials(function (err) {
if (err) console.log(err.stack)
// credentials not loaded
else {
console.log('Access key:', AWS.config.credentials.accessKeyId)
console.log('Secret access key:', AWS.config.credentials.secretAccessKey)
}
})
const docClient = new AWS.DynamoDB.DocumentClient()
const table = 'your-table-dev'
const params = {
TableName: table
}
const itemMap = new Map()
// Using scan to retrieve all rows
docClient.scan(params, function (err, data) {
if (err) {
console.error('Unable to query. Error:', JSON.stringify(err, null, 2))
} else {
console.log('Query succeeded.')
data.Items.forEach(item => {
if (itemMap.has(item.owner)) {
itemMap.set(item.owner, [...itemMap.get(item.owner), item])
} else {
itemMap.set(item.owner, [item])
}
})
itemMap.forEach(ownerConnections => {
ownerConnections.forEach((connection, index) => {
connection.sortOrder = index
update(connection)
})
})
}
})
function update(connection) {
const params = {
TableName: table,
Key: {
'id': connection.id
},
UpdateExpression: 'set sortOrder = :s',
ExpressionAttributeValues: {
':s': connection.sortOrder,
},
ReturnValues: 'UPDATED_NEW'
};
console.log('Updating the item...');
docClient.update(params, function (err, data) {
if (err) {
console.error('Unable to update item. Error JSON:', JSON.stringify(err, null, 2));
} else {
console.log('UpdateItem succeeded:', JSON.stringify(data, null, 2));
}
});
}

firebase firestore adding new document inside a transaction - transaction.add is not a function

I was assuming that it was possible to do something like:
transaction.add(collectionRef,{
uid: userId,
name: name,
fsTimestamp: firebase.firestore.Timestamp.now(),
});
But apparently it is not:
transaction.add is not a function
The above message is displayed inside the chrome console.
I see that we can use the set method of the transaction to add a new document transactionally. see: https://firebase.google.com/docs/firestore/manage-data/transactions
The thing is if I use set instead of add(which is not supported anyways), the id of the document should be created by me manually, firestore won't create it.
see: https://firebase.google.com/docs/firestore/manage-data/add-data
Do you see any downside of this not having an add method that generates the id for you automatically?
For example, is it possible that the id generated by the firestore itself is somehow optimized considering various concerns including performance?
Which library/method do you use to create your document IDs in react-native while using transaction.set?
Thanks
If you want to generate a unique ID for later use in creating a document in a transaction, all you have to do is use CollectionReference.doc() with no parameters to generate a DocumentReference which you can set() later in a transaction.
(What you're proposing in your answer is way more work for the same effect.)
// Create a reference to a document that doesn't exist yet, it has a random id
const newDocRef = db.collection('coll').doc();
// Then, later in a transaction:
transaction.set(newDocRef, { ... });
after some more digging I found in the source code of the firestore itself the below class/method for id generation:
export class AutoId {
static newId(): string {
// Alphanumeric characters
const chars =
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
let autoId = '';
for (let i = 0; i < 20; i++) {
autoId += chars.charAt(Math.floor(Math.random() * chars.length));
}
assert(autoId.length === 20, 'Invalid auto ID: ' + autoId);
return autoId;
}
}
see: https://github.com/firebase/firebase-js-sdk/blob/73a586c92afe3f39a844b2be86086fddb6877bb7/packages/firestore/src/util/misc.ts#L36
I extracted the method (except the assert statement) and put it inside a method in my code. Then I used the set method of the transaction as below:
generateFirestoreId(){
const chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
let autoId = '';
for (let i = 0; i < 20; i++) {
autoId += chars.charAt(Math.floor(Math.random() * chars.length));
}
//assert(autoId.length === 20, 'Invalid auto ID: ' + autoId);
return autoId;
}
then,
newDocRef = db.collection("PARENTCOLL").doc(PARENTDOCID).collection('SUBCOLL').doc(this.generateFirestoreId());
transaction.set(newDocRef,{
uid: userId,
name: name,
fsTimestamp: firebase.firestore.Timestamp.now(),
});
Since I am using the same algo for the id generation as the firestore itself I feel better.
Hope this helps/guides someone.
Cheers.
Based on the answer from Doug Stevenson, this is how I got it worked with #angular/fire:
// Create a reference to a document and provide it a random id, e.g. by using uuidv4
const newDocRef = this.db.collection('coll').doc(uuidv4()).ref;
// In the transaction:
transaction.set(newDocRef, { ... });
To complete Stefan's answer. For those using Angularfire, earlier to version 5.2 using CollectionReference.doc() results in an error "CollectionReference.doc() requires its first argument to be of type non-empty string".
This workaround worked for me:
const id = this.afs.createId();
const ref = this.afs.collection(this.collectionRef).doc(id);
transaction.set(ref, { ... });
Credit: https://github.com/angular/angularfire/issues/1974#issuecomment-448449448
I'd like to add an answer solving the id problem. There's no need to generate your own ids. The documentReference is updated after the transaction.set() is called, so in order to access the Firestore's id you need to just do the following:
const docRef = collectionRef.doc();
const result = await transaction.set(docRef, input);
const id = docRef.id;
First of all, firestore transaction object has 4 (get,set,update,delete) methods and doesnt has "add" method. However, the "set" method can be used instead.
import { collection,doc,runTransaction } from "firebase/firestore";
On the other hand documentReference must be created for "set" method.
Steps :
1-) collection method create a collectionReference object.
const collectionRef = collection(FirebaseDb,"[colpath]");
2-) doc method create a documentReference object with unique random id for specified collectionReference.
const documentRef = doc(collectionRef);
3-) add operation can be performed with the transaction set method
try {
await runTransaction(FirebaseDb,async (transaction) => {
await transaction.set(documentRef, {
uid: userId,
name: name,
fsTimestamp: firebase.firestore.Timestamp.now(),
});
})
} catch (e) {
console.error("Error : ", e);
}

Add timestamp in Firestore documents

I'm newbie to Firestore. Firestore docs says...
Important: Unlike "push IDs" in the Firebase Realtime Database, Cloud Firestore auto-generated IDs do not provide any automatic ordering. If you want to be able to order your documents by creation date, you should store a timestamp as a field in the documents.
Reference: https://firebase.google.com/docs/firestore/manage-data/add-data
So do I have to create key name as timestamp in document? Or created is suffice to fulfill above statement from Firestore documentation.
{
"created": 1534183990,
"modified": 1534183990,
"timestamp":1534183990
}
firebase.firestore.FieldValue.serverTimestamp()
Whatever you want to call it is fine afaik. Then you can use orderByChild('created').
I also mostly use firebase.database.ServerValue.TIMESTAMP when setting time
ref.child(key).set({
id: itemId,
content: itemContent,
user: uid,
created: firebase.database.ServerValue.TIMESTAMP
})
Use firestore Timestamp class, firebase.firestore.Timestamp.now().
Since firebase.firestore.FieldValue.serverTimestamp() does not work with add method from firestore. Reference
For Firestore
ref.doc(key).set({
created: firebase.firestore.FieldValue.serverTimestamp()
})
REALTIME SERVER TIMESTAMP USING FIRESTORE
import firebase from "firebase/app";
const someFunctionToUploadProduct = () => {
firebase.firestore().collection("products").add({
name: name,
price : price,
color : color,
weight :weight,
size : size,
createdAt : firebase.firestore.FieldValue.serverTimestamp()
})
.then(function(docRef) {
console.log("Document written with ID: ", docRef.id);
})
.catch(function(error) {
console.error("Error adding document: ", error);
});
}
All you need is to import 'firebase' and then call
firebase.firestore.FieldValue.serverTimestamp() wherever you need it. Be careful with the spelling though, its "serverTimestamp()". In this example it provides the timestamp value to 'createdAt' when uploading to the firestore's product's collection.
That's correct, like most database, Firestore doesn't store creation times. In order to sort objects by time:
Option 1: Create timestamp on client (correctness not guaranteed):
db.collection("messages").doc().set({
....
createdAt: firebase.firestore.Timestamp.now()
})
The big caveat here is that Timestamp.now()uses the local machine time. Therefore, if this is run on a client machine, you have no guarantee the timestamp is accurate. If you're setting this on the server or if guaranteed order isn't so important, it might be fine.
Option 2: Use a timestamp sentinel:
db.collection("messages").doc().set({
....
createdAt: firebase.firestore.FieldValue.serverTimestamp()
})
A timestamp sentinel is a token that tells the firestore server to set the time server side on first write.
If you read the sentinel before it is written (e.g., in a listener) it will be NULL unless you read the document like this:
doc.data({ serverTimestamps: 'estimate' })
Set up your query with something like this:
// quick and dirty way, but uses local machine time
const midnight = new Date(firebase.firestore.Timestamp.now().toDate().setHours(0, 0, 0, 0));
const todaysMessages = firebase
.firestore()
.collection(`users/${user.id}/messages`)
.orderBy('createdAt', 'desc')
.where('createdAt', '>=', midnight);
Note that this query uses the local machine time (Timestamp.now()). If it's really important that your app uses the correct time on the clients, you could utilize this feature of Firebase's Realtime Database:
const serverTimeOffset = (await firebase.database().ref('/.info/serverTimeOffset').once('value')).val();
const midnightServerMilliseconds = new Date(serverTimeOffset + Date.now()).setHours(0, 0, 0, 0);
const midnightServer = new Date(midnightServerMilliseconds);
The documentation isn't suggesting the names of any of your fields. The part you're quoting is just saying two things:
The automatically generated document IDs for Firestore don't have a natural time-based ordering like they did in Realtime Database.
If you want time-based ordering, store a timestamp in the document, and use that to order your queries. (You can call it whatever you want.)
This solution worked for me:
Firestore.instance.collection("collectionName").add({'created': Timestamp.now()});
The result in Cloud Firestore is:
Cloud Firestore Result
Try this one for Swift 4 Timestamp(date: Date())
let docData: [String: Any] = [
"stringExample": "Hello world!",
"booleanExample": true,
"numberExample": 3.14159265,
"dateExample": Timestamp(Date()),
"arrayExample": [5, true, "hello"],
"nullExample": NSNull(),
"objectExample": [
"a": 5,
"b": [
"nested": "foo"
]
]
]
db.collection("data").document("one").setData(docData) { err in
if let err = err {
print("Error writing document: \(err)")
} else {
print("Document successfully written!")
}
}
The way it worked with me, is just taking the timestamp from the snapshot parameter snapshot.updateTime
exports.newUserCreated = functions.firestore.document('users/{userId}').onCreate(async (snapshot, context) => {
console.log('started! v1.7');
const userID = context.params['userId'];
firestore.collection(`users/${userID}/lists`).add({
'created_time': snapshot.updateTime,
'name':'Products I ♥',
}).then(documentReference => {
console.log("initial public list created");
return null;
}).catch(error => {
console.error('Error creating initial list', error);
process.exit(1);
});
});
I am using Firestore to store data that comes from a Raspberry PI with Python. The pipeline is like this:
Raspberry PI (Python using paho-mqtt) -> Google Cloud IoT -> Google Cloud Pub/Sub -> Firebase Functions -> Firestore.
Data in the device is a Python Dictionary. I convert that to JSON.
The problem I had was that paho-mqtt will only send (publish) data as String and one of the fields of my data is timestamp. This timestamp is saved from the device because it accurately says when the measurement was taken regardless on when the data is ultimately stored in the database.
When I send my JSON structure, Firestore will store my field 'timestamp' as String. This is not convenient. So here is the solution.
I do a conversion in the Cloud Function that is triggered by the Pub/Sub to write into Firestore using Moment library to convert.
Note: I am getting the timestamp in python with:
currenttime = datetime.datetime.utcnow()
var moment = require('moment'); // require Moment
function toTimestamp(strDate){
return parsedTime = moment(strDate, "YYYY-MM-DD HH:mm:ss:SS");
}
exports.myFunctionPubSub = functions.pubsub.topic('my-topic-name').onPublish((message, context) => {
let parsedMessage = null;
try {
parsedMessage = message.json;
// Convert timestamp string to timestamp object
parsedMessage.date = toTimestamp(parsedMessage.date);
// Get the Device ID from the message. Useful when you have multiple IoT devices
deviceID = parsedMessage._deviceID;
let addDoc = db.collection('MyDevices')
.doc(deviceID)
.collection('DeviceData')
.add(parsedMessage)
.then ( (ref) => {
console.log('Added document ID: ', ref.id);
return null;
}).catch ( (error) => {
console.error('Failed to write database', error);
return null;
});
} catch (e) {
console.error('PubSub message was not JSON', e);
}
// // Expected return or a warning will be triggered in the Firebase Function logs.
return null;
});
Firestone method does not work. Use Timestamp from java.sql.Timestamp and don't cast to string.. Then firestone formats it properly. For example to mark a now() use:
val timestamp = Timestamp(System.currentTimeMillis())
multiple ways to store time in Firestore
firebaseAdmin.firestore.FieldValue.serverTimestamp() method. The actual timestamp will be computed when the doc is written to the Firestore.
while storing it looks like this:
firebaseAdmin.firestore.Timestamp.now() method.
while storing it looks like this:
For both the methods, next time you fetch data it will return Firestore Timestamp object:
So, you first need to convert it to native js Date object and then you can perform methods on it like toISOString().
export function FStimestampToDate(
timestamp:
| FirebaseFirestore.Timestamp
| FirebaseFirestore.FieldValue
): Date {
return (timestamp as FirebaseFirestore.Timestamp).toDate();
}
Store as unix timestamp Date.now, it'll be stored as number i.e. 1627235565028 but you won't be able to see it as readable Date in firestore db.
To query on this Firestore field, you need to convert the date to timestamp and then query.
Store as new Date().toISOString() i.e. "2021-07-25T17:56:40.373Z" but you won't be able to perform date range query on this.
I prefer the 2nd or 3rd way.
According to the docs, you can "set a field in your document to a server timestamp which tracks when the server receives the update".
Example:
import { updateDoc, serverTimestamp } from "firebase/firestore";
const docRef = doc(db, 'objects', 'some-id');
// Update the timestamp field with the value from the server
const updateTimestamp = await updateDoc(docRef, {
timestamp: serverTimestamp() // this does the trick!
});
Sharing what worked for me after googling for 2 hours, for firebase 9+
import { serverTimestamp } from "firebase/firestore";
export const postData = ({ name, points }: any) => {
const scoresRef = collection(db, "scores");
return addDoc(scoresRef, {
name,
points
date: serverTimestamp(),
});
};
Swift 5.1
...
"dateExample": Timestamp(date: Date()),
...
The newest version from Firestore you should use it as follow
import { doc, setDoc, Timestamp } from "firebase/firestore";
const docData = {
...
dateExample: Timestamp.fromDate(new Date("December 10, 1815"))
};
await setDoc(doc(db, "data", "one"), docData);
or for sever timestamp
import { updateDoc, serverTimestamp } from "firebase/firestore";
const docRef = doc(db, 'objects', 'some-id');
const updateTimestamp = await updateDoc(docRef, {
timestamp: serverTimestamp()
});

conditionalExpression for unique primary key in dynamodb

I need to insert a document if the primary key doesn't exist. I have tried to solve this using conditionExpression but it seems to fail.
const primaryKey = "4234241";
const tableSpec = {
TableName: 'tableName',
Item: params,
ConditionExpression: '#primaryId <> :primaryId',
ExpressionAttributeNames: {'#primaryId': 'primaryId'},
ExpressionAttributeValues: {
':primaryValue': primaryValue
}
};
var docClient = new AWS.DynamoDB.DocumentClient();
docClient.put(tableSpec, function (err, data) {
if (err) {
console.log(err);
}
});
"ConditionalCheckFailedException: The conditional request failed" is the output of the console.log statement.
Is it throwing the exception when an object with this primary key exists? Than it's fine, just catch an exception, process it if you need to (maybe log that object exists already) and move on.
With this you will make one call it will either return success if an object was created or an exception (that you can catch and ignore) if an object already exists.
The only solution of ConditionalCheckFailedException is getting and check before insert.

Query List of Maps in DynamoDB

I am trying to filter list of maps from a dynamodb table which is of the following format.
{
id: "Number",
users: {
{ userEmail: abc#gmail.com, age:"23" },
{ userEmail: de#gmail.com, age:"41" }
}
}
I need to get the data of the user with userEmail as "abc#gmail.com". Currently I am doing it using the following dynamodb query. Is there any another efficient way to solve this issue ?
var params = {
TableName: 'users',
Key:{
'id': id
}
};
var docClient = new AWS.DynamoDB.DocumentClient();
docClient.get(params, function (err, data) {
if (!err) {
const users = data.Item.users;
const user = users.filter(function (user) {
return user.email == userEmail;
});
// filtered has the required user in it
});
The only way you can get a single item in dynamo by id if you have a table with a partition key. So you need to have a table that looks like:
Email (string) - partition key
Id (some-type) - user id
...other relevant user data
Unfortunately, since a nested field cannot be a partition key you will have to maintain a separate table here and won't be able to use an index in DynamoDB (neither LSI, nor GSI).
It's a common pattern in NoSQL to duplicate data, so there is nothing unusual in it. If you were using Java, you could use transactions library, to ensure that both tables are in sync.
If you are not going to use Java you could read DynamoDB stream of the original database (where emails are nested fields) and update the new table (where emails are partition keys) when an original table is updated.

Resources