I am working on a online booking system of items.
I am using mongo to store booking and item details
Item
{
id: "3",
"name": "",
"description": "",
"extra": [{}]
}
Booking
{
"id": "",
"itemId":""
"startDate": millis,
"endDate": millis,
"status": "",
"userId": ""
}
I have to implement search b/w dates. The search should return only available items for the specified period. How can I build a scalable search for this? I am planning to use elastic also for search. Any suggestion related to new technology also welcome.
I'd suggest making the booking the base object and putting the item info inside it. That is to say:
Set up mapping:
PUT bookings
{
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"item": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"description": {
"type": "text"
},
"extra": {
"type": "nested"
}
}
},
"startDate": {
"type": "date",
"format": "epoch_millis"
},
"endDate": {
"type": "date",
"format": "epoch_millis"
},
"status": {
"type": "keyword"
},
"userId": {
"type": "keyword"
}
}
}
}
Ingest the simplest booking
POST bookings/_doc
{
"item": {
"id": "987"
},
"startDate": 1587110540025,
"endDate": 1587220730025
}
Restricting the *Date fields and only returning the corresponding item:
GET bookings/_search
{
"_source": "item",
"query": {
"bool": {
"must": [
{
"range": {
"startDate": {
"gte": "17/04/2020",
"format": "dd/MM/yyyy"
}
}
},
{
"range": {
"endDate": {
"lte": "18/04/2020",
"format": "dd/MM/yyyy"
}
}
}
]
}
}
}
Note that although our date fields are defined as epoch_millis, we can still query using human-readable date strings, provided we specify the format. You can of course use milliseconds if you prefer.
While indexing the items to Elasticsearch you can check bookings. Think that, you are indexing items and you get the item from Mongo. Also, you can get the bookings for this item and you can add a field like bookingCount inside the item document of Elasticsearch. While searching you can use bookingCount field to search without booking items.
In generally, the indexing is async operations. You can use queue. So, this will reduce latency for the user operations. And, you can do what you want in there. You can get a summary with bookings and you can put inside the item.
{
id: "3",
"name": "",
"description": "",
"extra": [{}],
"bookingCount": "",
"bookingsByStatus": {
"status_1": 1233,
"status_2": 1233,
...
}
}
But this is a business decision. And after any update of items and booking, you need yo update the item from Elasticsearch index. Also, you can use other solution like mentione by #jzzfs.
Related
I'm scared to put this out there because it should be so easy and I am facing the same issue as the post here, here and here and I have tried each of the answers to no avail. Below is the current Resulting Input (redacted) and Related CodeView of the inputs.
The Result
{
"method": "post",
"headers": {
"x-ms-documentdb-raw-partitionkey": "\"2020\""
},
"path": "/dbs/xxxx/colls/smtp/docs",
"host": {
"connection": {
"name": "/subscriptions/..."
}
},
"body": {
"category": [
[
"cat facts"
]
],
"email": "example#test.com",
"event": "processed",
"id": "yada",
"partitionKey": "\"2020\"",
"sg_event_id": "yada yada",
"sg_message_id": "yada",
"smtp-id": "yada",
"timestamp": 1604345542
}
}
The Code View
{
"inputs": {
"body": {
"category": [
"#items('For_each')['category']"
],
"email": "#items('For_each')['email']",
"event": "#items('For_each')['event']",
"id": "#items('For_each')['sg_message_id']",
"partitionKey": "\"#{formatDateTime(utcNow(),'yyyy')}\"",
"sg_event_id": "#items('For_each')['sg_event_id']",
"sg_message_id": "#items('For_each')['sg_message_id']",
"smtp-id": "#items('For_each')['smtp-id']",
"timestamp": "#items('For_each')['timestamp']"
},
"headers": {
"x-ms-documentdb-raw-partitionkey": "\"#{formatDateTime(utcNow(),'yyyy')}\""
}
}
The error I'm getting is the usual one - PartitionKey extracted from document doesn't match the one specified in the header
I just can't see what I'm missing here now.
Thanks all.
First, as Matias comments, check your partition key path.
Then, change this code "partitionKey": "\"#{formatDateTime(utcNow(),'yyyy')}\"", to "partitionKey": "#{formatDateTime(utcNow(),'yyyy')}", in your document.
It works fine on my side:
I want to copy items from
CosmosDB databaseA/productCollection
to
CosmosDB databaseB/productCollection
Therefore I decided to use Azure Data Factory.
I actived also "Export as-is to JSON files or Cosmos DB collection".
The read operation works as expected.
Unfortunately, the write operation stops because of an error related to the data:
ErrorCode=InvalidTemplate, ErrorMessage=Unable to parse expression 'Currency'
{
"ProductName": "Sample",
"Price": {
"#Currency": "GBP",
"$": "2624.83"
}
}
I'm not able to change to input data itself.
The output data has to equal the input data.
Is there possiblity, that #Currency will not be interpreted as an expression
In ARM, this part is failling:
Price.{#Currency}
I had the same problem and I was able to resolve accordingly.
I am using a Pipeline with a Source that is a Dataset referencing JSON data.
Clicking the button highlighted below.
I had to change the JSON from
{
"name": "SourceDataset",
"properties": {
"linkedServiceName": {
"referenceName": "StorageAccountLink",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "Json",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "test-data"
}
},
"schema": {
"type": "object",
"properties": {
"#context": {
"type": "string"
},
"value": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string"
}
}
}
}
}
}
}
}
To ( Escaping the # with ## )
{
"name": "SourceDataset",
"properties": {
"linkedServiceName": {
"referenceName": "StorageAccountLink",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "Json",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "test-data"
}
},
"schema": {
"type": "object",
"properties": {
"##context": {
"type": "string"
},
"value": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string"
}
}
}
}
}
}
}
}
I tried to reproduce your issue but it works for me. I used copy activity to transfer data from account A to account B.
Additional, if this operation is just need to be executed once, please consider using Azure Cosmos DB Migration Tool. It's free for usage. You could export the data from cosmos db A as json file then import it into cosmos db B very simply.Also, it could be executed in the cmd so that it could be made as a scheduled job on the windows system.
I want to know the page id or page url which has paternity in the content. When i use search functionality, i get the section details but i don't get any information about page.
Query:
https://graph.microsoft.com/v1.0/me/drive/root/search(q='paternity')
Result:
{
"#odata.context": "https://graph.microsoft.com/v1.0/$metadata#Collection(driveItem)",
"value": [
{
"#odata.type": "#microsoft.graph.driveItem",
"createdDateTime": "2018-07-11T11:06:28Z",
"id": "dfhsdfkhfklsdfsdkjf",
"lastModifiedDateTime": "2018-07-11T11:20:26Z",
"name": "Section1.one",
"webUrl": "https://microsoft-my.sharepoint.com/personal/abcd_contoso_com/_layouts/15/WopiFrame.aspx?sourcedoc=%7B7E1C4305-983D-4CE2-A15E-DBAF1B961423%7D&file=Section1.one&action=default&DefaultItemOpen=1",
"size": 390328,
"createdBy": {
"user": {
"email": "abcd#contoso.com",
"displayName": "abcd"
}
},
"lastModifiedBy": {
"user": {
"email": "abcd#contoso.com",
"displayName": "abcd"
}
},
"parentReference": {
"driveId": "b!QqRkFzhjsdgjkdhfkjdhXiDBfhDiNEmqz4NJGbg-Gcv-NrFDvVRJca8R9-3ylQ",
"driveType": "business",
"id": "01QsdjhdkjhdsdkjhHGT4FXINN2A"
},
"file": {
"mimeType": "application/msonenote"
},
"fileSystemInfo": {
"createdDateTime": "2018-07-11T11:06:28Z",
"lastModifiedDateTime": "2018-07-11T11:20:26Z"
},
"searchResult": {}
}
]
}
Please advise on how to get page level information
The OneNote API has search - you may try using that one:
https://blogs.msdn.microsoft.com/onenotedev/2014/11/17/introducing-the-onenote-search-api-beta-powered-by-bing/
We're trying to deploy an ARM template which deploys a Stream Analytics job with n Event Hubs outputs depending on an input parameter.
Right now we're having success with all but the listKeys() function inside the outputs property copy loop function which gets each Event Hub's primary keys:
"sharedAccessPolicyKey": "[listKeys(resourceId('Microsoft.EventHub/namespaces/eventhubs/authorizationRules', variables('clientEventHubNamespace'), parameters('clients')[copyIndex('outputs')].id, variables('clientEventHubClientSharedAccessName')), '2015-08-01').primaryKey]"
We get the error:
17:44:31 - Error: Code=InvalidTemplate; Message=Deployment template
validation failed: 'The template resource
'tailor-router-axgf7t3gtspue' at line '129' and column '10' is not
valid: The template function 'copyIndex' is not expected at this
location. The function can only be used in a resource with copy
specified. Please see https://aka.ms/arm-copy for usage details..
Please see https://aka.ms/arm-template-expressions for usage
details.'.
However, if we change this to be a specific index:
"sharedAccessPolicyKey": "[listKeys(resourceId('Microsoft.EventHub/namespaces/eventhubs/authorizationRules', variables('clientEventHubNamespace'), parameters('clients')[0].id, variables('clientEventHubClientSharedAccessName')), '2015-08-01').primaryKey]"
it works.
Is copyIndex('propertyName') inside a listKeys() a supported function?
If not, is there a workaround that would achieve the same effect?
Kind regards,
Nick
Stream Analytics job resource definition:
{
"apiVersion": "2016-03-01",
"type": "Microsoft.StreamAnalytics/StreamingJobs",
"name": "[variables('routerStreamAnalyticsName')]",
"location": "[variables('location')]",
"dependsOn": [ "clientsEventHubCopy" ],
"tags": {
"boundedContext": "[variables('boundedContextName')]"
},
"properties": {
"sku": {
"name": "[parameters('routerStreamAnalyticsSkuTier')]"
},
"outputErrorPolicy": "drop",
"eventsOutOfOrderPolicy": "adjust",
"eventsOutOfOrderMaxDelayInSeconds": 0,
"eventsLateArrivalMaxDelayInSeconds": 5,
"dataLocale": "en-US",
"compatibilityLevel": "1.0",
"inputs": [
{
"name": "input0",
"properties": {
"type": "stream",
"serialization": {
"type": "Avro"
},
"datasource": {
"type": "Microsoft.ServiceBus/EventHub",
"properties": {
"serviceBusNamespace": "[parameters('input0EventHubNamespace')]",
"sharedAccessPolicyName": "[parameters('input0EventHubSharedAccessPolicyName')]",
"sharedAccessPolicyKey": "[parameters('input0EventHubSharedAccessPolicyKey')]",
"eventHubName": "[parameters('input0EventHubName')]"
}
}
}
}
],
"transformation": {
"name": "routing",
"properties": {
"streamingUnits": "[parameters('routerStreamAnalyticsSkuTier')]",
"query": "omitted"
}
},
"copy": [
{
"name": "outputs",
"count": "[length(parameters('clients'))]",
"input": {
"name": "[parameters('clients')[copyIndex('outputs')].id]",
"properties": {
"datasource": {
"type": "Microsoft.ServiceBus/EventHub",
"properties": {
"serviceBusNamespace": "[variables('clientEventHubNamespace')]",
"sharedAccessPolicyName": "[variables('clientEventHubClientSharedAccessName')]",
"sharedAccessPolicyKey": "[listKeys(resourceId('Microsoft.EventHub/namespaces/eventhubs/authorizationRules', variables('clientEventHubNamespace'), parameters('clients')[copyIndex('outputs')].id, variables('clientEventHubClientSharedAccessName')), '2015-08-01').primaryKey]",
"eventHubName": "[parameters('clients')[copyIndex('outputs')].id]"
}
},
"serialization": {
"type": "Avro"
}
}
}
}
]
}
},
Thanks for reporting this and sorry for the inconvenience.
I just talked to the ARM team, we had an issue when copyindex was inside the index tags eg 'array[copyindex()]'. It should be fixed now.
Let us know how it goes.
Thanks,
JS - Azure Stream Analytics
This might be a silly question but I could not manage to filter elasticsearch indexes by a datetime field. I must be missing something.
This is the mapping:
"created_at": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
This is what I got:
{
"_index": "myindex",
"_type": "myindextype",
"_id": "21c",
"_score": 1,
"_source": {
"code": "21c",
"name": "hello",
...
"created_at": "2015-04-30T13:10:50.107769Z"
}
},
With this query:
"query": {
"filtered": {
"query": {},
"filter": {
"range": {
"created_at": {
"gte": "2015-05-02T13:10:50.107769Z"
"format": "strict_date_optional_time||epoch_millis"
}}}}}
I would expect to filter out the entry above. But it returns nothing.
Is there a problem with time format? Because it is directly coming from Django Rest Framework's serializers. They claim that it is ISO 8601 format and elasticsearch claims the same.
I would also like to filter them out by a simpler date like "2015-05-02".
I am stuck. Thank you in advance.
Edit: It does not matter whatever i write into the range filter. It always return all the entries.
This worked. I tried a lot of different things and lost my way at some point.
{
"query": {
"filtered": {
"filter": {
"range": {
"created_at": {
"gte": "2015-05-02"
}
}
}
}
}
}