I have a 6NF-esque schema in part of my database, where any time a property value is changed, a new row is created with the CURRENT_TIMESTAMP. For example
+----------+-------+---------+
| EntityID | Value | TimeSet |
+----------+-------+---------+
| 1 | foo | 1:30 PM |
+----------+-------+---------+
| 1 | bar | 1:31 PM |
+----------+-------+---------+
So, the PK is EntityID, TimeSet (TimeSet is a MySQL TIMESTAMP - I just used readable values for the example). Any GET requests will SELECT the latest value for the entity only (i.e. GET /entities/1/<property> would return bar only).
As of right now, there are no behaviors that depend on the time set, it's just there for auditing. My question is: when I want to set values for this attribute over HTTP, should I be using PUT or POST? Technically, a new row is being created every time the user sends a value, but from the standpoint of the API, the request is idempotent, because you could create 100 rows of the same value, and only the most recent one will be returned for any GET requests.
Meaby this can help you:
The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions:
- Annotation of existing resources;
- Posting a message to a bulletin board, newsgroup, mailing list,
or similar group of articles;
- Providing a block of data, such as the result of submitting a
form, to a data-handling process;
- Extending a database through an append operation.
The PUT method requests that the enclosed entity be stored under the supplied Request-URI.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
You should be looking at things with a resource perspective. Though you are just updating a value with a timestamp, you are actually creating a new resource on the server and not modifying the old one. Returning the latest timestamped resource is actually part of your business logic and should not be confused with a PUT/POST request.
So, the right answer is use a POST request.
Related
I am writing a python backend which will retrieve a text via several methods such as:
URL fetched via a request and beautifulsoup4 parsing -> str
RSS news text fetched via feedparser -> str
MongoDB Document key/value from a given collection/db -> str
MySQL select -> str
SPARQL query -> str
For every str I am receiving from any of the above, I will perform a NLP pipeline and wish to persist its resulting findings with some reference to be able to go back to that same text.
For cases 1, maybe case 2 (also with an URL?), and case 5, I can benefit of a unique ID which is the URL/URI, and correspondingly I could also use the unique MongoDB _ID for case 3, while for case 4 I am not sure what could I store to be able to go back to the original text.
Maybe I could then persist the NLP processing using a schema such as:
METHOD | IDENTIFIER | NLPRESULT to persist the results of all of these access methods?
Just as an additional clarification, if the above works well and I am presented with a text that originates from a resource I have already processed, I should be able to just skip the processing and fetch the previous result.
Are there best practices recommended to approach this task?
Suppose that I have a table called persons and that a request to change any information about a person also updates that record's last_modified column. Would such a request still be considered idempotent? What I'm trying to find out is if auxiliary fields can be exempted from the criteria of idempotence.
If any information is changed on the database after a request (a POST request obviously, you would not alter a person record on a GET request) then it's not indempotent. By definition. Unless you only store stats (like logs).
Here it's not the last_modified column which is important, it's the change any information about a person.
A GET request is indempotent, you can take any uri and put it in an <IMG> in a web page, browsers will load it without asking, it must not alter anything in the database, or in the session (like destroying a session is not indempotent). An indempotent request can be prefetched, can run in any prioity (no need to care about the order of several indempotent queries,none of them can impact the other), etc.
As part of the Application Insights custom telemetry I create and send from my application, I can provide custom properties with Events that I choose to track. Those are then available later within the normal AI UI or Analytics interface to query against. Similarly, when a user begins a session, I can use the AI API to set the app-defined user identifier or an app-defined session identifier.
But is there a way to do a cross of the two? For example, is there a way that I could set a custom property for a given user (such as an audience or role she is part of)? Or a way to set a custom property for a given user session? (perhaps the connection type or company branch office they are in) There are plenty of predefined sort of user- and session-related properties that AI implicitly associates with each user session. (like city, country, device, etc.)
I would really like to set properties like these one time for that session (or user) and then be able to associate other activities during that user session with these properties. (such as custom events, metrics, trace entries, etc.) What I need to avoid is having to set such properties with every event, every trace, or every metric logged (e.g., with an ITelemetryInitializer), because I've got about 25 different ASP.NET apps instrumented on the client and server side and a couple of separate SaaS apps instrumented only on the client side. To try to introduce custom extensions and then continually and repeatedly determine the custom properties to be added to everything logged would be a monumental undertaking across a lot of teams.
Is this possible? If so, how? I haven't been able to find any mention of it in the API documentation and Intellisense snooping in the C# API has similarly turned up nothing obvious. (e.g., with Microsoft.ApplicationInsights.Channel.ITelemetry.Context.Session or .User)
Yes, you can set property once per session. Then use join to associate it with the rest of events.
For instance, below query counts events per session and then associates this count with custom property. After that it can be piped for further aggregations if needed.
let events = customEvents
| where timestamp > ago(1d);
events
| summarize count() by session_Id
| join kind=inner (
events
| where name == "MySingleEventPerSession"
| summarize any(*) by session_Id
) on session_Id
| project count_, any_customDimensions.MyCustomProperty, session_Id
Imaging that there are two clients client1 and client2, both writing the same key. This key has three replicas named A, B, C. A first receives client1's request, and then client2', while B receives client2's request, and then client1's. Now A and B must be inconsistent with each other, and they cannot resolve conflict even using Vector Clock. Am I right?
If so, it seems that it is easy to occur write conflict in dynamo. Why so many open source projects based on dynamo's design?
If you're using Dynamo and are worried about race conditions (which you should be if you're using lambda)
You can check conditionals on putItem or updateItem, if the condition fails
e.g. during getItem the timestamp was 12345, add conditional that timestamp must equal 12345, but another process updates it, changes the timestamp to 12346, your put/update should fail now, in java for example, you can catch ConditionalCheckFailedException, you can do another get item, apply your changes on top, then resubmit the put/update
To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
For more information about PutItem, see Working with Items in the Amazon DynamoDB Developer Guide.
Parameters:
putItemRequest - Represents the input of a PutItem operation.
Returns:
Result of the PutItem operation returned by the service.
Throws:
ConditionalCheckFailedException - A condition specified in the operation could not be evaluated.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/AmazonDynamoDB.html#putItem-com.amazonaws.services.dynamodbv2.model.PutItemRequest-
Can't talk about HBase, but I can about Cassandra, which is inspired in Dynamo.
If that happens in Cassandra, the most recent key wins.
Cassandra uses coordinator nodes (which can be any node), that receive the client requests and resends them to all replica nodes. Meaning each request has its own timestamp.
Imagine that Client2 has the most recent request, miliseconds after Client1.
Replica A receives Client1, which is saved, and then Client2, which is saved over Client1 since Client2 is the most recent information for that key.
Replica B receives Client2, which is saved, and then Client1, which is rejected since has an older timestamp.
Both replicas A and B have Client2, the most recent information, and therefore are consistent.
My question is a really basic question. Consider to query a modality work list to get some work items by a C-FIND query. Consider using a sequence (SQ) as Return Key attribute for the C-FIND query, for example: [0040,0100] (Scheduled Procedure Step) and universal matching.
What should I expect in the SCP's C-FIND response? Or, better say, what should I expect to find with regards of the scheduled procedure step for a specific work item? All the mandatory items that Modality Work List Information Model declare as encapsulated in the sequence? Should I instead explicitly issue a C-FIND request for those keys I want the SCP return in the response?
For example: if I want the SCP return the Scheduled Procedure Step Start Time and Scheduled Procedure Start Date, do I need to issue a specific C-FIND request with those keys or querying for Scheduled Procedure Step key is enough to force the SCP to send all items related to the Scheduled Procedure Step itself?
Yes, you should include the Scheduled Procedure Step Start Time / Date Tags into the 0040,0100 sequence.
See also Service Class Specifications (K6.1.2.2)
This will not ensure you will retrieve this information, because it depends on the Modality Worklist Provider, which information will be returned.
You could also request a Dicom Conformance Statement from the Modality Provider to know the necessary tags for request/retrieve.
As for table K.6-1, you can consider it as showing only the requirement of the SCP side or what SCP is required to use for matching key (i.e. query filter) and additional required attribute values to return (i.e. Return Key) with successful match. It is up to SCP’s implementation to support matching against required key but you can always expect SCP to use the values in matching key for query filter.
Also note that, SCP is only required to return values for attributes that are present in the C-FIND Request. One exception is the sequence matching and there you have the universal matching like mechanism where you can pass a zero length ITEM to retrieve entire sequence. So as stated in PS 3.4 section C.2.2.2.6, you can just include an empty ITEM (FFFE, E000) element with VR of SQ under Scheduled Procedure Step Sequence (0040, 0100) for universal matching.