Elasticsearch bulk update followed by search

Elasticsearch bulk update followed by search - asynchronous

In my server I update some documents using bulk API:
{"update":{"_type":"post","_retry_on_conflict":"3","_index":"xxxx","_id":"yyyy"}}
{"doc":{"sentiment":"positive","mood":1,"upgrade":true}}
After I get the response I make a new request for the same document using search:
{"query":{"filtered":{"filter":{"ids":{"values":["yyyy"]}}}}}
But the returned document has no updated value( Still has the old value ). If I wait for some time the updated value appears. I Think that occurs because bulk is async? Is there any way to fix this?

you can use refresh api to force index update, or even add ?refresh=true at the end of bulk command. But normally not recommended. Also, If there are more than one Node, you may need to use synced flush.

Related

How to paginate data in Firebase in Next.js SSR app?

I am trying to build a next.js server-side rendered blog. For it, I need to paginate the posts data. However, I am yet to find a way to use the query cursors firebase provides to paginate the data. The building query code is:
let postsQuery = firebase.firestore().collection('/posts').orderBy('postedOn', 'asc').limitToLast(10);
if (currentTagFilter !== 'All') {
postsQuery = postsQuery.where('tag', '==', currentTagFilter);
}
Now, this works for the first page, but I do not know how to request the next 10 posts. I could have saved the first document of the query and use .endBefore(firstPost). But, if I create some state in _app.js and save the first document in an array, for example, I cannot find how to make it accessible in getServerSideProps. Not to mention, if the user goes straight to /page/2, nothing will be displayed to him as the query for page 1 has not been performed yet.
How can I paginate the data correctly?

You might want to rethink you pagination strategy entirely. Firestore doesn't support pagination by index or page number. You have to provide a document snapshot or document details from the last seen document in order to get the next page.
Given these limitations and requirements, it's not possible for the user to go straight to page 2 (or any page other than the first one). So, it would be a bad idea to provide a link or mechanism to do that.
If you want to paginate data "correctly" with Firestore, you have to start at the first page, and cycle through the results using startAfter(), providing the details of the document where the last page ended. This is illustrated in the documentation.

How to read documents from Change Feed in Azure Cosmos DB since last checkpoint after restart?

I am using Change Feed processor library to read the Change Feed on a partitioned collection and below is the code for how I have configure it. I ma using most of the default options.
ChangeFeedProcessorOptions feedProcessorOptions = new
{
LeaseRenewInterval = TimeSpan.FromSeconds(15),
};
var docObserverFactory = DocumentFeedObserverFactory.Create(this.destinationCollectionInfo, this.dbRepository);
this.builder
.WithHostName(hostName)
.WithFeedCollection(this.monitoredCollectionInfo)
.WithLeaseCollection(this.leaseCollectionInfo)
.WithProcessorOptions(feedProcessorOptions)
.WithObserverFactory(docObserverFactory);
This runs fine as long as the Change Feed application is running and documents are being inserted/updated in the collection and the Change Feed app picks them up as expected.
The problem happens when I stop the Change Feed app for sometime and insert/update few documents in the Collection. Then when I start the Change Feed app, it doesn't pick changes from where it last left. Those changes that were inserted when the Change Feed app was stopped are lost. But when I set the flag StartFromBeginning to true, it picks everything from the start including changes that were inserted when the Change Feed app was stopped in between for sometime.
My understanding of read from current (StartFromBeginning to false) is that the Change Feed reads documents since it last left. But that doesn't seem to happen. Please help.

There are two ways to continue from exactly where you left it.
The first, and more accurate one, is to store the Continuation token of the last thing you read. That way you can specify it when you start again and it will win over both the StartTime and the StartFromBeginning flags.
The second one is to provide the StartTime property which will try and find the continuation token of a given time automatically. It has an approximate 5 second precision so there is a chance that you might miss some documents though.

Meteor: Single-Document Subscription

Whenever I encounter code snippets on the web, I see something like
Meteor.subscribe('posts', 'bob-smith');
The client can then display all posts of "bob-smith".
The subscription returns several documents.
What I need, in contrast, is a single-document subscription in order to show an article's body field. I would like to filter by (article) id:
Meteor.subscribe('articles', articleId);
But I got suspicious when I searched the web for similar examples: I cannot find even one single-document subscription example.
What is the reason for that? Why does nobody use single-document subscriptions?

Oh but people do!
This is not against any best practice that I know of.
For example, here is a code sample from the github repository of Telescope where you can see a publication for retrieving a single user based on his or her id.
Here is another one for retrieving a single post, and here is the subscription for it.
It is actually sane to subscribe only to the data that you need at a given moment in your app. If you are writing a single post page, you should make a single post publication/subscription for it, such as:
Meteor.publish('singleArticle', function (articleId) {
return Articles.find({_id: articleId});
});
// Then, from an iron-router route for example:
Meteor.subscribe('singleArticle', this.params.articleId);

A common pattern that uses a single document subscription is a parameterized route, ex: /posts/:_id - you'll see these in many iron:router answers here.

Updating Meteor.users

I've created a form for users to update their profiles. When I submit the form I'm receiving a [403] error.
Not permitted. Untrusted code may only update documents by ID.
My question is, if I'm going to use Meteor.users.allow, where - in what file/directory - do I write this code?
Thanks,
Nathan

The error you're getting is not a result of your allow/deny rules. You would get a straight 'Access Denied' error if it were.
When updating your users (as well as having the correct allow rules in place) you need to update your user by their _id- especially if they are being updated on the client end.
So instead of
Meteor.users.update({name: "etc"}, {$set:..});
You need to split it in two, one to get the _id and then one to update your document on that.
var user = Meteor.users.findOne({name: 'etc'});
Meteor.users.update({_id: user._id}, {$set:..});
The rule is on the client you can only use _id to find the document when updating.

WordPress Write Cache Issue with Multiple Sessions

I'm working on a content dripper custom plugin in WordPress that my client asked me to build. He says he wants it to catch a page view event, and if it's the right time of day (24 hours since last post), to pull from a resource file and output another post. He needed it to also raise a flag and prevent other sessions from firing that same snippet of code. So, raise some kind of flag saying, "I'm posting that post, go away other process," and then it makes that post and releases the flag again.
However, the strangest thing is occurring when placed under load with multiple sessions hitting the site with page views. It's firing instead of one post -- it's randomly doing like 1, 2, or 3 extra posts, with each one thinking that it was the right time to post because it was 24 hours past the time of the last post. Because it's somewhat random, I'm guessing that the problem is some kind of write caching where the other sessions don't see the raised flag just yet until a couple microseconds pass.
The plugin was raising the "flag" by simply writing to the wp_options table with the update_option() API in WordPress. The other user sessions were supposed to read that value with get_option() and see the flag, and then not run that piece of code that creates the post because a given session was already doing it. Then, when done, I lower the flag and the other sessions continue as normal.
But what it's doing is letting those other sessions in.
To make this work, I was using add_action('loop_start','checkToAddContent'). The odd thing about that function though is that it's called more than once on a page, and in fact some plugins may call it. I don't know if there's a better event to hook. Even still, even if I find an event to hook that only runs once on a page view, I still have multiple sessions to contend with (different users who may view the page at the same time) and I want only one given session to trigger the content post when the post is due on the schedule.
I'm wondering if there are any WordPress plugin devs out there who could suggest another event hook to latch on to, and to figure out another way to raise a flag that all sessions would see. I mean, I could use the shared memory API in PHP, but many hosting plans have that disabled. Can't use a cookie or session var because that's only one single session. About the only thing that might work across hosting plans would be to drop a file as a flag, instead. If the file is present, then one session has the flag. If the file is not present, then other sessions can attempt to get the flag. Sure, I could use the file route, but it's kind of immature in my opinion and I was wondering if there's something in WordPress I could do.

The key may be to create a semaphore record in the database for the "drip" event.
Warning - consider the following pseudocode - I'm not looking up the functions.
When the post is queried, use a SQL statement like
$ts = get_time_now(); // or whatever the function is
$sid = session_id();
INSERT INTO table (postcategory, timestamp, sessionid)
VALUES ("$category", $ts, "$sid")
WHERE NOT EXISTS (SELECT 1 FROM table WHERE postcategory = "$category"
AND timestamp < $ts - 24 hours)
Database integrity will make this atomic so only one record can be inserted.
and the insertion will only take place if the timespan has been exceeded.
Then immediately check to see if the current session_id() and timestamp are yours. If they are, drip.
SELECT sessionid FROM table
WHERE postcategory = "$postcategory"
AND timestamp = $ts
AND sessionid = "$sid"

The problem goes like this with page requests even from the same session (same visitor), but also can occur with page requests from separate visitors. It works like this:
If you are doing content dripping, then a page request is probably what you intercept with add_action('wp','myPageRequest'). From there, if a scheduled post is due, then you create the new post.
The post takes a little bit of time to write to the database. In that time, a query on get_posts() may not see that new record yet. It may actually trigger your piece of code to create a new post when one has already been placed.
The fix is to force WordPress to flush the write cache appears to be this:
try {
$asPosts = array();
$asPosts = # wp_get_recent_posts(1);
foreach($asPosts as $asPost) {break;}
# delete_post_meta($asPost['ID'], '_thwart');
# add_post_meta($asPost['ID'], '_thwart', '' . date('Y-m-d H:i:s'));
} catch (Exception $e) {}
$asPosts = array();
$asPosts = # wp_get_recent_posts(1);
foreach($asPosts as $asPost) {break;}
$sLastPostDate = '';
# $sLastPostDate = $asPost['post_date'];
$sLastPostDate = substr($sLastPostDate, 0, strpos($sLastPostDate, ' '));
$sNow = date('Y-m-d H:i:s');
$sNow = substr($sNow, 0, strpos($sNow, ' '));
if ($sLastPostDate != $sNow) {
// No post today, so go ahead and post your new blog post.
// Place that code here.
}
The first thing we do is get the most recent post. But we don't really care if it's not the most recent post or not. All we're getting it for is to get a single Post ID, and then we add a hidden custom field (thus the underscore it begins with) called
_thwart
...as in, thwart the write cache by posting some data to the database that's not too CPU heavy.
Once that is in place, we then also use wp_get_recent_posts(1) yet again so that we can see if the most recent post is not today's date. If not, then we are clear to drip some content in. (Or, if you want to only drip in like every 72 hours, etc., you can change this a little here.)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex