Network timeout using node.js client at about 5 seconds - azure-cosmosdb

Note, this question was previously very different. This is now the real issue. Which is...
When making a call to executeStoredProcedure() using the node.js client I get a 408 code, RequestTimeout and I get no data back from the sproc's "body". This seems to occur at about 5 seconds, but when I time bound things from inside the sproc itself, any value over say 700 milliseconds causes me to get a network timeout (although I don't see it until about 5 seconds have passed).
Note, I can have longer running sprocs with read operations. This only seems to occur when I have a lot of createDocument() operations, so I don't think it's on the client side. I think something is happening on the server side.
It's still possible that my original thought is true and I'm not getting a false back from a createDocument() call which causes my sproc to keep running past its timeout and that's what's causing the 408.
Here is the time limited version of my create documents sproc
generateData = function(memo) {
var collection, collectionLink, nowTime, row, startTime, timeout;
if ((memo != null ? memo.remaining : void 0) == null) {
throw new Error('generateData must be called with an object containing a `remaining` field.');
}
if (memo.totalCount == null) {
memo.totalCount = 0;
}
memo.countForThisRun = 0;
timeout = memo.timeout || 600; // Works at 600. Fails at 800.
startTime = new Date();
memo.stillTime = true;
collection = getContext().getCollection();
collectionLink = collection.getSelfLink();
memo.stillQueueing = true;
while (memo.remaining > 0 && memo.stillQueueing && memo.stillTime) {
row = {
a: 1,
b: 2
};
getContext().getResponse().setBody(memo);
memo.stillQueueing = collection.createDocument(collectionLink, row);
if (memo.stillQueueing) {
memo.remaining--;
memo.countForThisRun++;
memo.totalCount++;
}
nowTime = new Date();
memo.nowTime = nowTime;
memo.startTime = startTime;
memo.stillTime = (nowTime - startTime) < timeout;
if (memo.stillTime) {
memo.continuation = null;
} else {
memo.continuation = 'Value does not matter';
}
}
getContext().getResponse().setBody(memo);
return memo;
};

The stored procedure above queues document creates in a while loop until the API returns false.
Keep in mind that createDocument() is an asynchronous method. The boolean returned represents whether it is time to wrap up execution right there and then. The return value isn't "smart" enough to estimate and account for how much time the async call will take; so it can't be used for queueing a bunch of calls in a while() loop.
As a result, the stored procedure above doesn't terminate gracefully when the boolean returns false because it has a bunch of createDocument() calls that are still running. The end result is a timeout (which eventually leads to blacklisting on repeated attempts).
In short, avoid this pattern:
while (stillQueueing) {
stillQueueing = collection.createDocument(collectionLink, row);
}
Instead, you should use the callback for control flow. Here is the refactored code:
function(memo) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
var row = {
a: 1,
b: 2
};
if ((memo != null ? memo.remaining : void 0) == null) {
throw new Error('generateData must be called with an object containing a `remaining` field.');
}
if (memo.totalCount == null) {
memo.totalCount = 0;
}
memo.countForThisRun = 0;
createMemo();
function createMemo() {
var isAccepted = collection.createDocument(collectionLink, row, function(err, createdDoc) {
if (err) throw err;
memo.remaining--;
memo.countForThisRun++;
memo.totalCount++;
if (memo.remaining > 0) {
createMemo();
} else {
getContext().getResponse().setBody(memo);
}
});
if (!isAccepted) {
getContext().getResponse().setBody(memo);
}
}
};

Related

How to know if client is in sync with server in Meteor?

I am trying to implement a feature where I want the user to see if all requests to the server have been handled, i.e. that the client is in sync with the server so that the user can be confident that all his changes are saved.
My idea was to override Meteor.call and keep a counter for each call and then when a reply/error is returned I decrease the counter. I will then on the client show a message saying "Synced" if the counter is zero otherwise I will show "Unsynced".
Basically my question is if there is any "built-in" feature in Meteor which already handles this, keeping track of outgoing Meteor calls, or if I should proceed as I have started?
This is what my code looks at this moment:
var originalMeteorCall = Meteor.call;
var counter = 0;
Meteor.call = function() {
if (this.isClient) {
if (arguments && arguments.length > 1) {
counter++;
var returnFunc = arguments[arguments.length - 1];
var newReturnFunc = function (err, result) {
counter--;
return returnFunc(err, result);
}
arguments[arguments.length - 1] = newReturnFunc;
}
}
var result = originalMeteorCall.apply(this, arguments);
return result;
}

"Meteor code must always run within a Fiber" when using Meteor.runAsync

Using cassandra with meteor.
let client = new cassandra.Client({contactPoints: [cassandraHost]});
var cassandraExecSync = Meteor.wrapAsync(client.execute, client);
MyProject.Feed.CassandraMeteorWrap = {
insertNewPost: function (userId, postContentJson, relevance) {
var insertCommand = insert(userId, postContentJson, relevance);
try {
return cassandraExecSync(insertCommand);
} catch (err) {
console.log("error inserting: " + insertCommand);
console.log(err);
}
}
So I wrapped Cassandra.client.execute (which has last arg as callback) with Meteor.wrapAsync
The first few inserts work, but after a few inserts (the insert is being called periodically) I get a:
[Error: Meteor code must always run within a Fiber. Try wrapping callbacks that you pass to non-Meteor libraries with Meteor.bindEnvironment.]
Update: Debug meteor showed the stack trace and the exception starts from the npm package I use "cassandra-driver" on the .onTimeout():
function listOnTimeout() {
var msecs = this.msecs;
var list = this;
debug('timeout callback ' + msecs);
var now = Timer.now();
debug('now: %d', now);
var first;
while (first = L.peek(list)) {
// If the previous iteration caused a timer to be added,
// update the value of "now" so that timing computations are
// done correctly. See test/simple/test-timers-blocking-callback.js
// for more information.
if (now < first._monotonicStartTime) {
now = Timer.now();
debug('now: %d', now);
}
var diff = now - first._monotonicStartTime;
if (diff < msecs) {
list.start(msecs - diff, 0);
debug(msecs + ' list wait because diff is ' + diff);
return;
} else {
L.remove(first);
assert(first !== L.peek(list));
if (!first._onTimeout) continue;
// v0.4 compatibility: if the timer callback throws and the
// domain or uncaughtException handler ignore the exception,
// other timers that expire on this tick should still run.
//
// https://github.com/joyent/node/issues/2631
var domain = first.domain;
if (domain && domain._disposed) continue;
try {
if (domain)
domain.enter();
var threw = true;
**first._onTimeout();**
if (domain)
domain.exit();
threw = false;
} finally {
if (threw) {
// We need to continue processing after domain error handling
// is complete, but not by using whatever domain was left over
// when the timeout threw its exception.
var oldDomain = process.domain;
process.domain = null;
process.nextTick(function() {
list.ontimeout();
});
process.domain = oldDomain;
}
}
}
}
debug(msecs + ' list empty');
assert(L.isEmpty(list));
list.close();
delete lists[msecs];
}

Application Cache and Slow Process

I want to create an application wide feed on my ASP.net 3.5 web site using the application cache. The data that I am using to populate the cache is slow to obtain, maybe up to 10 seconds (from a remote server's data feed). My question/confusion is, what is the best way to structure the cache management.
private const string CacheKey = "MyCachedString";
private static string lockString = "";
public string GetCachedString()
{
string data = (string)Cache[CacheKey];
string newData = "";
if (data == null)
{
// A - Should this method call go here?
newData = SlowResourceMethod();
lock (lockString)
{
data = (string)Cache[CacheKey];
if (data != null)
{
return data;
}
// B - Or here, within the lock?
newData = SlowResourceMethod();
Cache[CacheKey] = data = newData;
}
}
return data;
}
The actual method would be presented by and HttpHandler (.ashx).
If I collect the data at point 'A', I keep the lock time short, but might end up calling the external resource many times (from web pages all trying to reference the feed). If I put it at point 'B', the lock time will be long, which I am assuming is a bad thing.
What is the best approach, or is there a better pattern that I could use?
Any advice would be appreciated.
I add the comments on the code.
private const string CacheKey = "MyCachedString";
private static readonly object syncLock = new object();
public string GetCachedString()
{
string data = (string)Cache[CacheKey];
string newData = "";
// start to check if you have it on cache
if (data == null)
{
// A - Should this method call go here?
// absolut not here
// newData = SlowResourceMethod();
// we are now here and wait for someone else to make it or not
lock (syncLock)
{
// now lets see if some one else make it...
data = (string)Cache[CacheKey];
// we have it, send it
if (data != null)
{
return data;
}
// not have it, now is the time to look for it.
// B - Or here, within the lock?
newData = SlowResourceMethod();
// set it on cache
Cache[CacheKey] = data = newData;
}
}
return data;
}
Better for me is to use mutex and lock depended on the name CacheKey and not lock all resource and the non relative one. With mutex one basic simple example will be:
private const string CacheKey = "MyCachedString";
public string GetCachedString()
{
string data = (string)Cache[CacheKey];
string newData = "";
// start to check if you have it on cache
if (data == null)
{
// lock it base on resource key
// (note that not all chars are valid for name)
var mut = new Mutex(true, CacheKey);
try
{
// Wait until it is safe to enter.
// but also add 30 seconds max
mut.WaitOne(30000);
// now lets see if some one else make it...
data = (string)Cache[CacheKey];
// we have it, send it
if (data != null)
{
return data;
}
// not have it, now is the time to look for it.
// B - Or here, within the lock?
newData = SlowResourceMethod();
// set it on cache
Cache[CacheKey] = data = newData;
}
finally
{
// Release the Mutex.
mut.ReleaseMutex();
}
}
return data;
}
You can also read
Image caching issue by using files in ASP.NET

Raven DB DocumentStore - throws out of memory exception

I have code like this:
public bool Set(IEnumerable<WhiteForest.Common.Entities.Projections.RequestProjection> requests)
{
var documentSession = _documentStore.OpenSession();
//{
try
{
foreach (var request in requests)
{
documentSession.Store(request);
}
//requests.AsParallel().ForAll(x => documentSession.Store(x));
documentSession.SaveChanges();
documentSession.Dispose();
return true;
}
catch (Exception e)
{
_log.LogDebug("Exception in RavenRequstRepository - Set. Exception is [{0}]", e.ToString());
return false;
}
//}
}
This code gets called many times. After i get to around 50,000 documents that have passed through it i get an OutOfMemoryException.
Any idea why ? perhaps after a while i need to declare a new DocumentStore ?
thank you
**
UPDATE:
**
I ended up using the Batch/Patch API to perform the update I needed.
You can see the discussion here: https://groups.google.com/d/topic/ravendb/3wRT9c8Y-YE/discussion
Basically since i only needed to update 1 property on my objects, and after considering ayendes comments about re-serializing all the objects back to JSON, i did something like this:
internal void Patch()
{
List<string> docIds = new List<string>() { "596548a7-61ef-4465-95bc-b651079f4888", "cbbca8d5-be45-4e0d-91cf-f4129e13e65e" };
using (var session = _documentStore.OpenSession())
{
session.Advanced.DatabaseCommands.Batch(GenerateCommands(docIds));
}
}
private List<ICommandData> GenerateCommands(List<string> docIds )
{
List<ICommandData> retList = new List<ICommandData>();
foreach (var item in docIds)
{
retList.Add(new PatchCommandData()
{
Key = item,
Patches = new[] { new Raven.Abstractions.Data.PatchRequest () {
Name = "Processed",
Type = Raven.Abstractions.Data.PatchCommandType.Set,
Value = new RavenJValue(true)
}}});
}
return retList;
}
Hope this helps ...
Thanks alot.
I just did this for my current project. I chunked the data into pieces and saved each chunk in a new session. This may work for you, too.
Note, this example shows chunking by 1024 documents at a time, but needing at least 2000 before we decide it's worth chunking. So far, my inserts got the best performance with a chunk size of 4096. I think that's because my documents are relatively small.
internal static void WriteObjectList<T>(List<T> objectList)
{
int numberOfObjectsThatWarrantChunking = 2000; // Don't bother chunking unless we have at least this many objects.
if (objectList.Count < numberOfObjectsThatWarrantChunking)
{
// Just write them all at once.
using (IDocumentSession ravenSession = GetRavenSession())
{
objectList.ForEach(x => ravenSession.Store(x));
ravenSession.SaveChanges();
}
return;
}
int numberOfDocumentsPerSession = 1024; // Chunk size
List<List<T>> objectListInChunks = new List<List<T>>();
for (int i = 0; i < objectList.Count; i += numberOfDocumentsPerSession)
{
objectListInChunks.Add(objectList.Skip(i).Take(numberOfDocumentsPerSession).ToList());
}
Parallel.ForEach(objectListInChunks, listOfObjects =>
{
using (IDocumentSession ravenSession = GetRavenSession())
{
listOfObjects.ForEach(x => ravenSession.Store(x));
ravenSession.SaveChanges();
}
});
}
private static IDocumentSession GetRavenSession()
{
return _ravenDatabase.OpenSession();
}
Are you trying to save it all in one call?
The DocumentSession need to turn all of the objects that you pass it into a single request to the server. That means that it may allocate a lot of memory for the write to the server.
Usually we recommend on batches of about 1,024 items in you are doing bulks saves.
DocumentStore is a disposable class, so I worked around this problem by disposing the instance after each chunk. I highly doubt this is the most efficient way to run operations, but it will prevent significant memory overhead from happening.
I was running a sort of "delete all" operation like so. You can see the using blocks disposing both the DocumentStore and the IDocumentSession objects after each chunk.
static DocumentStore GetDataStore()
{
DocumentStore ds = new DocumentStore
{
DefaultDatabase = "test",
Url = "http://localhost:8080"
};
ds.Initialize();
return ds;
}
static IDocumentSession GetDbInstance(DocumentStore ds)
{
return ds.OpenSession();
}
static void Main(string[] args)
{
do
{
using (var ds = GetDataStore())
using (var db = GetDbInstance(ds))
{
//The `Take` operation will cap out at 1,024 by default, per Raven documentation
var list = db.Query<MyClass>().Skip(deleteSum).Take(5000).ToList();
deleteCount = list.Count;
deleteSum += deleteCount;
foreach (var item in list)
{
db.Delete(item);
}
db.SaveChanges();
list.Clear();
}
} while (deleteCount > 0);
}

Most efficient way to check to see if there is any data in a SQL row

The code below works, but I know it can't be the most efficient. Is there another way to ask if there are any rows rather than using Any()?
I'd like to have the NoResults Div hidden by default and only turned on when no rows are present, likewise have the repeater show up by default and only hidden when no results are listed.
using (AgileEntities context = new AgileEntities())
{
int StoryID = Convert.ToInt32(Request["StoryID"]);
var tasks = from t in context.Tasks
where t.StoryId == StoryID
orderby t.Number
select t;
rptTasks.DataSource = tasks;
rptTasks.DataBind();
if (tasks.Any())
{
rptTasks.Visible = true;
NoResults.Visible = false;
}
else
{
rptTasks.Visible = false;
NoResults.Visible = true;
}
}
Caution - calling .Any() may re-execute your query
I would do this a bit 'safer' to ensure single execution.
//force execution once
var taskList = tasks.ToList();
rptTasks.Visible = taskList.Count>0;
NoResults.Visible = taskList.Count==0;
And
rptTasks.DataSource = tasksList;
rptTasks.DataBind();
The problem with Any() and Count() is they cause your code to execute over and over - a test case
static void Main(string[] args)
{
//Populate the test class
List list = new List(1000);
for (int i=0; i o.CreateDate.AddSeconds(5) > DateTime.Now);
while (newList.Any())
{
//Note - are actual count keeps decreasing.. showing our 'execute' is running every time we call count.
Console.WriteLine(newList.Any());
System.Threading.Thread.Sleep(500);
}
}
You can replace Any() with Count() above to show. Basically the code keeps evaluating the query when you call Any() - I'm not sure if this applies to Linq to Sql though if there is any different caching mechanism.
var tasks = from t in context.Tasks
where t.StoryId == StoryID
orderby t.Number
select t;
var tasksList = tasks.ToList();
rptTasks.DataSource = tasksList;
rptTasks.DataBind();
if (tasksList.Count > 0)
{
rptTasks.Visible = true;
NoResults.Visible = false;
}
else
{
rptTasks.Visible = false;
NoResults.Visible = true;
}
The ToList() call will execute the query and create a list of tasks objects
Your DataBind() call has already caused the query to be executed, so calling Any() on top of that shouldn't cost you anything further.
You can change this with :
rptTasks.Visible = tasks.Any();
NoResults.Visible = !rptTasks.Visible;

Resources