Extremely slow Flex Remoting/Cffunction response when arguments include a string which can be cast to a VERY large number - apache-flex

I've run into a strange problem that I've been wrestling with for a few days. I'm hoping someone will have some insights.
I have a Flex app that uses standard remoting to access a SQL db via ColdFusion. (Flex sdk 3.4 (yes, old), CF 10)
I've recently found a problem where if the arguments to a cffunction includes a string that could be a very large number ("2e4361251", for example), the remoting result/fault handler in my AS code will not get called for a VERY long time, effectively locking the app. (App logic requires that I wait for this operation to finish before moving on.)
My most recent test took 45 minutes for the result handler to be called.
What the cffunction does does not matter.
If I am attempting to update my db, the update is successful (whether the string is used in that update or not), my app just doesn't get the result for a very long time;
If the cffunction does NOTHING, like
< cffunction name="myFunc" access="remote">
< !-- do nothing -->
< /cffunction>
it STILL takes an extremely long time to return.
Repeated testing has a noticeable effect on the server; CF starts eating up processor and RAM, though it will eventually recover.
Passing the string IN as a parameter is key; if I'm retrieving the string from the db, there is no problem.
These strings could represent VERY large numbers. If I were trying to do any sort of calculations or other manipulations, I could maybe understand this. But as I said, the function doesn't have to do anything at all, just the presence of the string causes the problem. (And I have verified that it is being passed as a String, not an int, Number, float, decimal, etc)
The string that alerted me to this issue originated with a user. It is a catalogue number of some sort. They had many with different letter/number combos; this one happened to use a single 'E'. I cannot simply 'disallow' strings of this nature.
Anyone have any ideas on why this is happening and, more importantly at the moment, how I can avoid it? A 45 min delay = 'frozen app' to the user.

Related

What could cause a sqlite application to slow down over time with high load?

I'll definitely need to update this based on feedback so I apologize in advance.
The problem I'm trying to solve is roughly this.
The graph shows Disk utilization in the Windows task manager. My sqlite application is a webserver that takes in json requests with timestamps, looks up the existing entry in a 2 column key/value table, merges the request into the existing item (they don't grow over time), and then writes it back to the database.
The db is created as follows. I've experimented with and without WAL without difference.
createStatement().use { it.executeUpdate("CREATE TABLE IF NOT EXISTS items ( key TEXT NOT NULL PRIMARY KEY, value BLOB );") }
The write/set is done as follows
try {
val insertStatement = "INSERT OR REPLACE INTO items (key, value) VALUES (?, ?)"
prepareStatement(insertStatement).use {
it.setBytes(1, keySerializer.serialize(key))
it.setBytes(2, valueSerializer.serialize(value))
it.executeUpdate()
}
commit()
} catch (t: Throwable) {
rollback()
throw t
}
I use a single database connection the entire time which seems to be ok for my use case and greatly improves performance relative to getting a new one for each operation.
val databaseUrl = "jdbc:sqlite:${System.getProperty("java.io.tmpdir")}/$name-map-v2.sqlite"
if (connection?.isClosed == true || connection == null) {
connection = DriverManager.getConnection(databaseUrl)
}
I'm effectively serializing access to the db. I'm pretty sure the default threading mode for the sqlite driver is to serialize and I'm also doing some serializing in kotlin coroutines (via actors).
I'm load testing the application locally and I notice that disk utilization spikes around the one minute mark but I can't determine why. I know that throughput plummets when that happens though. I expect the server to chug along at a more or less constant rate. The db in these tests is pretty small too, hardly reaches 1mb.
Hoping people can recommend some next steps or set me straight as far as performance expectations. I'm assuming there is some sqlite specific thing that happens when throughput is very high for too long, but I would have thought it would be related to WAL or something (which I'm not using).
I have a theory but it's a bit farfetched.
The fact that you hit a performance wall after some time makes me think that either a buffer somewhere is filling up, or some other kind of data accumulation threshold is being reached.
Where exactly the culprit is, I'm not sure.
So, I'd run the following tests.
// At the beginning
connection.setAutoCommit(true);
If the problem is in the driver side of the rollback transaction buffer, then this will slightly (hopefully) slow down operations, "spreading" the impact away from the one-minute mark. Instead of getting fast operations for 59 seconds and then some seconds of full stop, you get not so fast operations the whole time.
In case the problem is further down the line, try
PRAGMA JOURNAL_MODE=MEMORY
PRAGMA SYNCHRONOUS=OFF disables the rollback journal synchronization
(The data will be more at risk in case of a catastrophic powerdown).
Finally, another possibility is that the page translation buffer gets filled after a sufficient number of different keys has been entered. You can test this directly by doing these two tests:
1) pre-fill the database with all the keys in ascending order and a large request, then start updating the same many keys.
2) run the test with only very few keys.
If the slowdown does not occur in the above cases, then it's either TLB buffer management that's not up to the challenge, or database fragmentation is a problem.
It might be the case that issuing
PRAGMA PAGE_SIZE=32768
upon database creation might solve or mitigate the problem. Conversely, PRAGMA PAGE_SIZE=1024 could "spread" the problem avoiding performance bottlenecks.
Another thing to try is closing the database connection and reopening it when it gets older than, say, 30 seconds. If this works, we'll still need to understand why it works (in this case I expect the JDBC driver to be at fault).
First of all, I want to say that I do not use exactly your driver for sqlite, and I use different devices in my work. (but how different are they really?)
From what I see, correct me if im wrong, you use one transaction, for one insert statement. You get request, you use the disc, you use the memory, open, close etc... every time. This can't work fast.
The first thing I do when I have to do inserts in sqlite is to group them, and use a single transaction to do it. That way, you are using your resources in batches.
One transaction, many insert statements, single commit. If there is a problem with a batch, handle the valid separately, log the faulty, move the next batch of requests.

What Are The Reasons For Bit Shifting A Float Before Sending It Via A Network

I work with Unity and C# - when making multiplayer games I've been told that when it comes to values like positions that are floats, I should use a bit shift operator on them before sending them and reverse the operation on receive. I have been told this not only allows for larger numbers values and is capable of maintaining floating point precision which may be lost. However, if I do not have to, I do not wish to run this operation every time I receive a packet unless I have to. Though the bottle necks seem to be the actual parsing of the bytes received. Especially without message framing and attempting to move from string to byte array. (But that's another story!)
My question are:
Are these valid reason to undergo the operation? Are they accurate statements?
If not should I be running bit shift ops on my floats?
If I should, what are the real reasons to do it?
Any additional information would be most appreciated.
One of the resourcesI'm referring to:
Main reasons for going back and forth to/from network byte order is to combat endianness caused problems, mainly to ensure each byte of multi byte values (long, int but also floats) is read and written in the way giving the same results regardless of architecture. This issue can be theoretically ignored if you are sure you are exchanging data between systems using the same endianness, but that's rather bad idea from very beginning as you are simply creating unneded technological debt and keep unjustified exceptions in the code ("It all works BUT on the same endianness only. What can go wrong?").
Depending on your app architecture you can rewrite the packet payload/data once you receive it and then use that version further in the code. Also note that you need to encode the data again prior sending it out.

BlazeDS slow with large number of objects

I'm developing a mobile app using Flex and I have run into some problems using BlazeDS. Some users request a (relatively) pretty large amount of data from my server, which returns in about 2 seconds. The data consists of some pretty simple objects (Client, which has a name/phone/email, and a few other properties, some of which are other nested objects with more properties). The largest requests consist of no more than about 10,000 of these objects, which is only a few MB in size. The problem I am running into is that as soon as the server sends its response, the mobile screen locks up while the data is being processed. For 10,000 objects, this can take several minutes and sometimes even crash the device, and at best leave the user with a frozen screen the entire time. For the average user, it is at least 2-5 seconds of frozen screen. This is not only an issue for devices with limited capabilities. This also happens on my PC (i5 processor, 8GB RAM). From what I can tell, this downtime is taking place somewhere between when the device receives the response and when I can access the data. Setting a breakpoint on the first line of the following RemoteObject result handler has the screen lock up BEFORE it reaches the breakpoint:
protected function myResultHandler(event:ResultEvent):void
{
var result:ArrayCollection = event.result as ArrayCollection;
//Do other stuff here
}
I know very little about BlazeDS and AMF, so my only guess is that the freezing happens while the objects are being created on the device. Is there any way to speed up this process at all? Should I normally expect to see really poor performance like this? Any help would be greatly appreciated.
After a couple of hours digging around, I found the solution to my problem: On the server side, the objects I was sending had a ton of extraneous properties unrelated to the information I needed on the mobile app. In addition, there were helper methods on those classes in the form getMyHelper() which would attempt to generate a property on the Flex side. This resulted in a huge list of reference errors being thrown during the download since no properties with those names existed in the AS objects. I created stripped down 'lite' versions of the objects I needed sent across with no extra properties or methods. The massive lists now display nearly instantaneously after receiving the response from the server.

High rate of Gen 1 garbage collections

I am profiling an application(using VS 2010) that is behaving badly in production. Once of the recommendations given by VS 2010 is:
Relatively high rate of Gen 1 garbage collections is occurring. If, by
design, most of your program's data structures are allocated and
persisted for a long time, this is not ordinarily a problem. However,
if this behavior is unintended, your app may be pinning objects. If
you are not certain, you can gather .NET memory allocation data and
object lifetime information to understand the pattern of memory
allocation your application uses.
Searching on google gives the following link=> http://msdn.microsoft.com/en-us/library/ee815714.aspx. Are there some obvious things that I can do to reduce this issue?I seem to be lost here.
Double-click the message in the Errors List window to navigate to the
Marks View of the profiling data. Find the .NET CLR Memory# of Gen 0
Collections and .NET CLR Memory# of Gen 1 Collections columns.
Determine if there are specific phases of program execution where
garbage collection is occurring more frequently. Compare these values
to the % Time in GC column to see if the pattern of managed memory
allocations is causing excessive memory management overhead.
To understand the application’s pattern of managed memory usage,
profile it again running a.NET Memory allocation profile and request
Object Lifetime measurements.
For information about how to improve garbage collection performance,
see Garbage Collector Basics and Performance Hints on the Microsoft
Web site. For information about the overhead of automatic garbage
collection, see Large Object Heap Uncovered.
The relevant line there is:
To understand the application’s pattern of managed memory usage, profile it again running a.NET Memory allocation profile and request Object Lifetime measurements.
You need to understand how many objects are being allocated by your application and when, and how long they are alive for. You're probably allocating hundreds (or thousands!) of tiny objects inside a loop somewhere without really thinking about the consequences of reclaiming that memory when the references fall out of scope.
http://msdn.microsoft.com/en-us/library/ms973837.aspx states:
Now that we have a basic model for how things are working, let's
consider some things that could go wrong that would make it slow. That
will give us a good idea what sorts of things we should try to avoid
to get the best performance out of the collector.
Too Many Allocations
This is really the most basic thing that can go wrong. Allocating new
memory with the garbage collector is really quite fast. As you can see
in Figure 2 above is all that needs to happen typically is for the
allocation pointer to get moved to create space for your new object on
the "allocated" side—it doesn't get much faster than that. However,
sooner or later a garbage collect has to happen and, all things being
equal, it's better for that to happen later than sooner. So you want
to make sure when you're creating new objects that it's really
necessary and appropriate to do so, even though creating just one is
fast.
This may sound like obvious advice, but actually it's remarkably easy
to forget that one little line of code you write could trigger a lot
of allocations. For example, suppose you're writing a comparison
function of some kind, and suppose that your objects have a keywords
field and that you want your comparison to be case insensitive on the
keywords in the order given. Now in this case you can't just compare
the entire keywords string, because the first keyword might be very
short. It would be tempting to use String.Split to break the keyword
string into pieces and then compare each piece in order using the
normal case-insensitive compare. Sounds great right?
Well, as it turns out doing it like that isn't such a good idea. You
see, String.Split is going to create an array of strings, which means
one new string object for every keyword originally in your keywords
string plus one more object for the array. Yikes! If we're doing this
in the context of a sort, that's a lot of comparisons and your
two-line comparison function is now creating a very large number of
temporary objects. Suddenly the garbage collector is going to be
working very hard on your behalf, and even with the cleverest
collection scheme there is just a lot of trash to clean up. Better to
write a comparison function that doesn't require the allocations at
all.

Memory Leak with FileMonitorTarget / CacheDependency+DepFileInfo in ASP.NET

In our ASP.NET web app we're experiencing a quite extensive memory leak which I am investigating right now. Using WinDbg I got down to the largest memory eaters in our app which are (ran !dumpheap -stat in the WinDbg console to get these):
MethodTable Addr Count Overall size Type
...
000007fee8306e10 212928 25551360 System.Web.UI.LiteralControl
000007feebf44748 705231 96776168 System.Object[]
000007fee838fd18 4394539 140625248 System.Web.Caching.CacheDependency+DepFileInfo
000007fee838e678 4394614 210941472 System.Web.FileMonitorTarget
000007feebf567b0 18259 267524784 System.Collections.Hashtable+bucket[]
00000000024897c0 1863 315249528 Free
000007feebf56cd0 14315 735545880 System.Byte[]
000007feebf4ec90 1293939 1532855608 System.String
For all I know a large number of String objects can be quite normal; still there's definitely room for improvement. But what really makes me itch is the count of System.Web.FileMonitorTarget objects: we have over 4 million instances on the heap (à 48 bytes)! Using two memory dumps and comparing them I've found out that these objects are not being cleaned up by the GC.
What I'm trying to find out is: where are these objects coming from? I've already tried ANTS Memory Profiler to get to the root of the evil but it leads nowhere near any of our own classes. I see the connection with System.Web.Caching.CacheDependency+DepFileInfo and thus the System.Web.Cache but we do not use file dependencies to invalidate our cache entries.
Also, there are 14315 instances of System.Byte[] making up for over 700 MB on the heap which stuns me - the only place where we use Byte[] is our image uploading component but we have only around 30 image uploads per day.
What might be the source of these Byte arrays and FileMonitorTarget objects? Any hints are very welcome!
Oliver
P.S. Someone asked pretty much the same question here but the only 'answer' there was very general.
There are a couple of things I would look into. You're right the strings are often used in great number. Still you have approx. 1.4 GB worth of strings on the heap. Does that sound right? If not I would look into that. If that is withing the expected range, just ignore it.
If you suspect FileMonitorTarget and/or Byte[] to be leaking, dump the instances using !dumpheap -mt XXX where XXX is the listed MethodTable for the types. You may want to use PSSCOR2 instead of SOS, as it makes this task a bit easier (the output from !dumpheap shows a delta column and you can limit the number of instances dumped).
The next thing to do is to start looking into what is keeping specific instances alive. The !gcroot command will tell you what roots a specific instance. Pick an instance at random and inspect the roots. If everything is as expected move on to the next. If you application is leaking instances of these types chances are that you will get an instance that should have been freed. Once you get the roots you need to figure out what part of the code is holding on to these. A common source is unsubscribed events, but there are other possible reasons why objects are kept alive.
Objects of type System.Web.Caching.CacheDependency+DepFileInfo are created automatically by ASP.NET to monitor file changes to your website. So even if you are not specifically using a FileDependency cache expiration, ASP.NET itself does.
If I run a dump field against some of these objects, I get a path to my controls/pages.
0:000> !df -field _filename 0d3f24ec
Name: System.String
MethodTable: 79330b24
EEClass: 790ed65c
Size: 180(0xb4) bytes
GC Generation: 2
(C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: C:\inetpub\wwwroot\Website\Application\Base\UserControl\Messages.ascx
Fields:
MT Field Offset Type VT Attr Value Name
79332d70 4000096 4 System.Int32 1 instance 82 m_arrayLength
79332d70 4000097 8 System.Int32 1 instance 81 m_stringLength
79331804 4000098 c System.Char 1 instance 44 m_firstChar
79330b24 4000099 10 System.String 0 shared static Empty
>> Domain:Value 000e0ba0:02581198 00109f28:02581198 <<
79331754 400009a 14 System.Char[] 0 shared static WhitespaceChars
>> Domain:Value 000e0ba0:025816f0 00109f28:02586410 <<
You can see this link describing a bit more detail: Understanding ASP.NET Dynamic Compilation
However, your case might still be different. Try running !GCRoot [obj_addr] and see what is holding onto those objects. In my case it is entirely IIS /.NET related objects.
That said, I still had a problem where millions of these cache objects were created, and I have no idea why. :| (this is the first time it happened to me, but I don't think it appeared or will disappear magically...)

Resources