I've implemented a caching interface and memchanged provider for our website using enyim. Works great in testing until we get to load testing, where it spikes the CPU of w3wp.exe to near 100%. We have a configuration property to switch the caching provider back to dotnet's API and the CPU goes back to 5-7%. Has anyone experienced similar?
Every time you store something in memcached through enyim, the .NET runtime will perform binary serialization on the stored object. And deserialization when you retrieve. For some types (string, byte[] and some more), enyim implements a more specific and light weight serialization, but most types are serialized by the standard BinaryFormatter. This is processor intensive.
It especially hurts when your code is written towards the in-memory cache in ASP.NET. You will probably have code that thinks that getting something from cache is free. You may get it from cache again and again and again. We had comparable problems when we switched to memcached. If you do some profiling, you'll probably find that you do insanely many reads from cache.
Our experiences with the enyim client have been very positive. We run memcached in an ASP.NET server farm on around 10 nodes and it is very stable. For some forms of data (very often accessed), we prefer the in-memory in-process caching of ASP.NET.
Be sure to also check your serialization and deserialization code for proper object or stream disposal.
I had the exact same w3p.exe spiking to 99% symptoms, and thought for sure it was an Enyim/Membase driver bug, but it wasn't. It was ours, and it was because we forgot to Dispose() of the MemoryStream after Deserializing every JSON object in our JSON helper class:
public static T DeserializeToObject<T>(this string json)
{
byte[] byteArray = Encoding.ASCII.GetBytes( json );
MemoryStream stream = new MemoryStream( byteArray );
DataContractJsonSerializer serializer = new DataContractJsonSerializer(typeof(T));
T returnObject = (T)serializer.ReadObject(stream);
stream.Close();
stream.Dispose(); // we forgot this line!
return returnObject;
}
Related
I am trying to serialize an object into a MemoryStream using System.Text.Json's JsonSerializer. I am unable to find the implementation/method of that in the documentation. Can someone share the sample implementation for serialization and deserialization using System.Text.Json?
UPDATE
.NET 6 added JsonSerializer.Serialize overloads that write to a stream. It's now possible to write just :
JsonSerializer.Serialize(stream,myObject);
This produces unindented JSON using UTF8 without BOM
Original Answer
It's unclear what the problem is, or what documentation and examples are missing, as there's are multiple sections in learn.microsoft.com and hundreds of blog posts and articles. In the docs JSON serialization and deserialization is a good place to start and How to serialize and deserialize (marshal and unmarshal) JSON in .NET includes the section Serialize to UTF8.
A MemoryStream is just a Stream wrapper over a byte[] array anyway, so serializing to a MemoryStream is the same as serializing to a byte[] array directly. This can be done with JsonSerializer.SerializeToUtf8Bytes:
byte[] jsonUtf8Bytes =JsonSerializer.SerializeToUtf8Bytes(weatherForecast);
And finally, in .NET anything that needs to serialize to something, works through Reader and Writer objects, like TextReader, StreamReader, TextReader and -Writers. In JSON.NET's case, this is done through the Utf8JsonWriter object. JsonSerializer.Serialize has an overload that writes to a Utf8JsonWriter :
using var stream=File.OpenWrite(somePath);
using var writer=new Utf8JsonWriter(stream);
JsonSerializer.Serialize(writer,myObject);
That's the slow way of using System.Text.Json though. Using buffers means allocating them and cleaning them up, which is costly, especially in web applications. For this reason, ASP.NET Core uses IO pipelines instead of streams to receive and send data to sockets, using reusable buffers leased from buffer pools and passed along each step in the ASP.NET Core pipeline. Passing byte[] buffers around copies their contents, so .NET Core introduced the Span<> and Memory<> types, which represent a view over an existing (possibly pooled) buffer. This way, ASP.NET Core passes those "views" of the buffers around, not the buffers themselves.
System.Text.Json was built to use pipelines and reusable memory instead of streams, allowing ASP.NET Core to use minimal memory and as few allocations as possible in high traffic web sites. ASP.NET Core uses the Utf8JsonWriter(IBufferWriter) constructor to write to the output pipeline through a PipeWriter.
We can use the same overload to write to a reusable buffer with an ArrayBufferWriter. That's the equivalent of using a MemoryStream BUT the output is accessed through either a ReadOnlySpan<byte> or Memory<byte> so it doesn't have to be copied around :
using var buffer=new ArrayBufferWriter<byte>(65536);
using var writer=new Utf8JsonWriter(buffer);
JsonSerializer.Serialize(writer,myObject);
ReadOnlySpan<byte> data=buffer.WrittenSpan;
The better option is to use newtonsoft.json.
It has lot of examples
I have a web service that receives requests from users and returns some json. I need to save the json string in the database so for the moment, the write query occurs before the response is sent back.
Is there a way to send the response first and then do the write query, after the response left the web service?
Thanks.
There's a couple of different options here - they all have tradeoffs, though, and would be pretty esoteric. You don't mention why you want to do this, so I'm guessing performance. If that's the case, I think you're barking up the wrong tree - a simple write is almost certainly not your performance problem.
So, off the top of my head:
Queuing, as Ragesh mentions, would be a nice approach. This gets you similar semantics of a transaction, while off loading the write. You still have to write to the queue, though, which may be about the same overhead as writing to the DB.
You could spawn a new thread (using either the ThreadPool or System.Threading.Thread - there's some debates about which is preferable in ASP.NET) to handle the write. This can generally work, but you may have issues with unhandled exceptions, app domain restarts, etc.
You could store the JSON data into a static or Application variable, then use a Timer to periodically write them to the DB. This will be multithreaded code, so you will need to synchronize read/writes to the collection.
Similar to #3, store the JSON data into Cache and use the invalidation callback to write to the DB.
Lots of variations on store somewhere (memory, disk, flat DB table, etc.), process later (ASP.NET, scheduled task, Windows Service, Sql Agent, etc.).
#frenchie says: a response starts by reading the json string from the db and ends with writing it back. In other words, if the user sends a request, the json string that's going to be read must be the one that was written in the previous response.
That complicates things, since inherent in async work is not knowing when something is done. If you require the async portion (writing back to the DB) to be done before handling the next request, you'll have to execute a wait to make sure it actually completed. In order to do that, you'll need to keep server side state on the client - not exactly a best practice as far as services go (though, it sounds like you're already doing that with these JSON request/response pairs).
Given the complications, I would make sure that you've done your profiling and determined it is indeed a performance problem.
You can do schedule a query work like
ThreadPool.QueueUserWorkItem(state =>
this.AsynchronousExecuteReference());
// and run
static void AsynchronousExecuteReference()
{
// run here your sql update
}
One other example using Thread inside an class and you can pass parameters to it.
public class RunThreadProcess
{
// Some parametres
public int cProductID;
// my thread
private Thread t = null;
// start it
public Thread Start()
{
t = new Thread(new ThreadStart(this.work));
t.IsBackground = true;
t.SetApartmentState(ApartmentState.MTA);
t.Start();
return t;
}
// actually work
private void work()
{
// do thread work
all parametres are available here
}
}
And here is how I run it
var OneAction = new RunThreadProcess();
OneAction.cProductID = 100;
OneAction.Start();
Do not worry about memory, CG knows that this process is used until the thread ends, so I have check it and CG not delete it and wait the thread to ends.
You should look at using message queues like MSMQ, ActiveMQ or RabbitMQ to do this. When you receive your request, you'll put the relevant data in to the queue, and send your response to the client. At the other end of the queue, you'll have some process that reads from the queue and inserts data in to your database.
this is missing the point of a request/response. unless you want to get into async commands like a service bus, but that's pub/sub, not request/response. the point of request/response is to do the work on the server after receiving the request and before sending the response. even if the work is sending an async message to a service bus.
You could try moving your web service URL to an ASPX page where the lifecycles come in to play.
In the code-behind, call your routine that does the main portion of the work in Page_Load or Page_Prerender (or whenever is appropriate prior to the response being sent) and then do your DB work in the Page_Unload event which occurs after the response has been sent (http://msdn.microsoft.com/en-us/library/ie/ms178472.aspx).
What I would like to do is stream the request to a file store asynchronously so that the incoming request does not take up a lot of memory and so that handling thread is not held up for IO.
I see that there is an asynchronous HTTP handler that I can implement. This looks like it would help with the thread usage, but it looks like the request has already been fully copied into memory by this point by IIS/ASPNET.
Is there a way to keep ASP.NET from reading the entire request in before handling it?
There is a new method added to the HttpRequest in .NET 4 called GetBufferlessInputStream(), which gives you synchronous access to the request stream.
From the MSDN article:
This method provides an alternative to using the InputStream property. The InputStream property waits until the whole request has been received before it returns a Stream object. In contrast, the GetBufferlessInputStream method returns the Stream object immediately. You can use the method to begin processing the entity body before the complete contents of the body have been received.
The ability to access the request stream asynchronously will be available in .NET 4.5. See the What's New in ASP.NET 4.5 article for more information. It looks like there will be several nice ASP.NET performance improvements in 4.5.
you are not searching SO enough.
the solution you need is explained here, step by step, in very much details: Efficiently Uploading Large Files via Streaming
check this one, your question is a duplicated: Streaming uploaded files directly into a database table
While this SO question is specifically about MVC, the answers should work for ASP.NET generally. Specifically, people appear to have had a good experience with Darren Johnstone's UploadModule.
Is there any way to use caching in ASP.Net except SQL Server second level cache. As it is the first time to work with caching I want any way with an example. I have found that NHibernate implements this but we are using .netTiers as an application framework.
The Session cache seems to be the appropriate caching mechanism here. The Session cache is a fault-tolerant cache of objects.
Inserting an object
Session["Username"] = "Matt";
Reading an object
string username = (string)Session["Username"];
Removing an object
Session.Remove("Username");
I say fault-tolerant because if the value with the key you specify doesn't exist in the Session cache, it will not through an exception, it will return null. You need to consider that when implementing your code.
One thing to note, if you are using Sql Server or State Server, the objects you can put in the cache need to be serializable.
Memcached is also a very good way to go, as it is very flexible. It is a windows service that runs on any number of machines and your app can talk to the instances to store and retrieve from the cache. Good Article Here
We're using a Linq-to-SQL DataContext in a web application that provides read-only data to the application and is never updated (have set ObjectTrackingEnabled = false to enforce this.
Since the data never changes (except for occasional config updates) it seems wasteful to be reloading it from SQL Server with a new DataContext for each web request.
We tried caching the DataContext in the Application object for all requests to use, but it was generating a lot of error and our research since shows that this was a bad idea, DataContext should be disposed of within the same unit of work, not thread safe, etc, etc.
So since the DataContext is meant to be a data access mechanism, not a data store, we need to be looking at caching the data that we get from it, not the context itself.
Would prefer to do this with the entities and collections themselves so the code can be agnostic about whether it is dealing with cached or "fresh" data.
How can this be done safely?
First, I need to make sure that the entities and collections are fully loaded before I dispose of the DataContext. Is there a way to force a full load of everything from the database conveniently?
Second, I'm pretty sure that storing references to the entities and collections is a bad idea, because it will either
(a) cause the entities to be corrupted when the DataContext goes out of scope or
(b) prevent the DataContext from going out of scope
So should I clone the EntitySets and store them? If so, how? Or what's the go here?
This is not exactly an answer to your question, but I suggest avoiding caching on web site side.
I would rather focus on optimizing database queries for faster and more efficient data retrieval.
Caching will:
not be scalable
need extra code for synchronization, I assume your data isn't completely static in DB?
extra code will be bug prone
will eat up memory of your web server quickly, the next thing you might end up addressing is memory issue on your web server
will not work very well, when you need to load-balance your web site
[Edit]
If I needed to cache 5MB data, I would use Cache object, probably with lazy loading. I would use a set of lightweight collections, like ReadonlyCollection<T>, Collectino<T>. I would probably use ReadonlyDictionary<TKey, TValue> also for quick searches in the memory. I would use LINQ-to-Objects to manipulate with the collections.
You want to cache the data retrieved from the DataContext rather than the DataContext object itself. I usually refactor out commonly-retrieved data into methods that I can implement silent caching with, something like this (may need to add thread-safe logic):
public class MyBusinssLayer {
private List<MyType> _myTypeCache = null;
public static List<MyType> GetMyTypeList() {
if (_myTypeCache == null) {
_myTypeCache = // data retrieved from SQL server
}
return _myTypeCache
}
}
This is the simplest pattern that can be used and will cache for one web request. To cache for longer periods, store the contents in a longer-term storage, such as Application or Cache. For instance, to store in Application level data, use this kind of pattern.
public static List<MyType> GetMyTypeList() {
if (Application["MyTypeCacheName"] = null) {
Application["MyTypeCacheName"] = // data retrieved from SQL server
}
return (List<MyType>)Application["MyTypeCacheName"];
}
This would be for data that almost never changes, such as a static collection of status types to choose from in a DropDownList. For more volitile data, you can use the Cache with a timeout period, which should be selected based on how often the data changes. With the Cache items can be invalidated manually with code if necessary, or with a depedency checker like SqlCacheDependency.
Hope this helps!