Most efficient method of generating PNG as HTTP response - http

I've built an ASP.NET page whose output stream is a dynamically-generated PNG image containing only text on a transparent background.
The text is based upon database IDs contained in the querystring. There will be a limited number of variations.
Which one of the following would be the most efficient means of returning the image to the client?
Store each variation upon the first generation, and thenceforth retrieve this from the drive.
Simply generate the image each time.
Cache the output response based upon the querystring.

Totally depends on how often this image is going to have to be generated.
If it's a small project I would elect to generate it each time as this would be the simplest solution.
If you are expecting a lot of generations then storing the image each time it's generated and checking for pre generated images would be next, it gets a bit complicated though, all depends on how many unique variations of images you expect to be generated, if it is small, go for it, otherwise you may have to have expiry dates on the images that are not so frequently accessed.
In short, it depends on what the application of this is, and not enough information was given to give a comprehensive answer to your specific solution.

Related

Serving Lazy Thumbnail Images from Azure Blob Storage - What is the overhead of Exists?

I have a website where users upload images. These images are shown on various sections of the site with various thumbnail dimensions. Since the site is still under rapid development, I don't yet want to commit to a set number of thumb sizes. Thus I believe I should be generating thumbnails on a lazy basis.
Of the two options, which is the most performant way to do this:
When I go to serve the thumbnail, convert the dimensions into a canonical filename (like "bighouse-thumb-160x120"). Check if the file exists in blob storage using client.GetContainerReference(containerName).GetBlockBlobReference(key).Exists(); If it does not exist, generate it and save it.
When I go to serve the thumbnail, query my SQL database to see if the thumbnail exists. If it exists, get the blob URI from the DB and emit that as HTML. If it does not exist, generate it and update the SQL database.
I've used #2 in the past, but design-wise it is duplicating state which is bad. If querying azure for the existence of blobs is scalable, I'd rather do that. I don't really understand the threading model in Asp.Net. If I have 200 users requesting thumbs, will my azure Exists calls all happen in parallel? Even if they do, two round trips seem like a lot of overhead. I assume roundtripping the database is faster and lends itself more easily to generic caching solutions.
What is the right answer?
Regardless of the overhead, I would pre-generate thumbnails when you upload/store the image. This way you move the burden of generating thumbnails from something that is done many times (retrieving an image) to one that is much less often executed (storing an image).
Consider the following scenario, when you lazily generate thumbnails on the first view:
Check for an existing thumbnail (is false, first view remember ;))
Generate a thumbnail
Store the thumbnail
Send the thumbnail to the client
With pre-generated thumbnails the process is much shorter:
Send the thumbnail to the client
Done.
With 'lazy generating' the check for existing can be expensive due to network overhead (on every hit!), generating the thumbnail can be hugely expensive memory- & CPU-wise and than you have to store it, with network overhead again. You can even offload generating the thumbnail(s) to a separate process, possibly started by queue messages, to take the burden of generating the images even further away from your webservers.
However, this brings up the question of what you should do when you introduce a new thumbnail/image size. When you pre-generate the thumbnails you can write a simple tool to create the new sizes and store them, and if you went the separate process route it's even simpler. Just upgrade the separate process, generate a queue message for every existing image and just let it do its work.

Show same Dicom image header

I wonder if it is possible to have two different images with the same header. If so, can we display this two images in the PACS or otherwise which tags should be changed to display them?
No you can't. The issue is that Dicom requires every image to have it's own Unique UID stored in the header.
PACS systems commonly ignore newly received images if their UID is identical to one they already have in the database, so if you don't follow this part of the standard you probably don't get very far.
If you want to generate DICOM images, and you are unfamiliar with the standard, my advice is to use OFFIS img2dcm or a similar tool to convert them from normal images.
Jan de Vaan is correct, you SHOULDNT have duplicate headers with different image bitmaps and that is 'guaranteed' by the unique SOP instance UID. I've never seen a duplicate UID come from any modality. Saying that, there are sloppy applications with terrible UID implementations to generate the unique UID. There are teleradiology apps, secondary capture devices and other open source DICOM packages (for instance) that, typically on reset, could and do create a repeatable UID sequence. Many research labs get along fine with these. I expect you have one of these applications up the pipeline back to the source modality. If its messing up something as fundamental to DICOM as the UID, I would wonder what other wonderful things are happening.
If you must get these into the PACS, you would have to change the SOP Instance UID, which is not a recommended practice.

Should I use Wordpress Transient API in this case?

I'm writing a simple Wordpress plugin for work and am wondering if using the Transients API is practical in this case, or if I should seek out another way.
The plugin's purpose is simple. I'm making a call to USZip Web Service (http://www.webservicex.net/uszip.asmx?op=GetInfoByZIP) to retrieve data. Our sales team is using a Lead Intake sheet that the plugin will run on.
I wanted to reduce the number of API calls, so I thought of setting a transient for each zip code as the key and store the incoming data (city and zip). If the corresponding data for a given zip code already exists, then no need to make an API call.
Here are my concerns:
1. After a quick search, I realized that the transient data is stored in the wp_options table and storing the data would balloon that table in no time. Would this cause a significance performance issue if the db becomes huge?
2. Is this horrible practice to create this many transient keys? It could easily becomes thousands in a few months time.
If using Transient is not the best way, could you please help point me in the right direction? Thanks!
P.S. I opted for the Transients API vs the Options API. I know zip codes don't change often, but they sometimes so. I set expiration time of 3 months.
A less-inflated solution would be:
Store a single option called uszip with a serialized array inside the option
Grab the entire array each time and simply check if the zip code exists
If it doesn't exist, grab the data and save the whole transient again
You should make sure you don't hit the upper bounds of a serialized array in this table (9,000 elements) considering 43,000 zip codes exist in the US. However, you will most likely have a very localized subset of zip codes.

Autocomplete optimization for large data sets

I am working on a large project where I have to present efficient way for a user to enter data into a form.
Three of the fields of that form require a value from a subset of a common data source (SQL Table). I used JQuery and JQuery UI to build an autocomplete, which posts to a generic HttpHandler.
Internally the handler uses Linq-to-sql to grab the data required from that specific table. The table has about 10 different columns, and the linq expression uses the SqlMethods.Like() to match the single search term on each of those 10 fields.
The problem is that that table contains some 20K rows. The autocomplete works flawlessly, accept the sheer volume of data introduces deleays, in the vicinity of 6 seconds or so (when debugging on my local machine) before it shows up.
The JqueryUI autocomplete has 0 delay, queries on the 3 key, and the result of the post is made in a Facebook style multi-row selectable options. (I almost had to rewrite the autocomplete plugin...).
So the problem is data vs. speed. Any thoughts on how to speed this up? The only two thoughts I had were to cache the data (How/Where?); or use straight up sql data reader for data access?
Any ideas would be greatly appreciated!
Thanks,
<bleepzter/>
I would look at only returning the first X number of rows using the .Take(10) linq method. That should translate into a sensbile sql call, which will put much less load on your database. As the user types they will find less and less matches, so they will only see that data they require.
I'm normally reckon 10 items is enough for the user to understand what is going on and still get to the data they need quickly (see the amazon.com search bar for an example).
Obviously if you can sort the data in a meaningful fashion then the 10 results will be much more likely to give the user what they are after quickly.
Returning the top N results is a good idea for sure. We found (querying a potential list of 270K) that returning the top 30 is a better bet for the user finding what they're looking for, but that COMPLETELY depends on the data you are querying.
Also, you REALLY should drop the delay to something sensible like 100-300 ms. When you set delay to ZERO, once you hit the 3-character trigger, effectively EVERY. SINGLE. KEY. STROKE. is sent as a new query to your server. This could easily have the unintended and unwelcome effect of slowing down the response even MORE.

Using MaxRequestLength with Large Web Searches

I have an application that holds data referencing 300,000 customers. When a user did a search the result was often bigger than our MaxRequestlength would allow, we have dealt with this in two ways: We have increased our MaxRequestLength to 102400 (KB) and required the user to supply two letters of the first Name and two letters of the last name, to limit the sheer # of customer records returned. This keeps us from exceeding the MaxRequestLength limit.
I was just wondering if anyone had any insight in to whether this was a particularly good approach, whether there is a limit to how big MaxRequestLength could be or should be, and what other options might be useful in this situation.
Most web applications I have seen deal with this by returning a paginated list, and displaying only the first page of results.
In modern implementations using ORM's, "Skip" and "Take" operators are used to retrieve only those records which are required for a given page.
So any given request is no longer than the number of records on one page.
I would recommend paging the results instead of displaying everything. I would also suggest adding multiple search fields allowing your users to filter their results even further. This will allow your user to find what they are looking for faster.
As you can guess from my comment, I think MaxRequestLength only restricts the size of the request (-> the amount of data sent from the client/browser to the server).
If you are exceeding this limit, then this probably means that you have a huge ViewState which is sent with every response. ViewState is stored in a hidden field on the page and is sent back to the server with every PostBack (and that's where the MaxRequestLength setting could come into play). You can easily check this by looking at the source of your page in the web browser and looking for a hidden INPUT element with the name "__VIEWSTATE" and a large string-value.
If this is the case, the you should try to reduce the size of the ViewState, e.g. by
setting ViewState="false" on your controls (GridView or whatever) and re-binding the control on every PostBack (this is the recommended approach)
storing the ViewState on the server side
compressing the ViewState
If your requirements allow it, I would suggest implementing server-side paging. That way you only send one page worth of records over the wire rather than the entire record set.
300,000 records is a completely unusable result set from a human perspective.
As others have said, page the results to something like the top 50 or 100 records. Let them sort it and provide a way to narrow the search criteria.
For perspective, look at google. They default to 10 records per page. Part of the reason for this is that people would rather provide more criteria than go spelunking through a large result set.

Resources