My application allows user to upload CSV file that is processed and records are written in the database. But this file can contain a very big number of records for example 300 000. And in this case it may need to up to half an hour to process all this records, I would like my application not to freeze the page for this period, but show progress and maybe some errors, or it would be better to allow user to move to another pages and from time to time come back to check process.
By what means can I achieve that?
The approach we took to resolve a similiar issue was as follows;
Upload File using normal http methods.
Save file locally.
Submit file to asynchronous webservice (.asmx). This process will insert a record that will store the status of the import, along with actually starting importing the records. Once all records have been processed, set the status accordingly.
This all happens in a single flow. Because the WebMethod is asynchronous, it will return without waiting for itself to complete and the import will happen in the background.
You now redirect user to page that periodically checks the status of the asynchronous import until such point as it is finished. You can also add additional information to this process such as progress by batching the records and updating another fields accordingly.
This has worked well for us for many years now. I have not added any real detail as that will be specific to your implementation.
Related
I'm sorry that this question is quite vague, but I don't know where to start to realize the following.
I have a WASM blazor app that has the ability to import data by uploading an Excel document.
I'd like to give the user the chance to preview import sanity checks before performing actual db update.
I could call my controller to process input file, do sanity checks and give results back to client, then if everything is fine I could call my controller again and process the file a second time to do actual db modifications.
But I'd like to avoid reading and manipulating input file twice, so I wonder if there is a way (SignalR? I don't know anything about it) to process the file in the controller, suspend the execution, show sanity check results, ask for confirmation and, finally, update db. Maybe with a timeout to avoid server resources waste.
Any hint is appreciated.
Thank you!
I have a database with almost 600 million records. I want to perform search directly by uploading excel file, so when I upload excel file, it should start searching in background, and showing status as 'In Progress' once it is complete its the process should stop and give message as 'Completed'
I can write code for uploading file xls file and search through normal way, but how to WRITE using BACKGROUND WORKER in asynchronous mode using asp.net C#.
I would be uploading excel file with 50,00 record to perform search from 600 million records in database.
Define your requirements
what is a typical time the search will require
how many results such search might lead (50K? 600M?)
you don't want to show them all on the result page, isn't it?
Depends on requirements there are few ways you can go, e.g.
set up a DB or MSMQ to which you would log a request. Your web app could then monitor the status of the request to subsequently notify the user of it's completion.
Asynchronous Pages
I would opt for my the suggestion and
upload file/log request in to the database
setup db job which will proceed the request and log all results into a table
do not wait until search is finished but call an ajax that checks result table
Similar topic has been discussed as a part of other questions: Long-running code within asp.net process,
BackgroundWorker thread in ASP.NET
I am using Asp.NET MVC 3. In one particular use case, I need to present the user with three forms one after the other. On each form the user can upload up to 5 files, each file being a maximum of 2 MB. So the combined file size can go up to 30MB if a particular user uploads all files in every form. In some cases depending on other selections on the page the user may not have to pass through all the three forms. At the end of it all there will be a submit button that inserts all data into a table.
My question is: what is the best way to manage the files uploaded on each page along with their associated data per page until the user hits submit on the final page. I am thinking keeping it in session is not a good idea as this server can have upto 100 simultaneous users, which will mean a lot of memory required on the server. Am I wrong in that assumption? If not using sessions to store all that data, what are my other options?
Thanks in advance.
Store your files in a "in progress" table. It is okay to save state in workflow scenarios.
When they complete the process, move the data to the permanent resting place.
If they cancel the upload process, you can delete the data from the staging environment.
If they are not logged in, you can delete them when the session expires. Make sure you put a date stamp on the records so that you can run cleanup scripts periodically.
I would like to know the best way to deal with long running processes started on demand from an ASP.NET webpage.
The process may consist of various steps (like upload files to the server, run SSIS packages on them, execute some stored procedures etc.) and sometimes the process could take up to couple of hours to finish.
If I go for asynchronous execution using a WCF service, then what happens if the user closes the browser while the process is running, how the process success or failure result should be displayed to the user? To solve this, I choose one-way WCF service calls, but the problem with this is I need to create a process table and store the result (and error messages if it fails in any of the steps and which steps have completed successfully) in that table which is an additional overhead because there are many such processes with various steps that the user can invoke from the web page and user needs to be made aware of the progress (in simplest case, the status can be "process xyz running") and once it is done, the output needs to be displayed to the user (for example by running a stored procedure).
What is the best way to design the solution for this?
As I see it, you have three options
Have a long running page where the user waits for the response. If this is several hours, you're going to have many usability problems, so I wouldn't even consider it.
Create a process table to store the results of operations. Run service functions asynchronously and delegate logging the results to the service. There can be a page that the user refreshes which gets the latest results of this table.
If you really don't want to create a table, then store all the current process details in the users' session state, and have a current processes page as above. You have the possible issue that the session might timeout, or the web app might restart and you'll lose all this.
I can't see that number 2 is such a great hardship. You could make the table fairly generic to encompass all types of processes: process details could just be encoded as binary or xml and interpreted by the web application. You then have the most robust solution.
I cant say what the best way would be but using Windows Workflow Foundation for such long running processes is definitely one way to go about it.
You can do tracking of the process to see what stage it is at, even persist it if you have steps where it is awaiting user input etc.
WF provides a lot of features out of the box (especially if your storage medium is SQL Server) and may be a good option to consider.
http://www.codeproject.com/KB/WF/WF4Extensions.aspx might help give you some more insight into the same.
I think you are in the right track. You should run the process asynchronously, store the execution somewhere (a table), and keep status of the running process in there.
Your user should see a pending display label while the process is executing, and a finished label with the result when the process finished. If the user closed the browser, she will see the result of her running process next time she logs in.
Hypothetically, if the user clicks "save,save,save,save" a bunch of times on a text file, making single character changes at a time and managing to resave 5 times before the first save is processed, what is best practice in that situation? Assuming we don't have a "batch process" option... Or maybe they push save,save,save,save on 4 files one after the next before they've all been processed.
Should you queue up the requests and process them one at a time? Or should you just let it happen and it will work itself out somehow?
Thanks!
We usually send down a GUID in the page when a form is initialized, and send it along on the save request. The server checks a shared short-term memory cache- if it's a miss, we store the GUID and process the save, if it's a hit, we fail as a dupe request. This allows one save per page load (unless you do something to reinit the GUID on a successful save).
If you make your save operation light enough, it shouldn't matter how many times the user hits save. The amount of traffic a single user can generate is usually quite light compared to the load of thousands of users.
I'd suggest watching the HTTP traffic when you compose a Gmail message or work in Google docs. Both are very chatty, and frequently send updates back to the server.