I'm sorry that this question is quite vague, but I don't know where to start to realize the following.
I have a WASM blazor app that has the ability to import data by uploading an Excel document.
I'd like to give the user the chance to preview import sanity checks before performing actual db update.
I could call my controller to process input file, do sanity checks and give results back to client, then if everything is fine I could call my controller again and process the file a second time to do actual db modifications.
But I'd like to avoid reading and manipulating input file twice, so I wonder if there is a way (SignalR? I don't know anything about it) to process the file in the controller, suspend the execution, show sanity check results, ask for confirmation and, finally, update db. Maybe with a timeout to avoid server resources waste.
Any hint is appreciated.
Thank you!
Related
I have a application in which I can generate raw export in xls.
The problem is that the xls generation can be very long, more than the timeout duration.
I've checked and my query isn't the culprit (takes <2s for a regular query), but the xls generation is very long (for several thousand lines, I put different colors in cells, conditionally display data...).
I was thinking about the command, which runs in CLI, without timeout problem.
I can't use it directly, because the data to be generated has to be called by users (without cli access).
So I thought about calling the command in my controller
The user would choose the parameters in a form, send the form, and then in the controller, the parameters would be passed to the command that would do the heavy lifting.
My question is: In this case, is the command called in the CLI context (with CLI timeout = 0) or is it called in the application (Web) context (with timeout <50s) ? In the latter case, this would be useless, and I would be grateful for any advice on any alternate method to resolve my problem.
This is a textbook case for a message queue.
RabbitMq is recommended, and easy to use with Symfony.
You will have a producer, which will generate a message and put it in a queue. This will be done in your controller.
The db query and the sheet generation should be placed in the consumer (the command running in the background, picking messages from the queue and processing them).
When the sheet is ready, save it as a file, and perhaps log it in the database with a unique ID.
This migth sound difficult, but it is very simple, and you should learn it anyway :)
A problem is showing the result to the user. The simplest way is to refresh the browser every X seconds. Other choices include polling with ajax, and websocket based notifications from the server.
My application allows user to upload CSV file that is processed and records are written in the database. But this file can contain a very big number of records for example 300 000. And in this case it may need to up to half an hour to process all this records, I would like my application not to freeze the page for this period, but show progress and maybe some errors, or it would be better to allow user to move to another pages and from time to time come back to check process.
By what means can I achieve that?
The approach we took to resolve a similiar issue was as follows;
Upload File using normal http methods.
Save file locally.
Submit file to asynchronous webservice (.asmx). This process will insert a record that will store the status of the import, along with actually starting importing the records. Once all records have been processed, set the status accordingly.
This all happens in a single flow. Because the WebMethod is asynchronous, it will return without waiting for itself to complete and the import will happen in the background.
You now redirect user to page that periodically checks the status of the asynchronous import until such point as it is finished. You can also add additional information to this process such as progress by batching the records and updating another fields accordingly.
This has worked well for us for many years now. I have not added any real detail as that will be specific to your implementation.
I would like to know the best way to deal with long running processes started on demand from an ASP.NET webpage.
The process may consist of various steps (like upload files to the server, run SSIS packages on them, execute some stored procedures etc.) and sometimes the process could take up to couple of hours to finish.
If I go for asynchronous execution using a WCF service, then what happens if the user closes the browser while the process is running, how the process success or failure result should be displayed to the user? To solve this, I choose one-way WCF service calls, but the problem with this is I need to create a process table and store the result (and error messages if it fails in any of the steps and which steps have completed successfully) in that table which is an additional overhead because there are many such processes with various steps that the user can invoke from the web page and user needs to be made aware of the progress (in simplest case, the status can be "process xyz running") and once it is done, the output needs to be displayed to the user (for example by running a stored procedure).
What is the best way to design the solution for this?
As I see it, you have three options
Have a long running page where the user waits for the response. If this is several hours, you're going to have many usability problems, so I wouldn't even consider it.
Create a process table to store the results of operations. Run service functions asynchronously and delegate logging the results to the service. There can be a page that the user refreshes which gets the latest results of this table.
If you really don't want to create a table, then store all the current process details in the users' session state, and have a current processes page as above. You have the possible issue that the session might timeout, or the web app might restart and you'll lose all this.
I can't see that number 2 is such a great hardship. You could make the table fairly generic to encompass all types of processes: process details could just be encoded as binary or xml and interpreted by the web application. You then have the most robust solution.
I cant say what the best way would be but using Windows Workflow Foundation for such long running processes is definitely one way to go about it.
You can do tracking of the process to see what stage it is at, even persist it if you have steps where it is awaiting user input etc.
WF provides a lot of features out of the box (especially if your storage medium is SQL Server) and may be a good option to consider.
http://www.codeproject.com/KB/WF/WF4Extensions.aspx might help give you some more insight into the same.
I think you are in the right track. You should run the process asynchronously, store the execution somewhere (a table), and keep status of the running process in there.
Your user should see a pending display label while the process is executing, and a finished label with the result when the process finished. If the user closed the browser, she will see the result of her running process next time she logs in.
I know that similar questions have been asked all over the place, but I'm having trouble finding one that relates directly to what I'm after.
I have a website where a user uploads a data file, then that file is transformed and imported into SQL. The file could be up to 50mb in size, and some times this process can take 30 minutes or sometimes even longer.
I realise I need to palm off the actual work to another process, and poll that process on the web page. I'm wondering what the best approach would be though? Being a web developer by trade, I'm finding all this new Windows Service stuff a bit confusing, and I just wanted somewhere to start.
So:
Can I do / should I being doing this with a windows service? if so, how?
Should I use WCF? If this runs under IIS, will I have problems with aspnet_wp.exe recycling and timing out my process?
clarifications
The data is imported into sql, there's no file distribution taking place.
If there is a failure, it absolutely MUST be reported to the user. The web page will poll every, lets say, 5 seconds, from the time the async task begins, to get the 'status' of the import. Once it's finished another response will tell the page to stop polling for status updates.
queries on final decision
ok, so as I thought, it seems that a windows service is the best idea. So as to HOW to get it to work, it seems the 'put the file there and wait for the service to pick it up' idea is the generally accepted way, is there a way I can start a process run by the service, without it having to constantly be checking a database table / folder? As I said earlier, I don't have any experience with Windows Services - I wondered if I put a public method in the service, can I call it somehow?
well ...
var thread = new Thread(() => {
// your action
});
thread.Start();
but you will have problems with that:
what if the import to sql fails? should there be any response to the client
if it fails, how do you ensure the file on a later request
what if the applications shuts down ... this newly created and started thread will be killed either
...
it's not always a good idea to store everything in sql (especially files...). if you want to make the file available to several servers why not distribute them via ftp ...?
i believe that your whole concept is a bit messed up (sry assuming this), and it might be helpful if you elaborate and give us more information about your intentions!
edit:
Can I do / should I being doing this
with a windows service? if so, how?
you can :) i advise you to create a simple console-program and convert this with srvany and sc. you can get a rough overview howto here (note: insert blanks after =... that's a silly pitfall)
the term should is relative, because you did not answer the most important question
what if a record is persisted to the database, telling a consumer that file test.img should be persisted, but your service hasn't captured it or did not transform it yet?
so ... next on
Should I use WCF? If this runs under IIS, will I have problems with aspnet_wp.exe recycling and timing out my process?
you probably could create a WCF-service which recieves some binary-data and then stores this to a database. this request could be async. yes. but what for?
once again:
please give us more insight to your workflow: what are you exactly trying to achieve? which "environmental-conditions" to you have (eg. app A polls db and expects file-records which are referenced in table x to be persisted) ...
edit:
so you want to import a .csv-file. well that changes everything :)
but i won't advise you to use a wcf-service (there could be a usage: eg. a wcf-service which has a method to insert a single row, then your iteration through the file would be implemented in another app... not that good, though).
i would suggest following:
at first do everything in your webapp (as you've already done), but rather use some sort of bulk-insert and do your transformation/logic on the database.
if you have some sort of bottle-neck then, i would suggest you something like a minor job-service, eg:
webapp will upload the file and insert a row to a job-table. the job-service is continiously polling the table/or gets informed via wcf by the webapp (hey, hey, finally some sort of usage for WCF in your scenario... :) ) and then does the import-job, writing a finish-note to a table/or set the state of the job to finished ...
but this is a bit overkill :)
Please see if my below comments helps you to resolve your issue:
•Can I do / should I being doing this with a windows service? if so, how?
Yes you can do this with a windows service. And I think that is the way you should be doing it. You can implement your own service to process your request or you can use the open source code Job Proccessor
Basically the idea is..
You submit a request for processing
the csv file in database table with
some status as not started.
Then your windows service picks up
the request from database table which
are not started and update them as in
progress status.
Once the processing is complete
succesfully /unsuccesfuly your
service updated the database table
with status as Completed / Failed.
And your asp.net page can poll to
database table for the current status
every 5 sec or so.
•Should I use WCF? If this runs under IIS, will I have problems with aspnet_wp.exe recycling and timing out my process?
you should not be using WCF for this purpose.
I am thinking on the following approach but not sure if its the best way out:
step1 (server side): A TaskMangaer class creates a new thread and start a task.
step2 (server side): Store taskManager object reference into the cache for future reference.
step3 (client side): Use periodic Ajax call to check the status of the task.
Basically the intention is to have a framework to run a background task (5mins approx) and provide regular feedback on the web UI for the percentage of task completed.
Is there a neat way around this or any existing asp.net API that will be helpful ?
Edit 1#: I want to run the task in-proc with the app.
Edit 2#: Looks like badge implementation on stack overflow is also using the cache to track background task. https://blog.stackoverflow.com/2008/07/easy-background-tasks-in-aspnet/
I think the problem with storing the result in the cache is that ASP.NET might scavenge that cache entry for other purposes (ie if its short on memory, if its grumpy, etc). Something that is served from the cache should be something you can recreate on demand if its not found in the cache, the ASP.NET runtime is free to dump cache entries whenever it feels like it.
The usage of the cache in the badge discussion seems fundamentally different, in that case the task was shortlived. The cache was just being used as a hacky timer to fire off the task periodically.
Can you confirm this is a task that is going to take 5 minutes, and require its own thread that whole time? This is a performance concern in itself, you will only be able to support a limited number of such requests if each requires its own thread for so long. Only if thats acceptable would I let the task camp a thread for so long.
If its ok for these tasks to camp a thread, then I'd just go ahead and store the result in a dictionary global to the process. The key of the dictionary would correlate to the client request / AJAX callback series. The key should incorporate the user ID as well if security is at all important.
If you need to scale up to many users, then I think you need to break the task down into asynchronous steps, and in that case I'd probably use a DB table to store the results (again keyed per request / user).
Microsoft Message Queuing was built for scenarios like the one you try to solve:
http://www.microsoft.com/windowsserver2003/technologies/msmq/default.mspx
Windows Communicatio Foundation also has message queuing support.
Hope this helps.
Thomas
One approach for doing this is to use application state. When you spawn a worker thread, pass it a request ID that you generate, and return this to the client. The client will then pass that request ID back to the server in its AJAX calls. The server will then fetch the status using the request ID from application state. (The worker thread would be updating the application state based on its status).
I saw an approach to a similar problem somewhere. The solution was something like:
Start the background task on server.Return immediately with a url to the result.
Until the result is posted, this url will return 404.
The client checks periodically for this url.
The client reads the results when
they are finally posted.
The url will be something like http://mysite/myresults/cffc6c30-d1c2-11dd-ad8b-0800200c9a66.
The best document format is probably JSON.
If feedback on progress is important, modify the document to also contain status (inprogress/finish) and progress (42 %).