Consuming StackOverflow API and Visual Studio 2010 - asp.net

I have downloaded TheWorldsWorstStackOverflowClone. One of the project is called TheWorldWorsts.ApiWrapper, which basically is the core of accessing the API. There is a class called ApiProxy.cs, which has all the methods for the API call. This is good.
Now what I want to do is I am trying to collect data from this API interface and store it in a database. I know the limit to the API call is 10k per day. I.e: I want to be able to call the method in the ApiProxy class 10k times per day, done automatically. How can I do this?
The non-automatic way would be to create a dummy site where when every time I access the site it does all that process, but this not efficient. It seems that I have to write some kind of a scheduler by deploying a web service, but that is too complicated... as explained here. Any other simpler methods?

A Windows Service or Desktop App might be a better solution than a web application. You are not deploying a web service, you are consuming one using a proxy class, and this does not require you to have a web server or a web site.
You could use a web application to control and monitor progress as your service downloads data, but the actual work is long running and needs to be offloaded to another process or thread so you can tell the user whats going on.

Check out this one
http://stacky.codeplex.com/
This looks what you need, though I am facing some debugging issues, but hope you can figure it out.

Related

Azure Worker Role Deployment for background data

I want my application to be in 2 phases. 1 part will simply fetch data in json format from an API and store it to a SQL database(or maybe a NO-SQL) and the other half(the web part) will read the data and implement customize alerts. So, basically i need to create a worker for the fetch process. But I'm confused between worker role and web role in Azure. Kindly help me what's the best possible way to implement this design?
You can just merge both in the same web role - the part of code running in IIS (the ASP.NET project created when you create a web role from a Visual Studio template) will handle web requests and the part running the "role entry point" will run the fetch process. Unless you absolutely need to scale them separately this will give you a simpler and more manageable solution.
Have you looked at this tutorial? It gives possible use cases and tutorials for both web and worker roles.
http://www.windowsazure.com/en-us/documentation/articles/cloud-services-dotnet-multi-tier-app-storage-1-overview/

Referencing an unstable DLL

We are referencing a 3rd party proprietary CLI DLL in our .net project. This DLL is only an interface to their proprietary C++ library. Our project is an asp.net (MVC4/Web API) web application.
The C++ unmanaged library is rather unstable. Sometimes it crashes with e.g. dangling pointers. We have no way of solving it, and using this library is a first-class customer requirement.
When the application crashes, the application pool in IIS doesn't respond anymore. We have to restart it, and doing so takes a couple minutes (yes, that long!).
We would like to keep this unstable DLL from crashing our application. What's the best way of doing it? Can we keep the CLI DLL in a separate AppDomain? How?
Thanks in advance.
I think every answer to this question will be some kind of work around.
My workaround would be to not interact directly with the DLL from your web application.
Instead write your requests from the web application to either a Message Queue or a SQL table. You can then have another application such as a Windows Service which reads the requests, interacts with the DLL and then writes the results back for your web application to read.
I'm not saying that SQL / Message Queues are the right way, I'm more thinking of the general process flow.
I had this exact problem with a third party library that accessed protected memory for purposes of interacting with a hardware copy protection dongle. It worked fine in a console or winforms app, but crashed like crazy when called from an IIS application.
We tried several different things, some of which are mentioned in other answers on this page. But ultimately, the best solution for us was to us a very old technology - .Net Remoting. I know - it's somewhat frowned on these days. But it fit this particular need quite well.
The unstable code was placed in a Windows Service application. The web application made remoting calls to this service, which relayed the commands to the third-party library.
Now I'm sure you could do the same thing with WCF, sockets, etc. But remoting was quick and easy to setup, and since we only talk to the same server it works without opening any ports. It just talks on a named pipe.
It does mean a second service to install besides the web application, but that was acceptable in my particular use case.
If you did something similar, and the third-party code actually crashed the service, you could probably write some code in your main application to bring it back up.
So perhaps a process boundary is more useful than an App Domain when you have unstable code to wrangle.
I would first increase the IIS process recyling rate, maybe the the DLL code fails after a certain number of calls, or after the process reaches a certain amount of memory usage.
You can find information on the configuration of IIS 7.0 recycling options here: http://technet.microsoft.com/en-us/library/cc753179(v=ws.10).aspx
In your case I would recycle the process at a specific time, when you know there is less load on the application. And after a certain number of requests (lower than the default) to try and have "fresh" process most of the time.
The recycling process is graceful in the sense that the the old process is not terminated until the one that will replace it is ready, so there should be no noticeable downtime.
More information about the recycling mechanism here: http://technet.microsoft.com/en-us/library/cc745955.aspx
If the above does not solve the problem I would wrap the calls in my own code that manages the unstable DLL execution.
This code should recover from the failures for example by repeating the failing calls until a result is obtained and failing with a graceful error if it is not possible after a number of attempts.
Internally the calls to the unstable DLL could be made in a spawned thread or even the code could be in an new external executable that you could launch with Process.Start.
This last option has more overhead but it might be your only option. See this SO question for more information on this: How do you handle a thread that has a hung call?
I suggest following solution.
Wrap this dll with another web application. Can be one of the following ones. Since you already use web api, it is most suitable for you.
Simple ASMX Web Service
WCF Service
Asp.Net MVC - WEB Api Service
Control your p-invoke code so that you do not have any bug? See following articles.
The Black Art of P/Invoke and Marshaling in .NET
P/Invoke Revisited
Publish this application to IIS with different application pool.
Use standard techniques suggested before like. I suggest configure recycling IIS for both memory and scheduled times.
IIS process recycling rate
How to limit the memory used by an application in IIS?

What architecture to use for my ASP.NET Application?

Here is the scenario:
There's a data source sitting at site A that I could communicate using a set of APIs to get data I need.
I want to build an ASP.NET web application that periodically fetch data from site A and update/store the data in my own database. And periodically process the data and store processing results in my database so that users could browse the results in my web application front-end.
I have no clue how to design the architecture? How to achieve things like periodically communicate with another data source and process data in my database periodically in a web application?
I have very little experience designing web applications. It would be really nice if you could elaborate.
In answer to:
"How to achieve things like periodically communicate with another data source and process data in my database periodically in a web application?"
I do this by creating a web service, then creating a console application. I use Windows Task Scheduler to run the console application at an interval of my choice. The task is run and the web service is called, which communicates with various data sources and processes the data.
Kind of vague to answer.
There are tools that help with the communication with each API, some services provide wrappers for communication. When they are not provided look into something like Hammock as a wrapper.
High level helps, they are not ABSOLUTEs, just tips and thoughts
Follow a mutli-tiered model where you clearly separate your layers
Model
View
Controller
Use an ORM like ServiceStack for data access
Create a small console app to do the processing
Use a schedule job in windows to run it.
DONT do this with something like Quartz ( way too much overhead )
DONT do with with SqlServerAgent, too much overhead, not enough control if you are a .net Programmer
Watch how big your objects and lists are getting you will run out of memory when working with other people's data
Use JSON it is a great format to pass data around within and external to your application
Setup logging make sure it works, other peoples data breaks
Scrub incoming data, you can't assume other people's data is clean.
Profile your application to know where the hot paths are
Write unit tests
Run your unit tests regularly.
Test on multiple browser
Thats probably good for now. Clarify your question and we may be able to give more help.

Speeding up a Web Service

I have a web service running and I consume it from my desk application that is written on Compact Framework.
It takes 13 seconds to retrieve 8 results which is kinda slow. I also expect to be retrieving more results in the future. The database query runs fast.
Two questions: how do I detect where the speed slow down occurs? Do I put timers in the Web services code?
I would like to detect whether it is the network or the application code.
This is my first exposure to web services in a real environment so please bear with me.
i used asp.net 2.0 and c# to write a simple web service.
Another good profiler is the EQATEC Profiler. I did a write up on it here: http://elegantcode.com/2009/07/02/eqatec-profiler-and-net-cf-profiling-and-regular-net/
And it works find for .net CF projects. But this will allow you to see if there performance issues in unexpected places.
Your already on the right track of adding event logging, and include timers in them. Note, doing so will add to the over all time it takes, so you'll want to remove them after you track down the culprit. Also look into running the same webservice call multiple-times without re-initiating the connection, that may be cause as well.
-Jay
A starting point is to profile your web service to see where the delay is comming from
Did you know the CLR Profiler? There are some tools you can use to see what is happening
http://msdn.microsoft.com/en-us/library/ms998579.aspx
The database connectivity from your service to the DB could be a possible cause for slowdown. Adding timers should do the trick. If the code isnt too huge, you can look at the coding constructs to come up with an informed decision of where exactly things can be slow. Then add the timers. You would get a fair idea of where things are slowing down.
Two biggest pain points are going to be instantiating the web service reference and transferring all the data over the network. Pending anything turning up where some obvious blunder was made, I would look at ways of reducing the size of your xml and ways of better handling your web service reference.
All I know about the compact framework is that it is a pain to work in. I've worked on a number of web projects though and profiling your server, putting in logging to record the time taken will be helpful. If all the time is being taking post server response, however, it won't do much more than prove your server is working quickly.
SoapUI is a fantastic java application for consuming web services. It has a lot of functionality, including time metrics. I would start with that and see how long it takes to consume the same thing your client would be. Failing issues there, start with what I recommended above.

Should I use a Windows Service or an ASP.NET Background Thread?

I am writing a web application in ASP.NET 3.5 that takes care of some basic data entry scenarios. There is also a component to the application that needs to continuously poll some data and perform actions based on business logic.
What is the best way to implement the "polling" component? It needs to run and check the data every couple of minutes or so.
I have seen a couple of different options in the past:
The web application starts a background thread that will always run while the web application does. (The implementation I saw started the thread in the Application_Start event.)
Create a windows service that is always running
What are the benefits to either of these options? Are there additional options?
I am leaning toward a windows service because it is separated and can run on a different server (more scalable) as well as there is more control over when it is started/stopped, etc. However, I feel like the compactness of having the "background" logic running in the process of the web application might make the entire solution more understandable.
I'd go for the separate Windows service primarily for the reasons you give:
You can run it on a different server if necessary.
You can start and stop it independently of the web site.
I'd also add that it could well have some impact on the performance of the web site itself - something you want to avoid.
The buzz-word here is "separation of concerns". The web site is concerned with presenting the data to the user, the service with checking the integrity of the data.
You can also update the web site and service independently of each other should you need to.
I was going to suggest that you look at a scheduled task and let Windows control when the process runs, but I re-read your question and noted that you wanted the checks to run every couple of minutes. The overhead of starting the process might be too great in this case - though some experimentation would probably prove this one way or the other.
If you use a scheduled task there's also the possibility that you could start the next check before the current one has finished - something you can code for if you're in complete control.
Why not just use a console app that has no ui? Can do all that the windows service can and is much easier to debug and maintain. I would not do a windows service unless you absolutely have to.
You might find that the SQL Server job scheduler sufficient for what you want.
Console application does not do well in this case. I wrote a TAPI application which has to stay in the background and intercept incoming calls. But it did it only once because the tapi manager got GCed and was never available for the second incoming call.

Resources