Disclaimer: I'm a c# ASP.NET developer learning "RoR". Sorry if this question doesn't "get" RoR, any corrections greatly appreciated!
What is multithreading
My understanding of "multithread" ability in web apps is twofold:
Every time a web/app server receives a request it can assign a thread to the new request, thus multiple requests can run concurrently.
The app runtime + language allows for multiple threads to be used WITHIN a single request (in ASP.NET via "Async" methods and keywords for example).
In this way, IIS7 + ASP.NET can do points 1 AND 2.
I'm confused about RoR
I've read these two articles and they have left me confused:
Clearing up some things about LinkedIn mobile’s move from Rails to
node.js
How to deploy a multi-threaded Rails app
question one.
I think I understand that RoR doesn't lend itself very well to point number 2 above, that is, having multiple threads within the same request, have I got that right?
question two.
Just to be crystal clear, RoR app/web servers can also do point number 1 above right (that is, multiple requests can run concurrently)? is that not always the case with RoR?
Question 1:
You can spawn more Ruby threads in one request if you want, although that seems to be outside the typical use case for Rails. There are uses for it for certain long-running IO or external operations.
Question 2:
The limiting factor for Ruby concurrency in general, not just with Rails, is the Global Interpreter Lock. This feature of Ruby prevents more than 1 thread of Ruby from executing at any given time per process. The lock is released whenever there is non-Ruby code executing, such as waiting for disk IO or SQL responses. You can get around this by using a different implementation of Ruby than the default, such as JRuby, but not all.
Phusion Passenger uses process based concurrency to handle a few requests concurrently, so, strictly speaking, is not "multithreaded," but is still concurrent.
This talk from Ruby MidWest 2011 has some good thoughts on getting multithreaded Ruby on Rails going.
Since this is about "from ASP.NET to RoR" there is another small but important detail to remember: In *nix environments it's common to achieve concurrency of a service application through multi-processing rather than multi-threading. This is an architecture that goes way back and is related to the relatively cheap cost of multi-processing on *nix systems using fork and Copy-on-Write. Each process serves one request at a time in a single thread and the main process controls spawning and killing worker child processes. Multiple requests are served concurrently by different child processes.
Modern service applications, for example Apache, have multi-process, multi-threaded, and even combined modes (where the service forks several processes, each running several threads).
In cases where the application was built with portability at mind (examples again: Apache, MySQL, etc) it is customary to run it in multi-process or combined mode on *nix systems, and in multi-threaded mode on Windows servers.
However, admittedly Rails is somewhat lacking on the Windows front. It's not that you can't run it on Windows, it's just that not a lot of effort went into making sure it runs well and smoothly for production use on Windows servers. It's not a common production platform among the RoR community.
As a result, Eventhough Rails itself is thread-safe since version 2.2, there isn't yet a good multi-threaded server for it on Windows servers. And you get the best results by running it on *nix servers using multi-process/single-threaded concurrency model.
Rails as a framework is thread-safe. So, the answer is yes!
The second link that you posted here names many of the Rails servers that don't do well with multi-threading. He later mentions that nginx is the way to go (it is definitely the most popular one and highly recommended). But he doesn't mention what made him come up to the conclusions.
Ruby 1.9.3 came out recently and has some new threading goodness built in which didn't exist before.
Use of multi-threading generally depends on the use case.
Personally I have tried it once an year ago and it had worked but I haven't used it in any production code because I haven't come across a use case where using multi-threading made more sense over pushing the long running task to a background job.
I would love to explore this more. So, if you can describe what you are trying to achieve then maybe we can do a POC.
Related
I have two questions:
Is there any API in Qt for getting all the processes that are running right now?
Given the name of a process, can I check if there is such a process currently running?
Process APIs are notoriously platform-dependent. Qt provides just the bare minimum for spawning new processes with QProcess. Interacting with any processes on the system (that you didn't start) is out of its depth.
It's also beyond the reach of things like Boost.Process. Well, at least for now. Note their comment:
Boost.Process' long-term goal is to provide a portable abstraction layer over the operating system that allows the programmer to manage any running process, not only those spawned by it. Due to the complexity in offering such an interface, the library currently focuses on child process management alone.
I'm unaware of any good C++ library for cross-platform arbitrary process listing and management. You kind of have to just pick the platforms you want to support and call their APIs. (Or call out to an external utility of some kind that will give you back the info you need.)
I'm interested in using a statistical programming language within a web site I'm building to do high performance stats processing that will then be displayed to the web.
I'm wondering if an R compiler can be embedded within a web server and threaded to work well with the LAMP stack so that it can work smoothly with the front-end and back-end of the web site and improve the performance of the site.
If R is not the right choice for such an application, then perhaps there is another tool that is?
The general rule is that webserver should do NO calculations -- whatever you do, it will always end in a bad user experience. The way is that the server should respond to calculation request by scheduling the job for some worker process, give the user some nice working status and then push the results obtained from worker when they are ready (most likely with AJAX polling or some more recent COMET idea).
Of course this requires some RPC protocol to R and some queuing agent -- this can be done either with background processes (easy yet slow), R HTTP servers (more difficult yet faster), or real RPC like Rserve or triggr (hard, yet fast to ultra-fast).
You are confusing two issues.
Yes, R can be used via a webplatform. In fact, the R FAQ has an entire section on this. In the fifteen+ years that both R and 'the Web' have ridden to prominence, many such frameworks have been proposed. And since R 2.13.0 R even has its own embedded web server (to drive documentation display).
Yes, R scripts can run faster via the bytecode compiler, but that does not give you orders of magnitude.
I whould like to know some opinions about OpenEJB: we are considering to use it on a new project, but really didn't found many opinions about it.
So, here is my question: how about it? Does it perform well? Is it stable enough for a production environment?
We switched to OpenEJB (deployed embedded in our app on Tomcat). Performance tests showed better or not worse results processing our transactions compared to JBoss (transactions include data access, JMS, and servlets). We use ActiveMQ within OpenEJB for JMS. There are no stability problems as of yet - we are still in staging (pre-production) environment though. The documentation is definitely lacking, but not as poor as other embedded choices. Overall, we consider this as a good choice if you run on Tomcat. Deploying it on other application servers turned out to be much more difficult (JBoss, Weblogic, Websphere) but there are not many reasons for this usually (we had few but dropped this after several attempts basically failed).
And as in all open source products: expect lack of support (documentation, troubleshooting, bugs, etc.) to be compensated by free access to sources.
We've had experience with Oracle OAS and JBOSS before. We decided to give OpenEJB a try. We've found out that it is not only very fast but it also much easier to setup and configure, and it has much better defaults.
Currently we implement our own failure measures in the client, so we don't know how they compare for clustering, or other advanced features that we don't use.
We we have to go back and deal with JBOSS in the developer side, we see a drop on productivity, because it takes too long to bootstrap.
Until recently I'd considered myself to be a pretty good web programmer (coming up for 10yrs commercial experience on a variety of e-commerce, static and enterprise applications). I'm self taught and have always used the Microsoft product stack (ASP, ASP.NET)...
My applications are always functional, relatively bug free, but have never been lightening quick. As a frequent web user I always found this to be the norm... how fast are the websites from the big tech players (eBay, Facebook, Microsoft, IBM, Dell, Telerik etc etc) - in truth none are particularly fast. I always attributed this to "the way things are with web apps"...
...then I cam across a product called Jira from atlasian and this has stopped me in my tracks...
This application is fast, and I mean blindingly fast.. too fast to time the switches between pages, fully live content, lots of images and data and cross references etc etc...
I run this on an intranet, with a large application DB, and this is running on a very normal server (single processor, SATA HDD, 8GB RAM).
Am I missing something?? Are my programming techniques that bad?? I am wondering if this speed gain is down to it being written in Java and running on Tomcat.
Does anyone have any benchmarks to compare JSP / ASP or Tomcat / IIS???
Thanks,
Mark
NOTE: this isn't a blatant plug for Jira. I don't work for them or have any affiliation to them... but I would like to be able to write applications like them :)
YMMV. But one of the longest-lived Things That Aren't True Anymore is the assertion that "Java Is Slow". Excepting floating-point (where most Java implementations aren't at liberty to use the floating-point hardware), Java is generally as fast or faster than compiled code. Some of the best and brightest have spent years of effort ensuring this, including such things as dynamic recompilation/re-optimization of code based on run-time metrics - something that statically-compiled languages like C or assembler cannot boast.
ASP is sort of the opposite extreme, since the original ASP had to recompile each page request each and every time it was made. ASPX addressed this by allowing retention of the compiled page code. That got rid of a lot of useless overhead.
A more compelling reason to prefer Java over ASPanything/IIS is freedom. A Java/Tomcat webapp will run under almost any OS on almost any hardware. IIS runs on Windows. Period. And for the most part, that also means Intel. Not Sparc, Not zSeries. Maybe you don't care. But then again, maybe next week IBM will offer your employer a can't-refuse deal on a mainframe.
I don't have benchmarks, and there are a lot of things that can make one platform preferable. But I permanently gave up on the "Java is slow" idea when I encountered the Poseidon UML tool with its cool real-time graphics UI and the FreeMind mindmapper tool. A small hit to startup the JVM, but after that, you'd never know what language you were working under.
The great debate. Java vs. .Net.
When .Net first came out there was an application written called "The Pet Shop." Which was a .Net port of Sun's J2EE reference application, "The Pet Store". It was announced that Microsoft's implementation was "faster."
As with anything, especially anything to do with marketing, you have to dig deeper to find the truth.
Any technology can be fast with enough hardware and the correct design.
In my experience there are two factors to speed: What type of hardware is used and how you architect your application (this includes database tuning).
Caching at various levels (response, db, etc.) makes a huge different in responsiveness of a web application. There is also a lot of things that are done to reduce time consuming operations like db connection pooling, sql statement caching, etc. As much as I'd like to say Java is better :-), I think in this case the performance is due to the way Jira was written and the fact that it's being run internally (probably with few users as compared to eBay, Facebook, Microsoft). This site, Stackoverflow, uses ASP.NET MVC and IIS and is very responsive and my guess (since code is not open sourced, yet) is that they use many of the same techniques you would find in Jira or any other web application built to scale.
I think that it is not typically the frameworks and languages used that make an application slow. In my experience, some frameworks like JSF or .NET server side controls give developers alot of freedom to make too many database calls and look things up too often, but that's definitely not the fault of the framework used.
Keep your application as light as possible and focus on keeping the data sent to the client as small as possible, and you will have a fast application. It's usually faster to develop fast applications too.
The Jira folks have written a best in class application (and charge for it) - nice work crocodile dundees.
I also suggest to consider also two aspects:
the maintenance activities: logging and deployment. In my opinion under a unix like server is more easier to log, deploy, and maintain new release than doing the same on a Windows server.
if the project require to use some open source application (i.e. Alfresco repository) Java is better solutions
People's opinion is mostly biased. Most people have never really tried the other while claiming the other is slower. I wouldn't trust any answer: it's mere opinion. It's boring to always read the same 4 cents again and again.
It's one of those things I see a lot but never really think of. Do you think for the purpose of web application development (specifically ASP.NET WebForms/MVC). Do you think it's advantageous to do such a thing and if so, what kind of advantages come out of it?
By virtualization I mean using products like Hyper-V to separate the server context like your SQL and Web Server, etc.
First question is, virtualization of what? Do you mean server virtualization? Do you mean running VMWare on each dev's laptop with multiple OSes? Do you mean moving everything to the cloud?
Virtualization of servers, in web app context, is not really different from that in general IT - most of the servers on the Internet, including StackOverload's, are bought to handle peak loads and spend most of the time idling away the cycles, so virtualizing them makes sense when you have more than a certain amount.
VMWare on the desktop (or other parallels on other operating systems) is superb because a) your devs can run a full instance of your server environment, including multiple virtual servers connnected in a virtual network - this is about as close to the real thing that you can get, minus hardware costs and minus devs messing with each other's servers. For clients, you can use Linux and multiple Windows installs to test various browsers, font sizes, etc. quickly - also a big win.
Moving everything to the cloud makes sense in many cases, but is probably a topic for a separate full-sized question :)
One big advantage I see is, that every developer can have his/her own sandbox to work on. If someone messes up his/her sandbox he/she can take a clean image and all is OK again. So I guess that means that there is room to experiment without losing valuable time getting back to the normal setup, you can simply do a rollback.
I'm in doubt a bit on whether you should use virtualisation for production environments. Depending on the application of course.
The only time I would use a virtual for ASP.Net development was if the app required specific setup, such as relying on installed software, wierd settings or particular shares. Every developer has their own webserver and can run their own database so if it's a "basic" webapp I don't see much value in virtuals.. it's pretty hard to break anything with a basic web app deployment :)
With a virtual server, you can test your code in a production-like environment. It is also possible to quickly revert back to the original setup. For many applications, it is useful in that time period just after you write the code, but before it goes to production.
I'm a fan of virtualizaion and use it in testing and production (VMWare and Hyper-v) but over the last year I find it less important on a dev machine. TFS provides me with all the backup/rollback ability that I need, multiple versions of .net can now exist on the same machine and VS2008 can target all those versions.
In a development environment a virtual environment is useful to put several different servers on one box, you can have an instance for your web app, one for your services, one for database, etc. That way it mimics your production environment if you are using separate servers.
One of the benefits of using virtualization in production is that your application is not tied to a specific machine. If you wanted to move your web server instance to another box, it is trivial to do so. You don't need to install or configure things on the new server and hope that everything is set up properly.
One problem I have had though in testing virtual instances is that it can run slower for some applications, specifically engineering apps that like running the CPU at 100%. So test before you leap.