Developing a distributed system as a grid - grid

Has anyone had experience with developing a distributed system as a grid?
By grid, I mean, a distributed system where all nodes are identical and there is no central management, database etc.
How can the grid achieve even distribution of: CPU, Memory, Disk, Bandwidth etc.?

Something akin to Plan9 perhaps?
wikipedia entry.

What you're actually talking about is a cluster. There is a lot of software available for load balancing etc, even specific linux distros such as Rocks, which comes complete with MPI/PVM and monitoring tools built in.

Related

Why is it hard to implement fast network transfer of small files?

I use some kinds of network mounts (like Samba Windows shares, sshfs, scp) on different networks (LAN, Dialup). Whenever it comes to transferring a large amount of small files, I see poor performance. Far away from what would theoretically be possible. No resource appears really busy, so it seems to be a question about the software behind that (this is why I'm hopefully not OT here).
What is the problem from a software developer perspective behind that? Why do those tools not saturate any component of my system or the network?
Is that just because the Linux kernel makes some stuff complicated, or is there more to know about?

Are ASP.NET websites already multithreaded considering scalability on Azure?

This comes in reference to another question how does windows azure platform scale for my app? stating only multithreaded applications would benefit from multi-core architecture.
My question is that as ASP.NET/IIS (or any other web application and web server) is basically multi-threaded will it not take the advantage of multi-cores?
Assuming the site is a simple web application with some logic that is executed on page display without explicit multithreading implementation.
I believe it is for the console applications or schedules only that require multithreaded programming implemented for it to take advantage of multi cores, please advice.
IIS by design implementation takes full advantage of all available to the OS CPU cores. However it is really application specific to decide which VM size to choose. I barely see anything larger than Medium being used for a Web Role!
The most common case is to use Small VM for a Web Role and scale out to multiple instances when unnecessary. And I always advise customers to start with a Small VM, do some performance/load testing while closely monitoring the VM to assess the need for a larger one (if any). From resilience point of view, as well as durability, and in many cases even performance, it is much better to rely on 4 Small VMs, rather than 1 Large!
Keep in mind that it is always easier, and cost-wise to scale with a smaller step (1 or 2 CPU Cores at once, meaning Small or Medium VM).
There are many business requirements existing in this world that require multi-thread on web application such as generate data and insert it to DB, export excel via AJAX... With a small VM and a minimum RAM with powerful core, if you don't manage your logic well and take advantage of hardware, your RAM would run out shortly.

Choosing Embedded Linux for device

I am starting to create a QT application with sqlite for a hand held device. My Project Manager asks me to select an operating system (embedded linux) for the device (we are not considering android).
As in Desktop, are there many embedded-Linux distributions for devices?
If so, Which embedded linux I should consider?
You have multiple choices, but I will suggest the easier and - in my opinion - better two.
Buildroot - is a set of makefiles that lets you create your custom embedded distribution. Can take care of building the Linux
kernel, the toolchain and a barebox or U-Boot bootloader, too. Easily expandable and
with a practically zero learning curve. You have a fully working
system in a matter of hours.
Yocto - a fully fledged (and complicated) build system. Suggested over Buildroot when you need a LOT of packages/components
and may need flexibility in expanding the system directly on
premises. What you can do substantially depends on the "layers" (sets of rules for building things) available: you combine layers together to obtain your system. Has a steep learning curve but is used and directly
supported by multiple vendors (e.g.: Atmel, TI).
Anyway, unless you have more than good reasons, I strongly suggest the former.
There are several Linux distros to be used with ARM. Maybe you should consider Fedora ARM https://fedoraproject.org/wiki/Architectures/ARM
This is a difficult question to answer not knowing more about the project requirements (not just software requirements, but also non-functional ones as well) and capabilities of the platform.
Angstrom (based on OpenEmbedded) is another possibility for Linux.
I would challenge the assumption that the operating system must be Linux. Why? If time-to-market or having commercial support are important, you might be better off with commercial embedded or RT operations systems such as VxWorks or QNX.
There are also professionally supported Linux distros such as Montavista
Whilst free linux distros are, well, free, you are generally on your own and your team's time isn't free.
You can use Qt for embedded device , it’s fast and compatible with many hardwares and if your hardware is not supported, porting it to a new hardware is not so hard
plus it has special rendering system

VMWare - network applications

I am developing a distributed file system using Java, I cannot give many details at this moment. I need to test some things on Linux, I will use WMWare server an install Linux inside a virtual machine. Is there any difference between the simulated network card and a real ethernet interface?
I am developing a distributed file system ... I will use WMWare server an install Linux inside a virtual machine.
VMware is great for this sort of thing. There should be no difference except, as RichieHindle said, in performance, especially if you're planning to run multiple vms on the same server.
Use real hardware if you want usable performance benchmark results.
Java is it's own 'VM'... on top of a layer of virtualization in the guest OS... on top of VMware... on a virtual execution model CPU. Take a little virtualization here, add a little virtualization there, and pretty soon we're talking about some real abstraction!
From the point of view of application code, no, there's no difference.
The only visible difference might be in performance - the speed of response and exact timings of things might be different, but you're talking microseconds.
There's so much general-purpose software that works flawlessly under VMs that the answer to almost every question of the form "Are VMs different from real machines?" at the application level is "No".
(Things might be different if you were talking about kernel-level driver software.)

Scalability Case Studies

I'm starting to build a community website from the site up and my web framework will be Asp.net and Mysql.
I want to start planning some scalability into the infrastructure early because I'm anticipating high traffic when the site goes live.
Are there any case studies which you recommend reading where asp.net or mysql has been scaled and which demonstrates good scaling techniques?
I think it could be a challenge to find reference materials for that particular combination. Many .NET shops stick to SQL Server, and fewer use MySQL (at least at scale).
In general it would be appropriate to:
Follow general .NET practices for scalability. Weed out what is not appropriate for you.
Learn about database performance and implications of various design strategies such as denormalisation (when and why).
Consider out-of-process caching like memcached.
Review books on MySQL performance. Most of these are focused on UNIX platforms. Windows users may have problems applying some of these practices.
Read up on how other people are scaling their sites (Building Scalable Sites and The Art of Capacity Planning)
Consider how you might optimise your web design to be more scalable. Are you using AJAX? Work out what the impact of excessive polling may be etc.
Learn how to measure the performance of your application and database (starting points ASP.NET and MySQL).
Develop a plan for scaling your architecture (1 server to 2 servers, to multiple servers etc) so that you have some frame of reference for making decisions about building things in your system.
I only know of one really good resource to read case studies about scalability techniques and I am really surprised no one has mentioned it. High Scalability
There is so many examples of "out of the box" thinking that and different techniques for scaling that I think it makes a good read for anyone who is interested in the topic.
BrianLy said it best here:
"Develop a plan for scaling your
architecture (1 server to 2 servers,
to multiple servers etc) so that you
have some frame of reference for
making decisions about building things
in your system."
As a forum I frequent says, 'quoted for truth'. All of his points are excellent, but this one is a key point that many people overlook. It doesn't matter how scalable your code and database are if you are running on a creaky old server. The hardware may not be as important as your code, improving it beyond a certain point will give diminishing returns VERY quickly, but do NOT forget to get your hardware to that point. If you have crap hardware, or even good hardware but not enough of it, your site will bomb out.
For mysql scaling, you may find this interesting: danga livejournal

Resources