Openstack: How to decide hardware capacity? - openstack

I'm reading some OpenStack material recently, but didn't get a chance to try yet. I got the sense that Openstack could management a large number of virtual machines via API or dashboard interface. User could easily create/start virtual machines.
Then I come out a confusion. As the underlying computer hardware might vary, some computer maybe only able to host one virtual machine, some maybe ten. When user start a virtual machine, does user manually or Openstack automatically designate a hardware computer to host the virtual machine? In either case, how to decide the hardware computer's capacity? Does Openstack provide the functionality to set capacity attribute of hardware computer?

When you run OpenStack, each physical machine (which OpenStack calls compute hosts) will periodically report how many CPUs it has and how much RAM it has, as well as how many CPUs and how much RAM have been allocated to virtual machines that are currently running.
The OpenStack scheduler uses this information to determine which compute host to run a VM on. First, it checks to see if a host has enough CPUs (by applying the CoreFilter) and enough RAM (by applying the RamFilter). Compute hosts that don't have enough CPUs or RAM available won't even be considered.
Once it has a set of candidate hosts that have enough CPU and RAM, the scheduler needs to pick one of them. By default, the scheduler will use a "spread-first" strategy, allocating VMs to machines that have the most amount of CPU/RAM that isn't currently allocated to VM. It's possible to change this strategy to a "fill-first" behavior, so that the compute host with the least amount of free resources will get allocated first. This is configured by setting the nova.scheduler.least_cost.compute_fill_first_cost_fn parameter.
For more information, see the chapter on scheduling in the OpenStack Compute Admin guide.

Related

Do all servers have one base OS, like in RED HAT openstack architecture?

I'm a noob learning openstack. And The resources are all over the place tbh. I came across this image and would like to know one thing,
So, Suppose I have 100TB of storage and 10 server grade processors, and ram of 1TB, do all these resources make up of only one base OS- RED hat enterprise Linux? So, they sell resources to connect all the equipment and connect to install one single OS which can comprehend them all?
And Upon this, we throw an Openstack architecture so clients can use them as needed? Do we need as many NICs or the NICs virtual?
How to scale?
As you say, you just add a server. Install RHEL or another supported Linux distro (it's best to install the same distro and version on all servers), then OpenStack and configure it. The new server will register with the OpenStack controllers and can be used for launching virtual machines immediately.
The process is a bit more involved when you run a cloud with baremetal instances (i.e. you don't launch VMs but provision physical systems), but in principle it's the same.
by definition(at consumer scale-like one laptop) we need a network interface card for one IP
This is incorrect. You can configure multiple IP addresses on a single interface, even on your PC at home, even if that PC runs Windows.
An enterprise cloud requires connecting nodes to several networks. Usually, servers have several physical NICs, bond them together, and use VLANs or other multiplexing technologies to implement the networks. See this blog (five years old, but the principles still apply today, and it's well-written) for a good example of a real-world OpenStack network architecture.
Openstack uses one big special NIC
OpenStack can be deployed in many ways. It is not a shrink-wrapped solution. It can be used on servers with single NICs, bonded NICs, VLANs, normal networks, etc. Your statement is almost correct if you think of a typical deployment and a bond interface as a "big special NIC".
If you are interested to try this out at home, see the OpenStack installation tutorial. You will learn a lot.

Load testing should be done locally or remotely?

I am using a vps for my website so I don't believe I can access it from the local network or something.
I am using digitalocean as a vps.
So where should I install tools like ab, siege, jmeter etc. , locally on the vps / on my own computer (client) / on another droplet(vps) in the same region and connect to the web server droplet via private network?
From my understanding if I use those tools on the vps itself, they might use too much of the cpu and ram (same cpu and ram the web server uses) for the test to be correct.
On the other hand testing remotely might end up with bad values because of network bottleneck. Is this the case if I use another vps on the same subnet (digitalocean private network function for example)?
I am lost, both solutions seem wrong so what am I missing?
The best option is to install the load generator on another VPS residing in the same subnet as the application under test - this way you will be able to get more "clean" results not impacted by connect times / latency
Having both application under test and the load generator at the same machine is not recommended as load testing tools themselves are very resource intensive and you may run into the situation when both applications are "struggling" for resources hence load generator is not capable of sending requests fast enough and application under test cannot handle requests properly. In general it is recommended to keep an eye on resources consumption by the application under test/load generators in order to ensure that both have enough headroom, you will also be able to correlate increasing number of virtual users with increased resources consumption. You can use an APM tool or alternatively JMeter PerfMon Plugin if you don't have any alternatives in place.
As a fallback you can use your local machine for testing, however make sure that you have enough bandwidth (you can check it using i.e. https://www.speedtest.net/ service) and your ISP is aware of your plans and won't block you for the fraudulent action (as it might be considered a DOS attack)
We get good results using Unix machines from Amazon Webservices as load generator. You get not such a clean result like Dimitri mentioned, when the load generator is located in the same network. But you get a realistic result, like the enduser will get it too. With our scenario we evaluate some key values during execution like CPU, DB connections and amount of changed data sets in db during test. We repeat the test several times because there is always some variance in the result. The loadtest in the same network will deliver more stable results and can be compared to a measurement in a laboratory, but I think it is very good to know how your application behave in reality.

Does Google Compute Engine offer SR-IOV (Single Root I/O Virtualization)?

Amazon / AWS EC2 offers SR-IOV (Single Root I/O Virtualization) instances, which it dubs "enhanced networking" -- does Google offer this on Compute Engine?
Specifically, are any GCE instance types able to bypass the hypervisor and have direct access to a multi-queue NIC?
SRV-IOV support is needed to take advantage of Scylla DB's architecture?
HN Discussion: https://news.ycombinator.com/item?id=10262719
Currently Google Compute Engine does not offer SR-IOV. That said, SR-IOV is not strictly necessary to take advantage of Scylla's architecture.
GCE offers multi-queue networking and it is possible to directly user-mode assign the virtio-net queues using Intel's DPDK. This should allow our virtio-net NIC to work with Scylla, although at least at one point DPDK made certain qemu specific assumptions with respect to virtio-net (in particular it assumed Tx/Rx queue depths of 256 descriptors; the virtio-net NIC in GCE currently advertises 16,384 entry queues although this is likely to change in the near future).
For applications like Scylla this should offer superior network performance and better in-guest compute overhead over utilizing the kernel TCP/IP stack.
Additionally, for all GCE instances with >= 1 cores (i.e., not fractional core instances) we offer multi-Gbps throughput subject to fabric availability. Latency is likely to be lowest in zones with Haswell processors. We do not currently guarantee specific network characteristics, but we offer up to 2 Gbps/core of network throughput shared between the virtual NIC and any attached persistent disk volumes (Local SSD throughput does not count against this limit). Throughput wise this makes 8-vCPU and larger instances comparable to EC2 Enhanced Networking.
At the moment, nothing that we offer is similar to AWS' "enhanced networking".
You are more than welcome posting this as a Feature Request on our Compute Engine Issue tracker though, so we can look at implementing a similar feature.

How to get cpu cores utilizations in xen?

I have installed xen as hypervisor and there are dom0 and some paravirtualized machins as domu VMs on it.
I know xentop is used for checking the performance of system and virtual machines and I can read the output of it for measuring the virtual machine cpu utilization, But, it just gives the total usage of cpus!
So, is there any tool or any way to get cpu usages per cores?
I think you may be able to get what you want from XenMon. http://www.virtuatopia.com/index.php/Xen_Monitoring_Tools_and_Techniques
Also, try using xentop with the VCPUs option: -v or press V when inside xentop.
If you are using XenServer, then there are lots of interesting host metrics available, much more than the base Xen setup. Check out http://support.citrix.com/servlet/KbServlet/download/38321-102-714737/XenServer-6.5.0_Administrators%20Guide.pdf, Chapter 9.
It's all particularly good if you use XenCenter to view those metrics, but then I would say that since I wrote a significant portion of it ;-)

ASP.NET Hosting on Virtual Servers running on VMWare

My Company is running several international websites for selling insurance products.
Our current setup is a Webfarm with multiple Loadbalanced Webservers hosting our ASP.NET applications. The backend is a single - yet powerful - SQL Server. (all in one data center)
Our network admins want to move to virtual servers running on VMWare.
Scenarios could be
Webfarm: Multiple standard webservers, Loadbalanced (current setup), Session state on SQL Server
Virtual Webfarm: Multiple virtual servers, loadbalanced on one physical VMWare Host, Session state on SQL Server
2.a same as above but with multiple physical hosts
Single Virtual Webserver: One big powerful virtual webserver, no loadbalancing required, session state can be kept in process
There is a big hype around virtualization and I can see the benefits, but have no experience with this. I cannot tell what issues we will face and to what we should pay special attention.
Does anyone have experience with such a virtual setup?
What are general recommendations?
I tend towards 2a. I am afraid of having all webservers on one single physical machine.
Many thanks in advance to share your thoughts.
There are three reasons to use more than one webserver for an application:
Scaling - More grunt is required than one machine can provide
Reliability - Website should keep running in case of failure (a. hardware b. software)
Prioritization - One of the webservers takes on heavy work (perhaps scheduled tasks) leaving the other to respond to client requests quickly.
Marrying that up to you scenarios:
Scenario 1 provides 1, 2, 3
Scenario 2 provides 2b (perhaps 2a if it is fully hardware redundant (doubt it))
Scenario 2a provides 1, 2
Scenario 3 provides none of the above
Advantages of Virtual Hosting:
Lower Total Cost of Ownership (TCO) on big cluster serving multiple purposes is cost effective
New servers can be created quickly if needed
Redundant hardware is easier to justify if the cost is shared among many applications
Disadvantages:
Other virtual machines may suck away your CPU/Disk IO capacity
IMHO there is little point to load balancing multiple virtual machines on the same virtual server.
Robert's pretty much covered it all, I'm mostly just adding a note to say that at least one of our clients is currently running with option 2a.
So we have multiple loadbalanced web servers running on a couple of VM hosts, talking to a non-virtualised SQL cluster - this works quite well for them.
One other advantage of virtualisation is that it allows you to more fully utilise your hardware - however, you need to be aware that if you're running your virtual host at around 90% capacity with multiple VMs, you've not got a lot of spare capacity for any traffic spikes - if you're not expecting any, then great, but if you are, you'll need to have something in place to cope.
I agree with all of the above answers, and I actually work at a webhost. :-) If you're using multiple load-balanced webservers now then I can only assume the reason for it is either
Hardware Redundancy: If a single app server fails then those sessions are lost, but the app keeps running on the other servers and users can immediately re-connect.
or
Application Load Distribution (it's late so I can't think of a better name): Your traffic dictates that you have multiple app servers since all of your users would crash a single app server.
If #1 is the reason, then going to VMWare defeats the purpose since you only have one server supporting everything, and in case of hard drive crash, etc, you are down while it is repaired. If #2 is the reason then a VMWare based solution MAY work, however keep in mind that the hardware you'd use would almost necessarily be of a higher caliber than what you're currently using. So you maybe get more bang for your buck, but you STLL lose the redundancy that multiple physical machines gave you.
Now, you could always combine the two by having multiple physical machines all running VMWare, but that adds a level of complexity to things that you may not necessarily want either.
It doesn't sound like there would be any tangible benefit from running multiple virtual servers on the same physical host, you're just adding overhead. Unless I'm missing something with the way you've described the setup, there wouldn't be any benefit at all from moving to VMware - unless you're looking at taking advantage of features such as VMotion
VMware is most useful for consolidating underutilized hardware. If your hardware is running at near-capacity during peak periods then you don't want to run multiple VMs on the one machine.
There are benefits to Virtualization but your network admins need to prove that there is a benefit for your company before you even consider switching. I would say if you have multiple apps running on dedicated servers with low traffic (i.e. each app has it's own physical server) then sure, Virtualize. If you have one app over many servers, then don't.
You should be able to use virtual machine hosts with multiple vm per host and load balance across all of them.
Microsoft is doing this with msdn and technet http://virtualization.info/en/news/2008/05/microsoft-migrates-msdn-and-technet-on.html.

Resources