Setting Openstack compute node with a fake hypervisor - openstack

I'm trying to set up openstack compute nodes that mimics a real node, however never actually sets up the VMs on a physical host.
In the openstack tests, there are usages of fake drivers (defined in nova/virt/fake.py) through a complex system of testing classes.
I wish to get such a node up and running not within a test (meaning, I don't want to use these classes to spawn the compute node), but on an actual VM/container, however, I cannot figure out how to get a compute process to run with this fake hypervisor (or more specifically, one that will be defined by me).
How do I inject this fake driver instead of the real driver in a compute node?
(also, I'm installing OS using devstack (latest))
For more clarification, my goal is to do stress testing of OS, running multiple fake compute nodes, not in all-in-one configuration. The usage of devstack to setup the controller node is for simplifying the process, but the system should be:
A controller node, running the core services (Nova, Glance, Keystone etc.).
Multiple compute nodes, using fake hypervisors on different machines.

When installing a new compute node, there is a configuration file nova-compute.conf that is being created automatically.
It seems that in /etc/nova/nova-compute.conf there is an option:
compute_driver = libvirt.LibvirtDriver
That uses libvirt as the default hypervisor for a compute node. In addition to hyperv, vmwareapi and xenapi, according to the nova configuration documentation, one can choose using the fake driver by changing this option to:
compute_driver = fake.FakeDriver
In order to set the fake driver to our implementation, we may replace the fake driver written in fake.py with something else.

Related

Kaa cluster architecture

I am not really understand with Kaa cluster architecture. First is i need to install and configure Kaa components on a single Linux node by using this link: http://kaaproject.github.io/kaa/docs/v0.10.0/Administration-guide/System-installation/Single-node-installation/
I need to install SQL, NOSQL and Zookeeper in it. Does it means this single node is actually a cluster? i want to implement scalability and high availability. Do i need to clone the single node to implement fail over process?
The Kaa cluster architecture is:
http://kaaproject.github.io/kaa/docs/v0.10.0/Architecture-overview/
To setup and configure Kaa cluster you should follow the instructions on the Kaa Cluster setup documentation page. The Single Node Installation page describes what Kaa dependencies should be installed and how they should be configured, but as it for a single node installation, they all are placed at single node and configured respectively.
The cluster setup is most like single-node installation, but requires more nodes and configuration for their correct cooperation.
Thus, the difference between Kaa cluster and single-node operation is generally in configuration of the components rather than in the components themself.
Therefore, you can clone a single-node Kaa server as a basis for a cluster node, but you will need to change its configuration accordingly before it can operate as a cluter node correctly.

Kubernetes and MPI

I want to run an MPI job on my Kubernetes cluster. The context is that I'm actually running a modern, nicely containerised app but part of the workload is a legacy MPI job which isn't going to be re-written anytime soon, and I'd like to fit it into a kubernetes "worldview" as much as possible.
One initial question: has anyone had any success in running MPI jobs on a kube cluster? I've seen Christian Kniep's work in getting MPI jobs to run in docker containers, but he's going down the docker swarm path (with peer discovery using consul running in each container) and I want to stick to kubernetes (which already knows the info of all the peers) and inject this information into the container from the outside. I do have full control over all the parts of the application, e.g. I can choose which MPI implementation to use.
I have a couple of ideas about how to proceed:
fat containers containing slurm and the application code -> populate
the slurm.conf with appropriate info about the peers at container
startup -> use srun as the container entrypoint to start the jobs
slimmer containers with only OpenMPI (no slurm) -> populate a
rankfile in the container with info from outside (provided by
kubernetes) -> use mpirun as the container entrypoint
an even slimmer approach, where I basically "fake" the MPI runtime by
setting a few environment variables (e.g. the OpenMPI ORTE ones) ->
run the mpicc'd binary directly (where it'll find out about its peers
through the env vars)
some other option
give up in despair
I know trying to mix "established" workflows like MPI with the "new hotness" of kubernetes and containers is a bit of an impedance mismatch, but I'm just looking for pointers/gotchas before I go too far down the wrong path. If nothing exists I'm happy to hack on some stuff and push it back upstream.
I tried MPI Jobs on Kubernetes for a few days and solved it by using dnsPolicy:None and dnsConfig (CustomDNS=true feature gate will be needed).
I pushed my manifests (as Helm chart) here.
https://github.com/everpeace/kube-openmpi
I hope it would help.
Assuming you don't want to use hw-specific MPI library (for example anything that uses direct access to communication fabric), I would go with option 2.
First, implement a wrapper for mpirun which populates necessary data
using kubernetes API, specifically using endpoints if using a
service (might be a good idea), could also scrape pod's exposed
ports directly.
Add some form of checkpoint program that can be used for
"rendezvous" synchronization before starting actual running code (I
don't know how well MPI deals with ephemeral nodes). This is to
ensure that when mpirun starts it has stable set of pods to use
And finally actually build a container with necessary code and I
guess SSH service for mpirun to use for starting processes in
other pods.
Another interesting option would be to use Stateful Sets, possibly even running with SLURM inside, which implement a "virtual" cluster of MPI machines running on kubernetes.
This provides stable hostnames for each node, which would reduce the problem of discovery and keeping track of state. You could also use statefully-assigned storage for container's local work filesystem (which, with some work, could be made to for example always refer to same local SSD).
Another benefit is that it would be probably least invasive to the actual application.

openstack: relation between controller & compute nodes

I just started playing with openstack, and many things still don't understand. As I see it, to start a VM instance, we normally execute some commands on the controller e.g.
glance image-create
nova boot
But how does the controller know:
1) on which compute node to start the VM
2) how many compute nodes it has
Where does it take this information?
The controller will boot determine the location to launch the instance based on the information provided by nova-scheduler:
http://docs.openstack.org/juno/config-reference/content/section_compute-scheduler.html
As for how many compute nodes are recognized, this is determined when you register a compute node with nova compute on the controller. Here is a reference for how compute is installed and configured for RHEL/CentOS/Fedora:
http://docs.openstack.org/juno/install-guide/install/yum/content/ch_nova.html
I'd suggest to learn the OpenStack software architecture for such questions, for example, look at this page http://docs.openstack.org/openstack-ops/content/example_architecture.html.
Simply speacking, OpenStack saves all the configurations in database which is by default mysql, so Controller knows all the information. A Nova component named nova-scheduler running as a controller service will decide where to place VM among all available hosts.
A good staring point is to deploy multiple nodes env. You will know how OpenStack works in the deployment procedure.

devstack multi node installation

I have 3 nodes which i am using for multi node setup. I am thinking of following the below structure
Controller: keystone, horizon, g-reg, g-api, n-api, n-crt, n-sch, n-cond, n-cauth, n-obj, n-novnc, n-xvnc, c-api, c-sch (this node will have mysql and rabbitmq as well)
Network: q-svc, q-agt, q-dhcp, q-l3, q-meta, quantum
Compute: n-cpu, c-vol
I have a few questions. 1. In Compute node, do i need to keep n-api? Also what else is needed apart from n-api and c-vol? Is q-agt needed in compute? 2. Will i need c-api along with c-vol? Does compute node need rabbit mq installed?
Q1)
You don't want the nova-api on the compute nodes generally. It's better on the controller.
Nova api makes use of pasted hard system credentials and you don't want that paste file exposed on any node that a user may compromise with a hypervisor escape.
nova-compute and nova-volume is all you probably need. they do communicate with the scheduler over rabbitmq so make sure that's working =P
Q2)
You don't NEED cinder to run an openstack cloud, though I see no reason not to include it.
I don't know what impact disabling cinder has on the devstack stack.sh script, I've never done it.
As per RabbitMQ see above answer.

Should nova-api run on different compute nodes?

I am dealing with OpenStackļ¼ˆFolsom) and I want to deploy OpenStack to work on different
compute nodes. Is it necessary to run Nova Api service on every node?
It seems that every compute node needs a nova-api service in my equirement, but I think it does not make sense.
In my understanding only one nova-api service is required in the hole cloud system.
Request -> nova-api -> nova-schedule to determine which node to use.
Yes I think it is so, and according to the office guide writen by the OpenStack Installing Additional Compute Nodes only the dependence and the nova-* component on the additional compute node should be installed or just the nova-compute package.
In general, you only need one nova-api service running.
However, if your networking is configured for multi-host, then you will need to run a metadata service on each compute node. In this scenario, you need to run nova-api-metadata service on each compute node.
It is not necessary to run Nova-API service in every compute node. But, if you are using some of the available images with cloud init script that looks for metadata from Nova API then you need to install it in every compute node.
If you can build your own VM image without cloud init scripts, then it will not be required.

Resources