Setting up an Artifactory instance to mirror another Artifactory - artifactory

We would like to have two JFrog Artifactory instances, one for users inside local network of the company (with complete open access) and one that can be used from outside (with restricted accesses). So we want the second instance mirror some (or all) of repositories of the first one. Where should I start?

You would probably want to have Remote Repositories that point to the inner network instance (it has to be reachable from outside your network, but only for the credentials you specify on the remote repository's configuration though) so that they can serve whatever is in the inner instance's repositories when requested.
Another option is to use Replication (which runs periodically) to sync (mirror) a repository from the outside instance with the inner one.
I would probably go with the first option as you will not have to wait for replication cycles to complete to be sure the outer instance is completely synced, replication is more of a DR usecase.

Related

Is it possible to create a load balancer in gcp that works without instance groups in a regional scope?

the problem is I have 3 virtual machines with the same source images in three diferent zones in a region. I can't put them in a MIG because each of them has to attach to a specific persistent disk and according that I researched, I have no control of which VM in the MIG will be attached to which persistent disk (please correct me if I´m wrong). I explored the unmanaged instance group option too, but only has zonal scope. Is there any way to create a load balancer that works with my VMs or I have to create another solution (ex. NGINX)?
You have two options.
Create an Unmanaged Instance Group. This allows you to set up your VM instances as you want. You can create multiple instance groups for each region.
Creating groups of unmanaged instances
Use a Network Endpoint Group. This supports specifying backends based upon the Compute Engine VM instance's internal IP address. You can specify other methods such as an Internet address.
Network endpoint groups overview
Solution in my case was using a dedicated vm, intalling nginx as a load balancer and creating static ips to each vm in the group. I couldn't implement managed or unmanaged instance groups and managed load balancer.
That worked fine but another solution found in quicklabs was adding all intances in a "instance pool", maybe in the future I will implement that solution.

Kubernetes and MPI

I want to run an MPI job on my Kubernetes cluster. The context is that I'm actually running a modern, nicely containerised app but part of the workload is a legacy MPI job which isn't going to be re-written anytime soon, and I'd like to fit it into a kubernetes "worldview" as much as possible.
One initial question: has anyone had any success in running MPI jobs on a kube cluster? I've seen Christian Kniep's work in getting MPI jobs to run in docker containers, but he's going down the docker swarm path (with peer discovery using consul running in each container) and I want to stick to kubernetes (which already knows the info of all the peers) and inject this information into the container from the outside. I do have full control over all the parts of the application, e.g. I can choose which MPI implementation to use.
I have a couple of ideas about how to proceed:
fat containers containing slurm and the application code -> populate
the slurm.conf with appropriate info about the peers at container
startup -> use srun as the container entrypoint to start the jobs
slimmer containers with only OpenMPI (no slurm) -> populate a
rankfile in the container with info from outside (provided by
kubernetes) -> use mpirun as the container entrypoint
an even slimmer approach, where I basically "fake" the MPI runtime by
setting a few environment variables (e.g. the OpenMPI ORTE ones) ->
run the mpicc'd binary directly (where it'll find out about its peers
through the env vars)
some other option
give up in despair
I know trying to mix "established" workflows like MPI with the "new hotness" of kubernetes and containers is a bit of an impedance mismatch, but I'm just looking for pointers/gotchas before I go too far down the wrong path. If nothing exists I'm happy to hack on some stuff and push it back upstream.
I tried MPI Jobs on Kubernetes for a few days and solved it by using dnsPolicy:None and dnsConfig (CustomDNS=true feature gate will be needed).
I pushed my manifests (as Helm chart) here.
https://github.com/everpeace/kube-openmpi
I hope it would help.
Assuming you don't want to use hw-specific MPI library (for example anything that uses direct access to communication fabric), I would go with option 2.
First, implement a wrapper for mpirun which populates necessary data
using kubernetes API, specifically using endpoints if using a
service (might be a good idea), could also scrape pod's exposed
ports directly.
Add some form of checkpoint program that can be used for
"rendezvous" synchronization before starting actual running code (I
don't know how well MPI deals with ephemeral nodes). This is to
ensure that when mpirun starts it has stable set of pods to use
And finally actually build a container with necessary code and I
guess SSH service for mpirun to use for starting processes in
other pods.
Another interesting option would be to use Stateful Sets, possibly even running with SLURM inside, which implement a "virtual" cluster of MPI machines running on kubernetes.
This provides stable hostnames for each node, which would reduce the problem of discovery and keeping track of state. You could also use statefully-assigned storage for container's local work filesystem (which, with some work, could be made to for example always refer to same local SSD).
Another benefit is that it would be probably least invasive to the actual application.

Defining Apigee Target Servers hosts (the right way) in an HA Architecture

I'm on 14.04 On-Prem
I have an Active and DR setup
see here: http://www.slideshare.net/michaelgeiser/apigee-dc-failover
When I failover to the DR site, I update my DNS entry (at Akamai)
Virtual hosts work fine; Target Servers are giving me a headache
How can I setup and work with the Target Servers so I do not have to modify the Proxy API bundle but have traffic flow to the right VIP based on the DC?
I prefer not to do something like MyService-Target-dc1 and MyService-Target-dc2 and use the deploy script to modify the target name in the bundle.
I do not want to have a JavaScript that is modifies the target or anything else in the Proxy API, I need to define this in Environment setup.
I also cannot put the two DCs each into a separate Org; I need to use the same API Key when I move between the Active and DR sites; different Orgs mean different API Keys (right?).
TIA
One option is to modify the DNS lookup on each set of MP per DC so that a name like 'myservice.target.dc' resolves to different VIP. You'll of course want to document this well
since this, especially since this is external to the Apigee product.
I know you weren't too keen on modifying the target, but if you were open to that option, you could try using the host header of an ELB in front (if you have one) or client IP address (e.g., in geo-based routing) to identify which DC a call is passing through. From there, you can modify the target URL.
And yes, different Orgs do mean different API keys.
You can try Named Target Servers. They are part of the Load Balancing function but you can set them up individually and have different targets for different environments See:
Load Balancing Across Backend Servers
http://apigee.com/docs/api-services/content/load-balancing-across-backend-servers
Create a Named Target Server
http://apigee.com/docs/management/apis/post/organizations/%7Borg_name%7D/environments/%7Benv_name%7D/targetservers

how to get instances back after reboot in openstack

After successful installation of devstack and launching instances,but once reboot machine, need to start all over again and lose all the instances which were launched back then.I tried rejoin-stack but did not worked,How can i get the instances back after reboot ?
You might set resume_guests_state_on_host_boot = True in nova.conf. The file should be located at /etc/nova/nova.conf
I've found some old discussion http://www.gossamer-threads.com/lists/openstack/dev/8772
AFAIK at the present time OpenStack (Icehouse) still not completely aware about environments inside it, so it can't restore completely after reboot. The instances will be there (virsh domains), but even if you start them manually or using nova flags I'm not sure whether other facilities will handle this correctly (e.g. neutron will correctly configure all L3 rules according to DB records, etc.) Honestly I'm pretty sure they won't...
The answer depends of what you need to achieve:
If you need a template environment (e.g. similar set of instances and networks each time after reboot) you may just script everything. In other words just make a bash script creating everything you need and run it each time after stack.sh. Make sure you're starting with clean environment since OpenStack DB state remains between ./unstack - ./stack.sh or ./rejoin-stack.sh (you might try to just clean DB, or delete it. stack.sh will build it back).
If you need a persistent environment (e.g. you don't want to loose VM's and whole infrastructure state after reboot) I'm not aware how to do this using OpenStack. F.e. neutron agents (they configure iptables, dhcp etc) do not save state and driven by events from Neutron service. They will not restore after reboot, so the network will be dead. I'll be very glad if someone will share a method to do such recovery.
In general I think OpenStack is not focusing on this and will not focus during the nearest release cycles. Common approach is to have multi-node environment where each node is replaceable.
See http://docs.openstack.org/high-availability-guide/content/ch-intro.html for reference
devstack is an ephemeral environment. it is not supposed to survive a reboot. this is not a supported behavior.
that being said you might find success in re-initializing the environment by running
./unstack.sh
follower by
./stack.sh
again.
Again, devstack is an ephemeral environment. It's primary purpose for existing is to run gate testing for openstack's CI infrastructure.
or try ./rejoin-stack.sh to re-join previous screens.

launching adhoc docker instances: Is it recommended to launch a docker instance per request?

Is it recommended to launch a docker instance per request?
I have either lighttpd or Nginx running on my web server as a reverse proxy. I support a number of subdomains with very low usage. When a request for the subdomain arrives I want to start the docker instance. Preferable I'd like to launch them dynamically so that if more than one user arrives that I would launch one per user... and/or a shared instance (determined by configuration)
Originally I said this should work well for low traffic sites, but upon further thought, no, this is a bad idea.
Each time you launch a Docker container, it adds a read-write layer to the image. Even if there is very little data written, the layer exists, and each request will generate one. When a single user visits a website, rendering the page will generate 10's to 1000's of requests, for CSS, for javascript, for each image, for fonts, for AJAX, and each of these would create those read-write layers.
Right now there is no automatic cleanup of the read-write layers -- they persist even after the Docker container has exited. By default, nothing is lost.
So, even for a single low traffic site, you would find your disk use growing steadily over time. You could add your own automated cleanup.
Then there is the second problem: anything uploaded to the website would not be available to any other requests unless it was written to some out-of-container shared storage. That's pretty easy to do with S3 or a separate and persistent database service, but it does start showing the weakness in the "one new Docker container per request" approach. If you're going to have some persistent services, why not make the Docker containers more persistent and run them longer?

Resources