AFAIK, Artifactory keeps two replicas of each artifact.
1. In a cluster of N nodes with local storage (no shared filesystem), how does Artifactory pick which nodes will get the two replicas when artifact is first deployed to Artifactory?
2. When a client requests artifact X, how does Artifactory determine which nodes host given artifact and how does it decide which node will provide the artifact for this request?
3. Is there artifact-to-node affinity? In other words, if node1 and node2 have two replicas of artifact X, and I manually move the artifact from node1 to node7, will Artifactory attempt to revert this move? If there is this affinity, is it stored in the DB? Can it be modified? The reason I am asking is in the cluster of N nodes, some artifact hashes are present on all N nodes, so I have N replicas. I'd like to remove some of them to decrease disk space pressure.
Thank you. I can provide more details, if necessary.
AFAIK, Artifactory keeps two replicas of each artifact
This is only if you are using a sharded type configuration. For example, double or redundant shards or a clustered filesystem. In either case, the number of replicas is configurable.
In a cluster of N nodes with local storage (no shared filesystem), how does Artifactory pick which nodes will get the two replicas when artifact is first deployed to Artifactory?
The node that received the request will get a copy and then whatever other number of nodes is needed to meet the redundancy requirement.
For sharding, this is configurable with the writeBehavior (where you can specify round-robin, freePercentageSpace, or freeSpace)
When a client requests artifact X, how does Artifactory determine which nodes host given artifact and how does it decide which node will provide the artifact for this request?
Is there artifact-to-node affinity? In other words, if node1 and node2 have two replicas of artifact X, and I manually move the artifact from node1 to node7, will Artifactory attempt to revert this move?
There is no affinity.
The node that received the request will respond to it. If it does not have the binary, it will stream it from a node that does and then sends it to the user.
Related
I am trying to run Example Cordapp in two different VMs. With Notary and PartyC in 1st server and PartyA and PartyB in 2nd server.
I followed the steps here, Corda nodes: how to connect two independent pc as two nodes
In the conf file of,
Notary and PartyC - I have edited the P2P address
PartyA and PartyB - I have edited the P2P address
With the above conf files, I ran the Network Bootstrapper jar in server 1 and copied the folders PartyA and PartyB in another cordapp example to server 2 and started the Notary and Parties 1 by 1 respectively in the corresponding VMs.
All nodes started succesfully and when I try to execute a IOU flow from PartyC(in server 1) to PartyB(in server2), it is pausing at Collecting counterparties signature step without proceeding further. Below is what I see in PartyC's Console,
enter image description here
The flow getting stuck in CollectSignatureFlow means the initiating node is not able to get a response from the counterparty node.
The CollectSignatureFlow internally establishes a session with the counterparty node and shares the transaction data to get the signature.
Since you have nodes in separate machines, they are probably not able to see each other. Generally, if nodes are hosted in a separate VM, the VM's must have public IPs or must be in a network where they are able to see each other.
So i've read several articles & looked through Openstack docs for the definition of a node.
Node
A Node is a logical object managed by the Senlin service. A node can
be a member of at most one cluster at any time. A node can be an
orphan node which means it doesn’t belong to any clusters.
Node types
According to the Oracle docs, there are different node types (controller node, compute node etc.). What I'm confused about is if a single node is a single physical computer host. Does that mean I can still deploy multiple nodes with different node types on the same host?
Node Cluster
I read that a cluster is a group of nodes. How could the cluster for the controller node look like?
CONTROLLER NODE
The controller node is the control plane for the OpenStack
environment. The control pane handles identity (keystone), dashboard
(Horizon), telemetry (ceilometer), orchestration (heat) and network
server service (neutron).
In this architecture, I have different Openstack services (Horizon, Glance etc.) running on one node. Can I conclude from this picture whether it's part of a cluster?
Ok, so a node in the context of the Openstack documentation is synonymous to host:
The example architecture requires at least two nodes (hosts)
from the sentence on the page: https://docs.openstack.org/newton/install-guide-ubuntu/overview.html
You already found out that what a node is in the context of Senlin.
Node types: the nodes referred here are the physical hosts, like in the rest of the Openstack documentation. The node type is determined by the services running on the host. Usually you can run serveral services on a host.
In Openstack the word cluster is only used to referred to service collection managed by Senlin. So usually no, these services need not form a cluster.
I am not really understand with Kaa cluster architecture. First is i need to install and configure Kaa components on a single Linux node by using this link: http://kaaproject.github.io/kaa/docs/v0.10.0/Administration-guide/System-installation/Single-node-installation/
I need to install SQL, NOSQL and Zookeeper in it. Does it means this single node is actually a cluster? i want to implement scalability and high availability. Do i need to clone the single node to implement fail over process?
The Kaa cluster architecture is:
http://kaaproject.github.io/kaa/docs/v0.10.0/Architecture-overview/
To setup and configure Kaa cluster you should follow the instructions on the Kaa Cluster setup documentation page. The Single Node Installation page describes what Kaa dependencies should be installed and how they should be configured, but as it for a single node installation, they all are placed at single node and configured respectively.
The cluster setup is most like single-node installation, but requires more nodes and configuration for their correct cooperation.
Thus, the difference between Kaa cluster and single-node operation is generally in configuration of the components rather than in the components themself.
Therefore, you can clone a single-node Kaa server as a basis for a cluster node, but you will need to change its configuration accordingly before it can operate as a cluter node correctly.
I've setup a kafka cluster with 3 nodes.
kafka01.example.com
kafka02.example.com
kafka03.example.com
Kafka does replication so that any node in the cluster can be removed without loosing data.
Normally I would send all data to kafka01, however that will break the entire cluster if that one node goes down.
What is industry best practice when dealing with clusters? I'm evaluating setting up an NGINX reverse proxy with round robin load balancing. Then I can point all data producers at the proxy and it will divvy up between the nodes.
I need to ensure that no data is lost if one of the nodes becomes unavailable.
Is an nginx reverse proxy an appropriate tool for this use case?
Is my assumption correct that a round robin reverse proxy will distribute the data and increase reliability without data loss?
Is there a different approach that I haven't considered?
Normally your producer takes care of distributing the data to all (or selected set of) nodes that are up and running by using a partitioning function either in a round robin mode or by using some semantics of your choice. The producer publishes to a partition of a topic and different nodes are leaders for different partitions of one topic. If a broker node becomes unavailable, this node will fall out of the cluster (In Sync Replicas) and new leaders for partitions on that node will be selected. Through metadata requests/responses, your producer will become aware of this fact and push messages to other nodes which are currently up.
If we start a SolrCloud with 2 shards. By a hash function algorithm(Murmur) documents are distributed over 2 shards. It is claimed that we can send the query to any of the cores and it will go to the write shard because the shards know about each other. I want to know how they know about each other?
solr becomes solrcloud with zookeeper ensemble.so its a zookeeper which makes it possible for nodes in the solrcloud to talk to each other.You can think of a zookeeper as a central repository for solrcloud configuration.
Now you can send queries to any node in the cloud,the node will consult the zookeeper to know which other nodes are alive in the shards and distributes the query to all alive nodes in the cluster. Every node in the cluster executes the query and sends results back to the node who distributed the request.The node which was queried will combine the search results returned by all the nodes in the shards and will send back to client.