Add multiple notaries in CORDA - corda

I am trying to create a DLT with 4 nodes and 2 notaries where each notary is responsible for communicating with 2 nodes.
Sample Gradle config
task deployNodes(type: net.corda.plugins.Cordform, dependsOn: ['jar']) {
directory "./build/nodes"
networkMap "O=Controller,L=London,C=GB"
node {
name "O=Controller,L=London,C=GB"
advertisedServices = ["corda.notary.validating"]
p2pPort 10002
rpcPort 10003
cordapps = ["net.corda:corda-finance:$corda_release_version"]
}
node {
name "O=ControllerNY,L=New York,C=US"
advertisedServices = ["corda.notary.validating"]
p2pPort 10004
rpcPort 10005
cordapps = ["net.corda:corda-finance:$corda_release_version"]
}
node {
name "O=PartyA,L=London,C=GB"
advertisedServices = []
p2pPort 10006
rpcPort 10007
webPort 10008
cordapps = ["net.corda:corda-finance:$corda_release_version"]
rpcUsers = [[ user: "user1", "password": "test", "permissions": []]]
}
node {
name "O=PartyB,L=London,C=GB"
advertisedServices = []
p2pPort 10009
rpcPort 10010
webPort 10011
cordapps = ["net.corda:corda-finance:$corda_release_version"]
rpcUsers = [[ user: "user1", "password": "test", "permissions": []]]
}
node {
name "O=PartyC,L=New York,C=US"
advertisedServices = []
p2pPort 10012
rpcPort 10013
webPort 10014
cordapps = ["net.corda:corda-finance:$corda_release_version"]
rpcUsers = [[ user: "user1", "password": "test", "permissions": []]]
}
node {
name "O=PartyD,L=New York,C=US"
advertisedServices = []
p2pPort 10015
rpcPort 10016
webPort 10017
cordapps = ["net.corda:corda-finance:$corda_release_version"]
rpcUsers = [[ user: "user1", "password": "test", "permissions": []]]
}
}
how can i add both controller, controllerNY to network so that it picks it as notary and not as a normal node

Both Controller and ControllerNY will be added to your network as notaries in this case, because they both advertise a notary service.
Each node is then free to use either notary for a given transaction. You pick your notary within the flow using something like:
serviceHub.networkMapCache.getNotary(notaryToUse)
Or
serviceHub.networkMapCache.notaryIdentities.single { it.name.organisation == notaryToUse }

Related

How to check resource access rights via team membership in firebase security rules?

A user should be allowed to access a resource if they are in a team that is allowed to access the resource.
How can I do this in security rules ?
I have collections:
teams, with a .members field in each
resources, with a teamsThatCanAccess for each
If i wrote this in js, itd be something like:
canUserAccess = (userId, resource) => {
teams = resource.teamsThatCanAccess
hasAccess = false
teams.forEach((team) => {
if (userId in team.members) {
hasAccess = true
}
}
return hasAccess
}
However, as I understand it, security rules dont like loops.
--EDIT--
To illustrate further, the database I'm building will look like something like this:
teams = [
{ name: "teamA", org: "org1", members: ["uid1", "uid2", "uid3"] },
{ name: "teamB", org: "org1", members: ["uid1", "uid2"] },
{ name: "teamC", org: "org1", members: ["uid3", "uid4", "uid5"] },
{ name: "teamD", org: "org2", members: ["uid201", "uid202"] },
]
resources = [
{
id: "projectId1",
name: "project 1",
org: "org1",
teamsThatCanAccess: ["teamA", "teamB"],
},
{
id: "projectId2",
name: "project 2",
org: "org1",
teamsThatCanAccess: ["teamA", "teamB", "teamC"],
},
{
id: "projectId3",
name: "project 3",
org: "org1",
teamsThatCanAccess: ["teamC"],
},
{
id: "projectId4",
name: "project 201",
org: "org2",
teamsThatCanAccess: ["teamD"],
},
]
projectFiles = [
{ content: "document text", project: "projectId1" },
{ content: "document text 2", project: "projectId1" },
{ content: "document text 3", project: "projectId2" },
]
Based on what you described, you have a structure that looks like this:
// document at /teams/someTeamId
{
"members": [
"uid1",
"uid2",
"uid3"
],
/* ... */
}
// document at /resources/someResourceId
{
"teamsThatCanAccess": [
"someTeamId",
"otherTeamId"
],
/* ... */
}
To secure the data, you will need to introduce a new collection of documents, called something like teamsByUser:
// document at /teamsByUser/uid1
{
"memberOf": [
"someTeamId",
"otherTeamId"
]
}
By introducing this array, you can now use the rules.List#hasAny method to find if there is any overlap between the memberOf array in /teamsByUser/{userId} and the teamsThatCanAccess array in /resources/{resourceId}.
This will then allow you to configure your rules as
rules_version = '2';
service cloud.firestore {
match /databases/{database}/documents {
match /resources/{resourceId} {
allow read: if resource.data.size() == 0 // is empty?
|| get(/databases/$(database)/documents/teamsByUser/$(request.auth.uid)).data.memberOf.hasAny(resource.data.teamsThatCanAccess); // accessing user is a member of an allowed team
}
// don't forget to add rules to prevent users joining arbitrary teams from clients
}
}

Transitive dependencies of CordApp doesn't get updated

original question. How do I update the transitive dependency of cordApp to use Artemis 2.5.0. I'm following this corda-ftp demo. Updated the build.gradle as shown below. when I do gradle dependencies I see Artemis 2.5.0 wins but somehow the nodes pick up 2.2.0 as I can see in classpath in nodes logs.
buildscript {
ext.corda_release_version = '3.1-corda'
ext.corda_gradle_plugins_version = '3.1.0'
ext.quasar_version = '0.7.9'
ext.junit_version = '4.12'
ext.spring_boot_version = '2.0.2.RELEASE'
ext.corda_release_group = 'net.corda'
ext.kotlin_version = '1.1.60'
ext.username = "corda"
ext.password = "corda_initial_password"
ext.client_port = 10009
repositories {
mavenLocal()
mavenCentral()
jcenter()
}
dependencies {
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
classpath "net.corda.plugins:cordapp:$corda_gradle_plugins_version"
classpath "net.corda.plugins:cordformation:$corda_gradle_plugins_version"
classpath "net.corda.plugins:quasar-utils:$corda_gradle_plugins_version"
classpath "io.spring.gradle:dependency-management-plugin:1.0.5.RELEASE"
}
}
repositories {
mavenLocal()
jcenter()
mavenCentral()
maven { url 'https://dl.bintray.com/kotlin/exposed' }
maven { url 'https://jitpack.io' }
maven { url 'https://ci-artifactory.corda.r3cev.com/artifactory/corda-releases' }
maven { url 'https://ci-artifactory.corda.r3cev.com/artifactory/corda-dev/' }
}
apply plugin: 'kotlin'
apply plugin: "io.spring.dependency-management"
apply plugin: 'net.corda.plugins.cordapp'
apply plugin: 'net.corda.plugins.cordformation'
apply plugin: 'net.corda.plugins.quasar-utils'
dependencyManagement {
dependencies {
dependencySet(group: 'org.apache.activemq', version: '2.5.0') {
entry 'artemis-amqp-protocol'
entry 'artemis-commons'
entry 'artemis-core-client'
entry 'artemis-jdbc-store'
entry 'artemis-jms-client'
entry 'artemis-journal'
entry 'artemis-native'
entry 'artemis-selector'
entry 'artemis-server'
}
}
}
sourceSets {
main {
resources {
srcDir "config/dev"
}
}
test {
resources {
srcDir "config/test"
}
}
}
dependencies {
compile "org.jetbrains.kotlin:kotlin-stdlib-jre8:$kotlin_version"
testCompile "org.jetbrains.kotlin:kotlin-test:$kotlin_version"
testCompile "junit:junit:$junit_version"
// Corda integration dependencies
cordaCompile "$corda_release_group:corda-core:$corda_release_version"
cordaCompile "$corda_release_group:corda-finance:$corda_release_version"
cordaCompile "$corda_release_group:corda-jackson:$corda_release_version"
cordaCompile "$corda_release_group:corda-rpc:$corda_release_version"
cordaCompile "$corda_release_group:corda-node-api:$corda_release_version"
cordaCompile "$corda_release_group:corda-webserver-impl:$corda_release_version"
cordaRuntime "$corda_release_group:corda:$corda_release_version"
cordaRuntime "$corda_release_group:corda-webserver:$corda_release_version"
testCompile "$corda_release_group:corda-test-utils:$corda_release_version"
testCompile "$corda_release_group:corda-node-driver:$corda_release_version"
// GraphStream: For visualisation (required by TemplateClientRPC app)
compile "org.graphstream:gs-core:1.3"
compile("org.graphstream:gs-ui:1.3") {
exclude group: "bouncycastle"
}
// CorDapp dependencies
// Specify your cordapp's dependencies below, including dependent cordapps
compile "io.reactivex:rxjava:1.2.4"
}
tasks.withType(org.jetbrains.kotlin.gradle.tasks.KotlinCompile).all {
kotlinOptions {
languageVersion = "1.1"
apiVersion = "1.1"
jvmTarget = "1.8"
javaParameters = true // Useful for reflection.
}
}
def copyConfigTask(nodeName) {
return tasks.create("copy${nodeName}", Copy) {
from "${nodeName}.json"
into "./build/nodes/${nodeName}/"
rename {
"cordaftp.json"
}
}
}
task deployNodes(type: net.corda.plugins.Cordform, dependsOn: ['jar', copyConfigTask("CorpA"), copyConfigTask("CorpB")]) {
directory "./build/nodes"
node {
name "O=R3Corp,OU=corda,L=London,C=GB"
notary = [validating : false]
p2pPort 10002
rpcSettings {
address("localhost:10003")
adminAddress("localhost:10043")
}
cordapps = []
}
node {
name "O=CorpA,L=Paris,C=FR"
p2pPort 10005
rpcSettings {
address("localhost:10006")
adminAddress("localhost:10046")
}
extraConfig = [
jvmArgs : [ "-Xmx1g"],
attachmentContentCacheSizeMegaBytes: 100
]
cordapps = []
// TODO: Replace username / password with vars such that we can DRY the username, password
rpcUsers = [[ "user": "corda", "password": "corda_initial_password", "permissions": ["ALL"]]]
}
node {
name "O=CorpB,L=Rome,C=IT"
p2pPort 10008
rpcSettings {
address("localhost:10009")
adminAddress("localhost:10049")
}
extraConfig = [
jvmArgs : [ "-Xmx1g"],
attachmentContentCacheSizeMegaBytes: 100
]
cordapps = []
// TODO: Ditto
rpcUsers = [[ "user": "corda", "password": "corda_initial_password", "permissions": ["ALL"]]]
}
}
task(runClientB, dependsOn: 'classes', type: JavaExec) {
classpath = sourceSets.main.runtimeClasspath
main = 'net.corda.cordaftp.SenderKt'
args "localhost:$client_port", "$username", "$password", "build/nodes/CorpB/cordaftp.json"
}
You cannot control which versions of dependencies the nodes use internally. You can only control the dependencies used by your CorDapp.

Building cordapp jar and deploy in new node

I am trying to integrate the two examples (Corda java template: https://github.com/corda/cordapp-template-java and Oracle example: https://github.com/corda/oracle-example ) so as to integrate Oracle node in the template.
I changed build.gradle, settings.gradle and copied base and service package to template folder. Though, the project is not logically linked as Oracle corresponds to different service, it does compiles and create classes under build folder successfully. After re-syncing gradle project, the gradle tasks were successfully updated and I am able to run deployNodes successfully.
However, no jar is present in build/nodes/Oracle/cordapp folder.
Kindly advise if additional changes needs to be done.
Git url for changes made: https://github.com/ashubisht/cordapp-template-java/tree/OracleIntegration_IOURelV3_0307
Here's the updated gradle file
buildscript {
ext.corda_release_group = 'net.corda'
ext.corda_release_version = '3.1-corda'
ext.corda_gradle_plugins_version = '3.1.0'
ext.junit_version = '4.12'
ext.quasar_version = '0.7.9'
ext.kotlin_version = '1.1.60'
repositories {
mavenLocal()
mavenCentral()
jcenter()
}
dependencies {
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
classpath "net.corda.plugins:cordapp:$corda_gradle_plugins_version"
classpath "net.corda.plugins:cordformation:$corda_gradle_plugins_version"
classpath "net.corda.plugins:quasar-utils:$corda_gradle_plugins_version"
}
}
repositories {
mavenLocal()
jcenter()
mavenCentral()
maven { url 'https://jitpack.io' }
maven { url 'https://ci-artifactory.corda.r3cev.com/artifactory/corda-releases' }
}
apply plugin: 'kotlin'
apply plugin: 'java'
apply plugin: 'net.corda.plugins.cordapp'
apply plugin: 'net.corda.plugins.cordformation'
apply plugin: 'net.corda.plugins.quasar-utils'
sourceSets {
main {
resources {
srcDir "config/dev"
}
}
test {
resources {
srcDir "config/test"
}
}
integrationTest {
java {
compileClasspath += main.output + test.output
runtimeClasspath += main.output + test.output
srcDir file('src/integration-test/java')
}
}
}
configurations {
integrationTestCompile.extendsFrom testCompile
integrationTestRuntime.extendsFrom testRuntime
}
dependencies {
testCompile "junit:junit:$junit_version"
// Corda integration dependencies
cordaCompile "$corda_release_group:corda-core:$corda_release_version"
cordaCompile "$corda_release_group:corda-finance:$corda_release_version"
cordaCompile "$corda_release_group:corda-jackson:$corda_release_version"
cordaCompile "$corda_release_group:corda-rpc:$corda_release_version"
cordaCompile "$corda_release_group:corda-node-api:$corda_release_version"
cordaCompile "$corda_release_group:corda-webserver-impl:$corda_release_version"
cordaRuntime "$corda_release_group:corda:$corda_release_version"
cordaRuntime "$corda_release_group:corda-webserver:$corda_release_version"
testCompile "$corda_release_group:corda-node-driver:$corda_release_version"
// CorDapp dependencies
// Specify your CorDapp's dependencies below, including dependent CorDapps.
// We've defined Cash as a dependent CorDapp as an example.
cordapp project(":cordapp")
cordapp project(":cordapp-contracts-states")
//Added oracle support to template for testing/ experimenting configs
cordapp project(":base")
cordapp project(":service")
//Oracle changes end here
cordapp "$corda_release_group:corda-finance:$corda_release_version"
}
task integrationTest(type: Test, dependsOn: []) {
testClassesDir = sourceSets.integrationTest.output.classesDir
classpath = sourceSets.integrationTest.runtimeClasspath
}
tasks.withType(JavaCompile) {
options.compilerArgs << "-parameters" // Required for passing named arguments to your flow via the shell.
}
task deployNodes(type: net.corda.plugins.Cordform, dependsOn: ['jar']) {
directory "./build/nodes"
node {
name "O=Notary,L=London,C=GB"
notary = [validating : true]
p2pPort 10002
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"$project.group:cordapp:$project.version",
"$corda_release_group:corda-finance:$corda_release_version"
]
}
node {
name "O=PartyA,L=London,C=GB"
p2pPort 10005
rpcSettings {
address("localhost:10006")
adminAddress("localhost:10046")
}
webPort 10007
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"$project.group:cordapp:$project.version",
"$corda_release_group:corda-finance:$corda_release_version"
]
rpcUsers = [[ user: "user1", "password": "test", "permissions": ["ALL"]]]
}
node {
name "O=PartyB,L=New York,C=US"
p2pPort 10008
rpcSettings {
address("localhost:10009")
adminAddress("localhost:10049")
}
webPort 10010
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"$project.group:cordapp:$project.version",
"$corda_release_group:corda-finance:$corda_release_version"
]
rpcUsers = [[ user: "user1", "password": "test", "permissions": ["ALL"]]]
}
node {
name "O=Oracle,L=New York,C=US"
p2pPort 10011
rpcSettings {
address("localhost:10012")
adminAddress("localhost:10052")
}
webPort 10013
//The below cordapps will be deployed to oracle.
//Create below packages named base and service and add to dependency
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"net.corda.examples.oracle:base:1.0",
"net.corda.examples.oracle:service:1.0"
]
rpcUsers = [[ user: "user1", "password": "test", "permissions": ["ALL"]]]
}
}
task runTemplateClient(type: JavaExec) {
classpath = sourceSets.main.runtimeClasspath
main = 'com.template.TemplateClient'
args 'localhost:10006'
}
When defining your oracle node in deployNodes, you have provided the following cordapps block:
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"net.corda.examples.oracle:base:1.0",
"net.corda.examples.oracle:service:1.0"
]
However, your project's group, as given in your project's gradle.properties file, is com.template. Therefore you need to specify your CorDapps as follows:
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"com.template:base:0.1",
"com.template:service:0.1"
]
Alternatively, you can use the following shorthand:
cordapps = [
"$project.group:cordapp-contracts-states:$project.version",
"$project.group:base:0.1",
"$project.group:service:0.1"
]

DC/OS Marathon constraints hostname list

When I want use
"constraints": [["hostname", "CLUSTER", "192.168.18.6(1|2)"]]
or
"constraints": [["hostname", "CLUSTER", "DCOS-S-0(1|2)"]]
In Marathon app name "/zaslepki/4maxpl" has all the time Waiting status
So I try use attribute - I execute:
[root#DCOS-S-00 etc]# systemctl stop dcos-mesos-slave-public.service
[root#DCOS-S-00 etc]# mesos-slave --work_dir=/var/lib/mesos/slave --attributes=DC:DL01 --master=zk://192.168.18.51:2181,192.168.18.51:2181,192.168.18.53:2181/mesos
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1229 13:16:19.800616 24537 main.cpp:243] Build: 2016-11-07 21:31:04 by
I1229 13:16:19.800720 24537 main.cpp:244] Version: 1.0.1
I1229 13:16:19.800726 24537 main.cpp:251] Git SHA: d5746045ac740d5f28f238dc55ec95c89d2b7cd9
I1229 13:16:19.807195 24537 systemd.cpp:237] systemd version `219` detected
I1229 13:16:19.807232 24537 main.cpp:342] Inializing systemd state
I1229 13:16:19.820071 24537 systemd.cpp:325] Started systemd slice `mesos_executors.slice`
I1229 13:16:19.821051 24537 containerizer.cpp:196] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
I1229 13:16:19.825422 24537 linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1229 13:16:19.826690 24537 main.cpp:434] Starting Mesos agent
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#726: Client environment:zookeeper.version=zookeeper C client 3.4.8
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#730: Client environment:host.name=DCOS-S-00
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#737: Client environment:os.name=Linux
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#738: Client environment:os.arch=3.10.0-514.2.2.el7.x86_64
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#739: Client environment:os.version=#1 SMP Tue Dec 6 23:06:41 UTC 2016
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#747: Client environment:user.name=root
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#755: Client environment:user.home=/root
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#log_env#767: Client environment:user.dir=/opt/mesosphere/etc
2016-12-29 13:16:19,827:24537(0x7f8ecae60700):ZOO_INFO#zookeeper_init#800: Initiating client connection, host=192.168.18.51:2181,192.168.18.51:2181,192.168.18.53:2181 sessionTimeout=10000 watcher=0x7f8ed221a030 sessionId=0 sessionPasswd=<null> context=0x7f8ebc001ee0 flags=0
I1229 13:16:19.828233 24537 slave.cpp:198] Agent started on 1)#192.168.18.60:5051
2016-12-29 13:16:19,828:24537(0x7f8ec8c49700):ZOO_INFO#check_events#1728: initiated connection to server [192.168.18.51:2181]
I1229 13:16:19.828263 24537 slave.cpp:199] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --attributes="DC:DL01" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --image_provisioner_backend="copy" --initialize_driver_logging="true" --ip_discovery_command="/opt/mesosphere/bin/detect_ip" --isolation="posix/cpu,posix/mem" --launcher_dir="/opt/mesosphere/packages/mesos--253f5cb0a96e2e3574293ddfecf5c63358527377/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --master="zk://192.168.18.51:2181,192.168.18.51:2181,192.168.18.53:2181/mesos" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/var/lib/mesos/slave"
I1229 13:16:19.829263 24537 slave.cpp:519] Agent resources: cpus(*):8; mem(*):6541; disk(*):36019; ports(*):[31000-32000]
I1229 13:16:19.829306 24537 slave.cpp:527] Agent attributes: [ DC=DL01 ]
I1229 13:16:19.829319 24537 slave.cpp:532] Agent hostname: DCOS-S-00
2016-12-29 13:16:19,832:24537(0x7f8ec8c49700):ZOO_INFO#check_events#1775: session establishment complete on server [192.168.18.51:2181], sessionId=0x1593f6a1ef20fce, negotiated timeout=10000
I1229 13:16:19.832623 24548 state.cpp:57] Recovering state from '/var/lib/mesos/slave/meta'
I1229 13:16:19.832695 24547 group.cpp:349] Group process (group(1)#192.168.18.60:5051) connected to ZooKeeper
I1229 13:16:19.832723 24547 group.cpp:837] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I1229 13:16:19.832736 24547 group.cpp:427] Trying to create path '/mesos' in ZooKeeper
I1229 13:16:19.834234 24547 detector.cpp:152] Detected a new leader: (id='70')
I1229 13:16:19.834319 24547 group.cpp:706] Trying to get '/mesos/json.info_0000000070' in ZooKeeper
I1229 13:16:19.835002 24547 zookeeper.cpp:259] A new leading master (UPID=master#192.168.18.53:5050) is detected
Failed to perform recovery: Incompatible agent info detected.
------------------------------------------------------------
Old agent info:
hostname: "192.168.18.60"
resources {
name: "ports"
type: RANGES
ranges {
range {
begin: 1
end: 21
}
range {
begin: 23
end: 5050
}
range {
begin: 5052
end: 32000
}
}
role: "slave_public"
}
resources {
name: "disk"
type: SCALAR
scalar {
value: 37284
}
role: "slave_public"
}
resources {
name: "cpus"
type: SCALAR
scalar {
value: 8
}
role: "slave_public"
}
resources {
name: "mem"
type: SCALAR
scalar {
value: 6541
}
role: "slave_public"
}
attributes {
name: "public_ip"
type: TEXT
text {
value: "true"
}
}
id {
value: "8bc3d621-ed8a-4641-88c1-7a7163668263-S9"
}
checkpoint: true
port: 5051
------------------------------------------------------------
New agent info:
hostname: "DCOS-S-00"
resources {
name: "cpus"
type: SCALAR
scalar {
value: 8
}
role: "*"
}
resources {
name: "mem"
type: SCALAR
scalar {
value: 6541
}
role: "*"
}
resources {
name: "disk"
type: SCALAR
scalar {
value: 36019
}
role: "*"
}
resources {
name: "ports"
type: RANGES
ranges {
range {
begin: 31000
end: 32000
}
}
role: "*"
}
attributes {
name: "DC"
type: TEXT
text {
value: "DL01"
}
}
id {
value: "8bc3d621-ed8a-4641-88c1-7a7163668263-S9"
}
checkpoint: true
port: 5051
------------------------------------------------------------
To remedy this do as follows:
Step 1: rm -f /var/lib/mesos/slave/meta/slaves/latest
This ensures agent doesn't recover old live executors.
Step 2: Restart the agent.
[root#DCOS-S-00 etc]# rm -f /var/lib/mesos/slave/meta/slaves/latest
[root#DCOS-S-00 etc]# systemctl start dcos-mesos-slave-public.service
and I use in .json application configuration file
"constraints": [["DC", "CLUSTER", "DL01"]]
Status application is Waiting.....
This is my .json file aplication "/zaslepki/4maxpl"
{
"id": "/zaslepki/4maxpl",
"cmd": null,
"cpus": 0.5,
"mem": 256,
"disk": 0,
"instances": 2,
"constraints": [["hostname", "CLUSTER", "DCOS-S-0(3|4)"]],
"acceptedResourceRoles": [
"slave_public"
],
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "arekmax/4maxpl",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
"servicePort": 10015,
"protocol": "tcp",
"labels": {}
}
],
"privileged": false,
"parameters": [],
"forcePullImage": false
}
},
"healthChecks": [
{
"path": "/",
"protocol": "HTTP",
"portIndex": 0,
"gracePeriodSeconds": 300,
"intervalSeconds": 30,
"timeoutSeconds": 10,
"maxConsecutiveFailures": 2,
"ignoreHttp1xx": false
}
],
"labels": {
"HAPROXY_GROUP": "external"
},
"portDefinitions": [
{
"port": 10015,
"protocol": "tcp",
"labels": {}
}
]
}
What I do wrong? I find that same problem link but there problem was fixed by use
constraints: [["DC", "CLUSTER", "DL01"]]
You've got a clue in a log:
Invalid attribute key:value pair 'DL01'
Change your attribute to key:value pair e.g., DC:DL01 and it should work. Probably you will need to clean metadata directory because you are changing Agent configuration.
Cluster operator doesnt work with multiple values. You need to pass regular expression so your it should looks like this
"constraints": [["hostname", "LIKE", "192.168.18.6(1|2)"]]

Docker Swarm Mode: Not all VIPs for a service work. Getting timeouts for several VIPs

Description
I'm having issues with an overlay network using docker swarm mode (IMPORTANT: swarm mode, not swarm). I have an overlay network named "internal". I have a service named "datacollector" that is scaled to 12 instances. I docker exec into another service running in the same swarm (and on the same overlay network) and run curl http://datacollector 12 times. However, 4 of the requests result in a timeout. I then run dig tasks.datacollector and get a list of 12 ip addresses. Sure enough, 8 of the ip addresses work but 4 timeout every time.
I tried scaling the service down to 1 instance and then back up to 12, but got the same result.
I then used docker service ps datacollector to find each running instance of my service. I used docker kill xxxx on each node to manually kill all instances and let the swarm recreate them. I then checked dig again and verified that the list of IP addresses for the task was no longer the same. After this I ran curl http://datacollector 12 more times. Now only 3 requests work and the remaining 9 timeout!
This is the second time this has happened in the last 2 weeks or so. The previous time I had to remove all services, remove the overlay network, recreate the overlay network, and re-create all of the services in order to resolve the issue. Obviously, this isn't a workable long term solution :(
Output of `docker service inspect datacollector:
[
{
"ID": "2uevc4ouakk6k3dirhgqxexz9",
"Version": {
"Index": 72152
},
"CreatedAt": "2016-11-12T20:38:51.137043037Z",
"UpdatedAt": "2016-11-17T15:22:34.402801678Z",
"Spec": {
"Name": "datacollector",
"TaskTemplate": {
"ContainerSpec": {
"Image": "507452836298.dkr.ecr.us-east-1.amazonaws.com/swarm/api:61d7931f583742cca91b368bc6d9e15314545093",
"Args": [
"node",
".",
"api/dataCollector"
],
"Env": [
"ENVIRONMENT=stage",
"MONGODB_URI=mongodb://mongodb:27017/liveearth",
"RABBITMQ_URL=amqp://rabbitmq",
"ELASTICSEARCH_URL=http://elasticsearch"
]
},
"Resources": {
"Limits": {},
"Reservations": {}
},
"RestartPolicy": {
"Condition": "any",
"MaxAttempts": 0
},
"Placement": {
"Constraints": [
"node.labels.role.api==true",
"node.labels.role.api==true",
"node.labels.role.api==true",
"node.labels.role.api==true",
"node.labels.role.api==true"
]
}
},
"Mode": {
"Replicated": {
"Replicas": 12
}
},
"UpdateConfig": {
"Parallelism": 1,
"FailureAction": "pause"
},
"Networks": [
{
"Target": "88e9fd9715o5v1hqu6dnkg3vp"
}
],
"EndpointSpec": {
"Mode": "vip"
}
},
"Endpoint": {
"Spec": {
"Mode": "vip"
},
"VirtualIPs": [
{
"NetworkID": "88e9fd9715o5v1hqu6dnkg3vp",
"Addr": "192.168.1.23/24"
}
]
},
"UpdateStatus": {
"State": "completed",
"StartedAt": "2016-11-17T15:19:34.471292948Z",
"CompletedAt": "2016-11-17T15:22:34.402794312Z",
"Message": "update completed"
}
}
]
Output of docker network inspect internal:
[
{
"Name": "internal",
"Id": "88e9fd9715o5v1hqu6dnkg3vp",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "192.168.1.0/24",
"Gateway": "192.168.1.1"
}
]
},
"Internal": false,
"Containers": {
"03ac1e71139ff2140f93c80d9e6b1d69abf442a0c2362610bee3e116e84ef434": {
"Name": "datacollector.5.cxmvk7p1hwznautresir94m3s",
"EndpointID": "22445be80ba55b67d7cfcfbc75f2c15586bace5f317be8ba9b59c5f9f338525c",
"MacAddress": "02:42:c0:a8:01:72",
"IPv4Address": "192.168.1.114/24",
"IPv6Address": ""
},
"08ae84c7cb6e57583baf12c2a9082c1d17f1e65261cfa93346aaa9bda1244875": {
"Name": "auth.10.aasw00k7teq4knxibctlrrj7e",
"EndpointID": "c3506c851f4c9f0d06d684a9f023e7ba529d0149d70fa7834180a87ad733c678",
"MacAddress": "02:42:c0:a8:01:44",
"IPv4Address": "192.168.1.68/24",
"IPv6Address": ""
},
"192203a127d6831c3f4a41eabdd8df5282e33c3e92b99c3baaf1f213042f5418": {
"Name": "parkingcollector.1.8yrm6d831wrfsrkzhal7cf2pm",
"EndpointID": "34de6e9621ef54f7d963db942a7a7b6e0013ac6db6c9f17b384de689b1f1b187",
"MacAddress": "02:42:c0:a8:01:9a",
"IPv4Address": "192.168.1.154/24",
"IPv6Address": ""
},
"24258109e16c1a5b15dcc84a41d99a4a6617bcadecc9b35279c721c0d2855141": {
"Name": "stream.8.38npsusmpa1pf8fbnmaux57rx",
"EndpointID": "b675991ffbd5c0d051a4b68790a33307b03b48582fd1b37ba531cf5e964af0ce",
"MacAddress": "02:42:c0:a8:01:74",
"IPv4Address": "192.168.1.116/24",
"IPv6Address": ""
},
"33063b988473b73be2cbc51e912e165112de3d01bc00ee2107aa635e30a36335": {
"Name": "billing.2.ca41k2h44zkn9wfbsif0lfupf",
"EndpointID": "77c576929d5e82f1075b4cc6fcb4128ce959281d4b9c1c22d9dcd1e42eed8b5e",
"MacAddress": "02:42:c0:a8:01:87",
"IPv4Address": "192.168.1.135/24",
"IPv6Address": ""
},
"8b0929e66e6c284206ea713f7c92f1207244667d3ff02815d4bab617c349b220": {
"Name": "shotspottercollector.2.328408tiyy8aryr0g1ipmm5xm",
"EndpointID": "f2a0558ec67745f5d1601375c2090f5cd141303bf0d54bec717e3463f26ed74d",
"MacAddress": "02:42:c0:a8:01:90",
"IPv4Address": "192.168.1.144/24",
"IPv6Address": ""
},
"938fe5f6f9bb893862e8c06becd76c1a7fe5f2d3b791fc55d7d8164e67ee3553": {
"Name": "inrixproxy.2.ed77crvat0waw41phjknhhm6v",
"EndpointID": "88f550fecd60f0bdb0dfc9d5bf0c74716a91d009bcc27dc4392b113ab1215038",
"MacAddress": "02:42:c0:a8:01:96",
"IPv4Address": "192.168.1.150/24",
"IPv6Address": ""
},
"970f9d4c6ae6cc4de54a1d501408720b7d95114c28a6615d8e4e650b7e69bc40": {
"Name": "rabbitmq.1.e7j721g6hfhs8r7p3phih4g9v",
"EndpointID": "c04a4a5650ee6e10b87884004aa2cb1ec6b1c7036af15c31579462b6621436a2",
"MacAddress": "02:42:c0:a8:01:1e",
"IPv4Address": "192.168.1.30/24",
"IPv6Address": ""
},
"b1f676e6d38eec026583943dc0abff1163d21e6be9c5901539c46288f8941638": {
"Name": "logspout.0.51j8juw8aj0rjjccp2am0rib5",
"EndpointID": "98a93153abd6897c58276340df2eeec5c0ceb77fbe17d1ce8c465febb06776c7",
"MacAddress": "02:42:c0:a8:01:10",
"IPv4Address": "192.168.1.16/24",
"IPv6Address": ""
},
"bab4d80be830fa3b3fefe501c66e3640907a2cbb2addc925a0eb6967a771a172": {
"Name": "auth.2.8fduvrn5ayk024b0lkhyz50of",
"EndpointID": "7e81d41fa04ec14263a2423d8ef003d6d431a8c3ff319963197f8a8d73b4e361",
"MacAddress": "02:42:c0:a8:01:3a",
"IPv4Address": "192.168.1.58/24",
"IPv6Address": ""
},
"bc3c75a7c2d8c078eb7cc1555833ff0d374d82045dd9fb24ccfc37868615bb5e": {
"Name": "reverseproxy.6.2g20zphn5j1r2feylzcplyorg",
"EndpointID": "6c2138966ebcd144b47229a94ee603d264f3954a96ccd024d9e96501b7ffd5c0",
"MacAddress": "02:42:c0:a8:01:6c",
"IPv4Address": "192.168.1.108/24",
"IPv6Address": ""
},
"cd59d61b16ac0325336121a8558e8215e42aa5300f75054df17a70bf1f3e6c0c": {
"Name": "usgscollector.1.0h0afyw8va8maoa4tjd5qz588",
"EndpointID": "952073efc6a567ebd3f80d26811222c675183e8c76005fbf12388725a97b1bee",
"MacAddress": "02:42:c0:a8:01:48",
"IPv4Address": "192.168.1.72/24",
"IPv6Address": ""
},
"d40476e56b91762b0609acd637a4f70e42c88d266f8ebb7d9511050a8fc1df17": {
"Name": "kibana.1.6hxu5b97hfykuqr5yb9i9sn5r",
"EndpointID": "08c5188076f9b8038d864d570e7084433a8d97d4c8809d27debf71cb5d652cd7",
"MacAddress": "02:42:c0:a8:01:06",
"IPv4Address": "192.168.1.6/24",
"IPv6Address": ""
},
"e29369ad8ee5b12fb0c6f9bcb899514ab092f7da291a7c05eea758b0c19bfb65": {
"Name": "weatherbugcollector.1.crpub0hf85cewxm0qt6annsra",
"EndpointID": "afa1ddbad8ab8fdab69505ddb5342ac89c0d17bc75a11e9ac0ac8829e5885997",
"MacAddress": "02:42:c0:a8:01:2e",
"IPv4Address": "192.168.1.46/24",
"IPv6Address": ""
},
"f1bf0a656ecb9d7ef9b837efa94a050d9c98586f7312435e48b9a129c5e92e46": {
"Name": "socratacollector.1.627icslq6kdb4syaha6tzkb19",
"EndpointID": "14bea0d9ec3f94b04b32f36b7172c60316ee703651d0d920126a49dd0fa99cf5",
"MacAddress": "02:42:c0:a8:01:1b",
"IPv4Address": "192.168.1.27/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "257"
},
"Labels": {}
}
]
Output of dig datacollector:
; <<>> DiG 9.9.5-9+deb8u8-Debian <<>> datacollector
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38227
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;datacollector. IN A
;; ANSWER SECTION:
datacollector. 600 IN A 192.168.1.23
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Thu Nov 17 16:11:57 UTC 2016
;; MSG SIZE rcvd: 60
Output of dig tasks.datacollector:
; <<>> DiG 9.9.5-9+deb8u8-Debian <<>> tasks.datacollector
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9810
;; flags: qr rd ra; QUERY: 1, ANSWER: 12, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;tasks.datacollector. IN A
;; ANSWER SECTION:
tasks.datacollector. 600 IN A 192.168.1.115
tasks.datacollector. 600 IN A 192.168.1.66
tasks.datacollector. 600 IN A 192.168.1.22
tasks.datacollector. 600 IN A 192.168.1.114
tasks.datacollector. 600 IN A 192.168.1.37
tasks.datacollector. 600 IN A 192.168.1.139
tasks.datacollector. 600 IN A 192.168.1.148
tasks.datacollector. 600 IN A 192.168.1.110
tasks.datacollector. 600 IN A 192.168.1.112
tasks.datacollector. 600 IN A 192.168.1.100
tasks.datacollector. 600 IN A 192.168.1.39
tasks.datacollector. 600 IN A 192.168.1.106
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Thu Nov 17 16:08:54 UTC 2016
;; MSG SIZE rcvd: 457
Output of docker version:
Client:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 23:26:11 2016
OS/Arch: darwin/amd64
Server:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: 6b644ec
Built: Wed Oct 26 21:44:32 2016
OS/Arch: linux/amd64
Output of docker info:
Containers: 58
Running: 15
Paused: 0
Stopped: 43
Images: 123
Server Version: 1.12.3
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 430
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host null overlay bridge
Swarm: active
NodeID: 8uxexr2uz3qpn5x1km9k4le9s
Is Manager: true
ClusterID: 2kd4md2qyu67szx4y6q2npnet
Managers: 3
Nodes: 8
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Heartbeat Tick: 1
Election Tick: 3
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Node Address: 10.10.44.201
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-91-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.676 GiB
Name: stage-0
ID: 76Z2:GN43:RQND:BBAJ:AGUU:S3F7:JWBC:CCCK:I4VH:PKYC:UHQT:IR2U
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: herbrandson
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
provider=generic
Insecure Registries:
127.0.0.0/8
Additional environment details:
Docker swarm mode (not swarm). All nodes are running on AWS. The swarm has 8 nodes (3 managers and 5 workers)
UPDATE:
Per the comments, here's a snipet from the docker daemon logs on the swarm master
time="2016-11-17T15:19:45.890158968Z" level=error msg="container status
unavailable" error="context canceled" module=taskmanager task.id=ch6w74b3cu78y8r2ugkmfmu8a
time="2016-11-17T15:19:48.929507277Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=exb6dfc067nxudzr8uo1eyj4e
time="2016-11-17T15:19:50.104962867Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=6mbbfkilj9gslfi33w7sursb9
time="2016-11-17T15:19:50.877223204Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=drd8o0yn1cg5t3k76frxgukaq
time="2016-11-17T15:19:54.680427504Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=9lwl5v0f2v6p52shg6gixs3j7
time="2016-11-17T15:19:54.949118806Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=51q1eeilfspsm4cx79nfkl4r0
time="2016-11-17T15:19:56.485909146Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=3vjzfjjdrjio2gx45q9c3j6qd
time="2016-11-17T15:19:56.934070026Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:00.000614497Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:00.163458802Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=4xa2ub5npxyxpyx3vd5n1gsuy
time="2016-11-17T15:20:01.463407652Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:01.949087337Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:02.942094926Z" level=error msg="Failed to create real server 192.168.1.150 for vip 192.168.1.32 fwmark 947 in sb 938fe5f6f9bb893862e8c06becd76c1a7fe5f2d3b791fc55d7d8164e67ee3553: no such process"
time="2016-11-17T15:20:03.319168359Z" level=error msg="Failed to delete a new service for vip 192.168.1.61 fwmark 2133: no such process"
time="2016-11-17T15:20:03.363775880Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/5de57ee133a5: reexec failed: exit status 5"
time="2016-11-17T15:20:05.772683092Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:06.059212643Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:07.335686642Z" level=error msg="Failed to delete a new service for vip 192.168.1.67 fwmark 2134: no such process"
time="2016-11-17T15:20:07.385135664Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/6699e7c03bbd: reexec failed: exit status 5"
time="2016-11-17T15:20:07.604064777Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:07.673852364Z" level=error msg="Failed to delete a new service for vip 192.168.1.75 fwmark 2097: no such process"
time="2016-11-17T15:20:07.766525370Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/6699e7c03bbd: reexec failed: exit status 5"
time="2016-11-17T15:20:09.080101131Z" level=error msg="Failed to create real server 192.168.1.155 for vip 192.168.1.35 fwmark 904 in sb 192203a127d6831c3f4a41eabdd8df5282e33c3e92b99c3baaf1f213042f5418: no such process"
time="2016-11-17T15:20:11.516338629Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:20:11.729274237Z" level=error msg="Failed to delete a new service for vip 192.168.1.83 fwmark 2124: no such process"
time="2016-11-17T15:20:11.887572806Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/5b810132057e: reexec failed: exit status 5"
time="2016-11-17T15:20:12.281481060Z" level=error msg="Failed to delete a new service for vip 192.168.1.73 fwmark 2136: no such process"
time="2016-11-17T15:20:12.395326864Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/5b810132057e: reexec failed: exit status 5"
time="2016-11-17T15:20:20.263565036Z" level=error msg="Failed to create real server 192.168.1.72 for vip 192.168.1.91 fwmark 2163 in sb cd59d61b16ac0325336121a8558e8215e42aa5300f75054df17a70bf1f3e6c0c: no such process"
time="2016-11-17T15:20:20.410996971Z" level=error msg="Failed to delete a new service for vip 192.168.1.95 fwmark 2144: no such process"
time="2016-11-17T15:20:20.456710211Z" level=error msg="Failed to add firewall mark rule in sbox /var/run/docker/netns/88d38a2bfb77: reexec failed: exit status 5"
time="2016-11-17T15:20:21.389253510Z" level=error msg="Failed to create real server 192.168.1.46 for vip 192.168.1.99 fwmark 2145 in sb cd59d61b16ac0325336121a8558e8215e42aa5300f75054df17a70bf1f3e6c0c: no such process"
time="2016-11-17T15:20:22.208965378Z" level=error msg="Failed to create real server 192.168.1.46 for vip 192.168.1.99 fwmark 2145 in sb e29369ad8ee5b12fb0c6f9bcb899514ab092f7da291a7c05eea758b0c19bfb65: no such process"
time="2016-11-17T15:20:23.334582312Z" level=error msg="Failed to create a new service for vip 192.168.1.97 fwmark 2166: file exists"
time="2016-11-17T15:20:23.495873232Z" level=error msg="Failed to create real server 192.168.1.48 for vip 192.168.1.17 fwmark 552 in sb e29369ad8ee5b12fb0c6f9bcb899514ab092f7da291a7c05eea758b0c19bfb65: no such process"
time="2016-11-17T15:20:25.831988014Z" level=error msg="Failed to create real server 192.168.1.116 for vip 192.168.1.41 fwmark 566 in sb 03ac1e71139ff2140f93c80d9e6b1d69abf442a0c2362610bee3e116e84ef434: no such process"
time="2016-11-17T15:20:25.850904011Z" level=error msg="Failed to create real server 192.168.1.116 for vip 192.168.1.41 fwmark 566 in sb 03ac1e71139ff2140f93c80d9e6b1d69abf442a0c2362610bee3e116e84ef434: no such process"
time="2016-11-17T15:20:37.159637665Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=6yhu3glre4tbz6d08lk2pq9eb
time="2016-11-17T15:20:48.229343512Z" level=error msg="Error closing logger: invalid argument"
time="2016-11-17T15:51:16.027686909Z" level=error msg="Error getting service internal: service internal not found"
time="2016-11-17T15:51:16.027708795Z" level=error msg="Handler for GET /v1.24/services/internal returned error: service internal not found"
time="2016-11-17T16:15:50.946921655Z" level=error msg="container status unavailable" error="context canceled" module=taskmanager task.id=cxmvk7p1hwznautresir94m3s
time="2016-11-17T16:16:01.994494784Z" level=error msg="Error closing logger: invalid argument"
UPDATE 2:
I tried removing the service and re-creating it and that did not resolve the issue.
UPDATE 3:
I went through and rebooted each node in the cluster one-by-one. After that things appear to be back to normal. However, I still don't know what caused this. More importantly, how do I keep this from happening again in the future?

Resources