Euca 5.0 Ansible Console Task Failing - eucalyptus

Background:
I am only able to get past the ansible console install/config tasks by adding --region localhost to anywhere in: /usr/share/eucalyptus-ansible/roles/cloud-post/tasks/console.yml wherever it calls tools that take that argument.
Otherwise each sub task fails like this: ["euca-describe-images: error: connection error (('Connection aborted.', gaierror(-2, 'Name or service not known')))"]
Running the commands from that playbook directly on the euca server being configured gives the same result unless I specify --region localhost
Problem:
I'm stuck here: [cloud-post : update console route53 system domain for eucalyptus-cloud authentication]
Error: "euform-update-stack: error (ValidationError): No updates are to be performed.", "stderr_lines": ["euform-update-stack: error (ValidationError): No updates are to be performed."]
All services are running except the ImagingBackend is Not Ready
No instances are running according to euca-describe-instances
Images are available:
IMAGE ami-5be483c81cf8bd65c eucalyptus-console-image-5-0-823/eucalyptus-console-image-5-0-823.raw.manifest.xml 000216594841 available private x86_64 machine instance-store hvm
TAG image ami-5be483c81cf8bd65c type eucalyptus-console-image
TAG image ami-5be483c81cf8bd65c version 5.0.823
IMAGE ami-f31092ddb73e29af9 eucalyptus-service-image-v5.0.100/eucalyptus-service-image.raw.manifest.xml 000216594841 available privatx86_64 machine instance-store hvm
TAG image ami-f31092ddb73e29af9 provides imaging,loadbalancing
TAG image ami-f31092ddb73e29af9 type eucalyptus-service-image
TAG image ami-f31092ddb73e29af9 version 5.0.100
---
all:
hosts:
exp-euca.lan.com:
exp-enc-[01:02].lan.com:
vars:
vpcmido_public_ip_range: "192.168.100.5-192.168.100.254"
vpcmido_public_ip_cidr: "192.168.100.1/24"
cloud_system_dns_dnsdomain: "cloud.lan.com"
cloud_public_port: 443
eucalyptus_console_cloud_deploy: yes
cloud_service_image_rpm: no
cloud_properties:
services.imaging.worker.ntp_server: "x.x.x.x"
services.loadbalancing.worker.ntp_server: "x.x.x.x"
children:
cloud:
hosts:
exp-euca.lan.com:
console:
hosts:
exp-euca.lan.com:
node:
hosts:
exp-enc-[01:02].lan.com:
EDIT:
Solved. Details are in the comments of the marked answer.

The name error most likely means that DNS for the domain cloud.lan.com is not being correctly delegated to your deployment. To test this, check if the nameserver is found:
dig +short NS cloud.lan.com
you should see "ns1.cloud.lan.com" and then should be able to use that nameserver to resolve services, e.g.
dig +short ec2.cloud.lan.com #ns1.cloud.lan.com
which should be the IP of the host for the compute service.
The second item is a bug in the ansible playbook that occurs when the stack is already present and up to date. To work around it, you can either update your playbook or delete the stack before running the playbook. Depending on how far the playbook progressed you may have a script to do this:
/usr/local/bin/console-manage-stack -a delete
the related playbook change is https://github.com/AppScale/ats-deploy/pull/36

Related

DynamoDB local behaving erratically

This is a very strange situation that's driving me nuts, and I would really appreciate some help here.
I am using CDK to define the DynamoDB table and associated indices. To test them locally, I installed cdklocal and DynamoDB local using localstack. When the computer (Mac running Ventura 13.1) is restarted, everything works as expected. Here is the script I use to bootstrap and start the stack (this is in a file called startStack.sh):
docker-compose up -d
echo "Waiting for 5 seconds"
sleep 5
cd test-app
cdklocal bootstrap
echo "Waiting for 5 seconds"
sleep 5
cdklocal deploy TestAppStack
#cdklocal deploy TestAppStack/ops-table
DYNAMO_ENDPOINT="http://localhost:4566/" dynamodb-admin &
open http://0.0.0.0:8001
cd ..
The test-app directory contains a local copy of the DynamoDB (and associated indices) definition. I do not encounter any errors running the cdklocal (or cdk) deploy commands so I am assuming that the CDK definition is not an issue.
The docker-compose looks like this:
version: "3.8"
services:
localstack:
container_name: AWS-DEVELOPMENT-WITH-LOCALSTACK
image: localstack/localstack:latest
network_mode: bridge
ports:
- "127.0.0.1:53:53"
- "127.0.0.1:53:53/udp"
- "127.0.0.1:443:443"
- "127.0.0.1:4566:4566"
- "127.0.0.1:4571:4571"
- "127.0.0.1:${PORT_WEB_UI-8080}:${PORT_WEB_UI-8080}"
environment:
- DYNAMODB_SHARE_DB=1
- DISABLE_CORS_CHECKS=1
- SERVICES=s3,dynamodb,sns,sqs,firehose,kinesis,ses,sts,cloudformation
- DEBUG=1
- DATA_DIR=/tmp/localstack/data
- PORT_WEB_UI=8080
- LAMBDA_EXECUTOR=local
- KINESIS_ERROR_PROBABILITY=1.0
- DOCKER_HOST=unix:///var/run/docker.sock
- HOST_TMP_FOLDER=./.localstack
volumes:
- './.localstack:/var/lib/localstack'
- '/var/run/docker.sock:/var/run/docker.sock'
Everything works as expected when I first run the startStack.sh file - the dynamodb-admin window opens up correctly and other interfaces can interact with the local DynamoDB table. But after some time (and I have not been able to pinpoint the cause), all interactions with local DynamoDB start failing with the following error(s):
Bootstrapping environment aws://000000000000/us-west-2...
❌ Environment aws://000000000000/us-west-2 failed bootstrapping: UnknownEndpoint: Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
at Request.ENOTFOUND_ERROR (/usr/local/lib/node_modules/aws-sdk/lib/event_listeners.js:611:46)
at Request.callListeners (/usr/local/lib/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/usr/local/lib/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/usr/local/lib/node_modules/aws-sdk/lib/request.js:686:14)
at error2 (/usr/local/lib/node_modules/aws-sdk/lib/event_listeners.js:443:22)
at ClientRequest.<anonymous> (/usr/local/lib/node_modules/aws-sdk/lib/http/node.js:99:9)
at ClientRequest.emit (node:events:513:28)
at ClientRequest.emit (node:domain:489:12)
at Socket.socketErrorListener (node:_http_client:494:9)
at Socket.emit (node:events:513:28) {
code: 'UnknownEndpoint',
region: 'us-west-2',
hostname: 'localhost',
retryable: true,
originalError: [Error],
time: 2023-01-15T06:46:40.614Z
}
Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
The script hangs at the following message:
[16:52:01] Retrieved account ID 000000000000 from disk cache
[16:52:01] Assuming role 'arn:aws:iam::000000000000:role/cdk-hnb659fds-deploy-role-000000000000-us-west-2'.
[16:52:01] Assuming role failed: Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
[16:52:01] Could not assume role in target account using current credentials Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region. . Please make sure that this role exists in the account. If it doesn't exist, (re)-bootstrap the environment with the right '--trust', using the latest version of the CDK CLI.
current credentials could not be used to assume 'arn:aws:iam::000000000000:role/cdk-hnb659fds-deploy-role-000000000000-us-west-2', but are for the right account. Proceeding anyway.
[16:52:01] Waiting for stack CDKToolkit to finish creating or updating...
Restarting the computer fixes it, but it's not clear what causes the issue in the first place. Restarting Docker does not help either.
Any thoughts on what could be causing the problem and how I can avoid it?
I'm adding this as an answer, although I do not have an affirmative answer I thought I would try to help.
I believe your port is being occupied and thus the process you are running is unable to obtain it resulting in error. Before running the job, check if the port is occupied:
sudo lsof -i :4566

docker run is stalling

I'm trying to run a docker image, which is essentially the default ASP.Net 6 default API:
docker run my-image-name --restart=aways -d -p 80:5111
I've tried a few different ways of running this, but it appears to always do the same thing. First of all, whether I launch in detached mode or interactive, I get the following display:
{"EventId":14,"LogLevel":"Information","Category":"Microsoft.Hosting.Lifetime","Message":"Now listening on: http://[::]:80","State":{"Message":"Now listening on: http://[::]:80","address":"http://[::]:80","{OriginalFormat}":"Now listening on: {address}"}}
{"EventId":0,"LogLevel":"Information","Category":"Microsoft.Hosting.Lifetime","Message":"Application started. Press Ctrl\u002BC to shut down.","State":{"Message":"Application started. Press Ctrl\u002BC to shut down.","{OriginalFormat}":"Application started. Press Ctrl\u002BC to shut down."}}
{"EventId":0,"LogLevel":"Information","Category":"Microsoft.Hosting.Lifetime","Message":"Hosting environment: Production","State":{"Message":"Hosting environment: Production","envName":"Production","{OriginalFormat}":"Hosting environment: {envName}"}}
{"EventId":0,"LogLevel":"Information","Category":"Microsoft.Hosting.Lifetime","Message":"Content root path: /app","State":{"Message":"Content root path: /app","contentRoot":"/app","{OriginalFormat}":"Content root path: {contentRoot}"}}
It then seems to stall - I can't ctrl-c, nor can I attach to the process (it does show under docker ps). I also can't connect to the API:
http://localhost:5111/swagger
Just returns ERR_CONNECTION_REFUSED
My guess here is that there's something wrong with the dockerfile itself - however, it builds fine. The question I have is: how can I debug this in order to determine the error?
I can't add this as a comment because i don't have the reputation so it will have to be an answer instead.
I am new to Docker myself so I am not sure why it is stalling but when you specify the port with -p 80:5111 you are mapping port 5111 in the container to port 80 on the Docker host.
So you should connect with http://localhost:80/swagger instead.

AWS credentials not found for celery-k8s deployment

I'm trying to run dagster using celery-k8s and using the examples/celery-k8s as a start. upon running the pipeline from playground I get
Initialization of resources [s3, io_manager] failed.
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I have configured aws credentials in env variables as mentioned in the document
deployments:
- name: "user-code-deployment-test"
image:
repository: "somasays/dagster-usercode-example"
tag: "0.5"
pullPolicy: Always
dagsterApiGrpcArgs:
- "-f"
- "/workspace/repo.py"
port: 3030
env:
AWS_ACCESS_KEY_ID: AAAAAAAAAAAAAAAAAAAAAAAAA
AWS_SECRET_ACCESS_KEY: qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
AWS_DEFAULT_REGION: eu-central-1
and I can also see these values are set in the env variables of the pod and can also access the s3 location after pip install awscli and aws s3 ls see the screenshot below the job pod however throws Unable to locate credentials
Please help
The deployment configuration applies to the user code servers. Meanwhile the celery executor runs your pipeline code in separate kubernetes jobs. To provide your secrets there, you will want to configure the env_secrets field of the celery-k8s executor in your pipeline run config.
See https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-k8s/dagster_k8s/job.py#L321-L327 for details on the config.

Mosquitto: Starting in local only mode

I have a virtual machine that is supposed to be the host, which can receive and send data. The first picture is the error that I'm getting on my main machine (from which I'm trying to send data from). The second picture is the mosquitto log on my virtual machine. Also I'm using the default config, which as far as I know can't cause these problems, at least from what I have seen from other examples. I have very little understanding on how all of this works, so any help is appreciated.
What I have tried on the host machine:
Disabling Windows defender
Adding firewall rules for "mosquitto.exe"
Installing mosquitto on a linux machine
Starting with the release of Mosquitto version 2.0.0 (you are running v2.0.2) the default config will only bind to localhost as a move to a more secure default posture.
If you want to be able to access the broker from other machines you will need to explicitly edit the config files to either add a new listener that binds to the external IP address (or 0.0.0.0) or add a bind entry for the default listener.
By default it will also only allow anonymous connections (without username/password) from localhost, to allow anonymous from remote add:
allow_anonymous true
More details can be found in the 2.0 release notes here
You have to run with
mosquitto -c mosquitto.conf
mosquitto.conf, which exists in the folder same with execution file exists (C:\Program Files\mosquitto etc.), have to include following line.
listener 1883 ip_address_of_the_machine(192.168.1.1 etc.)
By default, the Mosquitto broker will only accept connections from clients on the local machine (the server hosting the broker).
Therefore, a custom configuration needs to be used with your instance of Mosquitto in order to accept connections from remote clients.
On your Windows machine, run a text editor as administrator and paste the following text:
listener 1883
allow_anonymous true
This creates a listener on port 1883 and allows anonymous connections. By default the number of connections is infinite. Save the file to "C:\Program Files\Mosquitto" using a file name with the ".conf" extension such as "your_conf_file.conf".
Open a terminal window and navigate to the mosquitto directory. Run the following command:
mosquitto -v -c your_conf_file.conf
where
-c : specify the broker config file.
-v : verbose mode - enable all logging types. This overrides
any logging options given in the config file.
I found I had to add, not only bind_address ip_address but also had to set allow_anonymous true before devices could connect successfully to MQTT. Of course I understand that a better option would be to set user and password on each device. But that's a next step after everything actually works in the minimum configuration.
For those who use mosquitto with homebrew on Mac.
Adding these two lines to /opt/homebrew/Cellar/mosquitto/2.0.15/etc/mosquitto/mosquitto.conf fixed my issue.
allow_anonymous true
listener 1883
you can run it with the included 'no-auth' config file like so:
mosquitto -c /mosquitto-no-auth.conf
I had the same problem while running it inside docker container (generated with docker-compose).
In docker-compose.yml file this is done with:
command: mosquitto -c /mosquitto-no-auth.conf

~/.ssh/id_rsa.pub not found error while installing capistrano as ansible playbook

I try to install https://github.com/roots/bedrock-ansible to get a bedrock deployment (http://roots.io/wordpress-stack/) running.
When I run "vagrant up", after some time I get the error:
TASK: [capistrano-setup | Setup deploy group] *********************************
skipping: [default]
TASK: [capistrano-setup | Setup deploy user] **********************************
skipping: [default]
TASK: [capistrano-setup | Adding public key to server] ************************
fatal: [default] => could not locate file in lookup: ~/.ssh/id_rsa.pub
FATAL: all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
to retry, use: --limit #/Users/johannes/site.retry
default : ok=46 changed=16 unreachable=1 failed=0
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
I do not have a clou how i can fix this. Do you have an idea?
It seems the role is trying to find your local public key. It should be in the location in the error message '~/.ssh/id_rsa.pub', but it's not. So either you don't have one, or you keep it in another location.
If you're not familiar with generating SSH keys you probably don't have one. I personally like the GitHub help page for this: https://help.github.com/articles/generating-ssh-keys/
(you only have to perform steps 1 and 2).
If you do have SSH keys, but in a different location, the capistrano-install role in bedrock uses some variables:
deploy_user: deploy
deploy_keys:
- "~/.ssh/id_rsa.pub"
So you can set (multiple) public key files in the deploy_keys list and they will be added to the deploy_user's authorized keys.
All this is needed because Capistrano will use the deploy user to connect to the remote server later. http://blakesmith.me/2010/02/08/understanding-public-key-private-key-concepts.html

Resources