I'm getting the following Warning message when trying to start the dagster-daemon:
Schedule my_hourly_schedule was started from a location Scheduler that can no longer be found in the workspace, or has metadata that has changed since the schedule was started. You can turn off this schedule in the Dagit UI from the Status tab.
I'm trying to automate some pipelines with dagster and created a new project using dagster new-project Scheduler where "Scheduler" is my project.
This command, as expected, created a diretory with some hello_world files. Inside of it I put the dagster.yaml file with configuration for a PostgreDB to which I want to right the logs. The whole thing looks like this:
However, whenever I run dagster-daemon run from the directory where the workspace.yaml file is located, I get the message above. I tried runnning running the daemon from other folders, but it then complains that it can't find any workspace.yaml files.
I guess, I'm running into a "beginner mistake", but could anyone help me with this?
I appreciate any counsel.
One thing to note is that the dagster.yaml file will not do anything unless you've set your DAGSTER_HOME environment variable to point at the directory that this file lives.
That being said, I think what's going on here is that you don't have the Scheduler package installed into the python environment that you're running your dagster-daemon in.
To fix this, you can run pip install -e . in the Scheduler directory, although the README.md inside that directory has more specific instructions for working with virtualenvs.
When I created a new .conf file inside /etc/supervisor/conf.d/ and tried to start this program it was showing some errors (fatal error) and restarting frequently by itself. Then I ran the command sudo service supervisor restart but now the supervisor also stopped and couldn't be restarted it. During solving my error the nginx server also got stuck also.
After spending a vast time I recovered it Alhamdulillah and writing the solution in the answer section.
Don't trust the solution entirely for your problem. Your problem may belong to another issues as well.
Sometime Supervisor can show the below horrible error when you restart the
service (by the command sudo service supervisor restart):
unix:///var/run/supervisor.sock refused connection
Try to diagnosis the problem with the command supervisord. You can also run journalctl -xe.
Problems and Solutions:
When you write a new .conf file to inside the /etc/supervisor/conf.d directory which contains some statements that are generating error.
Like, you write some statements that will run a script. That script contains some statements that runsGunicorn to deploy a python web apps. In the script you wrote a statement to bind an unix socket. But the mentioned directory where the unix socket will be created doesn't give permission to create the .sock file there. This can lead the permission error.
The demo gunicorn command is below:
SOCKFILE = /home/shamim/python_project/another_directroy/gunicorn.sock
gunicorn ${DJANGO_WSGI_MODULE}:application \
--name $NAME \
--bind=unix:$SOCKFILE
If the another_directory doesn't give the permission to create a .sock file inside it then an error can be occurred. So give it enough permission to create something here from outside. Or, Bind IP and port instead unix socket (like 127.0.0.1:ANY_PORT). Be sure first the port is not used by another application.
Sometimes the error can be occurred if any directory path is used inside .conf file but actually that directory doesn't exist at all.
Now run the command supervisord.
If the error persists after fixing the above issues and now shows a error like -
another program is already listening on a port that one of our HTTP servers is configured to use
then run the below command to fix this issue:
sudo unlink /var/run/supervisor.sock
If the command above does not work you should check run unlink the file at /tmp/supervisor.sock
Keep in mind that the nginx server can also show some errors and fail to
restart (or start) if any .conf file contains some statement where a socket
is used but actually the socket file doesn't exist or doesn't have enough permission to be executed.
Example: If you write the below code in any nginx file config:
upstream surveyapp_payment_stripe {
server unix:/home/shamim/python_project/another_directroy/gunicorn.sock fail_timeout=0 weight=5 max_fails=3;
}
If the above socket doesn't exist or not have enough permission then some error may be occurred.
Nginx can also show error if any directory path is used here but not exists at all. To run nginx at this time quickly just delete the .conf file or edit it's extension (make another another extension type other than .conf).
Hopefully this explanation will help someone in future.
I wanted to change the backup to a different disk. I mounted the disk to /mnt2 on centos and when I navigate to Admin > Backups > Backup Daily > Edit backup-daily Backup, I see an option Server Path For Backup. I tried the following two things.
I entered the mount directory /mnt2 and hit run now. The background job fails with the following error in logs.
An error occurred while performing a backup: Backup directory provided
in configuration: '/mnt2' cannot be created or is not a directory.
I also tried creating a tmp2 directory on local drive and entered /tmp2 and hit run now. The background job fails with the same error as above.
Note 1:
I restarted the docker container just to see if it's not picking up file system changes in real time. That did not work.
Note 2:
There is a browse button next to Sever Path for Backup and I dont see /mnt2 or /tmp2 directories I created. I couldnot find anything useful in the documentation either.
How do I change the backup directory for artifactory?
The setup is artifactory with docker.
For artifactory docker instance, a volume needs to be specified so it maps to the local folder, say, /opt/artifactory/.
In my case, /var/opt/jfrog/artifactory(docker) is mapped to /opt/artifactory(local)
I am supposed to create a folder here -- /opt/artifactory/backup_mount. Give read and write access for 1039 user and group. It shows up in artifactory UI as /var/opt/jfrog/artifactory/backup_mount.
Note:
If you create a directory, it shows up without any docker restart.
If you create a mount, restart docker so the mount is recognized.
I have a Google Compute Engine VM instance with a Asterisk Server running on it. I get this message when I try to run sudo:
sudo: parse error in /etc/sudoers near line 21
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin
Is there a password for root so I can try to change it there? Any suggestions on this?
It looks like you have manually edited the /etc/sudoers file so while you would normally have sudo access, due to the parse error, you won't be able to do this directly.
Here's how to fix this situation.
1. Save the current boot disk
go to to the instance view in Developers Console
find your VM instance and click on its name; you should now be looking at a URL such as
https://console.cloud.google.com/project/[PROJECT]/compute/instancesDetail/zones/[ZONE]/instances/[VM-NAME]
stop the instance
detach the boot disk from the instance
2. Fix the /etc/sudoers on the boot disk
create a new VM instance with its own boot disk; you should have sudo access here
attach the disk saved above as a separate persistent disk
mount the disk you just attached
fix the /etc/sudoers file on the disk
unmount the second disk
detach the second disk from the VM
delete the new VM instance (let it delete its boot disk, you won't need it)
3. Restore the original VM instance
re-attach the boot disk to the original VM
restart the original VM with its original boot disk, with fixed config
How to avoid this in the future
Always use the command visudo rather just any text editor directly to edit the /etc/sudoers file which will validate the contents of the file prior to saving it.
I ran into this issue as well and had the same issue Nakilon was reporting when trying the gcloud workaround.
What we ended up doing was configure a startup script that removed the broken sudoers file.
So in your metadata put something like:
#/bin/sh
rm "/etc/sudoers.d/broken-config-file"
echo "ok" > /tmp/ok.log
https://cloud.google.com/compute/docs/startupscript
As you probably figured out this requires the /etc/sudoers file to be fixed. As nobody has root access to the instance, you will not be able to do this from inside the instance.
The best way to solve this is to edit the disk from another instance. The basic steps to do this are:
Take a snapshot of your disk as a backup (!)
Shutdown your instance, taking care not to delete the boot disk.
Start a new "debugger" instance from one of the stock GCE images.
Attach the old boot disk to the new instance.
In the debugger instance, mount the disk.
In the debugger instance, fix the sudoers file on the mounted disk.
In the debugger instance, unmount the disk
Shutdown the debugger instance.
Create a new instance with the same specs as your original instance using the fixed disk as the boot disk.
The new disk will then have the fixed sudoers file.
Since i bumped into this issue too, if you have another instance or any place where you can run with gcloud privileges, you can run:
gcloud compute --project "<project id>" ssh --zone "europe-west1-b" "<servername>"
I ran this on a server which had gcloud as root, so you login to the other box as root too! Then fix your issue. (if you don't have a box, just spin a micro up with the correct gcloud privileges) saves the hassle of disk stuff etc.
As mentioned in above comments, I am getting the same error like below in gcp VM.
sudo: parse error in /etc/sudoers near line 21
sudo: no valid sudoers sources found, quitting
sudo: unable to initialize policy plugin
To solve this
I have ssh to another vm and become root then I ran gcloud ssh command to our main vm (where you are getting the sudo error.)
gcloud compute --project "<project id>" ssh --zone "europe-west1-b "<servername>"
And BOOM!, now are login as root in the VM.
Now you can access/change the /etc/sudoers file accordingly.
I found this hack better than recreating vm/disks.
Hope this helps to someone!
It is possible to connect to a VM as root from your developers console Google Cloud Shell. Make sure the VM is running, start the shell and use this command:
gcloud compute ssh root#<instance-name> --zone <zone> [--project <project-id>]
where instance-name is found in the Compute Engine VM Instances screen. project-id is optional but required if you are connecting to an instance in a different project from the project where you started the shell.
You can then fix this and other issues that may prevent you from using sudo.
I got a Permission denied error when trying to ssh to the problem instance via gcloud. Using a startup script as mentioned above by #Jorick works. Instructions for it are here. You will have to stop and restart the VM instance for the startup script to get executed. I modified the script slightly:
rm -f /etc/sudoers.d/google_sudoers >& /tmp/startup.log
After the restart, launch an SSH session from the cloud console and check that you are able to view the file contents (with sudo more /etc/sudoers.d/google_sudoers for example). If that works your problem has been solved.
I've seen some similar issues, but none seem to address this exact problem I'm having.
I use the Nginx pid file when running an awstats update + log rotation so that I can tell the process to close and reopen the log files. This is the standard way I've seen of doing this:
kill -USR1 cat /usr/local/nginx/logs/nginx.pid (http://wiki.nginx.org/LogRotation)
However, my issue is that sometimes the pid file disappears. When this happens, the log rotation doesn't properly reopen files and nginx continues to write to the same log files. I have no idea why this happens, and I usually have to do a full nginx stop + nginx start to get it to recreate the pid file. It also doesn't happen on a regular schedule. Sometimes the webserver will be fine for months and then all of a sudden the PID file will disappear and then the logs won't get updated properly.
Is this something that anyone else has encountered? Any ideas to try?
Nginx version: 1.5.13
OS: CentOS 6.5