How do I edit files in a docker image? - directory

I am just setting up docker on my local machine for web-dev.
I have seen lots of tutorials for docker with rails etc...
I am curious how does docker work in terms of editing the projects source code.
I am trying to wrap my head around this -v tag.
In many of the tutorials I have seen users have stored their Dockerfile in the project base directory and the built from there, do you just edit the code in the directory and refresh the browser? And leave docker running.
Just trying to wrap my head around it all, sorry if basic question.

I usually differentiate two use cases of Docker:
in one case I want a Dockerfile that helps end users get started easily
in another case I want a Dockerfile to help code contributors to have a testing environment up and running easily
For end users, you want your Dockerfile to
install dependencies
checkout the latests stable code (from github or elsewhere)
setup some kind of default configuration
for contributors, you want your Dockerfile to
install dependencies
document how to run a Docker container setting up a volume to share the source code between their development environment and the docker container.
To sum up, for end users the Docker image should embed the application code while for contributors the docker image will just have the dependencies.

Related

Passing files from a rocker container to a latex container within a gitlab-ci job

I would like to use Gitlab CI to compile a Latex article as explained in this answer on tex.stackexchange (a similar pdf generation example is shown in the gitlab documentation for artifacts). I use a special latex template given by the journal editor. My Latex article contains figures made with the R statistical software. R and Latex are two large software installations with a lot of dependencies so I decided to use two separate containers for the build, one for the statistical analysis and visualization with R and one to compile a Latex document to pdf.
Here is the content of .gitlab-ci.yml:
knit_rnw_to_tex:
image: rocker/verse:4.0.0
script:
- Rscript -e "knitr::knit('article.Rnw')"
artifacts:
paths:
- figure/
compile_pdf:
image: aergus/latex
script:
- ls figure
- latexmk -pdf -bibtex -use-make article.tex
artifacts:
paths:
- article.pdf
The knit_rnw_to_tex job executed in the R "rocker" container is successful and I can download the figure artifacts from the gitlab "jobs" page. The issue in the second job compile_pdf is that ls figure shows me an empty folder and the Latex article compilation fails because of missing figures.
It should be possible to use artifacts to pass data between jobs according to this answer and to this well explained forum post but they use only one container for different jobs. It doesn't work in my case. Probably because I use two different containers?
Another solution would be to use only the rocker/tidyverse container and install latexmk in there, but the installation of apt install latexmk fails for an unknown reason. Maybe because It has over hundred dependencies and that is to much for gitlab-CI?
The "dependencies" keyword could help according to that answer, but the artifacts are still not available when I use it.
How can I pass the artifacts from one job to the other?
Should I use cache as explained in docs.gitlab.com / caching?
Thank you for the comment as I wanted to be sure, how you do it. Example would help too, but I'll be generic for now (using docker).
To run multiple containers you need a
(The Docker executor)
To quote the documentation on it:
The Docker executor when used with GitLab CI, connects to Docker
Engine and runs each build in a separate and isolated container using
the predefined image that is set up in .gitlab-ci.yml and in
accordance in config.toml.
Workflow
The Docker executor divides the job into multiple steps:
Prepare: Create and start the services.
Pre-job: Clone, restore cache and download artifacts from previous stages. This is run on a special Docker image.
Job: User build. This is run on the user-provided Docker image.
Post-job: Create cache, upload artifacts to GitLab. This is run on a special Docker Image.
Your config.toml could look like this:
[runners.docker]
image = "rocker/verse:4.0.0"
builds_dir = /home/builds/rocker
[[runners.docker.services]]
name = "aergus/latex"
alias = "latex"
From above linked documentation:
The image keyword
The image keyword is the name of the Docker image that is present in the local Docker Engine (list all images with docker images) or any image that can be found at Docker Hub. For more information about images and Docker Hub please read the Docker Fundamentals documentation.
In short, with image we refer to the Docker image, which will be used to create a container on which your build will run.
If you don’t specify the namespace, Docker implies library which includes all official images. That’s why you’ll see many times the library part omitted in .gitlab-ci.yml and config.toml. For example you can define an image like image: ruby:2.6, which is a shortcut for image: library/ruby:2.6.
Then, for each Docker image there are tags, denoting the version of the image. These are defined with a colon (:) after the image name. For example, for Ruby you can see the supported tags at docker hub. If you don’t specify a tag (like image: ruby), latest is implied.
The image you choose to run your build in via image directive must have a working shell in its operating system PATH. Supported shells are sh, bash, and pwsh (since 13.9) for Linux, and PowerShell for Windows. GitLab Runner cannot execute a command using the underlying OS system calls (such as exec).
The services keyword
The services keyword defines just another Docker image that is run during your build and is linked to the Docker image that the image keyword defines. This allows you to access the service image during build time.
The service image can run any application, but the most common use case is to run a database container, e.g., mysql. It’s easier and faster to use an existing image and run it as an additional container than install mysql every time the project is built.
You can see some widely used services examples in the relevant documentation of CI services examples.
If needed, you can assign an alias to each service.
As for your questions:
It should be possible to use artifacts to pass data between jobs
according to this answer and to this well explained forum post but
they use only one container for different jobs. It doesn't work in my
case. Probably because I use two different containers?
The builds and cache storage (from documentation)
The Docker executor by default stores all builds in /builds/<namespace>/<project-name> and all caches in /cache (inside the container). You can overwrite the /builds and /cache directories by defining the builds_dir and cache_dir options under the [[runners]] section in config.toml. This will modify where the data are stored inside the container.
If you modify the /cache storage path, you also need to make sure to mark this directory as persistent by defining it in volumes = ["/my/cache/"] under the [runners.docker] section in config.toml.
builds_dir -> Absolute path to a directory where builds are stored in the context of the selected executor. For example, locally, Docker, or SSH.
The [[runners]] section documentation
As you may have noticed I have customized the build_dir in your toml file to /home/builds/rocker, please adjust it to your own path.
How can I pass the artifacts from one job to the other?
You can use the build_dir directive. Second option would to use Job Artifacts API.
Should I use cache as explained in docs.gitlab.com / caching?
Yes, You should use cache to store project dependencies. The advantage is that you fetch the dependencies only once from internet and then subsequent runs are much faster as they can skip this step. Artifacts are used to share results between build stages.
I hope it is now clearer and I have pointed you into right direction.
The two different images are not the cause of your problems. The artifacts are saved in one image (which seems to work), and then restored in the other. I would therefore advise against building (and maintaining) a single image, as that should not be necessary here.
The reason you are having problems is that you are missing build stages which inform gitlab about dependencies between the jobs. I would therefore advise you to specify stages as well as their respective jobs in your .gitlab-ci.yml:
stages:
- do_stats
- do_compile_pdf
knit_rnw_to_tex:
stage: do_stats
image: rocker/verse:4.0.0
script:
- Rscript -e "knitr::knit('article.Rnw')"
artifacts:
paths:
- figure/
compile_pdf:
stage: do_compile_pdf
image: aergus/latex
script:
- ls figure
- latexmk -pdf -bibtex -use-make article.tex
artifacts:
paths:
- article.pdf
Context:
By default, all artifacts of previous build stages are made available in later stages if you add the corresponding specifications.
If you do not specify any stages, gitlab will put all jobs into the default test stage and execute them in parallel, assuming that they are independent and do not require each others artifacts. It will still store the artifacts but not make them available between the jobs. This is presumably what is causing your problems.
As for the cache: Artifacts are how you pass files between build stages. Caches are for well, caching. In practice, they are used for things like external packages in order to avoid having to download them multiple times, see here. Caches are somewhat unpredictable in situations with multiple different runners. They are only used for performance reasons, and passing files between jobs using cache rather than using the artifact system is a huge anti-pattern.
Edit: I don't know precisely what your knitr setup is, but if you generate an article.tex from your article.Rnw, then you probably need to add that to your artifacts as well.
Also, services are used for things like a MySQL server for testing databases, or the dind (docker in docker) daemon to build docker images. This should not be necessary in your case. Similarly, you should not need to change any runner configuration (in their respective config.toml) from the defaults.
Edit2: I added a MWE here, which works with my gitlab setup.

How to convert container into image?

I installed WordPress with docker-compose, now I've finished developing the website, how can I turn this container into a permanent image so that I'm able to update this website even if I remove the current container?
The procedure I went through is the same as this tutorial.
Now I got the WordPress container as below
$ docker-compose images
Container Repository Tag Image Id Size
-------------------------------------------------------------------------
wordpress_db_1 mysql 5.7 e47e309f72c8 355 MB
wordpress_wordpress_1 wordpress 5.1.0-apache 523eaf9f0ced 402 MB
If that wordpress image is well made, you should only need to backup your volumes. However if you changed files on the container filesystem (as opposed to in volumes), you will also need to commit your container to produce a new docker image. Such an image could then be used to create new containers.
In order to figure out if files were modified/added on the container filesystem, run the docker diff command:
docker diff wordpress_wordpress_1
On my tests, after going through Wordpress setup, and even after updating Wordpress, plugin and themes the result of the docker diff command gives me:
C /run
C /run/apache2
A /run/apache2/apache2.pid
Which means that only 2 files/directories were Changed and 1 file Added.
As such, there is no point going through the trouble of using the docker commit command to produce a new docker image. Such a image would only have those 3 modifications.
This also means that this Wordpress docker image is well designed because all valuable data is persisted in docker volumes. (The same applies for the MySQL image)
How to deal with container lost ?
As we have verified earlier, all valuable data lies in docker volumes. So it does not matter if you loose your containers. All that matter is to not loose your volumes. The question of how to backup a docker volume is already answered multiple times on Stack Overflow.
Now be aware that a few docker and docker-compose commands do delete volumes! For instance if you run docker rm -v <my container>, the -v option is to tell docker to also delete associated volumes while deleting the container. Or if you run docker-compose down -v, volumes would also be deleted.
How to backup Wordpress running in a docker-compose project?
Well, the best way is to backup your Wordpress data with a Wordpress plugin that is well known for doing so correctly. It is not because you are running Wordpress in docker containers that Wordpress good practices don't apply anymore.
In the case you need to restore your website, start new containers/volumes with your docker-compose.yml file, go through the minimal Wordpress setup, install your backup plugin and use it to restore your data.

Wordpress local dev environment image

I would like to set an image for wordpress local environment.
We have developers that are working with Mac, Windows and Linux and I wish that it will be easy to set a working environment.
It is important that they can use an IDE outside the VM for development, and GIT.
What is the best way to achive that? Docker or Vagrant?
I tried doing so according to this tutorial but there are some stuff wrong with it and it does not working.
https://resources.distilnetworks.com/all-blog-posts/wordpress-development-with-vagrant
I've used Varying Vagrant Vagrants, which is great for WordPress development for a pretty long time, recently I've switched to Laravel's Homestead, which is also great and works fine with WordPress too.
Both are pre-made environments that are easy to install, pre-configured and ready to use. Never had any issues with them.
You can use WPTunnel for this purpose.
It creates local WordPress installations inside Docker containers. The source files are mounted from a local directory, so any IDE and Git can access it:
https://github.com/dsdenes/wptunnel
E.g. if you create a local installation with:
$ wptunnel create mysite
then you can access the source files at ~/wptunnel/projects/mysite
So if you want to use a version control tool like Git, then you can do:
$ git init ~/wptunnel/projects/mysite
and edit the files with your favorite IDE:
$ code ~/wptunnel/projects/mysite
disclamier: I'm the author of the library

In Docker WordPress image what causes a delay in copying application files?

I have created a new Dockerfile based on the official WordPress image, and I have been trying to troubleshoot why I cannot remove the default themes. I discovered that the reason is because at the time the command is executed the files do not actually exist yet.
Here are the relevant lines from my Dockerfile:
FROM wordpress
RUN rm -rf /var/www/html/wp-content/themes/twenty*
The delete command works as expected if I run it manually after the container is running.
As a side note, I have also discovered that when I copy additional custom themes to the /var/www/html/wp-content/themes directory, from the Dockerfile, it does work but not quite as I would expect. Because any files in the official docker image will overwrite my custom versions of the same file. I would have imagined this to work the other way around, in case I want to supply my own config file.
So I actually have two questions:
Is this behavior Docker-related? Or is it in the WordPress-specific image?
How can I resolve this? It feels like a hack, but is there a way to asynchronously run a delayed command from the Dockerfile?
What's up, Ben!
Your issue is related to a concept introduced by Docker named entrypoint. It's typically a script that is executed when the container is run, and contains actions that need to be ran at runtime, not buildtime. That script is ran right after you run the image. They are used to make containers behave like services. The parameters set with the CMD directive are, by default, the ones passed directly to the entrypoint, and can be overwritten.
You can find the debian template of the Dockerfile of the image you are pulling here. As you can see, it calls an entrypoint named docker-entrypoint.sh. Since I don't want to dive into it too much, basically, it's performing the installation of your application.
Since you are inheritating the Wordpress image, the entrypoint of the wordpress image is being executed. Overwriting it so that it is not executed anymore is not a good idea either, since it would render your image useless.
A simple hack that would work in this case would be the following:
FROM wordpress
RUN sed -i 's/exec \"\$\#\"/exec \"rm -rf \/var\/www\/html\/wp-content\/themes\/twenty\* \&\& \$\#\"/g'
That would rewrite the entrypoint, making the last exec clause to remove those files and run whatever service it decided to run (typically apache, but I don't know which could be the case in this container).
I hope that helps! :)

Developing in a Docker image that's under version control

Currently have a pipeline that I use to build reports in R and publish in Jekyll. I keep my files under version control in github and that's been working great so far.
Recently I began thinking about how I might take R, Ruby and Jekyll and build a docker image that any of my coworkers could download and run the same report without having all of the packages and gems set up on their computer. I looked at Docker Hub and found that the automated builds for git commits were a very interesting feature.
I want to build an image that I could use to run this configuration and keep it under version control as well and keep it up to date in Docker Hub. How does something like this work?
If I just kept my current setup I could add a dockerfile to my repo and Docker Hub would build my image for me, I just think it would be interesting to run my work on the same image.
Any thoughts on how a pipeline like this may work?
Docker Hub build service should work (https://docs.docker.com/docker-hub/builds/). You can also consider using gitlab-ci or travis ci (gitlab will be useful for privet projects, it also provides privet docker registry).
You should have two Dockerfiles one with all dependencies and second very minimalistic one for reports (builds will be much faster). Something like:
FROM base_image:0.1
COPY . /reports
WORKDIR /reports
RUN replace-with-requiered-jekyll-magic
Dockerfile above should be in your reports repository.
In 2nd repository you can crate base image with all the tools and nginx or something for serving static files. Make sure that nginx www-root is set to /reports. If you need to update the tools just update base_mage tag in Dockerfile for reports.

Resources