Is there a way to expire the cache of the docker build command without explicitly adding the --no-cache flag?
Let's say I would like to search for package (security) updates at least once per week when building my image in the gitlab ci-cd pipeline. And the rest of the time allow caching to speed up the build.
I was thinking of using an auxiliary artifact that expires after certain time, but this is probably not the most elegant (and certainly not portable) solution.
Example of my docker file:
FROM rocker/r-ver:4.0:
# Apply OS updates
# This will be by default cached
RUN apt-get update && apt-get --yes upgrade
COPY myscript.R ./myscript.R
CMD [ "Rscript", "./myscript.R" ]
Related
We have a JS-based stack in our application - React with vast majority being a React Admin frontend, built on Next.js server, with Postgres, Prisma and Nexus on backend. I realize it's not a great use case for Next.js (React Admin basically puts entire application in a single "component" (root), so basically I have a giant index.tsx page instead of lots of smaller pages), but we've had quite terrible build times in Gitlab CI and I'd like to know if there's anything I can do about it.
We utilize custom gitlab-runners deployed on the company Kubernetes cluster. Our build job essentially looks like:
- docker login
- CACHE_IMAGE_NAME="$CI_REGISTRY_IMAGE/$CI_COMMIT_REF_SLUG:latest"
- SHA_IMAGE_NAME="$CI_REGISTRY_IMAGE/$CI_COMMIT_SHORT_SHA"
- docker pull $CACHE_IMAGE_NAME || true
- docker build
-t $CACHE_IMAGE_NAME
-t $SHA_IMAGE_NAME
--cache-from=$CACHE_IMAGE_NAME
--build-arg BUILDKIT_INLINE_CACHE=1
- docker push # both tags
And the Dockerfile for that is
FROM node:14-alpine
WORKDIR /app
RUN chown -R node:node /app
USER node
COPY package.json yarn.lock ./
ENV NODE_ENV=production
RUN yarn install --frozen-lockfile --production
COPY . .
# Prisma client generate
RUN npm run generate
ENV NEXT_TELEMETRY_DISABLED 1
RUN npm run build
ARG NODE_OPTIONS=--max-old-space-size=4096
ENV NODE_OPTIONS $NODE_OPTIONS
EXPOSE 3000
CMD ["yarn", "start"]
This built image is then deployed with Helm into our K8s with the premise that initial build is slower, but subsequent builds in the pipeline will be faster as they can utilize docker cache. This works fine for npm install (first run takes around 10 minutes to install, subsequent are cached), but next build is where hell breaks loose. The build times are around 10-20 minutes. I recently updated to Next.js 12.0.2 which ships with new Rust-based SWC compiler which is supposed to be up to 5 times faster, and it's actually even slower (16 minutes).
I must be doing something wrong, but can anyone point me in some direction? Unfortunately, React Admin cannot be split across several Next.js pages AFAIK, and rewriting it to not use the framework is not an option either. I've tried doing npm install and next build in the CI and copy that into the image, and store in the Gitlab cache, but that seems to just shift the time spent from installing/building into copying the massive directories in/out cache and into the image. I'd like to try caching the .next directory in between builds, maybe there is some kind of incremental build possible but I'm skeptical to say the least.
Well, there are different things that we can approach for making it faster.
You're using Prisma, but you're generating the client every time that you have a modification in any of the files, preventing the Docker cache to actually take care of that layer. If we take a look into the Prisma documentation, we need to generate the Prisma Client each time that there's a change on the Prisma schema, not in the TS/JS Code.
I will suppose you have your Prisma schematics under the prisma directory, but feel free to adapt my suppositions to the reality of your project:
ENV NODE_ENV=production
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --production
COPY prisma prisma
RUN npm run generate
You're using a huge image for your final container, which maybe doesn't have a significant impact on the build time, but it definitely has on the final size and the time required to load/download the image. I would recommend to migrate into a multi-stage solution like the following one:
ARG NODE_OPTIONS=--max-old-space-size=4096
FROM node:alpine AS dependencies
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --production
COPY prisma prisma
RUN npm run generate
FROM node:alpine AS build
WORKDIR /app
COPY . .
COPY --from=dependencies /app/node_modules ./node_modules
ENV NEXT_TELEMETRY_DISABLED 1
RUN npm run build
FROM node:alpine
ARG NODE_OPTIONS
ENV NODE_ENV production
ENV NODE_OPTIONS $NODE_OPTIONS
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
WORKDIR /app
COPY --from=build /app/public ./public
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package.json ./package.json
COPY --chown=nextjs:nodejs --from=build /app/.next ./.next
USER 1001
EXPOSE 3000
CMD ["yarn", "start"]
From another point of view, probably you could improve also the NextJS build by changing some tools and modifying the NextJS configuration. You should use locally the tool https://github.com/stephencookdev/speed-measure-webpack-plugin to analyze which is the culprit of that humongous build-time (which probably is something related to sass) and also take a look into the TerserPlugin and the IgnorePlugin.
I'm using CodePipeline to deploy whatever is on master branch of the git to Elastic Beanstalk.
I followed this tutorial to extend the default nginx configuration (specifically the max-body-size): https://medium.com/swlh/using-ebextensions-to-extend-nginx-default-configuration-in-aws-elastic-beanstalk-189b844ab6ad
However, because I'm not using the standard eb deploy command, I dont think the CodePipeline flow is going into the .ebextension directory and doing the things its supposed to do.
Is there a way to use code pipeline (so i can have CI/CD from master) as well as utilize the benefits of .ebextension?
Does this work if you use the eb deploy command directly? If yes, then I would try using the pipeline execution history to find a recent artifact to download and test with the eb deploy command.
If CodePipeline's Elastic Beanstalk Job Worker does not play well with ebextensions, I would consider it completely useless to deploy to Elastic Beanstalk.
I believe there is some problem with the ebextensions themselves. You can investigate the execution in these log files to see if something is going wrong during deployment:
/var/log/eb-activity.log
/var/log/eb-commandprocessor.log
/var/log/eb-version-deployment.log
All the config files under .ebextension will be executed based on the order of precedence while deploying on the Elastic Beanstalk. So, it is doesn't matter whether you are using codepipeline or eb deploy, all the file in ebextension directory will be executed. So, you don't have to worry about that.
Be careful about the platform you're using, since “64bit Amazon Linux 2 v5.0.2" instead of .ebextension you have to use .platform.
Create .platform directory instead of .ebextension
Create the subfolders and the proxy.conf file like in this path .platform/nginx/conf.d/proxy.conf
In proxy.conf write what you need, in case of req body size just client_max_body_size 20M;
I resolved the problem. You need include .ebextension folder in your deploy.
I only copy the dist files, then I need include too:
- .ebextensions/**/*
Example:
## Required mapping. Represents the buildspec version. We recommend that you use 0.2.
version: 0.2
phases:
## install: install dependencies you may need for your build
install:
runtime-versions:
nodejs: 12
commands:
- echo Installing Nest...
- npm install -g #nestjs/cli
## pre_build: final commands to execute before build
pre_build:
commands:
- echo Installing source NPM dependencies...
- npm install
## build: actual build commands
build:
commands:
# Build your app
- echo Build started on `date`
- echo Compiling the Node.js code
- npm run build
## Clean up node_modules to keep only production dependencies
# - npm prune --production
## post_build: finishing touches
post_build:
commands:
- echo Build completed on `date`
# Include only the files required for your application to run.
artifacts:
files:
- dist/**/*
- package.json
- node_modules/**/*
- .ebextensions/**/*
Ande the config file /.ebextensions/.node-settings.config:
option_settings:
aws:elasticbeanstalk:container:nodejs:
NodeCommand: "npm run start:prod"
I am defining a Dockerfile where I install sqlite3 in a ubuntu based image, something very similar (I also install grpc and rust as well as all the necessary dependencies) to:
FROM ubuntu
RUN apt-get update && \
apt-get install -y sqlite3 libsqlite3-dev&& \
apt-get clean && \
apt-get autoremove
I use this image to built my Rust project within it. The issue that I am facing is that cargo build fails on my GitLab CI due to a linking issue:
Compiling migrations_macros v1.4.0
error: linking with `cc` failed: exit code: 1
...
= note: /usr/bin/ld: cannot find -lsqlite3
I found out that this is due to this symlink not being present on the Docker image that is running on CI:
libsqlite3.so -> /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6
while the file libsqlite3.so.0.8.6 exists. So if I create the symlink during the CI jobs I can have a working workaround. The weird thing is that if I pull the same exact image from my registry on my pc and run the container I can build without any issue and any change because the symlink is actually there.
What could be the cause of the problem and how to solve it?
After quite a bit of thinking the following ideas come to my mind which could help.
Docker history
Docker command has a build in feature to view the history of a built image. You have the option to identify the problematic command in the DockerFile.
docker history <image id or name>
For more visual filtering i do recommend dive tool but others are also available on google.
Correct docker version
Since in this scenario the two docker instances are different the question is trivial. Are they on the same version of docker daemon and docker file system driver?
I successfully created an example .travis.yml file and the build passes. However, it is a bit slow. The main cause is composer install and it takes up to 40 seconds.
Since my experience with Travis CI is 2 days old, I need someone with experience to tell me what is the better practise when it comes to using composer install in Travis environment? Under what block I should call it and what should be the command itself?
Note: I use Symfony projects so if there is something specific to this framework, please let me know.
I read some blog posts and went through example files in some open source projects and ended up confusing myself. Some use under before_script: and some install: etc. Also some use composer install, composer install --prefer-source --no-interaction --dev, travis_retry composer install --ignore-platform-reqs --no-interaction --prefer-source so on. My aim is to speed up the build time.
What's wrong with composer and your .travis.yml?
Composer update without PHP environment checking
PHP and Continuous Integration with Travis CI
so on
My own .travis.yml.
language: php
php:
- 5.6
env:
global:
- SOURCE_DIR=src
install:
- sudo apt-get update > /dev/null
- sudo apt-get install apache2 libapache2-mod-fastcgi > /dev/null
before_script:
- ...
- ...
- composer self-update
- composer install
- ...
- ...
script:
- bin/phpspec run --no-ansi --format=dot
- bin/behat --profile=default -f progress
- ...
- ...
There's no one right answer for which section the composer install belongs to IMO. Use what makes sense for you. I would put it in install.
Prefixing it with travis_retry is also a good idea if the command is prone to fail due to network issues. for example. It'll retry the same command a default number of 3 times. The command is only considered a failure if after all retry attempts the wrapped command still did not exit 0.
As for speeding up you build, I wouldn't bother for eliminating 40s from the build time. That said, you can have a look at caching the composer install directory. This would save a tarball of that dir after the builder finishes and try to get it from network storage at the beginning of the next build. That way only new.changed dependencies would be needed to install. Since that archive lives on network storage and not inside the container however, this might just not give you any actual speedup. Docs for caching
While trying to follow these instructions (on setting up testing for Meteor using RTD by Xolvio) I came across this line early on:
Ensure you have node and Meteor and that these dependencies globally
installed (you may need to run this as sudo, depending on how you're
setup)
How can I quickly determine whether Meteor has been installed globally on this machine?
If you followed the instructions to install meteor:
$ curl https://install.meteor.com | /bin/sh
It will be installed "globally" (i.e. for all users). You can verify this just by typing:
$ which meteor
This should return (on either linux or mac):
/usr/local/bin/meteor
Executables in /usr/local/bin should be available to all users. If this returned something like /home/dave/local/bin/meteor, then it may be available only to a single user.