I am building a CI/CD pipeline for a product and am confused about a few things.
I have so far worked in a system where I used to do "code promotion" for environment progression ie each branch pointed towards a certain env and PRs between the branches but very recently I read about "artifact promotion" and I feel like it is a sensible thing to do and want to give it a try.
Now for my microservices, I am able to manage it by keeping a Docker image for each env, free from any env specific variables, and supplying env variables to my pod directly. It all works. But for my frontend, I am hosting it using S3 & CloudFront and I am using Next JS f/w and the way env variables work in Next JS is that we need to supply them at build time and they get embedded in the export/dist.
How do I do "artifact promotion" in such cases, specially when the env variables are different for each environment.
PS: I know this question is very specific to my use case. Apologies if I am asking it at a wrong place!


Apache Airflow - Multiple deployment environments

When handling multiple environments (such as Dev/Staging/Prod etc) having separate (preferably identical) Airflow instances for each of these environments would be the best case scenario.
I'm using the GCP managed Airflow ( GCP Cloud Composer), which is not very cheap to run, and having multiple instances would increase our monthly bill significantly.
So, I'd like to know if anyone has recommendations on using a single Airflow instance to handle multiple environments?
One approach I was considering of was to have separate top-level folders within my dags folder corresponding to each of the environment (i.e. dags/dev, dags/prod etc)
and copy my DAG scripts to the relevant folder through the CI/CD pipeline.
So, within my source code repository if my dag looks like:
During the CI stage, I could have a build step that creates 2 separate versions of this file:
I would follow a strict naming convention for naming my DAGs, Airflow Variables etc to reflect the environment name, so that the above build step can automatically rename those accordingly.
I wanted check if this is an approach others may have used? Or are there any better/alternative suggestions?
Please let me know if any additional clarifications are needed.
Thank you in advance for your support. Highly appreciated.
I think it can be a good approach to have a shared environement because it's cost effective.
However if you have a Composer cluster per environment, it's simpler to manage, and it's allows having a better separation.
If you stay on a shared environment, I think you are on the good direction with a separation on the Composer bucket DAG and a folder per environment.
If you use Airflow variables, you also have to deal with environment in addition to the DAGs part.
You can then manage the access to each folder in the bucket.
In my team, we chose another approach.
Cloud Composer uses GKE with autopilot mode and it's more cost effective than the previous version.
It's also easier to manage the environement size of the cluster and play with differents parameters (workers, cpu, webserver...).
In our case, we created a cluster per environment but we have a different configuration per environment (managed by Terraform):
For dev and uat envs, we have a little sizing and an environment size as small
For prod env, we have a higher sizing and an environment size as Medium
It's not perfect but this allows us to have a compromise between cost and separation.

Firebase Cloud Functions -- package all in a single VS Code project, or create multiple VS Code projects?

I am new to cloud functions and a little unclear about the way they are "containerized" after they are written and deployed to my project.
I have two quite different sets of functions. One set deals with image storage and firebase, another deals with some time consuming computations. They two sets (lets call them A and B) of functions use different node modules and have no dependecies on each other, except they both use Firestore.
My question is wehther it matters if I put all the functions in a single VS Code project, or if I should split them up in separate projects? One question is on the deployment side? (It seems like you deploy all the functions in the project when you run firebase deploy changes, even if some of the functions haven't changed, but probably more important is whether or not functions which don't need sharp or other other image manipulation packages are "containerized" together with other functions which maybe need stats packages and math related packages, and does it make any difference how they are organized into projects?
I realize this question is high level and not about specific code, but its not so clear to me from the various resources what is the appropriate way to bundle these two sets of unrelated cloud functions to not waste a lot of unecessary loading once theya re deployed out to Firestore.
Visual studio code project is simply a way to package your code. You can create 2 folder in your project, one for each set of function with their own firebase configuration.
Only the source repository can be a constraint here, especially if 2 different teams work on the code base and each one doesn't need to see the code of the other set of functions
In addition, if you open a VS code project with the 2 set of functions, it will take more time to load them and to lint them.
On Google Cloud side, each functions are deployed in their own container. Of course, because the packaging engine (Buildpack) doesn't know, the whole code is added inside the container. When the app start, the whole code is loaded. More you have code, longer will be the init.
If you have segregate your set of functions code in different folder in your project, only the code for the set A will be embedded in the container of functions A, and same thing for B.
Now, of course, if you put all the functions at the same level and the functions doesn't use the same data, the same code and so on, it's:
The mess to understand which function do what
The mess in the container to load too much things
So, it's not a great code base design, but it's beyond the "Google Cloud" topic, and an engineering choice.
Initially I was really confused on GCP project vs VS Code IDE project...
On a question about how cloud functions are "grouped" into containers during deployment - I strongly believe that each cloud function "image" is "deployed" into its own dedicated and separate container in the GCP. I think Guillaume described it absolutely correctly. At the same time, the "source" code packed into an "image" - might have a lot of redundancies, and there may be plenty of resources, which are not used by the given cloud function. it may be a good idea to minimize that.
I also would like to suggest, that neither development nor deployment process should depend on the client side IDE, and ideally the deployment should not happen from the client machine at all, to eliminate any local configuration/version variability between different developers. If we work together - I may use vi, and you VS Code, and Guillaume - GoLand, for example. There should not be any difference in deployment, as the deployment process should take all code from (origin/remote) git repository, rather than from the local machine.
In terms of "packaging" - for every cloud function it may be useful to "logically" consolidate all required code (and other files), so that all required files are archived together on deployment, and pushed into a dedicated GCS bucket. And exclude from such "archives" any not used (not required) files. In that case we might have many "archives" - one per cloud function. The deployment process should redeploy only modified "archives", and don't touch unmodified cloud functions.

Jenkins Predefined environment variables

It is interesting that we must have used Jenkins predefined build environment variables like $WORKSPACE, $BUILD_NUMBER etc in a lot of our Jenkins job.
I find it boggling to understand that, how does Jenkins set rules such that when we print $WORKSPACE, it prints the current workspace of various jobs. How does it map the variable $WORKSPACE to the corresponding Jenkins Job.
Jenkins needs to know certain things about your build environment and jobs in order to do its job properly. For instance it needs to know the current build number, the location where your project should be checked out, who started the current build, etc. These things are typically exposed to you through the web interface.
Jenkins also exposes this information to your build scripts through environment variables that are injected into your scripts by Jenkins when your it is first launched. These environment variables can then be picked up by your script to do whatever is necessary with them.
In the example you gave ($WORKSPACE) Jenkins needs to know the absolute path to this location on your build slave because if it didn't, it wouldn't be able to check out your source and build it. Since it knows this information it exposes it to you as well to make writing your scripts easier.
There's a complete list of generally available environment variables provided by Jenkins available here.

How to handle environment variables on a deployed SF4 application

Symfony introduced a new Dotenv component since Symfony 3 which allows us to handle environment variables as application parameters. This looks really nice and it's the best practice to follow according to 12factor app manifesto.
Now, regarding Symfony 4 they went further by pushing forward this practice and that's why I started using environment variables via the .env file.
And then I wanted to deploy and I realized that the .env file must not be persisted on the server as it would be the same as having a parameters.yml file.
So I've been digging into the documentation a bit and I found this article which explains that we can directly create environment variables via some webserver directives. That's great for code being executed via FPM but it does not tell us how to handle environment variables when running a command via the CLI for instance.
How can I achieve this ?
Should there be an equivalent of a .env file stored somewhere? But then parameters would be duplicated ?
I'm welcoming any help ;)
Finally had the time to check the link Neodan posted and everything is in there!
So for those of you wondering what to do, simply edit the /etc/environment file and add your variables. Then reboot your server and all your processes will have access to these variables.
I guess that's the simplest solution. The only drawback of this method is that these variables are available by any process / users but that's ok as far as I'm concerned.
If you want a more secure solution I suppose that you could, as I stated before, configure your webserver to add environment variables and export them via your .bash_profile or .bashrc file but be careful about how you start your shell (when deploying your application for instance). It's more complicated to maintain and prone to errors I'd say.
N.B.: You also might want to be careful about how you name your variables to prevent collisions.

Require local file in PHPUnit

I am trying to test our source tree using PHPUnit with old, web based legacy code, trying to make as few changes as possible to begin. Once testing is in place, I can then change the library functions for better use, and better unit testing. However, I need the tests done to allow me to change it.
Question: We share the library code across many projects of our application, but they all share a common directory structure. When the website runs, local directories are available when we require files in the library.
Consider this:
When you launch the application, you point to either APP1 or APP2 for the different applications. They have the common code (messaging, DB access, etc...). in Library. The problem is, that the functions in the library need special parameters to work, as they are coded today. These libraries simply require('Config.php'); since it will be found in either APP1 or APP2 (they both have one with application specific settings) and the web server is using the APP1 or APP2 as the directory when the Library files were require()'d.
While this works, it fails when attempting to run the code in PHPUnit. My question is how to include the Config.php file without having to change the legacy code too much before the testing is in place.
I know this is the wrong format, but this is what I inherited.
I can not simply require('../../APP1/Config.php'); since both applications share this library.
Any suggestions are greatly appreciated.
Note: We are trying to test the library and all projects as we begin writing tests, so not sure if the include_path will solve it. I am contemplating different PHPUnit.xml.dist files for each application, but trying to avoid this right now due to corporate influence of testing all applications right away.
From phpunit.xml (<phpunit bootstrap="./bootstrap.php">...) and from the cli (--bootstrap ./bootstrap.php) you can specify a bootstrap file. In that file you could do this inclusion you are looking to do.
As word of advice, when stating to test a legacy code base don't start with Unit test. Your first goal should be "get some kind of automated tests in place". For most people, this will be system tests. That is testing the stack/site as a whole. A common tools for this is Selenium.
This is still not small task. What you are going to have to do it work out "how do I put my system in to a consistent state". The first thing you may need to do is automate importing and emptying test data in your database. Once you can do that you will be able to reliably run automated tests. You will need to get many other things to be consistent also, date and times being a good example.
My point is that, from experience, starting with Unit tests will not give you the value you need to prove that automated testings is worth the effort.
Good luck!
