What artifacts to save for a nightly build? - build-process

Assume that I set up an automatic nightly build. What artifacts of the build should I save?
For example:
Input source code
output binaries
Also, how long should I save them, and where?
Do your answers change if I do Continuous Integration?

You shouldn't save anything for the sake of saving it. you should save it because you need it (i.e., QA uses nightly builds to test). At which point, "how long to save it" becomes however long QA wants them.
i wouldn't "save" source code so much as tag/label it. I don't know what source control you're using, but tagging is trivial (performance & disk space) for any quality source control system. Once your build is tagged, unless you need binaries, there really isn't any benefit to just having them around because you can simply re-compile when necessary from source.
Most CI tools let you tag on each successful build. This can become problematic for some systems as you can easily have 100+ tags a day. For such cases I recommend still running a nightly build and only tagging that.

Here are some artifacts/information that I'm used to keep at each build:
The tag name of the snapshot you are building (tag and do a clean checkout before you build)
The build scripts themselfs or their version number (if you treat them as a separate project with its own version control)
The output of the build script: logs and final product
A snapshot of your environment:
compiler version
build tool version
libraries and dll/libs versions
database version (client & server)
ide version
script interpreter version
OS version
source control version (client and server)
versions of other tools used in the process and everything else that might influence the content of your build products. I usually do this with a script that queries all this information and logs it to a text file that should be stored with the other build artifacts.
Ask yourself this question: "if something destroys entirely my build/development environment what information would I need to create a new one so I can redo my build #6547 and end up with the exact same result I got the first time?"
Your answer is what you should keep at each build and it will be a subset or superset of the things I already mentioned.
You can store everything in your SCM (I'd recommend a separate repository), but in this case your question on how long you should keep the items looses sense. Or you should store it to zipped folders or burn a cd/dvd with the build result and artifacts. Whatever you choose, have a backup copy.
You should store them as long as you might need them. How long, will depend on your development team pace and your release cycle.
And no, I don't think it changes if you do continous integration.

This isn't a direct answer to your question, but don't forget to version control the nightly build setup itself. When the project structure changes, you may have to change the build process, which will break older builds from that point on.

In addition to the binaries as everyone else has mentioned I would recomend setting up a symbol server and a source server and making sure you get the correct information out and into those. It will aid in debugging tremendously.

We save the binaries, stripped and unstripped (so we have the exactly same binary, once with and once without debug symbols). Further we build everything twice, once with debug output enabled and once without (again, stripped and unstripped, so every build result in 4 binaries). The build is stored to a directory according to SVN revision number. That way we can always retain the source from the SVN repository by simply checking out this very revision (that way the source is archived as well).

A surprising one I learned about recently: If you're in an environment that might be audited you'll want to save all the output of your build, the script output, the compiler output, etc.
That's the only way you can verify your compiler settings, build steps, etc.
Also, how long to save them for, and where to save them?
Save them until you know that build won't be going to production, iow as long as you have the compiled bits around.
One logical place to save them is your SCM system. Another option is to use a tool that will automatically save them for you, like AnthillPro and its ilk.

We're doing something close to "embedded" development here, and I can tell you what we save:
the SVN revision number and timestamp, as well as the machine it was built on and by whom (also burned into the build binaries)
a full build log, showing whether it was a full/incremental build, any interesting (STDERR) output the data baking tools produced, a list of files compiled and any compiler warnings (this compresses very well, being text)
the actual binaries (for anywhere from 1-8 build configurations)
files produced as a side effect of linking: a linker command file, address map, and a sort of "manifest" file indicating what was burned into the final binaries (CRC and size for each), as well as the debugging database (.pdb equivalent)
We also mail out the result of running some tools over the "side-effect" files to interested users. We don't actually archive these since we can reproduce them later, but these reports include:
total and delta of filesystem size, broken down by file type and/or directory
total and delta of code section sizes (.text, .data, .rodata, .bss, .sinit, etc)
When we have unit tests or functional tests (e.g. smoke tests) running, those results show up in the build log.
We've not thrown out anything yet -- given, our target builds usually end up at ~16 or 32 MiB per configuration, and they're fairly compressible.
We do keep uncompressed copies of the binaries around for 1 week for ease of access; after that we keep only the lightly compressed version. About once a month we have a script that extracts each .zip that the build process produces and 7-zips a whole month of build outputs together (which takes advantage of only having small differences per build).
An average day might have a dozen or two builds per project... The buildserver wakes up about every 5 minutes to check for relevant differences and builds. A full .7z on a large very active project for one month might be 7-10GiB, but it's certainly affordable.
For the most part, we've been able to diagnose everything this way. Occasionally there's a hiccup on the buildsystem and a file isn't actually a the revision it's supposed to be when a build happens, but there's usually enough evidence of this in the logs. Sometimes we have to dig out a tool that understands the debugging database format and feed it a few addresses to diagnose a crash (we have automatic stackdumps built into the product). But usually all the information needed is there.
We haven't had to crack the .7z archives yet, to mention. But we have the info there, and I have some interesting ideas on how to mine bits of useful data from it.

Save what can't be reproduced easily. I work on FPGAs where only the FPGA team have the tools and some cores (libraries) of the design are licensed to compile on only one machine. So we save the output bitstreams. But try to check them over one another rather than with a date/time/version stamp.

Save as in check in to source code control or just on disk? Save nothing to source code control. All derived files should be visible in the file system and available to developers. Don't checkin binaries, code generated from XML files, message digests etc. A separate packaging step will make these end products available. As you have the change number you can always reproduce the build if necessary assuming of course everything you need to do a build is completely in the tree and is available to all builds by syncing.

I would save your built binaries for exactly as long as they have a chance to go into production or be used by some other team (like a QA group). Once something has left production, what you do with it can vary a lot. For a lot of teams, they'll keep just their most recent prior build around (for rollback) and otherwise discard their builds.
Others have regulatory requirements to keep anything that went into production around for as long as seven years (banks). If you are a product company, I'd keep around any binary a customer might have installed in case a tech support guy wants to install the same version.

Related

R and Rstudio Docker vs Binder

My problem is that I can't use R-studio at my work place as the IT does not support it . I want to use R and R-studio that installed on my personnel laptop on my company laptop ( using a modern browser which is behind firewall ) . Some of the options I am thinking of two two things
should I need to build a docker for R and R-studio (I see base images are already available) , I am mostly interested in basic R , Dplyr (haven ,xporter, and Reticulate ) packages .
Should I have to use a binder . I am not technical person and my programming skills are very limited can any one suggest me way .
What exactly are the difference between using Docker option vs Binder ?
I know I can use R-Studio online and get my work done but with the new paid account I am running out of project hours and very slow sometimes . Thanks in advance
Here are some examples beyond the modern RStudio MyBinder example:
https://github.com/fomightez/pythonista_skewedf
https://github.com/fomightez/r_phylogenetics_worshop
https://github.com/fomightez/chapter7/tree/master/binder
The modern RStudio MyBinder example has been set as a template on GitHub so you can use
The first one is for a special use of a package not on conda. And I started that one from square one.
The other two were converted from content by others to aid in making them Binder-ready.
You essentially list everything you need from conda in the environment.yml along with the appropriate channels. If you need special stuff not on conda, you need the other configuration files included there.
Getting everything working can take some iterations on adding things, letting the image get built, and testing your libraries are available. Although you seem to think your situation is not overly complex.
The binder launch badges you see are just images where you modify the URL to point the MyBinder federation site at your repository. Look at the URL and you should see the pattern where you put studio at the end of the URL pointing at your repo. The form at MyBinder.org site can help with this; however, most often it is easier to just adapt a working launch badge's code copied from elsewhere. The form isn't set up at this time for making the URLs for launching to RStudio.
Download anything useful your create in a running session. The sessions timeout after 10 minutes, although RStudio usually keeps them active.
Lack of Persistence and limited memory, storage, & power can be drawbacks. The inherent reproducibility and portability are advantages.
MyBinder.org doesn't work with private repos. If you have code you don't want to share, you can upload it to the temporary session, using the repo for specifying the environment. You could host a private binderhub that does allow the use of private git repositories; however, that is probably overkill for your use case and exceed your ability level at this time.
GitHub isn't the only place to host repositories that can be pointed at the MyBinder system. If you go to the MyBinder.org page and click where it says 'GitHub' on the left side of the top line of the form, you can see a list of the sources at which you can host a repository and point the system to build an image and launch a container with that specified image.
Building the image from a source repository takes some minutes the first time. Once the image is built though on the service, launch is typically less than 30 seconds. Each time you make a change on the source repo, a build is necessary. Some changes don't cause the new build to be as long as the initial one as some optimizing is done to only build what is necessary after a change. Keep in mind there are several members of the federation around the workd and if traffic on the internet gets sent to where the built image isn't yet available, it will be built from scratch again first.
The Holepunch project is out there to offer some help for users working in the R ecosystem; however, with the R-Conda system that is now integrated into MyBinder it is pretty much as easy to do it the way I described. Last I knew, the Holepunch route makes a Dockerfile that isn't as easy to troubleshoot as using the current the R-Conda system route. Dockerfiles are essentially a last ditch configuration file that MyBinder can handle. The reason being the other configuration files are much easier and don't require knowing Dockerfile syntax. MyBinder aims to offer the ability to take advantage of Docker offering containers with a specified environment without users needing to know anything about Docker.
There is a Binder Help category for posting to get help at the Jupyter Discourse Forum. Some other examples of posts already there may help you troubleshoot.
Notice of a common pitfall
Most of the the configuration files for making a repository Binder-ready are simply text and can be edited right in the GitHub browser interface, without need to git or even cloning the repo locally.
Last I knew, there are two exceptions to this. The postBuild and start configuration files have settings that allow them to be run as scripts and these get altered in a way they no longer work if you edit them via the GitHub browser interface. (This was my experience when last I tried. Your mileage may vary or things may have changed now.) To edit those, you have to have git available on a system you have and pull one from some other source. Then edit that on your machine that has git working & add it your repo and push it back up from your local computer.
(If this is a problem, you can post in the Jupyter Discourse Forum Binder help category and you and I could coordinate where I fork and edit those files in your repo to your specifications and then make a pull request to update your source of the fork with those changes.)
If you are using Jupyter notebooks extensively then it may make sense to use Binder
But if you simply want to use R and Rstudio, then all you need is docker. A good resource is
https://github.com/rocker-org/rocker

Qt 5.13.2.0 possible malware Variant.Adware.Kazy.795337 in qwebp.dll

today we received info from one of our customer about this malware detection:
Gen:Variant.Adware.Kazy.795337
It's only inside the qwebp.dll file attached to our project by qtdeploy process.
We're building 32-bit Qt (5.13.2.0) from the source and the same issue is reported on the same DLL no matter where it was built. We're using the latest VS 2019.
https://www.virustotal.com/gui/file/9f09c05803ad4ffcd99454c420a840e17549ee711690fb1f11fd1b59bccc3b23/detection
https://www.virustotal.com/gui/file/80c4c747d781a27c72de71c0900ccc045aefd2b4e4f17c949aaeeb3d0b7973b1/detection
When I scanned the older version (5.13.0.0) everything is ok:
Previous versions seem to be clean:
https://www.virustotal.com/gui/file/b7b7cacaef0e76439ef8c367c401524e93dfa00c9ca67a20290e829fec325a5a/detection
Also, any debug build and 64-bit builds are clean too.
Any idea what can cause this? Can anyone else please try to scan this file?
Thanks
TL;DR: It is probably nothing, but notify Qt anyway (and check your own systems).
Are you using the prebuilt Qt binaries or are you compiling the sources yourself?
If you are using the official prebuilt binaries, I'd of course expect that the Qt Devteam scans them and verifies that they don't accidently spread malware, but there is always the miniscule chance of something slipping through.
Same goes for the sources - while their review process should be thorough enough to avoid malicious code being slipped in, there is still the outside chance of either a key account being compromised or (even more unlikely) bad code being added slice-by-slice over a longer time period to avoid detection (along the lines of the underhanded C contest). Still, either case seems to be rather unlikely.
Bottom line: while this does sound like (and probably is) a false positive, you still may want to raise an issue with Qt e.g. on the their Bugtracking site or directly with Qt support (if you have a commercial license) to be sure. Also (if you didn't do that already) verify that the problem is not on your end, e.g. that your computers are clean and that you don't just randomly catch/detect your infection in that file.
Update:
A ticket concerning this issue was opened (I assume by Ludek Vodicka) on Qt bugtracker. Opened on Nov 19th and categorized as P1: Critical, but unfortunately no indication that it is actually being worked on (at least of Dec 18th).

Profiling GPRBuild

I have a large GPR based project that can take over 30 minutes to compile.
Having analyzed the build process I noticed many obvious inefficiencies (multiple calls to gprbuild rather than aggregates, excessive use of alternative files rather than configurations, etc). I am wondering if there is some means to 'profile' the build process to see what takes so long.
In particular it takes about 5 minutes to recompile when even a single file changes and there is an error in it. In theory it should be pretty quick to realize that that file has to be recompiled (its the only one that does) and start the compilation process, rapidly discovering the error.
From the verbose output it looks like it takes quite a while just parsing the massive web of gpr files used to define the build, but I would like to know where it spends most of its time.
Thus my question is: Is it possible to profile a build done by gprbuild? If so, how?
From low to high complexity:
Ask gprbuild to report more details about what it is doing with the flag -vh.
Run gprbuild through strace.
Rebuild gprbuild with the required flags to profile it using gprof (but be aware that gprof doesn't always tell the truth).

What really is a patch?

This was one of my interview questions which i couldn't answer.... Is it related to web developement?
What really is a patch?
and my interview question was
how will u start new patch?
A patch is a small release of source code that fixes a specific (and usually critical) issue in a product. Patches are also usually released outside of a normal release cycle due to an urgent issue.
patch(1) is a program that comes with most every Unix or Linux type system which takes a diff file as input and applies the differences that file contains. This means one developer can run the diff(1) tool on two versions of a bit of source code, then send the resulting diff file to someone else with one of these source versions, and they can patch their copy to look like the other version.
There are several different diff formats. patch(1) likes unified diffs best.
It is common for open source projects to request patches from outsiders in unified diff format. This lets the outsider make their changes, then produce a patch (that is, a unified diff file) that someone with checkin rights can apply to the source repository directly. Some source management systems -- Subversion, for example -- make this easy: "svn diff" gets you a unified diff, which isn't the default with the regular Unix diff command. You can thus say something like "svn diff > my-changes.patch" and get a patch file.

How can I uninstall Win32 assemblies and cleanup WinSxS?

After a lot of trial and error (mostly due to lack of documentation and examples) I have managed to create MSI installers that install custom DLLs to WinSxS as side-by-side assembly. There is only one problem: Uninstalling leaves all files (DLLs, manifests and catalogs) in the WinSxS directory. How can or should I best clean that up? I know for sure that nothing else references it.
I have read somewhere that WinSxS has a self-scavenging process that cleans up over time but I could not find more information about that. Can you manually invoke this to clean up stuff?
The only other way I see is manually deleting those bits. First you have to change the owner of all files (assembly, catalog, manifest and their respective directory) from SYSTEM to an administrator account, adjust the permissions and delete them. There are also pieces left in the registry (I think HKLM\COMPONENTS\DerivedData\Components may be one place), but since WinSxS should be treated as opaque it is hard to find any information.
Scavenging isn't exposed anywhere that I know of. I'm not even sure when it is kicked off automatically. Maybe on uninstall of a service pack? Maybe some tool admins can run? I really forget.
Anyway, my suggestion is don't fight it. There are so many twisty turns down there that it just isn't worth trying to get the disk space back. Once uninstalled the bits still in the SxS cache will not be activated so they are just wasting space.
It's a dumb design but blame Microsoft and don't try to overcompensate.
Here is an article, it's kinda complete guide to WinSxS.
So, shortly, you can only uninstall some components (all their versions are in this folder), and you can run Service Pack bridge burning utility (in Vista it is named VSP1CLN.EXE and shipped with SP1). Note, that after execution, you shouldn't be able to uninstall SP or any components to state, prior to SP release date.
No-one is convinced you can - short of a complete reinstall, your bloaty WinSxS directory is there to stay.
There's been a long "discussion" of the problem on technet.
There is no documentation of the format, or any instructions how to remove files that are no longer needed - MS seems to think that disc space is cheap. There is a self-scavenging feature, but no-one's convinced it works, or if it does, it is very conservative (as you'd hope as you don't want it to break your OS)
You can tell is the scavenger is working by checking the "C:\Windows\winsxs\Temp\PendingDeletes." folder, as this is where files are moved by windows update or an installer moves them to - the scavenger just deletes the files in here.
You'll notice that after you uninstall your assembly, while the files are still there, they can no longer be bound to - so they are just "staged", or cached, but not really installed.
Rob & gbjbaanb are correct - you cannot manually invoke a scavenge yourself. Don't try to delete the files yourself - there are multiple places in the registry where they are registered, DerivedData\Components being only one of the many references.
I think the rule for Vista is scavenging is kicked off by the TrustedInstaller service after 10 minutes of machine inactivity, after the last servicing operation (service pack, hotfix, etc). But it's very fickle, so it doesn't run as often as it should. So just be patient, and the files will disappear on their own.
Well i was having some issues as i have an 80GB SSD for my windows and the WinSxs folder was about 12gb's
I was searching the net and i found this command:
DISM.exe /online /Cleanup-Image /spsuperseded
And now my WinSxs is 7gb which was wonderful news.
There are a few updates regarding the cleanup method that apply to newer OS. Check http://www.karafilis.net/winsxs-cleanup

Resources