How to guarantee test server is killed? - gnu-make

I'm a big fan of Python's try/finally and the builtin trap command in various shells. I have a Make target to which I would like to apply the same sort of logic. Suppose I have this target and dependencies:
test : start-server run-test-group-1 run-test-group-2 stop-server
If tests fail during the run-test-* phases, the stop-server actions won't execute. Is there a way to guarantee the stop-server actions are executed, even if "-k" is not given? I realize I could place "-" before the relevant command(s) in in the run-test-* actions, but I think that would cause make to exit with a 0 status, causing the controlling process to think the tests succeeded. I still want the parent process to know the tests failed.

Use the shell trap mechanism instead and run make recursively, perhaps? Like this:
test:
trap EXIT ERR "$(MAKE) stop-server"; \
$(MAKE) start-server && \
$(MAKE) actual-test
actual-test: run-test-group-1 run-test-group-2

Related

RobotFramework: Command sent appends "2&>1" to it

Is there any reason that the below command does this and is there any way to stop it appending it? Code
Run And Return Rc cat ${files_to_process_path}${FILE}
Outputs...
16:47:26.424 TRACE Arguments: [ 'cat /var/www/sponsor1_integration/to_process/study-6313_LONGBOAT_20170112_12:37.csv' ]
16:47:26.428 INFO Running command 'cat /var/www/sponsor1_integration/to_process/study-6313_LONGBOAT_20170112_12:37.csv 2>&1'.
16:47:26.431 TRACE Return: 0
It is appended, so any error output of a shell command is propagated back to your keywords - i.e. not to hide it. The construct 2>&1 does just that - redirects stder to stdout.
As for removing it - no, it's embedded too deep in the OperatingSystem library, with no control to not use it.
If you really don't want it, you should create your own library for that. Though, it does not do any harm, on the contrary, its benefit is you'll see any errors from the commands.

Robot framework exit status of a command is wrong?

I'm trying to execute a command remotely through Robot Framework which is failing through Robot framework and giving me the wrong exit status of 13.
But if we run this manually exit status of TTman.sh is 112 which is actually pass(Not the standard return codes).
am I doing something wrong here?
You are not getting the remote code of the remote command, in fact the RC 13 you are getting from the run is most probably from the robotframework - on run completion its RC is the number of failed cases. I.e. 13 cases should have failed, when you observed this.
To get the return code of your command, a few changes in the case are needed; this is how the semi-last line should look like, with explanations below:
${rc}= Execute Command your_command_from_the_question &>/dev/null; echo $?
First, all the output of your command (stdout & stderr) is redirected to /dev/null - to not return it. Then the special var $? is printed - it holds the RC of the last executed command (and is available in most *sh variants, like bash).
Finally, that value is stored in the ${rc} robotframework variable, and you can do whatever checks you need on it, further in the case.
This approach has one drawback - as stderr is hidden, you will not be able to see any errors coming from running the command. But if it was not, then they would be interleaved with the RC, which would have required further processing of the {rc} var, to get the desired value. If you need it (the stderr output in case of failures), change accordingly.
P.S. don't add screenshots of a source in a question, it is much less usable than a text version.

resume Rsnapshot to same drive

Sometimes, on a large rsync using Rsnapshot, the NFS mount we are syncing to will drop.
Then when you run:
rnsapshot monthly
to resume it, it will behave as if this is a brand new, rotating the monthly.0 to monthly.1 and so on.
Is there a way to resume the rsync using rsnapshot monthly if something gets interrupted? That won't start a brand new backup?
The answer is no not really but sort of. rsnapshot runs as a batch job - generally triggered from a cron job. It does not keep any state between runs apart from the backups themselves. If your NFS mount goes away mid backup, after a while you'll get some sort of IO error, and rsnapshot will give up and die with an error. The backup will fail. Next time it is run after the failure, it will start the backup as if from scratch.
However, if you use the sync_first config option, and not the link_dest option, rsnapshot will do a better job of recovery. It will leave the files its already transferred in place and won't have to transfer them again, but it will have to check again that the source and destination are the same in the usual rsync way. The man page gives some detail on this.
This is not the case with the link_dest method which for most errors removes the work its done and starts again. Specifically, on error link_dest does a "rollback" like this:
ERROR: /usr/bin/rsync returned 255 while processing sam#localhost:..
WARNING: Rolling back "localhost/"
/bin/rm -rf /tmp/rs-test/backups/hourly.0/localhost/
/bin/cp -al /tmp/rs-test/backups/hourly.1/localhost \
/tmp/rs-test/backups/hourly.0/localhost
touch /tmp/rs-test/backups/hourly.0/
rm -f /tmp/rs-test/rsnapshot-test.lock
If your not using it already, and you can use it (apparently some non UNIX systems can't), use the sync_first method.
I don't use rsnapshot, but I've written my own equivalent wrapper over rsync [in perl] and I do delta backups, so I've encountered similar problems.
To solve the problem of creating "false whole backups", the basic idea is to do:
while 1
rsnapshot remote_whatever/tmp
if (rsnapshot_was_okay) then
mv remote_whatever/tmp remote_whatever/monthly_whatever
break
endif
end
The above mv should be atomic, even over NFS.
I just answered a similar question here: https://serverfault.com/questions/741346/rsync-directory-so-all-changes-appear-atomically/741420#741420 that has more details

Jenkins Build Script exits after Google Test execution

I am building a Qt GUI application via Jenkins. I added 3 build steps:
Building the test executable
Running the test executable
compiling a coverage report with gcovr
For some reason, the shell task for running the test executable stops after execution. Even a simple echo does not run after. The tests are written with Google Test and output xUnit XML files, which are analyzed after the build.
Some tests start the applications user interface, so I installed the jenkins xvnc plugin to get them to run.
The build tasks are as follows:
Build
cd $WORKSPACE/projectfiles/QMake
sh createbin.sh
Test
cd $WORKSPACE/bin
./Application --gtest_output=xml
Coverage Report
cd $WORKSPACE/projectfiles/QMake/out
gcovr -x -o coverage.xml
Now, an echo at the end of the first build task is correctly printed, but an echo at the end of the second is not. The third build task is therefore not even run, although the Google Test output is visible. I thought that maybe the problem is that some of the Google Tests fail, but why whould the script stop executing just because the tests fail?
Maybe someone can give me a hint on why the second task stops.
Edit
The console output looks like this:
Updating svn://repo/ to revision '2012-11-15T06:43:15.228 -0800'
At revision 2053
no change for svn://repo/ since the previous build
Starting xvnc
[VG5] $ vncserver :10
New 'ubuntu:10 (jenkins)' desktop is ubuntu:10
Starting applications specified in /var/lib/jenkins/.vnc/xstartup
Log file is /var/lib/jenkins/.vnc/ubuntu:10.log
[VG5] $ /bin/sh -xe /tmp/hudson7777833632767565513.sh
+ cd /var/lib/jenkins/workspace/projectfiles/QMake
+ sh createbin.sh
... Compiler output ...
+ echo Build Done
Build Done
[VG5] $ /bin/sh -xe /tmp/hudson4729703161621217344.sh
+ cd /var/lib/jenkins/workspace/VG5/bin
+ ./Application --gtest_output=xml
Xlib: extension "XInputExtension" missing on display ":10".
[==========] Running 29 tests from 8 test cases.
... Test output ...
3 FAILED TESTS
Build step 'Execute shell' marked build as failure
Terminating xvnc.
$ vncserver -kill :10
Killing Xvnc4 process ID 1953
Recording test results
Skipping Cobertura coverage report as build was not UNSTABLE or better ...
Finished: FAILURE
Generally, if one Build Step fails, the rest will not be executed.
Pay attention to this line from your log:
[VG5] $ /bin/sh -xe
The -x makes the shell print each command in console before execution.
The -e makes the shell exit with error if any of the commands failed.
A "fail" in this case, would be a return code of not 0 from any of the individual commands.
You can verify this by running this directly on the machine:
./Application --gtest_output=xml
echo $?
If the echo $? displays 0, it indicates successful completion of the previous command. If it displays anything else, it indicates an error code from the previous command (from ./Application), and Jenkins treats it as such.
Now, there are several things at play here. First is that your second Build Step (essentially a temporary shell script /tmp/hudson4729703161621217344.sh) is set to fail if one command fails (the default behaviour). When the Build Step fails, Jenkins will stop and fail the whole job.
You can fix this particular behaviour by adding set +e to the top of your second Build Step. This will not cause the script (Build Step) to fail due to individual command failure (it will display an error for the command, and continue).
However, the overall result of the script (Build Step) is the exit code of the last command. Since in your OP, you only have 2 commands in the script, and the last is failing, it will cause the whole script (Build Step) to be considered a failure, despite the +x that you've added. Note that if you add an echo as the 3rd command, this would actually work, since the last script command (echo) was successful, however this "workaround" is not what you need.
What you need is proper error handling added to your script. Consider this:
set +e
cd $WORKSPACE/bin && ./Application --gtest_output=xml
if ! [ $? -eq 0 ]; then
echo "Tests failed, however we are continuing"
else
echo "All tests passed"
fi
Three things are happening in the script:
First, we are telling shell not to exit on failure of individual commands
Then i've added basic error handling in the second line. The && means "execute ./Application if-and-only-if the previous cd was successful. You never know, maybe the bin folder is missing, or whatever else can happen. BTW, the && internally works on the same error code equals 0 principle
Lastly, there is now proper error handling for the result of ./Application. If the result is not 0, then we show that it had failed, else we show that it had passed. Note, this since the last command is not a (potentially) failing ./Application, but an echo from either of if-else possibilities, the overall result of the script (Built Step) will be a success (i.e 0), and the next Build Step will be executed.
BTW, you can as well put all 3 of your build steps into a single build step with proper error handling.
Yes... this answer may be a little longer than what's required, but i wanted you to understand how Jenkins and shell treat exit codes.

All the tests passed, but bamboo build fails with a statement "No failed tests found, a possible compilation error occurred."

I'm supposed to run some jbehave(automated) tests in bamboo. Once the tests run I'll generate some junit compatible xml files so that bamboo could understand the same. All the jbehave tests are ran as part of a script, because I need to run the jbehave tests in a separate display screen(remember these are automated browser tests). Example script is as follows.
Ex:
export DISPLAY=:0 && xvfb-run --server-args="-screen 0, 1024x768x24"
mvn clean integration-test -DskipTests -P integration-test -Dtest=*
I have one more junit parser task which points to the generated junit compatible xml files. So, once the bamboo build runs and even if all the tests pass, I get red build with the message "No failed tests found, a possible compilation error occurred."
Can somone please help me on this regard.
Your build script might be producing successful test reports, but one (or both, possibly) of your tasks is failing. That means that the failure is probably* occurring after your tests complete. Check your build logs for errors. You might also try logging in to your Bamboo server (as the bamboo user) and running the commands by hand.
I've seen this message in the past when our test task was crashing halfway through the test run, resulting in one malformed report that Bamboo ignored and a bunch of successful reports.
*Check the build log to make sure that your tests are indeed running. If mvn clean doesn't clean out the test report directory, Bamboo might just be parsing stale test reports.
EDIT: (in response to Kishore's links)
It looks like your task to kill Xvfb is what is causing the build to fail.
18-Jul-2012 09:50:18 Starting task 'Kill Xvfb' of type 'com.atlassian.bamboo.plugins.scripttask:task.builder.script'
18-Jul-2012 09:50:18
Beginning to execute external process for build 'Functional Tests - Application Release Test - Default Job'
... running command line:
/bin/sh
/tmp/FUNC-APPTEST-JOB1-91-ScriptBuildTask-4153769009554485085.sh
... in: /opt/bamboo-home/xml-data/build-dir/FUNC-APPTEST-JOB1
... using extra environment variables:
<..snip (no meaningful output)..>
18-Jul-2012 09:50:18 Failing task since return code was 1 while expected 0
18-Jul-2012 09:50:18 Finished task 'Kill Xvfb'
What does your "Kill Xvfb" script do? Are you trying something like pkill -f "[x]vfb"? pkill -f silently returns non-zero if it can't match the expression to any processes.
My solution was to make a 'script' task:
#!/bin/bash
/usr/local/bin/phpcs --report=checkstyle --report-file=build/logs/checkstyle.xml --standard=PSR2 ./lib | exit 0
Which always exits with status 0.
This is because PHP code sniffer return exit status 1 when only 1 coding violation (warning / error) is found which causes the built to fail.
Turns out to be a simple fix.
General bamboo behavior is to scan the entire log and see for any failure codes(1). For this specific configuration i had some 6 scripts out of which one of them was to kill the xvfb(frame buffer). For some reason server is not able to kill xvfb and that task was returning a failure code. Because of this, though all the tests passed, bamboo got one of this error codes from previous tasks and build was failing.
Current fix is to remove the task which kills xvfb and the build went green! \o/.

Resources