VSTS: how to run tests from a different build after release? - automated-tests

I have two builds. One is the functional code that "does the work" (let's call it 'F') and the other one is automated tests (let's call it 'T'). The reason they're separate is basically that the two are using completely different technologies (T is C#/Mstest, F is something completely different).
What I want is to run, automatically, the tests from T whenever I release F into an environment, say integration.
I tried looking around (and googled --a lot!) but couldn't find a way to do that. Any ideas?

Add two sets of artifacts to your release: One from build F, one from build T.

Related

Abort ExUnit on the first test that does not pass

In Ruby, specifically RSpec, you can tell the test runner to abort on the first test that does not pass by the command-line flag --fail-fast. This helps a lot to not waste time or lose focus when fixing a lot of test in a row, for example when doing test-driven or behavior-driven development.
Now on Elixir with ExUnit I am looking for a way to do exactly that. Is there a way to do this?
There is such an option since Elixir 1.8.
Use the --max-failures switch to limit the number of tests evaluated with failure. To halt the test suite after the first failure, run this:
mix test --max-failures 1
Unfortunately there is (to my knowledge) no such flag implemented.
However, you can run a single test by
mix test path/to/testfile.exs:12
where 12 is the line number of the test.
Hope that helps!
That makes not much sense since tests in Elixir are a) to be run blazingly fast and b) in most cases are to be run asynchronously. Immediate termination of the test suite on the failed test is an anti-pattern and that’s why it’s not allowed by ExUnit authors.
One still has an option to shoot their own leg: just implement a custom handler for the EventManager and kill the whole application on “test failed” event.
For BDD, one preferably uses tags, running the test suite with only this feature included. That way you’ll get an ability to run tests per feature at any time in the future.
Also, as a last resort one might run a specific case only by passing the file name to mix test and/or a specific test only by passing the file name followed by a colon and a line number.

Julia code encapsulation. Is this a generally good idea?

Is it a common thing for Julia users to encapsulate functionality into own modules with the module keyword?
I've been looking at modules and they seem to use include more than actually using the module keyword for parts of code.
What is the better way?
Julia has 3 levels of "Places you can put code"
include'd files
submodules
packages -- which for our purposes have exactly 1 (non-sub) module.
I use to be a big fan of submodules, given I come from python.
But submodules in julia are not so good.
From my experience, code written with them tends to be annoying both from a developer, and from a user perspective.
It just don't work out cleanly.
I have ripped the submodules out of at least one of my packages, and switched to plain includes.
Python needs submodules, to help deal with its namespace -- "Namespaces are great, lets do more of those".
But because of multiple dispatch, julia does not run out of function names -- you can reuse the same name, with a different type signature, and that is fine (Good even)
In general submodules grant you separation and decoupling of each submodule from each other. But in that case, why are you not using entirely separate packages? (with a common dependency)
Things that I find go wrong with submodules:
say you had:
- module A
- module B (i.e A.B)
- type C
One would normally do using A.B
but by mistake you can do using B since B is probably in a file called B.jl in your LOAD_PATH.
If you do this, then you want to access type C.
If you had done using A.B you would end up with a type A.B.C, when you entered the statement B.C().
But if you had mistakenly done using B, then B.C() gives you the type B.C.
And this type is not compatible, with functions that (because they do the right using) expect as A.B.C.
It just gets a bit messy.
Also reload("A.B") does not work.
(however reload often doesn't work great)
Base is one of the only major chuncks of julia code that uses submodules (that I aware of). And even Base is pushed a lot of those into seperate (stdlib) packages for julia 0.7.
The short is, if you are thinking about using a submodule,
check that it isn't a habit you are bringing over from another language.
And consider if you don't just want to release another separate package.

Using many slots to test many classes in Qt unit testing

I've seen a variety of workarounds posted in various places that suggest writing custom main functions instead of relying on the Qt QTEST_MAIN() macro when creating a single test execution that works through many different tests of different classes.
Correct me if I'm wrong, but couldn't you just have a single test class and have as many slots as you need to test as many classes as you want? Just instantiate the desired class you want to test inside the slot's implementation and run your tests in that slot. Then, a different slot might instantiate a different class and run different tests. The single QTEST_MAIN is supposed to run through all your slot tests, so everything gets tested, right?
Here are some of the alternate techniques I've read about that don't use QTEST_MAIN:
https://sites.google.com/a/embeddedlab.org/community/technical-articles/qt/qt-posts/creatingandexecutingasingletestprojectwithmultipleunittests
https://stackoverflow.com/a/12207504/768472
Of course you can have as much slots as you want in a test class. But sooner or later you will need to separate tests and group them just because you have too many tests to place them all in one class. And you will have to create several test classes.
The original intent of QTEST_MAIN function is to run only one test. If you want to test several classes and you can do it independently of each other, you can put them in separate test classes, add a QTEST_MAIN macro for each of them and compile each class to a separate executable. The plus is that if one test case crushes, other tests continue to run properly. The minus is that you need a test runner to run all tests and check their results, and qtestlib doesn't provide a runner. You can write your own runner or use one of existing ones (example).
The options are:
to obey the paradigm of QTestLib. Separate tests to different executables to prevent test failing because of other tests.
to store all test in one class. If your app is not tiny, this will be very inconvenient.
to run all tests manually using custom main function. It's not very bad, but it's inconvenient too because you need to list test classes manually.
to use another test lib. I prefer Google Test. It's much more powerful than qtestlib, it supports death tests, it automatically registers and runs tests and counts their results. There are no such problems in google test. Note that you can use many useful qtestlib features (like QSignalSpy) along with another test framework.

'make'-like dependency-tracking library?

There are many nice things to like about Makefiles, and many pains in the butt.
In the course of doing various project (I'm a research scientist, "data scientist", or whatever) I often find myself starting out with a few data objects on disk, generating various artifacts from those, generating artifacts from those artifacts, and so on.
It would be nice if I could just say "this object depends on these other objects", and "this object is created in the following manner from these objects", and then ask a Make-like framework to handle the details of actually building them, figuring out which objects need to be updated, farming out work to multiple processors (like Make's -j option), and so on. Makefiles can do all this - but the huge problem is that all the actions have to be written as shell commands. This is not convenient if I'm working in R or Perl or another similar environment. Furthermore, a strong assumption in Make is that all targets are files - there are some exceptions and workarounds, but if my targets are e.g. rows in a database, that would be pretty painful.
To be clear, I'm not after a software-build system. I'm interested in something that (more generally?) deals with dependency webs of artifacts.
Anyone know of a framework for these kinds of dependency webs? Seems like it could be a nice tool for doing data science, & visually showing how results were generated, etc.
One extremely interesting example I saw recently was IncPy, but it looks like it hasn't been touched in quite a while, and it's very closely coupled with Python. It's probably also much more ambitious than I'm hoping for, which is why it has to be so closely coupled with Python.
Sorry for the vague question, let me know if some clarification would be helpful.
A new system called "Drake" was announced today that targets this exact situation: http://blog.factual.com/introducing-drake-a-kind-of-make-for-data . Looks very promising, though I haven't actually tried it yet.
This question is several years old, but I thought adding a link to remake here would be relevant.
From the GitHub repository:
The idea here is to re-imagine a set of ideas from make but built for R. Rather than having a series of calls to different instances of R (as happens if you run make on R scripts), the idea is to define pieces of a pipeline within an R session. Rather than being language agnostic (like make must be), remake is unapologetically R focussed.
It is not on CRAN yet, and I haven't tried it, but it looks very interesting.
I would give Bazel a try for this. It is primarily a software build system, but with its genrule type of artifacts it can perform pretty arbitrary file generation, too.
Bazel is very extendable, using its Python-like Starlark language which should be far easier to use for complicated tasks than make. You can start by writing simple genrule steps by hand, then refactor common patterns into macros, and if things become more complicated even write your own rules. So you should be able to express your individual transformations at a high level that models how you think about them, then turn that representation into lower level constructs using something that feels like a proper programming language.
Where make depends on timestamps, Bazel checks fingerprints. So if at any one step produces the same output even though one of its inputs changed, then subsequent steps won't need to get re-computed again. If some of your data processing steps project or filter data, there might be a high probability of this kind of thing happening.
I see your question is tagged for R, even though it doesn't mention it much. Under the hood, R computations would in Bazel still boil down to R CMD invocations on the shell. But you could have complicated muliti-line commands assembled in complicated ways, to read your inputs, process them and store the outputs. If the cost of initialization of the R binary is a concern, Rserve might help although using it would make the setup depend on a locally accessible Rserve instance I believe. Even with that I see nothing that would avoid the cost of storing the data to file, and loading it back from file. If you want something that avoids that cost by keeping things in memory between steps, then you'd be looking into a very R-specific tool, not a generic tool like you requested.
In terms of “visually showing how results were generated”, bazel query --output graph can be used to generate a graphviz dot file of the dependency graph.
Disclaimer: I'm currently working at Google, which internally uses a variant of Bazel called Blaze. Actually Bazel is the open-source released version of Blaze. I'm very familiar with using Blaze, but not with setting up Bazel from scratch.
Red-R has a concept of data flow programming. I have not tried it yet.

Project nearing completion. Time to begin testing. Which methods are feasible towards the end of the development cycle?

Let's assume one joins a project near the end of its development cycle. The project has been passed on across many teams and has been an overall free-for-all with no testing whatsoever taking place along the whole time. The other members on this team have no knowledge of testing (shame!) and unit testing each method seems infeasible at this point.
What would the recommended strategy for testing a product be at this point, besides usability testing? Is this normally the point where you're stuck with manual point-and-click expected output/actual output work?
I typically take a bottom-up approach to testing, but I think in this case you want to go top-down. Test the biggest components you can wrap unit-tests around and see how they fail. Those failures should point you towards what sub-components need tests of their own. You'll have a pretty spotty test suite when this is done, but it's a start.
If you have the budget for it, get a testing automation suite. HP/Mercury QuickTest is the leader in this space, but is very expensive. The idea is that you record test cases like macros by driving your GUI through use cases. You fill out inputs on a form (web, .net, swing, pretty much any sort of GUI), the engine learns the form elements names. Then you can check for expected output on the GUI and in the db. Then you can plug in a table or spreadsheet of various test inputs, including invalid cases where it should fail and run it through hundreds of scenarios if you like. After the tests are recorded, you can also edit the generated scripts to customize them. It builds a neat report for you in the end showing you exactly what failed.
There are also some cheap and free GUI automation testing suites that do pretty much the same thing but with fewer features. In general the more expensive the suite, the less manual customizition is necessary. Check out this list: http://www.testingfaqs.org/t-gui.html
I think this is where a good Quality Assurance test would come in. Write out old fashioned test cases and hand out to multiple people on the team to test.
What would the recommended strategy for testing a product be at this point, besides usability testing?
I'd recommend code inspection, by someone/people who know (or who can develop) the product's functional specification.
An extreme, purist way would be to say that, because it "has been an overall free-for-all with no testing whatsoever", therefore one can't trust any of it: not the existing testing, nor the code, nor the developers, nor the development process, nor management, nothing about the project. Furthermore, testing doesn't add quality to software (quality has to be built-in, part of the development process). The only way to have a quality product is to build a quality product; this product had no quality in its build, and therefore one needs to rebuild it:
Treat the existing source code as a throw-away prototype or documentation
Build a new product piece-by-piece, optionally incorporating suitable fragments (if any) of the old source code.
But doing code inspection (and correcting defects found via code inspection) might be quicker. That would be in addition to functional testing.
Whether or not you'll want to not only test it but also spend the extra time effort to develop automated tests depends on whether you'll want to maintain the software (i.e., in the future, to change it in any way and then retest it).
You'll also need:
Either:
Knowledge of the functional specification (and non-functional specification)
Developers and/or QA people with a clue
Or:
A small, simple product
Patient, forgiving end-users
Continuing technical support after the product is delivered
One technique that I incorporate into my development practice when entering a project at this time in the lifecycle is to add unit tests as defects are reported (by QA or end users). You won't get full code coverage of the existing code base, but at least this way future development can be driven and documented by tests. Also this way you should be assured that your tests fail before working on the implementation. If you write the test and it doesn't fail, the test is faulty.
Additionally, as you add new functionality to the system, start those with tests so that at least those sub-systems are tested. As the new systems interact with existing, try adding tests around the old boundary layers and work your way in over time. While these won't be Unit tests, these integration tests are better than nothing.
Refactoring is yet another prime target for testing. Refactoring without tests is like walking a tight rope without a net. You may get to the other side successfully, but is the risk worth the reward?

Resources