What is the best practice to develop CD/CI when you use ML studio APIs? - azure-machine-learning-studio

In our backend development process, we have two environments: testing and production. We develop our code, and then we push the code into the testing repository. Then on the release date, we push everything into production.
Now that we are going to use ML studio, I'm struggling with setting up testing and production environments for my ML studio experiments.
I created two identical experiments with independent APIs; one experiment for testing and the other experiment is used by the production. When it comes to moving the trained experiment from testing to production, I make all the changes I made in the testing environment to the production environment, which is a very time demanding process.
Do you know any better solution so we can deploy and test our changes and then deploy the latest changes to the production? How people use ML studio in their CD/CI process?
The attached image shows the design that I have now. I'd appreciate if you can help me in improving this process. Maybe ML studio has some features to manage this scenario that I don't know.

In MLStudio, when you doing publishing an experiment as a API, the current API get replaced with a prevailing one. One of the practices you can do is like this.
Maintain a test experiment.
Maintain an identical copy of that and push it for the production.
When you done with the changes in the test experiment keep it as that (Then you can change it later) and make a copy of that(Using Save as) - publish it as the production service.
There are some drawbacks in this too. You have to update the API endpoints on the production code once you going to release and you may have to document the versions going for the production manually. The only advantage is the time of updating two experiments goes off.

Related

Azure GUIX automated testing

Can one implement automated UI testing for any UI designed using Azure GUIX? I came across an earlier post asking the same question.
Azure GUIX automated testing specifically for languages
I searched for Azure Test Harness but could not find much on it. Any suggestions?
We have added a string-fit test in GUIX Studio for the 6.1.9.1 patch release, under Edit -> Run String Fit Test. This release has been pushed to the App Store and will be rolling out over the next couple of days. This tests to insure every string in every language fits within the assigned widgets. Of course you can assign and re-assign strings at runtime, so Studio can't test for that, only the know assignments of widgets and fonts can be statically tested by GUIX Studio.
For our own internal regression testing, we build our test apps as Win32 apps and write python scripts to generate events into those apps to drive them. We do md5sum calculations of the canvas memory and compare the computed value with "golden values" to insure nothing has been broken. We haven't yet instrumented anything similar to support on-target regression testing but we have this feature in our backlog, I will see if we can get this on the priority list for the next release.
Best Regards

Advantages of a build server?

I am attempting to convince my colleagues to start using a build server and automated building for our Silverlight application. I have justified it on the grounds that we will catch integration errors more quickly, and will also always have a working dev copy of the system with the latest changes. But some still don't get it.
What are the most significant advantages of using a Build Server for your project?
There are more advantages than just finding compile errors earlier (which is significant):
Produce a full clean build for each check-in (or daily or however it's configured)
Produce consistent builds that are less likely to have just worked due to left-over artifacts from a previous build
Provide a history of which change actually broke a build
Provide a good mechanism for automating other related processes (like deploy to test computers)
Continuous integration reveals any problems in the big picture, as different teams/developers work in different parts of the code/application/system
Unit and integration tests ran with the each build go even deeper and expose problems that would maybe not be seen on the developer's workstation
Free coffees/candy/beer. When someone breaks the build, he/she makes it up for the other team members...
I think if you can convince your team members that there WILL be errors and integration problems that are not exposed during the development time, that should be enough.
And of course, you can tell them that the team will look ancient in the modern world if you don't run continuous builds :)
See Continuous Integration: Benefits of Continuous Integration :
On the whole I think the greatest and most wide ranging benefit of Continuous Integration is reduced risk. My mind still floats back to that early software project I mentioned in my first paragraph. There they were at the end (they hoped) of a long project, yet with no real idea of how long it would be before they were done.
...
As a result projects with Continuous Integration tend to have dramatically less bugs, both in production and in process. However I should stress that the degree of this benefit is directly tied to how good your test suite is. You should find that it's not too difficult to build a test suite that makes a noticeable difference. Usually, however, it takes a while before a team really gets to the low level of bugs that they have the potential to reach. Getting there means constantly working on and improving your tests.
If you have continuous integration, it removes one of the biggest barriers to frequent deployment. Frequent deployment is valuable because it allows your users to get new features more rapidly, to give more rapid feedback on those features, and generally become more collaborative in the development cycle. This helps break down the barriers between customers and development - barriers which I believe are the biggest barriers to successful software development.
From my personal experience, setting up a build server and implementing CI process, really changes the way the project is conducted. The act of producing a build becomes an uneventful everyday thing, because you literally do it every day. This allows you to catch things earlier and be more agile.
Also note that setting build server is only a part of the CI process, which includes setting up tests and ultimately automating the deployment (very useful).
Another side-effect benefit that often doen't get mentioned is that CI tools like CruiseControl.NET becomes the central issuer of all version numbers for all branches, including internal RCs. You could then enforce your team to always ship a build that came out of the CI tool, even if it's a custom version of the product.
Early warning of broken or incompatible code means that all conflicts are identified asap, thereby avoiding last minute chaos on the release date.
When your boss says "I need a copy of the latest code ASAP" you can get it to them in < 5 minutes.
You can make the build available to internal testers easily, and when they report a bug they can easily tell you "it was the April 01 nightly build" so that you can work with the same version of the source code.
You'll be sure that you have an automated way to build the code that doesn't rely on libraries / environment variables / scripts / etc. that are set up in developers' environments but hard to replicate by others who want to work with the code.
We have found the automatic VCS tagging of the exact code that produce a version very helpful in going back to a specific version to replicate an issue.
Integration is a blind spot
Integration often doesn't get any respect - "we just throw the binaries into an installer thingie". If ithis doesn't work, it's the installers fault.
Stable Build Environment
Prevents excuses such as "This error sometimes occurs when built on Joe's machine". Prevents using old dependent libraries accidentally when building on Mikes machine.
True dogfooding
You inhouse testers and users have a true customer experience. Your developers have a clear reference for reproducing errors.
My manager told us we needed to set them up for two major reasons. None were really to do with the final project but to make sure what is checked in or worked on is correct.
First to clean up DLL Hell. When someone builds on their local machine they can be pointing at any reference folder. Lots of projects were getting built with the wrong versions of dlls from someone not updating their local folder. In the build server it will always be built of the same source. All you have to do is get latest to get the latest references.
The second major thing for us was a way to support projects with little knowledge of them. Any developer can go grab the source and do a minor fix if required. They don't have to mess with hours of set up or finding references. We have an overseas team that works primarily on a project but if there is a rush fix we need to do during US hours we can grab latest and be able to build not have to worry about broken source or what didn't get checked in. Gated checkins save everyone else on your team time.

Deploying code from Subversion Repository to web server without building

At my company, we develop our ASP.NET applications as websites and often just work off of our network drive, which points directly to the files on our development web server. Our code is compiled at time of HTML request, so we don't build our web applications. I've read that automated builds are a best-practice, and aim to set that up as well at some point. Right now we're using VSS, which is awful, and I'd like to switch us to subversion.
I've read about NAnt for builds and deployment, and also just heard about CruiseControl.NET. Can I use these tools simply to push code from our SVN Repository to our development web server when a developer commits changes to it from their working copy?
You certainly can. I have gone through a very similar migration. We always compiled our web apps, but we migrated from VSS to SVN and then setup cruisecontrol and nant to automate our builds and deployments. We used to just drag and drop with windows explorer which was down right painful.
As it happens I have been blogging on this process. My last post specifically covers using cruisecontrol.net and NANT: http://www.mattwrock.com/post/2009/10/22/The-Perfect-Build-Part-3-Continuous-Integration-with-CruiseControlnet-and-NANT-for-Visual-Studio-Projects.aspx
I am CM/Developer at my company. We use Nant, CCNET and Subversion, for continous integration and automated deployments to the DEV servers. Works perfectly.
Things to note:
1. If your getting Nant, get Nant contrib as well
2. If your building and deploying installers, it will be easier to use devenv.exe to build the installers.
3. You can check out PSTools to install stuff on remote servers.
4. I would set up two different build categories in CCNEt, 1. for Continous and 2. For force builds....this should be your publish.
The set up can get pretty complex, I have used it with VSS as well, email me if you have any questions or need scripting help.
Yep.
At one company we built a nAnt script that did this. Very simple and effective, but extremely cryptic to change or update.
At another we used Cruise Control which worked great but again was a little cryptic (I think it uses nAnt on the backend), but was very nice to look at and see the steps and problems visually.
Honestly though, the latest Team Server from MS is very good at managing code and very nice at producing builds as well. By far the easiest and effective way I've ever used to deploy .net code.

Continuous Integration vs. Nightly Builds

Reading this post has left me wondering; are nightly builds ever better for a situation than continuous integration? The consensus of the answers seems to be pretty lopsided in favor of continuous integration, is that evangelism or is there really no reason to use nightly builds when continuous integration is an option?
If you're really doing continuous integration with all available tests, nightly builds would be redundant, since the last thing checked in that day would already have been tested.
On the other hand, if your CI regime only involves running a subset of all available tests, for example because some of your tests take a long time to run, then you can use nightlies additionally to run all tests. This'll let you catch many bugs early, and if you can't catch them early, you can at least catch them overnight.
I don't know, though, if that's technically still CI, since you're only doing a "partial" build each time, by ignoring some of the tests.
In our organization, nightly builds and CI builds have two distinct purposes. The CI build is a 'latest code' build in which the unit tests are run against the last check in as you would expect. We also run several code metrics on the CI build.
For nightly builds, however, we only incorporate source code that has been through the peer review process and is deemed ready for testing.
This way, the nightly build always contains build that is 'feature ready' for testing, while the CI build contains features that while functional (to the extent that the unit tests pass) may not be ready to send the to the test group.
The test groups writes new CRs exclusively from one of the nightly builds as opposed to the CI build, although those are also available for informal exploratory type testing.
Yes, if you have a process you want attached to a build, but it is resource heavy. For example, on my team we run JTest during the nightly build. We can't run it during the day because:
It requires a lot of resources, which may not be available
It takes 4 hours to complete each time
If you have a nice robust CI process in place a "nightly" is still useful.
As mentioned, a "nightly" build can do exhaustive tests and perhaps some high-level system tests. End-to-end stuff.
The concept of a "nightly" build is easily understood by everyone in the organization. If you have trouble communicating CI builds out to other groups (for example, a QA group that doesn't have the same handle on Agile that the Dev group might) a "nightly" is a powerful and simple concept.
If your nightly is a separate set of resources, it can be managed separately and used to cut "gold" images with some claims to software integrity. For example, developer writes code, some trusted build system that dev can't touch builds it, QA tests the gold build and signs it. In such a situation, the nightly build functions like a production build system.
Just some thoughts.
In my professional opinion, the only reason to use nightly builds is when the build process takes so long that it can't complete in a "reasonable" amount of time.
For example, if your build process takes 5 hours to complete there is really no reason to do a build on check in.
Beyond that, there is so much value in knowing as soon as possible when a build fails that it overrides other concerns.
It depends on the purpose and length of each of your builds. Basically, you should identify what you are trying to learn from CI and decide if it is worth while spending the resources on running multiple builds.
We used continuous integration at my last job for a few different purposes.
First, we used it to make sure that the repository and thus the developers always had a version of the code that compiled. Few things are worse for team members than having to manage another person's broken changes through commenting, uncommenting, reverting and merging, because one person checked in bad code. For this, we had a build that ran instantaneously with no tests or other validation so we knew as soon as possible if the code was safe to update. Builds usually took about ten minutes and the machine was probably running around 50% on a normal workday. No documentation was generated here, just a quiet pass or a loud fail siren.
Second, we wanted to know as soon as possible whether or not any rules were broken. The quicker that you find a broken rule, the easier it is to fix. For this purpose, we had a separate machine that ran a full build and validation of the code. This machine was running 12-14 hours a day continuously on a normal workday. Email status of the build was sent out describing broken unit tests, code compliance, etc.
We stopped there as far as automatically triggered builds go. A nightly build on top of that seemed a bit extreme for us. But I suppose if you wanted to have a snapshot build archived daily, you may want to schedule a third build with the extra steps required for that. Though, we did have another build that wrapped up and archived our QA deployment artifacts for quick and easy deployment, but we only ever manually triggered that one.
I think the other posts cover the common reasons, like having a build process that takes "too long" or having to run only a subset of tests during the CI build. But there's another reason which is political.
In some organizations the official builds are handled by a minimally responsive build/infrastructure/release management/SCM team. In these cases you might put them in charge of the nightly build and then run the CI build out of development. This avoids a fight because their build remains "the official build" and your CI build give you the feedback you need.
We have both continuous integration and nightly builds in place. They serve two different purposes.
Our continuous integration mechanism builds the software and runs unit tests under the continuous integration suite.
Our nightly build tags the source under version control, builds the software, runs the unit tests under nightly build suite. The software built here is then used in various system tests and stress tests.
I think that one of the main differentiator for nightly build is system tests.

What is a good CI build-process

What constitutes a good CI build-process?
We use CI, but is deployment to production even a realistic CI goal when you have dependencies on several services that should be deployed too and other apps may depend on these too.
Is a good good CI build process good enough when its automated to QA and manual from there?
Well "it depends" :)
We use our CI system to:
build & unit test
deploy to single box, run intergration tests and code analisys
deploy to lab environment
run acceptance tests in prod-like system
drop builds that pass to code drop for prod deployment
This is for a greenfield project of about a dozen services and databases deployed to 20+ servers, that also had dependencies on half a dozen other 'external' services.
Using a CI tool to deploy your product to a production environment as a realistic goal? again... "it depends"
Why would you want to do this?
if you have the process you can roll changes (and roll back) faster and more often
less chance for human error
you can test the same deployment strategy in a test environment before going to production and catch issues earlier
Some technical things you have to address before you can answer this:
what is the uptime requirements for your system -- Are you allowed to have downtime or does it need to be up 24/7?
do you have change control processes in place that require human intervention/approval?
is your deployment robust enough for any component to roll back to a known-good state if a deployment fails?
is your system designed to handle different versions of services or clients in case one or several component deployments fails (and you have the above rollback to last known good)?
does the process have the smarts to handle a partial deployment where a component cannot handle mixed versions of its dependencies/clients?
how are you handing database deployment/upgrades?
do you have monitoring in place so you know when something goes wrong?
Here are a couple of recent related links about automation and building the tools you need.
When it comes down to it the more complex your system the more difficult it is do automate everything, but that does not mean it is not a worthy goal, it just takes a lot more effort and willpower to get it done -- everything from knowing the difficulties you're going to face, the problems you have to account for (failure will happen), the political challenges of building infrastructure (vs. more product features).
Now heres the big secret... the technical challenges are challenging but not impossible... the political challenges may be insurmountable. Everything about this costs money whether its dev time or buying 3rd party solutions. So really, can you build the $1K, $10K, $100K, or $1M solution?
Whatever solution you go for make sure the automation is robust first, complete second... i.e. make sure you have as robust a solution as you can for getting deployment to a test environment rather than a fragile solution that deploys to production.
CI is not intended as a deployment mechanism. It is good to have your CI execute any automated deployment to a QA/Test server, to ensure those aspects of your build work, but I would not use a CI system like Cruise Control or Bamboo as the means of deployment.
CI is for building the codebase periodically to automate execution of automated tests, verification of the codebase via static analysis and other checks of that nature.
Be sure you understand the idea behind a CI build. CI stands for Continuous Integration and CI builds are really intended to be throw-away builds that are performed when a developer checks code in to the source control system (or at some specified interval) to ensure that the newest changes do not break the code base (hence the idea of continuously integrating the changes to the code base).
To that end, the technology used for the actual build server process is largely irrelevant compared to what actually happens during the build. As #pdavis mentioned, the CI build should compile the code base, execute some code analysis (FxCop, StyleCop, Lint, etc.), execute unit tests and code coverage, and execute any other custom analysis you want performed that should impact the concept of a "successful" or "failed" build.
Having a CI build automatically deploy to an environment really doesn't fall under the control of a build server. That being said, you can always create a separate project that runs on the build server that handles the deployment when it detects certain conditions (such as a build completes successfuly), but that should always be done as a completely independent thing.
I am starting on a new project at work that I am really looking forward to. We are still in the initial design stage and have just recently completed the Logical System Architecture. We have ordered new servers for the testing and staging environments and are setting up a Continuous Integration (CI) build system based on Cruise Control (http://cruisecontrol.sourceforge.net/) and MSBuild (http://msdn2.microsoft.com/en-us/library/wea2sca5.aspx) which is basically an improved port of ANT. It appears that Visual Studio 2005 project and solution files are all now in MSBuild format. Cruise Control will be automatically pulling the source from Visual Source Safe (ok, it isn't Subversion but we can deal), compiling it, and then running it through fxCop (http://www.gotdotnet.com/Team/FxCop/), nUnit (http://www.nunit.org/), nCover (http://ncover.org/site/), and last but not lease Simian (http://www.redhillconsulting.com.au/products/simian/). Cruise Control has a pretty good website interface for displaying all of the compiled results from the various tools and can even display code changes from one build to the next. It also keeps track of all builds in a build history. I'm looking forward to the test driven development and think that this type of approach combined with nUnit/nCover should give us a pretty good idea before we roll out changes that we haven't broken anything. There are also plans to incorporate some type of automated user interface testing once we are far enough along in the project. Depending on the tool, this should be just a matter of installing the tool on the build server and calling it from Cruise Control. Sweet.
A good CI process will have full or nearly-full unit test coverage. Unit tests test classes and methods, vs. integration tests, which test multiple parts of the system. When you set up your CI builds, have them automate the unit tests. That way, the CI builds can run multiple times per day. We have ours set to run every 2 hours.
You can have longer running builds that run once per day. These can use other services and run integration tests.
I was watching a ThoughtWorks presentation (creators of Cruise Control) and they actually addressed this issue. Their answer is that NO deployment is too complex to test. Why? Because otherwise, your customers become your testers, which is exactly where you don't want to be.
If you have a complex deployment structure, set up a visualization server. Have it pretend to be all the systems you need to talk to. They can always start in a known good state, because you can reset to a clean image.
To answer your initial question, a good process is one which enables communication between the repository and the developers. If the repository is in a bad state (non-compiling code, failed tests, etc.), the developers know about it as soon as possible, so that they can correct it.
The later a bug is discovered, the costlier it is to fix. So bugs should be discovered as early as possible. This is the motivation behind CI.
A good CI should ensure catching as many bugs as possible. The whole application comprises of code (often in multiple languages), Database schema, deployment files etc. Errors in any of these can cause bugs - so the CI should try to exercise as many of them as possible.
CI does not replace a proper QA discipline. Also, CI need not be very comprehensive on day one of the project. One can start with a simple CI process that does basic compilation & unit testing initially. As you discover more classes of bugs in QA, you should adapt the CI process to try to catch future occurrences of those bugs. It can also involve static code-analysis checks, so that you can implement consistent coding and design ideals across the codebase.

Resources