Say you got an old codebase that you need to maintain and it's clearly not compliant with current standards. How would you distribute your efforts to keep the codebase backwards compatible while gaining standards compliance? What is important to you?
At my workplace we don't get any time to refactor things just because it makes the code better. There has to be a bug or a feature request from the customer.
Given that rule I do the following:
When I refactor:
If I have to make changes to an old project, I cleanup, refactor or rewrite the part that I'm changing.
Because I always have to understand the code to make the change in the first place, that's the best time to make other changes.
Also this is a good time to add missing unit tests, for your change and the existing code.
I always do the refactoring first and then make my changes, so that way I'm much more confident my changes didn't break anything.
Priorities:
The code parts which need refactoring the most, often contain the most bugs. Also we don't get any bug reports or feature requests for parts which are of no interest to the customer. Hence with this method prioritization comes automatically.
Circumenventing bad designs:
I guess there are tons of books about this, but here's what helped me the most:
I build facades:
Facades with a new good design which "quarantine" the existing bad code. I use this if I have to write new code and I have to reuse badly structured existing code, which I can't change due to time constraints.
Facades with the original bad design, which hide the new good code. I use this if I have to write new code which is used by the existing code.
The main purpose of those facades is dividing good from bad code and thus limiting dependencies and cascading effects.
I'd factor in time to fix up modules as I needed to work with them. An up-front rewrite/refactor isn't going to fly in any reasonable company, but it'll almost definitely be a nightmare to maintain a bodgy codebase without cleaning it up somewhat.
Related
I've got a bit of an issue. I'm writing and continually developing an ASP.NET MVC application. The problem is, everytime I update one small part of our site (not even anything to do with our database), it seems to break about 4 other things in other parts of the site. I've been doing my best to anticipate it, but in the end, I know there are better ways out there to test, and I'm just wondering what the general consensus is for best practices?
Thanks!
In short - not organized answer!
Have a plan.
Know what the change is.
Document what you are going to test and why.
Maintain a test suite (Regression) to execute after all major changes.
Keep improving your test cases and adding the ones you missed
(there should be a test case associated with all your bugs.
This will ensure the same bug doesn't get introduced due to regression.
And like you said have a proper structure, ensure process implementation, have better code reviews (catch the bugs before they get submitted) and keep reading a lot about all this.
This answer is open for editing to make it better.
Q:
Lately ,i asked for testing a code ,,to detect the bugs and fix the problems ,,i find many problems ,, but the main problem here is the code it self ,, spaghetti code many many code lines and the tracing to fix problems is so difficult ,, plus some code is copied and pasted from the internet as it is without any modification,, no documentation is possible to this code,,the performance is so bad due to the heavy using of viewstate in every thing,, and it takes me a lot of time to fix the problems ,, and iam afraid of after all this time still other bugs may appear in the future .. how to handle this case ,,any advices concerning this issue will be a great help.
Start refactoring. This will not make only the code better readable it will also give you a btter understanding on how it works and what it does.
That's a very common situation - software was developed for some years and by many different developers and now you have to support it. You should fight the will to throw it away and rewrite the whole application - it is a big effort and chances are big that you will do many mistakes that are already fixed in bad code. See Joel article for more explanations.
In my experience the best way is to refactor the code. However it involves writing a lot of tests - unit, automated acceptance - and it will take almost the same time as rewriting, but it will be less pain
I think the mentality for working with spaghetti code is summed up quite well in Refactoring by Martin Fowler.
The picture that comes to my mind is surgery: The entire patient except the part to be operated on is draped. The draping leaves the surgeon with only a fixed set of variables. Now, we could have long arguments over whether this abstraction of a person to a lower left quadrant abdomen leads to good health care, but at the moment of surgery, I'm kind of glad the surgeon can focus
Basically what that means is start small, and update a single part of the code at a time. Little changes will add up to good, structured code in the future.
Of course, if there are major problems with the entire architecture, you may have to heed Cybernate's advice and start from scratch (unpleasant as it may be).
I would also write unit tests as well as refactoring as codymanix said. This does a couple things:
As you refactor, you will have some check that your refactoring won't break things.
Writing the tests is a good way of learning the domain knowledge embedded in the code.
Of course, there is an inherent catch-22 with adding unit tests to spaghetti code: it is usually hard to unit test, and you need to refactor it to be able to make it testable. But if you go bit by bit and start with the low-hanging fruit, usually you can make it readable.
Scrap the existing code and start afresh.
That involves lot less effort comparitively.
If that's not a possible option start with Refactoring using tool like ReSharper that will give a good start.
I just finished working on a project for the last couple of months. It's online and ready to go. The client is now back with what is more or less a complete rewrite of most parts of the application. A new contract has been drafted and payment made for the additional work involved.
I'm wondering what would be the best way to start reworking this whole thing. What are the first few things you would do? How would you rework the design in a way that you stay confident that the stuff you're changing does not break other stuff?
In short, how would you tackle drastic application design changes efficiently (both DB and code)?
Presuming that you have unit tests in place, this is just refactoring.
If you don't have unit tests in place, then
Write unit tests for the parts you're likely to keep.
Write unit tests for the parts you're going to change.
Run the tests. The "keep" should pass. The "change" should fail.
Start refactoring until the tests pass.
This is NOT-A-NEW thing in software and people have done this and written a lot about this.
Try reading
Working Effectively with Legacy
Code
Refactoring Databases:
Evolutionary Database Design
The techniques explained here are invaluable to sustain any kind of long running IT projects.
Database design is different from application design in this regard.
Very often, client rethinking changes the application completely, but changes little, if anything, in the fundamental underlying data model of the enterprise. The reason for this is that clients tend to think in terms of business processes, but not in terms of fundamental data. Business processing and data processing are tightly coupled. Data storage is less tightly coupled.
In the days of classical database design, designers learned how to exploit this pattern, by dividing their database design into (at least) two layers: logical design and physical design. There are any number of times that a change of business process requires a complete rewrite of the application, and a major rework of the database physical design, but requires few, if any, changes to the logical design.
If your database design didn't separate out the layers like this, it's hard to tell what gets affected and what doesn't. Start with your tables and columns. Ask yourself if any of the changes require removing any column from the table it's in, or require inventing new columns. If the answer is no, you're in luck. Next, look at the constraints placed on the database (things like PRIMARY KEY, FOREIGN KEY, UNIQUE and NOT NULL). These constraints might be tightened or loosened by the client's changes. If not, you're in luck. If you didn't declare any constraints in the database, and chose to do all your integrity protection in application code, you're probably out of luck.
You still have a fair amount of work to do in terms of changing the indexes on the tables, and the way the application works with the data. But you've salvaged part of the investment in the old system.
The application itself is much more vulnerable to client changes in process than the database. If your database design was completely driven by your application design, you may be out of luck.
If it's THAT drastic of a change it might be best to just start over. I've worked on a number of projects that have gone through some drastic changes.
Starting over gives you a chance to use experience learned since the last project and provide a more efficent product.
I would recommend against trying to re-work the old site into the new site, you'll probably spend more time fiddling around changing things than you would have if you had just re-written it.
Best of luck to you !
How would you rework the design in a way that you stay confident that the stuff you're changing does not break other stuff? In short, how would you tackle drastic application design changes efficiently (both DB and code)?
Tests, code complexity/coverage metrics, and a continuous integration system. Run them early and often, so you know which parts are the riskiest and where to start writing.
These will become your safety nets when you have to make potentially problematic changes. If something does break, your CI system will tell you, and you won't have spent weeks down some rabbit hole before you realize there's a problem.
Sometimes you do things better the second time around so just try and stay positive. Plus you will have more domain knowledge this time around.
Although I am using drupal since the D4 series, I only started developing professionally for it with D6, so - despite I did various site upgrades - I was never faced by the task of having to port my own code to a new version.
I know the Drupal community will come up with lot of technical support about changed API's and architectural changes (see the deadwood module for D5-D6 or even these stubs of D6-D7 how-to's for modules and themes).
However what I am looking for with my question is more in the line of strategy thinking, or in other words, I am looking for inputs and advice on how to plan / implement / review the process of porting my own code, in the light of what colleague developers learned by previous experience. Some example:
Would you advice to begin to port my modules as soon as I have time for doing it, and to maintain a concurrent D7 for some time (so I am "prepared" for the D-day) or would you advice to rather wait for the day in which the port will be actually imminent and then upgrade the modules to D7 and drop the D6 version?
Only some of my modules have full test coverage. Would you advice to complete test coverage for the D6 version so to have all tests working to check the D7 port, or would you advice to write my test directing at porting time, to test the D7 version?
Did you find that being an early adopter gives you an edge in terms of new features and better API's or did you rather find that is more convenient to delay the conversion so as to leverage the larger amount of readily available contrib modules?
Did you set for yourself quality standards / evaluation criteria or did you just set the bar to "if it works, I'm happy"? Why? If you set certain standards or goals, what did they where / what will they be? How did they help you?
Are there common pitfalls that you experienced in the past and that you think are applicable to the D6-D7 porting process?
Is porting a good moment to do some refactoring or it is just going to make everything more complex to be put back together?
...
These questions are not an exhaustive list, but I hope they give an idea of what kind of information I am looking for. I would rather say: whatever you think is relevant and I did not list above gets a "plus"! :)
If I did not manage to express myself clearly enough, please post a comment with the info you think I should add in the question. Thank you in advance for your time!
PS: Yes I know... D7 is not yet out and it will take months before important contrib modules will be upgraded... but it's never too early to start thinking! :)
Good questions, so let's see:
(when to start porting)
This certainly depends on the complexity of the modules to port. If there are really complex/large ones, it might be useful to start early in order to find tricky spots while not being under pressure. For smaller/standard ones, I'd try to find a bigger time slot later on where I can port many of them in a row in order to get the routine stuff memorized quickly (and benefit from the probably improved documentation).
(test coverage)
Normally I'd say that having a good test coverage before starting refactoring/porting would certainly be advisable. But given that Drupal-7 introduces a major change concerning the testing framework by moving it to core, I'd expect the need to rewrite a significant amount of tests anyway. So if there is no need to maintain the Drupal-6 versions after the migration, I'd save the time/trouble and aim for increased coverage after the porting.
(early adopter vs. wait and see)
Using Drupal since the 4.7 version, we have always waited for at least the first official release of a new major version before even thinking about porting. With Drupal 6, we waited for the views module before porting our first site, and we still have some smaller projects on Drupal-5, as they are working just fine and it would be hard to justify the extra bill for our clients as long as there are still maintenance/security fixes for it. There is just so much time in a day and there is always this backlog of bugs to fix, features to add, etc., so no use playing with unfinished technology while there are more imminent things to do that would immediately benefit our clients. Now this would certainly be different if we'd have to maintain one or more 'official' contributed modules, as offering an early port would be a good thing.
I'm a bit in a bind here - being an early adopter certainly benefits the community, as someone has to find that bugs before they can get fixed, but on the other hand, it makes little business sense to fight hour after hour with bugs others might have found/fixed if you'd just waited a bit longer. As long as I have to do this for a living, I need to watch my resources, trying to strike an acceptable balance between serving the community and benefiting from it :-/
(quality standards)
"If it works, I'm happy" just doesn't cut it, as I don't want to be happy momentarily only, but tomorrow as well. So one of my quality standards is that I need to be (somewhat) certain that I 'grokked' new concepts well enough in order to not just makes things work, but make them work like they should. Now this is hard to define more precisely, as it is obviously impossible to know if one 'got it' before 'getting it', so it boils down to a gut feeling/distinction of 'yeah, it kinda works' vs. 'yup, that looks right', and one has to accept that he will quite regularly be wrong about this.
That said, one particular point I'm looking out for is 'intervene as early as possible'. As a beginner, I often tweaked stuff 'after the fact' during the theming stage, while it would have been much easier to apply the 'fix' earlier in the processing chain by means of one hook or the other. So right now, whenever I'm about to 'adjust' something in the theme layer, I deliberately take a small time out to check if this can not be done more cleanly/compatible within a hook earlier on. As I expect Drupal-7 to add even more hooking options, this is something I will pay extra attention to, as it usually reduces conflicts and sudden 'breaking of stuff' when adding new modules.
(common pitfalls)
Well - mainly porting to early, finding out afterwards/in between that one or more needed modules were not available for the new version at all, or only in dev/alpha/early beta state. So I'd make sure to compile a complete list of used/needed modules first, listing their porting state, along with a quick inspection of their issue queues.
Besides that, I have so far always been very pleased with the new versions and their improvements, and I'm looking forward for Drupal-7 again.
(refactoring while porting)
One could say that porting is a rather large refactoring in itself, so there is no need to add to the complexity by restructuring non porting related stuff. On the other hand, if you already have to shred your modules to pieces anyway, why not use the opportunity to make it a major overhaul? I'd try to draw a line based on complexity - for big/complex modules, I'd do the port as straight as possible, and refactor more later on, if need be. For smaller modules, it shouldn't really matter, as the likelihood of introducing subtle bugs should be rather small.
(other stuff)
... need to think about it ...
Ok, other stuff:
Resource needs - given some of the Drupal-7 threads, it looks like they are likely to go up, so this should be evaluated before porting smaller sites that sit on a shared/restricted hosting account.
Latest versions first - This one is rather obvious and always stressed in the upgrade guides, but nevertheless worth mentioning: Upgrade core and all modules to their latest current version first before doing a major upgrade, as the upgrade code is highly likely to depend on the latest table/data structures to work correctly. Given Drupals 'piecemeal', one step at a time update strategy, it would be very hard to implement upgrade code that would detect different pre-upgrade states and acted accordingly.
I have recently been put in charge of debugging two different programs which will eventually need to share an XML parsing script, at the minimum. One was written with PureMVC, and another was built from scratch. While it made sence, originally, to write the one from scratch (it saved a good deal of memory, but the memory problems have since been resolved).
Porting the non-PureMVC application will take a good deal of time and effort which does not need to be used, but it will make documentation and code-sharing easier. It will also lower the overall learning curve. With that in mind:
1. What should be taken into account when considering whether it is best to move things to one standard?
(On a related note)
Some of the code is a little odd. Because the interpreting App had to convert commands from one syntax to another, it made sense to have an interpreter Object. Because there needed to be communication with the external environment, it made more sense to have one object interact with the environment, and for that to deal with the interpreter exclusively.
Effectively, an anti-Singleton was created. The object would only interface with the interpreter, and that's it. If a member of another class were to try to call one of its public methods, the object would raise an Exception.
There are better ways to accomplish this, but it is definitely a bit odd. There are more standard means of accomplishing the same thing, though they often involve the creation of classes or class files which are extraordinarily large. The only solution which I could find that was standards compliant would involve as much commenting and explanation as is currently required, if not more. Considering this:
2. If some code is quirky, but effective, is it better to change it to make it less quirky, even if it is made a more unwieldy?
In my opinion this type of refactoring is often not considered in schedules and can only be done when there is extra time.
More often than not, the criterion for shipping code is if it works, not necessarily if it's the best possible code solution.
So in answer to your question, I try and refactor when I have time to do so. Priority One still remains to produce a functional piece of code.
Things to take into account:
Does it work as-is?
As Galwegian notes, this is the only criterion in many shops. However, IMO just as important is:
How skilled are the programmers who are going to maintain it? Have they ever encountered nonstandard code? Compare the cost of their time to learn it (including the cost of delayed dot releases) to the cost of your time to refactor it.
If you're maintaining it, then instead consider:
How much time will dealing with the nonstandard code cost you over the intended lifecycle of the codebase (e.g., the time between now and when the whole thing is rewritten)?
That's hard to guess, but consider that many codebases FAR outlive the usefulness envisioned by their original authors. (Y2K anyone?) I've gradually developed a sense of when a refactoring is worthwhile and when it's not, mostly by erring on the side of "not" too often and regretting it later.
Only change it if you need to be making changes anyway. But less quirky is always a good goal. Most of the time spent on a particular piece of software is in maintenance, so if you can do something to make that easier, you'll be reducing the overall time spent on that piece of code. Nonetheless, don't change something if it's working and doesn't need any modifications.
If you have time, now. If you don't have time and it can be avoided, later.