Upgrading from Drupal 6 to Drupal 7: best programmer's practices? - drupal

Although I am using drupal since the D4 series, I only started developing professionally for it with D6, so - despite I did various site upgrades - I was never faced by the task of having to port my own code to a new version.
I know the Drupal community will come up with lot of technical support about changed API's and architectural changes (see the deadwood module for D5-D6 or even these stubs of D6-D7 how-to's for modules and themes).
However what I am looking for with my question is more in the line of strategy thinking, or in other words, I am looking for inputs and advice on how to plan / implement / review the process of porting my own code, in the light of what colleague developers learned by previous experience. Some example:
Would you advice to begin to port my modules as soon as I have time for doing it, and to maintain a concurrent D7 for some time (so I am "prepared" for the D-day) or would you advice to rather wait for the day in which the port will be actually imminent and then upgrade the modules to D7 and drop the D6 version?
Only some of my modules have full test coverage. Would you advice to complete test coverage for the D6 version so to have all tests working to check the D7 port, or would you advice to write my test directing at porting time, to test the D7 version?
Did you find that being an early adopter gives you an edge in terms of new features and better API's or did you rather find that is more convenient to delay the conversion so as to leverage the larger amount of readily available contrib modules?
Did you set for yourself quality standards / evaluation criteria or did you just set the bar to "if it works, I'm happy"? Why? If you set certain standards or goals, what did they where / what will they be? How did they help you?
Are there common pitfalls that you experienced in the past and that you think are applicable to the D6-D7 porting process?
Is porting a good moment to do some refactoring or it is just going to make everything more complex to be put back together?
...
These questions are not an exhaustive list, but I hope they give an idea of what kind of information I am looking for. I would rather say: whatever you think is relevant and I did not list above gets a "plus"! :)
If I did not manage to express myself clearly enough, please post a comment with the info you think I should add in the question. Thank you in advance for your time!
PS: Yes I know... D7 is not yet out and it will take months before important contrib modules will be upgraded... but it's never too early to start thinking! :)

Good questions, so let's see:
(when to start porting)
This certainly depends on the complexity of the modules to port. If there are really complex/large ones, it might be useful to start early in order to find tricky spots while not being under pressure. For smaller/standard ones, I'd try to find a bigger time slot later on where I can port many of them in a row in order to get the routine stuff memorized quickly (and benefit from the probably improved documentation).
(test coverage)
Normally I'd say that having a good test coverage before starting refactoring/porting would certainly be advisable. But given that Drupal-7 introduces a major change concerning the testing framework by moving it to core, I'd expect the need to rewrite a significant amount of tests anyway. So if there is no need to maintain the Drupal-6 versions after the migration, I'd save the time/trouble and aim for increased coverage after the porting.
(early adopter vs. wait and see)
Using Drupal since the 4.7 version, we have always waited for at least the first official release of a new major version before even thinking about porting. With Drupal 6, we waited for the views module before porting our first site, and we still have some smaller projects on Drupal-5, as they are working just fine and it would be hard to justify the extra bill for our clients as long as there are still maintenance/security fixes for it. There is just so much time in a day and there is always this backlog of bugs to fix, features to add, etc., so no use playing with unfinished technology while there are more imminent things to do that would immediately benefit our clients. Now this would certainly be different if we'd have to maintain one or more 'official' contributed modules, as offering an early port would be a good thing.
I'm a bit in a bind here - being an early adopter certainly benefits the community, as someone has to find that bugs before they can get fixed, but on the other hand, it makes little business sense to fight hour after hour with bugs others might have found/fixed if you'd just waited a bit longer. As long as I have to do this for a living, I need to watch my resources, trying to strike an acceptable balance between serving the community and benefiting from it :-/
(quality standards)
"If it works, I'm happy" just doesn't cut it, as I don't want to be happy momentarily only, but tomorrow as well. So one of my quality standards is that I need to be (somewhat) certain that I 'grokked' new concepts well enough in order to not just makes things work, but make them work like they should. Now this is hard to define more precisely, as it is obviously impossible to know if one 'got it' before 'getting it', so it boils down to a gut feeling/distinction of 'yeah, it kinda works' vs. 'yup, that looks right', and one has to accept that he will quite regularly be wrong about this.
That said, one particular point I'm looking out for is 'intervene as early as possible'. As a beginner, I often tweaked stuff 'after the fact' during the theming stage, while it would have been much easier to apply the 'fix' earlier in the processing chain by means of one hook or the other. So right now, whenever I'm about to 'adjust' something in the theme layer, I deliberately take a small time out to check if this can not be done more cleanly/compatible within a hook earlier on. As I expect Drupal-7 to add even more hooking options, this is something I will pay extra attention to, as it usually reduces conflicts and sudden 'breaking of stuff' when adding new modules.
(common pitfalls)
Well - mainly porting to early, finding out afterwards/in between that one or more needed modules were not available for the new version at all, or only in dev/alpha/early beta state. So I'd make sure to compile a complete list of used/needed modules first, listing their porting state, along with a quick inspection of their issue queues.
Besides that, I have so far always been very pleased with the new versions and their improvements, and I'm looking forward for Drupal-7 again.
(refactoring while porting)
One could say that porting is a rather large refactoring in itself, so there is no need to add to the complexity by restructuring non porting related stuff. On the other hand, if you already have to shred your modules to pieces anyway, why not use the opportunity to make it a major overhaul? I'd try to draw a line based on complexity - for big/complex modules, I'd do the port as straight as possible, and refactor more later on, if need be. For smaller modules, it shouldn't really matter, as the likelihood of introducing subtle bugs should be rather small.
(other stuff)
... need to think about it ...
Ok, other stuff:
Resource needs - given some of the Drupal-7 threads, it looks like they are likely to go up, so this should be evaluated before porting smaller sites that sit on a shared/restricted hosting account.
Latest versions first - This one is rather obvious and always stressed in the upgrade guides, but nevertheless worth mentioning: Upgrade core and all modules to their latest current version first before doing a major upgrade, as the upgrade code is highly likely to depend on the latest table/data structures to work correctly. Given Drupals 'piecemeal', one step at a time update strategy, it would be very hard to implement upgrade code that would detect different pre-upgrade states and acted accordingly.

Related

Real life experience developing with Meteor

I'm working on a project where we have to take the decision soon whether to invest in our current technology stack to improve it and make it more flexible to support our time to market (LAMP based stack) or whether to change to a different stack in the hope that it would make our development faster, more efficient and possibly more fun.
One framework we're looking at is Meteor. So I'm wondering: Does anyone have real life experience with starting or shifting a medium-sized project to Meteor (3 developers, couple of hundred active users, mostly short-lived small pieces of user-generated content that are viewed by all users and need to be updated instantly)? Do you have metrics on productivity, code quality, code efficiency that you could share? Or just overall a feeling for how it went? How happy are you with Meteor when working using it for more than just a week or two? How is maintainability over a longer period? How well does it scale up?
Would appreciate any insight!
I'll try to be as fact based as possible to keep this objective:
I switched from Django to Meteor, PostGreSQL to MongoDB.
Switching stacks has a huge cost. A new language, syntax, patterns, and maybe even IDE. Online courses to be taken, a solid node.js foundation, curiosity about io.js, ES6, and Mongo 3.0. A refresher on how JavaScript treats Dates and numbers, and how to use JavaScript to query mongo.
On top of that, you'll want your developers to peak under the hood to see the Meteor magic so they understand fibers, reactivity, DDP, and minimongo. All these things will cost each developer at LEAST 160 hours, yet are necessary to be a competent developer. Skip these steps, and you've got a team of monkeys pulling levers.
To answer your questions:
Productivity? It will hit rock bottom along with code quality. Then slowly climb, and possibly exceed the previous mark (IF it's something the developers enjoy). This is because client & server are in the same language & just a file away. Debugging messages & stack traces are pretty good & hot code reloads, although still not great, are good.
Code quality has absolutely nothing to do with the framework.
Code efficiency is good because reactivity in handled behind the scenes most of the time and fibers makes it possible to write server code in a synchronous fashion. This increases code readability.
Maintainability is another word for code quality.
Scalability is more of a question about node.js, but will work for the VAST majority of projects. An honest critique of node's shortcomings is here: https://medium.com/code-adventures/farewell-node-js-4ba9e7f3e52b

to rewrite or not to rewrite

so I need to maintain this old legacy project, where one part of it is amaturely written with Wordpess, lots of crappy custom plugins, lots of ducktaped scripts, that solve one or other problem, database is designed very very purely too, there is also other part written with Zend, which is thightly connected with WP part, there is also this other "masterpiece" project connected to first project data. Main table contains around 1.5M records and needs to be normalized too. now this big ball of nails "works", also it has lots of LOCs, which are result of bad foundations, so it is a huge pain to maintain. Thing is, the way I see this, by not rewriting we are loosing on the long term, because we lose flexibility both from technology and busi
ness perspective,plus it starts not to scale, but rewriting is a risk, plus we would need to convert old data to new data structures. Hacker part of me wants to break this, take a risk and do it right, but at the same time I am having a feeling that my immaturity tries to take a too big bite at once. So what do you think?
In situations like yours, 9 out of 10 times people suggest a rewrite and they are wrong.
Unless you have great application level knowledge about what the system is doing you will not be able to rewrite it successfully, quickly.
If the system is working today, but is crappy in many ways, and you have management buy-in (they own the software) to "fix stuffs", then I suggest an incremental approach will often be better than a full-on rewrite.
I suspect that the database is giving you the most headaches, so that may be the best place to start. Start by understanding the problem that it is currently solving and write that down. If there is no layer between the software and the db (other than jdbc or the like) add a layer. Once there is a layer separating the db from the application, it will be easier to change the db (and the layer) while minimizing the impact on the application.
At some point you will be happy with the first thing you changed. At that point, fix some other part. Repeat until the system is "more better".
Concerning risk: Taking risks is not bad, but being careless is terrible. Understand the risks and plan to mitigate them.
In situations like yours, 9 out of 10 times, I suggest to rewrite. Rationally, the situation a) won't get better, and b) will certainly get worse. You should bite the bullet before it's too late.
And by too late I mean something breaks completely and not only will you have to rewrite the whole thing, but your service will also be offline (ergo you may be also losing users/customers).
A good strategy in this situations is to "strangle" your application as described by Martin Fowler:
http://martinfowler.com/bliki/StranglerApplication.html
The strategy is to gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.
I already strangled a legacy application using this approach with great results and practically no offline time

How to ensure quality checkins with continuous build systems?

I religiously go through all of my code before I check it in and do a diff of the before and after code and read through it and make sure I undersatnd the changes. Usually I end up having to add comments, amend variable names, amend algorithms, amend code, retest things, discuss with other developers about their code, add new bugs/issues, but I very rarely end up doing the checkin immediately.
I do notice however that alot of developers these days seem to check their code in and think that when the build breaks that it is enough, then they go back and look at their changes. This is one of the things about continuous build systems that I definitely do not like, in that sometimes I think developers stop thinking about their code enough.
What best practices are there for ensuring only quality code goes into continuous build systems?
I do notice however that alot of developers these days seem to check their code in and think that when the build breaks that it is enough, then they go back and look at their changes.
In my opinion, using the continuous build for "verification" purpose is indeed a bad practice, developers should always try to not commit bad code that can affect the team and interrupt the work (the why is so obvious that if you don't get that, just look for another job). So, if your CI engine doesn't offer Pre-tested commit (like TeamCity, Team Foundation Server as I just saw, maybe Hudson one day, etc), you should always sync/build/sync (and rebuild if necessary) locally before to commit. Not doing this is laziness and not respectful of the team.
What best practices are there for ensuring only quality code goes into continuous build systems?
Just in case, remind why breaking the build is bad: bad code potentially affects the whole team and interrupts the work (sigh).
If you can get tool support (see the mentioned solutions above), get it.
If not, document the right workflow: 1. sync 2. build locally 3. sync 4. back to 2 if required 5. commit. Make this visible so that there is no excuses.
If this is an option, use something like (or similar to) the Hudson's Continuous Integration Game plugin. This can make things more fun.
I've seen people using a light financial "penalty" for broken build but I don't really like this idea. First, we should be able to behave as responsible adults. Second, the obtained result was that people started to delay commits (which at the end was the opposite of the expected benefit).
Team Foundation Server has something called pre checkin validation.

How to Convince Programming Team to Let Go of Old Ways?

This is more of a business-oriented programming question that I can't seem to figure out how to resolve. I work with a team of programmers who have been working with BASIC for over 20 years. I was brought in to help write the same software in .NET, only with updates and modern practices. The problem is that I can't seem to get any of the other 3 team members(all BASIC programmers, though one does .NET now as well) to understand how to correctly do a relational database. Here's the thing they won't understand:
We basically have a transaction that keeps track of a customer's tag information. We need to be able to track current transactions and past transactions. In the old system, a flat-file database was used that had one table that contained records with the basic current transaction of the customer, and another transaction that contained all the previous transactions of the customer along with important money information. To prevent redundancy, they would overwrite the current transaction with the history transactions-(the history file was updated first, then the current one.) It's totally unneccessary since you only need one transaction table, but my supervisor or any of my other two co-workers can't seem to understand this. How exactly can I convince them to see the light so that we won't have to do ridiculous amounts of work and end up hitting the datatabse too many times? Thanks for the input!
Firstly I must admit it's not absolutely clear to me from your description what the data structures and logic flows in the existing structures actually are. This does imply to me that perhaps you are not making yourself clear to your co-workers either, so one of your priorities must be to be able explain, either verbally or preferably in writing and diagrams, the current situation and the proposed replacement. Please take this as an observation rather than any criticism of your question.
Secondly I do find it quite remarkable that programmers of 20 years experience do not understand relational databases and transactions. Flat file coding went out of the mainstream a very long time ago - I first handled relational databases in a commercial setting back in 1988 and they were pretty commonplace by the mid-90s. What sector and product type are you working on? It sounds possible to me that you might be dealing with some sort of embedded or otherwise 'unusual' system, in which case you do need to make sure that you don't have some sort of communication issue and you're overlooking a large elephant that hasn't been pointed out to you - you wouldn't be the first 'consultant' brought into a team who has been set up in some manner by not being fed the appropriate information. That said such archaic shops do still exist - one of my current clients systems interfaces to a flat-file based system coded in COBOL, and yes, it is hell to manage ;-)
Finally, if you are completely sure of your ground and you are faced with a team who won't take on board your recommendations - and demonstration code is a good idea if you can spare the time -then you'll probably have to accept the decision gracefully and move one. Myself in this position I would attempt to abstract out the issue - can the database updates be moved into stored procedures for example so the code to update both tables is in the SP and can be modified at a later date to move to your schema without a corresponding application change? Make sure your arguments are well documented and recorded so you can revisit them later should the opportunity arise.
You will not be the first coder who's had to implement a sub-optimal solution because of office politics - use it as a learning experience for your own personal development about handling such situations and commiserate yourself with the thought you'll get paid for the additional work. Often the deciding factor in such arguments is not the logic, but the 'weight of reputation' you yourself bring to the table - it sounds like having been brought in you don't have much of that sort of leverage with your team, so you may have to work on gaining a reputation by exceling at implementing what they do agree to do before you have sufficient reputation in subsequent cases - you need to be modded up first!
Sometimes you can't.
If you read some XP books, they often say that one of your biggest hurdles will be convincing your team to abandon what they have always done.
Generally they will recommend letting people who can't adapt go to other projects (Or just letting them go).
Code reviews might help in your case. Mandatory code reviews of every line of code is not unheard of.
Sometime the best argument is an example. I'd write a prototype (or a replacement if not too much work). With an example to examine it will be easier to see the pros and cons of a relational database.
As an aside, flat-file databases have their places since they are so much easier to "administer" than a true relational database. Keep an open mind. ;-)
I think you may have to lead by example - when people see that the "new" way is less work they will adopt it (as long as you don't rub their noses in it).
I would also ask yourself whether the old design is actually causing a problem or whether it is just aesthetically annoying. It's important to pick your battles - if the old design isn't causing a performance problem or making the system hard to maintain you may want to leave the old design alone.
Finally, if you do leave the old design in place, try and abstract the interface between your new code and the old database so if you do persuade your co-workers to improve the design later you can drop the new schema in without having to change anything else.
It is difficult to extract a whole lot except general frustration from the original question.
Yes, there are a lot of techniques and habits long-timers pick up over time that can be useless and even costly in light of technology changes. Some things that made sense when processing power, memory, and even disk was expensive can be foolish attempts at optimization now. It is also very much the case that people accumulate bad habits and bad programming patterns over time.
You have to be careful though.
Sometimes there are good reasons for the things those old timers do. Sadly, they may not even be able to verbalize the "why" - if they even know why anymore.
I see a lot of this sort of frustration when newbies come into an enterprise software development shop. It can be bad even when the environment is all fairly modern technology and tools. If most of your experience is in writing small-community desktop and Web applications a lot of what you "know" may be wrong.
Often there are requirements for transaction journaling at a level above what your DBMS may do. Quite often it can be necessary to go beyond DB transaction semantics in order to ensure time-sequence correctness, once and only once updating, resiliancy, and non-repudiation.
And this doesn't even begin to address the issues involved in enterprise or inter-enterprise scalability. When you begin to approach half a million complex transactions a day you will find that RDBMS technology fails you. Because relational databases are not designed to handle high transaction volumes you must often break with standard paradigms for normalization and updating. Conventional RDBMS locking techniques can destroy scalability no matter how much hardware you throw at the problem.
It is easy to dismiss all of it as stodginess or general wrong-headedness - even incompetence. But be careful because this isn't always the case.
And by the way: There are other models besides the RDBMS, and the alternative to an RDBMS is not necessarily "flat files" - contrary to the experience of of most coders today. There are transactional hierarchical DBMSs that can handle much higher throughput than an RDBMS. IMS is still very much alive in large IBM shops, for example. Other vendors offer similar software for different platforms.
Of course in a 4-man shop maybe none of this applies.
Sign them up for some decent trainings and then it's up to you to convince them that with new technologies a lot more is possible (or at least easier!).
But I think the most important thing here is that professional, certified trainers teach them the basics first. They will be more impressed by that instead of just one of their colleagues telling them: "hey, why not use this?"
Related post here.
The following may not apply in yr situation, but you make very little mention of technical details, so I thought I'd mention it...
Sometimes, if the access patterns are very different for current data than for historical data (I'm making this example up, but say that Current data is accessed 1000s of times per second, and accesses a small subset of columns, and all current data fits in less than 1 GB, whereas, say, historical data uses 1000s of GBs, is accessed only 100s of times per day, and access is to all columns),
then, what your co-workers are doing would make perfect sense, for performance optimization. By separating the current data (albiet redundantly) you can optimize the indices and data structures in that table, for the higher frequency access paterns that you could not do in the historical table.
Not everything that is "academically", or "technically" correct from a purely relational perspective makes sense when applied in an actual practical situation.

Productivity gains of using CASE tools for development

I was using a CASE called MAGIC for a system I'm developing, I've never used this kind of tool before and at first sight I liked, a month later I had a lot of the application generated, I felt very productive and ... I would say ... satisfied.
In some way a felt uncomfortable, cause, there is no code and everything I was used to, but in the other hand I could speed up my developing. The fact is that eventually I returned to use C# because I find it more flexible to develop, I can make unit testing, use CVS, I have access to more resources and basically I had "all the control". I felt that this tool didn't give me confidence and I thought that later in the project I could not manage it due to its forced established rules of development. And also a lot of things like sending emails, using my own controls, and other things had their complication, it seemed that at some point it was not going to be as easy as initially I thought and as initially the product claims. This reminds me a very nice article called "No Silver Bullet".
This CASE had its advantages but on the other hand it doesn't have resources you can consult and actually the license and certification are very expensive. For me another dissapointing thing is that because of its simplistic approach for development I felt scared on first hand cause of my unexperience on these kind of tools and second cause I thought that if I continued using it maybe it would have turned to be a complex monster that I could not manage later in the project.
I think it's good to use these kind of solutions to speed up things but I wonder, why aren't these programs as popular as VS.Net, J2EE, Ruby, Python, etc. if they claim to enhance productivity better than the tools I've pointed?
We use a CASE tool at my current company for code generation and we are trying to move away from it.
The benefits that it brings - a graphical representation of the code making components 'easier' to pick up for new developers - are outweighed by the disadvantges in my opinion.
Those main disadvantages are:
We cannot do automatic merges, making it close to impossible for parallel development on one component.
Developers get dependant on the tool and 'forget' how to handcode.
Just a couple questions for you:
How much productivity do you gain compared to the control that you use?
How testable and reliant is the code you create?
How well can you implement a new pattern into your design?
I can't imagine that there is a CASE out there that I could write a test first and then use a CASE to generate the code I need. I'd rather stick to resharper which can easily do my mundane tasks and retain full control of my code.
The project I'm on originally went w/ the Oracle Development Suite to put together a web application.
Over time (5+ years), customer requirements became more complex than originally anticipated, and the screens were not easily maintainable. So, the team informally decided to start doing custom (hand coded) screens in web PL/SQL, instead of generating them using the Oracle Development Suite CASE tools (Oracle Designer).
The Oracle Report Builder component of the Development Suite is still being used by the team, as it seems to "get the job done" in a timely fashion. In general, the developers using the Report Builder tool are not very comfortable coding.
In this case, it seems that the productivity aspect of such CASE tools is heavily dependent on customer requirements and developer skill sets/training/background.
Unfortunaly the Magic tool doesn't generates code and also it can't implement a design pattern. I don't have control over the code cause as i stated before it doesn't have code to modify. Te bottom line is that it can speed up productivity in some way but it has the impossibility to user CVS, patterns also and I can't control all the details.
I agree with gary when he says "it seems that the productivity aspect of such CASE tools is heavily dependent on customer requirements and developer skill sets/training/background" but also I can't agree more with Klelky;
Those main disadvantages are:
1. We cannot do automatic merges, making it close to impossible for parallel development on one component.
2.Developers get dependant on the tool and 'forget' how to handcode.
Thanks

Resources