I'm working with Drupal and I want to diff between two databases, to see which tables have been updated in a given session. how do I do this? - drupal

I'm working with Drupal on a project, trying to find a way to speed up tests (we're using Cucumber and Selenium), and I'm trying to see which tables have been changed in a given series of steps, so I can just revert dump and out reset those tables between each test case.
Right now, Simpletest, the Drupal testing framework works by installing and setting up the tables for every module needed for a test, which makes for slow tests, and I'm emulating a similar approach by loading a db dump for each test.
Given that a site, if you're doing integration testing has a 'known good' state to be starting from, I think it would be faster to be able to just revert back to that point each time, instead of waiting twenty seconds or so to drop the database then pipe the dumpfile back in between each test runs.
However, when I try diffing between two dumpfiles (ie before.I.create.a.node.sql, and after.I.create.a.node.sql) the output is an unreadable load of serialised php, that I can't make sense of.
Ae there any tools I can use to help work out which tables I need to drop and rebuild between test cases, so I don't incur the 20 second hit on each test, short of reading the schema and code of every module I'm working with?
I'm following the ideas outlined here with getting cucumber to work with PHP, and yes, I have seen [this question here on a similar subject
Thanks!

Drupal does store a lot of serialized PHP in the database. But the main part of it is kept in the cache tables; like cache, cache_field, cache_menu, etc and you can safely truncate these before dumping the database.
If you have any simpletest tables you could drop those. They are all temporary and is used only for running your Simpletest test suite.
That should reduce the dump size a lot. If it's not enough I can recommend reading up on the tables in the book Pro Drupal Development, or you could skim through the .install files to read the module's schema definitions. Though most will probably be real data you'd want to revert between tests.
Because of the relational nature of the database, be sure to either know exactly what you're doing or dump/revert all the remaining tables together.

Related

Is there any way to execute repeatable flyway scripts first?

We use flyway since years to maintain our DB scripts, and it does a wonderful job.
However there is one single situation where I am not really happy - possibly someone out there has a solution:
In order to reduce the number of scripts required (and also in order to keep overview about "where" our procedures are defined) I'd like to implement our functions/procedures in one script. Every time a procedure changes (or a new one is developed) this script shall be updated - repeatable scripts sound perfect for this purpose, but unfortunately they are not.
The drawback is, that a new procedure cannot be accessed by non-repeatable scripts, as repeatable scripts are executed last, so the procedure does not exist when the non-repeatable script executes.
I hoped I can control this by specifying different locations (e.g. loc_first containing the repeatables I want to be executed first, loc_normal for the standard scripts and the repeatables to be executed last).
Unfortunately the order of locations has no impact on execution order ;-(
What's the proper way to deal with this situation? Right now I need to specify the corresponding procedures in non-repeatable scripts, but that's exactly what I'd like to avoid ....
I found a workaround on my own: I'm using flyway directly with maven (the same would work in case you use the API of course). Each stage of my maven script has its own profile (specifiying URL etc.)
Now I create two profiles for every stage - so I have e.g. dev and devProcs.
The difference between these two maven profiles is, that the "[stage]Procs" profile operates on a different location (where only the repeatable scripts maintaining procedures are kept). Then I need to execute flyway twice - first with [stage]Procs then with [stage].
To me this looks a bit messy, but at least I can maintain my procedures in a repeatable script this way.
According to flyway docs, Repeatable migrations ALWAYS execute after versioned migration.
But, I guess, you can use Flyway callbacks. Looks like, beforeMigrate.sql callback is exactly what you need.

Static data storage on server-side

Why some data on server-side are still stored in DBC files, not in SQL-DB? In particular - spells (spells.dbc). What for?
We have a lot of bugs in spells and it's very hard to understand what's wrong with spell, but it's harder to find it spell...
Spells, Talents, achievements, etc... Are mostly found in DBC files because that is the way Blizzard did it back in the day. It's true that in 2019 this is a pretty outdated way to work indeed. Databases are getting stronger and more versatile and having hard-coded data is proving to be hard to work with. Hell, DBCs aren't really that heavy anyways and the reason why we haven't made this change yet is that... We have no other reason other than it being a task that takes a bit of time and It is monotonous to do.
We are aware that Trinity core has already made this change but they have far more contributors than we do if that serves as an excuse!
Nonetheless, this is already in our to-do list if you check the issue tracker at the main repository.
While It's true that we can't really edit DBC files because we would lose all the progress when re-extracted or lost the files, however, we can modify spells in a C++ file called SpellMgr.
There we have a function called SpellMgr::LoadDbcDataCorrections().
The main problem while doing this change is that we have to modify the core to support this change, and the function above contains a lot of corrections. Would need intense testing to make sure nothing is screwed up in the process.
In here by altering bits you can remove or add certain properties to the desired spells instead of touching the hard coded dbc files.
If you want an example, in this link, I have changed an Archimonde spell to have no cast time.
NOTE:
In this line, the commentary about damage can be miss leading but that's because I made a mistake and I haven't finished this pull request yet as of 18/04/2019.
The work has been started, notably by Kaev. I think at least 3 DBCs are now useless server side (but probably still needed client side, they are called DataBaseClient for a reason) like item.dbc.
Also, the original philosophy (for ALL cores, not just AC) was that we would not touch DBC because we don't do custom modifications, so there was no interest in having them server side.
But we wanted to change this and started to make them available directly in the DB, if you wish to help with that, it would be nice!
Why?
Because when emulation started, dbc fields were 90% unknown. So, developers created a parser for them that just required few code changes to support new fields as soon as their functionality was discovered.
Now that we've discovered 90% of required dbc fields and we've also created some great conversion tools for DBC<->SQL, it's just a matter of "effort".
SQL conversion is useful to avoid using of client data on server (you can totally overwrite them if you don't want to go against EULA) or just extends/customize them.
Here you are the issue about DBC->SQL conversion: https://github.com/azerothcore/azerothcore-wotlk/issues/584

Performance of large collection in Meteor 1.0.X

There has been a LOT of development in the Meteor world, and as such it's getting hard to find answers that work for current versions due to the plethora of answers you find for old, out-dated versions.
I have an app that has a LOT of data in a particular collection. By lots I mean somewhere between 10k-100k, and very potentially a lot more. Essentially it's log data, and I need to display the results in a table with no pagination (like a tail). In researching ways to optimize large collections I keep running into things like this that seem to be for older versions of Meteor.
So, as I see it my options are:
Use fast-render plugin to display the page prior to the subscription (at least this is my understand on how it works).
Use some sort of progressive publish function, where it loads limited more relevant bits of data first, then progressively loads the remaining data by expanding the window/limit (not sure if this would cause heavier load on the server, though). There seems to have been a "progressive publish" plugin, but it doesn't seem to be under active development any longer.
Optimize the lookups via indexing (How do you specify that when creating the collection???)
Profiling and optimizing the template further (not sure how).
Some other method I haven't thought of yet...
Some combination of all-the-above.
What is the proper approach by which to publish and render lots of data in this way?
I'm going to assume that "optimize" means reduced query time.
Always start with the biggest bang for your buck.
Unless you're publishing the entire collection, or query on the _id, then you want to create an index using _ensureIndex. Get more info on this on the mongodb website or by searching other questions. http://docs.mongodb.org/manual/reference/method/db.collection.ensureIndex/
Second, limit the fields to just the info you need. eg {fields: {a:1, b:1}}. http://docs.meteor.com/#/full/fieldspecifiers
Third, don't sort.
If this still isn't good enough, make another question with schema & query details & the desired UI so we can better understand the reactivity and why you can't use some form of pagination.

How to avoid code duplication for non-data structures (views, stored procedures etc)

My project contains a lot of objects like views and stored procedures which are being changed quite frequently. Now I have to create new SQL script on every update which contains complete source code of changed objects despite I've actually changed only few rows. It leads to massive code duplication and I also found it difficult to review these changes.
I'd like to have only one actual version of SQL script for every object like view or procedure and recreate these objects every time I redeploy the database. As result I could change existing source file (like in Java or C programming) instead of creating a new update every time I need to alter view or procedure.
Is there a possibility to execute some scripts every time I migrate the database with Flyway?
I'm not sure why that got so many downvotes, it's a perfectly understandable and valid question. Perhaps it's because it closely resembles this open question:
Migrating Stored Procedures with Flyway
We are actually starting to push against this issue now. We've been using flyway for development and testing (and love it). But we've come to a point where we're starting to have to use procs/triggers/views (p/t/v's) and the fundamental disconnect between how we did it before, and how we must use flyway, is starting to be a strain.
Before, for a given database object (let's say it's a procedure), there'd be one source file. And if you needed to change the proc 'n' times, there would be 'n' versions of the same file in your VCS. Diff tools work great, IDE's all understand this, merges detect when two developers working in separate branches make changes to the proc, etc, etc. You know, old school.
But with flyway, any one proc with 'n' changes is now scattered across 'n' files. Instead of "one object in one file with 'n' versions", you have "one objecct in 'n' files with one change each". I now need to do a text search in my IDE for any instance of "proc_name" if I want to know the history of changes to the proc. The VCS knows nothing about it. Devs can each make a migration in their own branches that succeed when each is deployed, but leave the proc with a missing update.
I'm not saying any of this to complain about flyway, and I fully realize it's not a simple area. I'd almost say it's unsolveable (by flyway).
We're scheming how to handle this problem, and I'd be very interested to know how others have handled it.
Repeatable migrations are supported by Flyway 4.0, now.
Just add sql files starting with "R" without any version information to your migration folder:
R__Blue_cars.sql
You have to ensure, that the script could be repeatable migrated.
This is usually done by "CREATE OR REPLACE" clauses in your DDL statements.
https://flywaydb.org/documentation/migration/repeatable

Getting a handle on a huge classic asp project

I've been working with an ASP Classic site with over 500 files, some of which aren't used and some of which are; along with a database with hundreds of procs and functions and tables, in the same shape.
I need a way to get a grip on the project so I can eventually migrate it. I don't have time yet to walk through every single page and look at the SQL (stored procedures are in the database and are called properly within the ASP pages), so I'm at a loss as to how to get a handle on this.
My immediate thought is to make ASP classes and put them into the pages as I go - they'd pretty much be used for getting and setting fields, validation, and sending recordsets into display functions.
Is this a reasonable approach? Am I missing some strategy?
How would you approach this? A migration to another platform at this point is considered, but not feasible for the short term (next couple of months)
You can try to compile the project using http://aspclassiccompiler.codeplex.com/ or you can migrate to ASP.net MVC one page at a time (when needed) and using a mix of both in the meantime.
My simple advice is stop think about code. Spend more time with the UI actually using it and spend time examining in detail the database schema.
Edit
If you are trying to determine what pages are active then use IIS logging to harvest distinct pages hit. Also do some scripting to collect the names of files and text search the files in the site looking for any occurance of those files. This info should identify parts of the site that are rarely used or dead.
However in all probability there will considerable content in the "active" files which are also dead. Let me re-iterate do not actually to add classes or refactor the code at this stage you should concentrate on understanding what it does not how it does it. Understanding the DB Schema is a vital step and then understanding what UI interactions bring about specific changes in the DB.

Resources