Automating Drupal Migrations How to go from an Estimated One Week to Two Minutes Down Time
Automating Drupal Migrations
How to go from an Estimated One Week to Two Minutes Down Time
Dan HarrisFounder of Webdrips.comNine years Drupal Experience
Front EndBack Endetc.
Porting Node Hierarchy Module to Drupal 8
About the SpeakersHeron OrdoñezSenior Developer at NVIDIA9 years of Drupal Experience
NewspapersMagazinesRadio and TV
Note About the Migration ProcessAlthough we’re covering a Drupal 6 to 7 migration in this presentation, most if not all of these ideas presented here should work for any Drupal to Drupal migration.
Overview: Initial Plan/EstimatesInitial estimate: one week of downtimeSQL queries would be used to export/import
when coverage was limited with Drupal Migrate
Only automation provided by Migrate ModulesExisting Drupal 7 Architecture
Overview: Updated PlanVirtually zero downtime Complete migration in one business dayOver 99% automated D7 site to be built during migration from
scratch
About the Drupal 6 SiteArchitecturally, was a mess (Frankensite)
Migration provided chance to clean up architecture and code
Six custom themes (1 custom/5 subthemes)35 custom modules151 contributed modules
About the Drupal 6 Site1000 privileged usersAbout 400k non-privileged users25 Content Types, including WebformsOver 2,500 pages
About the Drupal 7 Site106 Modules Bootstrap Primary ThemeOne Bootstrap subtheme, Four sub-
subthemes Six content types only11 Features provided architecture
Automated Migration ProcessRequirementsMigrate modules: migrate, migrate_extras, migrate_d2d,
migrate_webformImport modules: menu_import, path_redirect_importFour custom modulesScripts migration and deploymentFast server with SSD
Migration Script OverviewRequirements:Create new Drupal D7 siteBuild out site architecture with featuresEnable ModulesMigrate D6 to D7Import items that couldn’t be migratedThis provided for a repeatable/reliable process
Migration Script Highlights(Review)Build the site:
drush site-installEnable features and modules:drush en feature_name -yMigrate each entity:drush mi entity
Custom Migration Modules1.Disable “edits” to the D6 site
a.Basically re-direct webform pages, admin pages, and paths like node/add, node/edit, etc.
2.Views (implemented with features) only for migration status and post-processing
3.Migrate_d2d module4.CSV-based Migration
Drupal Migrate/D2D/ExtrasHandled most of the heavy lifting
Everything except menu links, path redirects, and slide shows
Extensive drush support Plenty of methods available to massage dataD2D: simplifies migration code
Migrating UsersChallenges
Nearly 400K unprivileged usersNeeded to assign users to organic groups
Based on how webform questions answeredHad to fix user passwords
Fixed by writing directly to the user table inside the migration
Migrate Users CodeUnprivileged vs. Privileged was a simple query:class NvidiaPrivilegedUserMigration extends NvidiaUserMigration { protected function query() { $query = parent::query(); $query->condition('u.mail', '%nvidia.com', 'LIKE/NOT LIKE'); return $query; }}
Migrate Users CodeFix the password:public function complete($account, $row) { parent::complete($account, $row); $account->pass = $row->pass; db_update('users') ->fields(array('pass' => $account->pass)) ->condition('uid', $account->uid) ->execute(); $this->nvidia_memberships($row); }
Assign Users to Groups (Review) public function nvidia_memberships($row) {
$membership_query = Database::getConnection('default', 'd6source')->select('webform_submissions', 'ws'); $membership_query->join('webform_submitted_data', 'wd', 'wd.sid = ws.sid'); $membership_query->fields('wd', array('cid')); $membership_query->fields('ws', array('nid')); $membership_query->addExpression('group_concat(data)', 'data'); $membership_query->groupBy('ws.sid'); $membership_query->groupBy('cid'); $membership_query->condition('ws.uid', $row->uid); $membership_query->condition('ws.nid', array(1234567,2345678,3456789,4567890,5678901), 'IN');$membership_id = nvidia_og_membership_associate_user_with_program();
Node Migration ChallengesBody images & links with absolute pathsEmpty fields sometimes caused display issuesHad to deal with “interesting” architecture
decisions on the D6 siteMoved larger files to the cloudReduced the number of content types
Node Migration CodeDealing with textarea images:
Needed to use Simple HTML DOM ParserCode Review
How a Strange Dev. Decision can Affect a Migration
D6 product page and dB variables table (review) led to the following code$variable_name = 'nvidia_product_disable_product_image_'.$row->nid; // drush_print_r($variable_name); $query = Database::getConnection('default', 'd6source') ->select('variable', 'v') ->fields('v', array('name', 'value')) ->condition('v.name', $variable_name, '=') ->execute() ->fetchAll(); $product_image_disabled = $query[0]->value; if ($product_image_disabled == 'i:1;') { $row->field_inline_image = NULL; }
Remove Empty Textarea Fieldspublic function prepare($entity, stdClass $row) { foreach ($row as $key => $value) { if (!isset($row->$key) || $row->$key === null) { $entity->$key = NULL; } }}
“Non-Standard” Entity Migrations (Review)
D2D handles well-established Drupal entities well
nodes, users, taxonomy, etc. But what if you want to migrate block content
to an entity?CSV Migration to the rescue
ChallengesBiggest challenge was reducing the migration
timeThe original estimate just for migrating users was over
40hEventually that time was reduced to ~ 3 hoursWe tweaked my.cnf, php.ini, drush.iniGot a really fast server with Intel Xeon processors,
fast RAM, and a SSD
ChallengesInstallation of modules in order
circular dependenciesfeatures that add fields need to be installed before
migrationRelationships between content
Both nodes need to exist before creating a relationship
“Parent” content that did not exist in original site
Migration timeline-7days to release: Content freeze-2days: Automated rebuild, content migration
and editorial approval.-8h: Registration lockdown and migration start-2h: Batch processing of content by editors
and final tests
Accelerating migrationUse DrushSingle pass for each item
Migration objects are big and slowDon’t load an object from DB twice
Multithreadinghttps://www.deeson.co.uk/labs/multi-processing-part-2-how-make-migrate-move
Add multithreading to a working migration class
Not very portableneeds a Drush extensionneeds to run on the ‘fast’ server
Very effective
Add multithreading to a working migration class
Sub-class the migrationMake all the sub-migrations use the same
indexMake the sub-migration work on a small
‘chunk’ of the indexBreak the migration in parts and send chunks
of it to multiple threads
Add multithreading to a working migration class<?phpclass NVMultiThread extends NvidiaUnprivilegedUserMigration { public function __construct($args) { $args += array( 'source_connection' => NVIDIA_MIGRATE_SOURCE_DATABASE, 'source_version' => 6, 'format_mappings' => array( '1' => 'filtered_html', '2' => 'full_html', '3' => 'plain_text', '4' => 'full_html', ), 'description' => t('Multithreaded Migration of users from Drupal 6'), 'role_migration' => 'Role', );
This is boilerplate needed by D2D
Add multithreading to a working migration class
parent::__construct($args); $this->limit = empty($args['limit']) ? 100 : $args['limit']; $this->offset = empty($args['offset']) ? 0 : $args['offset']; $this->map = new MigrateSQLMap('nvidiaunprivilegeduser', array( 'uid' => array( 'type' => 'int', 'unsigned' => TRUE, 'not null' => TRUE, 'description' => 'User migration reference', ), ), MigrateDestinationUser::getKeySchema() ); }
map/index table
index definition
Add multithreading to a working migration class
protected function query() { $query = parent::query(); $query->range($this->arguments['offset'], $this->arguments['limit']); return $query; }}
Modify original query to limit the number of items to work on
Measuring the improvementSame serverRestore destination DB from backup after
each runSame source DBBoth DBs in the same serverMySQL optimizations for concurrency issues
Measuring the improvement1000 items, 100 per thread
Threads Time Speed
1 71s 845/min
2 60s 1000/min
3 54s 1111/min
Measuring the improvement10,000 items, 1000 per thread
Threads Time Speed
1 707s 848/min
2 303s 1980/min
3 300s 2000/min
4 291s 2061/min
5 351s 1709/min
Measuring the improvement50,000 items, 5000 per thread
Threads Time Speed
3 1990s 1507/min
4 1562s 1920/min
5 1303s 2302/min
6 1637s 1832/min
ConclusionDrop DNS TTL to 1 minute days before launchRepeatability is keyMigration is very powerful but can be slowAutomation helps drop downtime close to zero
ConclusionAsk for helpThere’s many ways to use Migration, if one
way is not working drop it and use it differently
CSV vs direct read from DBWeird things happen with orphaned fields