Automating Drupal Migrations How to go from an Estimated One Week to Two Minutes Down Time
Automating Drupal Migrations
How to go from an Estimated One Week to Two Minutes Down Time
About Dan Harris
● Founder Webdrips.com ○ Drupal-based web design and development shop ○ Founded in July, 2011. ○ Nine years Drupal experience○ 21 years professional experience.
● Twitter @webdrips● Email [email protected]
Note About the Migration Process
Although we’re covering a Drupal 6 to 7 migration in this presentation, most if not all of these ideas presented here should work for any Drupal to Drupal migration.
Overview: Initial Plan/Estimates
● Initial estimate: one week of downtime● SQL queries would be used to export/import
when coverage was limited with Drupal Migrate
● Only automation provided by Migrate Modules● Existing Drupal 7 Architecture
Overview: Updated Plan
● Virtually zero downtime ○ Intermediate: asking for one day down time or less
● Complete migration in one business day● Over 99% automated ● D7 site to be built during migration from
scratch
About the Drupal 6 Site
● Architecturally, was a mess (Frankensite)○ Migration provided chance to clean up architecture
and code● Six custom themes (1 custom/5 subthemes)● 35 custom modules● 151 contributed modules
About the Drupal 6 Site
● 1000 privileged users● About 400k non-privileged users● 25 Content Types, including Webforms● Over 2,500 pages
About the Drupal 7 Site
● 106 Modules ● Bootstrap Primary Theme● One Bootstrap subtheme, Four sub-
subthemes ● Six content types only● 11 Features provided architecture
Automated Migration Process
Requirements● Migrate modules: migrate, migrate_extras, migrate_d2d,
migrate_webform● Import modules: menu_import, path_redirect_import● Four custom modules● Scripts migration and deployment● Fast server with SSD
Migration Script OverviewRequirements:● Create new Drupal D7 site● Build out site architecture with features● Enable Modules● Migrate D6 to D7● Import items that couldn’t be migratedThis provided for a repeatable/reliable process
Migration Script Highlights (Review)
Build the site:drush site-install
Enable features and modules:drush en feature_name -y
Migrate each entity:drush mi entity
Custom Migration Modules
1. Disable “edits” to the D6 sitea. Basically re-direct webform pages, admin pages,
and paths like node/add, node/edit, etc.2. Views (implemented with features) only for
migration status and post-processing3. Migrate_d2d module4. CSV-based Migration
Drupal Migrate/D2D/Extras
● Handled most of the heavy lifting○ Everything except menu links, path redirects, and
slide shows● Extensive drush support ● Plenty of methods available to massage data● D2D: simplifies migration code
Migrating Users
Challenges● Nearly 400K unprivileged users● Needed to assign users to organic groups
○ Based on how webform questions answered● Had to fix user passwords
○ Fixed by writing directly to the user table inside the migration
Migrate Users Code
Unprivileged vs. Privileged was a simple query:class NvidiaPrivilegedUserMigration extends NvidiaUserMigration {
protected function query() {
$query = parent::query();
$query->condition('u.mail', '%nvidia.com', 'LIKE/NOT LIKE');
return $query;
}
}
Migrate Users Code
Fix the password:public function complete($account, $row) {
parent::complete($account, $row);
$account->pass = $row->pass;
db_update('users')
->fields(array('pass' => $account->pass))
->condition('uid', $account->uid)
->execute();
$this->nvidia_memberships($row);
}
Assign Users to Groups (Review) public function nvidia_memberships($row) {
$membership_query = Database::getConnection('default', 'd6source')->select('webform_submissions', 'ws');
$membership_query->join('webform_submitted_data', 'wd', 'wd.sid = ws.sid');
$membership_query->fields('wd', array('cid'));
$membership_query->fields('ws', array('nid'));
$membership_query->addExpression('group_concat(data)', 'data');
$membership_query->groupBy('ws.sid');
$membership_query->groupBy('cid');
$membership_query->condition('ws.uid', $row->uid);
$membership_query->condition('ws.nid', array(1234567,2345678,3456789,4567890,5678901), 'IN');
$membership_id = nvidia_og_membership_associate_user_with_program();
Node Migration Challenges
● Body images & links with absolute paths● Empty fields sometimes caused display issues● Had to deal with “interesting” architecture
decisions on the D6 site● Moved larger files to the cloud● Reduced the number of content types
Node Migration Code
Dealing with textarea images:● Needed to use Simple HTML DOM Parser● Code Review
How a Strange Dev. Decision can Affect a Migration
D6 product page and dB variables table (review) led to the following code$variable_name = 'nvidia_product_disable_product_image_'.$row->nid;
// drush_print_r($variable_name);
$query = Database::getConnection('default', 'd6source')
->select('variable', 'v')
->fields('v', array('name', 'value'))
->condition('v.name', $variable_name, '=')
->execute()
->fetchAll();
$product_image_disabled = $query[0]->value;
if ($product_image_disabled == 'i:1;') {
$row->field_inline_image = NULL;
}
Remove Empty Textarea Fieldspublic function prepare($entity, stdClass $row) {
foreach ($row as $key => $value) {
if (!isset($row->$key) || $row->$key === null) {
$entity->$key = NULL;
}
}
}
“Non-Standard” Entity Migrations (Review)
● D2D handles established Drupal entities well○ nodes, users, taxonomy, etc.
● But what if you want to migrate block content to an entity?○ CSV Migration to the rescue
Challenges
● Biggest challenge was reducing the migration time○ The original estimate just for migrating users was
over 40h○ Eventually that time was reduced to ~ 3 hours○ We tweaked my.cnf, php.ini, drush.ini○ Got a really fast server with Intel Xeon processors,
fast RAM, and a SSD
Challenges
● Installation of modules in order○ circular dependencies○ features that add fields need to be installed before
migration● Relationships between content
○ Both nodes need to exist before creating a relationship
○ “Parent” content that did not exist in original site
Migration timeline● -7days to release: Content freeze● -2days: Automated rebuild, content migration
and editorial approval.● -8h: Registration lockdown and migration
start● -2h: Batch processing of content by editors
and final tests
Accelerating migration
● Use Drush● Single pass for each item
○ Migration objects are big and slow○ Don’t load an object from DB twice
● Multithreading○ https://www.deeson.co.uk/labs/multi-processing-part-2-how-make-migrate-move
Add multithreading to a working migration class
● Not very portable○ needs a Drush extension○ needs to run on the ‘fast’ server
● Very effective
Add multithreading to a working migration class● Sub-class the migration● Make all the sub-migrations use the same
index● Make the sub-migration work on a small
‘chunk’ of the index● Break the migration in parts and send
chunks of it to multiple threads
Add multithreading to a working migration class<?php
class NVMultiThread extends NvidiaUnprivilegedUserMigration {
public function __construct($args) {
$args += array(
'source_connection' => NVIDIA_MIGRATE_SOURCE_DATABASE,
'source_version' => 6,
'format_mappings' => array(
'1' => 'filtered_html',
'2' => 'full_html',
'3' => 'plain_text',
'4' => 'full_html',
),
'description' => t('Multithreaded Migration of users from Drupal 6'),
'role_migration' => 'Role',
);
This is boilerplate needed by D2D
Add multithreading to a working migration class
parent::__construct($args);
$this->limit = empty($args['limit']) ? 100 : $args['limit'];
$this->offset = empty($args['offset']) ? 0 : $args['offset'];
$this->map = new MigrateSQLMap('nvidiaunprivilegeduser',
array(
'uid' => array(
'type' => 'int',
'unsigned' => TRUE,
'not null' => TRUE,
'description' => 'User migration reference',
),
),
MigrateDestinationUser::getKeySchema()
);
}
map/index table
index definition
Add multithreading to a working migration class
protected function query() {
$query = parent::query();
$query->range($this->arguments['offset'], $this->arguments['limit']);
return $query;
}
}
Modify original query to limit the number of items to work on
Measuring the improvement
● Same server● Restore destination DB from backup after
each run● Same source DB● Both DBs in the same server● MySQL optimizations for concurrency issues
Measuring the improvement
1000 rows, 100 per threadThreads Time Speed
1 71s 845/min
2 60s 1000/min
3 54s 1111/min
Measuring the improvement
10,000 rows, 1000 per threadThreads Time Speed
1 707s 848/min
2 303s 1980/min
3 300s 2000/min
4 291s 2061/min
5 351s 1709/min
Measuring the improvement
50,000 rows, 5000 per threadThreads Time Speed
3 1990s 1507/min
4 1562s 1920/min
5 1303s 2302/min
6 1637s 1832/min
Conclusion
● Drop DNS TTL to 1 minute days before launch
● Repeatability is key● Migration is very powerful but can be slow● Automation helps drop downtime close to
zero
Conclusion
● Ask for help● There’s many ways to use Migration, if one
way is not working drop it and use it differently○ CSV vs direct read from DB
● Weird things happen with orphaned fields