Transcript

Database Migrationswith Gradle and Liquibase

Dan StineCopyright Clearance Centerwww.copyright.com

Gradle SummitJune 12, 2015

2

About Me

• Software Architect

• Library & Framework Developer

• Platform Engineering Lead & Product Owner

• Gradle User Since 2011

• Enemy of Inefficiency & Needless Inconsistency

dstine at copyright.comsw at stinemail.comgithub.com/dstine

6/12/2015

3

About Copyright Clearance Center

• Global licensing solutions that make © work for everyone

– Get, share and manage content

– Rights broker for the world’s most sought-after materials

– Global company (US, Europe, Asia) – HQ in Danvers, MA

• Industry-specific software systems

– Internal and external user base

– Applications, services, databases

– Organic growth over many years

• In 2011, CCC adopted a Product Platform strategy for growing its software portfolio

6/12/2015

4

Agenda

• Context

• Liquibase

• Development Process

• Deploy Time

• Extensibility

• Wrap Up

6/12/2015

5

CONTEXT

6/12/2015

6

Database Migrations

• Database structure changes

– Tables, constraints, indexes, etc.

– Schema changes (DDL, not DML)

• Reference data

– List of countries, user types, order status, etc.

– Set of allowed values

• Database logic

– Functions, procedures, triggers

– (Very little of this)

6/12/2015

7

Our Historical Approach

• DB migrations handled in relatively ad-hoc fashion

• Various flavors of “standard” practice

– Framework copied and modified from project to project

– Framework not always used (“small” projects)

• Development teams shared a DEV database

– Conflicts between code and database

6/12/2015

8

Development Pain Points

• Intra-team collaboration was difficult

• Forced synchronous updates within development team

• Learn variations when switching between projects

• Project startup was costly

6/12/2015

9

Deployment Pain Points

• Manual process

– Where are the scripts for this app?

– Which scripts should be run and how?

• Recurring difficulties

– Hours spent resolving mismatches between app and database

– Testing activities frequently delayed or even restarted

• Impossible to automate

– Too many variations

• Self-service deployment was a pipe dream

6/12/2015

10

Standard Software Platform

• Started platform definition in 2011

– Homogenous by default

• Tools

– Java, Spring, Tomcat, Postgres

– Git / GitHub, Gradle, Jenkins, Artifactory, Liquibase, Chef

• Process

– Standard development workflow

– Standard application shape & operational profile

6/12/2015

11

Vision for Database Script Management

• Integrated into developer workflow

• Feeds cleanly into deployment workflow

• Developer commits scripts and the process takes over

– Just like with application code

6/12/2015

12

A Plan For Pain Relief

• Manage scripts as first-class citizens

– Same repo as application code

– Standard location in source tree

• Standard execution engine

– No more variations

– Automatic tracking of applied migrations

• Prevent conflicts and mismatches

– Introduce developer workstation databases (LOCAL )

– Dedicated sandbox

– Commit database and associated application change together

6/12/2015

13

A Plan For Pain Relief

• Liquibase

– Database described as code

– Execution engine & migration tracking

• Gradle

– Provide conventions

– Tasks for invoking Liquibase

– Already familiar to developers from existing build process

– Flexibility to integrate into deployment process

– Flexibility to handle emergent requirements

6/12/2015

14

LIQUIBASE

6/12/2015

15

Liquibase Basics

• Provides vocabulary of database changes

– Create Table, Add PK, Add FK, Add Column, Add Index, …

– Drop Table, Drop PK, Drop FK, Drop Column, Drop Index, …

– Insert, Update, Delete, …

• Changes are grouped into changesets

– Change(s) that should be applied atomically

• Changesets are grouped into changelogs

– Files managed in version control

6/12/2015

16

Liquibase Basics

• Changesets uniquely identified by [Author, ID, File]

– Liquibase tracks changeset execution in a special table

– Lock table to prevent concurrent Liquibase invocations

– Modified changesets are detected via checksums

• Supported databases

– MySQL, PostgreSQL, Oracle, SQL Server, …

• Groovy DSL

– Liquibase v2 supported only XML

– https://github.com/tlberglund/groovy-liquibase

6/12/2015

17

Example ChangesetchangeSet(id: '2015-01-23', author: 'John Doe <jdoe@copyright.com>') {

createTable(schemaName: 'apps', tableName: 'myapp_version', tablespace: 'ccc_data') { column(name: 'version_uid', type: 'VARCHAR(128)') column(name: 'type', type: 'VARCHAR(10)') column(name: 'owner_uid', type: 'VARCHAR(128)') column(name: 'version', type: 'VARCHAR(20)') column(name: 'start_date', type: 'TIMESTAMPTZ') column(name: 'end_date', type: 'TIMESTAMPTZ') } addPrimaryKey(constraintName: 'PK_myapp_version', schemaName: 'apps', tableName: 'myapp_version', tablespace: 'ccc_index', columnNames: 'version_uid') addForeignKeyConstraint(constraintName: 'FK_myapp_version_2_owner',

baseTableSchemaName: 'apps', baseTableName: 'myapp_version', baseColumnNames: 'owner_uid', referencedTableSchemaName: 'apps', referencedTableName: 'myapp_owner', referencedColumnNames: 'owner_uid')}

6/12/2015

18

Liquibase @ CCC

• Learning curve

– Team needs to understand the underlying model

– Don’t edit changesets once they’ve been applied

• Our standards

– Schema name and tablespace are required

– Parameterize schema name and tablespace

createTable( schemaName: dbAppsSchema, tableName: 'myapp_version', tablespace: dbDataTablespace)

6/12/2015

19

DEVELOPMENT PROCESS

6/12/2015

20

Development Workflow

• Gradle is our SCM hub

– Workstation builds

– LOCAL app servers via command line

– IDE integration

– CI and release builds on Jenkins

• Maintain Gradle-centric workflow

– Integrated database development

6/12/2015

21

Standard Project Structure

• Single Git repo with multi-project Gradle build

myapp myapp-db myapp-rest myapp-service myapp-ui

group = com.copyright.myapp

• UI and REST service published as WARs

• DB published as JAR

6/12/2015

22

Custom Gradle Plugin

• Created custom plugin: ccc-postgres

• Standard script location

– Main source set: src/main/liquibase

– Package: com.copyright.myapp.db

• Standard versions

– Liquibase itself

– Postgres JDBC driver

6/12/2015

23

Plugin Extension

• Custom DSL via Gradle extension

cccPostgres { mainChangelog = 'com/copyright/myapp/db/main.groovy'

}

• Main changelog includes other changelogs

6/12/2015

24

Development Lifecycle Tasks

• Provided by ccc-postgres

• Easy to manage LOCAL development database

– Isolated from other developers and deployments

– Pull in new schema changes run a task

• Built on Gradle Liquibase plugin

https://github.com/tlberglund/gradle-liquibase-plugin

6/12/2015

25

Development Lifecycle Tasks

6/12/2015

26

Development Lifecycle Tasks

• Typical developer loop

– gradlew update

– gradlew tomcatRun and/or IDE

• Not just for product development teams

– Simple to run any app

– Architects, QA, Platform Engineering

6/12/2015

27

Development Lifecycle Tasks

Task Runs As Description

createDatabase postgres

Creates ccc user and databaseCreates data and index tablespaces

createSchema ccc Creates apps schema

update ccc Runs main changelog

dropDatabase postgres

Drops ccc user and database

resetBaseChangelog

postgres

Truncates postgres.public.databasechangelog

6/12/2015

• resetBaseChangelog

– Must clear all traces of Liquibase to start over

28

Plugin Configuration

• Override default library versions

cccPostgres.standardDependencies.postgresDriver

• Defaults point to LOCAL development database

– Can override property values

dbHost, dbPort, dbName

dbUsername, dbPassword

dbDataTablespace, dbIndexTablespace

dbBaseUsername, dbBasePassword

6/12/2015

29

Standardization and Compliance

• So all our teams are authoring DB code

• But Liquibase is new to many

• And we have company standards

• Let’s automate!

6/12/2015

30

Static Analysis

• CodeNarc

– Static analysis of Groovy code

– Allows custom rule sets

• Created a set of custom CodeNarc rules

– Analyze our Liquibase Groovy DSL changelogs

• Apply to our db projects via the Gradle codenarc plugin

– Fail build if violations are found

6/12/2015

31

Static Analysis – Required Attributes

• Our rule categorizes all change attributes

– Required by Liquibase• createTable requires tableName

– Required by CCC• createTable requires schemaName and tablespace

– Optional

• Unintended positive consequence!

– Catches typos that otherwise would not be detected until farther downstream

– constrainttName or tablspace

6/12/2015

32

Static Analysis – Required Parameterization

• Ensure that schemaName & tablespace are parameterized for future flexibility

@Overridevoid visitMapExpression(MapExpression mapExpression) {

mapExpression.mapEntryExpressions .findAll { it.keyExpression instanceof ConstantExpression }

.findAll { ['schemaName', 'tablespace'] .contains(it.keyExpression.value) } .findAll { it.valueExpression instanceof ConstantExpression }

.each { addViolation(it, "${it.keyExpression.value} should not be hard-coded") }

super.visitMapExpression(mapExpression)}

6/12/2015

33

Schema Spy

• Generates visual representation of database structure

– Requires running database instance

– Requires GraphViz installation

• Custom task runSchemaSpy

– By default, points at LOCAL database

6/12/2015

34

Continuous Integration for DB Scripts

• Compile Groovy

– Catches basic syntax errors

• CodeNarc analysis

– Catches policy and DSL violations

• Integration tests

– Apply Liquibase scripts to H2 in-memory database

– Catches additional classes of error

6/12/2015

35

Release Build

• Publish JAR

– Liquibase Groovy scripts from src/main/liquibase

• META-INF/MANIFEST.MF contains entry point

Name: ccc-postgres MainChangelog: com/copyright/myapp/db/main.groovy

6/12/2015

36

DEPLOY TIME

6/12/2015

37

Deployment Automation

• Early efforts focused on applications themselves

– Jenkins orchestrating Chef runs

– Initial transition from prose instructions to Infrastructure as Code

• Database deployments remained manual

– Better than ad-hoc approach

– But still error prone and time-consuming

6/12/2015

38

Automated Application Deployments

• Chef environment file

– Cookbook versions: which instructions are used

• Chef data bags

– Configuration values for each environment

– Encrypted data bags for (e.g.) database credentials

• Jenkins deploy jobs (a.k.a “the button”)

– Parameters = environment, application version

6/12/2015

39

Initial Delivery Pipeline

6/12/2015

ManualDeploy

40

Initial Delivery Pipeline (DB Deployments)

• Clone Git repo and checkout tag

• Manually configure & run Gradle task from ccc-postgres

gradlew update -PdbHost=testdb.copyright.com -PdbPort=5432 -PdbDatabase=ccc-PdbUsername=ccc -PdbPassword=******

• Many apps xmany versions xmultiple environments =

TIME & EFFORT & ERROR

6/12/2015

41

Target Delivery Pipeline

6/12/2015

Full StackAutomatedDeploy

42

Target Delivery Pipeline

• Automated process should also update database

– Single Jenkins job for both apps and database scripts

• Maintain data-driven design

– Environment file lists database artifacts

– Controlled flow down the pipeline

• Gradle database deployment task

– Retrieve scripts from Artifactory

– Harvest information already in Chef data bags (URL, password)

– Execute Liquibase

6/12/2015

43

Automated Database Deployment

6/12/2015

44

Jenkins Deploy Job

• One job per application group, per set of deployers

– E.g. myapp.qa allows QA to deploy to environments they own

– Typically contains multiple deployables (apps, db artifacts)

– Typical deployer sets = DEV, QA, OPS

• Executes Liquibase via Gradle for database deployments

– Invokes deployDbArtifact task for each db artifact

• (Executes Chef for application deployments)

6/12/2015

45

Gradle deployDbArtifact Task

• Parameterized via Gradle project properties

– appGroup = myapp

– artifactName = myapp-db

– artifactVersion = 2.1.12

– environment = TEST

• Downloads JAR from Artifactory

– com.copyright.myapp:myapp-db:2.1.12

– Extract MainChangelog value from manifest

6/12/2015

46

Gradle deployDbArtifact Task

• Retrieves DB URL from Chef data bag item for TEST

"myapp.db.url": "jdbc:postgresql://testdb:5432/ccc"

• Retrieves password from encrypted Chef data bag

– myapp.db.password

• Executes Liquibase

6/12/2015

47

Data Bag Access

• Built on top of Chef Java bindings from jclouds

• No support for encrypted data bags

• Java Cryptography Extensions and the following libs:compile 'org.apache.jclouds.api:chef:1.7.2'compile 'org.apache.jclouds.provider:enterprisechef:1.7.2'

compile 'commons-codec:commons-codec:1.9'

6/12/2015

48

Push-Button Deploys

6/12/2015

Deploy History

6/10/2015

DEV TEST PROD

50

Automated Deployments By Role

6/12/2015

QA Rising

QAOvertakes

OPSOPS

Falling

InitialRollout

51

EXTENSIBILITY

6/12/2015

52

Additional Scenarios

• Framework originally design to handle migrations for schema owned by each application

• Achieved additional ROI by managing additional database deployment types with low effort

6/12/2015

53

Roles and Permissions

• An application that manages user roles and permissions (RP) for all other applications

– Has rp-db project to manage its schema, of course

– But every consuming app (e.g. myapp) needs to manage the particular roles and permissions known to it

– Reference data that lives in tables owned by another app

• myapp now has multiple db projects

– myapp-db to manage its schema

– myapp-rp-db to manage its RP reference data

– Both are deployed with new versions of myapp

6/12/2015

54

Roles and Permissions

• Minor addition of conditional logic

if (artifactName.endsWith('-rp-db')) { // e.g. myapp-rp-db // deploy to RP database

} else { // e.g. myapp-db // deploy to application's own database }

• Easy to implement because … Gradle & Groovy

• Conceptual integrity of framework is maintained

6/12/2015

55

WRAP UP

6/12/2015

56

Observations

• Power of convention and consistency

– Once first schemas were automated, dominoes toppled quickly

• Power of flexible tools and building blocks

– Handle legacy complexities, special cases, acquisitions, strategy changes, evolving business conditions

– New database project types fell easily into place

6/12/2015

57

Observations

• Know your tools

– Knowledge (how) has to propagate through the organization

– Ideally the underlying model (why)

• Schema changes no longer restrained by process

6/12/2015

“If it hurts, do it more often”

“If it’s easy, do it more often”

“If it hurts, do it more often”

Reduced technical debt

58

Dirty Work …

• Database development and deployment processes are often considered to be unexciting

• But sometimes you need to roll up your sleeves and do the dirty work to realize a vision

• And relational databases are still the bedrock of most of today’s information systems

6/12/2015

59

Dirty Work … Can Be Exciting!

• Efficient processes

• Reliable and extensible automation

• CONTINUOUS DELIVERY

6/12/2015

60

Full Stack Automated Self-Service Deployments

• Reduced workload of Operations team

• Safely empowered individual product teams

• Significantly reduced the DEV-to-TEST time delay

• Reinvested the recouped bandwidth

– More reliable & frequent software releases

– Additional high-value initiatives

6/12/2015

61

Resources

• Liquibase

http://www.liquibase.org

https://github.com/tlberglund/groovy-liquibase

https://github.com/tlberglund/gradle-liquibase-plugin

• Refactoring Databases: Evolutionary Database Design Ambler and Sadalage (2006)

• Jenkins and Chef:Infrastructure CI and Application Deployment

http://www.slideshare.net/dstine4/jenkins-and-chef-infrastructure-ci-and-automated-deployment

https://www.youtube.com/watch?v=PQ6KTRgAeMU

6/12/2015

62

The word and design marks which appear in this presentation are the trademarks of their respective companies.

6/12/2015

Thank You:

Copyright Clearance Center Engineering Team

Gradle Summit Organizers

top related