Top Banner
Paul Walk Director, Antleaf Managing Director, Dublin Core Metadata Initiative (DCMI) Web: http://www.paulwalk.net Email: [email protected] Twitter: @paulwalk www.antleaf.com www.dublincore.org Sharing profiles: Documenting profiles and vocabularies on the Web
28

Documenting metadata application profiles and vocabularies

Jan 21, 2018

Download

Technology

Paul Walk
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Documenting metadata application profiles and vocabularies

Paul Walk

Director, Antleaf

Managing Director, Dublin Core Metadata Initiative (DCMI)

Web: http://www.paulwalk.net

Email: [email protected]

Twitter: @paulwalk

www.antleaf.com www.dublincore.org

Sharing profiles: Documenting profiles and

vocabularies on the Web

Page 2: Documenting metadata application profiles and vocabularies

is it more important that

application profiles are

machine-friendly, or user-

friendly?

Page 3: Documenting metadata application profiles and vocabularies

the specific challenge:

how to manage & publish the Dublin Core

technical documentation in a more

efficient & sustainable way, making it

as user-friendly as possible while

maintaining its machine-readability

Page 4: Documenting metadata application profiles and vocabularies

context

• DCMI publishes important technical

documentation (vocabularies,

specifications, models) on the Web

• until recently, managed in sophisticated

bespoke system:

• sources edited as XML files

• maintained in a Subversion

repository

• assembled & converted with shell

scripts and 'Ant'

• FTP to a 'staging server'

• deployed to the live server by the

server admin, on request

• essentially a "closed" system

Page 5: Documenting metadata application profiles and vocabularies

three technologies which make the difference

1. Git• stable, sophisticated, free version control technology which is ubiquitously

supported

• github: global scale infrastructure providing git as a service

• invite contribution by 'pull request’

2. Markdown• simple, parseable but easily readable plain text format

3. Static website generators• a new class of content management system where sources are managed

locally and compiled into webpages which are then uploaded to a server

(like we used to do it in the early 90s!)

• supports distributed content-management via git

• supports long-term preservation by requiring only simple text-based

formats

• supports use of desktop authoring tools - e.g. text-editors

Page 6: Documenting metadata application profiles and vocabularies

we are exploring how these three

technologies:

* Git/GitHub

* Markdown (with metadata “front matter”)

* static-site generators

can be harnessed together to address

our challenge

Page 7: Documenting metadata application profiles and vocabularies

what are static site

generators?

Page 8: Documenting metadata application profiles and vocabularies

what are static site generators?

• a different kind of web-content management system, designed to publish

content as static content to a bog-standard web-server.

• content is processed during the publishing operation, rather than when the

user requests content (although client-side Javascript still supported)

• simple command-line application to generate content and serve pages

• no database - content in semi-structured text files

Page 9: Documenting metadata application profiles and vocabularies

components - standard to most systems

1. content-model

• folder hierarchy, text files

2. content pages

• (markdown, front-matter)

• blog type content is also often supported

3. templates (& themes)

• (with some level of basic scripting)

4. generator software

• typically a command-line script or application

5. configuration file

Page 10: Documenting metadata application profiles and vocabularies

1. content-model

• text files arranged in folder

hierarchy

• folder hierarchy relates to URL path

structure

• filename relates to URL

Page 11: Documenting metadata application profiles and vocabularies

2. content pages

• "front-matter" metadata

• often in YAML format like here

• main body in Markdown, arbitrary

HTML also accepted where necessary

Page 12: Documenting metadata application profiles and vocabularies

3. templates

• can reference metadata (e.g. 'page title') from content page

• can re-use 'partial' templates (e.g. a common 'header' & 'footer')

• often in a common templating language such as HAML

• (example below is in Go's templating syntax)

= include partials/header.html .

div.row-fluid

div class="col-xs-12"

h1.page-title {{if .Draft}}[**draft**]{{end}}{{.Title}}

h2.page-title

i {{.Params.author}}, {{.Date.Format "Monday, January 02, 2006"}}

{{.Content}}

= include partials/share_buttons.html .

= include _internal/disqus.html .

= include partials/footer.html .

Page 13: Documenting metadata application profiles and vocabularies

4. generator software

• used to generate new content:

• also used to run a local sever to see how the site will look

Page 14: Documenting metadata application profiles and vocabularies

deployment options

• SFTP

• Rsync (over SSH)

• git commit hooks (or GitHub webhooks)

• requires the site to be built on the server, so a little more infrastructure (a

simple CGI) is required

Page 15: Documenting metadata application profiles and vocabularies

436 known generators

https://staticsitegenerators.net

Page 16: Documenting metadata application profiles and vocabularies

workflow

Page 17: Documenting metadata application profiles and vocabularies

‘flipping’ the approach

Page 18: Documenting metadata application profiles and vocabularies

old approach (single source file)

Page 19: Documenting metadata application profiles and vocabularies

new approach (many source files, one per term)

Page 20: Documenting metadata application profiles and vocabularies

pros and cons

• old approach (source in XML file

or similar)

• pros:

• easy to track source files (few in

number)

• easy to transform into other

machine-readable formats

• cons:

• difficult to maintain the source -

not user-friendly

• poor support for extensive free

text description

• new approach (source in

Markdown+YAML)

• pros:

• easier to for humans to read and

maintain

• good support for extensive free

text description

• easy to re-use

(partially/completely)

• cons:

• may not suit very complex

vocabularies/or profiles

Page 21: Documenting metadata application profiles and vocabularies

simplifying curation and preservation

• version control and redundancy• synchronised repositories & distributed version control via Git

• active curation• ease of access and contribution to sources via Git

• simple & readable plain text formats (Markdown)

• "one click" deployment

• minimal deployment infrastructure• standard web-server

• text files, open formats, no database or server-side 'logic', static site

generators

• reduces broken websites

Page 22: Documenting metadata application profiles and vocabularies

issues & challenges

Page 23: Documenting metadata application profiles and vocabularies

1. is this still too technical for

some people who may need

to maintain a metadata

profile or vocabulary?

Page 24: Documenting metadata application profiles and vocabularies

2. will this approach be

sophisticated enough to

document the majority of

candidate

profiles/vocabularies?

Page 25: Documenting metadata application profiles and vocabularies

3. can we generalise this

approach to provide a

useful, re-usable tool kit for

others to adopt?

Page 26: Documenting metadata application profiles and vocabularies

4. how do we handle

versioning? By term, or by

‘collection’ - e.g. vocabulary

or profile

Page 27: Documenting metadata application profiles and vocabularies

versioning by term

Page 28: Documenting metadata application profiles and vocabularies

Paul WalkDirector, Antleaf

Managing Director, Dublin Core Metadata Initiative (DCMI)

Web: http://www.paulwalk.net

Email: [email protected]

Twitter: @paulwalk www.antleaf.com www.dublincore.org

Thank you!