Top Banner
Applying the LIPARM Schema to legacy content Paul S Ell David Hardy Centre for Data Digitisation and Analysis LIPARM Project Workshop 28 January 2013
14

Applying the LIPARM Schema to legacy content

Feb 24, 2016

Download

Documents

Joben

Applying the LIPARM Schema to legacy content. Paul S Ell David Hardy Centre for Data Digitisation and Analysis. LIPARM Project Workshop 28 January 2013. Backdrop. Significant investment in British Isles Parliamentary content – BOPCRIS, Stormont Papers, Cobbett’s Parliamentary Papers - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Applying the LIPARM Schema to legacy content

Applying the LIPARM Schema to legacy content

Paul S EllDavid Hardy

Centre for Data Digitisation and Analysis

LIPARM Project Workshop28 January 2013

Page 2: Applying the LIPARM Schema to legacy content

Backdrop

• Significant investment in British Isles Parliamentary content – BOPCRIS, Stormont Papers, Cobbett’s Parliamentary Papers

• Generally each resource has its own interface and metadata standards

• Systematic research using disparate resources is hampered by this

• Consequently the impact of the digital resources was reduced

Page 3: Applying the LIPARM Schema to legacy content

CDDA’s Role

• To take the standardised Parliamentary Metadata Language (PML) developed by the project and apply it to sample legacy datasets

• To examine existing authority files/controlled vocabularies and see the degree to which they need augmentation

• To advise of the challenges of applying the schema to legacy materials

• To identify methodologies to reduce the capital cost of implementing the schema

• To establish the time and investment needed to convert existing content

• To advise on the application of the schema to born digital content

Page 4: Applying the LIPARM Schema to legacy content

Capturing what?

• Members of parliament – John Smith, Lord Smith, Viscount Smith, Member for Manchester South, the Prime Minister, the Chancellor etc

• Parliamentary constituencies – changes of name over time, names presented in different ways (South Manchester/Manchester South), varying boundaries where the name remains the same, differntiating John Taylor (UU MP), Lord Kilclooney and Kilclooney the place in Donegal)

• Calendar objects – Parliaments 1979-1983, sessions 1/9/79-1/6/80, sittings 15/1/80

• Functions – PM, Speaker, Chancellor • Proceeding objects – debates, reading of bills, reading of acts• Divisions – and members who cast votes

Page 5: Applying the LIPARM Schema to legacy content

Authority files/Controlled vocabularies

• The schema is highly dependent on authority files such members of parliament and the dates they were in parliament, offices of state and individuals associated with them, constituency lists for each parliament and an association between a person and a constituency

• Whilst to a degree authority files could be populated automatically in practice there was work in manually amending them

• Authority files also had to cope with differing parliamentary models between Westminster and Northern Ireland – for example in NI single constituencies had more than one member serving them

• Ideally controlled vocabularies/authority files should facilitate links to non-parliamentary e-resources

Page 6: Applying the LIPARM Schema to legacy content

Issues to consider

• Initially the schema was applied manually which was both very time consuming and produced errors. A number of steps were introduced to automate the system

• The amount of work involved in retro conversion varies from parliamentary year to year. New administrations tend tend to have more legislation, administrations with slight majorities tend to have more divisions etc.

• The schema needs to be sufficiently flexible or adaptable to cope with differences between parliaments – such as multi-member constituencies.

• It would be useful to see to what degree existing XML could be used to apply the schema

• A pick and mix approach to elements of the schema would be good. Such is the detail at present tagging is highly complex.

Page 7: Applying the LIPARM Schema to legacy content

Lessons learnt

• Real-time conversion of content – as proposed to the Welsh and Northern Ireland Assemblies is likely to be far less problematic than retro-conversion

• In total only 14 years of Hansard have been converted during the project. Whilst the PML was honed, and staff became more familiar with the content this is a very slow process

• Hence there is a need to make the best possible use of any existing xml and to automate as much of the process as possible

• The project primarily has addressed the PML application to Hansard. Other content – parliamentary reports for example – will result in additional challenges

Page 8: Applying the LIPARM Schema to legacy content

Examples of stages in the process

Creating a unique name and date range for each volume

Page 9: Applying the LIPARM Schema to legacy content
Page 10: Applying the LIPARM Schema to legacy content

Development of a function/job list

Page 11: Applying the LIPARM Schema to legacy content

The fields are pre-populated from the existing authority files. Some skilled data entry staffHave sufficient access privileges to create new roles/people etc

Page 12: Applying the LIPARM Schema to legacy content
Page 13: Applying the LIPARM Schema to legacy content

Entering divisions

Page 14: Applying the LIPARM Schema to legacy content