Top Banner
Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.
48

Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Mar 31, 2015

Download

Documents

Jocelyn Austin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Working with Data Managers

Renee Woodten Frost

Internet2 Middleware Initiative

University of Michigan

Copyright Renee Woodten Frost 2003. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to

republish requires written permission from the author.

Page 2: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Topics

• Vignette• Data: Role in Directory Implementation• Data Policy Issues • Key Data Needs

– Identifiers– Directory data– eduPerson schema

• Strategies and Recommendations

Page 3: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Vignette

Sam is taking a class in genetics at Alpha U and needs to do some research for a paper. At lunch, he goes online to access a restricted EBSCO database AU shares with Beta U. A window pops up in the browser asking if it’s okay for AU to give EBSCO information about his status --- only students from subscribing institutions can access the database. He clicks ok, knowing that only his status is passed, not his name or contact information. The browser then loads the restricted website.

Page 4: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Vignette Illustrates

• Privacy trust • Sam controls personal information flow• Administrative and security services

integration• Inter-campus access• University vouches for and acts on behalf of

Sam

Page 5: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Demands on IT Revealed

• One stop for university services integrated with course management systems

• Expensive library databases shared with other schools by joint agreement

• Browser or desktop preferences follow you• Submission and/or maintenance of

information online• Privacy protection

Page 6: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Important questions, Important data

• Are the people using these services who they claim to be?

• Are they a member of our campus community?

• Have they been given permission?• Is their privacy being protected?

Page 7: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Pause for Some Terminology

• Identity: set of attributes about you.• Attributes: specific information stored about you.• Authentication: process used to prove your identity.

Often a login process.• Authorization: process of determining if policy permits

an intended action to proceed. • Directories: where an identity’s basic characteristics

are stored

Page 8: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Enterprise Directory

• Anti-stovepipe architecture that can provide authentication, attribute, & group services to applications.

• Adds value by improving cost/benefit of online services and by improving security.

• A new and visible flow of administrative data..

Page 9: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Definitions: Enterprise Directory Services

Enterprise Directory services - where your electronic identifiers are reconciled and basic characteristics are kept

– Very quick lookup function

– Machine address, voice mail box, email box location, address, campus identifiers

Page 10: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Enterprise Directory

• Determine application-driven requirements for authentication, attribute, and group services and then design these four stages to meet the requirements:

1.Data Sources

2.Metadirectory Processes

3.Directory Services

4.Applications

Page 11: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

UoM Core Middleware Stages

HRS

SIS

IDs

1

Ph Rebuild& AMP

BuildUser

cardswipe

2

qi-operutils

mailrouting

RADIUS

tigerlan

addressbooks

webauthN

webauthZ

IMAP/POPmail

webmail

calendar

qiSynch

NDS

LDAP DSPh

SMTPAUTH

whitepages

massmessaging

3

UMDI

dialupwirelessFRS

miscactions

Data sources Metadirectory processes Directories Applications

Page 12: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Nature of Directory Work

• Technology– Establish campus-wide services: name space,

authentication– Build an enterprise directory service– Populate the directory from source systems– Enable applications to use the directory

• Policies and Politics– Clarify relationships between individuals and institution– Determine who manages, who can update and who can see

common data – Structure information access and use rules between

departments and central administrative units– Reconcile business rules and practices

Page 13: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Data Policy Issues

• Cross organizational data sharing– Enabling a centralized repository– Identifying authoritative sources– Building trust

• Privacy constraints – FERPA, HIPAA• New procedures • Security• Audit ability• Accountability

Page 14: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Stage 1: Analyze Data Sources

• Common Identifiers on campus• Identify systems of record and data owners

– Determine data and data access needed– Determine frequency of the feed– Provide Standard Data Collection Model

• Define database load procedure and produce audit log

Page 15: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Definitions: Identifiers

Identifiers– your electronic identification– Multiple names and corresponding information

in multiple places– Single unique identifier for each authorized

user– Names and information in other systems can

be cross-linked to it• Admin systems, library systems, building

systems

Page 16: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Definitions: Authentication

Authentication – maps the physical you to an electronic identifier

– Password authentication most common

– Security need should drive authentication method

– Distance learning and inter-campus applications

Page 17: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Major campus identifiers

• UUID• Student and/or

emplid• Person registry ID• Account login ID• Enterprise-LAN ID• Student ID card

• Net ID• Email address• Library/departmental

ID• Publicly visible ID

(and pseudo-SSN)• Pseudonymous ID

Page 18: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

General Identifier Characteristics•Uniqueness (within a given context)•Dumb vs intelligent (i.e. whether subfields have meaning)•Readability (machine vs human vs device)•Affordance (centrally versus locally provided)•Resolver approach (how an identifier is mapped to associated object) •Metadata (both associated with the assignment and resolution of an identifier)•Persistence (permanence of relationship between identifier and specific object)

Page 19: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

General Identifier Characteristics

• Granularity (the degree to which identifier denotes a collection or component)

• Format (checkdigits)• Versions (can defining characteristics of identifier

change over time)• Capacity (size limitations imposed on the domain or

object range)• Extensibility (the capability to intelligently extend one

identifier to be the basis for another identifier).

Page 20: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Important Characteristics

•Semantics and syntax- what it names and how does it name it•Domain - who issues and over what space is identifier unique•Revocation - can the subject ever be given a different value for the identifier•Reassignment - can the identifier ever be given to another subject•Opacity - is the real world subject easily deduced from the identifier - privacy and use issues

Page 21: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Identifier Mapping Process

• Map campus identifiers against a canonical set of functional needs

• For each identifier, establish its key characteristics, including revocation, reassignment, privileges, and opacity

• Shine a light on some of the shadowy underpinnings of middleware

• A key first step towards the loftier middleware goals

Page 22: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Identifier Mapping Template

• Model Identifier Mapping and examples:

http://middleware.internet2.edu/earlyadopters/identifier-mappings/

Page 23: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Stage 1: Analyze Data Sources

• Common Identifiers on campus• Identify systems of record and data

owners/managers– Determine data and data access needed– Determine frequency of the feed/updates– Provide Standard Data Collection Model

• Define database load procedure and produce audit log

Page 24: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Cross Organizational Data Sharing

• Information gathering across silos– What are the systems of record? The

authoritative source of the data?– Who are the owners/stewards/managers?

• Centralized vs Distributed

– Environment• Cooperative vs Competitive

– Uncovering skeletons– Normalizing the data

Page 25: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Systems of Record

• Data (ex,names,addresses) exist in multiple systems; which is authoritative?

• Individual can have several roles; which is primary?– Student and alum– Student and staff/teaching assistant

• How is maintenance, especially purge process, handled?

Page 26: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Data Stewards/Managers

• Registrar• Human Resources• Alumni Records• Library Records• Schools and Colleges• Telecommunications• [Potentially, many] others

Page 27: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Requires Education and Communication with Data Stewards/Managers

• Need to understand data as a resource• Need to understand the concept of

authoritative data and be willing to collaborate• Need to understand the value of data sharing

and appropriate access • Need to be reassured that proper

security/privacy being adhered to

Page 28: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Institutional Environment Impact

• Public vs. Private Institutions• Institutional Vision vs. Local Control• Change Readiness• Strategic vs. Tactical Planning• Role of IT• Policy and Legal Constraints• Resource Determination/Allocation

Page 29: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Institutional Environment: Organizational Culture/Structure

• Competitive or collaborative– Challenges “ownership” – Can feel disenfranchised– Anticipate clear needs and keep everyone on the same page = educate and communicate

• Willingness to change– Technical infrastructure – Formally or informally, organizational structure may need to change too

Page 30: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Institutional Environment:Policy and Legal Constraints

•Ownership of Data– Is data stewardship well-defined?– Is it centralized or distributed?

•Access to Data– Formally or loosely governed?– Access authority centralized or distributed?

•Data Administration– Centrally managed or distributed?– FERPA and HIPAA compliant?

Page 31: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Data Administration

• Definition: the development and application of formal rules and methods to the management of an institution’s data resource

• Management of any resource: establish policy and procedures and monitor compliance

Page 32: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

University of MichiganData Resource Management Policy

• Institutional data resource is a University asset• Data resource will be safeguarded/protected• Data will be shared based on institutional policies• Data will be managed as an institutional resource• Institutional data will be identified and defined• Databases will be developed based on functional

needs• Information quality will be actively managed

Page 33: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

University of Michigan Data Resource Guidelines

• Defines data management roles• Introduces concept of “Institutional Database”• Provides guidelines for: collection &

maintenance, validation & correction, manipulation, modification, and reporting, security, access, data availability and integration, and documentation (includes data definitions and level of security)

Page 34: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

University of MichiganData Administration

• Philosophy: the value of data as an institutional resource is increased through the widespread and appropriate use; the value is diminished through misuse, misinterpretation, or unnecessary restriction.

• University “owns” the data, stewardship is identified and maintained

Page 35: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Without Data Administration . . And/or high level exec sponsorship

• the burden of data manager and data source identification and negotiation often falls to IT leadership

• requires leadtime, energy, communication and negotiation skills, and continual education and communication

Page 36: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Approach

• Dependent on institutional environment• Dependent on drivers• Dependent on project methods (often related

to environment)– Campus strategic project – Application requirement– Stealth

Page 37: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Primary Tasks to be Completed

• Select attributes/data for inclusion• Negotiate for access to data• Determine data access policy • Develop familiarity with semantics of desired

data elements • Develop familiarity with business processes

that maintain them• Define database load procedure, with

standard feeds, and produce audit log

Page 38: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

What Data is Needed?

• The object classes/schema and source data to populate directories are determined by the applications to be directory enabled.

• Common initial or early applications include white pages and email routing which require:– identifiers– directory information (name, addresses, phone numbers,

email addresses,etc) - found in standard directory schemas such as inetOrgPerson

– eduPerson attributes

Page 39: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

“Good” Practices for Attributes

• Use standards schema: inetOrgPerson, eduPerson, localPerson

• Never repurpose an fields defined as standards (RFC-defined). Add new attributes - adding attributes is easier than thought

Page 40: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

eduPerson

• A directory object class intended to support inter-institutional applications

• Fills gaps in traditional directory schema• For existing attributes, states good practices where

known• Specifies several new attributes and controlled

vocabulary to use as values• Provides suggestions on how to assign values, but

leaves it to the institution to choose• Latest version released with NMI components in

October, 2002

Page 41: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Upper Class Attributes Issues

• eduPerson inherits attributes from Person, inetOrgPerson

• Some of those attributes need conventions about controlled vocabulary (e.g. telephones)

• Some of those attributes need ambiguity resolved via a consistent interpretation (e.g. email address)

• Some of the attributes need standards around indexing and search (e.g. compound surnames)

• Many of those attributes need access control and privacy decisions (e.g. JPEG photo, email address, etc.)

Page 42: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

eduPerson Attributes

•eduPersonAffiliation•eduPersonEntitlement•eduPersonNickname•eduPersonOrgDN•eduPersonOrgUnitDN•eduPersonPrimaryAffiliation•eduPersonPrimaryOrgUnitDN•eduPersonPrincipalName

Page 43: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

eduPersonAffiliation

• Multi-valued list of relationships an individual has with institution

• Controlled vocabulary includes: faculty, staff, student, alum, member, affiliate, employee

• Applications that use: Shibboleth digital libraries, Directory of Directories for Higher Education

Page 44: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

eduPersonPrimaryAffiliation

• Single-valued attribute that would be the status put on a name badge at a conference

• Controlled vocabulary includes: faculty, staff, student, alum, member, affiliate

• Determined by institutional business rules• Applications that use: white pages, restricted

access sites

Page 45: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Strategies• Executive Dictate (overt or stealth)• Data Administration

– Fully functioning unit or philosophy itself – Data managers committee

• Education/communication/negotiation– Data administration concepts– Vignettes/scenarios (relevant to data manager)– Institutional drivers (external,internal, apps)– Case studies from other universities– NMI/Internet2 materials

Page 46: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Key Planning RecommendationsKey Planning Recommendations

• Understand the institutional environment, including data policies and business rules, and the value of the enterprise directory to your institution

• Build in time to collect and map/resolve identifiers• Allow considerable time upfront to work with/educate

data stewards, possibly developing policy• Think standards• Be prepared for political wounds from the possible

reduction of duchies in data and policies • Give priority to both education and communication

plans (continual and consistent)

Page 47: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

Strategies You Used?

• Discussion• Questions

Page 48: Working with Data Managers Renee Woodten Frost Internet2 Middleware Initiative University of Michigan Copyright Renee Woodten Frost 2003. This work is.

Base CAMP - February 5-7, 2003

More Information

• Middleware:– http://middleware.internet2.edu– http://www.nmi-edit.org

• My contact information:– [email protected]