Top Banner
The LongNow
13

The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Dec 13, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

The LongNow

Page 2: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Why FERPA?

Page 3: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

The Sequel:

Page 4: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Key Problems

Entity Resolution

Regulatory Hurdles

Page 5: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Entity Resolution LEA: 4907023 Jorge Castillo-Estrada 9/30/1997 M L 437659887

LEA: 6002007 George Castillo 9/30/1997 M L 906773502

Page 6: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Name Counts

Student Count First Name Last Name

64 JOSHUA SMITH

56 ASHLEY SMITH

52 JESSICA SMITH

48 JUSTIN SMITH

37 ASHLEY JONES

31 JUSTIN WILLIAMS

30 JESSICA JOHNSON

27 JOSHUA BROWN

There are ~55,000 unique first names among students in Arkansas and ~40,000 last names.

Approximately 20% of Arkansas students share both the same first and last name with another student.

Page 7: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

More Data Issues There are 4,026 students in Arkansas that share an SSN

with at least one other student in the state.

Between August and January, 874 student transfers to other schools resulted in an SSN change.

Between August and January, an additional 1,018 students changed their SSN—we have records for only 300 of these changes.

Between August and January, 21,255 students moved to another district in the state—only 18,986 students were marked as “withdrawn.”

Page 8: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

The Knowledge Base Approach“Indicative” information from multiple data sources is stored and merged into an “equivalence class” for each entity, using both fuzzy and logical associations. Knowledge base identifiers are used to manage the references.

Bob Smith, Barton Elementary

Robert Smith, Barton Elementary

Bob Smith, Wilson Elementary

Fuzzy Match

Logical Match (Drop/Enroll)

Identifier Representation

KB5765 Bob Smith, Barton

KB5765 Robert Smith, Barton

KB5765 Bob Smith, Wilson

Knowledge Base

Page 9: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Two Agencies, Two Regulations

HIPPA FERPA

Page 10: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

A trusted broker maintains a cross reference table, encoding the identifiers for various agencies and for various representations of the entities.

Trusted Broker

Bob Smith AC0236 Robert Smith ED4297

ACHI ADE

Trusted Broker

Identifier Representation

Identifier Encoded for ACHI

Identifier Encoded for ADE

KB5765 Bob Smith, Barton AC0236 ED4297

KB5765 Robert Smith, Barton AC0236 ED4297

KB5765 Bob Smith, Wilson AC0236 ED4297

Page 11: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Encoded LinksThe trusted broker can provide multiple agencies with encoded versions of the (hidden) knowledge base identifiers, protecting all future data requests.

Bob Smith AC0236

Robert Smith AC0236

Bob Smith AC0236

Katherine Johns AC0651

Kate Sanders AC0651

Erica Davis AC1327

ED4297 Bob Smith

ED4297 Robert Smith

ED4297 Bob Smith

ED8516 Katherine Johns

ED8516 Kate Sanders

ED3508 Erica Davis-Hill

ACHI ADE

Trusted Broker

Page 12: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Brokered Result 1AC0236 Score: 242AC0651 Score: 417AC1327 Score: 385

Data RequestsThe trusted broker translates encoded links between agencies for data requests and no personally identifying information needs to be exchanged.

ACHI ADE

What are the test scores for the following students?AC0236AC0651AC1327

ED3508 Score: 385ED4297 Score: 242ED8516 Score: 417Trusted Broker

AC0236 ↔ ED4297AC0651 ↔ ED8516AC1327 ↔ ED3508

Brokered Result 2Score: 242Score: 385Score: 417

Brokered Result 3

Average Score: 348

Page 13: The LongNow. Why FERPA? The Sequel: Key Problems Entity Resolution Regulatory Hurdles.

Brokered Result 1AC0236 Score: 242AC0651 Score: 417AC1327 Score: 385

Result OptionsThe trusted broker may deliver results between agencies in a variety of ways, without exchanging personally identifying information.

Trusted Broker

Brokered Result 2Score: 242Score: 385Score: 417

Brokered Result 3

Average Score: 348

Individual level resultswith encoded links

(safe, encoded)

Individual level results without links, random

(safe, anonymous)

Aggregated results(safe, anonymous)