New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U Tracking Referents (based on OIC, December 1, 2006) Barry SMITH and Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences University at Buffalo, NY, USA http://www.org.buffalo.edu/RTU
51
Embed
Tracking Referents (based on OIC, December 1, 2006)
Tracking Referents (based on OIC, December 1, 2006). Barry SMITH and Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences University at Buffalo, NY, USA http://www.org.buffalo.edu/RTU. Representational artifacts. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Tracking Referents
(based on OIC, December 1, 2006)
Barry SMITH and Werner CEUSTERSCenter of Excellence in Bioinformatics and Life Sciences
University at Buffalo, NY, USA
http://www.org.buffalo.edu/RTU
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Representational artifacts
classified according to the sort of entities they are about
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realist view of the world
• The world consists of entities that can be divided according to three dichotomies– entities that are
• Either particulars or universals;
• Either occurrents or continuants;
• Either dependent or independent;
– together with relations between these entities• <particular , universal> e.g. is-instance-of,
• <particular , particular> e.g. is-member-of
• <universal , universal> e.g. is_a (is-subtype-of)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realist view of the world (1)
airplane philosopher
airport
universals/types
instances/particulars
Enola Gay Barry Smith
JFKGeorge Bush
instance of
president
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realist view of the world (2)
Enola Gay Barry Smith
JFKGeorge Bush
t
continuants
flying meetingoccurrents
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realist view of the world (3)
philosopher
universals
particulars
Barry SmithGeorge Bush
presidentchild adult
t
Instance-at t
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Inadequate representational units
• “JFK” “Enola Gay”
• “Barry Smith” “George Bush”
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• Purpose:– explicit reference to the
concrete individual entities relevant to the accurate description of a scene
Proposed Solution: Referent TrackingNow! That should clear up a few things around here !
Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
78
Numbers instead of words
• Method:
– Introduce an Instance Unique Identifier (IUI) for each relevant particular (individual) entity
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Essentials of Referent Tracking
• generate of universally unique identifiers;• decide what particulars should receive a IUI;• finding out whether or not a particular has already been
assigned a IUI (each particular should receive maximally one IUI);
• using IUIs to make statements;• determining the truth values of statements in which IUIs
are used;• correcting errors in the assignment of IUIs.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
IUI generation
• Universally Unique IDs: – recently standardized through ISO/IEC 9834-8:2004, – specifies format and generation rules enabling users to
produce 128-bit identifiers that are either guaranteed or have a high probability of being globally unique
– Meaningless strings– Central management or certification not needed to
guarantee uniqueness• (But use as IUI requires this)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
IUI assignment• = an act carried out by the first ‘cognitive agent’
who recognizes the need to acknowledge the existence of a particular it has information about by labeling it with a IUI.
• ‘cognitive agent’:– A person;– An organisation;– A device or software agent, e.g.
• Bank note printer• Image analysis software
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Criteria for IUI assignment (1)
1. Different for continuants and for occurrents2. The continuant is in front of you, you can see it,
photograph it– The photograph gets a IUI; your act (occurrent) of taking the
photo gets a IUI
3. The occurrent occurs in your presence, you can make a video
– The video gets a IUI; your act (occurrent) of taking the video gets a IUI
4. When assigning a IUI you may not know exactly what the particular is (which type it instantiates)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Criteria for IUI assignment (2)
2. The particular’s existence ‘may not already have been determined as the existence of something else’:
• Morning star and evening star• Himalaya 2 observers not knowing they observed the same thing
3. May not have already been assigned a IUI.4. It must be relevant to do so:
• Personal decision, (scientific) community guideline, ... • Possibilities offered by the EHR system• If a IUI has been assigned by somebody, everybody else
making statements about the particular should use it
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Assertion of assignments
• IUI assignment is an act whose \execution has to be asserted in the IUI-repository:– <da, Ai, td>
• da IUI of the registering agent
• Ai the assertion of the assignment <pa, pp, tap, c>
» pa IUI of the author of the assertion
» pp IUI of the particular
» tap time of the assignment
» c optional description for identification
• td time of registering Ai in the IUI-repository
• Neither td or tap give any information about when #pp started to
exist. This might be asserted in statements providing
information about #pp .
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UPTP statements - particular to
particular
• ordered sextuples of the form <sa, ta, r, o, P, tr>
sa is the IUI of the author of the statement,
ta a reference to the time when the statement is made,
r a reference to a relationship (available in o) obtaining between the particulars referred to in P,o a reference to the ontology from which r is taken,P an ordered list of IUIs referring to the particulars between which
r obtains, and,tr a reference to the time at which the relationship obtains.
• P contains as many IUIs as required by the arity of r. In most cases, P will be an ordered pair such that r obtains between the particular represented by the first IUI and the one referred to by the second IUI. • As with A statements, these statements must also be accompanied by a meta-statement capturing when the sextuple became available to the referent tracking system.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
PTCL statements – particular to class
<sa, ta, inst, o, p, cl, tr>
sa is the IUI of the author of the statement,
ta a reference to the time when the statement is made,
inst a reference to an instance relationship available in o obtaining between p and cl,
o a reference to the ontology from which inst and cl are taken,
p the IUI referring to the particular whose inst relationship with cl is asserted,
cl the class in o to which p enjoys the inst relationship, and,
tr a reference to the time at which the relationship obtains.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Other Advantages
• mapping as by-product of tracking– Descriptions about the same particular using different
ontologies/concept-based systems
• Quality control of ontologies and concept-based systems– Systematic “inconsistent” descriptions in or cross
terminologies may indicate poor definition of the respective terms
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Dynamic aspects
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Accept that everything may change:
1. changes in the underlying reality:• Particulars and universals come and go
2. changes in our (scientific) understanding: • The plant Vulcan does not exist
3. reassessments of what is considered to be relevant for inclusion (notion of purpose).
4. encoding mistakes introduced during data entry or ontology development.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Reality versus beliefs, both in evolution
IUI-#3
O-#2
O-#1
tU1
U2
p3Reality
BeliefO-#0
= “denotes” = what constitutes the meaning of representational units …. Therefore: O-#0 is meaningless
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
An “optimal” representational artifact (2)
• Each representational unit in such a representational artifact would designate – (1) a single portion of reality (POR), which is – (2) relevant to its purposes and such that – (3) the authors intended to use this representational
unit to designate this POR, and– (4) there would be no PORs objectively relevant to
these purposes that are not referred to in the representational artifact.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Sources of error
• assertion errors: sources may be in error as to what is the case in their target domain;
• relevance errors: sources and analysts may be in error as to what is objectively relevant to a given purpose;
• encoding errors: they may not successfully encode their underlying cognitive representations, so that particular representational units fail to point to the intended PORs.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Key requirement for updating
Any change in an ontology or data repository should be
associated with the reason for that change to be able to assess later what kind of mistake has been made !
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Example: a person’s gender
• In John Smith’s EHR:– At t1: “male” at t2: “female”
• What are the possibilities ?• Change in reality:
• transgender surgery• change in legal self-identification
• Change in understanding: it was female from the very beginning but interpreted wrongly
• Correction of data entry mistake(was understood as male, but wrongly transcribed)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realism-based metric for data quality• Must be able to deal with a variety of problems by which
matching endeavors thus far have been affected– different authors may have different though still veridical views
on the same portion of reality, – authors may make mistakes,
• when interpreting reality, or • when formulating their interpretations in their chosen representation
language
– a matcher can never be sure to what the expressions in an repository actually refer (no God’s eye perspective),
– if two ontologies are developed at different times, reality itself may have changed in the intervening period.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UAn example: merging data from two sourcesReality exist before any observation
R
And also most structures in reality are there in advance
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UThe author of O1 acknowledges the existence of some Portion Of Reality (POR)
R
B1
Some portions of reality escape his attention.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
R
He considers only some of them relevant for O1,represents thus only part, here with Int = R+.
O1
B1
#1
RU1B1
RU1O1
• Both RU1B1 and RU1
O1 are representational units referring to #1;
• RU1O1 is NOT a
representation of RU1B1;
• RU1O1 is created through
concretization of RU1B1 in
some medium.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
R
Similarly concerning the author of O2
O1
B2B1
O2
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
R
Creation of the mapping
O1
B2B1
O2
Om
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Two (out of many other) possible configurations
#1 was not considered to be relevant for O2, but is considered to be relevant for Om.
The author of O1 made an encoding mistake, so that his ontology contains a reference to a non-intended referent, and this is copied into Om.
RR
O1O1
B2B2B1B1
O2O2
OmOm
RR
O1O1
B2B2B1B1
O2O2
OmOm
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UTypology of expressions included in and excluded from an
ontology in light of relevance and relation to external reality
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UTypology of expressions included in and excluded from an
ontology in light of relevance and relation to external reality
Valid presence in the representation
Valid absence in the representation
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UTypology of expressions included in and excluded from an
ontology in light of relevance and relation to external reality
Unjustified presence in the representation
Unjustified absence in the representation
But sometimes you get lucky …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UThe original beliefs are usually not accessible
R
O1
O2
B2B1
Om
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UThe original beliefs are usually not accessible
R
O1
O2
Om
• But if the ontologies are well documented and representations intelligible, then many such beliefs can be inferred, and mistakes found.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UFor concept-based systems, there is also no reality
R
O1
O2
Om
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UBut that what must hold if both ontologies are believed to be right, can be believed to mirror reality
O1
Om
O2
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The principle of forced backward belief
O1O1
OmOm
O2O2
A lot of information loss
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A decision support tool for dealing with inconsistencies ?
• O1:– Holds that penguins are birds, birds fly
• O2:– Holds that penguins are birds, penguins don’t fly
• The problem for Om:– Which source ontology to believe?
– What might be the source of the inconsistency ?• O1 is right and penguins do fly
• O1 is wrong and either penguins are not birds or not all birds fly
• Both are right but the representational units ‘penguin’, ‘bird’ and ‘fly’ do not refer to the same entities in reality.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Possible evolutions through updates
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Possible evolutions through updates
Example: a relevant entity ceases to exist, but the representation is not updated:
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Updating is an active process
• authors assume in good faith that – all included representational units are of the P+1 type, and
– all they are aware of, but not included, of A+1 or A+2.
• If they become aware of a mistake, they make a change under the assumption that their changes are also towards the P+1, A+1, or A+2 cases.
• Thus at that time, they know of what type the previous entry must of have been under the belief what the current one is, and the reason for the change.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
This leads to a calculus …
• NOT:
– to demonstrate how good an individual version of an ontology is,
• But rather– to measure how much it improved (hopefully)
as compared to its predecessors.
• Principle: recursive belief revision
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BeliefsAt t about t
Backward belief revision over time
Reality: a POR exists and is not relevant
• At time t, an analyst correctly perceives the existence of some particular, but considers it relevant while it isn’t, and he makes an encoding error such that the representational unit does not refer.
• There is thus a -2 error with respect to reality, but this remains, of course, unknown.
-2
R P
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BeliefsAt t about t
Backward belief revision over time
At t+1 about t+1At t+1 about t
Reality: a POR exists and is not relevant
• At t+1, he correct the encoding mistake, which forces him to believe that at t, the unit-reality configuration was of type P-4 rather than P+1.
R P
-2
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BeliefsAt t about t
Backward belief revision over time
At t+1 about t+1At t+1 about t
Reality: a POR exists and is not relevant
• Although he believes that the current situation is P+1, it is in reality P-6, where it was P-7 before.
• The real error is now -1, while the perceived error with respect to t is also -1
R P
-2
-1-1
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BeliefsAt t about t
Backward belief revision over time
At t+1 about t+1At t+1 about t
Reality: a POR exists and is not relevant
• At t+2, he believes that the posited POR in fact does not exist
R P
-2
-1-1
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BeliefsAt t about t
Backward belief revision over time
At t+1 about t+1At t+1 about t
Reality: a POR exists and is not relevant
At t+2 about t+2At t+2 about t+1At t+2 about t
R P
-2
-1
-1
-1-3-5
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Conclusion
• Realist ontology is a powerful quality assurance tool for building high quality ontologies AND high quality databases;
• Referent tracking, based on realist ontology, is a means to remove the ambiguity in data that cannot be solved by realist ontology alone;– It is a form of “adult” annotation
• Application of RT requires a globally accessible repository• The use of “meaningless” IUIs allows very strict safety and