An intelligent tutoring system for visual classiﬁcation problem solvingjchampai/papers/... · 2010. 7. 27. · An intelligent tutoring system for visual classiﬁcation problem

An intelligent tutoring system for visualclassification problem solving

Rebecca S. Crowley a,b,c,*, Olga Medvedeva a

Artificial Intelligence in Medicine (2006) 36, 85—117

http://www.intl.elsevierhealth.com/journals/aiim

aCenter for Pathology Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USAbCenter for Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, USAc Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA

Received 24 May 2004; received in revised form 31 December 2004; accepted 10 January 2005

KEYWORDSIntelligent tutoringsystems;Knowledge-basedsystems;Cognitive tutoringsystems;Classification problemsolving;Ontologies

Summary

Objective: This manuscript describes the development of a general intelligenttutoring system for teaching visual classification problem solving.Materials and methods: The approach is informed by cognitive theory, previousempirical work on expertise in diagnostic problem-solving, and our own prior workdescribing the development of expertise in pathology. The architecture incorpo-rates aspects of cognitive tutoring system and knowledge-based system designwithin the framework of the unified problem-solving method description languagecomponent model. Based on the domain ontology, domain task ontology and casedata, the abstract problem-solving methods of the expert model create a dynamicsolution graph. Student interaction with the solution graph is filtered through aninstructional layer, which is created by a second set of abstract problem-solvingmethods and pedagogic ontologies, in response to the current state of the studentmodel.Results: In this paper, we outline the empirically derived requirements and designprinciples, describe the knowledge representation and dynamic solution graph,detail the functioning of the instructional layer, and demonstrate two implementedinterfaces to the system.Conclusion: Using the general visual classification tutor, we have created SlideTu-tor, a tutoring system for microscopic diagnosis of inflammatory diseases of skin.# 2005 Elsevier B.V. All rights reserved.

* Corresponding author at: Center for Pathology Informatics,University of Pittsburgh Medical Center, Shadyside, Cancer Pavi-lion, Room 307, 5230 Centre Avenue, Pittsburgh, PA 15232, USA.Tel.: +1 412 623 1752; fax: +1 412 647 5380.

E-mail address: [email protected] (R.S. Crowley).

0933-3657/$ — see front matter # 2005 Elsevier B.V. All rights resedoi:10.1016/j.artmed.2005.01.005

1. Background

Knowledge-based systems (KBS) have a long historyof use in medical decision support [1], but are rarelydesigned specifically for educating health profes-

rved.

86 R.S. Crowley, O. Medvedeva

1 The relative effect size of an educational intervention can bedescribed in relationship to the effect of conventional classroomlearning. An intervention which raises performance by two stan-dard deviations over classroom learning is described as having a‘‘2-sigma’’ effect size.

sionals [2]. One reason for this may be the absenceof general architectures or frameworks for incor-porating medical knowledge bases in larger instruc-tional systems. In this paper, we describe ouradaptation of the intelligent tutoring system(ITS)–—a well-studied and successful paradigm forcreating intelligent educational systems. Theadvantage of the ITS paradigm is that it providesa basic pedagogic approach with proven efficacy indomains outside of medicine [3—6]. The disadvan-tage of the ITS paradigm is that it was not designedto use large, frequently changing, or existing knowl-edge bases. In response to this, we have incorpo-rated aspects of both ITS and KBS design to create ageneral architecture for instruction in visual classi-fication problem solving. This architecture has threeimportant advantages: (1) it modularizes all domainand pedagogic knowledge into ontologies makingthe system itself domain-neutral and general; (2)it preserves all of the major pedagogic featuresassociated with cognitive tutoring systems–—a highlyeffective subtype of ITS; and (3) it allows for sig-nificant flexibility in the pedagogic components–—more flexibility than is generally achieved in currentITS.

1.1. Intelligent tutoring systems

Intelligent tutoring systems are adaptive, instruc-tional systems that strive to emulate the well-known benefits of one-on-one tutoring when com-pared to other instructional methods. Model tracingITS (MTITS) are a subtype of ITS in which expertise isrepresented as a set of production rules that guidethe student through the problem space, correctingerrors in the intermediate steps and offering hintsspecific to the current problem state [7]. Modeltracing ITS are thus well suited to complex, multi-step problems, in which identifiable steps could bedescribed as ‘‘on the solution path’’ or ‘‘off thesolution path’’ at any given time. Cognitive intelli-gent tutoring systems (CITS) — a subtype of MTITS —incorporate domain-specific production rules thatare based on a cognitive theory of skill acquisition,such as ACT-R [8]. Often, the intermediate cognitivesteps are first identified using empirical methodssuch as cognitive task analysis [9]. Most MTITS andCITS have been developed for domains that arehighly procedural and do not require substantialdeclarative knowledge bases. Examples includemathematics and science instruction [4,6],flight simulator training [10], and training inthe workplace [5]. These tutors are among themost rigorously evaluated ITS, and have beenshown to be highly effective in increasing studentperformance. The instructional ‘‘gold standard’’ is

considered to be one-on-one human tutoring,which is associated with a 2-sigma effect overclassroom learning1 [11]. Cognitive Tutors andother MTITS have been shown to bring students1-sigma or more above standard instructionalmethods [4—7]. In contrast, meta-analyses of manytraditional computer-assisted instruction systemshave exhibited only a 0.3—0.5 sigma effect [12,13].

Only a handful of medical ITS, of any type, havebeen developed [14—19], and very few of these havebeen evaluated. The GUIDON project [14,20—23]extensively explored the development of KBS forteaching classification problem solving. Knowledge-based tutors were envisioned as a way to provideinstruction in domains that required significantdeclarative knowledge requirements, embeddedwith procedural skills. Many medical domains andtasks share these characteristics. GUIDON usedMYCIN’s rule set to teach medical students to reasonabout causative organisms in infectious meningitisand bacteremia given a patient’s history, physicalexamination, and laboratory results. Before a casewas presented, consultation with MYCIN was used togenerate an AND/OR tree representing Goals (ORnodes) and Rules (AND nodes). GUIDON then usedthe AND/OR tree to structure the discussion with thestudent. GUIDON interactedwith the student using amixed-initiative method of dialogue. Students couldassert, specify, and hypothesize and GUIDON wouldrespond. But GUIDON could also redirect studentattention to other problem aspects. Unlike MTITSand CITS, GUIDON made no attempt to limit studentoptions in the problem space or guide them towardsthe most efficient solution.

One of the central advances of GUIDON was theseparation of domain knowledge (represented byMYCIN’s �400 rules) from pedagogic knowledge(represented by GUIDON’s �200 tutorial rules).The modularity of the pedagogic system permittedincorporation with any expert system that utilizedthe EMYCIN architecture [24,25]. The other centraladvance of GUIDONwas the recognition that KBS likeMYCIN can simultaneously perform very well atmaking decisions and very poorly at teaching othersto make decisions [23]. MYCIN’s rules representedcompiled expert knowledge, that students found‘‘difficult to understand, remember, and incorpo-rate into a problem-solving approach’’ [14]. Theauthors determined that the ordering of premiseclauses contained implicit procedural knowledge

An intelligent tutoring system for visual classification problem solving 87

about diagnostic hierarchies and strategies thatcould not be referenced by GUIDON’s tutorial sys-tem. This led Clancey and Letsinger to reconfigurethe MYCIN rule set — creating NEOMYCIN — an expertmodel utilized in GUIDON 2 [23]. NEOMYCIN wasdistinguished from MYCIN by its forward-directeduse of data, top-down refinement strategy, use of anetiological taxonomy, and incorporation of rulesfor (1) expressing implicit ‘‘world relations’’ and(2) reasoning about evidence-hypothesis connec-tions. Because NEOMYCIN reasoned more like humanexperts, GUIDON-2 appeared to provide more under-standable explanations and advice to students.

Like GUIDON, current MTITS typically use domain-specific production rules to accomplish the model-tracing. In domains such as algebra, geometry andphysics there is no pressing need to create generalrules and scalable architectures, because each typeof problem solving can only be modeled by a uniqueset of production rules. For example, a tutoringsystem that teaches construction of a geometryproof would require a completely different set ofproductions when compared to a tutoring systemthat teaches conversion of an algebra word problemto its equivalent equation form. In contrast, visualclassification problem-solving tasks share a verysimilar structure, and can therefore be modeledwith a very general set of production rules, instan-tiated with domain knowledge to create a plethoraof domain-specific models. Thus, a tutoring systemfor dermatopathology and a tutoring system for X-ray classification of fractures could conceivablyshare a single general framework but utilize differ-ent domain and pedagogic content. In domains thatinclude expansive declarative knowledge bases, thismore general approach is necessary to constructsystems capable of tutoring across large parts of adomain.

1.2. Knowledge-based systems,ontologies and the UPML componentmodel

Research in KBS over the last 20 years has demon-strated the value of combining ontologies and gen-eral reasoning methods [26] (often called abstractproblem-solving methods or abstract PSMs). Ontol-ogies are formal specifications of concepts and theirrelationships [27]. The addition of instances to theontologically defined classes and relationships cre-ates a knowledge base that can be used to abstractdeclarative knowledge within a domain. AbstractPSMs combined with the domain knowledge declara-tively defined in the knowledge base producedomain specific solutions, but may use a very smallnumber of generic rules. As previously described,

very few ITS use these more abstract methods forknowledge representation and reasoning, due partlyto the specific domains for which these systems havebeen developed.

Reusable problem-solving methods and ontolo-gies permit scalability, and ease acquisition andmaintenance of knowledge [26]. But they also intro-duce a specific problem when used in ITS. Knowl-edge-based systems are designed to solve problems,but the purpose of an ITS is to teach humans to solveproblems. Therefore, ITS require pedagogic compo-nents that provide feedback on intermediate stepsin the solution. In most ITS, this is accomplished withadditional rule clauses that are intimately asso-ciated with the expert model of task performance.Among cognitive tutors, it is unusual to have ped-agogic models that are entirely separate from themodel-tracing component [28]. Feedback is veryspecific to the intermediate steps modeled by aset of domain specific production rules. A moregeneric approach that couples abstract PSMs andontologies, must also disentangle the pedagogic andexpert models. Otherwise pedagogic feedbackbecomes increasingly limited and inflexible,because (1) the more abstract productions supportan even more limited set of feedback types and (2)the system has no way to instruct across sets ofgeneral productions, making all feedback specific tothe single general rule to which it applies.

Increasingly, methods for developing highly mod-ular and reusable expert systems are resulting fromresearch on the Semantic Web. The unified problem-solving method description language (UPML) [29]provides a specification for creating distributed rea-soning systems. The UPML component architecture[30] provides a detailed specification of relationshipsbetween ontologies and abstract problem-solvingmethods that maximizes the modularity of the com-ponents. A task defines the problem to be solved bythe KBS as a set of subgoals. A problem-solvingmethod describes a reasoning process of the KBS. Adomain model describes the declarative knowledgeusedby theKBSduringproblem-solving. Task, domainmodel, and PSM are entirely independent, and arespecified using ontologies. Relationships betweenparts of the model are described using bridges andrefiners. Bridges are used to model relationshipsbetween two architectural components (for exampletask and PSMs), while refiners model the adaptationof an architectural component.

Given the absence of general frameworks fordeveloping ITS in medical domains, we elected tocreate our own framework–—incorporating aspectsof both KBS and ITS design. In particular, we use thehighly modular UPML component architecture tocreate two interlacing KBS–—one that produces


the expert model as a dynamic and advancing repre-sentation of the solution, and one that produces aninstructional layer tailored to the specific student.The instructional layer is therefore independent ofthe expert model, and is able to inspect the stu-dent’s progress across the entire solution, providingfeedback that is specific to intermediate states thatare defined by more than just a single rule.

The visual classification tutor (VCT) reproducesthe interaction style of the cognitive tutors, andachieves the scalability of modern KBS, but alsoprovides a novel method for separating the peda-gogic system so that it can be tailored to the indi-vidual student while operating within the confines ofa more general framework. In the VCT, abstractPSMs create a dynamic solution graph (DSG) — arepresentation of the current problem state andvalid next steps — that is updated with each studentaction. The VCT can be used with different inter-faces, different domain ontologies and differentpedagogic ontologies to create different visual clas-sification tutoring systems. Using the VCT, we haveimplemented SlideTutor–—a web-deployed systemfor teaching microscopic classification of inflamma-tory diseases of skin. The purpose of this paper is todescribe the general architecture, knowledgerepresentation, and functionality of the VCT, andto detail how it was used to create SlideTutor.

2. The VCT–—a domain neutral systemfor cognitive tutoring of visualclassification problem solving

Visual classification problem solving is a commoncognitive task in which the problem-solver usesvisual cues to deduce the correct class membershipfor an instance of unknown class. In medicine, visualclassification problem solving is an important aspectof expert performance in radiology, hematology andpathology. An ITS for visual classification problem-solving requires a developmental cognitive model ofthis task–—because the system must be able toprovide guidance and advice specific to a particularstudent’s needs, but must also adapt as the studentgains proficiency.

Our developmental model of visual classificationproblem solving derives from two sources: (1) pre-vious empirical and theoretical work in radiology[31—33] and also in non-visual domains [34—38] and(2) an empirical cognitive task analysis [39,40] thatwe performed to explore differences in the visualdiagnostic processes of novice, intermediate, andexpert pathologists.

The model on which we have based our tutoringsystem is summarized in Table 1, as differences in

five basic areas of classification problem solving–—search and detection, feature identification, fea-ture refinement, hypothesis triggering, and hypoth-esis evaluation. Early in the development ofexpertise, students lack basic abilities in searchingfor regions of interest, and limiting their diagnosticsearch space. They are frequently incorrect whendetermining the meaning of particular visual evi-dence, lack the ability to refine evidence, and donot know how to process evidence towards a diag-nostic conclusion. Intermediates, on the other hand,rarely exhibit errors related to search and detectionof regions of interest. Instead, they have difficulty incorrectly attaching symbolic meaning to visual evi-dence and knowing when they need to refine evi-dence. They are often uncertain of the meaning ofparticular visual cues even when they are correct,and consider an overly broad set of hypotheses.Unlike experts, intermediates do not use backwardsreasoning selectively on a small hypothesis set.

The implications of this developmental model forthe instructional framework of the VCTare shown inthe lower pane of Table 1, as the ITS instructionaldesign requirements. A fundamental characteristicof these requirements is that they assume the abilityof the system to change instructional approach asthe student progresses through a set of develop-mental stages. Taken together, these implicationsargue for the following design principles for oursystem:

1. T
he system must be able to determine both cor-rect and incorrect student actions, determine thegeneral class of error that has been made, andleave open the specific instructional response tothis error–—which may change based on the stu-dentmodel for a particular student, theparticularclassification task, or the interface itself.
2. T
he system must reason with the student, inorder to provide correct feedback as intermedi-ate solutions are developed. For example, thesystem should accept hypotheses and even diag-noses based on an early, incomplete or incom-pletely refined set of evidence. When additionalevidence is identified, the system should requirethat students revise their diagnoses or state-ments of diagnostic support.
3. T

he system must be able to reason both forwards(evidence to hypothesis) and backwards (hypoth-esis to evidence) so that it can support bothstrategies among students. More novice studentsmay need to learn what decisions remain to bemade before a given hypothesis can be con-firmed, while more expert students may needto be encouraged to look for particular evidencethat separates one or more hypotheses. Both

Anintellige

nttutorin

gsyste

mforvisu

alcla

ssificatio

nproblem

solvin

g89

Table 1 Developmental cognitive model of visual diagnostic expertise and resulting ITS instructional design requirements


scenarios require the ability to reason fromhypothesis to evidence.

In the remainder of this manuscript, we describethe design and implementation of a general knowl-edge-based tutoring system for visual classificationthat meets these requirements.

3. Composite general architecture ofthe VCT

The architecture and implementation we describe isnovel–—to our knowledge no other ITS has taken thisapproach. Like other ITS [41], the VCT includes anexpert model against which student actions aretested and a pedagogic component that respondsto incorrect actions and requests for help. But unlikeother ITS [4,6] the VCT does not rely on a set ofdomain specific production rules as the basis for theexpert model. In contrast, the VCTconsists of set ofabstract PSMs that build a DSG from three separatesets of frames that describe the domain knowledge,task sequence, and the case to be classified. Inter-action with the DSG is filtered through an instruc-tional layer created by a second set of abstract PSMsfrom a set of frames that describe the pedagogicknowledge, pedagogic task sequence, and even-tually data related to a particular student.

The components of the VCT (Fig. 1) conform to aproposed standard for KBS–—the UPML componentmodel [30]. The expert model is composed ofdomain model, domain task, and abstract PSMs.The domain model represents the domain knowl-edge used to solve problems. The domain taskrepresents the goals of the problem-solving process.

Figure 1 Visual classification tu

In combination with the case data, the abstractPSMs create the DSG against which student actionsare tested. The instructional model is composed of aseparate pedagogic model, pedagogic task model,and PSMs. The pedagogic model represents thepedagogic knowledge used to teach problem sol-ving. The pedagogic task represents the goals of theinstructional process. In combination with datareflecting the current state of the student model,the instructional model is manifested as a highlyflexible and context-specific instructional layerbetween the student interface and the DSG.

4. Interaction scenario

In order to clarify our methods, we first provide asingle interaction scenario that incorporates exam-ples of a wide range of system functionality. Thescenario is taken from the domain of dermato-pathology and uses SlideTutor, but similar scenarioscould be generated for any domain in which visualclassification is feature based. In Sections 5—8,we use these examples to detail the conceptualimplementation and function of each of the VCTcomponents. Individual actions enumerated in par-entheses, match the tabular format of the interac-tion scenario shown in Table 2. A video clip showingthe interaction scenario in the SlideTutor case-focused interface is available at http://slidetutor.-upmc.edu/video.html (accessed: 31 December2004).

A first year pathology resident is using the Slide-Tutor system to learn microscopic diagnosis ofinflammatory skin diseases. A case is selected by

tor component architecture.

http://slidetutor.upmc.edu/video.htmlhttp://slidetutor.upmc.edu/video.html


Table 2 Student actions and tutor responses corresponding to interaction scenario in text

Action Student action(SUBTASK type and value)

Tutorresponse

Description and references

2.1 Identify-Feature: blister Accept See corresponding state of DSG in Fig. 5A2.2 Identify-Attribute (of blister):

location = intraepidermalFailure Discrete values of attribute modeled

(see Section 5). See Table 5, Error S32.3 Help Request Best-next-step Hint after bug targets previous

error (see Section 7)2.4 Identify-Attribute (of blister):

location = subepidermalAccept Discrete values of attribute

modeled (see Section 5)2.5 Identify-Feature: mucin Failure See example of error delivery in

Fig. 7. See Table 5, Error I42.6 Identify-Absent-Feature: mucin Accept Pertinent negative (see Section 5)2.7 Assert-Hypothesis: erythema multiforme Accept Hypothesis consistent with at least

one feature (See Table 7)(Task description)

2.8 Assert-Hypothesis: suction blister Accept Task description2.9 Identify-Feature: inflammatory infiltrate Accept2.10 Assert-Support-Link: inflammatory

infiltrate and Erythema MultiformeAccept Support-link represented in DSG

as present or absent edgebetween nodes (see Section 6)

2.11 Assert-Support-Link: inflammatoryinfiltrate and suction blister

Failure See Table 5, Error E1

2.12 Identify-Feature: neutrophilicinflammatory infiltrate

Failure Alert after correct action (seeSection 7) See Table 5, Error E6

2.13 Identify-Attribute (of neutrophilicinflammatory infiltrate): location = dermis

Accept

2.14 Identify-Attribute (of neutrophilicinflammatory infiltrate): quantity = minimal

Failure Continuous values of attributemodeled (see Section 5). SeeTable 5, Error S2

2.15 Assert-Hypothesis: acquiredepidermolysis bullosa

Accept

2.16 Help Request Best-next-step Feature identification precedeshypothesis triggering in currentpedagogic task (see Table 6 - State 3).See corresponding state of DSG in Fig. 5B

2.17 Identify-Feature: eosinophilicinflammatory infiltrate

Failure Ambiguous error states (see Section 7).See Table 5, Errors I2 and I3

2.18 Identify-Attribute (of eosinophilicinflammatory infiltrate): location = dermis

Accept

2.19 Identify-Attribute (of eosinophilicinflammatory infiltrate): quantity = moderate

Accept

2.20 Identify-Feature: epithelial necrosis Failure See Table 5, Error I12.21 Identify-Feature: nuclear dust Accept2.22 Assert-Diagnosis: acquired

epidermolysis bullosaFailure EC represents integrated evidence-

hypothesis relationship (Fig. 6).Student action fails because no edgefrom EC to asserted diagnosis (see Fig. 5C).See Table 5, Error E10

2.23 Help Request Best-next-step Hints are given only for requirednodes (see Section 7). See correspondingstate of DSG in Fig. 5C

2.24 Assert-Hypothesis: linear IgA dermatosis Accept2.25 Help Request Best-next-step2.26 Assert-Hypothesis: dermatitis herpetiformis Accept2.27 Help Request Best-next-step2.28 Assert-Hypothesis: dermatitis

herpetiformis-like drug eruptionAccept

2.29 Assert-Diagnosis: linear IgA dermatosis Accept


Table 2 (Continued )

Action Student action(SUBTASK type and value)

Tutorresponse

Description and references

2.30 Problem Done Failure See Table 5, Error C12.31 Help Request Best-next-step2.32 Assert-Diagnosis: dermatitis herpetiformis Accept2.33 Help Request Best-next-step2.34 Assert-Diagnosis: dermatitis

herpetiformis-like drug eruptionAccept

2.35 Problem Done Accept

the tutoring system. At the conclusion of the stu-dent-system interaction, this case will match to FS-A in Fig. 2, but this is not known by student orsystem until the problem is solved. The student ispresented with a brief clinical history, and a virtualslide that the student uses to pan and zoom–—asthey would use a microscope. The interface alsoprovides a palette for students to build a diagram-matic representation of their reasoning. When stu-dent actions are correct, the diagram is updated toreflect the new state of the argument. When stu-dent actions are incorrect, students may modify thediagram to correct errors.The student begins by scanning through the slide,and finds an abnormal discontinuation of the epi-dermis. She correctly specifies the location of theabnormality in the image and correctly describesthis abnormal feature as ‘‘blister’’ (2.1). Thesystem accepts the feature. After looking closelyat the blister, the student erroneously concludesthat the blister is intra-epidermal (2.2). She istold that although she is wise to evaluate locationof the blister, the blister is not intra-epidermal inthis case. The diagram reflects this error. Thestudent asks for a hint (2.3), and is told to delete

Figure 2 Relationship of FEATURE_SPECIFICATION to DISEASE

the incorrect location. She properly asserts thatthe blister is subepidermal (2.4). She suggestsdermal mucin in another area (2.5), but is toldthat dermal mucin is absent in this case, and thatabsence of mucin is an important negative finding.The student corrects this mistake by assertingabsence of dermal mucin (2.6) and notes the loca-tion of the absent mucin as within the reticulardermis. The student suggests that this could beerythema multiform (EM) (2.7) or suction blister(2.8). The system adds these hypotheses to thediagram because both are consistent with at leastone piece of evidence–—the subepidermal blister.In an area that seems hypercellular, she indicatesthe presence of inflammatory infiltrate (2.9)–—ageneral category with subclasses delineated bycellular type. The system considers this to becorrect because both neutrophilic and eosinophilicinflammation (subtypes of inflammatory infil-trate) are present in this slide at the locationspecified by the student.The student tries to create explicit support for thediagnosis of EM with inflammatory infiltrate in herdiagram (2.10). The system considers this featuresupportive and modifies the diagram, because EM

instances for five feature specifications and five diseases.


2 We use the term model to refer to a subset of the completeontology, specific to a particular purpose. For example, thedomain model is the subset of the complete ontology that spe-cifies only the concepts relevant to domain knowledge. In con-trast, the task model specifies only the concepts relevant to taskknowledge. Separation of domain, task and case concepts are afoundational principle of the UPML component architecture.

is associated with lymphocytic infiltrate (anothersubtype of inflammatory infiltrate). But when thestudent tries to support suction blister withinflammatory infiltrate (2.11), the system marksthis as incorrect in the diagram and the studentremoves it. After further consideration, the stu-dent modifies the more general concept of inflam-matory infiltrate to the more specific concept ofneutrophil-rich inflammatory infiltrate (2.12).The system accepts this modification, but alsoalerts the student that by further specifying thetype of infiltrate–—the supportive relationshipbetween the inflammatory infiltrate and EM nolonger applies. Although lymphocytic infiltrate isseen in EM, neutrophilic infiltrate is not. Thestudent amends the diagram by deleting this expli-cit relationship. Since she has excluded EM andsuction blister she deletes these hypotheses fromthe palette. She correctly specifies the location ofthe neutrophilic infiltrate (2.13) and then suggeststhat the infiltrate is minimal in quantity (2.14).The system responds that although minimal neu-trophilic infiltrate can be seen in cases of thisentity–—the severity is not best described as mini-mal in this case. The student corrects the error.The student thinks of acquired epidermolysis bul-losa (AEB) (2.15), and adds this to the diagram. Butat this point she is stuck. There don’t seem to befurther features, and these are all the hypothesesshe can come up with. She asks for a hint (2.16). Thesystem suggests that there are still further featuresto find. Uncertain where these features are, sheasks for more hints and is taken to the correct fieldand magnification for the next feature. The studentthinks she knows what’s here and indicates thatthere is eosinophil-rich inflammatory infiltrate in aparticular area of the image (2.17). The systemsuggests that the eosinophil-rich inflammatoryinfiltrate is present–—but not in the area that shehas indicated. The student asks for additional hints,and the system annotates the correct region. A finalhint moves the slide and annotates an area in whichthe eosinophilia is particularly easy to recognize.The student correctly asserts the location (2.18)and degree (2.19) of the eosinophilic infiltrate. Shefinds an area near the blister that seems necrotic.She identifies epithelial necrosis (2.20), but is toldthat this is not a significant feature of this case. Thestudent removes the feature. Thinking that theycould be fragments of neutrophils instead, sheasserts nuclear dust (2.21), which is accepted intothe diagram.The student feels certain that this is AEB. She triesto make the hypothesis a full-fledged diagnosis(2.22). But the system responds that although blis-ter, neutrophilic dust, and neutrophilia are fea-

tures of AEB–—the eosinophilia is not consistent. Thestudent is unsure what to do next. Are there otherimportant features that she has missed? She asks foranother hint (2.23). The system responds that shehas found all of the features, but has not assertedall the hypotheses that apply given these features.After additional requests for hints (2.25, 2.27), sheis told that all of the features are consistent withlinear IgA dermatosis, dermatitis herpetiformis(DH) and dermatitis herpetiformis-like drug erup-tion (DHLDE). She asserts these hypotheses (2.24,2.26, 2.28). Which diagnosis is it? She guesses linearIgA dermatosis (2.29) and the system promotes thehypothesis to the status of diagnosis in the diagram,indicating that this is a perfectly acceptable diag-nosis. The student tries to conclude the problem(2.30) but is warned that there are more acceptablediagnoses. With additional hints (2.31, 2.33), she istold that DH and DHLDE are also consistent with allof these features. The student adds them to thereasoning palette (2.32, 2.34), and correctly indi-cates that the problem has been solved (2.35).

5. Expert model–—representationof domain knowledge, task andcase data

The expert model of the VCT provides the evolvingsolution that student actions are tested against.Abstract PSMs are applied to (1) instances in thedomain knowledge base; (2) instances defining thetask sequence and (3) instances derived from casedata, to create a DSG. We first describe the threemodels, and then detail how these frames are usedto generate the DSG.

5.1. Domain model

The domain model2 expresses the relationshipsbetween evidence and disease entities. The classconcepts and relationships of the domain model aregeneral, and apply widely throughout pathology andother areas of medicine in which classification isfeature-based. The class structure of the domainmodel is depicted in Fig. 3 in relationship to the casemodel. Our representation slightly extends the


Figure 3 UML class diagram of domain and case showing relationships to shared feature, attribute and value primitives.

ontology for classification problem solving describedby Motta and Lu [42], by adding additional attributesto features. A similar representation has also beenused to create ontologies for petrographic analysis[43]. Diseases3 are hierarchically represented. Anydisease may have more than one parent (multipleinheritance). Diseases have one or more FEATURE_-SPECIFICATION instances–—each of which correspondto a combination of evidence. This structure is quiteanalogous to what Evans and Gadd have termed‘‘facets’’ in describing clinical reasoning [44]. EachFEATURE_SPECIFICATION consists of one or moreFEATURE_ATTRIBUTE_VALUE_SET (FAVS). Each FAVSis in turn composed of two primary components: (1)a single FEATURE and (2) one or more ATTRIBUTE_-VALUE_SETS. Instances of FEATURE represent dis-tinct perceptual primitives of visual entities (such asblister) that form the ‘‘atoms’’ of visual feature

3 We use the term ‘disease’ as the class name throughout thismanuscript, in order to clarify relationships within a medicalcontext. However, this class is more appropriately named ‘visualentity’, because the DOMAIN model applies to any visual classi-fication task.

recognition. Instances of ATTRIBUTE_VALUE_SETrepresent the additional cognitive steps requiredfor refining these features (such as the distinctionof a blister’s location relative to the epidermis assubepidermal or intraepidermal). ATTRIBUTE_VA-LUE_SETS are composed of a single ATTRIBUTEand a set of VALUE.

The domainmodel therefore represents many-to-many relationships between FEATURE_SPECIFICA-TION and DISEASE. An example of this relationshipis shown in Fig. 2, for five instances of DISEASE underconsideration in the interaction scenario, and five oftwelve total FEATURE_SPECIFICATIONS associatedwith these five diseases.

Continuous and discrete values of attributes aremodeled differently:

� F

or continuous values such as the quantity ofinflammation, multiple possible values areexpressed as a value range of the ATTRIBUTE ina single FEATURE_SPECIFICATION. For example, inFig. 2, FS-A indicates that the value range ofneutrophil-rich inflammatory infiltrate is moder-ate to marked.


� F
or discrete values such as the location of inflam-mation, when an ATTRIBUTE of one FEATURE hasmultiple values within a single case (logical AND),these are represented as multiple ATTRIBUTE_-VALUE_SETS within a single FEATURE_SPECIFICA-TION. For example, in Fig. 2, FS-E contains twoseparate ATTRIBUTE_VALUE_SETS for location ofthe eosinophil-rich inflammatory infiltrate, onefor papillary dermis and one for reticular dermis.
� F
or continuous or discrete values, when an ATTRI-BUTE of one FEATURE can have only one of manyvalues (logical OR), these are represented asseparate FEATURE_SPECIFICATIONs (not shownin Fig. 2).
Identifying the absence of some features is oftena critical aspect of problem-solving. Students mustlearn when it is important to identify pertinentnegatives, because they form branch points in deci-sion making. These ‘explicitly absent features’ musttherefore be represented in the DSG so that theinstructional layer can separate these pertinentnegatives from the sea of absent features the stu-dent might correctly try to note in any individualcase. For example, when the student in our inter-action scenario says that mucin is present, thesystem responds that actually it is absent, and thatthe absence of this feature is salient (Table 2, Action2.5). Explicitly absent features are represented

Figure 4 Instance of CASE us

using the same method as present features, exceptthat the quantity attribute is defined as ‘none’. Forexample, the absence of mucin in the reticulardermis is expressed in FS-A and FS-B (Fig. 2) as aFAVS with feature: mucin, location: reticular der-mis, and quantity: none.

5.2. Case

In the VCT, the CASE provides the location andsemantic content of each feature encoded in asingle tutor case. The class relationships for CASEare shown in Fig. 3. Each CASE consists of a set ofLOCATED_OBSERVABLE, each of which combines asingle OBSERVABLE and a set of SHAPE that definesthe geographic distribution of the OBSERVABLE inthe image. OBSERVABLES are composed of one FEA-TURE and one or more ATTRIBUTE_VALUE_PAIR.ATTRIBUTE_VALUE_PAIR are composed of one ATTRI-BUTE and a single VALUE. For continuous values, thesingle values in ATTRIBUTE_VALUE_PAIR in the casemodel contrast with the value ranges expressed inATTRIBUTE_VALUE_SET in the domain model.

Fig. 4 shows the CASE instance used in the inter-action scenario. In combination with Fig. 2, it can besurmised that the CASE which is loaded at the startof the interaction scenario will be determined torepresent FS-A by both student and system by theconclusion of problem solving. Note that the mod-

ed in interaction scenario.


erate quantity of inflammatory infiltrate in theATTRIBUTE_VALUE-PAIR for CASE is within the rangeof the ATTRIBUTE_VALUE_SET (moderate, marked)for FS-A. One requirement for instances of CASE tomatch to instances of DISEASE is that all featuresand attributes match, and that single values withinATTRIBUTE_VALUE_PAIR are within the value rangesof the corresponding ATTRIBUTE_VALUE_SET.

The relationship of the domain model and casemodel is also shown in Fig. 3. ATTRIBUTE, VALUE andFEATURE are primitives shared by both domain andcase models. Instances of CASE are created using acase authoring tool, which constrains authoring touse only the FEATURE_SPECIFICATIONS enumeratedin the domain knowledge base. This is more fullydescribed in Section 12.

5.3. Domain task

The task model represents an abstraction of the goalstructure for classification problem solving. The taskmodel is composed of a small set of SUBTASKinstances, with one SUBTASK for each general cog-nitive goal in a particular classification problem-solving exercise. The task model used in the inter-action scenario (Table 3) contained one SUBTASK foreach of the following types of cognitive goals: (1)identify a feature in the image; (2) determine that acritical feature is absent; (3) refine the feature toinclude attributes and values; (4) assert a hypoth-esis; (5) accept a hypothesis as a diagnosis; (6) asserta supporting relationship between a feature and ahypothesis; (7) assert a refuting relationship

Table 3 Subtasks types used in TASK model

Has_Type Has_Parent Has_Role

Identify-Feature None DirectIdentify-Absent-Feature None DirectIdentify-Attribute Identify-Feature Property

Assert-Hypothesis Identify-Feature Direct

Assert-Diagnosis Assert-Hypothesis DirectSupport-Link Identify-Feature Indirect

Assert-hypothesisRefute-Link Identify-Feature Indirect

Assert-HypothesisIdentify-Distinguishing-

FeatureAssert-Hypothesis Indirect

Assert-Hypothesis

between a feature and a hypothesis and (8) find afeature that distinguishes between one or morecompeting hypotheses. Each of these generic SUB-TASKs may be instantiated to form one or more case-specific goals.

Relationships between instantiated goals arederived from the spatial structure of TASK, andultimately specify the entire problem space of validactions for a given case. The Has_Parent slot ofSUBTASK is used to create this spatial structure.For example, a SUBTASK of type Assert-Diagnosiscontains a Has_Parent slot that may be filled byAssert-Hypothesis or Identify-Feature. When filledby Assert-Hypothesis (as is true in the task shown inTable 3), the diagnosis must be preceded by theformation of a hypothesis. When filled by Identify-Feature, the diagnosis may be asserted directlyfrom the evidence without the need to first triggerand test a hypothesis.

The behavior of each type of goal during problemsolving is defined in the Has_Role slot of SUBTASK.This slot specifies whether the goal has an inherentordering in the subtask (direct role), is an attributeof an already achieved goal (property role), orreflects relationships between the goals (indirectrole). Examples of goals with direct role includefeature identification and hypothesis triggeringbecause (1) features must be found before theyare refined, and (2) hypotheses must be based onfeatures. An example of a goal with a property roleis feature refinement, because refinement of afeature must be preceded by identification of thefeature. Examples of goals with indirect roles

Explanation of subtask

Present feature in case to be identified and localizedPertinent negative feature in case to be notedQualities of feature to be identified only afterfeature is identifiedGoals for making hypotheses only instantiatedafter a feature node already exists which issupportive of this hypothesisAssertion of diagnosis follows assertion of hypothesisReasoning step to indicate that identifiedfeature (constrained or unconstrained byattributes) supportive of particular hypotheses

Reasoning step to indicate that identifiedfeature (constrained or unconstrained byattributes) refutive of particular hypotheses

Features to be identified when competinghypotheses exist where features distinguishbetween competing hypotheses


include (1) supporting and refuting hypotheses, and(2) distinguishing features. Any step, whetherdirect, indirect or property, may be either requiredor not required, but it is up to the instructional layerto determine which of these two states is in effect.

By modifying TASK, the goal instances created inthe DSG will alter the behavior of the expert modelthat is used to evaluate student actions. Conse-quently the same system can be used to modeldifferent strategies for solving a single problem,dependent on the state of the student model.

6. Dynamic solution graph

The DSG is a directed acyclic graph that models thecurrent problem state and all valid-next-steps. TheDSG generates the initial problem state at the begin-ning of each problem, using knowledge derived fromthedomain, task andcasemodels. After each studentaction, the graph structure is updated by a set ofabstract PSMs that may add or delete nodes and arcs,or change the node state. These changes are deter-mined by the behavior encapsulated in the Has_Roleand Has_Parent slots of the instantiated goals.

The system does not know the solution of theproblem a priori, and only reaches the solution withthe student. Although any individual state of the DSGrepresentsonlythecurrentproblemstateandallvalidnext-steps, sequential DSG states define one paththrough the problem space–—the path taken by thestudent. The graph representation supports the abil-ity to reason in both directions (design principle 3)and the dynamic nature of the graph enables thesystemtoreasonwiththestudent(designprinciple2).

Fig. 5A—C depicts the DSG in three differentstates, corresponding to the states following actions2.1 (Fig. 5A), 2.15 (Fig. 5B), and 2.17 (Fig. 5C) in theinteraction scenario (Table 2). Completed nodes areshown as filled. The current problem state is definedby the set of all completed nodes. All other nodesrepresent valid-next-steps. Bolded nodes depictcalculated best-next-steps.

6.1. Basic structure

The DSG is composed of a set of nodes and a set ofarcs. Each node represents one instantiated SUB-TASK. Once instantiated, we refer to the nodes assubgoals because they are specific to the context ofthe problem state. During instantiation, both caseand domain knowledge are incorporated. As aresult, an individual node in the DSG encapsulatesthe properties that are essential elements in theachievement of this subgoal (Table 4). For example,

in Fig. 5, the node entitled ‘‘nuclear dust’’ is anIdentify-Feature node and therefore has propertiesFeatureName, Area, and Magnification (Table 4).Each property contains a value or set of values.As determined by the instructional layer, a correctaction by the student must match all of these nodeproperties. For example, when the student identi-fied nuclear dust in the interaction scenario (Action2.21 in Table 2), the student’s feature namematched the node FeatureName, the student’s fea-ture location was within the set of regions definedby the Area, and the student’s degree of magnifica-tion was greater or equal to the Magnification. Arcsin the DSG represent temporal relationshipsbetween the current problem state and valid-next-steps. As the DSG advances, new nodes areadded which represent additional valid-next-steps.For example, after the assertion of ‘‘blister’’ thesolution graph contains new goals for attributesassociated with blister and for hypotheses thatare supported by blister (Fig. 5A).

The evidence cluster node is an additional nodeused to express an integrated relation betweenfeatures and hypotheses. Cluster nodes are usedfor disambiguating Feature-Hypothesis relation-ships as the student traverses the problem space.The method for disambiguation is shown with othergraph definitions in Fig. 6. The domain knowledgebase contains a one-to-one mapping between FEA-TURE_SPECIFICATION and DISEASE SET. However, anyfeature may be present in multiple FEATURE_SPE-CIFICATION and any DISEASE may be present inmultiple DISEASE SETs. Because the DSG reasonsonly one step ahead of the student, the systemmustdeal with an incomplete set of features in boundingthe student’s problem solving. Until all featureshave been identified, the evidence cluster providesthe only way to determine the DISEASE SETs that arevalid at any given time. For example, in Fig. 5A—C,as additional features are identified and refined, theevidence cluster points to fewer and fewer hypoth-eses. At the end of problem-solving only the one-to-one mapping between FEATURE_SPECIFICATION andDISEASE SETwill remain. This can be seen in Fig. 5C,where the evidence cluster points to three hypoth-eses–—linear IgA dermatosis, DHLDE Eruption andDH. These three hypotheses share the FS-A featurespecification (Fig. 2). In this state, the problem hasbeen solved. Note that in the interaction scenario,the student identifies blister early on, and thereforethe set of valid hypotheses is initially quite large. Ifthe student had identified nuclear dust first, the setof valid hypotheses would be small initially, andwould grow as additional features were identified.

In addition to bounding the student’s solution,the evidence cluster can be used by the pedagogic


system to permit only hypotheses that are consis-tent with ALL identified features (hypothesis is amember of the DISEASE_SET in evidence cluster) asopposed to ANY identified features (hypothesis is achild node of Identify-Feature). The current defaultinstructional model implements the latter criterion.

Figure 5 (A) State of DSG following identification of blisidentification of blister (Action 2.15 in Table 2). (C) State of DSG

6.2. Creating the graph

To create the initial problem state, informationfrom CASE and TASK are used to instantiate initialDSG nodes. Identify-Feature nodes are instan-tiated with feature names, areas and magnifica-

ter (Action 2.1 in Table 2). (B) State of DSG followingfollowing identification of blister (Action 2.17 in Table 2).


Figure 5. (Continued ).

tion from CASE. Identify-Attribute nodes areinstantiated with attribute name and value. Bothnodes contain structural information derivedfrom TASK. At the start of the problem only Iden-tify-Feature and Identify-Attribute nodes are pre-sent.

6.3. Determination of best-next-step

As shown in Fig. 5, the DSG includes all valid-next-steps in problem solving (unfilled nodes). In order tohelp students traverse the problem-space, the DSGmust select a single best-next-step for each cycle


Figure 5. (Continued ).

(shown as bolded nodes in Fig. 5). These best-next-steps are context-specific to the state of the pro-blem and what is known about the student, and areused by the pedagogic system to deliver context-specific help. For example, in the interaction sce-nario, the hint given to the student when there isstill evidence to identify (response to Action 2.16) is

different than the hint given when all evidence hasbeen identified (response to Action 2.23). Identifi-cation of best-next-step in the DSG is analogous toconflict-resolution among separate production rulesin more traditional ITS architectures.

The best-next-step is determined each cyclebased on information derived from the instructional

Anintellige

nttutorin

gsyste

mforvisu

alcla

ssificatio

nproblem

solvin

g101

Table 4 DSG node structure and behavior

DSG node slots Direct nodes Indirect nodes

Identify-Feature Identify-Attribute- Assert-Hypothesis Assert-Diagnosis

Support-Link

Refute-Link

Identify-Distinguishing-Feature

Is_Goal Indicates whether the node is a goal node, based on current problem statIs_From_Case Indicates whether the goal is derived from the caseState Reflects whether the node has been completed by the student or not

Properties Feature name: e.g. ‘blister’ Attribute name:e.g. ‘location’

Name: e.g.‘linear IgAdermatosis’

Name: e.g‘linear IgAdermatosis

Name: e.g.‘blister -linear IgAdermatosis’

Name: e.g.‘nuclear dust-arthropod bite’

Name: e.g.‘nuclear dust’

Area: set of polygonsassociated with feature

Attribute value: value ofattribute derived fromcase, e.g. ‘subepidermal’

Magnification: lowestobservable magnification

General behaviorof DSG node(applies toall nodes)

Value of state slot changes when completed by student

Completed nodes generate all direct parents and children based on conte ts of has-parent slot of Task modelNode cannot be deleted if derived from caseNodes connected to other completed nodes cannot be deleted on update

Specific behaviorof DSG nodeby subgoal type

Completion of Identify-Featurenodes results in update ofevidence cluster based onintegration of additionalinformation

AV nodes cannot becompleted before parentevidence nodes becauseIdentify-Attribute nodeshave property role derivedfrom Task; Completionof Identify-Attribute nodesresults in update ofevidence cluster based onintegration of additionalinformation

On update, diagnosisnode of same nameas hypothesis nodewill be created only ifhypothesis node Is-goal

All INDIRECT nodes are dynamic and reflectthe existing relationship between the DIRECTnodes and can become incorrect when theserelationships are no longer valid

e

.

’

n


Figure 6 Expert model definitions.

model. Each node contains a value for hint priorityand an indicator for whether or not it is a requirednode. The best-next-step in any state is the requirednode of highest priority. For example, in Fig. 5A, thebest-next-step is to refine the feature blister byadding that it has a subepidermal location, becausefeature-refinement has a higher priority than fea-ture identification or hypothesis formation for thecurrent pedagogic model. Priorities and require-ment indicators may be altered by the pedagogicmodel, based on the state of the student model.

6.4. Updating the graph

From the initial state, each subsequent studentaction is translated into an event that propagatesthrough the DSG and alters its structure. When theevent propagates, alterations to individual nodesare specific to the type of node (Table 4). Forexample, after a correct identification of a feature(student action matches a particular Identify-Fea-

ture node exactly) the graph updates by (1) chan-ging the state of the Identify-Feature node toidentified, (2) adding Assert-Hypothesis nodes whichare supported by this feature, (3) creating arcsbetween the Identify-Feature and each Assert-Hypothesis node supported that feature, (4) updat-ing the evidence cluster, and (5) calculating the newbest-next-step.

6.5. Reasoning forward and backward

The DSG models both the forward and backwardintermediate reasoning steps that are seen inempirical studies of developing expertise. Asdemonstrated in the GUIDON project [14,23], for-ward directed reasoning provides a much morenatural method for students to progress throughthe problem space. However, backward or goaldirected reasoning can be a powerful operatorand is often used by experts to perform more com-plex kinds of reasoning. These include checking of


solutions, and distinguishing between alternativehypotheses by seeking evidence that is differentiallyexpressed among the alternatives.

The DSG supports forward directed reasoning, bymaintaining the set of all valid-next-steps and cal-culating the best-next-step with each cycle. Conse-quently, by iteratively cycling the DSG, it can beused to traverse the entire problem space to arriveat the goal, just as a forward chaining system wouldbe used to arrive at the solution state. Forward-directed reasoning is used by the instructionalmodel to guide the student through step-by-step.In fact, the structure defined by the TASK ensuresthat as the DSG is successively augmented, thedirection of the graph models the forward-directedprocess from evidence to diagnosis.

If we define the TASK differently, the DSG can beiteratively cycled from a solution state to add goalnodes until an initial problem state is reached. Forexample, if we define the TASK to reason fromdiagnosis to evidence, then the graph will recreatethe entire problem space from the solution statebackward toward all initial states.

However, typically we are only interested inselected aspects of backward reasoning in order toperform particular instructional actions. Thereforewe do not recreate the entire backward solution.Instead, the current TASK produces the entiresequence of forward directed reasoning and simulta-neously models single backward or goal-directedreasoning steps. In particular, we model backwardreasoning from hypothesis to evidence in order to (1)offer more specific kinds of remediation and (2) helpstudents learn how to distinguish among hypotheses.

Remediation can be made more specific by iden-tifying whether a student could be using backwardreasoning to set goals for searching for particularfeatures. For example, when a hypothesis is added,goals for finding features and attributes relevant tothe added hypothesis appear in the next state of theDSG. Fig. 5B corresponds to the state followingAction 2.15 in Table 2, and shows nodes for isolatedneutrophils, isolated eosinophils and predominantlylymphocytic inflammatory infiltrate have beenadded to the DSG, after AEB was asserted. Arcsconnect these goals to the AEB hypothesis. In thiscase, none of these features are actually present inthe slide. In other situations, backward reasoninggoals may overlap with features present in the case.If in State 5B, the student were to assert the pre-sence of a predominantly lymphocytic inflammatoryinfiltrate, the system would provide explicit feed-back that this feature is not present, but wouldsimultaneously reinforce the student for searchingfor a feature that should be present, given that EM isunder consideration.

Identify-Distinguishing-Feature is an additionalsubgoal that can be supported by adding nodesfor features and attributes, using the backwarddirected reasoning. Specifically, the structure ofthe DSG can be used to determine the intersectionand complement of features for a given set ofhypotheses. For example, linear IgA dermatosisand arthropod bite can be distinguished by thepresence of nuclear dust which is present in linearIgA dermatosis but absent in arthropod bite (Fig. 2).

The DSG therefore supports the essential char-acteristics of a cognitive tutor–—it provides theability to distinguish between correct and incorrectactions, models the set of valid actions for the nextstep, and selects a single step as the next-best stepwhich can be used by the instructional layer to guidethe student through the problem [8]. Furthermore,the DSG supports the foundational principles weestablished from our developmental model(Table 1): (1) it determines general classes of errorsbut provides flexibility in tutor response, (2) itreasons with the student supporting intermediatesolutions and revision, and (3) it models both for-ward and backward reasoning. Finally, the DSGenables a more scaleable architecture because alldomain and pedagogic knowledge is maintained inseparate knowledge bases.

7. Instructional model

Pedagogic and domain knowledge are entirely sepa-rate in our system. The DSG is used to determineaccepted actions versus failures, and to determinethe best-next-step in problem-solving. But allinstructional content is provided by a separate sys-tem that responds to states in which (1) studentactions do not match an existing DSG node or (2) thestudent requests help.

In symmetry to the expert model, the instruc-tional model is composed of a pedagogic model,pedagogic task and PSMs (Fig. 1). The pedagogic taskis rudimentary in the current system consisting onlyof the goals to deliver hints and alerts, and thepriorities attached to these goals. In the defaultstate of the pedagogic task described in this manu-script only a single response (error) or set ofresponses (hints) is delivered, regardless of thestudent. In future versions of the system, morecomplex reasoning will be utilized to produce dif-ferent hints and alerts based on student modelstate.

The pedagogic model contains the declarativeknowledge required for two types of case-specificinterventions: (1) explanations delivered by the


system as alerts when the student makes a parti-cular kind of error, and (2) explanations delivered bythe system as hints when the student requests help.

7.1. Errors

Errors constitute student actions that generate a‘failure’ response from the expert model because

Table 5 Errors remediated by type

Process # Error remediated by tuto

Feature identification I1 Feature identified by stuI2 Feature identified by stuI3 Feature identified by stu

in this location has beenI4 Feature identified by stuI5 Feature identified by stu

for one or more hypotheI6 Feature identified as absI7 Feature identified as abs

currently seen in viewerI8 Absent feature identifiedI9 Magnification used by stuI10 Magnification used by stu

Feature refinement S1 Feature doesn’t require fS2 Wrong attribute for featuS3 Correct attribute for feaS4 Attribute can have that v

in this case (used for bacS5 Attribute can have that v

currently under considerain this case (used for forw

Hypothesis triggering T1 No feature identified to s

Hypothesis evaluation E1 Feature indicated as supphypothesis because featu

E2 Feature indicated as refubecause feature does not

E3 Feature indicated as suppbecause one or more att

E4 Feature indicated as refubecause one or more att

E5 Feature previously indicahypothesis because attrib

E6 Feature previously indicahypothesis because featu

E7 Feature previously indicarefute hypothesis becaus

E8 Hypothesis previously supsupported by any feature

E9 Diagnosis does not fit witE10 Diagnosis fits with some f

features that have beenE11 Diagnosis now inconsisten

because new feature addE12 Diagnosis now inconsisten

attribute value pairs of f

Problem completion C1 Student indicates problem

they do not fulfill the criteria for a complete matchat any node. When this is the case, student actionswill match to one or more error states (Table 5).Error states can be categorized as errors of featureidentification, feature refinement, hypothesis trig-gering and hypothesis evaluation, as outlined in ourdevelopmental model of expertise (Table 1). Errorstates for slide search have not yet been implemen-

r

dent is not presentdent exists at another locationdent exists elsewhere, but second feature presentmisseddent is explicitly absentdent is not present in this case, but can be presentses under consideration (including correct)ent is present in location currently under considerationent is present in another location not

by student is absent but not important to notedent is too low to identify absent featuredent is too low to identify feature

urther specification of attributes and valuesreture, but incorrect value for attributealue for hypotheses under consideration but notkwards reasoning)alue for hypotheses nottion but notards reasoning)

upport this hypothesis

orting hypothesis does not support thatre does not matchting hypothesis does not support that hypothesismatchorting hypothesis does not support that hypothesis

ribute value pairs do not matchting hypothesis does not support that hypothesisribute value pairs do not matchtive of supporting hypothesis now does not supportute value pairs have been addedtive of supporting hypothesis now does not supportre has been further specified within feature hierarchytive of refuting hypothesis now does note attribute value pairs have been addedported by one feature is no longerbecause attribute value pairs have been addedh feature(s) found so fareatures that have been identified but not otheridentifiedt with identified featureedt with identified feature because neweature added

-done before all required subtasks are completed


ted, primarily because our developmental modelsuggests that the target users (intermediates) havelargely mastered these skills.

Within each category, individual errors describespecific reasons for failure. For example, errors ofidentification occur when there is no match to anIdentify-Feature node in the DSG. Ten specific errorsare recognized in this category relating to incorrectpresent and absent feature assertions, incorrectlocations for these assertions, and incorrect viewermagnification when identifying features (Table 5).

When the student action does not match anyvalid-next-step, a ‘failure’ response returns fromthe DSG. A separate set of pedagogic PSMs are usedto match the student actions to the error statesdescribed in the pedagogic model. For example, inthe interaction scenario, when the student assertsmucin (Action 2.5 in Table 2) the DSG returns a‘failure’ but the student actions match to Error I4(Table 5), because an Identify-Feature node existsfor absence of mucin, but the student has identifiedmucin as present. In contrast, when the studentidentifies epithelial necrosis (Action 2.20 inTable 2), the system matches the student actionsto Error I1 (Table 5).

Figure 7 Example alert (respon

For each error state the pedagogic model main-tains a response (alert) that can be delivered to thestudent. Alerts are composed of context-specifictext with accompanying tutor actions. Context-specific text is generated with text templates. Asimple markup language is used for insertion ofcontext specific information derived from the cur-rent structure of the DSG and/or the incorrectstudent action. Tutor actions are general kinds ofnon-text interventions delivered along with thetext. They specify the general form of interven-tion, but do not indicate the specific target objecton the interface side. For example, the systemmayflag an incorrect step in the diagrammatic reason-ing interface. Like the text templates, tutoractions are made specific to context and interfaceobject by insertion of values derived from currentstructure of the DSG. For example, when the stu-dent incorrectly identifies mucin as present, thesystem delivers the text template for Error I4, usingthe value ‘mucin’ derived from the DSG node forthis feature (Fig. 7). Additionally, the system flagsthe incorrect node by changing the behavior of theobject to blinking, and the color of the object tored.

se to Action 2.5 in Table 2).


There are three extensions to the basic errorlogic:

� A
mbiguous error states. In some cases, studentactions match to multiple error states becausethere is ambiguity about which error the studenthasmade. In this case, the systemmust determinewhich alert should be delivered. In the interactionscenario, the student incorrectly identifies eosi-nophil-rich inflammatory infiltrate (EII) at a par-ticular location (Action 2.17 in Table 2).Eosinophil-rich inflammatory infiltrate is presentelsewhere on the slide (Table 5, Error I2). Butthere is also another feature present in this loca-tion (nuclear dust) that has not yet been identi-fied (Table 5, Error I3). Was the student correctlyidentifying EII as they traversed the slide, butincorrectly bounding the area in which it is seen,or was the student looking specifically at nucleardust and incorrectly labeling this EII? The systemcannot differentiate between these two condi-tions, but must assign blame to one of them inorder to (1) explain the error, and (2) maintain theassessment of student skills needed for the stu-dent model. When multiple error states apply, theerror state with the highest priority in the peda-gogic task is selected by the system. In the case ofAction 2.17 (Table 2), Error I2 has higher prioritythan Error I3, and therefore the I2 alert is deliv-ered. Currently error state priorities are staticand reflect the default pedagogic model. In thefuture, priorities will be dynamic and altereddependent on the state of the student model.
� R
ecommendations. Some kinds of feedback do notfit well into the standard division of feedback ashints and errors. In the interaction scenario, whenthe student correctly identifies the quantity of EIIas moderate, the system accepts the correctresponse (Action 2.19 in Table 2). Although thisis the correct quantity in this case, the studentshould know that quantity may take a range(including the correct value) and still be consis-tent with the diagnosis that applies in this case.Additional recommendations are delivered sepa-rately to the interface to distinguish them fromtrue errors, but may be concatenated whenmulti-ple recommendations apply.
� A
lerts after correct actions. Because the expertmodel reasons with the student, some previousassertions may become inconsistent with addi-tional evidence. In particular, this is true of sup-porting and refuting relationships. In theinteraction scenario, when the student furtherrefines neutrophil-rich inflammatory infiltrate,the system accepts the modification (Action2.12 in Table 2). However, the existing supportive
relationship to EM is no longer valid. The instruc-tional layer alerts the student to this new incon-sistency. Interactions of this kind are very difficultto construct with traditional cognitive tutor archi-tectures because they typically provide feedbackonly for a single rule.

7.2. Requests for help

When the student does not know what to do next,they may request help from the tutoring system.Requests for help return a single best-next-stepfrom the expert model. The type of best-next-stepreturned determines the hint delivered by theinstructional layer, and therefore the hint is specificto the current problem state. As described in Sec-tion 6, the best-next-step is determined by theexpert model. The best-next-step is dependent on(1) the state of the problem, and (2) the pedagogictask.

The state of the problem (Table 6) defines thevalid-next-steps from which a best-next-step isselected. The pedagogic task determines whetherany valid-next-step is required or not required (seeSection 5), and only required valid-next-steps maybe returned as best-next-steps. For example, thedefault state of pedagogic task requires hypotheses.Thus, when all features have been identified andrefined (State 5 in Table 6), the system suggests thatthe student assert hypotheses, because hypothesisis a required node. In the interaction scenario, thisinterchange is shown in Action 2.23 (Table 2). Infuture versions of the system, we will use thismechanism to change required nodes as expertisedevelops, allowing more advanced students to skipsteps that were required early on. For example, wecan then allow students to ‘‘jump’’ to the diagnosisafter the region of interest has been seen (seeTable 1, Design Requirement 1.7).

For each DSG node type, the pedagogic modelmaintains a set of responses (hints) that can bedelivered to the student. Like alerts, hints arecomposed of context-specific text and accompany-ing tutor actions. But unlike alerts, hints are orderedlists, providing increasingly specific advice. Earlyhints provide general guidance, and later hintsare more directive. In the interaction scenario,the student requests help before she has completelyidentified the evidence (Action 2.16 in Table 2), andprogresses through the hints available for this pro-blem state (Fig. 8). At first the system provides onlythe feature name, shows the student the area of thecase in which the feature can be found, andencourages the student to try to find the featureon her own. In subsequent hints, the system draws


Table 6 Problem states and corresponding hint sequences for early intermediate student

State Problem state description Hints delivered to early intermediate student

1 Student not in area of interest Direct student to area of interest

2 Student in area of interest but hasnot yet started to identify features

Advise student that they have found areaof interest and should look for features

3 More features can be identified Advise student that more featuresare presentMove viewer to area closest to currentfield that contains featureDraw around area containing featureProvide name of featureShow annotated example of feature(if available)Describe correct interface actionsto identify feature

4 Feature has been identified but not fullyrefined with attribute value pairs

Advise student that feature just identifiedhas important qualities (attribute valuepairs) that should be indicatedProvide attribute nameProvide value nameDescribe correct interface actions to assertqualities and assist by opening correct menu

5 All features have been identified and fullyrefined but no hypotheses have been asserted

Advise student that all features havebeen found and hypotheses shouldnow be considered

6 Hypothesis consistent with all evidencehas not been asserted

Provide hypothesis name andsupporting evidenceDescribe correct interface actionsto assert hypothesis

7 No remaining hypotheses consistentwith all evidence

Advise students that they are readyto make a diagnosis

8 All hypotheses consistent with all evidencehave been asserted, but differential diagnosisdoes not yet contain all members

Provide name of hypothesis to addto differential diagnosis

Describe correct interface actionsto assert diagnosis

9 All hypotheses consistent with all evidencehave been asserted, and differentialdiagnosis contains all members

Indicate that problem has been solved

around the feature, and then tells the student theexact set of actions to take. These ‘hint hierarchies’closely resemble the general structure of hints usedin other cognitive tutors [8]. Students can continueto ask for hints until all hints in the hint hierarchyhave been displayed.

The pedagogic model defines the hint text andtutor actions that are delivered in specific problemstates (Table 6). Hint text is generated with texttemplates and a simple markup language, for inser-tion of context specific information, derived fromthe current structure of the DSG. Tutor actionscorrespond to general kinds of non-text interven-

tions that are useful in particular problem states.They specify the set of actions to be taken on theinterface side. For example, the system may drawaround an interface object or open a menu. Tutoractions are made context-specific by insertion ofvalues derived from current structure of the DSG.

There are two extensions to the basic hint logic:

� S

ubgoal level and node hints. The instructionallayer distinguishes between states in which thestudent has or has not previously attempted aparticular kind of subgoal. When the student hasnot previously attempted a particular kind of


Figure 8 Example hint sequence (response to Action 2.16 in Table 2).

subgoal–—there is no existing DSG node of thesame node type as the best-next-step. In thiscondition, the instructional layer delivers a sub-goal level hint, aimed at providing guidance aboutwhat kind of subgoal to attend to. For example, ifthe student begins the problem by asking for ahint, the expert model will suggest that the stu-dent look for areas that are abnormal, and try toidentify features. In contrast, if the problem statecontains a completed subgoal of the same nodetype as the best-next-step, the instructional layerdelivers a subgoal node hint, aimed at providingguidance toward completing a given node. Thehints shown in Fig. 8 represent a set of subgoal-node hints.

� H
ints after errors. The pedagogic model main-tains a separate set of responses to requests forhelp that is used when the previous student actiongenerates an error. Action 2.3 in Table 2 shows anexample of this type of hint. Unlike other hints,hints after bug contain only a single message andset of tutor actions.
The errors detected and remediated (Table 5),flexibilities permitted (Table 7), and responsesto help requests (Table 6), implement the instruc-tional design requirements derived in response toour developmental model of expertise (Table 1).

8. Diagrammatic reasoning palettesfor visual classification problemsolving

Tutor interfaces combine an image viewer with aVCT diagrammatic reasoning palette used by stu-dents to construct a graphical representation oftheir reasoning. This palette acts as the sharedmedium between student and system, reifying theprocess of classification problem solving. Unlikemany domains in which ITS have been used[4,6,41], classification problem solving in medicinehas no formal notation. Therefore, an importantaspect of our work is to create and evaluate possiblenotations for their effect on learning. One benefit ofour architecture is that the separation of the TASKfrom DOMAIN and CASE, permits significant latitudein the kinds of problem representations that thepalettes can support.

For example, the VCT can currently be used withtwo different palettes to create significantly differ-ent interactions with students. The case-focusedpalette presents a local view of the problem, whichis fundamentally constructivist. Fig. 9 shows thecase-focused palette following Action 2.21 in theinteraction scenario. Features and absent featuresappear as square boxes containing their attribute:-


Table 7 Flexibilities and constraints of ITS–—requirements and implementation

ITS flexibility/constraint Pedagogic designrequirement (from Table 1)

Implementation of instructionallayer interaction with DSG

Location of features mustbe indicated

Help students connect visualfeatures to symbolic featurenames (1.4)

Student indicated locationmust match location derivedfrom case and stored in DSGevidence node for feature

Features can be added in any order,but hints provide optimum orderspecified in diagnostic algorithm

Allows more novice studentsto fully explore feature space(1.6) but encourages moreadvanced students to searchfor features most efficiently (1.8)

Any valid-next-step is allowedbut best-next-step followsmost efficient sequence

Attributes can be added in any order,but hints suggest refining featuresimmediately after it is asserted

Encourage feature refinementat the time it is asserted to helpmore novice students learnthe salient refinements (1.9)

Any valid-next-step is allowedbut best-next-step afterfeature identification isfeature refinement

Hints take students through allfeatures before suggestinghypotheses. But only one supportingfeature must be identified before ahypothesis can be asserted andeven accepted as diagnosis. Ifdiagnosis is made incorrect byaddition of features or refinementof features, then student will laterbe required to revise

Encourage complete featurearticulation among novices (1.5).Permit hypotheses consistent withany one feature (1.12) in order toallow students to explore relationshipsto other hypotheses (1.17). Allowstudents to jump to diagnosis aslong as region of interest hasbeen seen (1.7). Help moreadvanced students learn when toexclude hypotheses as featuresare further refined (1.19)

Best-next-steps completefeature identification beforesuggesting hypotheses.Hypotheses can only beadded to DSG when precededby one or more supportingfeatures. Evidence clustermust support relationshipbetween all previouslyasserted features anddiagnosis

Features do not need to be refinedbefore hypothesis can be assertedor accepted as the diagnosis

Allow more advanced studentsto reason without fully refinedfeatures (1.10). Permit sequencesin which new hypotheses requirere-examination of featurerefinement (1.11)

Attribute value nodes arenot required in defaultpedagogogic model

All diseases with featurespecifications matching thecase data must be asserted ashypothesis and diagnoses

Encourage hypotheses that areconsistent with all of features(1.14), and help students learnsets of hypotheses that sharesimilar features (1.15)

Problem is not completeuntil all Diseases matchingthe FEATURE SPECIFICATIONare asserted

value pairs. Hypotheses appear as separate roundedboxes, and may be connected to features usingsupport and refute links. Hypotheses may be movedinto the Diagnoses area of the palette when adiagnosis can be made (dependent on the state ofthe DSG and the student model). Only the featurespresent in the actual case are represented, but anyvalid hypothesis can be added and tested. At the endof each case, the diagram shows the relationshipspresent in a single case. These diagrams will bedifferent for each case.

In contrast, the knowledge-focused palette(Fig. 10) presents a global view of the problem.The interface is algorithmic. Students see the diag-nostic tree unfold as they work through the pro-

blem. Fig. 10 shows the knowledge-focused palettefollowing Action 2.21 in the interactions scenario(same problem state as Fig. 9). Features and absentfeatures appear as square boxes containing theirattribute:value pairs. After attributes and valueshave been correctly added by the student, theyare iconized as diamonds, but may be moused overto display their meaning. As features are added,they are connected to form a path toward thediagnoses. When students complete any level ofthe algorithm by correctly identifying and refiningthe feature, the tutor reifies all of the other possiblechoices at that level. The current path (all identifiedfeatures) is shown in yellow to differentiate it fromother paths. Hypotheses appear as separate


Figure 9 Case-focused diagrammatic reasoning palette (state following Action 2.21 in Table 2).

rounded boxes. When students make a hypothesis,the tutor places the hypothesis in the appropriateposition on the diagnostic tree. When the hypoth-esis fits with the current evidence it is shown con-nected to the current path. When the hypothesisdoes not fit with the current evidence, it is shownconnected to other paths with the content ofthe associated features and attributes hidden asboxes containing ‘?’ until students specificallyrequest the identity of the feature or attribute. Apointer is always present to provide a cue to thebest-next-step. In the problem state shown inFig. 10, the pointer on the right-hand side ofthe diagram shows a rounded box containing ‘?H’,because all of the features have been identifiedand the best-next-step at this state is to make ahypothesis. The student can see that AEB can havefive different FEATURE_SPECIFICATIONS (creatingfive different paths), including some that sharefeatures with the current case (blister, eosino-phil-rich inflammatory infiltrate, neutrophil-richinflammatory infiltrate). The student can also

determine from the diagram that Acquired Epider-molysis Bullosa is not the diagnosis in the currentcase. By the conclusion of problem solving theentire diagnostic tree is available for exploration.The knowledge-focused palette therefore exp-resses relationships between features and hypoth-eses both within and across cases. Students can usethe tree to compare between cases. At the end ofeach case, the diagram shows the same algorithm,but highlights the FEATURE_SPECIFICATION of thecurrent case.

9. Advantages of the approach

The approach we describe is novel, and has signifi-cant advantages for designing medical tutoring sys-tems. The advantages include:

� U

se of a paradigm with proven effectiveness. Animportant aspect of this project is to test thefeasibility of an established instructional method


Figure 10 Knowledge-focused diagrammatic reasoning palette (state following Action 2.21 in Table 2).

in a declarative-knowledge rich domain. Oneadvantage of the architecture that we proposeis that it closely reproduces the intermediatefeedback and goal-structuring of MTITS, a classof systems that have proven efficacy in otherdomains [3—6].

� S
calability, ease of maintenance, and reusability.The use of separate domain model and pedagogicmodel enhances scalability of the system becausenew domains or pedagogic content can be addedby extending the knowledge bases in an ontologyediting environment, without altering the code.The VCT domain model extends an existing ontol-ogy for classification problem solving [42], andtherefore any other ontology that also uses thisrepresentation can be easily incorporated. Theseparate domain model is also reused to constraincase authoring.
� S
upport for forward and backward reasoning. Theseparation of domain task from domain modelenhances the flexibility of the system to supportmore than one kind of student reasoning. The DSG
can be constructed to use forward reasoning,backward reasoning, or a combination of bothby minor alterations to the task model.

� F

lexibility of the instructional layer. The separa-tion of the instructional system from the expertmodel enhances instructional flexibility andindividualization. Unlike most cognitive tutors,the instructional responses (hints and errors) arenot additional clauses to an expert model rule.Rather, they are separate sets of declarativeelements that are mapped to general problemstates or error types. Instructional responses areeasily changed both by the author when theinstructional content is created, and by thesystem in choosing a particular response basedon the state of the student model. Because theDSG maintains the problem-state as the solutionadvances, the instructional layer can use evenmore complex methods for individualization ofinstruction. For example, the typical immediatefeedback cycle in which each student action isanswered by a tutor response, can be changed in


our system. The model could be permitted toadvance over many student actions and poten-tially off the correct solution path for multiplesteps. In this case the instructional layer couldprovide feedback at longer intervals, graduallyfading the instructional scaffolding as studentsgain expertise.

10. Implementation of the VCT

The VCT is implemented as a client—server system,with a multi-agent architecture, written in the Javaprogramming language. Agents communicate usingJava Agent Services–—a library that supports theFoundation for Intelligent Physical Agents (FIPA)standard [45]. The multi-component nature of thesystem permits replacement of individual compo-nents to create different domain specific tutoringsystems.

10.1. Server

The server includes the Tutor Engine, Xippix ImageDelivery System [46], and Protocol Collection Sys-tem for obtaining and storing detailed informationregarding user—system interaction.

10.1.1. Tutor engineAll abstract PSMs are implemented in Jess–—a Javaproduction rule system [47,48]. Ontology classes forthe VCT are created in Protégé-2000 [49,50]. Caseinformation acquired through an authoring system(Section 12) is stored in Protégé. Domain and caseProtégé projects share a third project containingthe FEATURE, ATTRIBUTE, and VALUE primitives(Fig. 3). All Protégé classes and instances are con-verted to Jess templates and facts using a modifica-tion of JessTab [51]. The DSG exists as a set of Jessfacts, which are utilized by the abstract PSMs of theinstructional layer. For development purposes, weutilize an extension of JGraph [52] to visualize theDSG in Jess Working Memory. Hint and error messagetemplates are stored in Protégé, and can be re-usedor modified for different tutoring systems. Instancesare created using a markup language that makesgeneral templates context specific. Instances mayinclude pointers to interface objects, and actions tobe performed alone or in combination with textfeedback.

10.1.2. Image delivery systemThe image delivery system [46] is a commercialapplication that permits large image files to bedelivered in smaller pieces for zooming and panningto a Java image viewer.

10.1.3. Protocol collection systemInter-agent messages containing time-stamped,low-level interface actions (such as image naviga-tion and button presses), complete student actions(such as feature identification or hypothesis crea-tion), and tutor responses are collected and storedin an Oracle 9i database for further analysis.

10.2. Client

The VCT student client is a Java WebStart [53]application that communicates with the server viaHTTP. The student client is implemented as a frame-work that can support a range of domain specificinterface elements. The client includes a Javaimage viewer that communicates directly with theXippix server [46] to update the image, based onuser requests. The client image viewer acts as avirtual microscope. Diagrammatic reasoning inter-faces use SpaceTree [54,55] for tree-based selection(both interfaces

An intelligent tutoring system for visual classiﬁcation problem solvingjchampai/papers/... · 2010. 7. 27. · An intelligent tutoring system for visual classiﬁcation problem

Documents