Flight Examiners’ Methods 1 Flight Examiners’ Methods of Ascertaining Pilot Proficiency Wolff-Michael Roth 1,2 1 University of Victoria; 2 Griffith University Abstract There are no studies about how flight examiners (check captains) think while assessing line pilots during flight training and examination. In this study, 23 flight examiners from 5 regional airlines were observed or interviewed in three contexts: (a) surrounding the assessment of line pilots in simulator sessions; (b) stimulated recall concerning the assessment of pilots; and (c) modified think-aloud protocols during assessment of flight episodes. The data reveal that flight examiners use the documentary method, a mundane form of reasoning, where observations are treated as evidence of underlying phenomena while presupposing these phenomena to categorize the observations. Although this method of making sense of pilot performance is marked by uncertainty and vagueness, different mathematical approaches are proposed that have the potential to model this form of reasoning or its results. Possible implications for theory and the practice and training of flight examiners are provided. Keywords Debriefing · assessment · cognition · cognitive anthropology · thought process A recently accredited flight examiner during stimulated recall: Probably your biggest fear is having to fail someone. That’s why people say, “Your worst one always is your first one.” . . . Becoming an examiner is like going for your first solo or your first commercial nav. “Well done, you’ve got the job.” (D2) A first officer during a post-debriefing interview: You’re always trying to remember [the flight examiner’s] way of doing it. You’ve just got to remember to tick their boxes and you’re okay. You’re also trying to remember the way they’re thinking. And if you get back in to that then you don’t get so many comments. A flight examiner is an authorized check pilot, who has taken on some of the duties of a flight inspector on behalf of the regulatory authority (e.g., CAA-NZ, 2013; Transport Canada,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Flight Examiners’ Methods 1
Flight Examiners’ Methods of Ascertaining Pilot Proficiency Wolff-Michael Roth1,2
1University of Victoria; 2Griffith University
Abstract There are no studies about how flight examiners (check captains) think while
assessing line pilots during flight training and examination. In this study, 23 flight examiners
from 5 regional airlines were observed or interviewed in three contexts: (a) surrounding the
assessment of line pilots in simulator sessions; (b) stimulated recall concerning the assessment of
pilots; and (c) modified think-aloud protocols during assessment of flight episodes. The data
reveal that flight examiners use the documentary method, a mundane form of reasoning, where
observations are treated as evidence of underlying phenomena while presupposing these
phenomena to categorize the observations. Although this method of making sense of pilot
performance is marked by uncertainty and vagueness, different mathematical approaches are
proposed that have the potential to model this form of reasoning or its results. Possible
implications for theory and the practice and training of flight examiners are provided.
Keywords Debriefing · assessment · cognition · cognitive anthropology · thought process
A recently accredited flight examiner during stimulated recall: Probably your biggest fear
is having to fail someone. That’s why people say, “Your worst one always is your first
one.” . . . Becoming an examiner is like going for your first solo or your first commercial
nav. “Well done, you’ve got the job.” (D2)
A first officer during a post-debriefing interview: You’re always trying to remember
[the flight examiner’s] way of doing it. You’ve just got to remember to tick their boxes
and you’re okay. You’re also trying to remember the way they’re thinking. And if you get
back in to that then you don’t get so many comments.
A flight examiner is an authorized check pilot, who has taken on some of the duties of a
flight inspector on behalf of the regulatory authority (e.g., CAA-NZ, 2013; Transport Canada,
Flight Examiners’ Methods 2
2013). Flight examiners are airline captains—serving both their airline and are responsible to the
regulatory authority—who assess the competencies of pilots against a national standard for
continued accreditation purposes and during type-rating. In the first introductory quotation, a
recently accredited flight examiner talks about becoming a flight examiner with rather little
experience or training in how to do the job or on how to assess such that the fears associated with
having to fail a pilot are lessened by the ability to ground the assessment in evidence. Such
comments concern the conceptual shifts individuals have had to make when becoming flight
examiners. Perhaps unsurprisingly, conversations with airline training and standards managers
reveal their eagerness to find out more about how flight examiners think for the purpose of using
such information in their training of new flight examiners, an increasing number of which are
needed because of the rapid expansion some airlines experience.
In the second introductory quotation, an experienced first officer (52 years of age, 30 years as
commercial pilot, 9,000 flight hours) talks about what really matters when he undergoes
examination following an operational competency assessment. First, he suggests that a pilot has
to remember how the particular examiner flies, which is the implicit referent for his assessment;
and a pilot also has to remember the way in which flight examiners are thinking, that is, the
processes of their thinking that lead them to make assessments and influences recommendations
for improvement or training. But how do flight examiners think? What evidence do they seek and
use to arrive at an assessment and how do they think (what are their thinking processes)?
Currently there are few studies; and those that can be found have been conducted in constrained
settings using pre-recorded video (e.g., Roth, Mavin, & Munro, 2014a) rather than observing
flight examiners at work.
Early work on pilot assessment investigated assessment in terms of measurement models and
focused on outcomes (e.g., Flin et al., 2003; Holt, Hansberger, & Boehm-Davis, 2002;
O’Connor, Hörmann, Flin, Lodge, & Goeters, 2002), where the measurement sometimes are
supported and enhanced by automated tools (e.g., Deaton et al., 2007; Johnston, Rushby, &
Maclean, 2000). More recent studies focus on the nature of the evidence that flight examiners
use in support of their ratings (e.g., Roth & Mavin, 2014). None of these studies investigate how
flight examiners think, the method or methods that they use to arrive at statements about the
Flight Examiners’ Methods 3
proficiency, knowledge, skills, or states of pilots. This study, grounded in the cognitive
anthropology of work, was designed to investigate flight examiners’ methods. The ultimate goal
of this work is to construct a basis for the training and professional development of flight
examiners.
There exists considerable research in how experts from a variety of fields think, including
clinical reasoning of medical experts (Boshuizen & Schmidt, 1992), instructional designers
The documentary method of interpretation as described in the literature is based on three
levels of sense: objective sense, expressive sense, and documentary sense. However, the
expressive sense pertains to a social actor’s intentions that cannot be objectively obtained. In the
present study, inferences that the flight examiners make about pilots intentions, therefore, are
treated as special cases of the documentary sense.
Objective sense. The objective sense of a situation refers to what different observers can
actually see and agree upon: their facts or their evidence. For example, flight examiners identify
indisputable facts, for example, that a pilot has (not) pushed the go-around button, what the
precise speed is (e.g., 145 knots, white bug + 10), or what the torque gauge reads. In the
following example from a debriefing, the flight examiner lists a set of observations that
constitute his objective sense, the facts used in the assessment of the pilots.
You were at 34 or something like that. Not a long way out. Told ATC. Made a PA. And then
Flight Examiners’ Methods 9
came back . . . to around the 240 to 250 indicated mark until we got below 10. And then we were
sitting at 235 knots. (B3)1
The list includes where the pilots had made the decision to turn around (35 DME), that there
was an exchange with air traffic control followed by a public announcement. They were flying
with a speed between 240 and 250 knots until they were less than 10 nautical miles on the
distance measuring equipment (DME) at which point they were flying at a rate of 235 knots.
Such lists do not in themselves constitute an evaluation but are (a) used as manifestations of
underlying intentions and (b) treated as the manifestations (documentary evidence) of one or
more underlying, not-directly observable knowledge (aircraft or standard operating procedures),
skill (manipulative, communicative, management, or decision-making, skill), or state (e.g.,
situational awareness). Although more imprecise and fuzzy, non-technical areas are described in
terms of concrete, observable evidence. For example, a flight examiner described the slow
responses of a first officer observable in the cockpit, during debriefings, and during regular
conversations, which he suggested could be verified by the interviewer (see Table 3).
[The first officer’s] execution of procedures is, it’s slow. «First officer’s» response to everything,
and you’ll find this when you talk with him, is a very slow response . . . there’s considerable
delay and then you get a response. And you generally get the right response. (B1)
««««« Insert Table 3 about here »»»»»
Documentary sense. Over the course of a session, the flight examiners built up a
whole/holistic sense of pilots (“a generic fix”), the evaluation of their skills, as per the
documentary evidence that is indicative of and explains the actual performance. For example, in
an icing condition during a single-engine approach, the pilots had not entered the correct speeds
in their landing speed card. All speeds (VREF, VAPP, VAC, and VFS) should have been the same.
This observation contributed to the flight examiner’s sense that the pilots had poor time
management and, as a result, forgot to enter the speeds as per operating procedures (“It comes
back to managing your time and what you actually want to achieve” [E1]).
Three categories of idealizations can be identified in the data: non/proficiency (pass/fail), 1 For cross-referencing purposes with Table 1, participant ID is given at the end of the transcription (e.g., “B3,” where “B” refers to airline B). Square brackets (i.e., [. . .]) enclose descriptive information; chevrons (i.e., «. . .») enclose replacements for proper names of persons, cities, and airports.
Flight Examiners’ Methods 10
(non/technical) skills and knowledge (e.g., handling, decision-making, management,
communication), and states/processes (e.g., situational awareness, thinking). All of these
idealizations are mundane and uncontroversial cultural objects denoted by the language shared
within the aviation community or specific airline (e.g., Mavin & Roth, 2014). These
idealizations, which are taken to be underlying the opbserved performance, are not directly
accessible. Instead, the word or language used denotes a sense that arises with observations.
They might say, for example, “I had this gut feeling that something’s not right. Whether it was
body language or something I’d seen, I’m not sure. But something didn’t sit with me” (D4). This
“gut-level sense,” which often begins with the flight examiners’ observations of pilot behavior
during the briefing preceding the simulator session, subsequently is worked out in terms of the
evidence, as B3 said in the above quotation, “We are using specific examples to cover a generic
fix.”
Mutual determination of objective and documentary sense. Flight examiners are tasked
with an assessment of pilots’ competencies (proficiency) levels, holistically or, as in airlines B,
D, and E, in terms of ratings of a set of human factors. Any idealization given in the sense is
based on what flight examiners actually observe, the objective sense of the situation (facts, actual
performance), which is taken to be a manifestation (document) of what by nature is
unobservable. There is therefore a reflexive relationship between concrete observations and
idealization of the underlying reality (phenomenon): the former lead to the emergence of the
latter, but the latter explains the presence of the former. In the following example recorded
during a debriefing session, the flight examiner justifies a passing grade to a worried first officer
who has had some performance problems in the past.
You maintained situational awareness and were able to make the airplane follow the correct flight
path at all times. The decisions that you had to make today were easy ones today . . . considered
all the points, and I saw a little bit of evidence of that early on in your decision to divert to
«airport 1» You said, “Okay, we need engineering and we need runway length,” which then kind
of, that’s «airport 2» out the way. And obviously «airport 1» was the closest place. So I saw clear
evidence that you were actually diagnosing the situation and making sure that you considered all
the facts that you need to consider to generate the options, which enabled you to make your
Flight Examiners’ Methods 11
decision. (B3)
In this explanation, the maintenance of a correct flight path is evidence for the pilot’s
situational awareness, the overt consideration of requirements for a diversion is evidence for
decision-making/diagnosing, and the active selection of an appropriate alternate airport is
evidence for decision-making/option generation. The state of the derived situational awareness
becomes a master concept, which is both evidenced in observational behaviors and performances
and explains these. The factual evidence determines the examiner’s sense that the pilot has
satisfactory decision-making skills (here options and diagnosis dimensions), and the satisfactory
decision-making skills explain the observed performance. This holistic sense in turn mediates
what flight examiners are looking for, and, therefore, what they collect as data and the intentions
that they take to be expressed in the objective facts. The relationship between an evolving
documentary sense and the objective sense of the situation can be seen at work in the thoughts of
a flight examiner from an airline using the explicit human factors model based Model of
Assessment of Pilot Performance (MAPP):
If I see something go wrong, then I sit there myself going, “Rightyo.” I then, as you say, visualize
the MAPP and go, “Okay, well where’s this fit in it? Did they lose situational awareness? ((Points
to item on visual MAPP model)) No. Okay, well what else could it have been? Well they flew the
aircraft within tolerances ((Points to item on MAPP)). Decisions? ((Points to item on MAPP))
Yep, they decided to go to «airport». Right call.” And I start trying to cross bits off and then
narrow it down myself. So reckon it’s management of the crew. (E1)
The conceptual tool (MAPP) provides the flight examiners with a way of mapping some
observable expression, a manifestation, to a presumed underlying performance or skill
(idealization).
How Flight Examiners Evolve the Objective Sense
To arrive at conclusions about pilots’ proficiency levels, knowledge, (non/technical) skills, or
states (situational awareness, thinking), flight examiners require documentary evidence on which
to base their assessment. In the simulator, they observe and generally take some notes in real
time without time out or recourse to revisiting an event. These notes are used both for
establishing the record of the flight performance observed and for the debriefing, where they
Flight Examiners’ Methods 12
describe to the pilots what they have done for the purpose of critique or praise. In this subsection,
we report findings concerning the process of establishing the documentary evidence for flight
examiners’ conclusions.
Flight examiners differ in which facts and how many facts they identify. The assessments
are based on documentary evidence. One might ask whether flight examiners identify the same
kind and number of facts. This is difficult to establish in the context of regular examinations but
can easily be done when, as in the present case, the same flight segment is evaluated with the
possibility to repeatedly replay the segment. Whereas there is little debate about facts once they
are articulated (e.g., “the calls were non-standard at the bottom of the approach” [A5]), the
modified think aloud protocols that control for the assessment situations show there is variation
between the flight examiner pairs whether a fact is actually noted and therefore taken into
account in the assessment (Table 4). There tends to be no debate about what the standard
operating procedures say and whether a pilot action is consistent or inconsistent with these. For
example, only two of 6 flight examiner pairs noticed that the pilot in the scenario did not push
the go-around button, the first step specified for a go-around in the standard operating procedure.
This disengages the autopilot, which, by means of the flight director, continues to direct the pilot
to continue downward in the approach rather than upward. Because this step is missing in the
kinetic sequence of the cockpit as a whole (e.g., Roth, Mavin, & Munro, 2014b), the procedure
that follows is “messy” (A3, A4, A5, B3, C1, D2, D3, D4), “untidy” (B3, D4, D6), or otherwise
deemed inappropriate. But the origin of the messy procedure is not apparent to four of the
examiner pairs. Three pairs noted that the captain in the scenario “was flying against the bars,”
that is, had a positive rate of climb whereas the command bars directed him to head down. Two
of these pairs identified the missing engagement of the go-around procedure—pushing the go-
around button—as the source of this divergence. Finally, only three pairs noted the crucial fact
that passengers were evacuated on the side of the running engine after landing with a fire on the
other engine (Table 4). That is, facts about instruments, actuators, and observable performance
constitute a baseline that is relatively undisputed.
««««« Insert Table 4 about here »»»»»
Flight Examiners’ Methods 13
There is considerable variation in terms of the total number of facts articulated and taken into
consideration when flight examiners articulate the evidence on which they base their assessment
decisions. In the context of assessing in simulator sessions, flight examiners take notes of what
they observe. But the extent of these notes varies widely. We therefore investigated the number
of facts holding constant the event to be assessed. Thus, in the scenario with the inappropriate
evacuation, made salient and took into account different facts and different numbers thereof
(from 1 to 9) (Table 5). However, all those pairs who noted the evacuation on the side of the
running engine failed both captain and first officers, whereas those who did not passed both.
««««« Insert Table 5 about here »»»»»
Flight examiners tend to be aware of the limitations of their evidence. Flight examiners
tend to be aware of the limitations of the documentary evidence that they obtain. They often find
out in the discussion with the pilots that they have missed something (e.g., “I must admit I didn’t
actually notice at the time too. It was only a bit later when I went oh, what’s going on here?”
[E1]); or report themselves having missed something (e.g., “I failed to note the point when the
autopilot was turned off” [B1]). In part, situations in which the flight examiners do not take
notice important facts arise while they take notes (“I don’t see with heads down” [A3]). While
having their heads down to write down observations, they are actually missing other potentially
relevant flight-related facts. As a result, examiners find themselves in situations where their own
observations and those pilots report differ. This is frequently made explicit in the training of new
flight examiners: “they teach us to try not to get yourself in that situation, because it’s quite, a bit
sort of, you know, he said, she said. I said, they said.”
Flight examiners noted the inherent contradiction in their task: To get the documentary
evidence that they need to support a pass/fail decision or their assessment of underlying skill
levels, they need to record their observations. But in the production of recording such such notes,
they miss out on observing flight relevant actions. The flight examiners from airline B explicitly
focus on observation while taking the scantest of notes (1–2 pages, some 15 observations). They
subsequently review their notes and what they remember in addition, pulling together all of the
information to arrive at an overall assessment as well as at assessments of categories of
performance, some of which may require special attention.
Flight Examiners’ Methods 14
I think keeping the notes is actually the thing that’s distracting. I find myself starting to note
something down, I’ll see something else that’s happening and so I’ll stop what I’m doing, take
note of what’s happening and then I forget what I was writing down in the first place. And that’s
lost. Sometimes. That’s a bit of a pain. But you still get the overall picture. (B3)
The flight examiners in the other airlines, too, tend to take brief rather than extended notes
(up to 5 pages for a 4-hour session). These notes in themselves are insufficient as a repoertoire of
facts (“I keep my notes pretty short, so if you read them they probably wouldn’t make a lot of
sense to you. But it’s just a few words to jog my memory” [D3]). Instead, these notes trigger
(episodic) memory and allow examiners to bring back what happened and those facts that they
are using in the assessment. What is important to the flight examiners is the overall picture,
which is more important than a complete tally of all facts.
The conflict is mitigated to some extent for those flight examiners who have access to a
debriefing tool. This tool records the entire simulator session and includes a videotape of pilots,
shows what pilots view, and features representations of instruments and actuators. The debriefing
tool allows flight examiners to mark simulator events for subsequent replay in the debriefing.
The process of going from observation to assessment is mirrored in the use of the debriefing tool.
Thus, a flight examiner was observed marking for replay 21 events during a 4-hour simulator
session. However, he would not actually play all of these and instead focus on four. The total
number of marked events gives him a selection to work from. In the end, as the overall picture
emerges, the flight examiners then select those that he deems most valuable in terms of
triggering learning. In airline D, the marking process has been adapted to their performance
model (MAPP) such that the flight examiners can now mark events according to the agreed-upon
performance categories (e.g., knowledge, communication, decision-making). Even with the
debriefing tool, reviewing one or more sequences for the purpose of getting all the facts may (but
does not have to be) prohibitive in terms of time available and returns for the investment.
Flight examiners engage in targeted evidence collection. In some airlines, records on the
preceding examination are kept. Individual flight examiners might keep their notes or remember
having assessed individual pilots repeatedly. In both types of cases, flight examiners use the
records or their memory to look for documentary evidence to support statements about whether
Flight Examiners’ Methods 15
or not a pilot has improved: “If he’s still having problems with his engine failure after take-off
then we might have to dig a little bit deeper in to it. And it just helps us tell whether something
that you see is random or systematic” (B3).
Important here is that flight examiners and training managers want to see whether a particular
(poor) performance is recurrent rather than a one-off in the actual performance. Sometimes flight
examiners and training manager choose events such that the evidence required in support of their
documentary sense is produced. This evidence then is used to teach the pilot a particular lesson:
“We know what areas they need to improve in and so sometimes, I have to confess, I would
introduce a malfunction at a difficult time for them to handle so that you can use it as a lesson”
(B3). Across a flight simulator session, flight examiners look for multiple pieces of evidence to
support their assessment of an underlying factor. Thus, in most real examination cases and in
contrast to evaluating brief video scenarios (e.g., Roth et al., 2014a), it is not the performance in
one individual situation that determines the assessment. Instead, the flight examiners build their
case based on the overall performance during the simulator session. In the following quotation,
the flight examiner supports his rating of 3 (satisfactory) rather than a 4 (good) on the technical
skill of flying the aircraft within limits because of one instance during a non-directional beacon
(NDB) approach, the aircraft was at the lower limit, which was taken as an indication that the
flight path management was problematic. But for the remainder of the examination, the pilot had
kept the aircraft well within the required limits.
Because we’re looking at a whole, you know, 2-hour, 3-hour session. And for example, «first
officer» got a three for flight path within limits on that exercise. Had it not been for the NDB
approach and the circling . . . he was only just fast enough. And so his flight path management for
the rest of the session was actually quite good ((i.e., rating = 4)). But that dragged it down. So it
was kind of holistic. (B2)
On rare occasions, an examination session is organized to have another flight examiner
provide an independent assessment. In such cases, the examiners use specific events to collect
evidence on the particular issues that the preceding examination/s had identified. They then
obtain the observation that makes the overall decision go one or the other way (“And as soon as I
sort of delved in to that area, it was like, right, that’s black and white” [D4]).
Flight Examiners’ Methods 16
Flight examiners’ selection of events limits the types of facts that they anticipate to
observe. In the situation of the think-aloud protocols, flight examiners were confronted with
brief segments of flights with any knowledge of the context. The situation on the job is different
because flight examiners do not identify arbitrary facts. Instead, having programmed the events,
they have readied themselves to observe specific facts that are associated with this type of event.
Moreover, in a particular examination cycle, all pilots fly the same line-oriented flight segments
and do the same spot checks. From delays in required actions, they anticipate workload to
increase and pilots come under time pressure, which results to a loss in the awareness of the
situation as a whole. That is, flight examiners’ perception is configured by the choice and timing
of events. It also affords them to anticipate facts related to particular human factors areas that are
more salient than others. Each event has a set of challenges, or “boxes,” and the flight examiner
observes whether or not: ”they ticked every box” (D3) and “how well they do” (“The session has
actually got a little bit of stop start in it . . . to tick the boxes . . . so you see some slips and errors
that you wouldn’t normally see” [C2])
Flight examiners use repeat performance to increase the amount of documentary
evidence. The account provided so far may sound as if flight examiners do their work in an
unprincipled manner. But this is not so. To make their cases for the presence of particular levels
of proficiency, knowledge, skill, or state, flight examiners require documentary evidence. They
do not take a single instant as a case for proficiency, knowledge, skill, or state. This especially
important to them in those cases where the observed performance requires them to make a
pass/fail decision.
At that stage I hadn’t failed him; but I hadn’t passed him either. I was sitting there thinking,
“We’ve got a 4-hour session here; we’ll see how the rest goes.” Depending on how the rest goes,
we’ll need to come back and look at that. (D4)
Repeat observations. Flight examiners build their cases as they go along, taking their
observations as evidence that stand for some performance and the level thereof, which stand for
the underlying skill. For some, this mapping occurs immediately (e.g., supported by the
conceptual model), whereas others may wait. As the session progresses “things may change”
because something else might become more important: “So I do have all those thoughts while I
Flight Examiners’ Methods 17
am going through, but when I come out of the sim, I ask myself, ‘What is the main, the big issue
here?’” (B3). Flight examiners then make one or a series of observations that determines their
decision:
So the guy next to him started managing it. And to me that was the point then when I had had two
individual exercises that weren’t managed well. So I was like, “rightyo, we’ve got issues here.”
And by this stage I had already decided, you know, he’s not going to pass today. (D4)
In another case, a flight examiner takes the fact that the first officer moves correctly through
the list of actions stated in the standard operating procedures as evidence for the presence of the
underlying knowledge. However, these steps occurred in the same manner across situations
(“He’s actually leveled out twice now and hasn’t pulled the power levers up.) There were two
observations where the first officer had leveled out without pushing the power levers forward. In
each situation the observation is evidence pointing to a performance problem. This performance
problem occurs across situations, and, in this, is consistent with an underlying skill issue. That
concern for manipulative ability is more serious than the concern with assertiveness, for which
the flight examiner has had evidence that can be fixed. One of the observations he has made is
good performance when the captain has been asked to fake incapacitation (by means of a
shoulder tap during the simulator exercise). In this situation, the first officer has performed well
(“stepped up . . . because he didn’t have to deal with the person next to him” [B1]).
Observations during “Repeats.” When performances are problematic potentially pointing to
underlying problems, flight examiners ask for repeating a situation, segment, or exercise to
collect further documentary evidence that allows them to get “a better fix” on a factor of interest.
Repeats provide further information about the underlying proficiency, knowledge, skill or state
underlying the pilot’s performance. If the pilot/s perform sufficiently well during the repeated
exercise, then this provides the flight examiner with evidence that there was an issue with the
particular performance not with the underlying dimension.
We did the exact same exercise again and he made the same mistake. And then just went through
the whole session making individual management mistakes. So someone like that, it’s actually
quite black and white. (D4)
Flight Examiners’ Methods 18
The flight examiner provides documentary evidence for the fact that an underlying ability is
present but occluded in the performance. Thus, talking about a first officer, a flight examiner
suggested that “And that was the case with «first officer» on a couple of occasions where, for
example, with the briefing the wrong flap setting for landing and briefing the wrong speed. He
knew it, he just hadn’t realized it” (B3).
Flight examiners do not seek to ascertain the nature of evidence even when technology
affords it. With the debriefing tool, flight examiners do have the possibility to replay some event
and to ascertain the nature and number of facts (documentary evidence). But nowhere in the
present dataset does a flight examiner use or talk about using the debriefing tool to check an
observation. When an observation was checked, then always because of discrepancies between
the flight examiner’s and a pilot’s description of what was the case.
How Flight Examiners Develop and Articulate Their Documentary Sense
The documentary method of interpretation is a common, everyday method for determining
some assumed underlying pattern that also explains the observation—e.g., for finding out what
someone thinks, for a coroner to determine the course of events that led to a death, or for a
historian to describe the worldview of an era (Garfinkel, 1967; Mannheim, 2004). But precisely
because it is an everyday method, it is so powerful: the method not only helps in making sense
but also intuitively makes sense. Flight examiners employ the documentary method of
interpretation to determine whether pilots are non/proficient (pass/fail), what their non/technical
knowledge and skills are, or the pilots’ situational awareness. All of these are phenomena are not
given in themselves: these are cultural constructs that held to manifest themselves in observables.
These constructs therefore exist only in and as documentary sense.
Viewing the same scenarios, flight examiners evolve different patterns taken to underlie
performance given in documentary sense. Previous studies suggest considerable variation in
the ratings of pilots and flight examiners asked to assess the same video (Flin et al., 2003; Mavin
et al., 2013). Such variation is also observed here in the form of different appreciations of the
proficiency or non-proficiency of a pilot. Thus, the 6 flight examiner pairs did not come to
complete agreement on the level of the performance for of the six pilots they assessed in the
think-aloud part of this study: no two pairs had the same ratings across the six pilots (Table 6).
Flight Examiners’ Methods 19
That is, even in a condition where flight examiners work in pairs such that individual subjectivity
is minimized, different conclusions are observed. What previous studies have not explained are
the reasons for such variations.
««««« Insert Table 6 about here »»»»»
The flight examiners do not have access to the knowledge and skills underlying performance
or to a pilot’s grasp of the situation (i.e., situational awareness). Here, as in the case of whether
to pass or fail the pilot, they use the documentary method of interpretation. Because the cultural
objects are not given directly but indirectly through the manner in which they manifest
themselves and because flight examiners differ in the contents of their observations, as shown
above, the differences in flight examiners’ overall documentary sense become intelligible. The
flight examiners are most concerned with overall proficiency, which they tend to ascertain by
means of the question whether they would want themselves or their family members and friends
to be a passenger on the aircraft flown by that pilot. If the response is yes, then the pilot passes; if
no, the pilot fails. This is so independent of the root causes—i.e., human factors—attributed to
non/proficiency:
And at the end of the day, I don’t actually think it matters that much what you call it, as long as
you call it something. And you can say, “Look, what I did notice during this exercise, I know you
know this stuff, but you just couldn’t recall it.” (B3)
The documentary sense begins with an indeterminate feel that articulates itself over time
into a more grounded sense. Flight examiners observe pilots over the course of a four-hour
period and then make their assessment. But their sense of how a pilot is doing emerges early,
often during the initial encounter in the briefing preceding the simulator session but certainly as
soon as the session begins:
I guess one thing I’m thinking in a 4-hour session though is I’ve got no hurry to make up my
decision. You know, you do, and of all the times in the past where I’ve had to not pass someone,
there’s always some stage during the session where I’ve gone, “No, they haven’t passed, they
haven’t failed.” And then at the end you might say, “Well you need to come up with a result.”
(D4)
The beginning tends to be some very general and generic without much concrete (objective)
Flight Examiners’ Methods 20
evidence description (e.g., “I’ve found the FO is very introverted and he’s either intimidated by
the simulator or he’s intimidated by the process” [A1], “He’s a plodder” [B1], “They are going
reasonably well” [A3], or “They’re getting through okay” [A4]). Sometimes flight examiners
note that their sense begins with observations of pilots’ “body language” (A4, C1, D3, D5). The
overall sense of whether a pilot is proficient or not, while evolving over the entire simulator
session, may start as soon as the session begins (e.g., “So the big picture actually developed over
the whole session. It might start as soon as you walk in” [C3]). This is so because there are
training sessions preceding the actual examination. Their observations during these sessions
configure the flight examiners’ sense at the beginning of the examination session:
And we spent three days leading up to it . . . so it wasn’t just a one off sort of day. However, over
those three days I’d sort of continued to work at all these things and we were making progress.
And my thought was, well he’s going to get through. It’s not going to be a great pass, but as long
as he keeps improving he’ll be fine. (D3)
As the examination session evolves, there is an increasing fixation of the general sense,
deriving from the increasing amount of concrete evidence available that can be used in the
documentation of the case ultimately made. Oftentimes flight examiners say with hindsight that
the problem has shown up from the beginning—e.g., in the body language of the pilot.
The evolving documentary sense shapes subsequent observations. When there is some
event, it cannot be known whether the problematic performance will be recurrent. It is only with
hindsight, after having repeated events that flight examiners will and do attribute the problem to
some inherent short-coming in the pilot, which leads to a fail rating. There therefore is path
dependence in the evolving overall sense concerning non/proficiency.
If they appear flustered, straight away, quite often at that point in time, they’ll say the wrong
thing, they’ll say, “Unscheduled feather” when it’s really a prop over speed. That sort of thing’s
quite common. And so usually if they’re going to start making mistakes it’s going to be a poor
performance, it starts happening quite early on. (B3)
Flight examiners seek further evidence (implicitly or explicitly) to confirm or disconfirm the
current documentary sense. As the modified think-aloud protocols reveal, in the attempt to locate
specific facts, flight examiners tend to find more negative evidence. In none of the 20 fail or pass
Flight Examiners’ Methods 21
with marker cases was there an evolution from a more negative to a more positive sense. Instead,
flight examiners either began or moved to a more negative sense concerning a pilot’s
performance. Thus, the two examiner pairs where the emergency evacuation into the running
engine with noted while viewing the scenario had the definitive sense that it was a fail. One
flight examiner pair, during one of the repeated viewings, noted the running engine which led to
the reversal of their earlier sense of a good performance (pass) to a definitive fail.
By means of the documentary sense flight examiners evolve an entire explanatory
framework. Together with the general sense of overall proficiency that flight examiners evolve
based on their observations also arises an explanatory framework. Their observations contribute
to the emergence of a sense (e.g., level of situational awareness), which then explains the fact
observed. The associated idealization (e.g., situational awareness) might then be explained by
something else that is based on evidence (e.g., [workload] management). Thus, the fact that a
pilot delayed some task might be taken as evidence that there are problems with management,
which may have high workload as its consequence, which in turn lowers situation awareness.
That is, taken together all the explanatory terms that flight examiners use in their assessment