CRM and "The Emperor's New Clothes"
John A. Wise
Center for Aviation/Aerospace Research
Embry-Riddle Aeronautical University
Daytona Beach, Florida, USA
Presented at and in the Proceedings of:
The Third Global Flight Safety and Human Factors Symposium
Auckland, New Zealand
9-12 April 1996
The purpose of this paper is to ask the question "Is cockpit resource
management an effective training program for aircrews?" This paper
questions whether present programs can be shown to have a valid
measurable affect on aircrew performance. It is argued that current
evaluation methods (e.g., written exams, simulator evaluations) may not
validly measure the actual effects (if any) of CRM training on crew
behavior. Thus, any inferences made about changes in aircrew
performance based on the current evaluation methods may be
unsubstantiated and thus suspect.
Consider if you will the author's modification of an old children's fairy tale, The Emperor's New Clothes.
Once upon a time there was an airline executive who was very vain. Of course, for years his staff had been telling him how wonderful he was. It was not surprising that he had begun to believe it himself. And the surer the airline executive became that he was the most intelligent, most clever man in aviation, the less he liked to hear any criticism at all. One day, the airline executive decided that he would like the airline's crews to work together as a team. I need a program "Something fitting an airline like ours," he said. "Something more important than anything ever used before." And because he wanted this program to be so special, the airline executive did not believe that any of the usual programs were good enough. It so happened that visiting the corporation at that time were two psychologists who were looking for a chance to become rich. They presented themselves to the airline executive and announced that they could change the behavior of his crews using a training program that was unlike anything that had ever been used before. "It will be developed using state-of-the-art information," said one, "so advanced that only the truly intelligent can understand it. Ordinary people are too stupid to appreciate it at all." This sounded like just the thing to the airline executive and he asked them to set to work at once. And so the program began. The two psychologists were given a room in headquarters in which to work and every day they asked for more money to buy the computers and simulators they needed. Of course, they did not buy anything at all, but stored up the money for themselves. The airline's flight crews were evaluated with questionnaires, role playing, and LOFT scenarios. "This is not the way our crews have been evaluated before," the executive said doubtfully. "But, Sir," replied the two psychologists, "to change crews behaviors requires improved methods, as a man of your intelligence will appreciate." "Of course, of course," said the airline executive hastily. At last the day came for the initial briefing on the program. The two psychologists pretended to hold something up before the airline executive. "Isn't it wonderful?" they gushed. "We are particularly pleased with the computer based training. A real triumph, I'm sure you agree." The airline executive hesitated. He could see nothing at all. So he asked his staff's opinion. The staff was very frightened to appear stupid in front of the airline executive, although they could see nothing at all either. "It is quite beyond words." one said slowly. "I can't find expressions to do it justice" said another. At that the airline executive was even more uneasy. "I can't possibly appear more stupid than my staff," he thought to himself. Out loud he said, "Well this is quite extraordinary. I can truthfully say that I have never seen anything like it!" "Do you feel that the LOFT scenarios should be a little longer?" one of the staffers asked. "Perhaps just a trifle," said the executive. "And I wonder if the role playing exercise needs a little adjustment" said another staffer. The airline executive found it easier as he went on to think of things to say about a training program that he could not understand at all. Several more meetings took place before the program was finally announced to be finished. The psychologists provided the airline executive with ample marketing material several days before he was supposed to brief his government's civil aviation agency. When they had finished, the two psychologists stood back and admired their work. "Stunning!" they cried. "Quite, quite remarkable. We only wish that we could stay for the briefing but, alas, we must fly out this very morning." Of course, the airline executive rewarded the two men handsomely and even gave them a bonus for on-time delivery before they left for their airline flight. At last, with fanfare from multimedia and computer generated graphics, the airline executive made his presentation. As the airline executive began, there was a small silence. But no one wanted to appear more stupid than his friends. "I've never seen anything like it," cried one man. "Quite unique," said another. Soon everyone was cheering the airline executive's program. But one young employee had not heard about the new program. In a loud, clear voice, he shouted, "Why hasn't the airline executive made any sense?" There was an awful silence. Then everyone began to laugh. "It's taken a youngster to show us up for the idiots we are," chuckled a man in the crowd. Soon the people were helpless with laughter and even the airline's staffers began to giggle behind their briefing handouts. Only one person was not laughing. The poor airline executive was so embarrassed, he ran straight back to his headquarters, with neither a new program nor dignity.
The author does NOT want to imply anyone in the CRM training effort is anything but honest and honorable. Nor do he intend to imply that the CRM community has purposely used inappropriate measures for personal gain. The author does feel it is necessary for someone to play the role of that young naive boy, and to force the aviation community stop and ask itself some tough questions about CRM evaluation. The desirability of flights crews to work together as an effective team is intuitive. In the USA we would call it "motherhood" - i.e., something with which no one would disagree. So when crew coordination was identified as a potential contributor to safety problems, many government civil aviation agencies and the airlines quickly developed training programs to theoretically mitigate the effects of those problems. The author believes the problem is the way the industry has evaluated the effectiveness of CRM. CRM programs are intended to improve aircrew attention, crew coordination, stress, attitude, and risk assessment. Traditional CRM programs consist of seminars, training videos, and workshops. A flight simulator and a written exam are then given to test the aircrews' response to the program. The results are interpreted to evaluate how well the aircrew would perform in future situations in the cockpit. The real problem, in the eyes of the author, is whether the above techniques are a valid measure of the degree of behavior change in the flight crews - i.e., does CRM training do what it claims to do? The author believes that while many people in aviation are raving about the latest CRM programs, there is a real possibility that the programs are "naked" and may not result in the desired changes.
Examples of Potential Weaknesses
cademic: If a student in an experimental psychology class were to do an experiment to study whether the experimental treatment changed a subject's behavior, through the use of a questionnaire - that student would probably fail. The test of whether the experimental treatment changed a subject's behavior can only be determined by whether the behavior of the subject has changed. It is not sufficient to determine whether the subject thought his behavior changed, or the degree the subject thought that the experimenter wanted the subjects to believe that their behavior had changed. Neither are valid measures of behavior change. Therapy: Marriage counseling is a profession not unlike CRM (i.e., one of its goals is to help married couples, who are having trouble in their marriage, communicate and work together). One would not measure therapist success by looking at their clients responses to a questionnaire on current psychological theory! One would not expect the evaluation to be based on how well the couple could do in a role playing scenario (LOFT?). Rather, one would look to see how many of the couples were still living together after a certain number of years. Not that the other information might not be useful to the therapist in modifying her program of treatment, they would certainly not be the final measure of whether the treatment had been successful! Testing problems: Another basic experimental design issue is whether the flight crews provide "real" answers or "official" answers. Pilots tend to be very "test wise." They have taken many tests to get where they are. They have all "lied" many times when they gave the "official" answer on tests rather than what they truly believed to be the case. For example, in the USA the "official" FAA doctrine on how one flies on instruments is very different from the way the military teaches instrument flying (the later being a more simple technique). But even if one believes and uses the military approach, one must provide the "official" answer on the instrument written and oral exam if one expects to get a civilian instrument flight ticket. Observation problems: Likewise, cockpit observation is also suspect. Direct observation always changes the observed person's behavior. No matter how friendly, how knowledgeable, or how much rapport is established, any data collected with direct, obtrusive observation is always treated as suspect in any traditional experimental environment. As Heisenberg noted, observation always affects the results! If the above is true, then why would one NOT expect pilots to give back the official company position on CRM when asked, independent of whether or not they believe it? It is often easier to nod in agreement, than to fight official policy!
The author is not really sure of the best solution. He is certain, however, that the answer lies in measures of actual behavior and not in questionnaires or obtrusive observation. The basic question is, how does one get those direct measures of actual crew behavior? At the current time the author would like to suggest that answer may lie in the use of cockpit voice recorder data - with several important limits. The first is that the voice recorder data be given the same basic protection as provided the current incident reporting systems (i.e., it is totally confidential, stripped of all personal information, and results only reported in generic form). The second limit would be that a relatively small and totally random sample of flights be analyzed every year - maybe three to four hundred flights. This would provide an adequate statistical sample, while keeping the odds extremely low that any one individual would be monitored (thus mitigating the effects of obtrusive observation on the crews behavior). The data would be shipped to an organization (e.g., a university) where it would be stored in a totally confidential way. The data would be analyzed by a team consisting of at least the following: a human factors person, a pilot, and a psycholinguist. The data would be analyzed for both "CRM appropriate" and CRM inappropriate" utterances. The "CRM inappropriate" utterances could be categorized in a way to allow CRM programs to modify their training to address those weaknesses. One might even want to make these observations airline specific to take into account the differences in their various training programs. While some have argued that the data should be used for identifying "CRM deviant" flight crew members or "CRM deficient" airlines, the author would argue that such identification is premature at this time. That is, if the above data demonstrate that current CRM training techniques are ineffective in changing behavior, then who should be blamed for faulty behavior - the crew member or the training staff? And if the training staff has not had adequate feedback on where their techniques are effective and where they are not effective, should the airline be held accountable? Until the training staff has had the feedback it needs to modify and improve its training - then is it fair to punish the airline? Likewise, until it can be demonstrated that the behaviors of the vast majority of flight crews are actually in "compliance" with the airline's CRM policy, it would seem unfair to punish a crew member because alleged non-compliance, because it may be due to faulty or ineffective training! It is recognized that access to cockpit voice recordings is legally limited in many countries. However, if sufficient protections are established, like those currently accepted by pilots in the safety reporting systems, then it would seem reasonable to assume that pilots would be willing to agree to such analysis.
Conclusion and Plea
In short, this paper is a plea for a concerted effort to improve the validity of CRM evaluation tools to at least the level that meets basic validity criteria used in normal behavioral science research associated with behavioral change. Until that is done, it is reasonably probable that the entire aviation CRM industry is walking around without any clothes!
The author would like to recognize the support of the Harris Corporation, Embry-Riddle Aeronautical University, the McDonnell Douglas Corporation, and CACI for their financial support in allowing the author to a attend the Symposium and have the opportunity to present this paper. It must be noted however, that the views expressed in this paper are those of the author, and do not necessarily reflect those of any of the sponsors.