The National Board of Medical Examiners and the Federation of State Medical Boards began administering the Step 2 Clinical Skills Examination (CSE) on July 1, 2004.1 I had no strong feelings about the exam until I took it the following October. My experience revealed important limitations to the exam that, though difficult or impossible to measure, are central to understanding and improving its validity and place in medical education.
According to the United States Medical Licensing Examination (USMLE),2 the CSE evaluates three components: spoken English proficiency, communication and interpersonal skills, and “integrated clinical encounter.” This last category comprises gathering relevant information from the history and physical examination and documenting it in a patient note. Physicians evaluate examinees’ written notes, and standardized patients score performance in the exam rooms using checklists to record examinees’ behavior.
My exam went smoothly, but I felt an abiding sense of irony because the entire enterprise felt much more artificial than I had anticipated. Put bluntly, the standardized patients pretended to be ill, and I pretended to interview and examine them. I was conscious throughout not only that my every move and written comment were being carefully scrutinized and compared to unknown metrics (How many dermatomes must I test to perform a “complete” neurological examination? How many pauses should I allow to convey sufficient empathy?) but also that the exam environment profoundly changed my attitude and the way I interacted with the standardized patients. I was not motivated by concern for the ill person before me or by a genuine curiosity and desire to identify and treat his disease. I was motivated by the need to pass an examination upon which my career depended. I consciously feigned concern while structuring histories and physical examinations to maximize my score.
Neither, of course, had the standardized patients come to me with a frightening experience of illness and a desire to be healed; they came because they were paid to mimic as carefully as possible someone who had. In addition, these were among the most “well-behaved” patients I had ever seen. They were without exception articulate, intelligent, cooperative, and fluent in English. None was morbidly obese or plagued by multiple comorbidities. I could count each patient’s medicines on one hand; their medical and surgical histories were astonishingly brief. I also noticed that standardized patients often answered slightly different questions than the ones I asked, as if they were reciting a script. When I left the examination, a standardized patient shared the elevator with me, lunch box in hand, chatting with a colleague after another day on the job.
Apart from some routine testing anxiety, I felt comfortable interacting with the standardized patients, but each encounter clearly involved mutual pretense. I performed physical examinations without looking for abnormalities. I knew, for example, that no one would have a severe heart murmur or papilledema. So I listened at the four proper cardiac positions but did not take the time to characterize faint murmurs in detail. With careful attention I can visualize a patient’s optic fundi about half of the time; during the CSE I merely went through the motions. My abdominal exams were cursory and gentle, because the instructional video had warned not to palpate standardized patients too deeply.
Standardized patients have been used and studied extensively in medical education, and a large literature supports their value as educational tools.3 I do not believe the CSE is worthless. I do urge, however, that students and USMLE administrators think beyond the quantitative data on the CSE and carefully consider what the test can and cannot evaluate.
The CSE’s ability to evaluate English proficiency, communication, and the integrated clinical encounter depends on questionable presuppositions about knowledge and clinical judgment. The test’s format presumes that actors can adequately mimic an ill patient with sufficient coaching and that medical students can adequately imitate the examination of an ill patient by examining these actors. The exam equates an examinee’s performance with his observable actions, reflecting the belief that with enough effort knowledge can be made wholly explicit and formalized. My interactions with standardized patients were superficially indistinguishable from my interactions with real patients. Yet I not only had thoughts during the CSE that were different than those I have when examining real patients, but these thoughts affected my actions in important ways. Acting natural for a standardized patient is very different from being natural with a real one.
Physician-philosopher Michael Polanyi has described how knowledge resistant to formal analysis—what he calls tacit knowledge—undergirds all explicit knowledge and plays a fundamental role in human knowing and doing.4 Tacit knowledge is most recognizable in complex tasks such as physical examination. The explicit knowledge necessary for a direct funduscopic exam comprises a few simple instructions. Yet funduscopic exams are not simple; reliable success requires learning from hundreds of attempts. Experienced examiners cannot quantify and describe their skill for easy transmission to novices, because funduscopic exams require a wealth of tacit knowledge—continuous fine-motor adjustments, familiarity with the retinal topography—at which explicit instructions can only vaguely hint. Physicians attend from their tacit knowledge to the explicit aims of the fundus exam. Tacit knowledge is not limited to mechanical skills. It underlies all knowledge and so is central to communication, diagnosis, and treatment.5 - 6
A detailed discussion of Polanyi’s philosophy is impossible here, but Polanyi makes two important points. First, tacit knowledge means that we always “know more than we can tell”7 ; second, the process of attending from tacit knowledge to explicit knowledge implies that our actions are inescapably connected to our thoughts and motives. Communication requires not just attention to someone’s movements and utterances, but a sympathetic effort to understand and be understanding of them as meaningful, intentional human beings. Standardized patients complete checklists not by simply observing examinees, but by relying on tacit awareness of their actions as clues for understanding the meanings and intentions these actions convey.
The relationship between tacit and explicit knowledge helps to clarify whether the CSE adequately evaluates its three components. The exam tests English proficiency well. As native English speakers, standardized patients possess the tacit knowledge needed to assess accurately examinees’ ability to convey meaning through the English language.
Evaluation of the second area, interpersonal communications, depends on two problematic assumptions. First, by relying on checklists the CSE’s structure incorrectly implies realistic patient-physician interactions are independent of the motivations that create the clinical encounter. The CSE’s artificial nature creates a perceptual frame8 or accent of reality9 very different from the genuine article, so passing the CSE’s communication component requires merely good standardized patient communication skills, although good patient communication skills may help. Second, the exam assumes, for example, that recording eye contact and head nodding are good surrogates for evaluating communication and empathy. Perhaps they are in general, but physicians may communicate very well without head nodding, just as they can nod without really listening to their patients.
Whether the CSE effectively evaluates communication skills depends partly on the kinds of checklists standardized patients use. If checklists count pregnant pauses and measure eye contact, then the CSE measures acting skills rather than communication. If, on the other hand, standardized patients are asked to rate how well examinees convey information or establish rapport, the CSE may be able to approximate evaluation of communication skills in the clinical setting. The actual USMLE scoring algorithm is unavailable, but several review books aimed at medical students and the standardized patient literature10 suggest that the CSE uses a combination of these two approaches.
Finally, the CSE probably cannot evaluate medical students’ abilities to gather data in the clinical encounter. Even the best standardized patients lack medical training and so are recording behavior they do not fully understand. Just as only skilled divers are qualified to judge Olympic diving competitions, only trained physicians are competent to evaluate skills unique to medicine. A physician supplements formal criteria with the tacit appreciation of details required to evaluate medical students’ actions and the intentions behind them. Even in an artificial environment, a physician could distinguish a thorough cardiac examination from a rushed one and an adequate abdominal examination from perfunctory motions. A physician could also recognize acceptable variations in approach to the history or physical that a standardized patient might interpret as inadequate. There is a vast difference between faculty’s judging students’ actual clinical encounters and standardized patients’ recording students’ artificial behavior, and studies have found little correlation between the two.11 Although physicians do evaluate examinees’ written notes, an examination using laypeople to evaluate make-believe clinical scenarios does not assess the skills listed under the USMLE’s “integrated clinical encounter” rubric.
The CSE’s checklist format reduces costs and improves reliability, but it will decrease the educational value of standardized patients if it is replicated in medical school curricula.12 An earlier version of the CSE using real patients and physician evaluators was discarded as unreliable,13 so the National Board of Medical Examiners have taken great pains to show that the current CSE is a reliable instrument.1 ,14 They have legitimate reasons for valuing statistical reliability over verisimilitude in a national licensing examination but should also recognize that the current test format has little relevance to actual clinical skills.
Even the best standardized patient is no substitute for direct faculty observation at the bedside,15 so the National Board of Medical Examiners’ effort to increase direct observation by requiring the CSE for licensure is laudable. This requirement is already changing medical education; 96% of US medical school curricula now include evaluation components that use standardized patients.16 The examination’s broader effect, however, depends on two factors: whether faculty recognize tacit knowledge as important in medicine and whether they recognize that standardized patient encounters are by their nature highly contrived experiences. Evaluating real student-patient interactions implicitly recognizes tacit knowledge and is a valid test of students’ clinical skills; the current CSE recognizes only explicit knowledge and is a reliable test of students’ acting ability. To the extent that the new CSE promotes evaluation and teaching of real medical skills, it may improve medical education and physician quality. To the extent that it focuses medical schools’ attention on examination results, it will not only distract faculty from clinical teaching but also signal to medical students that standardized skills are acceptable substitutes for the real thing.
Funding/Support: This work was supported by the Vanderbilt Medical Scholars Program and MO1 RR00095.
Acknowledgment: I am indebted to Elizabeth Heitman, PhD, for her thoughtful critique and editorial advice.
Country-Specific Mortality and Growth Failure in Infancy and Yound Children and Association With Material Stature
Use interactive graphics and maps to view and sort country-specific infant and early dhildhood mortality and growth failure data and their association with maternal
Instructions
Comments are moderated and will appear on the site at the discretion of the Journal of American Medical Association editors. Comments should not exceed 500 words of text and 10 references.
Do not submit personal medical questions or information that could identify a specific patient, questions about a particular case, or general inquiries to an author. Only content that has not been published, posted, or submitted elsewhere should be submitted. By submitting this Comment, you and any coauthors transfer copyright to the journal if your Comment is posted.
* = Required Field
Disclosure of Any Conflicts of Interest* Indicate all relevant conflicts of interest of each author below, including all relevant financial interests, activities, and relationships within the past 3 years including, but not limited to, employment, affiliation, grants or funding, consultancies, honoraria or payment, speakers’ bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued. If all authors have none, check "No potential conflicts or relevant financial interests" in the box below. Please also indicate any funding received in support of this work. The information will be posted with your response.
Register and get free email Table of Contents alerts, saved searches, PowerPoint downloads, CME quizzes, and more
Subscribe for full-text access to content from 1998 forward and a host of useful features
Activate your current subscription (AMA members and current subscribers)
Some tools below are only available to our subscribers or users with an online account.
Download citation file:
Customize your page view by dragging & repositioning the boxes below.
and access these and other features:
Register Now
Enter your username and email address. We'll send you a reminder to the email address on record.
Athens and Shibboleth are access management services that provide single sign-on to protected resources. They replace the multiple user names and passwords necessary to access subscription-based content with a single user name and password that can be entered once per session. It operates independently of a user's location or IP address. If your institution uses Athens or Shibboleth authentication, please contact your site administrator to receive your user name and password.