Evaluation & assessment of EMP

This won’t hurt a bit: assessing English for Nursing

Dan Douglas

July 2019


There are a number of reasons people may not like to talk about language tests. Teachers sometimes find them irrelevant to their teaching and thus unfair to their students and inaccurate measures of what their students know and can do (Shohamy, 2007). For similar reasons, learners often find language tests irrelevant to their needs and thus unfair, stressful, and poor measures of what they know and can do with the language they have been learning (In’nami, 2006; Alderson & Clapham, 1995). Often, too, administrators find language tests somewhat mysterious and wonder what all the fuss is about regarding validity and reliability (Bachman, 2004). In this paper, I want to suggest that there is no reason why language tests, and particularly English for specific purposes tests, have to be painful, irrelevant, or mysterious. If a test is clearly relevant to the test takers’ needs and background and reflects what they have been studying, it will be less stressful for them and more appropriate in the eyes of teachers. If adequate documentation is provided about test objectives, development, and measurement qualities, administrators will have a better understanding and appreciation of what the test is measuring and how to interpret performance on it. We language teaching and assessment professionals have an ethical responsibility to ensure that the tests we make and use are as fair, accurate, relevant, and transparent as possible for the test takers and score users.

The ultimate purpose of the paper is to argue that tests of English for specific purposes should adhere to an overriding criterion of fairness – fairness to the test takers, to teachers, to educational programs, and finally to the societies in which the tests operate. This point is in line with current thinking in language testing generally, that ethical language testing is a much-needed focus in assessment, one that has been neglected for too long. In this paper I will discuss principles of assessment in ESP, going back to such basic questions as the following:

  • Why is ESP testing necessary?
  • What makes ESP tests specific?
  • What is English for Specific Purposes?
  • What problems are associated with ESP testing?

Why is ESP Testing Necessary?

Over the years, language specialists have made the following assertions with respect to the field of ESP testing: 1) Specific purpose language proficiency is really just general purpose language proficiency with technical vocabulary thrown in; 2) specific purpose language tests are not necessary since, if we test general language knowledge, specific uses will take care of themselves; 3) specific purpose language tests are unreliable and invalid since subject knowledge interferes with the measurement of language knowledge; 4) specific purpose language tests cannot be justified theoretically; and 5) specific purpose language testing is impossible anyway, since the logical end of specificity is a test for one person at one point in time (Davies, 1990; Alderson & Urquhart, 1985; O’Neill et al. 2007; Read & Wette, 2009). I will argue in this paper that these assertions are not true, that there is a theoretical justification for ESP, that ESP is different from general purpose language, that language knowledge and specific purpose background knowledge are both part of the ESP construct, and that specific purpose language testing is not only possible but necessary.

We often think of two broad purposes for ESP tests: Academic purposes, to determine whether applicants have enough control over the target language to succeed in academic studies, and occupational or professional purposes, to determine whether job applicants or employees can carry out necessary functions in the target language. There are two reasons why practitioners take the trouble to create ESP tests even though there are plenty of more general purpose tests easily available:

  • Reason One: Language performances vary with context, and
  • Reason Two: Specific purpose language is precise.

Language performances vary with context. Nurses use language differently when they are working directly with patients than when in an operating theatre. In the former situation, they use less technical language, more one-on-one social interaction, more interrogatives and directives. In theatre, nurses use more technical language, more receptive language, fewer well-formed sentences. Thus, we adapt the language we use to the communicative situation we are in. This is in part what specific purpose language is about (Douglas, 2000).

Specific purpose language is precise. Consider the following dialogue (Beare, 2011):

Nurse: Good morning, Ms Adams. How are you doing today?

Patient: Horrible! I can’t eat anything! I just feel sick to my stomach. Take the tray away.

Nurse: That’s too bad. I’ll just put this over here for now. Have you felt queasy for very long?

Patient: I woke up during the middle of the night. I couldn’t get back to sleep, and now I feel terrible.

Nurse: Have you been to the bathroom? Any diarrhea or vomiting?

Patient: I’ve been twice, but no diarrhea or vomiting. Maybe I should drink something. Can I have a cup of tea?

Nurse: No problem, I’ll get you a cup right away. Would you like black tea or peppermint tea?

Patient: Peppermint, please.

Adapted from Beare (2011)

The expressions felt queasy, been to the bathroom, diarrhea, vomiting, and black tea or peppermint tea are examples of precise, though not necessarily technical, specific purpose language. Bathroom is a particularly American expression – Americans find the word “toilet” somehow a bit too earthy, I think. Although these expressions may not strike the reader especially as examples of the type of language we usually associate with specific purpose English, they do represent a desire on the part of the speaker to be precise. In this context, precise means the use of language which, while accurate and exact, is clear and likely to be understood by the listener. Even the offer of two kinds of tea is an attempt on the nurse’s part to make sure the patient will not be surprised by the flavor of the tea when it arrives.

Specific purpose language tests are also necessary from a pedagogical point of view. They are fairer than non-test evaluations since all test takers are given the same instructions, input, and scoring criteria. They are more relevant to the learners than general language tests because ESP tests are based on an analysis of the target language use situation and so reflect the actual communicative needs of people in that context, and they reflect the learning content and style in the ESP course. Finally, ESP tests are more accurate than non-test assessments in giving test takers numerous opportunities to demonstrate what they can do in relevant situations. Thus, ESP test scores can be interpreted as evidence of communicative ability in a target language use situation.

What Makes ESP Tests Specific?

Authenticity and Knowledge make ESP tests what they are. Authenticity of task means that test tasks share features with target language use tasks: in our case, the work of nursing (Douglas, 2000). The interaction between language knowledge and specific purpose content knowledge means that both types of knowledge are necessary components of ESP tests. Authenticity refers to the degree to which a learning activity mirrors real-life language use, but as Widdowson (1979) put it:

It is probably better to consider authenticity not as a quality residing in instances of language but as a quality which is bestowed upon them, created by the response of the receiver. Authenticity in this view is a function of the interaction between the reader/hearer and the text which incorporates the intentions of the writer/ speaker…Authenticity has to do with appropriate response (p. 66).

Widdowson also made a distinction between genuine and authentic:

Genuineness is a property of written and spoken texts and refers to the origin of the text in an actual communicative situation.

Authenticity, on the other hand, is a perception by language users that communication reflects real life language use and is thus not a linguistic property but a cognitive one.

Consider the following dialogue between a nurse and a patient (Shahady, 2008):

Nurse: Good morning! I see you are in for your annual physical. Do you have any concerns about your health?

Charles: No, I’m feeling pretty good.

Nurse: Would you be willing to take a few minutes together to talk about your health and weight?

Charles: I guess so.

Nurse: How do you feel about your weight?

Charles: I know I could stand to lose a few pounds. My wife nags me about it every day!

Nurse: She’s probably just concerned about your health. Right now your body mass index, or BMI, is 30.1. A healthy BMI is below 25. Also, your waist circumference is 41 inches. We consider a healthy waist circumference something less than 40 inches. Your current BMI and waist circumference put you at risk to develop conditions that I see run in your family, like diabetes and heart disease. What do you think about this?

Charles: It sounds like I have some work to do. I’ve watched my brother deal with diabetes and it doesn’t look like much fun. How much weight do I need to lose?

Nurse: Any weight you lose will get you closer to a healthy weight. Have you ever tried anything to get to a healthier weight?

Charles: My wife tries to get me to eat salad and vegetables, but I’m more of a meat and potatoes guy.

This is a genuine dialogue, transcribed from an actual conversation between a nurse and a patient. Given its origin in a real clinic, it is also an authentic dialogue. Several features give the dialogue it authenticity:

  • Setting: Examining room furnishings, antiseptic smells
  • Participants: Patient, Nurse
  • Purpose: Discussing health concerns
  • Content: Health, weight, diet
  • Tone: Polite, professional
  • Language: Standard English, some medical terminology
  • Norms of Interaction: Patient-provider, knowledgeable to less knowledgeable
  • Genre: “Medical consultation”

If we were to use this dialogue in a language lesson, however, we might choose to focus on certain linguistic and pragmatic aspects of it. For example, the patient says his wife “nags” him about his weight. How does the nurse respond to this? Why? What
does body mass index mean? The nurse asks the patient how he feels about his weight and about the risk of diabetes and heart disease. Why? What is the meaning of “run in your family”? In discussing these aspects of the dialogue, we are still working with a genuine text, but I would argue that we lose some authenticity. We lose a number of the aspects of authenticity in the classroom or test:

  • Setting: Classroom furnishings and atmosphere
  • Participants: Learners, teacher
  • Purpose: Language acquisition
  • Content: Vocabulary, syntax, pragmatics, culture
  • Tone: Didactic, communicative
  • Language: Standard English, some Medical English
  • Norms of Interaction: Indirect, teacher-student
  • Genre: “Language lesson”

So although we may use genuine materials in our teaching and testing, authenticity will not necessarily follow unless we try to build it into our lessons and tests. We could do this, for example, by furnishing a corner of the classroom to resemble a medical examining room by bringing in some medical equipment such as a stethoscope, weighing scale, and examining table, we could give the person playing the role of the “nurse” a white coat, and we could perhaps even spray some antiseptic around (Douglas, 2000).

It is usual, too, to distinguish between carrier content and real content in ESP materials (Dudley-Evans & St John 19982). For example, in the writing task illustrated in the next section, based on case notes, the case notes are the carrier content and are probably genuine, or slightly edited, taken from actual medical records. However, in the ESP classroom or test, the real content is language related: vocabulary, comprehension, written rhetorical conventions, and professional communication between nurses. This distinction helps make it clear why authenticity is so important in ESP teaching and assessment: simply using genuine input material does not guarantee that learners/test takers will perceive the tasks we set for them as representative of communication in the target language use situation.

Interaction between Language and Content Knowledge

There are several pieces of information you need to know in order to purchase medicine at a pharmacy. You probably need to know whether the particular medicine requires a prescription or not, what your symptoms are, and perhaps some possible alternatives. Just knowing the language will not get you very far in communicating in specific purpose situations: you also need to have relevant background or content knowledge. An example writing task illustrates the need for content knowledge in ESP testing (OET Writing Subtest, 2009):


Patient History Maria Ortiz is a seven-day-old baby. Her mother has been discharged from the maternity hospital.
Social History Mother Violetta Ortiz (Mrs), DOB 07/08/1980. Husband Jose, 36 years.
Occupation security guard (night shift).
Other children Sam, 5 years (currently not attending school), Teresa, 3 years. Accommodation Two-bedroom flat (rented).

Nursing Notes Normal birth. Breast fed. Mother anxious about coping with 3 children. Baby sleepy; reluctant to feed. Baby’s weight: Birth – 3010g. Discharge – 3020g. Father unable to assist with children (night work). Mother very tired. No car; 20-minute walk to shops. Discharged from hospital, 10 April 2010


Using the information in the case notes, write a letter of referral to the maternal and child health nurse who will provide follow-up care in this case: Ms Josie Hext, Maternal and Child Health Centre, 133 Elm Grove, Oldmeadows.

In your answer: Expand the relevant case notes into complete sentences. Do not use note form.Use correct letter format. The body of the letter should be approximately 180-200 words.

OET Writing Subtest 2009

It seems to me that this task requires field specific background knowledge, first to understand the input text, the test taker must know the meaning of DOB, the significance of the birth and discharge weights of the baby, the meaning of letter of referral and “correct” referral letter format, and secondly, to complete the task, the test taker must be able to judge what is “relevant” in the case notes. Thus, in order to carry out this task, the test taker has to use both language knowledge and content knowledge, which makes this task a defining example of an ESP assessment.

What is English for Specific Purposes?

First, there is the question of whose English we are talking about. The field of World English calls into question not only the notion of whether Standard English exists or not, but also of what it means to be a native speaker of English (World English, 2004). Secondly, we need to know what we mean by Specific. When we talk about English for Nursing, we might mean that used for Licensed Practical Nursing, Nursing Practice, Travel Nursing, Oncology Nursing, Operating Room Nursing, Cardiac Nursing, Radiology Nursing, Nursing Education, Private Duty Nursing, Disabilities Nursing, Gynecology Nursing, Forensic Nursing, Critical Care Nursing, Clinical Nursing, Nursing Home Nursing, Ambulatory Care Nursing, Gastrointestinal Nursing, Pediatric Nursing, or Anesthetic Nursing (Nursing Guide, 2010). How different is the English needed for each of these specializations? Finally, we need to understand what we mean by Purpose. There is no such thing, of course, as English for no purpose. What, then, is General English? I would argue that the traditional distinction between general English and English for specific purposes is no longer tenable since all language teachers these days seek to provide learners with an ability to solve on their own the profusion of communication problems they will encounter when they leave the language learning classroom.

With regard to the notion of specificity, I have argued that rather than talking about specific purpose or general English as if they are dichotomous, it is better to think of a continuum of specificity, with something like “English conversation” at the more
general end, and something like “English for cardiac nursing” at the more specific end, as shown in the figure below:

[figure 1]

English for Specific Purposes has long been categorized into English for Academic Purposes and English for Occupational Purposes. The latter has been categorized into Vocational and Profession Purposes, and each of these consists of both pre- and in-service programs, as shown in the figure below:

[figure 2]

Finally, ESP is not a type of language, teaching material, or method, but rather, since its inception, has been an approach to language teaching/learning based on why learners need to learn English and designed to meet specific learner needs. Its content and methodology are derived from specific disciplines, occupations, and activities, and may be restricted in scope. ESP is usually goal directed, taught for a limited time period, to homogeneous groups of learners (Douglas, 2000; Grove & Brown, 2001; Hutchinson & Waters, 1987; Robinson, 1991; Strevens, 1988). Although these references are fairly old, they are not out of date, and the issue of defining and refining the concept of specific purpose language teaching is an ongoing and current task for practitioners (Douglas, Forthcoming).

ESP tests are based on an analysis of a target language use situation, usually known as needs analysis. Needs analysis techniques include Register Analysis, focusing on technical and sub-technical vocabulary and grammatical features; Discourse Analysis, focusing on specific language forms associated with various language use; and Learning Needs Analysis, focusing on the end product, learning skills for independent learning and language use (see Long 2005, for a discussion and examples of needs analysis).

To conclude this section, my definition of a specific purpose language test is as follows:

A specific purpose language test is one in which content and methods are derived from an analysis of a specific purpose target language use situation so that test tasks and content are authentically representative of tasks in the target situation, allowing for an interaction between the test taker’s language ability and specific purpose content knowledge, on the one hand, and the test tasks on the other, and allowing us to make inferences about a test taker’s capacity to use language in the specific purpose domain.
Douglas 2000, p.19

What are Some Problems Associated with Specific Purpose Language Testing?

The first difficulty with specific purpose language testing is determining where specific purpose language ability resides. Briefly, it resides in our brains: language
ability is a cognitive construct which cannot be observed directly, and can only be predicted or inferred, as if trying to determine what is in a black box that produces a certain outcome and ask the question of what is in the black box.

[figure 3]

We can observe the situational, linguistic, and background knowledge content of the input; we can observe the output; we must infer the nature of the language ability that produced the output (Douglas, 2010). We can make inferences about decontextualized language ability, for example Can write a letter using correct spelling, syntax, and punctuation. Here there is no reference to situation or content knowledge. However, in ESP assessment, we wish to make inferences about language ability and specific purpose background knowledge in the target language use context: Can write a letter of referral, including appropriate medical information. This task implies that the test taker will need knowledge of the situation in which a letter of referral is needed as well as the knowledge of what such a letter should contain. We may need to disentangle language and background knowledge in cases of non-experts, such as trainee nurses, for example. The test tasks should reflect the level of technical expertise the learners have at the time of testing.

With regard to the inferences and decisions we might want to make on the basis of ESP test performance, McNamara (1996) has distinguished between a strong and a weak performance hypothesis. If we adopt the strong hypothesis about what we may infer from test performance, we might say something like This nurse will be able to communicate with physicians, other nurses, and patients while working in a Critical Care facility. Note that this inference refers to what the nurse will be able to do without reference to his or her personality, state of anxiety, or level of knowledge. Adopting the weaker hypothesis, however, we might say something like This nurse can use both technical medical English and colloquial English appropriate to the context of working in a Critical Care facility. This inference focuses on language ability in a context of use, though again without reference to personality or knowledge.

McNamara has stated that he preferred the second of the two since it does not require us to deal either with job performance or specific purpose content knowledge, which he saw as unrelated to language ability.

I would suggest a middle ground, however. We, as language teachers and testers, should indeed make inferences about language ability, not job performance, but specific purpose background knowledge is a part of the ESP construct. Therefore, when we make inferences about the level of ESP language knowledge a test taker possesses, we are also inferring some level of specific purpose content knowledge. This is not easy since it requires us to know something about the field of nursing, not just the field of English language, but I think it is part of what it means to be an ESP teacher or tester.


The procedures and approaches discussed in this paper will make the ESP tests we develop and administer less painful for our learners, more relevant and useful for teachers, and more accessible and interpretable for administrators. Less painful because they will help ensure that the tasks and content of ESP tests are based on the needs and expectations of the test takers; more useful because the inferences and decisions based on test performance will be more applicable to teaching and materials design; and more accessible and interpretable because the results will be more clearly related to eventual job performance in vocational contexts.
The language in ESP tests should be directly related to that used in the target situation, and the tasks in ESP tests should be adapted directly from the target situation (e.g. Grove & Brown 2001). Both the language and the tasks in ESP tests should also reflect the English in the specific purpose syllabus and methodology. Thus ESP tests will be more relevant and more motivating for test takers and also make the inferences and decisions we make about learners, based on test performance, more accurate and fair. English tests for specific purposes, like having a blood sample taken, can never be entirely painless – they are, after all, tasks that require learners to put forth their best efforts – but they need not be agonizing for test takers if they are clearly relevant to their needs and expectations and are based on the learning that has taken place prior to the testing. We have an ethical responsibility to make sure the tests we use and develop are as fair, relevant, and accurate as we can make them.

Reprinted with kind permission of Dan Douglas, Professor Emeritus, TESL/Applied Linguistics Program
English Department, Iowa State University.


Since this paper is based on a keynote talk presented at the 2010 ESP International Symposium, in Kaohsiung, Taiwan, focusing on ESP for Nursing, many of the examples will be from the field of nursing. However, I hope the principles I advocate will be applicable to ESP practitioners in all fields and languages for specific purposes.
Readers will note that many of the citations in this paper are more than 10 years old. This is because there has been relatively little research over the years on the topic of assessment in ESP for nursing or other areas of medical English. I have cited the most recent and relevant work in this paper.
