7/25/2019 Lexical Frequency & ESP(Wk5) http://slidepdf.com/reader/full/lexical-frequency-espwk5 1/22 Engineering English: A lexical frequency instructional model Olga Mudraya Department of Linguistics and Modern English Language, Lancaster University, Lancaster LA1 4YT, UK Abstract This paper argues for the integration of the lexical approach with a data-driven corpus- based methodology in English teaching for technical students, particularly students of Engi- neering. It presents the findings of the authors computer-aided research, which aimed to establish a frequency-based corpus of student engineering lexis. The Student Engineering Eng- lish Corpus (SEEC), reported here, contains nearly 2,000,000 running words reduced to 1200 word families or 9000 word-types encountered in engineering textbooks that are compulsory for all engineering students, regardless of their fields of specialization. The most immediate implication arising from this research is that sub-technical vocabulary as well as Academic English should be given more attention in the ESP classroom. The paper illustrates some sample data-driven instructional activities consistent with the lexical approach, in order to help students acquire the so-called language prefabs, or formulaic multi-word units/collocations, for technical and non-technical uses. The integration of the lex- ical approach with a corpus linguistic methodology can enrich the learners language experi- ence and raise their language awareness, bringing out the researcher in them. 2005 The American University. Published by Elsevier Ltd. All rights reserved. 1. Introduction In recent years, corpus linguistics has come together with language teaching by recognizing the importance of language corpora as a basis for acquiring facts about 0889-4906/$30.00 2005 The American University. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.esp.2005.05.002 E-mail address: [email protected]. www.elsevier.com/locate/esp English for Specific Purposes 25 (2006) 235–256 ENGLISH FOR SPECIFIC PURPOSES
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Engineering English: A lexicalfrequency instructional model
Olga Mudraya
Department of Linguistics and Modern English Language, Lancaster University, Lancaster LA1 4YT, UK
Abstract
This paper argues for the integration of the lexical approach with a data-driven corpus-
based methodology in English teaching for technical students, particularly students of Engi-
neering. It presents the findings of the authors computer-aided research, which aimed to
establish a frequency-based corpus of student engineering lexis. The Student Engineering Eng-lish Corpus (SEEC), reported here, contains nearly 2,000,000 running words reduced to 1200
word families or 9000 word-types encountered in engineering textbooks that are compulsory
for all engineering students, regardless of their fields of specialization.
The most immediate implication arising from this research is that sub-technical vocabulary
as well as Academic English should be given more attention in the ESP classroom. The paper
illustrates some sample data-driven instructional activities consistent with the lexical
approach, in order to help students acquire the so-called language prefabs, or formulaic
multi-word units/collocations, for technical and non-technical uses. The integration of the lex-
ical approach with a corpus linguistic methodology can enrich the learners language experi-
ence and raise their language awareness, bringing out the researcher in them.
2005 The American University. Published by Elsevier Ltd. All rights reserved.
1. Introduction
In recent years, corpus linguistics has come together with language teaching by
recognizing the importance of language corpora as a basis for acquiring facts about
0889-4906/$30.00 2005 The American University. Published by Elsevier Ltd. All rights reserved.
the language to be learned and sharing a larger, ‘‘chunkier’’ view of language (Johns,
1991; McEnery & Wilson, 1997; Murison-Bowie, 1996). The availability of language
corpora to language learners and teachers offers promising opportunities in learning
a language, allowing learners to set up and carry out their own language analyseswith the help of computer concordancing programs that are aimed at identifying col-
locations, or word partnerships, in which certain words co-occur in natural text with
greater than random frequency.
The lexical approach to language teaching and learning (Lewis, 1993; Nattinger &
DeCarrico, 1992; Willis, 1990; overview in Moudraia, 2001) is similarly directed at
teaching collocations. It makes a particular distinction between vocabulary, tradi-
tionally understood as a stock of individual words with fixed meanings, and lexis
which takes into account not only single words but also word combinations that
we store in our mental lexicons ready for use.
This paper aims to show how the integration of the lexical approach with a cor-
pus-based methodology in teaching English for Specific Purposes (ESP), especially
Engineering English, can improve the way ESP is taught. My particular point here
is to demonstrate how a technical student can benefit from the data-driven lexical
approach. The examples will be taken from my Student Engineering English Corpus
(SEEC) of nearly 2,000,000 running words (Moudraia, 2003, 2004), which was built
with the purpose of establishing a representative corpus of Student Engineering Eng-
lish that reflects the lexis encountered in compulsory textbooks for engineering stu-
dents, regardless of their fields of specialization.
2. Corpus linguistics and ESP language learning
The lexical approach argues that language consists of chunks which, when com-
bined, produce continuous coherent text, and that only a minority of spoken sentences
are entirely novel creations. The existence and importance of formulaic multi-word
units has been pointed out by many linguists. Bolinger (1976) called them ‘‘the prefabs
of language’’ before Sinclair (1987, 1991) put forward the notion of the idiom principle
as a clear methodological grounding for viewing collocation, arguing that words do not
occur at random in a text. On the contrary, ‘‘a language user has available to him or hera large number of semi-preconstructed phrases that constitute single choices, even
though they might appear to be analysable into segments’’ (Sinclair, 1991, p. 110).
Corpus linguistics is a methodology which can be described as a study of natural
language on examples of real life language use via a corpus (McEnery & Wilson,
2001), defined as a body of text that is representative of a particular variety of lan-
guage and is stored on a computer. The availability of language corpora to language
learners and teachers adds a fresh dimension to the criteria for success in learning a
language. With data-driven learning (Johns, 1991), the data are primary, the teacher
has a new role as coordinator of research, and learners become research workers in
control of their learning process.In particular, concordancing programs for computerized text analysis can be used
very productively in and outside the ESP classroom. A concordance is ‘‘a collection
236 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
of the occurrences of a word-form, each in its textual environment’’ (Sinclair, 1991,
p. 32). Language teachers can use concordancers to produce vocabulary exercises to
help their students understand word partnerships. The concordance data can make
language facts more explicit by isolating common patterns in authentic languagesamples, the point of a concordance being to present abundant examples of a word
in its usual contexts. By seeing the contexts and collocates, the learners can get a
much better idea of the use of the word than they would achieve by merely looking
it up in the dictionary. Furthermore, by drawing students attention to collocates of
the keyword, concordance-based study has considerable potential for expanding
student vocabulary. Essentially, keywords are the words which are most unusually
(or outstandingly, in Scotts (1997) terms) frequent in a given body of text compared
with its frequency in a reference corpus.
McEnery and Wilson (2001, p. 121), identify ESP as a particular domain-specific
area of language teaching and learning, where ‘‘corpora can be used to provide many
kinds of domain-specific material for language learning, including quantitative ac-
counts of vocabulary and usage which address the specific needs of students in a par-
ticular domain more directly than those taken from more general language corpora’’.
In professional domains, various corpora are being built. Most of them are of finite
size, with the exception of so-called monitor corpora – open-ended collections of
texts, to which new texts are being constantly added until the corpora ‘‘will get
too large for any practicable handling, and will be effectively discarded’’ ( Sinclair,
1991, p. 25).
The largest current professional corpus is to be the Corpus of Professional English(http://www.perc21.org/cpe_project/index.html). It is being developed in collabora-
tion between the Professional English Research Consortium (PERC), Japan, and
Lancaster University, UK. When finished, it will consist of a 100-million-word data-
base of English used by professionals in science, engineering, technology and other
fields. Also, a monitor engineering corpus of several million words representing
the English used by engineers in over 355 professional engineering organizations,
has been steadily growing at the University of Aizu in Japan (Orr & Takahashi,
2002).
However, for language learning and teaching, smaller corpora can be more useful 1
as they are designed to represent the specific part of the language under investigationand are tailored to address the aspects of the language relevant to the needs of the
learner. Furthermore, smaller corpora are more manageable, allowing easier and fas-
ter access to language data. Some examples of smaller technical corpora designed for
language learners are Peter Roes Corpus of Scientific English comprising 280,000
running words, i.e., tokens, (cited in Yang, 1986), the 400,000-token Guangzhou
Petroleum English Corpus or GPEC (Qi-bo, 1989), the Jiaotong Daxue English of
Science and Technology (JDEST ) Corpus (Yang, 1985a, 1985b, 1986) and the Hong
Kong University of Science and Technology (HKUST ) Computer Science Corpus
(James, Davidson, Heung-yeung, & Deerwester, 1994) of 1,000,000 tokens each, as
1 See Ghadessy, Henry, and Roseberry (2001) for applications of smaller corpora to language teaching.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 237
well as my Student Engineering English Corpus (SEEC) of nearly 2,000,000 tokens
(Moudraia, 1999, 2003, 2004).
All these corpora are largely based on textbook selections although they are quite
different in design and have different objectives. For example, the JDEST was cre-ated mainly to monitor language teaching materials in order to learn ‘‘how well
the materials which have been developed for the learners of English are representing
the authentic materials they are going to read in the future’’ and also possibly to pro-
vide some knowledge on the productivity of different multi-word term patterns
(Yang, 1986, p. 103). The authors also hoped that the JDEST might be used for syn-
tactic and discourse study of EST (Yang, 1985b, p. 95). Peter Roes Corpus of Sci-
entific English was used for the automatic identification of scientific/technical terms
(Yang, 1986, p. 97).
The purpose of building the GPEC was threefold: firstly, to get to know more
about the features of Petroleum English; secondly, to provide teachers and learners
with a series of vocabulary lists; and finally, to gain some empirical knowledge in
developing a model for processing a medium-sized corpus on a microcomputer
(Qi-bo, 1989, p. 28). The HKUST had two principal objectives: (i) an empirical
determination of the nature of the comprehension problems of Chinese-speaking
undergraduate students in listening and reading in English for academic purposes;
and (ii) the development of materials to enhance listening and reading skills, in-
formed by the findings of empirical enquiries (James et al., 1994, p. 3). The SEEC
had three primary aims: (a) to establish a representative corpus of Student Engineer-
ing lexis; (b) to provide teachers and learners with a word list that could serve as thelexical syllabus foundation of English for Engineering; and (c) to explore the syntac-
tical, morphological, lexical, and discursive features of Engineering English (Moud-
raia, 2004, p. 142).
Despite their different purposes, all these corpora have led to the production of
vocabulary lists and lexical syllabuses for ESP/EST courses at tertiary level.
Exploitation of the findings using concordancing software was another outcome
of some of these projects. For example, an automatic monitoring and collecting
system of scientific/technical terms, in which new terms collected could be sup-
plied with concordances, was envisioned by the JDEST developers (Yang, 1986,
p. 102–103).I believe that concordancing is an indispensable tool in course design. In this pa-
per, I will be exploring the issue of technical/sub-technical/non-technical vocabulary
with examples from the SEEC. In the fourth section of this article, I have included
some data-driven instructional activities based on concordance samples from the
SEEC that are aimed at helping students acquire language prefabs for technical
resistance to semantic change, and a very narrow range; e.g., words such as urethane,
or vulcanise. Some researchers (Baker, 1988; Cowan, 1974; Flowerdew, 1993; Trim-
ble, 1985), however, distinguish a third category – so-called sub-technical vocabulary,
a class of words that stand between technical and non-technical words. These are lex-ical items with technical as well as non-technical senses, e.g. iron, force, stress, cur-
rent, tension, strength, etc., which have the same meaning in several technical
disciplines. As Baker (1988, p. 91) noted, the term sub-technical covers ‘‘a whole
range of items that are neither highly technical and specific to a certain field of
knowledge nor obviously general in the sense of being everyday words which are
not used in a distinctive way in specialised texts’’.
In addition, according to Yang (1986), sub-technical words are identified by their
frequency and distribution as well as their collocational behaviour. Yangs statistical
analysis has shown that sub-technical words have very high distribution across all
specialized fields; however, their frequency of occurrence is lower than that of func-
tion words. Both function words and sub-technical words are characterized by fairly
low peakratio (i.e., the maximum frequency of occurrence divided by the average fre-
quency) and rangeratio (i.e., the maximum frequency divided by the minimum fre-
quency). On the other hand, technical terms have very low distribution but very
high peakratio and rangeratio (Yang, 1986, p. 98). Even so, a sub-technical word
might also be a term in a specific field if it suddenly shows a peak frequency in that
field. In view of this, I will also be examining whether the most frequent words in the
SEEC are indeed technical or non-technical/sub-technical.
4. The Student Engineering English Corpus
4.1. Rationale
The goal of the project was to develop a reliable lexical syllabus for engineering
students in order to meet the objectives of English teaching for Engineering at Wala-
ilak University in Thailand,2 where I had worked for nearly seven years. We were in
a situation quite common in Southeast Asia: lectures in most subjects were delivered
in a local language (Thai, in this case) whilst textbooks were in English. That is why,in order to build a representative corpus of Student Engineering English, I selected
English-language textbooks, 13 in total, used in basic engineering disciplines
(BED). By BED, I mean those disciplines which are compulsory for all engineering
students regardless of their fields of specialization. At Walailak University, these
were Engineering Mechanics, Engineering Materials, Mechanics of Materials,
Mechanics of Fluids, Thermodynamics, Electrical Engineering, Engineering Draw-
ing, Manufacturing Process and Computer Programming. The main criterion for
selection was that the textbooks were recommended for engineering students, who
had to read them in English.
2 The project was supported by a small Grant # 970112 from the Walailak University Research Council.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 239
The main stages in the project included gathering a text corpus, putting it into ma-
chine-readable form, conducting a computational analysis of the material, and build-
ing a word list.3 Whole texts were used in the SEEC, as opposed to text extracts,
which is the case with most other smaller technical corpora designed for languagelearners (e.g., GPEC, JDEST and HKUST). In corpus construction, whole texts
are preferable to text extracts wherever possible, as this frees the researcher from
concerns about the validity of sampling techniques; moreover, a corpus made up
of whole documents is open to a wider range of linguistic studies than a collection
of short samples (Sinclair, 1991, p. 19). The SEEC is composed of thirteen text files,
details of which as presented in Table 1. The collected material formed a corpus of
about 2 million tokens and over 18,000 word-types, analysed with the help of the
WordSmith Tools software (Scott, 1996).
4.3. Word list organization
The entries in the resulting word list were organized by word families. The lem-
matisation process reduced the number of entries to about 7700 that were treated
according to the cumulative frequency of occurrence of the members of the word
families, and the most frequent word families (with the sum total of 100 occurrences,
3 This step required permission from the publishers for the electronic use of their texts. My
acknowledgements go to McGraw-Hill Australia (permission dated October 12, 1998), McGraw-Hill
Companies, Inc. (permission dated December 1, 1998), Brooks/Cole Publishing Company (Grant No. G-
09857, November 17, 1998) and Addison Wesley Longman Limited (ref. AP/2743, November 25, 1998) for
their permission to store their texts in an electronic format in order to create a word list.
or 0.005%) were selected. As a result, the 1260 most frequent word families compris-
ing 8850 words were included in the Student Engineering Word List which can serve
as the foundation for an Engineering English lexical syllabus.
The ‘‘word family’’ here is interpreted in the broadest sense, incorporating notonly derived and inflected forms but compound words as well, according to Level
7 of Bauer and Nations (1993) scale. Table 2 gives an example of the word fam-
ily under the headword use which is the most frequent word family in the Student
Engineering Word List. Also, Appendix A presents the one hundred most fre-
quent entries listed by headwords (i.e., base word or the most frequent word in
the family).
4.4. Word frequency analysis – findings
Word frequency analysis of the SEEC was carried out in comparison with the
COBUILD Bank of English Corpus and the British National Corpus (BNC). The
COBUILD Bank of English Corpus is the biggest monitor corpus of the English lan-
guage, steadily growing at Birmingham University, UK. Currently, it contains about
450,000,000 tokens; this analysis, however, is based on the 323,302,789 tokens that
COBUILD had in 2000. The BNC, developed at Lancaster University, UK in the
1990s, is the biggest finite corpus of the English language to date, containing around
100,000,000 tokens. For the analysis of the most frequent word forms, I used the
Written part of the BNC of 89,800,000 tokens.
The word frequency analysis was concerned with the most frequent word forms inall three corpora, including the most frequent closed-class (grammatical) and open-
class (content) word forms. It has revealed, firstly, that the most frequent word forms
in all three corpora, being mainly function words, concur (Appendix B). The
correlation between the fifty most frequent closed-class word forms in the SEEC,
Table 2
Use – the most frequent word family in the Student Engineering Word List
N Headword Frequency % Words joined
ABC order – 1186
Frequency order – 1
use 10,313 0.52 use (2784: n 961, v 1823), uses
(262: n 48, v 214), using (2100),
used (4538);
useful (341), usefully (1), usefulness (7);
useless (6);
usable (22), useable (2);
user (149), users (24), users (2);
usage (39); reuse (4: n 3, v 1),
re-use (3: n 1, v 2), reused (5),
reusable (7);
unused (5 adj), unusable (5);
misuse (1
n), misusing (1), misused (1);abuse (2: v 1, attrib 1);
multiuse (1 attrib), multi-user (1 attrib)
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 241
the COBUILD Bank of English and the BNC Written proved to be statistically sig-
nificant at the .01 level. The Spearmans rank order correlation between the fifty
most frequent closed-class word forms in the SEEC and the COBUILD Bank of
English is .778 while between the SEEC and the BNC Written it is .802.Secondly, a comparison of the fifty most frequent open-class (content) word
forms has indicated that the content word forms in the SEEC are predominantly
from the scientific register, while the most frequent content word forms in
COBUILD and the BNC Written are of a general nature (Appendix C). Further-
more, the most frequent content word forms in the SEEC are rather infrequent in
COBUILD and the BNC Written (Appendix D). This finding supports Salagers
(1983, p. 54) observation about ‘‘those context-dependent words’’ which occur with
high frequency across different scientific disciplines but tend to be used infrequently
in general word-frequency counts.
Similarly, the most frequently encountered words in the SEEC appear to be sub-
technical , i.e., words with non-technical as well as technical senses, common in most
kinds of technical writing, which are identified by their frequency and distribution
as well as their collocational behaviour (Yang, 1986). The SEEC word frequency anal-
ysis has additionally revealed that the non-technical sense of a sub-technical lexical
item is used more frequently than its technical sense. For example, the word solution
is more commonly used in the SEEC in the non-technical sense than in the chemical
sense (Table 3), even in a Chemical Engineering Thermodynamics textbook4 (Table 4).
Finally, keyword analysis of the Student Engineering lexis, carried out with the
help of the WordSmith Tools software, has provided further support for my hypoth-esis that the most frequent words in a specialist corpus are in fact sub-technical and
non-technical. Basically, keywords are the words which are most unusually frequent
in a given body of text against a reference corpus while the so-called key-keywords
(Scott, 1997) are the most frequent keywords over a number of files in the database
ensuring that these words are characteristic of the whole corpus.
The key-keyword comparison of the SEEC against the BNC Written Sampler5
provides some interesting information about the key verbs in the SEEC – they ap-
pear to be predominantly from the academic register. The key-key verbs in the SEEC
are: act, apply, assume, be, become, calculate, consider, correspond to, define, deter-
mine, exert, give, illustrate, indicate, locate, obtain, occur, require, show, sketch, solve,substitute, and use. These verbs are key in at least five (seven on average) text files out
of the thirteen that constitute the SEEC (Table 1). Importantly, ten of these key
substitute) are included in Coxheads (2000) New Academic Word List; ten (assume,
correspond, define, illustrate, indicate, locate, obtain, occur, require, sketch) are also in
4 However, the word form solutions, although very infrequent, does occur more frequently in its
chemical sense in the Chemical Engineering Thermodynamics textbook.5 The BNC Written Sampler is a one-million-word written subcorpus of the BNC containing a wide and
balanced sampling of texts from the BNC Written. It was used for the key-keyword comparison as the full
BNC was too large to be analysed by WordSmith Tools.
242 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
Via corpus-based teaching and learning, learners become exposed to authentic
real-life language use and no longer rely solely on published instructional material,much of which is inauthentic.
Within the lexical approach too, language activities are directed towards naturally
occurring language, and more time is devoted to collocations and idiomatic expres-
sions. Lewis (1993) claims that the basis of language is lexis, while grammar is the
search for powerful patterns. There is compelling evidence (Lewis, 1993; McKay,
1980) that the majority of errors made by foreign/second language learners are
semantic errors of inappropriate word choice caused by vocabulary deficiency
and, particularly, by lack of collocational power. In consequence, Nattinger (1980,
p. 341) has suggested that
Perhaps we should base our teaching on the assumption that, for a great deal
of the time anyway, language production consists of piecing together the ready-
made units appropriate for a particular situation and that comprehension
relies on knowing which of these patterns to predict in these situations. Our
teaching, therefore would center on these patterns and the ways they can be
pieced together, along with the ways they vary and the situations in which they
occur.
I argue for the integration of the lexical approach with data-driven corpus-based
methodology in English teaching, including ESP teaching, as I believe that the use of language corpora in the classroom can improve students knowledge of the language
and their ability to use it effectively. Clearly, the major strength of using a computer
corpus in language teaching is the insight it can provide into the unique collocational
patterns of a word. This is one of the many persuasive reasons for utilizing computer
corpora in the development of vocabulary materials. Although the exercises that
resemble those of standard vocabulary and grammar teaching practices (i.e.,
blank-filling, sentence completion, word matching, translation, etc.) can still be
put to use, their linguistic focus has fundamentally changed, with many of the activ-
ities being of the receptive, awareness-raising kind that can aid language acquisition
by providing learners with a tool which enables them to process input moreeffectively.
I find concordancing a very valuable tool in course design. A case can be made,
though, for the use of the specialist corpus for teaching ESP students, since, even
where lexis is common to both general and the specialist corpus, the items in the spe-
cialist corpus, as Flowerdew (1993, p. 236) has noted, may have particular uses that
will be revealed in concordancing. Keeping this in mind, I have worked out some
data-driven exercises based on concordance data that are aimed at helping students
acquire language prefabs for technical and non-technical uses in the specialist con-
text. These concordance-based activities are designed not so much to help students
understand engineering textbooks but rather to aid productive use of the languageprefabs. Fig. 1 presents a concordance sample from the SEEC that includes carefully
selected examples of the word solution used both in the general sense (e.g., solution of
244 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
a problem) and in the technical (chemical) sense. Solution was chosen because it fig-ures, in its general sense, as a high-frequency word family and also occurs frequently
as a sub-technical item.
Fig. 1. Concordance sample of solution.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 245
will be an oblique triangle and should be solved by applying the law of sines
can be solved quite simply by the use of
When the problem is solved simply by moving the disk from
a wide class of problems which are solved by trial and error.problems that cannot be solved by the Work-Energy Principle
problems in this chapter have been solved by using the Moody diagram.
Such problems are solved by considering a short length of
equations and as such may be solved by numerical techniques.
set of algebraic equations that can be solved by methods developed earlier
were solved by the application of second law.
Solve by trial the equation
would be relatively simple matter to solve them, say by matrix method.
Solving a problem by following five steps
solved by computer.
Pattern 3: solve/solves/solving/solved using as in Alternatively, we can solve such
problems using graphical solution.
Answer key
deformable-body mechanics problems are solved using these work-energy principle
Alternatively, we can solve such problems using graphical solution
the following problems are intended to be solved using the program provided in
13.14 Solve Problem 12.18c, using the method15.75 Using the method of 15.7, solve 15.49.
we demonstrate the FORTRAN program that solves these using the routine Original
Equations (13.52) can be solved using each of the two sets
we add this reaction to be solved using the ‘‘final’’ moles
Another useful activity would be finding out when one syntactic pattern (the use
of by, with, using with solve) was preferred over another. It would require examining
all the relevant concordance lines in the corpus but the limitations of space do not
allow it to be included here.By using corpora, students gain direct access to abundant examples of authentic
language samples, resulting in a better understanding of the use and patterns of cer-
tain linguistic features. Thus, corpus-based teaching can help train multi-skilled
autonomous learners who can take charge of their own learning processes.
6. Conclusion
In this paper, I have argued for the integration of the lexical approach with a
data-driven corpus-based methodology in ESP teaching, as I believe that the useof language corpora in the classroom can improve students knowledge of the
language and their ability to use it effectively. This leads me to the conclusion that
248 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
corpora can also improve the way ESP teaching is approached. It can inform teach-
ing and learning, producing students who know what it means to use a corpus, who
know how to extract material from it, and who, consequently, can learn a great deal
about language via a corpus. After all, as Dlaska (1999, p. 403) observed, ESP teach-ing need not be ‘‘dire and difficult pedagogical ground’’, forcing language teachers to
surrender their expertise in favour of teaching unfamiliar subjects, but on the con-
trary, it needs to ‘‘address, and eventually bridge, the discrepancy between general
language ability and specialized language ability . . . since the two areas are not in
opposition but complement each other’’.
Appendix A. The one hundred most frequent word families in the Student Engineering
Hadley, G. (2002). An introduction to data-driven learning. RELC Journal, 33(2), 99–124.
James, G., Davidson, R., Heung-yeung, A. C., & Deerwester, S. (1994). English in Computer Science: a
corpus-based lexical analysis. The Hong Kong University of Science and Technology: Longman Asia
Ltd.
Johns, T. (1991). Should you be persuaded – two examples of data-driven learning materials. ELR Journal,
4, 1–16.
Lewis, M. (1993). The lexical approach: the state of ELT and the way forward. Hove, England: Language
Teaching Publications.
Martin, A. V. (1976). Teaching academic vocabulary to foreign graduate students. TESOL Quarterly,
10(1), 91–99.
McEnery, A., & Wilson, A. (1997). Teaching and language corpora (TALC). ReCALL, 9(1),
5–14.
McEnery, A., & Wilson, A. (2001). Corpus linguistics (2nd ed.). Edinburgh: Edinburgh University Press.
McKay, S. L. (1980). Developing vocabulary materials with a computer corpus. RELC Journal, 11(2),
77–87.
Moudraia, O. (1999). Lexical syllabus foundation for engineering. RELC Journal, 30(2), 140–141.Moudraia, O. (2001). Lexical approach to second language teaching. Eric Digest EDO-FL-01-02.
Washington, DC: ERIC Clearinghouse on Languages and Linguistics. Available from http://
www.cal.org/ericcll/digest/0102lexical.html .
Moudraia, O. (2003). The student engineering corpus: analysing word frequency. In: D. Archer, P.
Rayson, A. Wilson, & T. McEnery (Eds.), Proceedings of the corpus linguistics 2003 conference
(pp. 552–561). UCREL technical paper number 16. UCREL, Lancaster University. ISBN
1862201315.
Moudraia, O. (2004). The student engineering English corpus. ICAME Journal, 28, 139–143.
Mudraya, O. V. (2004). Using a lexical approach for data-driven instruction of engineering English. IEEE
Transactions on Professional Communication, 47 (1), 65–70.
Murison-Bowie, S. (1996). Linguistic corpora and language teaching. Annual Review of Applied
Linguistics, 16 , 182–199.Nattinger, J. (1980). A lexical phrase grammar for ESL. TESOL Quarterly, 14, 337–344.
Nattinger, J., & DeCarrico, J. (1992). Lexical phrases and language teaching . Oxford: Oxford University
Press.
Orr, T., & Takahashi, A. (2002). Constructing a corpus of fundamental engineering English for nonnative
speakers. In J. Williams (Ed.), Conference proceedings of the IEEE international professional
communication conference (pp. 403–409). USA: Oregon.
Qi-bo, Z. (1989). A quantitative look at the Guangzhou Petroleum English Corpus. ICAME Journal, 13,
28–38.
Salager, F. (1983). The lexis of fundamental medical English: classificatory framework and rhetorical
function (a statistical approach). Reading in a Foreign Language, 1, 54–64.
Scott, M. (1996). WordSmith tools. Oxford: Oxford University Press. Available from http://www.lexi-
cally.net/wordsmith/ .
Scott, M. (1997). PC analysis of key words – and key key words. System, 25(2), 233–245.
Sinclair, J. M. (Ed.). (1987). Looking up: an account of the COBUILD project in lexical computing . London:
Collins COBUILD.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Thurstun, J., & Candlin, C. N. (1998). Concordancing and the teaching of the vocabulary of academic
English. English for Specific Purposes, 17 (3), 267–280.
Trimble, L. (1985). English for science and technology: a discourse approach. Cambridge: Cambridge
University Press.
Willis, D. (1990). The lexical syllabus: a new approach to language teaching . London: Collins
COBUILD.
Xue, G., & Nation, I. S. P. (1984). A university word list. Language Learning and Communication, 3(2),
215–229.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 255