Corpora and Language Teaching

Corpora and
Language Teaching
Corpora and Language Teaching
1. Language teaching
2. Data-driven learning
3. Learner corpora
2
1. Language teaching
• Corpora as a source of material/examples
– see later
• Corpora determine the syllabus
– Deciding what to teach and when can be
influenced by how widespread a phenomenon
is in the language
– Teachers’ (and text books’) preconceptions
are often wrong
– (Same issue in lexicography)
3
Language teaching
• Corpora as a source of explanation
– Students often ask about subtle differences in
language
– e.g. differences between close synonyms,
especially when the L1 doesn’t make a similar
distinction
– Corpus evidence can be better than teachers’
intuitions
4
Tsui (2005)
Amy B M Tsui: ESL teachers’ questions and corpus evidence” Int. J.
Corpus Ling. 10, 335-356
• TeleNex website since 1993 to support ESL
teachers in Hong Kong
• Lots of questions sent in about commonly
confused words; use of sentence connectives;
count vs non-count nouns; number agreement;
etc.
• Accurate information about usage can be got
from corpus data
5
Examples
• big vs large
• finally vs lastly
• less than vs fewer than
– prescritpive grammar books can be at odds
with actual usage
• Sentence-initial conjunctions
6
2. Data-driven learning
• Part of drive to use new technology to enhance
language learning
• Focus on exploitation of authentic materials
– even for tasks such as acquisition of grammatical
structures and lexical items
• Focus on real, exploratory tasks and activities
rather than traditional “drill & kill” exercises
• Learner-centred activities
• Use and exploitation of tools rather than readymade or off-the-shelf learnware
7
Task-based learning
• Acquisition of language and linguistic
competence as well as language and language
learning awareness can best be realised through
tasks which encourage the learner not to focus
explicitly on the structure and the rules of the
new language. Learners will acquire the form of
the foreign language because they are engaged
in exploring aspects of the target language on
the basis of authentic content.
8
Use of materials
• Not just use of real language in realistic situations
(mainstay of language-teaching methodology since
1960s: “Communicative language teaching theory”,
though starting to be disparaged, and “pedagogic
grammar” method is returning)
• Promotes concept of learning about language – learner
is made aware of need to engage in a learning process
• Learner is given tasks which use language as data, but
which promote learning of language structures
• Language learners engage in linguistically motivated
activities, including corpus-based studies using
concordancers and other tools
9
Product vs process
• Product approaches are those that carefully
present specific aspects of the language for the
students, usually in terms of grammatical
metalanguage and “rules”
• Process approaches encourage creativity and
self-discovery by students as they experiment
with the language, often at the expense of
accuracy
• Few teachers nowadays take a wholly productor process-based approach
10
Data-driven learning
• Tries to marry product and process
approach by teaching things like grammar
through use of real text
G. Hadley http://www.nuis.ac.jp/~hadley/publication/windofchange/windsofchange.htm 11
Data-driven learning
• Research then theory
• Start with a question – may be provided by
teacher, or occur spontaneously
• Propitious use of materials – again may be
directed by teacher, or student may
explore at will
• Students have to work out the “rule” (or
pattern) for themselves
12
3. Learner Corpora
• A tool in the study of language learning
and hence language teaching
• studies of SLA (second language
acquisition) interested in
– how learners learn
– evidence from learners’ errors
– how this can feed in to ideas about how to
teach
13
Interlanguage
• Much work done on the role of L1
interference in language learning
– assumption that elements of learner’s L1
interfere with L2 language acquisition at all
levels (phonology, lexis, grammar of course,
but also more subtly, even as learners
become inetrmediate or advanced)
• Studies of language produced by
language learners can throw light on this
14
Learner corpora
• Corpus of texts produced (usually written)
by language learners
• Notably: International Corpus of Learner
English (ICLE), Louvain University
• Collections are corpora in the true sense
in that they are often planned, purposeful,
etc, but above all …
– ANNOTATED
15
Annotation
• Annotation can include the usual kind of
thing (POS tags etc)
• But of more interest is the annotation of
errors
– identifying that something is “wrong”
– suggesting what the correct version should be
– classification of error types
16
Error typing
• Very difficult task
– multiple errors can be compounded
– often easy to say there’s an error, but which error is it,
eg The boys runs
• Error classification can be contentious
• Even saying something is wrong can be
debatable
• Grammatical errors more or less straightforward
• Meaning errors difficult to spot
• Even more difficult: annotating matters of style,
nuance, etc.
17
Use of learner corpora
• Research over- or underuse of particular
constructuions or word(group)s
• Identify lexical errors to help in compilation
of learners’ dictionaries
• Identification of generic L2 errors vs L1influenced errors
S Granger (ed) Learner English on Computer, London (1998): Longman
S Granger, J Hung & S Petch-Tyson (eds) Computer Learner Corpora, Second
Language Acquisition and Foreign Language Teaching, Amsterdam (2002):
John Benjamins
18