Introduction to Cognitive Science Linguistics Component Lecture 2 September 22, 2005. (2.00 p.m. – 3.50 p.m.) Venue: Meng Wah Complex Room 324 Lecturer: Dr. A. B. Bodomo Department of Linguistics <[email protected]> Topic 3: Formal Grammar: Parsing and Generation Introduction • In my previous lectures, we discussed how tacit linguistic knowledge can be represented at various levels of phonology, morphology, syntax, semantics, pragmatics, and their interfaces, including morphophonology, morphosyntax, and the syntax-semantics interrelationships. • In this lecture, we shall look closely at how these linguistic knowledge representations can be formalised into an algorithm, a computational procedure for processing this linguistic knowledge. 3 Keywords • • • • • • Constituent structure rules initial symbol terminal symbol non-terminal symbol generative grammar formal grammar 4 Formal devices and notation • The symbol ‘’ – indicates that a node is ‘rewritten as…’ or ‘consists of ’, or ‘has the constituents…’ • This is used in rewrite rules of the type: – S NP + VP • a sentence, S, has the constituents: noun phrase (NP) and verb phrase (VP) • Optionality in the grammar is expressed as {X, Y} . – This means apply either X or Y but not both 5 Formal devices and notation (cont’d) • Initial symbol: the • The symbol # is symbol from which a used to indicate rewrite rule begins (e.g. constituent boundary S) – e.g. # _ is word initial while _# is word final • Terminal symbol: the end symbols from which • The notation X (Y) no constituent structure implies that X is can be further obligatory and may developed (N, V, Art). be followed by Y All others are nonterminal symbols (e.g. NP, VP). 6 Two main aspects of grammatical information processing: Generating and Parsing sentences • Before we begin let us illustrate with a simple grammar and lexicon, using the following sentence: – The students greeted the teacher. 7 The students greeted the teacher. • Grammar: – S NP +VP – VP V + NP – NP Art + N • Lexicon 1: – Greeted: V, - NP – Students: N – The: Art – Teacher: N This grammar can also generate (i.e. produce) the following sentences: The teacher greeted the students The teacher scared the students The child ate an apple But you have to augment i.e. increase the lexicon as follows: Lexicon2: an: Art the: Art teacher: N students: N apple: N child: N greeted: V, -NP scared: V, -NP ate: V, -NP 8 Sentence Generation:the algorithm • To produce a sentence we need three things: A set of phrase structure rules (as illustrated above) A lexicon (as illustrated above), and A lexical insertion rule (as explained below) • A lexical insertion rule is an instruction to select the right word from a lexicon • The following is an example of a lexical rule: 9 Lexical insertion rule • For each terminal symbol of a phrase structure rule, select a word from the lexicon that satisfies the following conditions: – terminal symbol (e.g. N, V) It is a member of the class of – its subcategorization frame matches that of the terminal symbol (e.g. V, _NP). Attach this word as the daughter of this terminal symbol. • The set of rules above constitutes what is known as a sentence generator. 10 • The whole procedure of beginning with an initial symbol and then working through phrase structure rules to adding the lexical items via lexical insertions rules is driven by an algorithm or a set of instructions. • Let us set out an algorithm for the generation (production) of the sentence: The students greeted the teacher, a grammar and a lexicon as follows: 11 The students greeted the teacher Grammar: PS Rule (a): S NP +VP PS Rule (b): VP V + NP PS Rule (c): NP Art + N Lexicon1: Greeted: V, - NP Students: N The: Art Teacher: N Rule 1 Start with the initial symbol, S. Rule 2 For every non-terminal symbol, X, find a phrase structure rule with X as left-hand symbol and others as the right hand symbol(s), and develop a rewrite rule with X as the mother and the right hand symbols as ordered daughters. Rule 3 Apply rule 2 until all branches end in terminal symbols. Rule 4 Apply lexical rule iteratively until every terminal symbol is replaced by a lexical item. 12 Illustrating the algorithm S Applying Rule 1 VP NP Art N Applying Rule 2,3 NP V Art The professor greeted N Applying Rule 3 the students Applying Rule 4 • From the above we can see that we have started from an initial string and have ended with terminal strings with lexical items as their daughters. A sentence has thus been generated (produced), telling us how this sentence is built up. • Now, let us see how we can begin with an existing sentence and then break it down into its component parts by applying rules. 14 Sentence parsing: the algorithm • To parse a sentence means to analyse it into its constituent parts by the systematic application of lexical insertion rules and some phrase structure rules. • It is like the reverse process of generation. 15 Types of Parsing • Top-down: Begin with the symbol S. • Bottom-up: Begin with terminal symbols (words). Possible research: Which types of parsing in natural languages provide the most cognitively realistic and efficient parser? 16 Some sentence parsing rules which constitute a PARSER • For a sentence, S – Rule 1: Determine from the lexicon the word class of every item and develop a partial tree for each word where the word class label dominates the word. – Rule 2: Find a PS rule of the type X Y, Z and where the right hand symbols match some sequence of categories in the structure so far, and develop a partial tree with X as the mother and the right hand symbols as ordered daughters. – Rule 3: Continue rule 2 until the root, S, is reached and there are no unattached strings. 17 The man drank the tea. Grammar: PS Rule1: S NP +VP PS Rule2: VP V + NP PS Rule3: NP Art + N Applying Rule 1 Lexicon1: drank: V, - NP man: N the: Art tea: N Art The N V man drank the tea NP Applying Rule 2 Art The Art N NP N V Art N man drank the tea 18 VP Applying Rule 3 NP Art The NP N V Art N man drank the tea S NP VP NP Art The N V Art N man drank the tea Conclusion • Parsing and generation of natural language data is a very important area of linguistics, especially in computer applications of natural languages which has become an important aspect of the computer or information processing industry. 20 Topic 4: Language and Literacy Acquisition Keywords • language acquisition • innateness hypothesis • language faculty / Language Acquisition Device (LAD) • literacy • levels of literacy • literacy acquisition 22 Introduction • Theme – A survey of how linguistic knowledge is acquired/learnt by speakers of a language, from the point of view of spoken language and from the point of view of literacy (reading and writing). • Objective – an understanding of the basic terms and issues in language and literacy acquisition – an interface approach: rather than rigidly discussing these issues from language acquisition as separate and different from literacy acquisition, we will look at how 23 language acquisition relates to literacy acquisition. What is language acquisition? • Gleitman and Bloom 1999:434 – ‘refers to the process of attaining a specific variant of human language…the fundamental puzzle in understanding this process has to do with the openended nature of what is learned: children appropriately use words acquired in one context to make reference in the next, and they construct novel sentences to make known their changing thoughts and desires’ (in MIT Encyclopedia of the Cognitive Sciences). • Crystal 1997: 430 – The process of learning a first language in children. – The analogous process of gaining a foreign or second 24 language. Explaining how languages are acquired • In previous lectures we have tried to account for how all and only the grammatical sentences of a language are produced and represented in the brain of the speakers of a language. • However, a complete account of linguistic knowledge representation must address the issue of how we acquire a language as children and how we learn foreign languages as adults. • We will mainly be concerned with first language acquisition and not foreign language learning. 25 Stages of language development • the single word stage (12-18 months) – the language of the child consists of just a few isolated words of the target language, e.g. ‘mamma’, ‘daddy’,etc. – very little grammatical development • the grammar stage (19-29 months) – marked by the emergence of a few nominal and verbal inflections in languages that have these. – a few phrases and word utterances apparently strung together: ‘mammy, milk’; ‘daddy bye bye’, etc. • 30 months – can produce more adult-like speech: ‘Where's daddy ?’ ‘Daddy, I want to go with you.’ 26 Explaining language acquisition: • The reason for the uniformity and rapidity in child language acquisition is contained in the innateness hypothesis. • This is, at least, the position of Chomsky and most cognitive approaches to linguistic explanation. • In this hypothesis, language acquisition is determined by a biologically endowed innate language faculty (also called Language Acquisition Device (LAD)). • LAD or language learning ‘program’ in children’s brains provides them with a set of procedures (let us call it an ‘algorithm’ since we are computer/cognitive science inclined) for developing a grammar. – Input: linguistic experience they get from the parents and teachers. 27 The nature of the language faculty • Children can acquire any language as their native tongue. – e.g. a child of Cantonese speaking parents growing up in England can learn to speak perfect English as her native tongue. • Those aspects of language innately determined are universal – language faculty does not vary significantly from human to human An important aspect in the language faculty is the search for principles of Universal Grammar! 28 Universal Grammar (UG) • A theory of the human language faculty, i.e. a module of the mind/brain involved in the basic design of language (Noam Chomsky) • It is part of an innate biologically endowed language faculty, an innate mental organ specific to the human species • It allows us to perceive and interpret information governed by certain formal constraints • These formal constraints refer to a system of rules and representations and one of its operations (its grammar) by which the acceptable sentences of a language can be generated – Examples of formal universals, linguistic constraints of an abstract nature: the binding principles determining what can or cannot be the antecedent of an anaphoric, pronominal, or fully referential nominal element, etc. 29 Literacy Acquisition • Literacy: the ability to read, write and calculate basic numbers • Difficult to define: – can mean different things to different people in different areas: computer literacy, investment literacy, etc. • Is literacy part of our mental, cognitive faculty? – Yes, because any human can acquire literacy i.e. learn how to read, write and calculate basic numbers given the right environment 30 Levels of Literacy (cf. Stages of language acquisition) • 6 stages of reading (Daswani 1999) – Stages 1-3: Pre-reading, decoding, fluency (approx. grades 1 – 3) – Stage 4: Acquiring new knowledge (approx.grades 4 – 8) – Stage 5: Reading a range of complex materials critically (grades 9 – 12) – Stage 6: Mature reader: able to read for various purposes: professional, personal, civic (university and beyond) 31 The relationship between language and literacy acquisition • Traditional/historical view of child language acquisition: – learning to speak happens up to the age of five years, while learning to read happens after five. • Now they are seen as very intertwined i.e. very related: learning to speak and learning to be literate both deal with learning to use language • the basis of learning to speak has been outlined to provide an ecology for literacy. The most important lesson is that learning to speak and learning to read are very much interwoven. 32 Evidence of the interface of language and literacy acquisition • They are both part of learning to USE language. • Both need input from the environment. – can be compared with Vygotsky's idea of ZOPED, zone of proximal development, i.e. the distance between child initiative and ability of child to do things under the influence of parental support. – The learning environment: participants, situation, activity and a mechanism • Literacy acquisition is like language acquisition (cf. Givon's idea of literacy acquisition as a weak reflex of language acquisition). • Literacy is best acquired in a language one has acquired. 33 Conclusion • Literacy (reading and writing) is then another level/kind of linguistic knowledge representation. • Spoken and written linguistic knowledge representation interface with each other and are very intertwined. • Language and literacy acquisition have very important social, educational and cognitive implications. • Language and Literacy acquisition should therefore form an integral part of cognitive 34 science. References • • • • • • • • • David Barton. 1994. The roots of literacy. Literacy: An Introduction to the Ecology of Written Language. Oxford UK and Cambridge USA: Blackwell. Chapter 9, p.130-139. C. J. Daswani. 1999. Literacy. In Bernard Spolsky (ed) 1999. Concise Encyclopedia of Educational Linguistics. Oxford: Elsevier Science Ltd.. Viv Edwards and David Corson (eds.) 1997. Encyclopedia of Language and Education, Volume 2: Literacy. Netherlands: Kluwer Academic Publishers. Talmy Givon. 1998. The grammar of Literacy. In Syntaxis, 1, 1998: 1-40. Elfrieda Hierbert. 1994. Literacy in preschool programs. In Alan C. Purves et al.(eds.) 1994. Encyclopedia of English Studies and Language Arts. New York: Scholastic. 754-756. Ernest Lepore and Zenon Pylyshyn (eds). 1999. What Is Cognitive Science. Blackwell Publishers. (especially chapters 10, 11, 12, and 13) Neil Stillings and others. 1995. Cognitive Science: An Introduction. MIT Press. (especially chapters 6, 9, 10, and 11) Daniel A. Wagner. 1994. Literacy: definitions. In Alan C. Purves et al.(eds.) 1994. Encyclopedia of English Studies and Language Arts. New York: Scholastic. 748-752. R. Wilson and Frank C. Neil (eds.) 1999. The MIT Encyclopedia of the Cognitive Sciences. MIT Press. – Lila Gleitman and Paul Bloom. Language Acquisition. p.434-438 – David Olson. Literacy. p.481-482 35 Tentative List of research topics for Cognitive Science Students • Supervisor: Dr. Adams BODOMO ([email protected]) • Topics in Syntax: Theory, Description and Application – Building human language components in Computational Systems – The LFG treatment of serial verbs, Complex Predicates, and other verbal constructions in various languages: French, Norwegian, Japanese, Chinese, Dagaare, etc • Topics in Language and Literacy as cognitive processes – Chinese writing and computer technology: Survey and evaluation of various inputting systems. – New forms and functions of language and literacy in the age of Information technology (emails, ICQ, bulletin boards, mobile phone texting,etc).:A survey of SMS texting as a cognitive and communicative process in HK • The grammar of aphasic patients 36 Further studies - courses by Dr Bodomo • LING1002 - Language.com: Language in the Contemporary World (1st year undergraduate, co-taught with other staff members) • LING2011 - Language and Literacy in the Information Age • LING2032 - Syntactic Theory • LING2018 - Lexical-Functional Grammar • LING2041 - Language and Information Technology • LING2050 – Grammatical Description • LING2051 – French Syntax and Universal Grammar • Also consider B.A. in Human Language Technology (HLT) as an option for a minor 37 Take-home Quiz • Please submit your answers to your tutor on or before September 22, 2005. 38 - The End - 39
© Copyright 2025 ExpyDoc