Natural Language

Natural Language

What do we mean when we speak of a natural
language?

Let us contrast two kinds of language
"unnatural" (or formal) and "natural".
Unnatural language

computer language, such as C++, Java,
Prolog, or LISP

highly constrained; they have a very rigid
syntax

work with a very limited initial vocabulary.
A program
program add-numbers-in-the-table (input, output);
var
table: array [1..200] of integer;
index: integer;
sum: real;
begin
index := 1
sum:=0
while index <= 100
begin
sum := sum + table[index];
index := index+1;
end;
end.
Natural Languages

tend to be far less precisely defined.

Examples include the languages we speak, write and read:
English, American, French, Chinese, Japanese, or Sanscrit.

Have a very complex syntax or, perhaps, do not even conform
to a well defined syntax ... especially as they are used

They typically have an enormous vocabulary.
Natural language dialogue




Here are some numbers in a table.
Add them
All of them?
No, just the first 100.
Natural Language and Perception?

Natural language is conveyed in a number of ways.
One of the most important ways is that it is
"spoken”.

Speaking a language places sound patterns in the
environment.

These sound patterns join other sounds in the
environment, produced by the wind rushing through
the trees, by machinery, and by other speakers

Natural language is also conveyed via text, in books,
letters, newspapers and on computer screens.

What sensor do we use to interpret text? What is
perception in this case?
Are we reasoning about out perceptions?
Do we perceive that the string "My hat is red" is a
code for a conceptual entity?


Machine Translation

One of the original motivating reasons for
studying Natural Language was to build
machines which could translate text from one
language to another.

This was originally thought to be a modest
task
The method first envisioned was to


1. Replace the words in the language to be
translated with equivalent words in the target
language.
2. Use syntax rules to cleanup the resulting
sentences.
Sometimes this works.

Consider the following example.
English Sentence:

I must go home.
Word replacements into German
I 
must
go 
home
Ich

muss
gehen

nach hause

Resulting Sentence: Ich muss gehen nach hause.

Syntax transform ... move verb to the end of the
sentence. Ich muss gehen nach hause.

German Sentence: Ich muss nach hause gehen.

This is pretty good!
Here is an unlucky example.



The spirit is willing but the flesh is
weak............English
Translate: English  Russian  English
The vodka is strong but the meat is
rotten.........The result!
Words can mean many things

Most and often the meanings can be unrelated. Here is a
simple word which has a number of unrelated meanings.
bow
bow
bow
bow
bow





ribbon tied into a decorative configuration
an instrument used to project an arrow.
the forward portion of a boat
an act performed out of respect
deviation of an object from a straight line.
Placement

For one thing, selection of the proper meaning
of a word seems to depend on where that
word occurs in the sentence.
The pen is in the box.
The box is in the pen.
Context

Sentences change their meaning depending on the
context in which the sentence is presented.

The following sentence is ambiguous even if we
keep the specify the meaning for each of the words.
I saw the man on the hill with a telescope.

How well do you think the word substitution method
of machine translation would work on this one?
Natural Language Understanding

The upshot of the early work in machine translation
made it clear that there was more to translating
natural language than simply substituting words and
massaging the syntax.

Something "deeper" was afoot.

To properly translate a sentence the machine must
first "understand" it. What do we mean by "deeper"
and "understand"??? How do we do it??
What does the following phrase mean?

Water pump pulley adjustment screw threads
damage report summary.
How do we understand conversations?

What is going on in the following dialogue?
Do you know the time?
Yes.
Could you tell me the time?
Yes.
Will you tell me the time?
Yes.
I need to know the time.
I understand.
Eliza: Step toward Natural Language
Understanding?

Although its author, Joseph Weisenbaum,
would strongly disagree, it has been often said
that Eliza is a demonstration of natural
language understanding ... at least to some
extent.
Eliza's understanding based on key
words

Eliza had a set of templates, each looking for a key
word in the input sentence.

These templates were of the following form:
(* keyword *)
where the *'s are meant to be wildcards, like the ones used in
UNIX for filename descriptions. They are meant to match
with any string.

The template
(* computers *)
for example, would match the sentence
I am really frustrated with computers.

by matching the first “” of the template with the
string "I am really frustrated with " and the second
“” with the string ".".

Eliza's "understanding" of the sentence was simple,
to be sure, but it was enough to return "meaningful"
sentences to the user ... in the context of a therapy
session.

The program would store the strings which were
matched to the *'s and sometime later generate a new
sentence, often using these stored strings.

The responses were built into a table, referenced by
the template.

Whenever Elisa found a match, she would generate one of
the corresponding responses. Question: Is this
understanding?
Template
Responses
(* computers *)
Do computers frighten you?
(* mother *)
Tell me more about your family.
(I hate *)
You say you hate *.
no match
Please go on Tell me more about *.
Why do you hate *?
Conceptual Dependency

An attempt to represent sentences about
actions in a way that addresses the similarity
in meaning.
Capturing the similarity of meaning

capture the similarity of meaning found in sentences
like:
Mary took the book from John.
Mary received the book from John.
Mary bought the book from John.
John gave the book to Mary.
John sold the book to Mary.
John sold Mary the book.
John bought the book from Mary.
John traded a cigar to Mary for the book.

In all these examples the "ownership" of the
book is transferred between John and Mary.

The direction may vary and the intention may
change, but the result of the event is the
similar.
Frames

Shank and Abelson used frames to represent
events. These frames had four slots:
actor:
the agent causing the event
action: the action performed by the actor
object: the object being acted upon
direction:the direction in which that action is
oriented
All actions in terms of a small set of
“primitive actions”.

One of the list of primitive actions they worked with was










atrans
mtrans
ptrans
mbuild
speak
ingest
grasp
propel
attend
move
transfer of possession
transfer of mental information
physical transfer of an object from one place to another.
to build mental structures
the act of making sounds
eat
to hold in ones hand
to apply a force to an object
focusing ones consciousness upon
move a body part, eg moving an arm.
Four similar sentences
John gave Mary a the book.
actor: John
action: atrans
object: book
direction:
from John to Mary.
John took the book from Mary
actor: John
action: atrans
object: book
direction:
from Mary to John.
John bought the book from Mary
actor: John
action: atrans
object: book
direction:
from Mary to John.
Mary sold the book to John.
actor: Mary
action: atrans
object: book
direction:
from Mary to John.
Enhancements

One criticism of this presentation was that the
“understanding” did not capture some of the subtleties present
in many sentences.

In response, enhancements were made to the frame structure.
actor:
action:
object:
direction:
instrument:
cause:
time:
the agent causing the event
the action performed by the actor
the object being acted upon
the direction in which that action is oriented
device used to accomplish action
events caused by the action
timeframe

John went to the store
actor:
John
action:
ptrans
object:
John
direction: to (the store)
instrument:
unspecified
cause
unspecified
time
unspecified.

John flew to New York
actor:
John
action:
ptrans
object:
john
direction:
to (New York)
instrument:
actor: plane
action: propel
object: plane
direction:
to (New York)
other fields: unspecified
time:
past
Challenges: John took a plan to New
York

Is it
actor:
action:
object:
direction:

John
ptrans
plane
to New York
Or should it be ...
actor:
action:
object:
direction:
instrument:
actor:
action:
object:
direction:
time: past
John
ptrans
John
to (New York)
plane
propel
plane
to (New York)
MARGIE (Schank, et al. 1975).

- MARGIE was a natural language program
composed of a conceptual analyzer, a
sentence generator and an inference module.

The conceptual analyzer was a kind of
compiler that converted English sentences
into conceptual dependency structures.
English Sentences  Conceptual
Dependency (CD).



The sentence generator produced "natural language
sentences" from conceptual dependency structures
Conceptual Dependency  Natural Language
Sentences.
Schank and his students experimented with sentence
generators in a number of languages.
CD  English
CD  French

Hope in these efforts was, of course, to get a
handle on a solution to the language
translation problem.

The inference module would make inferences
from the CD expressions in a data base
together with the new CD expressions created
by the new sentences.

Experiments were conducted to see what
would happen with and without the inference
module enabled.
Inferencing disabled

Here is an example of the output generated with inferencing
disabled.

Input
John killed Mary by choking her.

Output paraphrases:
John Strangled Mary.
John choked Mary and she died because she could not breathe.
Mary died because she was unable to inhale some air and she was unable
to inhale some air because John grabbed her by the neck.
Inferencing enabled

When inferencing was enabled the output became
even more interesting:

Input
John gave Mary an Aspirin.

Output Sentences
John believes that Mary is sick.
Mary is sick.
Mary wants to feel better.
Mary will ingest the aspirin.

Programs like MARGIE are very interesting. Their
successes and failures help us understand the natural
language understanding process.

When we write programs like this we are trying to
discover what is it about sentences that make it
possible for us to:
Understand them.
Understand ones we have not seen before.
Produce new ones that others can understand.
Language building blocks or units

Printed text, for example, is organized according to
the following language units
1.
2.
3.
4.
5.
6.
7.
8.
9.
Letters
Words
Phrases
Sentences
Paragraphs
Sections
Chapters
Books
Fields
Syntax or sentence structure

It seems compelling that syntax or sentence structure plays a
role in determining the meaning of a sentence.
I saw the man on the hill with the telescope.
I saw the man on the hill with the telescope.
I saw the man on the hill with the telescope.
I saw the man on the hill with the telescope.
Figure 17-01.dsf

if we can determine the structure of the sentence it seems to
help resolve the ambiguity.

If we can determine that “on the hill modifies the man and
“with the telescope” modifies the action of seeing, then the
sentence meaning is resolved.
It means that I used the telescope to seem the man and that
the man was on the hill.

If, on the other hand, “ with the telescope” modifies the
phrase “on the hill” the hill,
then the sentence means that I saw the man that was located
on the particular hill that had a telescope on it. Etc.
Grammar

One form of grammar describes a sentence in terms of the concepts “noun
phrase, verb phrase, noun, verb, prepositional phrase, adverb, preposition,
etc.
Sentence
NP
ADJ
N
PP
VP
V
ADV
NP
Figure 17-02.dsf
Example

sentence :


nounphrase :


dog | bone | mouse | cat.
verb :


the | a.
noun :


verb, nounphrase.
determiner :


adjective, nounexpression.
verbphrase :


noun.
nounexpression :


nounexpression.
nounexpression :


determiner, nounexpression.
nounphrase :


nounphrase, verbphrase.
ate | chases.
adjective :

big | brown | lazy.
Parsing in Prolog

To begin with, we will simply determine if a sentence is a
legal sentence. In other words, we will write a predicate
sentence/1, which will determine if its argument is a sentence.

Our two examples assume we have broken the sentences into
words (by testing for the “whitespace” between words) and
stored in the following prolog lists
[the,dog,ate,the,bone]
[the,big,brown,mouse,chases,a,lazy,cat]
Basic strategies for parsing

The generate-and-test strategy
the list to be parsed is split in different ways
with the splittings tested to see if they are components of a
legal sentence.
Prolog code
sentence(L) :- append(NP, VP, L), nounphrase(NP),
verbphrase(VP).

The append predicate will generate possible values for the
variables NP and VP, by splitting the original list L.

The next two goals test each of the portions of the list to see if
they are grammatically correct. If not, backtracking into
append/3 causes another possible splitting to be generated.

The clauses for nounphrase/1 and verbphrase/1 are similar to
sentence/1, and call further predicates that deal with smaller
units of a sentence, until the word definitions are met, such as
noun([dog]). verb([ate]).
noun([mouse]). verb([chases]).
Difference strategy

The more efficient strategy is to skip the
generation step and pass the entire list to the
lower level predicates,

which in turn will take the grammatical portion of
the sentence they are looking for from the front of
the list and return the remainder of the list.
CD “grammar”

Another grammar is conceptual dependency (CD, Shank and
Ableson). It describes a sentence as a flat structure with well
defined roles: actor, action, object, direction, instrument,
time, etc.
Sentence
actor
action
object
direction
instrument
time
Figure 17-03.dsf
Transformational grammar

transformational grammar (by Noam Chomsky).

The key idea here is that somewhere “deep” in the mind is a structure (to
be determined) that represents ideas and thoughts, perhaps utterances.

From this structure, there is a set of transformations that translate or
convert “deep” structures into a “phrase” structure, like the grammar
school would produce.

Of course, analogous transformations would convert the phrase structures
into “deep” structures. This work distinguishes between the structure of
the sentence, namely phrase structure, and the structure of the ideas, the
deep structure. The conceptual dependency work of Shank and Ableson
does not.
Transformational grammar
Sentence
NP
ADJ
N
PP
VP
V
ADV
NP
Transformations
"DEEP STRUCTURE"
Figure 17-04.dsf
A key question


A second key question is: Once we decide on
a grammar, "How can we design an algorithm
that can take a sentence and determine its
structure?"
Such an algorithm would be referred to as a
parsing algorithm.
Consider the sentence group

I saw the man on the hill with a telescope. He
seemed to be looking at the moon.

Yes, your honor, I know he was quite a
distance away, but I had been setting up my
gear for an evening of star watching. I saw
the man on the hill with a telescope.
Context

Clearly the sentence “I saw the man on the
hill with the telescope” loses a lot of its
ambiguity when it is imbedded in a story or
paragraph.
SAM (Script Applier Mechanism)

Another Schank and Ableson program, SAM,
worked with conceptual dependency
structures that were collected into larger
structures called scripts.

These scripts represented stories or patterns of
events we often encounter together.
CD1
CD2
CD31
CD41
CD32
...
CD42
...
Figure 17-06.dsf
typical script: a restaurant script.

we could think
CD1 as representing the event of entering the restaurant.
CD2 could represent the waiter asking if you wanted dinner
or cocktails.
CD31 would be the event of moving to a seat in the lounge;
CD32 would be ordering a drink, etc.




SAM had a number of scripts in its internal
representations.
As sentences were entered SAM would convert them
to CD and focus on a script that "explained" them.
The script provided the logical thread that tied the
sentences together and helped resolve ambiguities in
individual sentences.
In a sense, the selection of the appropriate script was
an "understanding" of the collection of sentences



John went to the restaurant.
He ordered a Big Mac and an order of fries.
He ate and returned to the turnpike.

Depending on your collection of scripts, one might imagine
the first sentence selecting a number of restaurant scripts,
including one for McDonalds.

The second sentence would eliminate all but the McDonalds
script.

The third sentence could invoke, among other scripts, one
about driving long distances and could identify the
McDonalds script with the CD in the long-distance-driving
script associated with stopping to eat.