Chapter 3. The Vertical Direction: Concept Hierarchies.
The previous chapter demonstrated a technique whereby a computer, with
a very little knowledge of the way English works, can tell us that
hungary, poland, romania, bulgaria and czechoslovakia are
all related. So far so good, but this begs the question, what are
they? The next step would be to have a technique for working out that
they are all European countries. Something like the following
interactive demo (now deprecated):
Class Label
| Score
|
European country, European nation
| 3.500 |
Balkan country, Balkan nation, Balkans, Balkan state
| 1.250 |
country, state, land, nation
| 0.972 |
administrative district, administrative division, territorial division
| 0.458 |
district, territory
| 0.268 |
region
| 0.176 |
location
| 0.124 |
object, physical object
| 0.092 |
entity, something
| 0.072 |
|
|
Produced by the Infomap group, CSLI, Stanford University
Or if we want to know what cheese and its top five neighbours
butter, milk, meat, bread, wine from the left hand cluster in
Figure 2.6 have in common, we would like to be able to take these
words and classify them as follows:
Class Label
| Score
|
foodstuff, food product
| 2.500 |
dairy product
| 2.250 |
food, nutrient
| 0.944 |
substance, matter
| 0.472 |
object, physical object
| 0.334 |
beverage, drink, drinkable, potable
| 0.250 |
entity, something
| 0.248 |
money
| -0.250 |
combatant, battler, belligerent, fighter, scrapper
| -0.250 |
baked good, baked goods
| -0.250 |
dark red
| -0.250 |
|
|
Produced by the Infomap group, CSLI, Stanford University
Needless to say, there is an algorithm behind this class labelling
trick, and it relies on a lot of careful work by many Princeton
students over many years. The goal of this chapter is to describe this
work and the mathematical ideas behind it. The algorithm
itself is running here and you're
welcome to try it out for yourself. Two of the most important
characters in this story --- Aristotle and Darwin --- are not usually
thought of as mathematicians at all, but nonetheless they described
their ideas using mathematical models which were, if anything, far
ahead of their time.
The idea is that concepts can be arranged into a hierarchy,
a tree of meaning whose trunk and branches correspond to general
concepts and whose twigs and leaves correspond to particular or
specific concepts. We shall see that there are many examples of this
sort of structure in common use, from the famous 'Tree of Life' to
postal addresses and computer file systems. If the symmetric
relationships of the previous chapter can be thought of as level or
horizontal in character, the relationship between a child-node and a
parent-node in a hierarchy (most clearly exemplified in the
relationship between a species and its genus in the Tree of Life) can
be thought of as a vertical relationship.
Sections
1. Phylogeny, or the Tree of Life
This section describes the way Aristotle and later Charles Darwin used
the idea of a tree to describe the way a species inherits properties
from its genus. Such a structure is called a taxonomy.
2. Directed Relationships
Such a collection of relationships can be represented as another kind
of graph, but now the links in the graph are directed rather than
symmetric. A good example of a graph with directed links is the world
wide web, in which the link structure can be used to give an estimated
measure of the popularity of different webpages. This is how Google's
original PageRank algorithm worked.
3. Antisymmetric Relationships and Trees
The relationships in trees are not only directed, they are
antisymmetric - if A is an ancestor of B then
B cannot be an ancestor of A.
4. Representing Linguistic Meaning in Trees
The idea of using a tree to represent meaning in language goes back to
Aristotle's Categories, a tradition that was followed by many
philosophers and artists.
|
The Tree of Porphyry,
one of the earliest examples of a concept hierarchy. From the ceiling
fresco at Schussenried, Germany, by Franz Georg Herrmann (1757),
photograph by Jeffrey
Garrett (Northwestern University Library, October 2000). |
Nowadays linguists use parse-trees to
represent grammatical structure of sentences, and semantic knowledge
bases contain taxonomies which are directly descended from Aristotle's
work.
5. WordNet
The most widely available linguistic taxonomy is the Princeton WordNet.
|
Some of the uppermost nodes (major categories) in
the WordNet noun taxonomy.
|
WordNet has separate taxonomies for nouns and verbs, and organizes
modifiers (such as adjectives) into pairs of antonyms or
opposites. The insight that substances and actions often lack
'contraries' or antonyms, but that qualities normally have contraries,
is found explicitly in Aristotle's Categories
6. Finding class labels: Mapping collections of words into
WordNet
Coming full circle, this section describes the way we can use WordNet
to give a probably class label for a whole list of words.
|
The neighbours of Poland that we found in
Chapter 2 are now classified as European Countries |
References
The most important reading for this chapter is Aristotle's Categories.
The following paper describes and evaluates the class-labelling work more technically:
-
Dominic Widdows.
Unsupervised methods for developing taxonomies by combining
syntactic and statistical information. In Proceedings of
HLT/NAACL 2003, Edmonton, Canada, June 2003, pages 276-283.