So far we have considered sets of objects, and relationships between these objects such as the symmetric relationships in the graph model of Chapter 2, and the asymmetric or vertical relationships of the concept hierarchies in Chapter 3. We have started to feel our way around a graph or a hierarchy, following chains of relationships and learning about the shape of the space in the vicinity of a few interesting example words.
We have still said very little about measuring these relationships. In building the graph we considered how much weight should be given to each link, and gave some rules of thumb for ranking links. When using WordNet, we wanted to give greater significance to the relationship between oak and tree than to the relationship between oak and living thing, because oak is closer to tree that it is to living thing.
In order to obtain more general results, we need to be able to compare relationships more systematically. One of the most tried and tested mathematical techniques for doing this is to measure the distance between two points. Conversely, we might try to measure the similarity between two points (small distances corresponding to large similarities and large distances corresponding to small similarities).
This chapter introduces some of the ways that have been used to measure distances and similarities in mathematical spaces such as graphs and hierarchies. The most standard distance measures in mathematics are called metrics, which must satisfy certain conditions or axioms (such as being symmetric).
However, we must also pay great care to testing whether these mathematical techniques are actually appropriate when we're dealing with language. For example, we shall see that some of the properties of metrics are not always ideal for describing distances between words and concepts, and this chapter presents a few case studies in what can go wrong if we are not very careful and sensitive to the goal of our work, which is not using mathematical ideas of distance but inferring similarity of meaning in natural language.
Measuring similarity between words can enable us to build whole classes of words with similar semantic properties --- sets of words which are all tools or all musical instruments, for example --- and we demonstrate that the graph model of Chapter graph-chapter can be used for this purpose very successfully, provided we show great care in the presence of ambiguous words. Ambiguous words can sometimes behave like semantic wormholes, accidentally transporting us from one area of meaning to another. But we can also use this effect positively, because finding those words which have this strange wormhole effect when measuring distances helps us to recognize which words are ambiguous in the first place.
Distance functions depend on the space you're working in -- one may choose different ways of measuring distance that are appropriate for different applications. (For example, people often quote driving distances in minutes rather than miles because it is often the more relevant consideration.)
We consider shortest paths in graphs and the Euclidean distance (given by Pythagoras theorem) as examples.
The cosine measure is a very important similarity measure for points in the plane (and in higher dimensions as we shall see later in the book). The cosine measure assigns a high similarity to points that are in the same direction from the origin, zero similarity to points that are perpendicular to one another, and negative similarity for those that are pointing in opposing directions to one another.
This section also describes interesting similarity and distance measures that have been used to compare words in a taxonomy.
This section explores some of these objections to the mathematical abstraction, and considers ways of measuring distances between words that are more sensitive to these psychological objections. In particular, by factoring out the overall frequency or popularity of a word in a graph, we show that similarities can be measured in the word graph of Chapter 2 in a way that models the psychological observations much more effectively.
Relationships where you can make inferences like this are called transitive (not to be confused with transitive verbs). Some relationships in natural language and spatial reasoning are transitive, such as the 'is above' relationship. Others are sometimes transitive, such as 'is next to'.
Up to Geometry and Meaning | | | Back to Chapter 3 | | | On to Chapter 5 |