LM101-007: How to Reason About Uncertain Events using Fuzzy Set Theory and Fuzzy Measure Theory

By | June 23, 2014
Example of Logical Reasoning in an Environment Characterized by Uncertainty.

Episode Summary:

In real life, there is no certainty. There are always exceptions. In this episode, two methods are discussed for making inferences in uncertain environments. In fuzzy set theory, a smart machine has certain beliefs about imprecisely defined concepts. In fuzzy measure theory, a smart machine has beliefs about precisely defined concepts but some beliefs are stronger than others.

Show Notes:

Hello everyone! Welcome to the seventh podcast in the podcast series Learning Machines 101. In this series of podcasts my goal is to discuss important concepts of artificial intelligence and machine learning in hopefully an entertaining and educational manner.

In real life, there is no certainty. There are always exceptions. In this episode, two methods are discussed for reasoning about uncertain events. These methods are called fuzzy set theory and fuzzy measure theory. In fuzzy set theory, a smart machine has certain beliefs about imprecisely defined concepts. In fuzzy measure theory, a smart machine has beliefs about precisely defined concepts but some beliefs are stronger than others.

Uncertainty in the world arises (in part) due to the virtually infinite variations of concepts that are present. Suppose you are developing an artificially intelligent system and you decide you want to create the concepts of “bird” and “flying”. One idea is you simply define CONCEPT 1 as “bird” and CONCEPT 2 as “flying”.  However, such simple definitions do not really provide information regarding the criteria one should use to identify a particular entity as a bird or a particular action as flying.

For example, there are many types of birds and each type of bird has its own unique characteristics and features.

Examples of birds include:

Albatross,  Blackbird, Bluebird, Lark, Canary, Crow, Cuckoo, Dove, Duck, Eagle, Falcon, Finch, Goose, Grackle, Gull, Hawk, Heron, Hummingbird,
Jay, Loon, Magpie, Mallard, Meadowlark, Merlin, Nighthawk, Owl,Pelican, Pheasant, Pigeon, Puffin, Quail, Raven, Roadrunner, Robin, Sandpiper,
Sapsucker, Sparrow, Starling, Stork, Swallow, Thrush, Turkey, Vulture, Warbler,Woodpecker, and Wren.

Moreover, this is not a complete list of birds. In addition, keep in mind that for a particular bird such as a “robins” there is a lot of genetic variation. So you
will have some robins that are fat, some robins that are thin, some robins that are tall, some robins that have orange breasts, some robins that have red breasts, and so on.

Thus, for each of the types of birds discussed above, there are many versions of each type! Moreover, consider one particular robin. A particular robin will have a different visual appearance depending upon lighting conditions, the visual angle of the viewer, the robin’s age. In addition, each robin will have a different auditory sound depending upon its choice of song, its physical state of health, and its distance from the agent who is listening to the robin’s song, the robin’s age, and the environment within which the robin sings its song.

Suppose we want to develop a method for representing knowledge about birds and we
desire a language that allows us to define a concept such a BIRD yet also explicitly represents our uncertainty regarding the particular bird under consideration. One strategy for defining the BIRD-CONCEPT is to think of the BIRD-CONCEPT as a set of birds. If something is a member of that set, then we will call that something a BIRD.

More specifically, we begin by specifying a particular bird in terms of a list of features. Examples of features might include: (1) less than 1 foot tall, (2) less than 1 foot wide, (3) can_fly_short_distances,  (4) can_fly_long_distances, (5) generates bird songs at a high frequency, (6) has colored feathers, and so on. This would be an example of an “abstract feature space”. A particular bird at a particular instant in time could then be represented as a long list of ones and zeros which is called a “feature vector”. If the first number in the feature vector is a one, then this indicates that the first feature “less than 1 foot tall” is observed to be true. If the third number in the feature vector is a one, then this indicates that the third feature “can_fly_short_distances” is true. And so on. Suppose that we represent the characteristics of a bird using 30 features where each feature can have the value of either one or zero. This would be represented as a list of 30 numbers. Since each number on the list can take on two possible values this means that we can have 230 or over a thousand million different lists of 30 numbers. Each of those lists could be used to represent a specific type of bird. We thus see the concept of a feature list is very powerful. Using a short list of just 30 numbers we can represent in principle over a thousand million different types of birds! Some of these lists might correspond to different types of robins. Some of these lists might correspond to different types of penguins. Other lists might correspond to birds which never exists and never will exist. To summarize, we refer to a feature list as a “feature vector” and the set of all possible feature vectors is called a “feature space”.

As another example, suppose we record the sounds generated by a bird using a microphone. The microphone translates changes in air pressure over time into changes in electrical voltage over time. A digital auditory signal records the electrical voltage generated by the microphone in a very small time interval for a large number of very small time intervals. Thus, an auditory recording can be represented as a long list of numbers where the first number in the list is the voltage generated by the microphone at the beginning of the auditory recording and the last number in the list is the voltage generated by the microphone at the end of the auditory recording. This long list of numbers is another example of a “feature vector”. In this podcast, the audio signal sampling rate is 44,000 numbers per second which means that each second of speaking in this audio podcast is divided into 44,000 tiny little subdivisions of a single second and the voltage generated by the microphone for each of these 44,000 subdivisions is recorded as a number. This means that if it take two seconds to say a word that word is represented by a feature vector represented by a list of 88,000 numbers! Feature vectors of this type are common in machine learning algorithms such as those in your smart phone that are used to recognize voice commands!

Feature spaces can be defined in other ways as well. For example, suppose someone is a bird
photographer who has a digital camera. Using their digital camera, they have taken hundreds of digital photographs of different types of birds.  If you were to look closely at a digital photograph you will see that it is constructed of millions of tiny squares which are called “pixels”. Each pixel in a color photograph is associated with three numbers which indicate the pixel’s color. Thus, one could arrange the numbers associated with all of the pixels in a particular digital photograph in a long list of numbers and call that long list of numbers a “feature vector”. Thus, a digital photograph which has 3000 rows and 3000 columns of pixels would consist of 9 million pixels and 27 million features. Thus, the feature vector for this digital photograph would be a list of 27 million numbers! Suppose, in addition, that a number associated with specifying the color of a pixel can take on 500 different values. This implies that a feature vector corresponding to a list of 27 million numbers can take on 50027,000,000  possible values! This is a big number which allows us to represent in principle a very large number of possible images in a digital photograph. To get an idea regarding how large this number is, it has been estimated that the number of atoms in the universe is about 1080. The number of possible images that could be represented by feature vector consisting of a list of 27 million numbers is roughly equal to a million times the number of atoms in the universe! Feature vectors of this type are common in machine learning algorithms that recognize faces in digital photographs.

These examples show how a feature vector may be used to represent complex objects in the real world. Now suppose that we want to represent the concept  “orange canary”.

This concept may be represented as the set of feature vectors which include the features “is_a_bird”, and  “is_a_canary” and “is_orange”. Any feature vector in this set would be called an “orange canary”. The concept “orange canary” is set of all orange canaries.

This is a simple but critical idea.

As another example, if we want to represent the concept of “birds that fly” we simply group all of the feature vectors which have the feature “is_a_bird” and “can_fly” into a big set of objects and then we call that big set of objects “flying birds”. Each bird can have different methods of flying and different birds may fly in a different ways. The concept “birds that fly” is the collection of all birds that fly. We can discuss and manipulate this concept without knowing the details of the particular bird which is flying or details regarding the method of flying for that bird.

But we can not only represent CONCEPTS as sets of objects. We can also represent IF-THEN logical rules as sets of objects also.

Consider the following example. Suppose we want to represent the IF-THEN rule that:

IF something is a bird, THEN that something can fly.

This is represented by first obtaining the set of feature vectors which correspond to representations of different birds. For example, the feature vectors in this set
might have the feature “is_a_bird” in common. The logical IF-THEN rule asserts that
every feature vector in this set also has the feature “can_fly”.

The key idea here is that the IF portion of the IF-THEN statement defines a set of feature vectors which specify the set of all possible birds. The THEN portion of the IF-THEN rule is an assertion about the members of the set of all possible birds.

So far we have made a lot of important progress. We discussed the idea that objects in the world can be represented as feature vectors. We gave an example how the image on a digital photograph or an auditory voice command can be represented as feature vectors.  We then talked about the idea that we can represent a concept as a set of feature vectors. And finally, we discussed the idea of how we can even represent an IF-THEN logical rule as a set of feature vectors as well!

In logic, everything is true or false. Everything is black and white. Unfortunately, the real world is not so neat and tidy. There is always uncertainty. Sometimes this uncertainty is due to the complexity of the environment. Suppose that you check the weather channel and it says that it will be sunny day. You look outside and it is sunny. You then go outside an hour later and it is cloudy and overcast. Later on in the day it is sunny but it also rains. Was the weather channel’s prediction that it would be a sunny day correct? This type of situation is not impossible.  We live in a complex unpredictable world. We might be able to identify some types of regularities in our environment but it is uncommon to find systematic logical IF-THEN rules that characterize the world in which we live.

Moreover, even if the world was deterministic and could be characterized by deterministic logical IF-THEN rules, it is unlikely that we will be successful at identifying these exact rules. Thus, uncertainty is present not only because of the unpredictable nature of the environment but also is due to the uncertainty present in our representations of the world in which we live.

We will discuss two different ways of representing uncertainty. The first approach is called Fuzzy Set Theory. Fuzzy Set Theory starts by noting that when we specify a collection of birds that there are some birds in the set which are more “bird-like” than other birds.  So, for example, “canaries”, “robins”, and “pigeons” might be considered to be more “bird-like” than “penguins” and “ostriches”. The essential idea of Fuzzy Set Theory is that we acknowledge the fact that some birds in a set can be more “bird-like” than other birds.

To represent this idea more explicitly we simply assign a number to each bird which indicates to what extent that bird is “bird-like”.  This methodology for assigning a membership number to a particular object is called the set-membership function. A membership number which is equal to 1 indicates the specific bird is very bird-like while a membership number which is equal to 0 indicates the specific bird is not bird-like at all. So, for example, in ordinary set theory and logic, we would assign the membership number “1” to all possible birds and assign the membership number “0” to things which are not birds such a “truck” or a “hat”  or a “dragon” would be assigned the membership number 0.

In Fuzzy Set Theory, we do not limit ourselves to a fuzzy set membership rule that assigns only the number of zero to things that are not birds and the number one to things that are birds. If we have a thing such as a “penguin” which is only partially a member of the set of birds we can assign the membership number 1/10 to the entity “penguin”. The membership number 1/10 indicates that a “penguin” is “sort of” or “kind of” a bird but perhaps is not the best representative of the set of all birds. When the fuzzy membership rule is allowed to assign numbers which are between zero and one, then we refer to the sets as “fuzzy sets” since they explicitly represent sets of of objects where an object can either be a member of the set, not a member of the set, or partially a member of the set. The degree of set membership is specified by a special function called the set-membership function  which assigns a number indicates the degree to which an entity can belong to the fuzzy set. And, as mentioned previously, an important special case of fuzzy set theory occurs when an object is either a member of a set or is not a member of a set. Such a set is called a “crisp set” and corresponds to our usual notion of a set or collection of objects such as discussed at the beginning of this episode.

Using Fuzzy Set Theory, one can develop fuzzy logic in which knowledge is represented using fuzzy logical rules. It is important to emphasize that the terminology “fuzzy set theory” or “fuzzy logic” does not mean that we are not trying to carefully define our ideas and that we are confused fuzzy thinkers! Rather, the terminology “fuzzy set theory” and “fuzzy logic” means that we are trying to be precise and explicitly model the fact that in the real world things are not black and white or true or false but rather things can be partially true or partially false. We have “little white lies” and “shades of grey”!

There is not one but many different types of ways of representing a “fuzzy logical rule”. We do not have time to discuss all of these methods but I have provided some references in the show notes located at: www.learningmachines101.com which provide some introduction to the various methods for fuzzy logical inference.

Perhaps the simplest way of doing this is to combine the standard IF-THEN logical rule with fuzzy set theory. We can then make statements such as the following:

IF something is kind-of a type of BIRD, THEN that something can kind-of- FLY

This type of fuzzy IF-THEN rule helps us capture the presence of “vagueness”  or “imprecision” associated with the concepts of “BIRD” and “FLY”.

We can make this fuzzy IF-THEN rule more explicit by using concepts from fuzzy logic. Specifically, we can write:

IF something is a BIRD with membership value 9/10, THEN that something can FLY with membership value 9/10.

The advantage of fuzzy set theory and fuzzy logic is that the “imprecision” and “vagueness” and “uncertainty” in the real world can be explicitly modeled. This is a great strength of fuzzy set theory and fuzzy logic but it can also add an additional complication to the modeling process since it requires the researcher to postulate specific fuzzy membership functions and rules for combining the results of the fuzzy membership operations. Furthermore, it may be difficult sometimes to justify the choice of particular fuzzy membership functions or the particular rules for combining the results of those operations.

Still, despite these concerns, researchers are continuing work on fuzzy logic and fuzzy set theory and have made tremendous progress in this area. Fuzzy logic has been employed in thousands of real-world applications such as car braking controls, weather forecasting, helicopter control, and laundry machines for the purpose of making artificially intelligent decisions in environments characterized by uncertainty. In the show notes, I provide a reference to a recent issue of the journal “Advances in Fuzzy Systems” which is entirely devoted to a review of real-life applications of fuzzy logic.

Still, fuzzy set theory and fuzzy logic are not as widely used as another quite different approach for representing uncertainty. This second approach is called Fuzzy Measure Theory.  The basic idea of fuzzy measure theory is that we assign a Degree of Belief to a crisp set.

In fuzzy measure theory we might make an assertion such as a penguin is a member of the set of birds and its membership number is equal to 1.  However, our degree of belief that this assertion is true might only be equal to 1/20.

In contrast, in fuzzy set theory we might make an assertion such as a penguin is a member of the set of birds and its membership number is equal to 1/20. We would believe this assertion is true so our degree of belief is equal to 1.

In addition, fuzzy measure theory assumes the monotonicity assumption holds. Suppose that
the IF-THEN logical rule:

IF something is a BIRD, THEN that something can fly

Is a rule which we assume is true. Then the monotonicity assumption is that the belief that something is a bird must be greater than or equal to the belief that something can fly.

Or, in other words, if event A is a subset of event B then event A must have a degree of belief that is no greater than the degree of belief in event B.

Here is a second example. Suppose we attach the degree of belief 8/10 to the event that it will rain tomorrow. Then the degree of belief that we assign to the event that it will rain tomorrow or the day after tomorrow must be greater than or equal to 8/10.

To summarize, fuzzy set theory represents uncertainty by assuming true beliefs about fuzzy sets, while fuzzy measure theory represents uncertainty by assuming partially true beliefs about crisp sets.

Moreover, Fuzzy Set Theory and Fuzzy Measure Theory are semantically and computationally quite different even though they might appear very similar at first glance.

Note that the case where you have true beliefs about crisp sets corresponds to the classic
IF-THEN logical rules that we described in detail in Episode 3! It is also possible to develop a theory of uncertainty involving partial beliefs about fuzzy sets and some references to this topic may be found in the show notes at the website: www.learningmachines101.com .

In two weeks, we will continue our discussion of fuzzy measure theory. In particular, we will show that an important special case of fuzzy measure theory corresponds to the theory of uncertain reasoning which is most widely used in the field of artificial intelligence and is the basis of many artificially intelligent systems.

 

Further Reading:

Singh, H., Gupta, M. M., Meitzler, T., Hou, Zeng-Guang, Garg, K. K., Solo, A. M., and Zadeh, L. (2013). Real-life applications of fuzzy logic. Advances in Fuzzy systems. http://www.hindawi.com/journals/afs/2013/581879/

Wikipedia Fuzzy Set Theory entry (http://en.wikipedia.org/wiki/Fuzzy_set)

Wikipedia Fuzzy Logic entry (http://en.wikipedia.org/wiki/Fuzzy_logic)

Wikipedia Fuzzy Measure entry (http://en.wikipedia.org/wiki/Fuzzy_measure)

McNeil, D. and Freiberger, P. (1994). Fuzzy Logic: The Revolutionary Computer TechnologySimon and Schuster.http://www.amazon.com/Fuzzy-Logic-Revolutionary-Computer-Technology/dp/0671875353.This is a very readable introduction to fuzzy logic at the level of this blog.

Wang, Z. and Klir, G. (2010). Fuzzy measure theory. www.amazon.com/Fuzzy-Measure-Theory-Zhenyuan-Wang/dp/1441932259/. This is an advanced text and requires a strong math background but is a useful introduction to the field for advanced undergraduate students or graduate students with background in advanced mathematics.

Copyright Notice:

Copyright © 2014 by Richard M. Golden. All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *