Book Review – Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Book Cover Pattern Recognition and Machine Learning
Information Science and Statistics
Christopher Bishop
Machine Learning Algorithms
Springer Science
2006
738

About the Book:
This book is a collection of topics which are loosely organized but the discussion of the topics is extremely clear.   The loose organization of topics has the advantage that one can flip around the book and read different sections without having to read earlier sections. A beginner to machine learning might start by reading Chapters 1, 2, 3, and 4 very carefully and then read the initial sections of the remaining chapters to get an idea about what types of topics they cover.

The choice of topics hit most of the major areas of machine learning and the pedagogical style and writing style is quite clear. There are lots of great exercises, great color illustrations, intuitive explanations, relevant but not excessive mathematical notation, and numerous comments which are extremely relevant for applying theses ideas in practice. Both Pattern Recognition and Machine Learning and The Elements of Statistical Learning are handy references which I like to keep by my side at all times! Indeed,  both of these texts are perhaps the two most popular graduate level textbooks on Machine Learning.

I would say that if your training is in the field of statistics or mathematics you will probably like The Elements of Statistical Learning a little better than Pattern Recognition and Machine Learning but if your training is in the field of engineering you may like Pattern Recognition and Machine Learning a little better than the Elements of Statistical Learning.

Chapters 1 and 2 provide a brief overview of relevant  topics in probability theory.  Chapters 3 and 4 discuss methods for parameter estimation for linear regression modeling. Chapter 5 discusses parameter estimation for feedforward neural network models. Chapters 6 and 7 discuss kernel methods and Support Vector Machines (SVMs). Chapter 8 discusses Bayesian networks and Markov random fields. Chapter 9 discusses mixture models and Expectation Maximization (EM) methods. Variational Inference methods are discussed in Chapter 10. Chapter 11 discusses sampling algorithms which are useful for seeking global minima as well as numerically evaluating high-dimensional integrals. Chapter 12 discusses various types of Principal Component Analysis (PCA) including: PCA, Probabilistic PCA and Kernel PCA. Chapter 13 discusses Hidden Markov models. And Chapter 14 discusses Bayesian Model Averaging and other methods for modeling mixtures of experts.

Target Audience:
In order to read this textbook, a student should have taken the standard lower-division course in linear algebra, a lower-division course in calculus (although multivariate calculus is recommended), and a calculus-based probability theory course (typically an upper-division course). With this background, the book may be a little challenging to read but it is certainly accessible to students with this relatively minimal math background. If you have a PhD in Statistics, Computer Science, Engineering, or Physics you will find this book extremely useful because it will help you make contact with topics with which you are already familiar.

About the Author:
Dr. Christopher Bishop is Laboratory Director at Microsoft Research Cambridge and Professor of Computer Science at the University of Edinburgh. Dr. Bishop is an expert in the field of Artificial Intelligence and Artificial Neural Networks and he has a PhD in Theoretical Physics. In 2017, he was elected as a Fellow of the Royal Society.