If you are scouring for an exploratory text in probabilistic reasoning, basic graph concepts, belief networks, graphical models, statistics for machine learning, learning inference, naïve Bayes, Markov models and machine learning concepts, look no further. Dr. Barber has done a praiseworthy job in describing key concepts in probabilistic modeling and probabilistic aspects of machine learning. Don’t let the size of this 700 page, 28 chapter long book intimidate you; it is surprisingly easy to follow and well formatted for the modern day reader.
With excellent follow ups in summary, code and exercises, Dr. David Barber a reader at University college London provides a thorough and contemporary primer in machine learning with Bayesian reasoning. Starting with probabilistic reasoning, author provides a refresher that the standard rules of probability are a consistent, logical way to reason with uncertainty. He proceeds to discuss the basic graph concepts and belief networks explaining how we can reason with certain or uncertain evidence using repeated application of Bayes' rule. Since belief network, a factorization of a distribution into conditional probabilities of variables dependent on parental variables, is a specific case of graphical models, the book leads us into the discipline of representing probability models graphically. Followed by efficient inference in trees and the junction tree, the text elucidates on key stages of moralization, triangularization, potential assignment, and message-passing.
I particularly enjoyed the follow up chapter called statistics for machine learning which uniquely discuss the classical univariate distributions including the exponential, Gamma, Beta, Gaussian and Poisson. It summarizes the measure of the difference between distributions, Kullback-Leibler divergence and states that Bayes' rule enables us to achieve parameter learning by translating a prior parameter belief into a posterior parameter belief based on observed data. Learning as inference, naïve bayes, Learning with Hidden Variables and Bayesian model selection is followed by machine learning concepts. I found the sequence of chapters to be a bit off (shouldn’t graphical models be discussed before a specific case?) but since the book is more rooted in practice than an exercise in theorem-proving’s, the order ultimately makes sense.
Since Nearest neighbor methods are general classification methods, the book continues with conditional mixture of guassian into Unsupervised Linear Dimension Reduction, supervised linear dimension reduction, kernel extensions, guassian processes, mixture models, latent linear models, latent ability models, discrete and continuous state markov models eventuating to distributed computation of models, sampling and the holy grail of Deterministic Approximate Inference.
One of the many great things about this book is the practical and code oriented approach; tips with applied insight like “Consistency methods such as loopy belief propagation can work extremely well when the structure of the distribution is close to a tree. These methods have been spectacularly successful in information theory and error correction.” makes this text distinguished and indispensable.