pgm.HelloWorld() with Wainwright & Jordan
0I have recently came across Wainwright & Jordan’s paper on exponential families, graphical models, and variational inference and found it to be quite comprehensive and unifying introduction of the topic. Probabilistic graphical models use a graph-based representation as the basis for compactly encoding a complex distribution over a high-dimensional space. If you are familiar with Koller and Friedman’s work on Probabilistic Modeling, Wainwright and Jordan’s paper would provide a less mathenamtically terse and more unifying view of the area.
Graphical Models, Exponential Families, and Variational Inference
As compared to Pearl’s work on Causality, this paper provides a contemporary look at Message-passing Algorithms for Approximate Inference, Connection to Max-Product Message-Passing and detailed insight into Moment Matrices, Semidefinite Constraints, and Conic Programming Relaxation. Due to it’s clarity and detailed explanation, the background material on Graphs, hypergraphs, exponential families and duality is definitely worth reading even if you don’t need a refresher.
In Lieu of Pearl’s polytree approach, Wainwright & Jordan’s work discusses Graphical Models as Exponential Families before delving into Computational Challenges with High-Dimensional Models. Later chapters deal with Sum-Product, Bethe–Kikuchi, and Expectation-Propagation, Mean Field Methods, Variational Methods in Parameter Estimation, Convex Relaxations and Upper Bounds, Integer Programming, Max-product, and Linear Programming Relaxations concluding with Moment Matrices, Semidefinite Constraints, and Conic Programming Relaxation. For a computer scientist, it is always interesting to observe the statistical perspective of machine learning. This contemporary insight into Graphical Models, Exponential Families, and Variational Inference was published in Foundations and Trends in Machine Learning which is definitely built upon researchers’ earlier work on Variational inference for Dirichlet process mixtures and Variational inference in graphical models: The view from the marginal polytope.
As an appetizer, I would also recommend Bishop’s chapter on Graphical Model.
Speaking @ SoCal .NET Architecture Users Group – Implementing SOA Design Patterns with WCF
I will be speaking at the next SoCal IASA chapter meeting will be Thursday May 17, 2012 at Rancho Santiago Community College District, 2323 N. Broadway, Santa Ana. Meeting starts at 7:00 pm iA, pizza and networking 6:30 pm. RSVP by emailing to mike.vincent@mvasoftware.com if you plan to attend.
Implementing SOA Design Patterns with WCF
Service Oriented Architecture (SOA) is an architectural design pattern where it’s design is determined by few guiding principles mainly (a) Ser- vice compatibility is determined based on policy (b) Services share schema and contract, not class (c) Services are Autonomous and (d) Boundaries are Explicit. Implementation of these so-called SOA tenants requires a powerful framework which provides a unified programming model, reliable messaging, security, workflow service, interoperability and integration, syndication, meta-data exploration support, service versioning, REST-Ful endpoints and many other modern connected systems features. Both Service-Orientation and the Windows Communication Foundation (WCF) offer the promise of greater interoperability and ease of integration, but in order to realize benefits such as these we must evolve the way we architect solutions.
This session will be a hands-on introduction to SOA with Windows Communication Foundation. Speaker presents patterns using WCF that allows you to define descriptive, maintainable, yet extensible contracts and implementation of SOA tenants. Since SOA promotes loose coupling at the transport layer; you’ll learn how to create loosely coupled systems, the difference between web reference, service reference and channelfactory. The attendees will learn how to avoid anti-Patterns and leverage WCF to create extensible, versioned, responsive, interoperable, and easy-to- maintain services.
Are Bayesian networks Bayesian enough?
In Bayesian Artificial Intelligence, authors Kevin B. Korb and Ann E. Nicholson points out the non-Bayesian nature of Belief networks. The researchers note
Many AI researchers like to point out that Bayesian networks are not inherently Bayesian at all; some have even claimed that the label is a misnomer. At the 2002 Australasian Data Mining Workshop, for example, Geoff Webb made the former claim. Under questioning it turned out he had two points in mind:
(1) Bayesian networks are frequently “data mined” (i.e., learned by some computer program) via non-Bayesian methods.(2) Bayesian networks at bottom represent probabilities; but probabilities can be interpreted in any number of ways, including as some form of frequency; hence, the networks are not intrinsically either Bayesian or non-Bayesian, they simply represent values needing further interpretation.
These two points are entirely correct. We shall ourselves present non-Bayesian methods for automating the learning of Bayesian networks from statistical data. We shall also present Bayesian methods for the same, together with some evidence of their superiority. The interpretation of the probabilities represented by Bayesian networks is open so long as the philosophy of probability is considered an open question.
Indeed, much of the work presented here ultimately depends upon the probabilities being understood as physical probabilities, and in particular as propensities or probabilities determined by propensities.
Nevertheless, we happily invoke the Principal Principle: where we are convinced that the probabilities at issue reflect the true propensities in a physical system we are certainly going to use them in assessing our own degrees of belief.
The advantages of the Bayesian network representations are largely in simplifying conditionalization, planning decisions under uncertainty and explaining the outcome of stochastic processes. These purposes all come within the purview of a clearly Bayesian interpretation of what the probabilities mean, and so, we claim, the Bayesian network technology which we here introduce is aptly named: it provides the technical foundation for a truly Bayesian artificial intelligence.
References
A Brief Introduction to Graphical Models and Bayesian Networks
Darkroom theme for Lyx/LaTeX
Like distraction-free-easy-on-eyes Dark IDE’s, most developers prefer clutter free green on black background for their text editors as well. On Lyx, it’s fairly easy to do with step by step instructions here. This color scheme is somewhat similar to Darkroom for Windows. All you’d need to do is to modify your lyx preferences file with the following code. The location of preferences file depends on your OS but it’s general location is \
#
# SCREEN & FONTS SECTION ############################
#
\screen_font_roman "Shruti"
\screen_font_sans "Arial Black"
#
# COLOR SECTION ###################################
#
\set_color "cursor" "#00ff00"
\set_color "background" "#000000"
\set_color "foreground" "#00ff00"
\set_color "selection" "#005500"
\set_color "note" "#00ff00"
\set_color "notebg" "#555500"
\set_color "math" "#aaff00"
\set_color "mathbg" "#001000"
\set_color "mathmacrobg" "#003100"
\set_color "mathframe" "#005500"
\set_color "mathcorners" "#005500"
\set_color "buttonframe" "#005500"
\set_color "buttonbg" "#005500"
\set_color "buttonhoverbg" "#00aa00"
Continued adventures in SamIam – Back to Basics with Code Bandit
As discussed previous post on Customizing Conditional Probability using Code Generation with SamIam, I have touched upon importance of having programmatic and declarative control over the network. Working with SamIam (and with Infer.NET to some extent) gives a researcher provides this flexibility which is hard to find in proprietary tools.
Here is a simple example of a typical text book belief network. Once graphically drawn, SamIam’s code bandit allow you to extract the model out as a class.
This class hard codes the network …code\samiam30_windows_amd64\samiam\BeliefNet.net where one can operate on the object BayesianNetwork and can modify the nodes by population from a different data source rather than hard coding. Once a simple, readable structure model is available in raw code, there are lots of possibilities for data population. To build, ensure that inflib.jar occurs in the command line classpath, e.g. javac -classpath inflib.jar ModelTutorial.java
public BayesianNetwork createBayesianNetwork()
{
/* Create a domain of size 5. */
Domain domain = new Domain(5);
/* Add a discrete variable called "H" to the domain,
with states "True", "False". */
String name0 = "H";
String[] values0 = new String[]{ "True", "False" };
int id0 = domain.addDim( name0, values0 );
/* Add a discrete variable called "B" to the domain,
with states "True", "False". */
String name1 = "B";
String[] values1 = new String[]{ "True", "False" };
int id1 = domain.addDim( name1, values1 );
/* Add a discrete variable called "L" to the domain,
with states "True", "False". */
String name2 = "L";
String[] values2 = new String[]{ "True", "False" };
int id2 = domain.addDim( name2, values2 );
/* Add a discrete variable called "C" to the domain,
with states "True", "False". */
String name3 = "C";
String[] values3 = new String[]{ "True", "False" };
int id3 = domain.addDim( name3, values3 );
/* Add a discrete variable called "F" to the domain,
with states "True", "False". */
String name4 = "F";
String[] values4 = new String[]{ "True", "False" };
int id4 = domain.addDim( name4, values4 );
/* For the cpts, create arrays of double-precision floating point values. */
//H Value
//True 0.2
//False 0.8
double[] cpt0 = new double[]{ 0.2, 0.8 };
//B H Value
//True True 0.25
//True False 0.05
//False True 0.75
//False False 0.95
double[] cpt1 = new double[]{ 0.25, 0.05, 0.75, 0.95 };
//L H Value
//True True 0.03
//True False 5.0E-4
//False True 0.97
//False False 0.9995
double[] cpt2 = new double[]{ 0.03, 5.0E-4, 0.97, 0.9995 };
//C L Value
//True True 0.6
//True False 0.02
//False True 0.4
//False False 0.98
double[] cpt3 = new double[]{ 0.6, 0.02, 0.4, 0.98 };
//F L B Value
//True True True 0.75
//True True False 0.1
//True False True 0.5
//True False False 0.05
//False True True 0.25
//False True False 0.9
//False False True 0.5
//False False False 0.95
double[] cpt4 = new double[]{ 0.75, 0.1, 0.5, 0.05, 0.25, 0.9, 0.5, 0.95 };
Later on, SamIam creates the table using the CPT’s and eventually build the network using these tables.
/*
Create a IL2 Table for each cpt.
The parameters to the Table constructor are:
(1) the domain,
(2) the variable ids that name the dimensions of the table (in the form of an IntSet),
(3) the cpt data.
*/
Table table0 = new Table( domain, new IntSet( new int[]{ id0 } ), cpt0 );
Table table1 = new Table( domain, new IntSet( new int[]{ id0, id1 } ), cpt1 );
Table table2 = new Table( domain, new IntSet( new int[]{ id0, id2 } ), cpt2 );
Table table3 = new Table( domain, new IntSet( new int[]{ id2, id3 } ), cpt3 );
Table table4 = new Table( domain, new IntSet( new int[]{ id1, id2, id4 } ), cpt4 );
/* Create an array of all the Tables. */
Table[] tables = new Table[]{ table0, table1, table2, table3, table4 };
/*
The simple BayesianNetwork constructor takes only one argument:
an array of Tables.
*/
BayesianNetwork model = new BayesianNetwork( tables );
Upon building, you get the following console output.
Happy inferring!
Tag Cloud for Belief Network Sensitivity as Background Knowledge
While we are having fun visualizing with tag clouds, here is one on the following four key papers in the area of pattern mining with Bayesian networks as background knowledge and discovery of interesting patterns based on Bayesian network background knowledge.
- Fast discovery of interesting patterns based on Bayesian network background knowledge
- Fast discovery of unexpected patterns in data, relative to a Bayesian network
- Interestingness filtering engine: Mining Bayesian networks for interesting patterns
- Scalable pattern mining with Bayesian networks as background knowledge
- Using sensitivity of a bayesian network to discover interesting patterns
Selected Papers in Machine Learning
- The Discipline of Machine Learning by Tom Mitchell
- Introduction to Support Vector Machines – Dustin Boswell
- Fast Training of Support Vector Machines using Sequential Minimal Optimization
- Introduction to linear regression
- The elements of statistical learning (book)
- Survey of Clustering Algorithms
- Supervised Machine Learning: A Review of Classification Techniques
- Ensemble Methods in Machine Learning
- The Boosting Approach to Machine Learning
- K-means clustering via principal component analysis
- Dimensionality Reduction
- Unsupervised Learning by Probabilistic Latent Semantic Analysis
- Classes of Kernels for Machine Learning:A Statistics Perspective
- Bayesian inference: An introduction to principles and practice in machine learning
- An Introduction to MCMC for Machine Learning
- Supervised machine learning: A review of classification techniques
- Linear Algebra Refresher
- Bayesian Modelling in Machine Learning: A Tutorial Review
TagCloud for Sensitivity Analysis of Probabilistic Graphical Models
Here is an interesting word cloud built using dissertation by Hei Chan of UCLA on Sensitivity analysis of probabilistic graphical models.
Following is the summary of the thesis.
Probabilistic belief systems are used in artificial intelligence to model uncertainty. A popular framework for realizing probabilistic belief systems is to use graphical models, such as Bayesian networks and Markov networks. The topic of sensitivity analysis is concerned broadly with the relationships between local beliefs, such as network parameters, and global beliefs, such as values of probabilistic queries. Sensitivity analysis is crucial to probabilistic belief systems because we often need to revise our state of belief to incorporate new probabilistic information in the form of local belief changes. This work focuses on sensitivity analysis of probabilistic graphical models, by addressing central research problems such as the assessment of global belief changes due to local belief changes, the identification of local belief changes that induce certain global belief changes, and the quantifying of belief changes in general. Our results can be divided into the following parts. First, we develop procedures and complexity results for tuning Bayesian or Markov network parameters (single or multiple) to ensure certain query constraints. Second, we provide network-independent bounds on changes in query values due to arbitrary changes in Bayesian or Markov network parameters. Third, we propose a new distance measure for quantifying probabilistic belief changes, and use it to provide guarantees on global belief changes in Bayesian or Markov networks. Fourth, we provide algorithms and complexity results on the sensitivity of decisions induced by Bayesian networks. Finally, we discuss the philosophical topic of belief revision. Many of our results have been implemented in a program called SamIam (Sensitivity Analysis, Modeling, Inference and More), a graphical Bayesian network tool developed by the UCLA Automated Reasoning Group.
fun -> Infer.NET (probabilistic programming in F#)
“I’m not an outlier; I just haven’t found my distribution yet!” -Ronan Conroy
Infer.NET is a .NET library for machine learning which provides state-of-the-art algorithms for probabilistic inference from data. Due to its capability of seamlessly integrating with the .NET code, I have used this library frequently and wrote about Infer.Net in the past; however, during a recent hunt of looking for a functional dialect for probabilistic programming, I came across Infer.NET Fun dubbed as An F# Library for Probabilistic Programming. Being a first class .NET citizen, it was always possible to call infer.net directly from F# but having a “succinct syntax of F# into an executable modeling language” to enjoy the raw-functional-style of language is always the best.
Infer.NET Fun seems to have come out of this research paper Distribution Transformer Semantics for Bayesian Machine Learning and a resulting Microsoft research report by Johannes Borgström, Andrew D. Gordon, Michael Greenberg, James Margetson, and Jurgen Van Gael, Measure Transformer Semantics for Bayesian Machine Learning, no. MSR-TR-2011-18, July 2011
The project is defined as
Infer.NET Fun turns the simple succinct syntax of F# into an executable modeling language. You can code up the conditional probability distributions of Bayes’ rule using F# array comprehensions with constraints. Write your model in F#. Run it directly to synthesize test datasets and to debug models. Or compile it with the Infer.NET compiler and runtime for efficient statistical inference.
The syntax is very powerful and does every effort to makes functional programmer at home. As an example, let’s go with the age-old coin toss problem.
Let’s toss two coins and observe that not both of them are heads; now we can use infer,net to calculate the conditional probability that is assigned after the relevant evidence is taken into account. This is called the inference of posterior probability to figure out if each of these were heads. To do so using Infer.NET Fun you start by writing a model as follows:
open MicrosoftResearch.Infer.Fun.FSharp.Syntax
[]
let coins () =
let c1 = random (Bernoulli(0.5))
let c2 = random (Bernoulli(0.5))
let bothHeads = c1 && c2
observe (bothHeads = false)
c1, c2, bothHeads
This is the model definition of the problem. This model is using a function random which returns a random sample from a given distribution. The second function being used is observe which marks the execution as failed if the condition is not satisfied.
It takes some adjustment to be able to understand the functional definition here but the ultimate goal of this inference is to produce the distribution of the output variables of the model across all executions with a condition; no observation should fail (bothHeads=false). Sampling from a model is as simple as calling the function coins(). I am running this in debug mode stepping through as can be seen in the screenshot below.
by printing the output from method, you get the screenshot above for c1, c2 and bothHeads
printf "Sample: %O\n" (coins ())
Now, in order to retrieve the model (infer) from the given model which we just setup, you can do the following.
open MicrosoftResearch.Infer.Fun.FSharp.Inference setVerbose true setShowFactorGraph false let (c1D,c2D,bothD) = inferFun3 <@ coins @> () printf "coinsD: \n%O\n%O\n%O\n" c1D c2D bothD
Setting verbosit will be able to demonstrate the underlying model. Now in order to perform inference, the function inferFun3 runs and returns a triple of distributions, one for each return value of coins()
One of the really cool features of infer.net which I like is that by setting the following.
setShowFactorGraph true
The Infer.NET Fun paper concludes with the following information about their contribution. The paper talks about the importance of being able to write probablistic programs directly using a declarative language fun, a subset of F#.
Our direct contribution is a rigorous semantics for a probabilistic programming language that also has an equivalent factor graph semantics. More importantly, the implication of our work for the machine learning community is that probabilistic programs can be written directly within an existing declarative language (Fun—a subset of F#) and compiled down to lower level Bayesian inference engines.
For the programming language community, our new semantics suggests some novel directions for research. What other primitives are possible—non-generative models, inspection of distributions, on-line inference on data streams? Can we provide a semantics for other data structures, especially of probabilistically varying size? Can our semantics be extended to higher-order programs? Can we verify the transformations performed by machine learning compilers such as Infer.NET compiler for Csoft? Are there type systems for avoiding zero probability exceptions, or to ensure that we only generate factor graphs that can be handled by our back-end?
Don Syme have already said that F stands for fun in F# but Infer.NET fun is definitely putting fun in inference and F#, well probably
Here are the slides and video of the Infer.NET Fun talk at Lang.NEXT 2012.
Reverend Bayes, meet Countess Lovelace: Probabilistic Programming for Machine Learning
Great news for creating probabilistic DSL; as F# is moving into the Enterprise – Accelerated Analytical and Parallel .NET Development with F# 2.0
Happy Infering!
CloudCamp LA 2012, CQRS and NoSQL
Cloud camp LA happened couple of weeks ago at the coresite campus in downtown LA. The highlights of the evening were Dave Nielsen’s intro, Lynn Langit’s NOSQL session, Bret Statham‘s CQRS (Command Query Responsibility Segregation) talk and coresite’s datacenter tour.
Slides from Bret’s lightning talk can be downloaded here.
I have attended cloudcamps organized by Dave Nielsen in the past but this particular event wasn’t as organized as the one at Microsoft campus couple of years ago (and through no fault of his own). Dave is a Co-Founder of CloudCamp and author of the book PayPal Hacks. The event started late and hence the unconference style sessions and panels were cut short and disrupted. Lots of echo so it was hard to hear and topics which came out of un-conference discussion weren’t quite diverse and well organized even for an unconference. However, the data center tour was fun!
and a much nicer write-up by morphlaps on CloudCamp LA – Why Open Source (and OpenStack) Matters To the Enterprise
I get to meet Jason Woloz who is heading up the Cloud security alliance LA chapter. The first meetup is coming soon. http://www.meetup.com/LASC-CSA/
References:































