Algorithms
The Clairvoyant Load Balancing Algorithm for Highly Available Service Oriented Architectures
References
[1] . Psychic Routing: Upper Bounds on Routing in Private DTNs. , 2011.
[2] . Black-boxing the user: internet protocol over xylophone players (IPoXP). Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems Extended Abstracts:71—80, 2012.
[3] . Simple efficient load balancing algorithms for peer-to-peer systems. Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures:36—43, 2004.
[4] . Increasing Throughput in IP Networks with ESP-Based Forwarding: ESPBasedForwarding. , 2011.
[5] . Service Undiscovery Using Hide-and-Go-Seek for the Domain Pseudonym System (DPS). , 2012.
[6] . Performance analysis of load balancing algorithms. World Academy of Science, Engineering and Technology, 38:269—272, 2008.
LA Machine Learning event on Mining Time Series Data w/ Sylvia Halasz
Last night's LA Machine Learning event on Mining Time Series Data w/ Sylvia Halasz of YP at OpenX Pasadena was quite interesting and well attended. Dr. Halasz spoke about Adaptive Ensemble Kalman Filter and her work on building n-gram correlation with the flu outbreaks. Some of the associated papers follow.
- The ngram chief complaint classifier: A novel method of automatically creating chief complaint classifiers based on international classification of diseases groupings
- Detecting the start of the flu season
- Syndrome Surveillance - CDC
Causality, Probability, and Time - A Temporo-Philosophical Primer to Causal Inference with Case Studies
Causality, Probability and Time by Dr. Samantha Kelinberg is a whirlwind yet original journey of the interdisciplinary study of probabilistic temporal logic and causal inference. Probabilistic causation is a fairly demanding area of study which studies the relationship between cause and effect using the tools of probability theory. Judea Pearl, in his seminal text "Causality: Models, Reasoning, and Inference" refers to this quandary by stating that
(causality) connotes lawlike necessity, whereas probabilities connote exceptionality, doubt, and lack of regularity.
Dr. Kelinberg's work provides a balanced introduction to background work on this topic while breaking new grounds on a well-positioned approach of causality based on temporal logic. The envisioning problem is the problem of deducing the set of facts, possibly as the result of our actions leading to the decision problem. This is compounded with finding a timely and useful way to represent our knowledge about time, change, and chance.
In this ~260 page book, Dr. Kelinberg begins with a brief history of causality leading to Probability, logic and probabilistic temporal logic. The author then defines causality from various different facets, proceeding to causality inference, token causality and then finally the case studies. With practical examples and algorithms, author devises simple mathematical tools for analyzing the relationships between causal connections, inference, causal significance, model complexity, statistical associations, actions and observations.
Exploiting the temporal nature of probabilistic events, Dr. Kelinberg's research is a thought provoking and valuable addition to the scientific community interested in learning causal effects and inference with respect to time. Built upon the works of the likes of Heckerman, Breese, Santos and Young, this book will pave the way probabilistic reasoning researchers think about temporal effects on causality for years to come.
David Hume believed that the causes are invariably followed by their effects: "We may define a cause to be an object, followed by another, and where all the objects similar to the first, are followed by objects similar to the second." So, would you like a well written margin-annotation-laden text which provides formal and practical case study based approach to this somewhat abstract concept of causality? Then look no further!
Selected Papers on Interestingness Measures, Knowledge Discovery and Outlier Mining
- S. Abe and T. Inoue. Fuzzy support vector machines for multiclass problems.In ESANN 2002 Proceedings, pages 113-118, 2002.
- E.L. Allwein, RE. Schapire, and Y. Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research,
1:113-141,2000.
- P. Baldi and K. Hornik. Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2(1):53-581989
- Marc Benioff. Data, data everywhere: A special report on managing information. The Economist, February 2010.
- J.M. Keller R Krishnapuram L.I. Kuncheva J.C. Bezdek and N.R Pal. Will the real iris data please stand up? IEEE Transactions on Fuzzy Systems, 7:3,
1999.
- C. J. C. Burges. A tutorial on support vector machines for pattern recognition.
Data Mining and Knowledge Discovery, 2:121-167, 1998.
- C. Chen, A. Liaw, and L. Breiman. Using random forest to learn imbalanced data. Technical report, Department of Statistics, UC Berkeley, 2004.
- V. Cherkassky and F. Mulier. Learning from Data: Concepts, Theory and
Methods. John Wiley & Sons, Inc., 1998.
- R. Cilibrasi and P. Vitanyi. Clustering by compression. IEEE Transactions on
Information Theory, 51(4):1523-1545, 2005.
- R. Cilibrasi and P. Vitanyi. Normalized web distance and word similarity.
CoRR, abs/0905.4039, 2009.
- R. Cilibrasi, P. Vitanyi, and R. de Wolf. Algorithmic clustering of music. In
WEDELMUSIC, pages 110-117, 2004.
- T. Downs, I. Wood, and M. Gallagher. Empirical evidence for ultrametric structure in multi layer perceptron error surfaces. Neural Processing Letters,
16(2):177~186, 2002.
- A.A. Freitas. Are we really discovering "interesting" knowledge from data?
Expert Update (the BCS-SGAI Magazine), 9(1):41~47, October 2006.
- L. Geng and H. J. Hamilton. Interestingness measures for data mining: A
survey. ACM Comput. Surv., 38(3), 2006.
- M. Gori and F. Scarselli. Are multilayer perceptrons adequate for pattern recognition and verification? IEEE Trans. Pattern Anal. Mach. Intell., 20(11):1121~
1132, 1998.
- P.M. Granitto, P.F. Verdes, and H.A. Cecatto. Neural network ensembles:
evaluation of aggregation algorithms. arXiv, arXiv:cs.AI/0502006vl, 2005.
- S. Hashemi and T.P. Trappenberg. Using svm for classification in data sets with ambiguous data. In International Conference on Information Systems, Analysis and Synthesis (SCI 2002), 2002.
- M. Hassoun. Fundamentals of Artificial Neural Networks. Massachusetts Institute of Technology, 1995.
- S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice-Hall Inc., second edition, 1999.
- Z. He, X. Xu, and S. Deng. Discovering cluster-based local outliers. Pattern
Recognition Letters, 24(9-10):1641-1650, 2003.
- S. Hettich and S. D. Bay. Kdd cup 1999 data. UCI KDD Archive
[http://kdd.ics.uci.edu/ /databases/kddcup99/kddcup99.html], 1999.
- L. Itti and P. Baldi. Bayesian surprise attracts human attention. In Proceedings
Neural Information Processing Systems, 2005.
- B. Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data
{Data-Centric Systems and Applications}. Springer, January 2007.
- S. Singh M. Markou. Novelty detection: a review part 1: statistical approaches.
Signal Processing, 83(12):2481-2497, December 2003.
- K. McGarry. A survey of interestingness measures for knowledge discovery. The
Knowledge Engineering Review, 00:0:1-24, 2005.
- P. M. Murphy and M. J. Pazzani. Exploring the decision forest: an empirical investigation of occam's razor in decision tree induction. J. Artij. Int. Res.,
1(1):257-275, 1993.
- A. Orriols-Puig, J. Casillas, and E. Bernado-Mansilla. First approach toward online evolution of association rules with learning classifier systems. In GECCO
'08: Proceedings of the 2008 GECCO conference companion on Genetic and evolutionary computation, pages 2031-2038, New York, NY, USA, 2008. ACM.
- Y.H. Pao and C.- Y. Shen. Visualization of pattern data through learning of non- linear variance-conserving dimension-reduction mapping. Pattern Recognition,
30(10):1705-1717,1997.
- J.M. Puche, J.M. Benitez, and J.L. Mantas. Fuzzy pairwise multiclass support vector machines. In A. Gelbukh and C.A. Reyes-Garcia, editors, Mexican International Conference on Artificial Intelligence (MICA I) , volume LNAI, pages 562-571. Springer-Verlag, 2006.
- M. Robnik-Sikonja. Improving random forests. In J.F. Boulicaut et al., editor,
Machine Learning, ECML 2004, 2004.
- J. Schmidhuber. Driven by compression progress: A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. CoRR, abs/0812.4360, 2009.
- C. Shirky. It's not information overload. it's filter failure. Keynote Speech, September 2008.
- E. Suzuki. Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science, 12(6):627-653,
2006. http://w ..... jucs. org/jucs_12_6/data_mining_methods_for.
- E. Suzuki. Lecture Notes in Computer Science, volume 5579/2009, chapter Compression-Based Measures for Mining Interesting Rules, pages 741-746. Springer Berlin / Heidelberg, 2009.
- P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining, {First
Edition}. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA,
2005.
- D. Tax and R Duin. Experiments with classifier combining rules. In Lecture
Notes in Computer Science, volume 1857, pages 16-29, Berlin, 2000. Springer- Verlag.
- D. Tax and RP.W. Duin. Using two-class classifiers for multi class classification.
In C. Suen R Kasturi, D. Laurendeau, editor, Proceedings 16th International Conference on Pattern Recognition, volume II, pages 124-127, Quebec City, Canada, Aug.11-15 2002. IEEE Computer Society Press.
- I. Tsang, J. Kwok, P. Cheung, and N. Cristianini. Core vector machines: Fast svm training on very large data sets. Journal of Machine Learning Research,
6:363-392, 2005.
- L. H. Tsoukalas and R E. Uhrig. Fuzzy and Neural Approaches in Engineering.
John Wiley & Sons, Inc., New York, NY, USA, 1996.
- C. S. Wallace and D. M. Boulton. A information measure for classification.
Computer Journal, 11(2):185-194, 1968.
- J.-S. Wang and J.-C. Chiang. An efficient data preprocessing procedure for support vector clustering. Journal of Universal Computer Science, 15(4):705-
721, 2009. http://www . jucs. org/jucs_15_4/an_efficient_data_preprocessing.
- J. D. Williams. The Compleat Strategyst: Being a Primer on the Theory of
Games of Strategy. Dover Publications, 1986.
Hilary Mason - Machine Learning for Hackers
An interesting beginners talk for machine learning enthusiasts.
Ever tried to use a regular expression to parse an unstructured street address? This talk is an introduction to a few machine learning algorithms and some tips for integrating them where they make the most sense and will save you the most headaches.
Hilary Mason - Machine Learning for Hackers from BACON: things developers love on Vimeo.
On Bayesian Sensitivity Analysis in Digital Forensics
The idea of using of Bayesian Belief Networks in digital forensics to quantify the evidence has been around for a while now. To provide qualitative approaches to Bayesian evidential reasoning in the digital Meta-Forensics is however relatively new in the decision support systems research. For law enforcement, decision support and application of data mining techniques to “soft” forensic evidence is a large area in Bayesian forensic statistics which has depicted how Bayesian Networks can be used to infer the probability of defense and prosecution statements based on forensic evidence. Kevin B. Korb and Ann E. Nicholson's study on Sally Clark is Wrongly Convicted of Murdering Her Children and Linguistic Bayesian Networks for reasoning with subjective probabilities in forensic statistics gives an insight into an important development which helps to quantify the meaning of forensic expert testimony for "strong support".
The IEEE paper on Sensitivity Analysis of a Bayesian Network for Reasoning about Digital Forensic Evidence published in 3rd International Conference on Human-Centric Computing (HumanCom), 2010 is of particular interest since it has a comprehensive real-world list of evidence items and hypothesis.
Bayesian network representing an actual prosecuted case of illegal file sharing over a peer-to-peer network has been subjected to a systematic and rigorous sensitivity analysis. Our results demonstrate that such networks are usefully insensitive both to the occurrence of missing evidential traces and to the choice of conditionalevidential probabilities
one of the co-authors Dr. Overill has also covered grounds for A Complexity Based Forensic Analysis of the Trojan Horse Defence.
The evidence nodes are follows.
- Modification time of the destination file equals that of the source file
- Creation time of the destination file is after its own modification time
- Hash value of the destination file matches that of the source file
- BitTorrent client software is installed on the seized computer
- File link for the shared file is created
- Shared file exists on the hard disk
- Torrent file creation record is found
- Torrent file exists on the hard disk
- Peer connection information is found
- Tracker server login record is found
- Torrent file activation time is corroborated by its MAC time and link file
- Internet history record about the publishing website is found
- Internet connection is available
- Cookie of the publishing website is found
- URL of the publishing website is stored in the web browser
- Web browser software is available
- Internet cache record about the publishing of the torrent file is found
- Internet history record about the tracker server connection is found
- The seized computer was used as the initial seeder to share the pirated file on a BitTorrent network
while the following hypothesis stand.
- The pirated file was copied from the seized optical disk to the seized computer
- A torrent file was created from the copied file
- The torrent file was sent to newsgroups for publishing
- The torrent file was activated, which caused the seized computer to connect to the tracker server
- The connection between the seized computer and the tracker server was maintained

The authors conclude, exonerating the sparse evidence such that
The sensitivity analysis reported in this paper demonstrates that the BT BBN used in is insensitive to the occurrence of missing evidence and also to the choice of evidential likelihoods to an unexpected degree.
Our overall finding is gratifying because it implies that the exact choice of values for the inherently subjective evidential likelihoods is not as critical as might have been expected. Values falling within the consensus of experienced expert investigators are sufficiently reliable to be used in the BBN model. Furthermore, our results imply that the inability to recover one or more evidential traces in a digital forensic investigation is not generally critical for the probability of the investigatory hypothesis under consideration.
For some reason, this reminded me of a recent read SuperFreakonomics where authors devise a terrorist-algorithm with the following black-box variable.
“What finally made it work was one last metric that dramatically sharpened the aalgorithm. In the interest of national security, was have been asked to not disclose the particulars; we’ll call it Variable X.
What makes Variable X so special?
For one, it is a behavioral metric, not a demographic one. The dream of anti-terrorist authorities everywhere is to somehow become a fly on the wall in a room full of terrorists. In one small important way, Variable X accomplishes that. Unlike most other metrics in the algorithm, which produce a yes or no answer, Variable X measures the intensity of a particular banking activity. While not unusual in low intensities among the general population, this behavior occurs in high intensities much more frequently among those who have other terrorist markers.
This ultimately gave the algorithm great predictive power. Starting with a database of millions of bank customers, Horsley was able to generate a list of about 30 highly suspicious individuals. According to his rather conservative estimate, at least 5 of those 30 are almost certainly involved in actitvities. Five out of 30 isn’t perfect—the algorithm misses many terrorists and still falsley identifies some innocents—but it sure beats 495 out of 500,495.”
Bayesian Belief Networks can definitely serve as a better probabilistic graphical model to achieve a improved visibility and prior/posterior probabilities for such network related algorithm.
pgm.HelloWorld() with Wainwright & Jordan
I have recently came across Wainwright & Jordan's paper on exponential families, graphical models, and variational inference and found it to be quite comprehensive and unifying introduction of the topic. Probabilistic graphical models use a graph-based representation as the basis for compactly encoding a complex distribution over a high-dimensional space. If you are familiar with Koller and Friedman's work on Probabilistic Modeling, Wainwright and Jordan's paper would provide a less mathenamtically terse and more unifying view of the area.
Graphical Models, Exponential Families, and Variational Inference
As compared to Pearl's work on Causality, this paper provides a contemporary look at Message-passing Algorithms for Approximate Inference, Connection to Max-Product Message-Passing and detailed insight into Moment Matrices, Semidefinite Constraints, and Conic Programming Relaxation. Due to it's clarity and detailed explanation, the background material on Graphs, hypergraphs, exponential families and duality is definitely worth reading even if you don't need a refresher.
In Lieu of Pearl's polytree approach, Wainwright & Jordan's work discusses Graphical Models as Exponential Families before delving into Computational Challenges with High-Dimensional Models. Later chapters deal with Sum-Product, Bethe–Kikuchi, and Expectation-Propagation, Mean Field Methods, Variational Methods in Parameter Estimation, Convex Relaxations and Upper Bounds, Integer Programming, Max-product, and Linear Programming Relaxations concluding with Moment Matrices, Semidefinite Constraints, and Conic Programming Relaxation. For a computer scientist, it is always interesting to observe the statistical perspective of machine learning. This contemporary insight into Graphical Models, Exponential Families, and Variational Inference was published in Foundations and Trends in Machine Learning which is definitely built upon researchers' earlier work on Variational inference for Dirichlet process mixtures and Variational inference in graphical models: The view from the marginal polytope.
As an appetizer, I would also recommend Bishop's chapter on Graphical Model.
Continued adventures in SamIam - Back to Basics with Code Bandit
As discussed previous post on Customizing Conditional Probability using Code Generation with SamIam, I have touched upon importance of having programmatic and declarative control over the network. Working with SamIam (and with Infer.NET to some extent) gives a researcher provides this flexibility which is hard to find in proprietary tools.
Here is a simple example of a typical text book belief network. Once graphically drawn, SamIam's code bandit allow you to extract the model out as a class.
This class hard codes the network ...code\samiam30_windows_amd64\samiam\BeliefNet.net where one can operate on the object BayesianNetwork and can modify the nodes by population from a different data source rather than hard coding. Once a simple, readable structure model is available in raw code, there are lots of possibilities for data population. To build, ensure that inflib.jar occurs in the command line classpath, e.g. javac -classpath inflib.jar ModelTutorial.java
public BayesianNetwork createBayesianNetwork()
{
/* Create a domain of size 5. */
Domain domain = new Domain(5);
/* Add a discrete variable called "H" to the domain,
with states "True", "False". */
String name0 = "H";
String[] values0 = new String[]{ "True", "False" };
int id0 = domain.addDim( name0, values0 );
/* Add a discrete variable called "B" to the domain,
with states "True", "False". */
String name1 = "B";
String[] values1 = new String[]{ "True", "False" };
int id1 = domain.addDim( name1, values1 );
/* Add a discrete variable called "L" to the domain,
with states "True", "False". */
String name2 = "L";
String[] values2 = new String[]{ "True", "False" };
int id2 = domain.addDim( name2, values2 );
/* Add a discrete variable called "C" to the domain,
with states "True", "False". */
String name3 = "C";
String[] values3 = new String[]{ "True", "False" };
int id3 = domain.addDim( name3, values3 );
/* Add a discrete variable called "F" to the domain,
with states "True", "False". */
String name4 = "F";
String[] values4 = new String[]{ "True", "False" };
int id4 = domain.addDim( name4, values4 );
/* For the cpts, create arrays of double-precision floating point values. */
//H Value
//True 0.2
//False 0.8
double[] cpt0 = new double[]{ 0.2, 0.8 };
//B H Value
//True True 0.25
//True False 0.05
//False True 0.75
//False False 0.95
double[] cpt1 = new double[]{ 0.25, 0.05, 0.75, 0.95 };
//L H Value
//True True 0.03
//True False 5.0E-4
//False True 0.97
//False False 0.9995
double[] cpt2 = new double[]{ 0.03, 5.0E-4, 0.97, 0.9995 };
//C L Value
//True True 0.6
//True False 0.02
//False True 0.4
//False False 0.98
double[] cpt3 = new double[]{ 0.6, 0.02, 0.4, 0.98 };
//F L B Value
//True True True 0.75
//True True False 0.1
//True False True 0.5
//True False False 0.05
//False True True 0.25
//False True False 0.9
//False False True 0.5
//False False False 0.95
double[] cpt4 = new double[]{ 0.75, 0.1, 0.5, 0.05, 0.25, 0.9, 0.5, 0.95 };
Later on, SamIam creates the table using the CPT's and eventually build the network using these tables.
/*
Create a IL2 Table for each cpt.
The parameters to the Table constructor are:
(1) the domain,
(2) the variable ids that name the dimensions of the table (in the form of an IntSet),
(3) the cpt data.
*/
Table table0 = new Table( domain, new IntSet( new int[]{ id0 } ), cpt0 );
Table table1 = new Table( domain, new IntSet( new int[]{ id0, id1 } ), cpt1 );
Table table2 = new Table( domain, new IntSet( new int[]{ id0, id2 } ), cpt2 );
Table table3 = new Table( domain, new IntSet( new int[]{ id2, id3 } ), cpt3 );
Table table4 = new Table( domain, new IntSet( new int[]{ id1, id2, id4 } ), cpt4 );
/* Create an array of all the Tables. */
Table[] tables = new Table[]{ table0, table1, table2, table3, table4 };
/*
The simple BayesianNetwork constructor takes only one argument:
an array of Tables.
*/
BayesianNetwork model = new BayesianNetwork( tables );
Upon building, you get the following console output.
Happy inferring!
Tag Cloud for Belief Network Sensitivity as Background Knowledge
While we are having fun visualizing with tag clouds, here is one on the following four key papers in the area of pattern mining with Bayesian networks as background knowledge and discovery of interesting patterns based on Bayesian network background knowledge.
- Fast discovery of interesting patterns based on Bayesian network background knowledge
- Fast discovery of unexpected patterns in data, relative to a Bayesian network
- Interestingness filtering engine: Mining Bayesian networks for interesting patterns
- Scalable pattern mining with Bayesian networks as background knowledge
- Using sensitivity of a bayesian network to discover interesting patterns
Selected Papers in Machine Learning
- The Discipline of Machine Learning by Tom Mitchell
- Introduction to Support Vector Machines - Dustin Boswell
- Fast Training of Support Vector Machines using Sequential Minimal Optimization
- Introduction to linear regression
- The elements of statistical learning (book)
- Survey of Clustering Algorithms
- Supervised Machine Learning: A Review of Classification Techniques
- Ensemble Methods in Machine Learning
- The Boosting Approach to Machine Learning
- K-means clustering via principal component analysis
- Dimensionality Reduction
- Unsupervised Learning by Probabilistic Latent Semantic Analysis
- Classes of Kernels for Machine Learning:A Statistics Perspective
- Bayesian inference: An introduction to principles and practice in machine learning
- An Introduction to MCMC for Machine Learning
- Supervised machine learning: A review of classification techniques
- Linear Algebra Refresher
- Bayesian Modelling in Machine Learning: A Tutorial Review






