Deep learning architectures are built using multiple levels of non-linear aggregators, for instance neural nets with many hidden layers. In this introductory talk Will Stanton discusses the motivations and principles regarding learning algorithms for deep architectures. Bill provides a primer to neural networks, and deep Learning. He explains how Deep Learning gives some of the best-ever solutions to problems in computer vision, speech recognition, and natural language processing.
and also, why Google is Investing in deep learning.
Presented at IEEE HST 2015
Static Analysis for Web Service Security – Techniques and Tools for a Secure Development Life Cycle
Adnan Masood, Nova Southeastern University; Jim Java, Nova Southeastern University
Presented in the IEEE SoutheastCon 2015
Finding Interesting Outliers - A Belief Network based Approach
Abstract: Outliers are deviations from the usual trends of data; to discover interestingness among outliers i.e. finding anomalies which are of real-interest for subject matter experts is an active area of research in data mining and maching learning community. Due to its subjective nature, the definition of what amounts to ’interesting’ varies between domains and subject matter experts. In this research, we explore the quantification for measures of interestingness, using Bayesian Belief Networks as background knowledge. Mining outliers may help discover potential anomalies and fraudulent activities. Meaningful outliers can be retrieved and analyzed by using domain knowledge. Domain knowledge (or background knowledge) is represented using probabilistic graphical models such as Bayesian belief networks. Bayesian networks are graph-based representation used to model and encode mutual relationships between entities. Due to their probabilistic graphical nature, Belief Networks are an ideal way to capture the sensitivity, causal inference, uncertainty and background knowledge in real world data sets. Bayesian Networks effectively present the causal relationships between different entities (nodes) using conditional probability. This probabilistic relationship shows the degree of belief between entities. A quantitative measure which computes changes in this degree of belief acts as a sensitivity measure. In this research paper we provide an overview of interestingness measures, their use to measure sensitivity in belief networks and review the earlier work on so-called Interestingness Filtering Engine. Building upon these foundation, we introduce our algorithm IBOX - Interestingness based Bayesian Outlier eXplainer, which provides progressive improvement in the performance and sensitivity scoring of the earlier works. IBOX provides an iterative model to use multiple interestingness measures resulting in better performance and improved sensitivity analysis. The approach quantitatively validates probabilistic interestingness measures as an effective sensitivity analysis technique in rare class mining.
Topic Category: Data Mining and Machine Learning
Download Paper - NL-ESB - A Negative Latency Enterprise Service Bus
Here is Prof. Hastie's recent talk from the H2O World conference. In this talk, professor Hastie takes us through Ensemble Learners like decision trees and random forests for classification problems.
Other excellent talks from the conference include the following.
- Michael Marks - Values and Art of Scale in Business
- Nachum Shacham of Paypal - R and ROI for Big Data
- Hassan Namarvar, ShareThis - Conversion Estimation in Display Advertising
- Ofer Mendelevitch, Hortonworks - Bayesian Networks with R and Hadoop,
- Sandy Ryza, Cloudera - MLlib and Apache Spark
- Josh Bloch, Lord of the APIs - A Brief, Opinionated History of the API
- Macro and Micro Trends in Big Data, Hadoop and Open Source
- Competitive Data Science Panel: Kaggle, KDD and data sports
- Practical Data Science Panel
The complete playlist can be found here.
Following the great scholarly acceptance and outstanding academic success of "The Clairvoyant Load Balancing Algorithm for Highly Available Service Oriented Architectures, this year I present P Not Equal to NP - A Definitive Proof by Contradiction.
Click here to read the entire paper in PDF. P Not Equal to NP - A Definitive Proof by Contradiction.
Over a decade ago, Peter Flach of Bristol University wrote a paper on the topic of "On the state of the art in machine learning: A personal review" in which he reviewed several, then recent books, related to developments in machine learning. This included Pat Langley’s Elements of Machine Learning (Morgan Kaufmann), Tom Mitchell’s Machine Learning (McGraw-Hill), and Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations by Ian Witten and Eibe Frank (Morgan Kaufman) among many others. Dr. Flach mentioned Michael Berry and Gordon Linoff’s Data Mining Techniques for Marketing, Sales, and Customer Support (John Wiley) for it's excellent writing style citing the paragraph below and commending "I wish that all computer science textbooks were written like this."
“People often find it hard to understand why the training set and test set are “tainted” once they have been used to build a model. An analogy may help: Imagine yourself back in the 5th grade. The class is taking a spelling test. Suppose that, at the end of the test period, the teacher asks you to estimate your own grade on the quiz by marking the words you got wrong. You will give yourself a very good grade, but your spelling will not improve. If, at the beginning of the period, you thought there should be an ‘e’ at the end of “tomato”, nothing will have happened to change your mind when you grade your paper. No new data has entered the system. You need a test set!
Now, imagine that at the end of the test the teacher allows you to look at the papersof several neighbors before grading your own. If they all agree that “tomato” has no final ‘e’, you may decide to mark your own answer wrong. If the teacher gives the same quiz tomorrow, you will do better. But how much better? If you use the papers of the very same neighbors to evaluate your performance tomorrow, you may still be fooling yourself. If they all agree that “potatoes” has no more need of an ‘e’ then “tomato”, and you have changed your own guess to agree with theirs, then you will overestimate your actual grade on the second quiz as well. That is why the evaluation set should be different from the test set.” [3, pp. 76–77] 4
That is why when I recently came across "Machine Learning The Art and Science of Algorithms that Make Sense of Data", I decided to check it out and wasn't disappointed. Dr. Flach is the Professor of Artificial Intelligence at the University of Bristol and in this "future classic", he left no stone unturned when it comes to clarity and explainability. The book starts with a machine learning sampler, introduces the ingredients of machine learning fast progressing to Binary classification and Beyond. Written as a textbook, riddled with examples, foot-notes and figures, this text elaborates concept learning, tree models, rule models, linear models, distance-based models, probabilistic models to features and ensembles concluding with Machine learning experiments. I really enjoyed the "Important points to remember" section of the book as a quick refresher on machine-learning-commandments.
The concept learning section seems to have been influenced by author's own research interest and is not discussed in as much details in contemporary machine learning texts. I also found frequent summarization of concepts to be quite helpful. Contrary to it's subtitle and compared to it's counterparts, the book however is light on algorithms and code, possibly on purpose. While it explains the concepts with examples, number of formal algorithms are kept to a minimum. This may aid in clarity and help avoiding recipe-book-syndrome while making it potentially inaccessible to practitioners. Great at basics, the text also falls short on elaboration of intermediate to advance topics such as LDA, kernel methods, PCA, RKHS, and convex optimization. For instance, in chapter 10 "Matrix transformations and decompositions" could have been made an appendix while expanding upon meaningful topics like LSA and use cases of sparse matrix (pg 327). It is definitely not the book's fault; but rather of this reader expecting too much from an introductory text just because author explains everything so well!
As a text book on On the Art and Science of Algorithms, Peter Flach definitely delivers on the promise of clarity, with well chosen illustrations and example based approach. A highly recommended reading for all who would like to understand the principles behind machine learning techniques.
Materials can be downloaded from here which generously include excerpts with background material and literature references, full set of 540 lecture slides in PDF including all figures in the book with LaTeX beamer source of the above.
Going for a little Benoit B. Mandelbrot recursion joke here with the title.
Seth Juarez (github) recently spoke to Pasadena .NET user group on the topic of Practical Machine Learning using nuML. Seth is a wonderful speaker, educator and nuML is an excellent library to get started with machine learning in .NET. His explanations are very intuitive; even for people who have been working in the field for a while. During the talk and follow up discussions, there were various technical references made which went beyond the scope of talk. To be fair with Seth, he covered lot of material in an hour and a half; probably couple of weeks worth in a traditional ML course.
Therefore I decided to provide links to these underlying topics for the benefit of attendees in case anyone is interested in knowing more about them.
- No free lunch in search and optimization
- Probably approximately correct learning
- Kernalized Sorting for NLP Presentation - Paper by Seth
- QP Solver
- NP-Complete Problems
- Intuitive Explanation of Expectation Maximization
- Multi-class classification
- Rosylyn and Roslyn CTP Introduces Interactive Code for C#
- Expando Objects
- Cardinality vs Selectivity
- Microsoft Automatic Graph Layout Library
- Positive Definite Matrix
- Kernel Perceptron in Python
- Perceptrons and Kernels
- math.net numerics
- Matrix Slicing
- Vectors and Matrices
- CodeMash 2013 Repo and readme
- What is EM algorithm?
- k-means clustering
- Clustering Algorithms
- Bag of Words Model
- Cosine similarity vs Hamming distance
- Time series regression and generalized least squares
- Machine Learning Techniques for Stock Prediction
- Causality, Correlation and Browian Motion
Happy Machine Learning!
 R Stuart Geiger, Yoon Jung Jeong, Emily Manders. Black-boxing the user: internet protocol over xylophone players (IPoXP). Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems Extended Abstracts:71—80, 2012.
 David R Karger, Matthias Ruhl. Simple efficient load balancing algorithms for peer-to-peer systems. Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures:36—43, 2004.