Research & Development

Monads by David Crockford

The monadic curse is that once someone learns what monads are and how to use them, they lose the ability to explain it to other people.

Excellent lecture. Transcript and Monads for Humans

Share

State of the IoT Security

In a recent podcast by Scott Hanselman and Erica Stanley, an Internet of Things (IoT) primer, the guest mentioned how security is being treated as an afterthought for most things IoT. This is unfortunately true in various areas of software development; but especially with the unprecedented growth of IoT, this lax in providing security standards will fast become a safety and security dilemma.

To borrow the variety, velocity and volume analogy of Big Data, IoT is also subject to a very large variety of devices, supporting different velocities (performance capacities) and volumes (large number of devices, meshes etc). Protection of data in these devices and providing privacy of is definitely the key challenges in the IoT. It is also bad for business since lax security measures will cause decreased adoption impacting the success of the IoT and hinder overall development.

Following are some of the relevant links and papers which provide overview, analysis and taxonomy of security and privacy challenges in IoT.

 

References and Further Reading

Share

Penetration Testing techniques in Web Applications - Infographic

Penetration Testing techniques in web applications by Dimitris Mandilaras, Nikolaos Tsalis is an succinct info-graphic review of different security frameworks / methodologies including OWASP, PTES, ISSAF, NIST, OSSTM and PTF.

A short poster can be downloaded from here.

 

Share

Selection of 2014 F# / Functional Programming Resources

Share

Gradient Boosting Machine Learning by Prof. Hastie

Here is Prof. Hastie's recent talk from the H2O World conference. In this talk, professor Hastie takes us through Ensemble Learners like decision trees and random forests for classification problems.

 

Other excellent talks from the conference include the following.

  • Michael Marks - Values and Art of Scale in Business
  • Nachum Shacham of Paypal - R and ROI for Big Data
  • Hassan Namarvar, ShareThis - Conversion Estimation in Display Advertising
  • Ofer Mendelevitch, Hortonworks - Bayesian Networks with R and Hadoop,
  • Sandy Ryza, Cloudera - MLlib and Apache Spark
  • Josh Bloch, Lord of the APIs - A Brief, Opinionated History of the API
  • Macro and Micro Trends in Big Data, Hadoop and Open Source
  • Competitive Data Science Panel: Kaggle, KDD and data sports
  • Practical Data Science Panel

The complete playlist can be found here.

Share

Teaching Functional Programming to Professional .NET Developers

An informative paper by Tomas Petricek of University of Cambridge.

Abstract. Functional programming is often taught at universities to first-year or second-year students and most of the teaching materials have been written for this audience. With the recent rise of functional programming in the industry, it becomes important to teach functional concepts to professional developers with deep knowledge of other paradigms, most importantly object-oriented. We present our experience with teaching functional programming and F# to experienced .NET developers through a book Real-World Functional Programming and commercially offered F# trainings. The most important novelty in our approach is the use of C# for relating functional F# with object-oriented C# and for introducing some of the functional concepts. By presenting principles such as immutability, higher-order functions and functional types from a different perspective, we are able to build on existing knowledge of professional developers. This contrasts with a common approach that asks students to forget everything they know about programming and think completely differently. We believe that our observations are relevant for trainings designed for practitioners, but perhaps also for students who explore functional relatively late in the curriculum.

 

 

Honorable mention to A Look at F# from C#’s corner 

Share

Dissertation - Done!

Screen Shot 2014-08-26 at 10.37.48 AM

Share

P≠NP - A Definitive Proof by Contradiction

Following the great scholarly acceptance and outstanding academic success of "The Clairvoyant Load Balancing Algorithm for Highly Available Service Oriented Architectures, this year I present P Not Equal to NP - A Definitive Proof by Contradiction.

 

P Not Equal to NP - A Definitive Proof by Contradiction

 

Click here to read the entire paper in PDF. P Not Equal to NP - A Definitive Proof by Contradiction.

Share

LyX/LaTeX formatting for the C# code

If you are googling trying to find a good way to insert C# code in LyX, this is where you'd probably end up. MaPePer has provided a very good solution; I have modified it slightly (hiding tabs and removing comments) and following is illustration on how to use it in LyX.

First thing you'd need is a Lyx document (LyxC#CodeListing.lyx). Empty one works well.

Add the following to Preamble (Document-> Settings-> LaTeX Preamble)

\usepackage{color}
\usepackage{listings}

\lstloadlanguages{% Check Dokumentation for further languages ...
C,
C++,
csh,
Java
}

\definecolor{red}{rgb}{0.6,0,0} % for strings
\definecolor{blue}{rgb}{0,0,0.6}
\definecolor{green}{rgb}{0,0.8,0}
\definecolor{cyan}{rgb}{0.0,0.6,0.6}

\lstset{
language=csh,
basicstyle=\footnotesize\ttfamily,
numbers=left,
numberstyle=\tiny,
numbersep=5pt,
tabsize=2,
extendedchars=true,
breaklines=true,
frame=b,
stringstyle=\color{blue}\ttfamily,
showspaces=false,
showtabs=false,
xleftmargin=17pt,
framexleftmargin=17pt,
framexrightmargin=5pt,
framexbottommargin=4pt,
commentstyle=\color{green},
morecomment=[l]{//}, %use comment-line-style!
morecomment=[s]{/*}{*/}, %for multiline comments
showstringspaces=false,
morekeywords={ abstract, event, new, struct,
as, explicit, null, switch,
base, extern, object, this,
bool, false, operator, throw,
break, finally, out, true,
byte, fixed, override, try,
case, float, params, typeof,
catch, for, private, uint,
char, foreach, protected, ulong,
checked, goto, public, unchecked,
class, if, readonly, unsafe,
const, implicit, ref, ushort,
continue, in, return, using,
decimal, int, sbyte, virtual,
default, interface, sealed, volatile,
delegate, internal, short, void,
do, is, sizeof, while,
double, lock, stackalloc,
else, long, static,
enum, namespace, string},
keywordstyle=\color{cyan},
identifierstyle=\color{red},
}
\usepackage{caption}
\DeclareCaptionFont{white}{\color{white}}
\DeclareCaptionFormat{listing}{\colorbox{blue}{\parbox{\textwidth}{\hspace{15pt}#1#2#3}}}
\captionsetup[lstlisting]{format=listing,labelfont=white,textfont=white, singlelinecheck=false, margin=0pt, font={bf,footnotesize}}

 

In the preamble (Document-> Settings-> LaTeX Preamble)
preamble

 

Now add a program listing block. Hopefully you have the listing package installed otherwise you can always use the listing MikTeX update.

 

insert-program-listing-lyx


Now add the code to the listing block.


lyx-screen

and then Ctrl-R

 

CodeListing

 

Tada!

 

Happy Lyxing

 

References & download LyxC#CodeListing.lyx

 

 

Share

Machine Learning - On the Art and Science of Algorithms with Peter Flach

Over a decade ago, Peter Flach of Bristol University wrote a paper on the topic of "On the state of the art in machine learning: A personal review" in which he reviewed several, then recent books, related to developments in machine learning. This included Pat Langley’s Elements of Machine Learning (Morgan Kaufmann), Tom Mitchell’s Machine Learning (McGraw-Hill), and Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations by Ian Witten and Eibe Frank (Morgan Kaufman) among many others. Dr. Flach mentioned Michael Berry and Gordon Linoff’s Data Mining Techniques for Marketing, Sales, and Customer Support (John Wiley) for it's excellent writing style citing the paragraph below and commending "I wish that all computer science textbooks were written like this."

“People often find it hard to understand why the training set and test set are “tainted” once they have been used to build a model. An analogy may help: Imagine yourself back in the 5th grade. The class is taking a spelling test. Suppose that, at the end of the test period, the teacher asks you to estimate your own grade on the quiz by marking the words you got wrong. You will give yourself a very good grade, but your spelling will not improve. If, at the beginning of the period, you thought there should be an ‘e’ at the end of “tomato”, nothing will have happened to change your mind when you grade your paper. No new data has entered the system. You need a test set!

 

 

 

 

Now, imagine that at the end of the test the teacher allows you to look at the papersof several neighbors before grading your own. If they all agree that “tomato” has no final ‘e’, you may decide to mark your own answer wrong. If the teacher gives the same quiz tomorrow, you will do better. But how much better? If you use the papers of the very same neighbors to evaluate your performance tomorrow, you may still be fooling yourself. If they all agree that “potatoes” has no more need of an ‘e’ then “tomato”, and you have changed your own guess to agree with theirs, then you will overestimate your actual grade on the second quiz as well. That is why the evaluation set should be different from the test set.” [3, pp. 76–77] 4

 

Machine-Learning-9781107096394

 

That is why when I recently came across  "Machine Learning The Art and Science of Algorithms that Make Sense of Data", I decided to check it out and wasn't disappointed. Dr. Flach is the Professor of Artificial Intelligence at the University of Bristol and in this "future classic", he left no stone unturned when it comes to clarity and explainability.  The book starts with a machine learning sampler, introduces the ingredients of machine learning fast progressing to Binary classification and Beyond. Written as a textbook, riddled with examples, foot-notes and figures, this text elaborates concept learning, tree models, rule models, linear models, distance-based models, probabilistic models to features and ensembles concluding with Machine learning experiments. I really enjoyed the "Important points to remember" section of the book as a quick refresher on machine-learning-commandments.

The concept learning section seems to have been influenced by author's own research interest and is not discussed in as much details in contemporary machine learning texts. I also found frequent summarization of concepts to be quite helpful. Contrary to it's subtitle and compared to it's counterparts, the book however is light on algorithms and code, possibly on purpose. While it explains the concepts with examples, number of formal algorithms are kept to a minimum. This may aid in clarity and help avoiding recipe-book-syndrome while making it potentially inaccessible to practitioners. Great at basics, the text also falls short on elaboration of intermediate to advance topics such as LDA, kernel methods, PCA, RKHS, and convex optimization. For instance, in chapter 10 "Matrix transformations and decompositions" could have been made an appendix while expanding upon meaningful topics like LSA and use cases of sparse matrix (pg 327). It is definitely not the book's fault; but rather of this reader expecting too much from an introductory text just because author explains everything so well!

As a text book on On the Art and Science of Algorithms, Peter Flach definitely delivers on the promise of clarity, with well chosen illustrations and example based approach. A highly recommended reading for all who would like to understand the principles behind machine learning techniques.

Materials can be downloaded from here which generously include excerpts with background material and literature references, full set of 540 lecture slides in PDF including all figures in the book with LaTeX beamer source of the above.

Share
Go to Top