Close

Exploring Spark with Data Science Work bench

Apache Spark is a general purpose cluster computing platform which extends map-reduce to support multiple computation types including but not limited to stream processing and interactive queries. Last week IBM's Moktar Kandil presented at the Tampa Hadoop and Tampa Data Science Group Joint meetup on the topic of exploring Apache Spark. Apache Spark for Azure HD-Insight Following are…

Share

State of Facial Recognition (Azure Face API et al) & Sentiment Analysis

Sam just wrote a precis on Why Facial Recognition Is the Next Big Thing in Marketing which outlines how brands are / can use the facial recognition to increase engagement, and therefore sales. From a machine learning and data science perspective, building algorithms which understand what one's face is really saying i.e. performing emotion analysis to find insight into purchasing patterns…

Share

On Explainability of Deep Neural Networks

During a discussion yesterday with software architect extraordinaire David Lazar regarding how everything old is new again, the topic of deep neural networks and its amazing success was brought up. Unless one is living under a rock for past five years, the advancements in artificial neural networks (ANN) has been quite significant and noteworthy. Since the…

Share

MIT Machine Learning for Big Data and Text Processing Class Notes Day 5

On the final day (day 5) the agenda for the MIT Machine learning course was as follows: Generative models, mixtures, EM algorithm Semi-supervised and active learning Tagging, information extraction The day started with Dr. Jakkola's discusion on parameter selection, generative learning algorithms,  Learning Generative Models via Discriminative Approaches, and Generative and Discriminative Models. This led to the questions such…

Share

MIT Machine Learning for Big Data and Text Processing Class Notes Day 4

On day 4 of the Machine learning course, following was the agenda: Unsupervised learning, clustering Dimensionality reduction, matrix factorization, and Collaborative filtering, recommender problems The day started with Regina Barzilay (Bio) (Personal Webpage) talk on Determining the number of clusters in a data set and approaches to determine the correct numbers of clusters. The core idea being addressed was difference…

Share

MIT Machine Learning for Big Data and Text Processing Class Notes Day 3

Day 3 of the Machine Learning for Big Data and Text Processing Classification started with Dr. Regina Barzilay (Bio) (Personal Webpage) overview of the the following. Cascades, boosting Neural networks, deep learning Back-propagation Image/text annotation, translation Dr. Barzilay introduced BoosTexter for the class with a demo on twitter feed. BoosTexter is a general purpose machine-learning program based on boosting for building…

Share

MIT Machine Learning for Big Data and Text Processing Class Notes - Day 2

So after having an awesome Day 1 @ MIT, I was in CSAIL library and met Pedro Ortega, NIPS 2015 Program Manager @adaptiveagents. Celebrity sighting! Today on Day 2, Dr. Jaakkola (Bio) (Personal Webpage) professor, Electrical Engineering and Computer Science/Computer Science and Artificial Intelligence Laboratory (CSAIL), went over the following . Non-linear classification and regression, kernels Passive aggressive algorithm Overfitting, regularization, generalization Content…

Share