Unsupervised Learning Essentials

Large amount of data are recorded every day in different fields, including marketing, bio-medical and security. To discover knowledge from these data, you need machine learning techniques, which are classified into two categories:
These include mainly clustering and principal component analysis methods. The goal of clustering is to identify pattern or groups of similar objects within a data set of interest. Principal component methods consist of summarizing and visualizing the most important information contained in a multivariate data set.
These methods are “unsupervised” because we are not guided by a priori ideas of which variables or samples belong in which clusters or groups. The machine algorithm “learns” how to cluster or summarize the data.
Supervised learning consists of building mathematical models for predicting the outcome of future observations. Predictive models can be classified into two main groups:
regression analysis for predicting a continuous variable. For example, you might want to predict life expectancy based on socio-economic indicators.
Classification for predicting the class (or group) of individuals. For example, you might want to predict the probability of being diabetes-positive based on the glucose concentration in the plasma of patients.
These methods are supervised because we build the model based on known outcome values. That is, the machine learns from known observation outcomes in order to predict the outcome of future cases.
Here, we present a practical guide to machine learning methods for exploring data sets, as well as, for building predictive models.
You’ll learn the basic ideas of each method and reproducible R codes for easily computing a large number of machine learning techniques.
Our goal was to write a practical guide to machine learning for every one.
The main parts of the book include:
The book presents the basic principles of these tasks and provide many examples in R. This book offers solid guidance in data mining for students and researchers.
Key features:
At the end of each chapter, we present R lab sections in which we systematically work through applications of the various methods discussed in that chapter.