SVM Model: Support Vector Machine Essentials

kassambara | 11/03/2018 | 24473 | Comments (2) | Classification Methods Essentials

Support Vector Machine (or SVM) is a machine learning technique used for classification tasks. Briefly, SVM works by identifying the optimal decision boundary that separates data points from different groups (or classes), and then predicts the class of new observations based on this separation boundary.

Depending on the situations, the different groups might be separable by a linear straight line or by a non-linear boundary line.

Support vector machine methods can handle both linear and non-linear class boundaries. It can be used for both two-class and multi-class classification problems.

In real life data, the separation boundary is generally nonlinear. Technically, the SVM algorithm perform a non-linear classification using what is called the kernel trick. The most commonly used kernel transformations are polynomial kernel and radial kernel.

Note that, there is also an extension of the SVM for regression, called support vector regression.

In this chapter, we’ll describe how to build SVM classifier using the caret R package.

Contents:

Loading required R packages
Example of data set
SVM linear classifier
SVM classifier using Non-Linear Kernel
Discussion

The Book:

Machine Learning Essentials: Practical Guide in R

Loading required R packages

tidyverse for easy data manipulation and visualization
caret for easy machine learning workflow

library(tidyverse)
library(caret)

Example of data set

Data set: PimaIndiansDiabetes2 [in mlbench package], introduced in Chapter @ref(classification-in-r), for predicting the probability of being diabetes positive based on multiple clinical variables.

We’ll randomly split the data into training set (80% for building a predictive model) and test set (20% for evaluating the model). Make sure to set seed for reproducibility.

# Load the data
data("PimaIndiansDiabetes2", package = "mlbench")
pima.data <- na.omit(PimaIndiansDiabetes2)
# Inspect the data
sample_n(pima.data, 3)
# Split the data into training and test set
set.seed(123)
training.samples <- pima.data$diabetes %>%
  createDataPartition(p = 0.8, list = FALSE)
train.data  <- pima.data[training.samples, ]
test.data <- pima.data[-training.samples, ]

SVM linear classifier

In the following example variables are normalized to make their scale comparable. This is automatically done before building the SVM classifier by setting the option preProcess = c("center","scale").

# Fit the model on the training set
set.seed(123)
model <- train(
  diabetes ~., data = train.data, method = "svmLinear",
  trControl = trainControl("cv", number = 10),
  preProcess = c("center","scale")
  )
# Make predictions on the test data
predicted.classes <- model %>% predict(test.data)
head(predicted.classes)

## [1] neg pos neg pos pos neg
## Levels: neg pos

# Compute model accuracy rate
mean(predicted.classes == test.data$diabetes)

## [1] 0.782

Note that, there is a tuning parameter C, also known as Cost, that determines the possible misclassifications. It essentially imposes a penalty to the model for making an error: the higher the value of C, the less likely it is that the SVM algorithm will misclassify a point.

By default caret builds the SVM linear classifier using C = 1. You can check this by typing model in R console.

It’s possible to automatically compute SVM for different values of `C and to choose the optimal one that maximize the model cross-validation accuracy.

The following R code compute SVM for a grid values of C and choose automatically the final model for predictions:

# Fit the model on the training set
set.seed(123)
model <- train(
  diabetes ~., data = train.data, method = "svmLinear",
  trControl = trainControl("cv", number = 10),
  tuneGrid = expand.grid(C = seq(0, 2, length = 20)),
  preProcess = c("center","scale")
  )
# Plot model accuracy vs different values of Cost
plot(model)

# Print the best tuning parameter C that
# maximizes model accuracy
model$bestTune

##       C
## 12 1.16

# Make predictions on the test data
predicted.classes <- model %>% predict(test.data)
# Compute model accuracy rate
mean(predicted.classes == test.data$diabetes)

## [1] 0.782

SVM classifier using Non-Linear Kernel

To build a non-linear SVM classifier, we can use either polynomial kernel or radial kernel function. Again, the caret package can be used to easily computes the polynomial and the radial SVM non-linear models.

The package automatically choose the optimal values for the model tuning parameters, where optimal is defined as values that maximize the model accuracy.

Computing SVM using radial basis kernel:

# Fit the model on the training set
set.seed(123)
model <- train(
  diabetes ~., data = train.data, method = "svmRadial",
  trControl = trainControl("cv", number = 10),
  preProcess = c("center","scale"),
  tuneLength = 10
  )
# Print the best tuning parameter sigma and C that
# maximizes model accuracy
model$bestTune

##   sigma    C
## 1 0.136 0.25

# Make predictions on the test data
predicted.classes <- model %>% predict(test.data)
# Compute model accuracy rate
mean(predicted.classes == test.data$diabetes)

## [1] 0.795

Computing SVM using polynomial basis kernel:

# Fit the model on the training set
set.seed(123)
model <- train(
  diabetes ~., data = train.data, method = "svmPoly",
  trControl = trainControl("cv", number = 10),
  preProcess = c("center","scale"),
  tuneLength = 4
  )
# Print the best tuning parameter sigma and C that
# maximizes model accuracy
model$bestTune

##   degree scale C
## 8      1  0.01 2

# Make predictions on the test data
predicted.classes <- model %>% predict(test.data)
# Compute model accuracy rate
mean(predicted.classes == test.data$diabetes)

## [1] 0.795

In our examples, it can be seen that the SVM classifier using non-linear kernel gives a better result compared to the linear model.

Discussion

This chapter describes how to use support vector machine for classification tasks. Other alternatives exist, such as logistic regression (Chapter @ref(logistic-regression)).

You need to assess the performance of different methods on your data in order to choose the best one.

1 Note

Enjoyed this article? Give us 5 stars (just above this text block)! Reader needs to be STHDA member for voting. I’d be very grateful if you’d help it spread by emailing it to a friend, or sharing it on Twitter, Facebook or Linked In.

Show me some love with the like buttons below... Thank you and please don't forget to share and comment below!!

Avez vous aimé cet article? Donnez nous 5 étoiles (juste au dessus de ce block)! Vous devez être membre pour voter. Je vous serais très reconnaissant si vous aidiez à sa diffusion en l'envoyant par courriel à un ami ou en le partageant sur Twitter, Facebook ou Linked In.

Montrez-moi un peu d'amour avec les like ci-dessous ... Merci et n'oubliez pas, s'il vous plaît, de partager et de commenter ci-dessous!

Recommended for You!

Machine Learning Essentials: Practical Guide in R

Practical Guide to Cluster Analysis in R

Practical Guide to Principal Component Methods in R

R Graphics Essentials for Great Data Visualization

Network Analysis and Visualization in R

More books on R and data science

Recommended for you

This section contains the best data science and self-development resources to help you on your path.

Books - Data Science

Our Books

Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
Network Analysis and Visualization in R by A. Kassambara (Datanovia)
Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)

Others

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
Deep Learning with R by François Chollet & J.J. Allaire
Deep Learning with Python by François Chollet

Comments

You are not authorized to post a comment

Comment

Visitor

#656 11/30/2018 at 10h42

Hello,
I have a question about this part: SVM.
On the graph, we can read for Cost=1, Accuracy=0,786 ; whereas in the comments I understand that Accuracy=0,782 for Cost=1 !
I cannot find the correct values on the graph.
Thanks for any help.
S

Comment

Visitor

#635 10/21/2018 at 12h39

Good and clear tutorial to understand

STAY UPDATED

Articles - Classification Methods Essentials