<?xml version="1.0" encoding="UTF-8" ?>
<!-- RSS generated by PHPBoost on Sat, 04 Apr 2026 08:36:09 +0200 -->

<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Last articles - STHDA : Advanced Clustering]]></title>
		<atom:link href="https://www.sthda.com/english/syndication/rss/articles/30" rel="self" type="application/rss+xml"/>
		<link>https://www.sthda.com</link>
		<description><![CDATA[Last articles - STHDA : Advanced Clustering]]></description>
		<copyright>(C) 2005-2026 PHPBoost</copyright>
		<language>en</language>
		<generator>PHPBoost</generator>
		
		
		<item>
			<title><![CDATA[DBSCAN: Density-Based Clustering Essentials]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/105-dbscan-density-based-clustering-essentials/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/105-dbscan-density-based-clustering-essentials/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p><strong>DBSCAN</strong> (<strong>Density-Based Spatial Clustering and Application with Noise</strong>), is a <strong>density-based clusering</strong> algorithm, introduced in Ester et al. 1996, which can be used to identify clusters of any shape in a data set containing noise and outliers.</p>
<p>The basic idea behind the density-based clustering approach is derived from a human intuitive clustering method. For instance, by looking at the figure below, one can easily identify four clusters along with several points of noise, because of the differences in the density of points.</p>
<p>Clusters are dense regions in the data space, separated by regions of lower density of points. The DBSCAN algorithm is based on this intuitive notion of “clusters” and “noise”. The key idea is that for each point of a cluster, the neighborhood of a given radius has to contain at least a minimum number of points.</p>
<p><img src="https://www.sthda.com/english/sthda-upload/images/cluster-analysis/dbscan-idea.png" alt="DBSCAN idea" /> (From Ester et al. 1996)</p>
<div class="block">
<p>
In this chapter, we’ll describe the DBSCAN algorithm and demonstrate how to compute DBSCAN using the <em>fpc</em> R package.
</p>
</div>
<br/>
<p>Contents: </p>
<div id="TOC">
<ul>
<li><a href="#why-dbscan">Why DBSCAN?</a></li>
<li><a href="#algorithm">Algorithm</a></li>
<li><a href="#advantages">Advantages</a></li>
<li><a href="#parameter-estimation">Parameter estimation</a></li>
<li><a href="#computing-dbscan">Computing DBSCAN</a></li>
<li><a href="#method-for-determining-the-optimal-eps-value">Method for determining the optimal eps value</a></li>
<li><a href="#cluster-predictions-with-dbscan-algorithm">Cluster predictions with DBSCAN algorithm</a></li>
</ul>
</div>
<br/>
<p>Related Book:</p>
<div class = "small-block content-privileged-friends cluster-book">
    <center>
        <a href = "https://www.sthda.com/english/web/5-bookadvisor/17-practical-guide-to-cluster-analysis-in-r/">
          <img src = "https://www.sthda.com/english/sthda-upload/images/cluster-analysis/clustering-book-cover.png" /><br/>
      Practical Guide to Cluster Analysis in R
      </a>
      </center>
</div>
<div class="spacer"></div>
<div id="why-dbscan" class="section level2">
<h2>Why DBSCAN?</h2>
<p>Partitioning methods (K-means, PAM clustering) and hierarchical clustering are suitable for finding spherical-shaped clusters or convex clusters. In other words, they work well only for compact and well separated clusters. Moreover, they are also severely affected by the presence of noise and outliers in the data.</p>
<p>Unfortunately, real life data can contain: i) clusters of arbitrary shape such as those shown in the figure below (oval, linear and “S” shape clusters); ii) many outliers and noise.</p>
<p>The figure below shows a data set containing nonconvex clusters and outliers/noises. The simulated data set <em>multishapes</em> [in <em>factoextra</em> package] is used.</p>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/023-dbscan-density-based-clustering-data-dbscan-1.png" width="336" /></p>
<p>The plot above contains 5 clusters and outliers, including:</p>
<ul>
<li>2 ovales clusters</li>
<li>2 linear clusters</li>
<li>1 compact cluster</li>
</ul>
<p>Given such data, k-means algorithm has difficulties for identifying theses clusters with arbitrary shapes. To illustrate this situation, the following R code computes k-means algorithm on the multishapes data set. The function <em>fviz_cluster</em>()[<em>factoextra</em> package] is used to visualize the clusters.</p>
<p>First, install factoextra: install.packages(“factoextra”); then compute and visualize k-means clustering using the data set multishapes:</p>
<pre class="r"><code>library(factoextra)
data("multishapes")
df <- multishapes[, 1:2]
set.seed(123)
km.res <- kmeans(df, 5, nstart = 25)
fviz_cluster(km.res, df,  geom = "point", 
             ellipse= FALSE, show.clust.cent = FALSE,
             palette = "jco", ggtheme = theme_classic())</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/023-dbscan-density-based-clustering-k-means-multishapes-1.png" width="336" /></p>
<div class="success">
<p>
We know there are 5 five clusters in the data, but it can be seen that k-means method inaccurately identify the 5 clusters.
</p>
</div>
</div>
<div id="algorithm" class="section level2">
<h2>Algorithm</h2>
<p>The goal is to identify dense regions, which can be measured by the number of objects close to a given point.</p>
<p>Two important parameters are required for DBSCAN: <em>epsilon</em> (“eps”) and <em>minimum points</em> (“MinPts”). The parameter <em>eps</em> defines the radius of neighborhood around a point x. It’s called called the <span class="math inline">\(\epsilon\)</span>-neighborhood of x. The parameter <em>MinPts</em> is the minimum number of neighbors within “eps” radius.</p>
<p>Any point x in the data set, with a neighbor count greater than or equal to <em>MinPts</em>, is marked as a <em>core point</em>. We say that x is <em>border point</em>, if the number of its neighbors is less than MinPts, but it belongs to the <span class="math inline">\(\epsilon\)</span>-neighborhood of some core point z. Finally, if a point is neither a core nor a border point, then it is called a noise point or an outlier.</p>
<p>The figure below shows the different types of points (core, border and outlier points) using MinPts = 6. Here x is a core point because <span class="math inline">\(neighbours_\epsilon(x) = 6\)</span>, y is a border point because <span class="math inline">\(neighbours_\epsilon(y) < MinPts\)</span>, but it belongs to the <span class="math inline">\(\epsilon\)</span>-neighborhood of the core point x. Finally, z is a noise point.</p>
<p><img src="https://www.sthda.com/english/sthda-upload/images/cluster-analysis/dbscan-principle.png" alt="DBSCAN principle" /></p>
<p>We start by defining 3 terms, required for understanding the DBSCAN algorithm:</p>
<ul>
<li><em>Direct density reachable</em>: A point “A” is directly density reachable from another point “B” if: i) “A” is in the <span class="math inline">\(\epsilon\)</span>-neighborhood of “B” and ii) “B” is a core point.</li>
<li><em>Density reachable</em>: A point “A” is density reachable from “B” if there are a set of core points leading from “B” to “A.</li>
<li><em>Density connected</em>: Two points “A” and “B” are density connected if there are a core point “C”, such that both “A” and “B” are density reachable from “C”.</li>
</ul>
<p>A density-based cluster is defined as a group of density connected points. The algorithm of density-based clustering (DBSCAN) works as follow:</p>
<div class="block">
<ol style="list-style-type: decimal">
<li>
<p>
For each point <span class="math inline"><em>x</em><sub><em>i</em></sub></span>, compute the distance between <span class="math inline"><em>x</em><sub><em>i</em></sub></span> and the other points. Finds all neighbor points within distance <em>eps</em> of the starting point (<span class="math inline"><em>x</em><sub><em>i</em></sub></span>). Each point, with a neighbor count greater than or equal to <em>MinPts</em>, is marked as <em>core point</em> or <em>visited</em>.
</p>
</li>
<li>
<p>
For each <em>core point</em>, if it’s not already assigned to a cluster, create a new cluster. Find recursively all its density connected points and assign them to the same cluster as the core point.
</p>
</li>
<li>
<p>
Iterate through the remaining unvisited points in the data set.
</p>
</li>
</ol>
<p>
Those points that do not belong to any cluster are treated as outliers or noise.
</p>
</div>
</div>
<div id="advantages" class="section level2">
<h2>Advantages</h2>
<ol style="list-style-type: decimal">
<li>Unlike K-means, DBSCAN does not require the user to specify the number of clusters to be generated</li>
<li>DBSCAN can find any shape of clusters. The cluster doesn’t have to be circular.</li>
<li>DBSCAN can identify outliers</li>
</ol>
</div>
<div id="parameter-estimation" class="section level2">
<h2>Parameter estimation</h2>
<ul>
<li><p>MinPts: The larger the data set, the larger the value of minPts should be chosen. minPts must be chosen at least 3.</p></li>
<li><p><span class="math inline">\(\epsilon\)</span>: The value for <span class="math inline">\(\epsilon\)</span> can then be chosen by using a k-distance graph, plotting the distance to the k = minPts nearest neighbor. Good values of <span class="math inline">\(\epsilon\)</span> are where this plot shows a strong bend.</p></li>
</ul>
</div>
<div id="computing-dbscan" class="section level2">
<h2>Computing DBSCAN</h2>
<p>Here, we’ll use the R package <em>fpc</em> to compute DBSCAN. It’s also possible to use the package <em>dbscan</em>, which provides a faster re-implementation of DBSCAN algorithm compared to the fpc package.</p>
<p>We’ll also use the <em>factoextra</em> package for visualizing clusters.</p>
<p>First, install the packages as follow:</p>
<pre class="r"><code>install.packages("fpc")
install.packages("dbscan")
install.packages("factoextra")</code></pre>
<p>The R code below computes and visualizes DBSCAN using multishapes data set [factoextra R package]:</p>
<pre class="r"><code># Load the data 
data("multishapes", package = "factoextra")
df <- multishapes[, 1:2]
# Compute DBSCAN using fpc package
library("fpc")
set.seed(123)
db <- fpc::dbscan(df, eps = 0.15, MinPts = 5)
# Plot DBSCAN results
library("factoextra")
fviz_cluster(db, data = df, stand = FALSE,
             ellipse = FALSE, show.clust.cent = FALSE,
             geom = "point",palette = "jco", ggtheme = theme_classic())</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/023-dbscan-density-based-clustering-density-based-clustering-1.png" width="336" /></p>
<div class="notice">
<p>
Note that, the function <em>fviz_cluster</em>() uses different point symbols for core points (i.e, seed points) and border points. Black points correspond to outliers. You can play with <em>eps</em> and <em>MinPts</em> for changing cluster configurations.
</p>
</div>
<div class="success">
<p>
It can be seen that DBSCAN performs better for these data sets and can identify the correct set of clusters compared to k-means algorithms.
</p>
</div>
<p>The result of the <em>fpc::dbscan</em>() function can be displayed as follow:</p>
<pre class="r"><code>print(db)</code></pre>
<pre><code>## dbscan Pts=1100 MinPts=5 eps=0.15
##         0   1   2   3  4  5
## border 31  24   1   5  7  1
## seed    0 386 404  99 92 50
## total  31 410 405 104 99 51</code></pre>
<p>In the table above, column names are cluster number. Cluster 0 corresponds to outliers (black points in the DBSCAN plot). The function <em>print.dbscan</em>() shows a statistic of the number of points belonging to the clusters that are seeds and border points.</p>
<pre class="r"><code># Cluster membership. Noise/outlier observations are coded as 0
# A random subset is shown
db$cluster[sample(1:1089, 20)]</code></pre>
<pre><code>##  [1] 1 3 2 4 3 1 2 4 2 2 2 2 2 2 1 4 1 1 1 0</code></pre>
<p>DBSCAN algorithm requires users to specify the optimal <em>eps</em> values and the parameter <em>MinPts</em>. In the R code above, we used <em>eps = 0.15</em> and <em>MinPts = 5</em>. One limitation of DBSCAN is that it is sensitive to the choice of <span class="math inline">\(\epsilon\)</span>, in particular if clusters have different densities. If <span class="math inline">\(\epsilon\)</span> is too small, sparser clusters will be defined as noise. If <span class="math inline">\(\epsilon\)</span> is too large, denser clusters may be merged together. This implies that, if there are clusters with different local densities, then a single <span class="math inline">\(\epsilon\)</span> value may not suffice.</p>
<p>A natural question is:</p>
<div class="block">
<p>
How to define the optimal value of ?
</p>
</div>
</div>
<div id="method-for-determining-the-optimal-eps-value" class="section level2">
<h2>Method for determining the optimal eps value</h2>
<p>The method proposed here consists of computing the k-nearest neighbor distances in a matrix of points.</p>
<p>The idea is to calculate, the average of the distances of every point to its k nearest neighbors. The value of k will be specified by the user and corresponds to <em>MinPts</em>.</p>
<p>Next, these k-distances are plotted in an ascending order. The aim is to determine the “knee”, which corresponds to the optimal <em>eps</em> parameter.</p>
<p>A knee corresponds to a threshold where a sharp change occurs along the k-distance curve.</p>
<p>The function <em>kNNdistplot</em>() [in <em>dbscan</em> package] can be used to draw the k-distance plot:</p>
<pre class="r"><code>dbscan::kNNdistplot(df, k =  5)
abline(h = 0.15, lty = 2)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/023-dbscan-density-based-clustering-k-nearest-neighbor-distance-1.png" width="384" /></p>
<div class="success">
<p>
It can be seen that the optimal <em>eps</em> value is around a distance of 0.15.
</p>
</div>
</div>
<div id="cluster-predictions-with-dbscan-algorithm" class="section level2">
<h2>Cluster predictions with DBSCAN algorithm</h2>
<p>The function <em>predict.dbscan(object, data, newdata)</em> [in <em>fpc</em> package] can be used to predict the clusters for the points in <em>newdata</em>. For more details, read the documentation (<em>?predict.dbscan</em>).</p>
</div>
</div><!--end rdoc-->
 
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 20:02:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Model Based Clustering Essentials]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/104-model-based-clustering-essentials/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/104-model-based-clustering-essentials/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>The traditional clustering methods, such as hierarchical clustering (Chapter @ref(agglomerative-clustering)) and k-means clustering (Chapter @ref(kmeans-clustering)), are heuristic and are not based on formal models. Furthermore, k-means algorithm is commonly randomnly initialized, so different runs of k-means will often yield different results. Additionally, k-means requires the user to specify the the optimal number of clusters.</p>
<p>An alternative is <strong>model-based clustering</strong>, which consider the data as coming from a distribution that is mixture of two or more clusters <span class="citation">(Fraley and Raftery 2002, <span class="citation">Fraley et al. (2012)</span>)</span>. Unlike k-means, the model-based clustering uses a soft assignment, where each data point has a probability of belonging to each cluster.</p>
<div class="block">
<p>
In this chapter, we illustrate model-based clustering using the R package <em>mclust</em>.
</p>
</div>
<br/>
<p>Contents:</p>
<div id="TOC">
<ul>
<li><a href="#concept-of-model-based-clustering">Concept of model-based clustering</a></li>
<li><a href="#estimating-model-parameters">Estimating model parameters</a></li>
<li><a href="#choosing-the-best-model">Choosing the best model</a></li>
<li><a href="#computing-model-based-clustering-in-r">Computing model-based clustering in R</a></li>
<li><a href="#visualizing-model-based-clustering">Visualizing model-based clustering</a></li>
<li><a href="#references">References</a></li>
</ul>
</div>
<br/>
<p>Related Book:</p>
<div class = "small-block content-privileged-friends cluster-book">
    <center>
        <a href = "https://www.sthda.com/english/web/5-bookadvisor/17-practical-guide-to-cluster-analysis-in-r/">
          <img src = "https://www.sthda.com/english/sthda-upload/images/cluster-analysis/clustering-book-cover.png" /><br/>
      Practical Guide to Cluster Analysis in R
      </a>
      </center>
</div>
<div class="spacer"></div>
<div id="concept-of-model-based-clustering" class="section level2">
<h2>Concept of model-based clustering</h2>
<p>In model-based clustering, the data is considered as coming from a mixture of density.</p>
<p>Each component (i.e. cluster) k is modeled by the normal or Gaussian distribution which is characterized by the parameters:</p>
<ul>
<li><span class="math inline">\(\mu_k\)</span>: mean vector,</li>
<li><span class="math inline">\(\sum_k\)</span>: covariance matrix,</li>
<li>An associated probability in the mixture. Each point has a probability of belonging to each cluster.</li>
</ul>
<p>For example, consider the “old faithful geyser data” [in MASS R package], which can be illustrated as follow using the ggpubr R package:</p>
<pre class="r"><code># Load the data
library("MASS")
data("geyser")
# Scatter plot
library("ggpubr")
ggscatter(geyser, x = "duration", y = "waiting")+
  geom_density2d() # Add 2D density</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/022-model-based-clustering-scatter-plot-1.png" width="336" /></p>
<p>The plot above suggests at least 3 clusters in the mixture. The shape of each of the 3 clusters appears to be approximately elliptical suggesting three bivariate normal distributions. As the 3 ellipses seems to be similar in terms of volume, shape and orientation, we might anticipate that the three components of this mixture might have homogeneous covariance matrices.</p>
</div>
<div id="estimating-model-parameters" class="section level2">
<h2>Estimating model parameters</h2>
<p>The model parameters can be estimated using the <em>Expectation-Maximization</em> (EM) algorithm initialized by hierarchical model-based clustering. Each cluster k is centered at the means <span class="math inline">\(\mu_k\)</span>, with increased density for points near the mean.</p>
<p>Geometric features (shape, volume, orientation) of each cluster are determined by the covariance matrix <span class="math inline">\(\sum_k\)</span>.</p>
<p>Different possible parameterizations of <span class="math inline">\(\sum_k\)</span> are available in the R package <em>mclust</em> (see <em>?mclustModelNames</em>).</p>
<p>The available model options, in <em>mclust</em> package, are represented by identifiers including: EII, VII, EEI, VEI, EVI, VVI, EEE, EEV, VEV and VVV.</p>
<p>The first identifier refers to volume, the second to shape and the third to orientation. E stands for “equal”, V for “variable” and I for “coordinate axes”.</p>
<p>For example:</p>
<ul>
<li>EVI denotes a model in which the volumes of all clusters are equal (E), the shapes of the clusters may vary (V), and the orientation is the identity (I) or “coordinate axes.
</li>
<li>EEE means that the clusters have the same volume, shape and orientation in p-dimensional space.</li>
<li>VEI means that the clusters have variable volume, the same shape and orientation equal to coordinate axes.</li>
</ul>
</div>
<div id="choosing-the-best-model" class="section level2">
<h2>Choosing the best model</h2>
<p>The <em>Mclust</em> package uses maximum likelihood to fit all these models, with different covariance matrix parameterizations, for a range of k components.</p>
<p>The best model is selected using the Bayesian Information Criterion or <em>BIC</em>. A large BIC score indicates strong evidence for the corresponding model.</p>
</div>
<div id="computing-model-based-clustering-in-r" class="section level2">
<h2>Computing model-based clustering in R</h2>
<p>We start by installing the <em>mclust</em> package as follow: <em>install.packages(“mclust”)</em></p>
<div class="notice">
<p>
Note that, model-based clustering can be applied on univariate or multivariate data.
</p>
</div>
<p>Here, we illustrate model-based clustering on the diabetes data set [mclust package] giving three measurements and the diagnosis for 145 subjects described as follow:</p>
<pre class="r"><code>library("mclust")
data("diabetes")
head(diabetes, 3)</code></pre>
<pre><code>##    class glucose insulin sspg
## 1 Normal      80     356  124
## 2 Normal      97     289  117
## 3 Normal     105     319  143</code></pre>
<ul>
<li>class: the diagnosis: normal, chemically diabetic, and overtly diabetic. Excluded from the cluster analysis.</li>
<li>glucose: plasma glucose response to oral glucose</li>
<li>insulin: plasma insulin response to oral glucose</li>
<li>sspg: steady-state plasma glucose (measures insulin resistance)</li>
</ul>
<p>Model-based clustering can be computed using the function Mclust() as follow:</p>
<pre class="r"><code>library(mclust)
df <- scale(diabetes[, -1]) # Standardize the data
mc <- Mclust(df)            # Model-based-clustering
summary(mc)                 # Print a summary</code></pre>
<pre><code>## ----------------------------------------------------
## Gaussian finite mixture model fitted by EM algorithm 
## ----------------------------------------------------
## 
## Mclust VVV (ellipsoidal, varying volume, shape, and orientation) model with 3 components:
## 
##  log.likelihood   n df  BIC  ICL
##            -169 145 29 -483 -501
## 
## Clustering table:
##  1  2  3 
## 81 36 28</code></pre>
<p>For this data, it can be seen that model-based clustering selected a model with three components (i.e. clusters). The optimal selected model name is VVV model. That is the three components are ellipsoidal with varying volume, shape, and orientation. The summary contains also the clustering table specifying the number of observations in each clusters.</p>
<p>You can access to the results as follow:</p>
<pre class="r"><code>mc$modelName                # Optimal selected model ==> "VVV"
mc$G                        # Optimal number of cluster => 3
head(mc$z, 30)              # Probality to belong to a given cluster
head(mc$classification, 30) # Cluster assignement of each observation</code></pre>
</div>
<div id="visualizing-model-based-clustering" class="section level2">
<h2>Visualizing model-based clustering</h2>
<p>Model-based clustering results can be drawn using the base function plot.Mclust() [in mclust package]. Here we’ll use the function <em>fviz_mclust</em>() [in <em>factoextra</em> package] to create beautiful plots based on ggplot2.</p>
<p>In the situation, where the data contain more than two variables, <em>fviz_mclust</em>() uses a principal component analysis to reduce the dimensionnality of the data. The first two principal components are used to produce a scatter plot of the data. However, if you want to plot the data using only two variables of interest, let say here c(“insulin”, “sspg”), you can specify that in the <em>fviz_mclust</em>() function using the argument <em>choose.vars = c(“insulin”, “sspg”)</em>.</p>
<pre class="r"><code>library(factoextra)
# BIC values used for choosing the number of clusters
fviz_mclust(mc, "BIC", palette = "jco")
# Classification: plot showing the clustering
fviz_mclust(mc, "classification", geom = "point", 
            pointsize = 1.5, palette = "jco")
# Classification uncertainty
fviz_mclust(mc, "uncertainty", palette = "jco")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/022-model-based-clustering-model-base-clustering-1.png" width="307.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/022-model-based-clustering-model-base-clustering-2.png" width="307.2" /><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/022-model-based-clustering-model-base-clustering-3.png" width="307.2" /></p>
<p>Note that, in the uncertainty plot, larger symbols indicate the more uncertain observations.</p>
</div>
<div id="references" class="section level2 unnumbered">
<h2>References</h2>
<div id="refs" class="references">
<div id="ref-fraley2002">
<p>Fraley, Chris, and Adrian E Raftery. 2002. “Model-Based Clustering, Discriminant Analysis, and Density Estimation.” <em>Journal of the American Statistical Association</em> 97 (458): 611–31.</p>
</div>
<div id="ref-fraley2012">
<p>Fraley, Chris, Adrian E. Raftery, T. Brendan Murphy, and Luca Scrucca. 2012. “Mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation.” <em>Technical Report No. 597, Department of Statistics, University of Washington</em>. <a href="https://www.stat.washington.edu/research/reports/2012/tr597.pdf" class="uri">https://www.stat.washington.edu/research/reports/2012/tr597.pdf</a>.</p>
</div>
</div>
</div>
</div><!--end rdoc-->
 
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 19:46:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[cmeans() R function: Compute Fuzzy clustering]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/103-cmeans-r-function-compute-fuzzy-clustering/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/103-cmeans-r-function-compute-fuzzy-clustering/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>This article describes how to compute the <strong>fuzzy clustering</strong> using the function <strong>cmeans</strong>() [in <em>e1071</em> R package]. Previously, we explained <a href="https://www.sthda.com/english/articles/30-advanced-clustering/101-fuzzy-clustering-essentials/">what is fuzzy clustering</a> and how to compute the fuzzy clustering using the R function fanny()[in cluster package].</p>
<p>Related articles:</p>
<ul>
<li><a href="https://www.sthda.com/english/articles/30-advanced-clustering/101-fuzzy-clustering-essentials/">Fuzzy Clustering Essentials</a></li>
<li><a href="https://www.sthda.com/english/articles/30-advanced-clustering/102-fuzzy-c-means-clustering-algorithm/">Fuzzy C-Means Clustering Algorithm</a></li>
</ul>
<div id="cmeans-format" class="section level2">
<h2>cmeans() format</h2>
<p>The simplified format of the function <strong>cmeans</strong>() is as follow:</p>
<pre class="r"><code>cmeans(x, centers, iter.max = 100, dist = "euclidean", m = 2)</code></pre>
<div class="block">
<ul>
<li>
x: a data matrix where columns are variables and rows are observations
</li>
<li>
centers: Number of clusters or initial values for cluster centers
</li>
<li>
iter.max: Maximum number of iterations
</li>
<li>
dist: Possible values are “euclidean” or “manhattan”
</li>
<li>
m: A number greater than 1 giving the degree of fuzzification.
</li>
</ul>
</div>
<p>The function cmeans() returns an object of class fclust which is a list containing the following components:</p>
<ul>
<li>centers: the final cluster centers</li>
<li>size: the number of data points in each cluster of the closest hard clustering</li>
<li>cluster: a vector of integers containing the indices of the clusters where the data points are assigned to for the closest hard - clustering, as obtained by assigning points to the (first) class with maximal membership.</li>
<li>iter: the number of iterations performed</li>
<li>membership: a matrix with the membership values of the data points to the clusters</li>
<li>withinerror: the value of the objective function</li>
</ul>
</div>
<div id="compute-fuzzy-c-means-clustering" class="section level2">
<h2>Compute fuzzy c-means clustering</h2>
<pre class="r"><code>set.seed(123)
# Load the data
data("USArrests")
# Subset of USArrests
ss <- sample(1:50, 20)
df <- scale(USArrests[ss,])
# Compute fuzzy clustering
library(e1071)
cm <- cmeans(df, 4)
cm</code></pre>
<pre><code>## Fuzzy c-means clustering with 4 clusters
## 
## Cluster centers:
##   Murder Assault UrbanPop   Rape
## 1  0.857   0.338   -0.729  0.200
## 2 -0.731  -0.665    1.003 -0.333
## 3 -1.210  -1.248   -0.728 -1.153
## 4  0.629   0.970    0.501  0.865
## 
## Memberships:
##                    1      2      3       4
## Iowa         0.00916 0.0191 0.9658 0.00594
## Rhode Island 0.09885 0.5915 0.2050 0.10463
## Maryland     0.22786 0.0475 0.0273 0.69731
## Tennessee    0.87231 0.0286 0.0211 0.07801
## Utah         0.04446 0.8218 0.0844 0.04929
## Arizona      0.11876 0.1008 0.0399 0.74056
## Mississippi  0.62441 0.0931 0.1030 0.17952
## Wisconsin    0.03363 0.1110 0.8313 0.02403
## Virginia     0.39552 0.2570 0.1918 0.15573
## Maine        0.03433 0.0530 0.8915 0.02117
## Texas        0.24082 0.1595 0.0541 0.54557
## Louisiana    0.61799 0.0653 0.0419 0.27473
## Montana      0.13551 0.1366 0.6657 0.06215
## Michigan     0.09620 0.0371 0.0178 0.84890
## Arkansas     0.56529 0.1223 0.1805 0.13188
## New York     0.13194 0.1323 0.0416 0.69421
## Florida      0.17377 0.0749 0.0398 0.71155
## Alaska       0.38155 0.1354 0.1136 0.36947
## Hawaii       0.06662 0.7206 0.1487 0.06410
## New Jersey   0.05957 0.8009 0.0575 0.08206
## 
## Closest hard clustering:
##         Iowa Rhode Island     Maryland    Tennessee         Utah 
##            3            2            4            1            2 
##      Arizona  Mississippi    Wisconsin     Virginia        Maine 
##            4            1            3            1            3 
##        Texas    Louisiana      Montana     Michigan     Arkansas 
##            4            1            3            4            1 
##     New York      Florida       Alaska       Hawaii   New Jersey 
##            4            4            1            2            2 
## 
## Available components:
## [1] "centers"     "size"        "cluster"     "membership"  "iter"       
## [6] "withinerror" "call"</code></pre>
<p>The different components can be extracted using the code below:</p>
<pre class="r"><code># Membership coefficient
head(cm$membership)</code></pre>
<pre><code>##                    1      2      3       4
## Iowa         0.00916 0.0191 0.9658 0.00594
## Rhode Island 0.09885 0.5915 0.2050 0.10463
## Maryland     0.22786 0.0475 0.0273 0.69731
## Tennessee    0.87231 0.0286 0.0211 0.07801
## Utah         0.04446 0.8218 0.0844 0.04929
## Arizona      0.11876 0.1008 0.0399 0.74056</code></pre>
<pre class="r"><code># Visualize using corrplot
library(corrplot)
corrplot(cm$membership, is.corr = FALSE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/055-cmeans-c-means-clustering-1.png" width="384" /></p>
<pre class="r"><code># Observation groups/clusters
cm$cluster</code></pre>
<pre><code>##         Iowa Rhode Island     Maryland    Tennessee         Utah 
##            3            2            4            1            2 
##      Arizona  Mississippi    Wisconsin     Virginia        Maine 
##            4            1            3            1            3 
##        Texas    Louisiana      Montana     Michigan     Arkansas 
##            4            1            3            4            1 
##     New York      Florida       Alaska       Hawaii   New Jersey 
##            4            4            1            2            2</code></pre>
</div>
<div id="visualize-clusters" class="section level2">
<h2>Visualize clusters</h2>
<pre class="r"><code>library(factoextra)
fviz_cluster(list(data = df, cluster=cm$cluster), 
             ellipse.type = "norm",
             ellipse.level = 0.68,
             palette = "jco",
             ggtheme = theme_minimal())</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/055-cmeans-visualize-c-means-clusters-1.png" width="480" /></p>
</div>
</div>
</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 19:26:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Fuzzy C-Means Clustering Algorithm]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/102-fuzzy-c-means-clustering-algorithm/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/102-fuzzy-c-means-clustering-algorithm/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">

<p>In our previous article, we described the basic concept of <a href="https://www.sthda.com/english/articles/30-advanced-clustering/101-fuzzy-clustering-essentials/"><strong>fuzzy clustering</strong></a> and we showed how to compute fuzzy clustering. In this current article, we’ll present the <strong>fuzzy c-means clustering algorithm</strong>, which is very similar to the <a href="https://www.sthda.com/english/articles/27-partitioning-clustering-essentials/87-k-means-clustering-essentials/">k-means algorithm</a> and the aim is to minimize the objective function defined as follow:</p>
<p>
<span class="math"><span class="math display">\[
\sum\limits_{j=1}^k \sum\limits_{x_i \in C_j} u_{ij}^m (x_i - \mu_j)^2
\]</span></span>
</p>
<p>
Where,
</p>
<ul>
<li>
<span class="math"><span class="math inline">\(u_{ij}\)</span></span> is the degree to which an observation <span class="math"><span class="math inline">\(x_i\)</span></span> belongs to a cluster <span class="math"><span class="math inline">\(c_j\)</span></span>
</li>
<li>
<span class="math"><span class="math inline">\(\mu_j\)</span></span> is the center of the cluster j
</li>
<li>
<span class="math"><span class="math inline">\(u_{ij}\)</span></span> is the degree to which an observation <span class="math"><span class="math inline">\(x_i\)</span></span> belongs to a cluster <span class="math"><span class="math inline">\(c_j\)</span></span>
</li>
<li>
<span class="math"><span class="math inline">\(m\)</span></span> is the fuzzifier.
</li>
</ul>
<p>
<span class="notice">It can be seen that, FCM differs from k-means by using the membership values <span class="math"><span class="math inline">\(u_{ij}\)</span></span> and the fuzzifier <span class="math"><span class="math inline">\(m\)</span></span>.</span>
</p>
<p>
The variable <span class="math"><span class="math inline">\(u_{ij}^m\)</span></span> is defined as follow:
</p>
<p>
<span class="math"><span class="math display">\[
u_{ij}^m = \frac{1}{\sum\limits_{l=1}^k \left( \frac{| x_i - c_j |}{| x_i - c_k |}\right)^{\frac{2}{m-1}}}
\]</span></span>
</p>
<p>
The degree of belonging, <span class="math"><span class="math inline">\(u_{ij}\)</span></span>, is linked inversely to the distance from x to the cluster center.
</p>
<p>
The parameter <span class="math"><span class="math inline">\(m\)</span></span> is a real number greater than 1 (<span class="math"><span class="math inline">\(1.0 < m < \infty\)</span></span>) and it defines the level of cluster fuzziness. Note that, a value of <span class="math"><span class="math inline">\(m\)</span></span> close to 1 gives a cluster solution which becomes increasingly similar to the solution of hard clustering such as k-means; whereas a value of <span class="math"><span class="math inline">\(m\)</span></span> close to infinite leads to complete fuzzyness.
</p>
<p>
<span class="success">Note that, a good choice is to use <strong>m = 2.0</strong> (Hathaway and Bezdek 2001).</span>
</p>
<p>
In <strong>fuzzy clustering</strong> the centroid of a cluster is he mean of all points, weighted by their degree of belonging to the cluster:
</p>
<p>
<span class="math"><span class="math display">\[
C_j = \frac{\sum\limits_{x \in C_j} u_{ij}^m x}{\sum\limits_{x \in C_j} u_{ij}^m}
\]</span></span>
</p>
<p>
Where,
</p>
<ul>
<li>
<span class="math"><span class="math inline">\(C_j\)</span></span> is the centroid of the cluster j
</li>
<li>
<span class="math"><span class="math inline">\(u_{ij}\)</span></span> is the degree to which an observation <span class="math"><span class="math inline">\(x_i\)</span></span> belongs to a cluster <span class="math"><span class="math inline">\(c_j\)</span></span>
</li>
</ul>
<p>
The algorithm of fuzzy clustering can be summarize as follow:
</p>
<ol style="list-style-type: decimal">
<li>
Specify a number of clusters k (by the analyst)
</li>
<li>
Assign randomly to each point coefficients for being in the clusters.
</li>
<li>
Repeat until the maximum number of iterations (given by “maxit”) is reached, or when the algorithm has converged (that is, the coefficients’ change between two iterations is no more than <span class="math"><span class="math inline">\(\epsilon\)</span></span>, the given sensitivity threshold):
<ul>
<li>
Compute the centroid for each cluster, using the formula above.
</li>
<li>
For each point, compute its coefficients of being in the clusters, using the formula above.
</li>
</ul>
</li>
</ol>
<p>
The algorithm minimizes intra-cluster variance as well, but has the same problems as k-means; the minimum is a local minimum, and the results depend on the initial choice of weights. Hence, different initializations may lead to different results.
</p>
<p>
Using a mixture of Gaussians along with the expectation-maximization algorithm is a more statistically formalized method which includes some of these ideas: partial membership in classes.
</p>
</div>


</div><!--end rdoc-->

 
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 16:44:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Fuzzy Clustering Essentials]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/101-fuzzy-clustering-essentials/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/101-fuzzy-clustering-essentials/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<p>The <strong>fuzzy clustering</strong> is considered as soft clustering, in which each element has a probability of belonging to each cluster. In other words, each element has a set of membership coefficients corresponding to the degree of being in a given cluster.</p>
<p>This is different from k-means and k-medoid clustering, where each object is affected exactly to one cluster. K-means and k-medoids clustering are known as hard or non-fuzzy clustering.</p>
<p>In fuzzy clustering, points close to the center of a cluster, may be in the cluster to a higher degree than points in the edge of a cluster. The degree, to which an element belongs to a given cluster, is a numerical value varying from 0 to 1.</p>
<p>The <strong>fuzzy c-means</strong> (FCM) algorithm is one of the most widely used fuzzy clustering algorithms. The centroid of a cluster is calculated as the mean of all points, weighted by their degree of belonging to the cluster:</p>
<div class="block">
<p>
In this article, we’ll describe how to compute fuzzy clustering using the R software.
</p>
</div>
<br/>
<p>Related Book:</p>
<div class = "small-block content-privileged-friends cluster-book">
    <center>
        <a href = "https://www.sthda.com/english/web/5-bookadvisor/17-practical-guide-to-cluster-analysis-in-r/">
          <img src = "https://www.sthda.com/english/sthda-upload/images/cluster-analysis/clustering-book-cover.png" /><br/>
      Practical Guide to Cluster Analysis in R
      </a>
      </center>
</div>
<div class="spacer"></div>
<div id="required-r-packages" class="section level2">
<h2>Required R packages</h2>
<p>We’ll use the following R packages: 1) <em>cluster</em> for computing fuzzy clustering and 2) <em>factoextra</em> for visualizing clusters.</p>
</div>
<div id="computing-fuzzy-clustering" class="section level2">
<h2>Computing fuzzy clustering</h2>
<p>The function <em>fanny</em>() [<em>cluster</em> R package] can be used to compute fuzzy clustering. <strong>FANNY</strong> stands for <strong>fuzzy analysis clustering</strong>. A simplified format is:</p>
<pre class="r"><code>fanny(x, k, metric = "euclidean", stand = FALSE)</code></pre>
<div class="block">
<ul>
<li>
<strong>x</strong>: A data matrix or data frame or dissimilarity matrix
</li>
<li>
<strong>k</strong>: The desired number of clusters to be generated
</li>
<li>
<strong>metric</strong>: Metric for calculating dissimilarities between observations
</li>
<li>
<strong>stand</strong>: If TRUE, variables are standardized before calculating the dissimilarities
</li>
</ul>
</div>
<p>The function <em>fanny</em>() returns an object including the following components:</p>
<ul>
<li><strong>membership</strong>: matrix containing the degree to which each observation belongs to a given cluster. Column names are the clusters and rows are observations</li>
<li><strong>coeff</strong>: Dunn’s partition coefficient F(k) of the clustering, where k is the number of clusters. F(k) is the sum of all squared membership coefficients, divided by the number of observations. Its value is between 1/k and 1. The normalized form of the coefficient is also given. It is defined as <span class="math inline">\((F(k) - 1/k) / (1 - 1/k)\)</span>, and ranges between 0 and 1. A low value of Dunn’s coefficient indicates a very fuzzy clustering, whereas a value close to 1 indicates a near-crisp clustering.</li>
<li><strong>clustering</strong>: the clustering vector containing the nearest crisp grouping of observations</li>
</ul>
<p>For example, the R code below applies fuzzy clustering on the USArrests data set:</p>
<pre class="r"><code>library(cluster)
df <- scale(USArrests)     # Standardize the data
res.fanny <- fanny(df, 2)  # Compute fuzzy clustering with k = 2</code></pre>
<p>The different components can be extracted using the code below:</p>
<pre class="r"><code>head(res.fanny$membership, 3) # Membership coefficients</code></pre>
<pre><code>##          [,1]  [,2]
## Alabama 0.664 0.336
## Alaska  0.610 0.390
## Arizona 0.686 0.314</code></pre>
<pre class="r"><code>res.fanny$coeff # Dunn's partition coefficient</code></pre>
<pre><code>## dunn_coeff normalized 
##      0.555      0.109</code></pre>
<pre class="r"><code>head(res.fanny$clustering) # Observation groups</code></pre>
<pre><code>##    Alabama     Alaska    Arizona   Arkansas California   Colorado 
##          1          1          1          2          1          1</code></pre>
<p>To visualize observation groups, use the function <em>fviz_cluster</em>() [<em>factoextra</em> package]:</p>
<pre class="r"><code>library(factoextra)
fviz_cluster(res.fanny, ellipse.type = "norm", repel = TRUE,
             palette = "jco", ggtheme = theme_minimal(),
             legend = "right")</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/021-fuzzy-clustering-visualize-1.png" width="518.4" /></p>
<p>To evaluate the goodnesss of the clustering results, plot the silhouette coefficient as follow:</p>
<pre class="r"><code>fviz_silhouette(res.fanny, palette = "jco",
                ggtheme = theme_minimal())</code></pre>
<pre><code>##   cluster size ave.sil.width
## 1       1   22          0.32
## 2       2   28          0.44</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/021-fuzzy-clustering-silhouette-1.png" width="518.4" /></p>
</div>
<div id="summary" class="section level2">
<h2>Summary</h2>
<p>Fuzzy clustering is an alternative to k-means clustering, where each data point has membership coefficient to each cluster. Here, we demonstrated how to compute and visualize fuzzy clustering using the combination of <em>cluster</em> and <em>factoextra</em> R packages.</p>
</div>
</div><!--end rdoc-->
 
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 15:50:00 +0200</pubDate>
			
		</item>
		
		<item>
			<title><![CDATA[Hierarchical K-Means Clustering: Optimize Clusters]]></title>
			<link>https://www.sthda.com/english/articles/30-advanced-clustering/100-hierarchical-k-means-clustering-optimize-clusters/</link>
			<guid>https://www.sthda.com/english/articles/30-advanced-clustering/100-hierarchical-k-means-clustering-optimize-clusters/</guid>
			<description><![CDATA[<!-- START HTML -->

  <div id="rdoc">
<div id="hkmeans" class="section level1">
<h1>Hierarchical K-Means Clustering</h1>
<p>K-means (Chapter @ref(kmeans-clustering)) represents one of the most popular clustering algorithm. However, it has some limitations: it requires the user to specify the number of clusters in advance and selects initial centroids randomly. The final k-means clustering solution is very sensitive to this initial random selection of cluster centers. The result might be (slightly) different each time you compute k-means.</p>
<div class="block">
<p>
In this chapter, we described an hybrid method, named <strong>hierarchical k-means clustering</strong> (hkmeans), for improving k-means results.
</p>
</div>
<br/>
<p>Related Book:</p>
<div class = "small-block content-privileged-friends cluster-book">
    <center>
        <a href = "https://www.sthda.com/english/web/5-bookadvisor/17-practical-guide-to-cluster-analysis-in-r/">
          <img src = "https://www.sthda.com/english/sthda-upload/images/cluster-analysis/clustering-book-cover.png" /><br/>
      Practical Guide to Cluster Analysis in R
      </a>
      </center>
</div>
<div class="spacer"></div>
<div id="algorithm" class="section level2">
<h2>Algorithm</h2>
<p>The algorithm is summarized as follow:</p>
<ol style="list-style-type: decimal">
<li>Compute hierarchical clustering and cut the tree into k-clusters</li>
<li>Compute the center (i.e the mean) of each cluster</li>
<li>Compute k-means by using the set of cluster centers (defined in step 2) as the initial cluster centers</li>
</ol>
<div class="notice">
<p>
Note that, k-means algorithm will improve the initial partitioning generated at the step 2 of the algorithm. Hence, the initial partitioning can be slightly different from the final partitioning obtained in the step 4.
</p>
</div>
</div>
<div id="r-code" class="section level2">
<h2>R code</h2>
<p>The R function <em>hkmeans</em>() [in <em>factoextra</em>], provides an easy solution to compute the hierarchical k-means clustering. The format of the result is similar to the one provided by the standard kmeans() function (see Chapter @ref(kmeans-clustering)).</p>
<p>To install factoextra, type this: <em>install.packages(“factoextra”)</em>.</p>
<p>We’ll use the USArrest data set and we start by standardizing the data:</p>
<pre class="r"><code>df <- scale(USArrests)</code></pre>
<pre class="r"><code># Compute hierarchical k-means clustering
library(factoextra)
res.hk <-hkmeans(df, 4)
# Elements returned by hkmeans()
names(res.hk)</code></pre>
<pre><code>##  [1] "cluster"      "centers"      "totss"        "withinss"    
##  [5] "tot.withinss" "betweenss"    "size"         "iter"        
##  [9] "ifault"       "data"         "hclust"</code></pre>
<p>To print all the results, type this:</p>
<pre class="r"><code># Print the results
res.hk</code></pre>
<pre class="r"><code># Visualize the tree
fviz_dend(res.hk, cex = 0.6, palette = "jco", 
          rect = TRUE, rect_border = "jco", rect_fill = TRUE)</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/020-hierarchical-k-means-clustering-hierarchical-k-means-clustering-1.png" width="518.4" /></p>
<pre class="r"><code># Visualize the hkmeans final clusters
fviz_cluster(res.hk, palette = "jco", repel = TRUE,
             ggtheme = theme_classic())</code></pre>
<p><img src="https://www.sthda.com/english/sthda-upload/figures/cluster-analysis/020-hierarchical-k-means-clustering-hierarchical-k-means-clustering-2.png" width="518.4" /></p>
</div>
<div id="summary" class="section level2">
<h2>Summary</h2>
<p>We described hybrid <strong>hierarchical k-means clustering</strong> for improving k-means results.</p>
</div>
</div><!--end rdoc-->

<!-- END HTML -->]]></description>
			<pubDate>Thu, 07 Sep 2017 15:21:00 +0200</pubDate>
			
		</item>
		
	</channel>
</rss>
