@@ -81,7 +81,7 @@ Example
8181 print (" Loss is:" , c.loss)
8282
8383 Using the sklearn-compatible API
84- -------------------
84+ --------------------------------
8585
8686Note that KMedoids defaults to the `"precomputed" ` metric, expecting a pairwise distance matrix.
8787If you have sklearn installed, you can use `metric="euclidean" `.
@@ -114,8 +114,14 @@ MNIST (10k samples)
114114 print (" PAM took: %.2f ms" % ((time.time() - start)* 1000 ))
115115 print (" Loss with PAM:" , pam.loss)
116116
117- Choose the optimal number of clusters
118- -------------------
117+ Choosing the optimal number of clusters
118+ ---------------------------------------
119+
120+ This package includes :ref: `DynMSC<dynmsc> `, an algorithm that optimizes the Medoid Silhouette,
121+ and chooses the "optimal" number of clusters in a range of 2..kmax.
122+ Beware that if you allow a too large kmax, the optimum result will likely have many
123+ one-elemental clusters. A too high kmax may mask more desirable results, hence it
124+ is recommended that you choose only 2-3 times the number of clusters you expect as maximum.
119125
120126.. code-block :: python
121127
@@ -142,18 +148,26 @@ For larger data sets, it is recommended to only cluster a representative sample
142148Implemented Algorithms
143149======================
144150
151+ K-Medoids Clustering:
152+
145153* :ref: `FasterPAM<fasterpam> ` (Schubert and Rousseeuw, 2020, 2021)
146154* :ref: `FastPAM1<fastpam1> ` (Schubert and Rousseeuw, 2019, 2021)
147155* :ref: `PAM<pam> ` (Kaufman and Rousseeuw, 1987) with BUILD and SWAP
148- * :ref: `Alternating<alternating> ` (k-means-style approach)
149156* :ref: `BUILD<build> ` (Kaufman and Rousseeuw, 1987)
150- * :ref: `Silhouette<silhouette> ` (Kaufman and Rousseeuw, 1987)
157+ * :ref: `Alternating<alternating> ` (k-means-style approach)
158+
159+ Silhouette Clustering:
160+
161+ * :ref: `DynMSC<dynmsc> ` (Lenssen and Schubert, 2023)
151162* :ref: `FasterMSC<fastermsc> ` (Lenssen and Schubert, 2022)
152163* :ref: `FastMSC<fastmsc> ` (Lenssen and Schubert, 2022)
153- * :ref: `DynMSC<dynmsc> ` (Lenssen and Schubert, 2023)
154- * :ref: `PAMSIL<pamsil> ` (Van der Laan and Pollard, 2003)
155164* :ref: `PAMMEDSIL<pammedsil> ` (Van der Laan and Pollard, 2003)
156- * :ref: `MedoidSilhouette<medoid_silhouette> ` (Van der Laan and Pollard, 2003)
165+ * :ref: `PAMSIL<pamsil> ` (Van der Laan and Pollard, 2003)
166+
167+ Evaluation:
168+
169+ * :ref: `Medoid Silhouette<medoid_silhouette> ` (Van der Laan and Pollard, 2003)
170+ * :ref: `Silhouette<silhouette> ` (Kaufman and Rousseeuw, 1987)
157171
158172Note that the k-means style "alternating" algorithm yields rather poor result quality
159173(see Schubert and Rousseeuw 2021 for an example and explanation).
@@ -193,6 +207,13 @@ PAM BUILD
193207
194208.. autofunction :: pam_build
195209
210+ .. _DynMSC :
211+
212+ DynMSC
213+ ======
214+
215+ .. autofunction :: dynmsc
216+
196217.. _FasterMSC :
197218
198219FasterMSC
@@ -207,12 +228,12 @@ FastMSC
207228
208229.. autofunction :: fastmsc
209230
210- .. _ DynMSC :
231+ .. _ PAMMEDSIL :
211232
212- DynMSC
233+ PAMMEDSIL
213234=========
214235
215- .. autofunction :: dynmsc
236+ .. autofunction :: pammedsil
216237
217238.. _PAMSIL :
218239
@@ -221,13 +242,6 @@ PAMSIL
221242
222243.. autofunction :: pamsil
223244
224- .. _PAMMEDSIL :
225-
226- PAMMEDSIL
227- =========
228-
229- .. autofunction :: pammedsil
230-
231245.. _Silhouette :
232246
233247Silhouette
@@ -288,10 +302,11 @@ an earlier (slower, and now obsolete) version was published as:
288302
289303For further details on medoid Silhouette clustering with automatic cluster number selection (FasterMSC, DynMSC), see:
290304
291- | Lars Lenssen, Erich Schubert:
292- | **Medoid silhouette clustering with automatic cluster number selection**
293- | Information Systems (120), 2024, 102290
294- | https://doi.org/10.1016/j.is.2023.102290
305+ | Lars Lenssen, Erich Schubert:
306+ | **Medoid silhouette clustering with automatic cluster number selection**
307+ | Information Systems (120), 2024, 102290
308+ | https://doi.org/10.1016/j.is.2023.102290
309+ | Preprint: https://arxiv.org/abs/2309.03751
295310
296311an earlier version was published as:
297312
0 commit comments