Abstract
Silhouette index is commonly used in cluster analysis for finding the optimal number of clusters, as well as for final clustering validation and evaluation as a synthetic indicator allowing to measure the general quality of clustering (relative compactness and separability of clusters—see Walesiak and Gatnar in Statystyczna analiza danych z wykorzystaniem programu R. PWN, Warszawa, p. 420, 2009). Its advantage is low computational complexity and simple interpretation rules. Recently, some proposals have appeared to use this index directly as basis of clustering algorithms. The paper is a tryout of the evaluation of such approach. In the paper examples, when the “mechanical” use of the silhouette index leads to the results that do not correspond to the actual structure of the classes are shown, the recommendations on the principles of the correct application of the index are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arbelaitz O, Gurrutxaga I, Muguerza J, PéRez JM, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256
Henning Ch (2015) What are the true clusters? Pattern Recogn Lett 64:53–62
Hubert LJ, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Kang JH, Park CH, Kim SB (2016) Recursive partitioning clustering tree algorithm. Pattern Anal Appl 19(2):355–367
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Migdał-Najman K, Najman K (2006) Wykorzystanie indeksu silhouette do ustalania optymalnej liczby skupień. Wiadomości Statystyczne 6:1–10
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Starczewski A, Krzyżak A (2015) Performance evaluation of the Silhouette index. International conference on artificial intelligence and soft computing. Springer, Cham, pp 49–58
Walesiak M, Gatnar E (eds) (2009) Statystyczna analiza danych z wykorzystaniem programu R. PWN, Warszawa
Walesiak M, Dudek A (2019) clusterSim: searching for optimal clustering procedure for a data set. R package version 0.48-5
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: Source Code of Procedure of Finding Number of Cluster with Silhouette Index in R Language

Appendix 2: Source Code of Clustering Tree (RPCT) Algorithm Implementation in R Language


Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Dudek, A. (2020). Silhouette Index as Clustering Evaluation Tool. In: Jajuga, K., Bat��g, J., Walesiak, M. (eds) Classification and Data Analysis. SKAD 2019. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-030-52348-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-52348-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52347-3
Online ISBN: 978-3-030-52348-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)