ISSN: 0976-4860
Mamta Mittal, R.K.Sharma, V.P.Singh
Data mining isa process of extracting interested hidden information from large databases. It can be applied on many databases but kind of patterns to be found is specified by various data mining techniques. Clustering is one of the data mining techniques that partitions database into clusters such that data objects in same clusters are similar and data objects belonging to different cluster are differ. Researchers have developed many algorithms for clustering but this paper focus on well known partitioning based technique i.e k-means with threshold based clustering technique. k-means algorithm partition the database into k clusters where k is the user defined parameter, beside this it is sensitive to outliers and intial seed selection. Threshold based clustering is the another method which generates the clusters automatically based on threshold value. To assess quality of clustering obtained from both techniques several validity measures and validity indices have been applied on synthetic data. By the experimentations and comparisions of the clustering results, it has been obsereved that clusters obtained from the threshold based technique are more separated and compact which indicates good clustering.