International Journal of Fuzzy Systems, Vol. 16, No. 1, March 2014 39 A Hybrid Fuzzy Clustering Method with a Robust Validity Index Horng-Lin Shieh Abstract1 A robust validity index for fuzzy c-means (FCM) algorithm is proposed in this paper. The purpose of fuzzy clustering is to partition a given set of training data into several different clusters that can then be modeled by fuzzy theory. The FCM algorithm has become the most widely used method in fuzzy clustering. Although, there are some successful applications of FCM have been proposed, a disadvantage of FCM is that the number of clusters must be predetermined. After clustering, it is often necessary to evaluate the fitness of the results obtained by FCM. Such assessment techniques are called cluster validity. In this paper, a new cluster validity index is proposed to evaluate the fitness of clusters obtained by FCM and four examples show the results of proposed index have good performances than other cluster validities. Keywords: Clustering algorithm, fuzzy c-means (FCM) algorithm, robust, validity index. 1. Introduction tering partition a given set of sampling data into several different clusters by membership functions. Let X denote the universal set, then the membership function,  A , by which fuzzy set A is usually defined as: (1)  A : X  [0,1] . In fuzzy clustering algorithm, the degree of a data point, x  X={x1, x 2,…, xn}   d , belonging to cluster A can be denoted by  A (x ) . The value of a fuzzy membership function belongs to any number between 0 and 1, and is meant to be a mathematical characterization of a “set” which may not be precisely defined. There are many fuzzy clustering methods proposed in the literature [6-13]. In fuzzy clustering algorithms, the fuzzy c-means algorithm (FCM) [2, 8-10] proposed by Bezdek has become the most popular approach for both theoretical and practical applications in recent decades. Let U= {ik }cn  M fcn be a partition matrix where ik is the membership value of xi belonging to class k, and V={ v1, v2,…,vc } is a set of cluster centers. The FCM minimize the following objective function with respect to ik and vk : n c Jm(U,V) =   ( ik ) m || xi  vk ||2 , In data processing technique, the clustering algorithms are widely used for grouping together similar data into a number of clusters. It attempts to partition unlabeled input vectors into clusters such that data points within a cluster are more similar to each other than those belonging to different clusters. There are two kinds of clustering algorithm: hard and soft clustering [1]. In hard clustering, such as K-nearest neighbors (KNN) [2][3] and k-means [4], each data point is assigned to exactly one cluster, while in soft clustering, such as the fuzzy clustering algorithm, a membership value is assigned representing the degree to which each data point belongs to a cluster. In real applications, sampling data often have uncertain attributes and so cannot be correctly partitioned into one cluster. The fuzzy set proposed by Zadeh [5] is a solution for dealing with this problem. The fuzzy clus- and M fcn  U  [ ik ] 0  ik  1,i, j; c n   ik  1, 0   ik  n  . k 1 i 1  (3) To optimize (2), FCM algorithm alternates between optimizations of Jm over U with V fixed and Jm over V with U fixed, producing a sequence of { U ( s ) ,V ( s ) }. Specifically, the (s+1)th value of V={ v1 ,v2 ,…,vc } is computed using the (s)th value of U in the right-hand side of (4). n   v k  i1 n m ik   x i . (4) m ik i1 The (s+1) st value of U is obtained by (5): ik  Corresponding Author: Horng-Lin Shieh is with the Department of Electrical Engineering, St. John’s University, 499, Sec. 4, Tam King Road, Tamsui District, New Taipei City, Taiwan, 251. E-mail: [email protected] Manuscript received 05 Oct. 2011; revised 10 Oct. 2013; accepted 17 Feb. 2014. (2) k 1 i 1 || xi  vk ||1/( m 1) , c (5) 1/( m 1)  || xi  v j || j 1 where 1<m<  is the fuzzification parameter; 2  c  n the number of centers, and ||.|| denotes the inner product norm induced on d . © 2014 TFSA International Journal of Fuzzy Systems, Vol. 16, No. 1, March 2014 40 In literature, there were many successful applications of the fuzzy set, such as data mining [14, 15], decision making [16-18], robot control [19], and function approximation [20]. However, the FCM algorithm has one drawback, that is, FCM needs to know the number of clusters, which is not always possible in some real applications. In 1994, Yager and Filev [21] proposed a mountain function to obtain the initial cluster centers which can lead to better clustering results by FCM. In 1994, Chiu [22] modified the mountain method to construct a potential function for calculating the clustering center of the sampling data. The proposed method is also called the subtractive clustering (SC) algorithm. In SC algorithm, the feature points are likened to potential sources. The potential of cluster center has a maximal value at the location of the feature point which decreases rapidly at any point away from the feature point [23]. The potential of each data point xi is defined as: n P ( xi )   exp(   d ( xk , xi )) , (6) k 1 where   4 / ra , ra is a positive constant, and the d(xk,xi) is the distance between data points xi and xk. It is reasonable to assume that the peaks of the potential function correspond to cluster centers and that the valleys correspond to the decision boundaries between the clusters [23]. Suppose xk has the highest potential, then xk is selected as the first cluster center, denoted by x1* , and its corresponding potential value is p1* . After the first cluster center is selected, the potential of each data point can be revised by: (7) P( xi )  P( xi )  p1*e   d ( xk , xi ) , where   4 / rb , and rb is a positive constant. To find the next cluster center, x2* , the new revised potential is maximized and the effects of this cluster center are again removed. The process is repeated until p*j / p1*   , where  is a given threshold and p *j is the potential value of x*j . The choice of  is an important factor affecting the clustering results: if  is too large, then too few data points will be accepted as cluster centers; if  is too small, then too many cluster centers will be generated [22]. The problems of SC algorithm is that each cluster center obtained by SC algorithm is located a certain data point, but it is not a precise location of center. In this paper, the SC algorithm is used for identifying the initial cluster centers of FCM, and a novel robust validity index is proposed for FCM algorithm to indicate the fitness of partitions of a data set with noise. The proposed index combines the compactness in each clus- ter and the separation between clusters to obtain the correct number of clusters of the FCM algorithm. 2. Cluster Validity A validity index is a function which assigns to the output of the clustering algorithm a value which is intended to measure the quality of the clustering provided by the output. A validity index for finding an optimal c, denoted c*, which can completely describe the data structure, becomes the most studied topic in cluster validity. The quality a partition obtained by the FCM algorithm is evaluated by how closely the data points are associated to the cluster centers. The membership function indicates the close relationship between data and clusters. If the value of one of the memberships of a particular data point is larger than the others, then that point is identified as being a part of the subset of the data represented by the corresponding cluster centers [24]. If there are c clusters of a data set, then each data point has c memberships which represent the close degree between the data point with the cluster centers. So, it is desirable to summarize the information contained in the memberships by a single number which represents the fitness of the data point as classified by the clustering algorithms. The four validity indexes frequently used are [25]: (a) Partition coefficient (PC): The PC index proposed by Bezdek in 1981 is the first validity index for FCM. The PC index is define as (8): c n PC(c) = 1   ( ik )2 , n k 1 i 1 (8) c and  ik  1, k 1 where ik is the membership value of xi belonging to cluster k, and 1/c  PC ( c)  1 . The optimal cluster number c* of PC index can be find by solving ( arg max 2 c  n 1 PC(c)). But the PC index only considers the fuzzy membership degree ik that indicates the average relative amount of membership sharing done between pairs of fuzzy subsets in U, by combining into a single number, the average contents of pairs of fuzzy algebraic products [26], without considering the data structure of the clusters. So, the optimal value of c* is obtained when ik =1, for a certain cluster k, and ij =0, when j  k , i.e. the PC index has maximum value on every hard partition. Therefore, there is a special case where each data point formed one cluster, i.e. ii =1 and ij =0, when j  i . A modification of the PC index proposed by Dav´e [27], called MPC, is defined as: MPC (c) = 1  c (1  PC (c)) . c 1 (9) Horng-Lin Shieh: A Hybrid Fuzzy Clustering Method with a Robust Validity Index The value of MPC is in [0, 1]. The MPC is a normalized version of the PC index. When PC(c) =1/c, MPC =0 and PC(c) =1, MPC =1. So, it has the same disadvantage as the PC index. (b) Partition entropy (PE): The PE proposed by Bezdek [28] is defined as: PE ( c )   1 c n   ik ln( ik ) , n k 1 i 1 (10) where 0  PE ( c )  ln(c ) . The optimal cluster number c* of PE index can be find by solving ( arg min2 c  n 1 PE(c)). The PE index considers measuring the amount of fuzziness in a given U. The problem of PE index is analog to the PC index, i.e. the PE index will take the minimum value on every hard partition. These two indexes only evaluate the fuzziness of U, but do not consider the data structure of the clusters. (c) Xie and Beni (XB) index: The XB index proposed by Xie and Beni [29] in 1991 is defined as follows: c XB( c )  n 2 2    ik || xi  vk || k 1 i 1 n  mini  j (|| vi  v j ||) , (11) where vi and vj represent the centers of cluster i and j, respectively, and ||vi -vj || represents the Euclidean distance between vi and vj. XB tries to solve the arg min2  c  n 1 XB(c) to obtain the optimal clustering number c* for data set X. The XB index integrates two properties, compactness and separation, of the data set and clusters. The numerator represents the compactness in each cluster and the denominator indicates the separation between clusters [30]. For obtaining the best performance of partitions, the value of the compactness in one cluster is as small as possible and the separation between the clusters is as high as possible. To avoid the problem that the XB index decreases monotonically when cluster number c is close to the cardinality of a data set, the authors recommend plotting the XB curve as a function of c and then selecting the starting point of the monotonic epoch as the maximum c (cmax) to be considered [31]. (d) Fukuyama and Sugeno (FS) index [32]: The FS index is another index that integrates the properties of compactness and separation of the data set and clusters, and is defined as follows: FS(c) = n c c  n   m 2   ( ik ) || xi  vk || -     ( ik ) 2  || vk  v || , (12) i 1k 1 k 1  i 1   where 1  m   , and v is the mean of the cluster cen- ters. In the right side of (12), the first term represents the compactness of the clusters and the second term indicates the separation between the clusters. A small value of FS index indicates a good fuzzy clustering result with good compactness in clusters and separation between the 41 clusters. The optimal cluster number of a clustering algorithm is obtained by solving arg min2  c  n 1 FS(c). The above indexes have the common objective of finding a good estimate of cluster number c so that each one of the c clusters is compact or/and separated from the other clusters. But in real application systems, data domains are often affected by noises, but there has been little discussion in the literature about the influence of noise on validity indexes. In this paper, a novel robust validity index is proposed to indicate the goodness of partitions in a data set with noise. 3. A Novel Robust Cluster Validity for the FCM Algorithm In this paper, a novel validity index is proposed which integrates the properties of compactness and separation to indicate the fitness of the partition obtained by FCM algorithm. Let W  wik | 0  wik  1 , 1  i  n, 1  k  c be a weighting matrix, where wik is a weighting of xi belonging to cluster k, and V={ v1 ,v2 ,…,vc } is a set of the cluster centers, where vk  d . The wik is defined by: wik  exp(  || xi  vk ||2 2 2 ), (13) where ||xi-vk || represents the Euclidean distance between xi and vk , and  is the width of the Gaussian function. In the proposed method,    , where  is defined by (6). This paper defines a new compactness measure to indicate the optimal cluster number obtained by FCM. The proposed compactness measure is defined as (14): c n Com(c) = ( 1    || xi  vk || wik ) 2 ik n k 1 i 1 (14) c and  ik  1 . k 1 where the wik is a weight indicating the importance of distance between the data xi and cluster vk. The difference between wik and ik obtained by FCM algorithm is that there is no limitation to “the sum of wik =1, i, k ”. So, the wik has more representative than ik in reflecting correlation between data and clusters. Especially, the noises do not need to satisfy the limitation. Hence, the influence of noises is reduced by wik . In (14), if all data around cluster k is close to cluster center vk, then the value of the Com(c) is low, which indicates that all clusters are compactness; so, in order to find the best partition of the sampling data, the Com(c) value should be as low as possible. To measure the separation between clusters, this paper defines the separation function as: International Journal of Fuzzy Systems, Vol. 16, No. 1, March 2014 42 SE(c)  1 (15) c c   || vi  vk || i 1k 1 c(c  1) 2 1 , = SV  MV  mini  j (|| vi  v j ||) c c   || vi  vk || where SV = i 1k 1 , c(c  1) 2 and MV = mini  j (|| vi  v j ||) . In (15), the denominator indicates the separation between clusters. The first term SV of denominator represents the average distance between clusters. The second term of MV denominator represents the minimal distance between clusters. The SC algorithm adopts (7) to reduce the potential of data x around data xi* , where xi* is the ith selected cluster center. When the data set contains noise that far away from the cluster centers and a new cluster center is added at the noise point, the compactness of (14) is reduced, but the SV will be decreased to prevent a data forming a cluster. When the noise is close to the cluster centers and a new cluster center is located at the noise point, the minimal distance between cluster centers may be reduced. So, the SV and MV are used to prevent the noise becoming new cluster centers. To measure the separation between clusters and the compactness in each cluster, this paper combines Com(c) with SE(c) to find a good estimation of cluster number c* for the data clustering algorithm. This can be defined as: CS(c) = Com(c)×SE(c). (16) The low CS(c) value means that each of the c clusters is compact in each cluster and separated from the other clusters. In order to find the optimal cluster number c*, this paper tries to find the arg min2  c  n 1 CS(c) to produce the optimal compactness and separation properties for the clusters generated by the SC algorithm. The above mentioned procedure can be concluded with the following algorithm: Step 1: Set the initial values of  ,  ,  and cmax and max iteration. Step 2: Calculate the potential P(xi) of each data xi, 1 i  n. Step 3: Set the initial number of c=2. Step 4: Evolve the maximal potential value pc* from the P(xi). According to pc* , find the data point xc, that is, vc = xc and xc’s potential value is pc* . Step 5: Update the potential according to (7). Step 6: Set V={ v1,v2 ,…,vc } as the initial cluster centers for FCM. Step 7: For k = 1 to max iteration Step 8: Calculate the k and v k by (5) and (4). Step 9: Next k Step 10: Calculate the CS(c) value by (13) ~ (16). Step 11: c=c+1, if c<=cmax, then go to Step 4. Step 12: Search the minimal value of CS(c). The related value c of the minimal value of CS(c) is the optimal number of clusters. 4. Experimental Results In this section, four experiments are illustrated and their results show that the validity index proposed in this paper outperforms other indexes proposed in the literature. The values of  ,  are set by the definition of the potential function. Example 1: In Fig. 1(a), there is a data set made up of three clusters with noises/outliers. This data set is generated by four cluster centers at (x, y) = {(2, 2), (7, 7), (13, 13)} with Gaussian noise N(0, 0.2 ), and twelve noises/outliers denoted by circles are added. Intuitively, c = 3 is suitable for the data set. As shown in Fig. 1(b), the PE index indicates that the optimal cluster number is three and in Figs. 1(c) and (d), the XB and FS indexes indicate that optimal cluster numbers are two and four for this data set, respectively. In Fig. 1(e), the CS index indicates that the optimal cluster number is three for this data set. The results show that the PE and the CS index proposed by this paper obtained the correct cluster number of this data set. Example 2: In Fig. 2(a), there is a data set made up of four clusters with noises. This data set is generated by four cluster centers at (x, y) = {(1, 1), (7, 7), (13, 13), (19, 19)} with Gaussian noise N(0, 0.8 ), and twenty noises denoted by circles are added. Intuitively, c = 4 is suitable for the data set. As shown in Fig. 2(b), the PE index indicates that the optimal cluster number is four and in Figs. 2(c) and 2(d), the XB and FS indexes indicate that optimal cluster numbers are two and eight for this data set, respectively. In Fig. 2(e), the CS index indicates that the optimal cluster number is four for this data set. The results show that the PE and the CS index proposed by this paper obtained the correct cluster number of this data set. Horng-Lin Shieh: A Hybrid Fuzzy Clustering Method with a Robust Validity Index 25 1 20 0.9 0.8 15 0.7 PE y 10 5 0.6 0.5 0.4 0 0.3 -5 -10 -10 0.2 -5 0 5 10 15 20 0.1 25 2 4 6 Cluster number x (a) 8 10 (b) 3.5 -3000 43 Example 3: In this example, shown as Fig. 3(a), the data set was extended to eight clusters to test the performances of the validity indexes. As shown in Fig. 3(e) and 3(d), the CS and FS indexes correctly acquired the optimal number of clusters. The optimal number of cluster centers obtained by PE is two, as shown in Fig 3(b). In Fig. 3(c), the result of the XB index indicates that the optimal cluster number is four, but, in fact, the number eight is a secondary optimal choice of XB index. 3 -3500 2.5 40 0.7 -4000 30 FS XB 2 1.5 -4500 0.6 20 1 0.55 -5000 2 4 6 Cluster number 8 -5500 10 2 4 (c) 6 Cluster number 8 10 10 PE y 0.5 0 0.65 0.5 0.45 0 0.4 (d) -10 0.35 1800 -20 -20 1600 -10 0 1400 10 x 20 30 40 2 4 6 8 Cluster number (a) 10 12 10 12 (b) CS 1200 1000 4 2.5 800 0 x 10 -0.5 2 600 4 6 Cluster number 8 (e) 1 -4 0.9 0 0.8 20 PE 10 2 4 6 8 Cluster number 12 -4.5 2 4 6 8 Cluster number (d) 0.6 0.5 0 10 (c) 0.7 y -3.5 0.5 1 30 -2 -2.5 -3 Figure 1. A data set contains three clusters. 40 -1 -1.5 1.5 10 FS 2 XB 400 x 10 3 4 0.4 -10 -10 0 10 x 20 30 0.2 40 2 4 6 8 Cluster number (a) 10 (b) 7 -0.6 x 10 4 0.5 -0.8 5 -0.9 FS XB 3 -1.1 -1.3 -1.4 1 2 4 6 8 Cluster number 10 12 (e) Figure 3. A data set contains eight clusters with noises. -1.2 2 -1.5 2 4 6 8 Cluster number 10 12 -1.6 2 4 6 8 Cluster number (c) 10 (d) 8000 7000 6000 CS 0 0 -1 4 1.5 1 -0.7 6 2 12 CS -20 -20 2.5 0.3 5000 4000 3000 2000 2 4 6 8 Cluster number 10 12 (e) Figure 2. A data set contains four clusters with noises. 12 Example 4: As shown in Fig. 4(a), this example used a data set analogous to the data set proposed by Wu et al [33] to identify the above-mentioned indexes. In Fig. 4(e), the result indicated that the CS index proposed in this paper correctly acquired the optimal number of clusters. The results, shown in Fig. 4(c), also show that XB obtained the correct optimal cluster number, but PE and FS, shown in Fig. 4(b) and 4(d), indicated the optimal cluster number as 2 and 15, respectively. Fig 4(f) shows the sixteen cluster centers obtained by the fuzzy c-means algorithm. International Journal of Fuzzy Systems, Vol. 16, No. 1, March 2014 44 As shown by the above examples, the proposed CS index always outperformed the PE, XB and FS indexes. The PE index has good performance when the data set contains small amount clusters. When the clusters are increased, the PE cannot obtain correct cluster number for various data sets. The performances obtained by the CS index were better than the XB index in Examples 1, 2 and 3, and the performance obtained by the CS were also better than the FS index in Examples 1, 2 and 4. 40 1.3 since the PE index only considers the fuzzy membership degree of data belonging to each cluster without considering the structure of the clusters, it elicits a lower level of performance than proposed index, especially when the number of cluster is large. Therefore, the proposed validity index shows the greatest capability to estimate the optimal cluster number for various data sets. The results of examples have proven that the validity index proposed in this paper has better performance than the PE, XB and FS indexes. 1.2 30 Acknowledgment 1.1 20 1 0.9 PE y 10 0 This paper was supported by the National Science Council under contract number NSC 102-2622-E-129001-CC3. 0.8 0.7 -10 0.6 0.5 -20 -20 -10 0 10 x 20 30 0.4 40 2 4 6 8 (a) 10 12 Cluster number 14 16 18 20 References (b) 1.4 3 1.2 2 1 1 x 10 4 [1] 0 FS XB 0.8 0.6 -1 -2 0.4 -3 0.2 0 -4 2 4 6 8 10 12 Cluster number 14 16 18 -5 20 2 4 6 8 (c) 10 x 10 10 12 Cluster number 14 16 18 20 (d) 4 40 30 8 20 CS y 6 4 10 0 -10 2 -20 0 2 4 6 8 10 12 Cluster number 14 16 18 20 -20 -10 (e) 0 10 x 20 30 40 (f) Figure 4. A data set contains sixteen clusters. 5. Conclusion In this paper, a novel validity index for the FCM algorithm is proposed for evaluating the fitness of the resultant cluster number to a data set with noise. Firstly, the SC algorithm is adopted to generate the initial cluster centers for FCM algorithm, and then the FCM algorithm is adopted to reconstruct the cluster centers using the resultant centers obtained by SC algorithm as the initial cluster centers. A robust validity index is proposed in this paper for evaluating the fitness of clustering to data sets. The core idea in the proposed validity index was the combination of the compactness measure of each cluster and the separation property between clusters. From the experiments, both of the XB and FS indexes are affected by noises such that the optimal cluster number cannot be reached in three of the four simulation examples. Also, V. S. Tsen and C.-P. Kao, “A novel similarity-based fuzzy clustering algorithm by integrating PCM and mountain method,” IEEE Trans. on Fuzzy Systems, vol. 15, no. 6, pp. 1188-1196, 2007. [2] B. Bhattacharya and D. Kaller, “Reference set thinning for the k-nearest neighbor decision rule,” In Proc. of the Fourteenth International Conference on Pattern Recognition, vol. 1, pp. 238-242, 1998. [3] J. M. Keller, M. R. Gray, and J. A. Givens, “A fuzzy k-nearest neighbors algorithm,” IEEE Trans. on System, Man, and Cybernetics, vol. SMC-15, pp. 580-585,1985. [4] J. A. Hartigan and M. A. Wong, “A k-means clustering algorithm,” Applied Statistics, vol. 28, no. 1, pp. 100-108, 1979. [5] L. A. Zadeh, “Fuzzy sets,” Information Control, vol. 8, no. 3, pp. 338-353, 1965. [6] J. C. Bezdek, “A convergence theorem for the fuzzy ISODATA clustering algorithms,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-2, no. 1, pp. 1-8, 1980. [7] X. Li, H.-S. Wong, and S. Wu, “A fuzzy minimax clustering model and its applications,” Information Sciences, vol. 186, no. 1, pp. 114-125, 2012. [8] E. Nadernejad and A. Barari, “A novel pixon-based image segmentation process using fuzzy filtering and fuzzy c-mean algorithm,” International Journal of Fuzzy Systems, vol. 13, no. 4, pp. 350-357, 2011. [9] P. Kaur, A. K. Soni, and A. Gosain, “A robust kernelized intuitionistic fuzzy c-means clustering algorithm in segmentation of noisy medical images,” Pattern Recognition Letters, vol. 34, no. 2, pp. 163-175, 2013. [10] S. Miyamoto, “Different objective functions in Horng-Lin Shieh: A Hybrid Fuzzy Clustering Method with a Robust Validity Index [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] fuzzy c-means algorithms and kernel-based clustering,” International Journal of Fuzzy Systems, vol. 13, no. 2, pp. 89-97, 2011. M. Sabzekar and M. Naghibzadeh, “Fuzzy c-means improvement using relaxed constraints support vector machines,” Applied Soft Computing, vol. 13, no. 2, pp. 881-890, 2013. A. Skabar and K. Abdalgader, “Clustering sentence-level text using a novel fuzzy relational clustering algorithm,” IEEE Trans. on Knowledge and Data Engineering, vol. 25, no. 1, pp. 62-75, 2013. Y. Hu, D. Wu, and A. Nucci, “Fuzzy-clusteringbased decision tree approach for large population speaker identification,” IEEE Trans. on Audio, Speech, and Language Processing, vol. 21, no. 4, pp. 762-774, 2013. C. Hu, N. Luo, X. Yan, and W. Shi, “Traffic flow data mining and evaluation based on fuzzy clustering techniques,” International Journal of Fuzzy Systems, vol. 13, no. 4, pp. 344-349, 2011. T.-P. Hong, G.-C. Lan, Y.-H. Lin, and S.-T. Pan, “An effective gradual data-reduction strategy for fuzzy itemset mining,” International Journal of Fuzzy Systems, vol. 15, no. 2, pp. 170-181, 2013. C.-T. Chen, P.-F. Pai, and W.-Z. Hung, “A new decision-making process for selecting project leader based on social network and knowledge map,” International Journal of Fuzzy Systems, vol. 15, no. 1, pp. 36-46, 2013. D. Yu, “Multi-criteria decision making based on generalized prioritized aggregation operators under intuitionistic fuzzy environment,” International Journal of Fuzzy Systems, vol. 15, no. 1, pp. 47-54, 2013. I.-J. Ding, “Fuzzy rule-based system for decision making support of hybrid SVM-GMM acoustic event detection,” International Journal of Fuzzy Systems, vol. 14, no. 1, pp. 118-130, 2012. Y.-J. Mon and C.-M. Lin, “Supervisory fuzzy Gaussian neural network design for mobile robot path control,” International Journal of Fuzzy Systems, vol. 15, no. 2, pp. 142-148, 2013. H. L. Shieh, Y.-K. Yang, P.-L. Chang, and J.-T. Jeng, “Robust neural-fuzzy method for function approximation,” Expert Systems with Applications, vol. 36, no. 3, pp. 6903-6913, 2009. R. R. Yager and D. P. Filev, “Approximate clustering via the mountain method,” IEEE Trans. on Systems, Man and Cybernetics, vol. 24, no. 8, pp. 1279-1284, 1994. S. Chiu, “Fuzzy model identification based on cluster estimation,” Journal of Intelligent and Fuzzy Systems, vol. 2, no. 3, pp. 267-278, 1994. R. N. Dav’e and R. Krishnapuram, “Robust clus- [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] 45 tering methods: a unified view,” IEEE Trans. on Fuzzy Systems, vol. 5, no. 2, pp. 270-293, 1997. M. P. Windham, “Cluster validity for the fuzzy c-means clustering algorithm,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-4, no. 4, pp. 357-363, 1982. K. L. Wu and M. S. Yang, “A cluster validity index for fuzzy clustering,” Pattern Recognition Letters, vol. 26, no. 9, pp. 1275-1291, 2005. W. Wang and Y. Zhang, “On fuzzy cluster validity indices,” Fuzzy Sets and Systems, vol. 158, no. 19, pp. 2095-2117, 2007. R. N. Dav’e, “Validating fuzzy partition obtained through c-shells clustering,” Pattern Recognition Letter, vol. 17, no. 6, pp. 613-623, 1996. J. C. Bezdek, “Cluster validity with fuzzy sets,” Journal of Cybernetics. vol. 3, no. 3, pp. 58-73, 1974. X. L. Xie and G. Beni, “A validity measure for fuzzy clustering,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841-847, 1991. D.-W. Kim, K. H. Lee, and D. Lee, “On cluster validity index for estimation of the optimal number of fuzzy clusters,” Pattern Recognition, vol. 37, no. 10, pp. 2009-2025, 2004. N. R. Pal and J. C. Bezdek, “On cluster validity for the fuzzy c-means model,” IEEE Trans. on Fuzzy Systems, vol. 3, no. 3, pp. 370-379, 1995. Y. Fukuyama and M. Sugeno, “A new method of choosing the number of clusters for the fuzzy c-means method,” in Proc. of the 5th Fuzzy Systems Symposium, Japan, 1989, pp. 247-250. K. L. Wu, M. S. Yang, and J.-N. Hsieh, “Robust cluster validity indexes,” Pattern Recognition, vol. 42, no. 11, pp. 2541-2550, 2009. Horng-Lin Shieh was born in Chang Hua, Taiwan, September 20, 1965. He received his B.S., M.S., and Ph.D. degrees all in electrical engineering from National Taiwan University of Science and Technology in 1991, 1993 and 2006, respectively. He is currently a Professor in St. John’s University in Taiwan. His research interests include fuzzy clustering, intelligent systems, neural networks and RFID applications.