PROPOSTA DE UM MÉTODO BASEADO EM DENSIDADE E GRADE PARA O PROBLEMA DE AGRUPAMENTO AUTOMÁTICO

Gustavo Silva Semaan; Raphael Borges Vasconcelos; José André de Moura Brito; Luiz Satoru Ochi

doi:10.5151/marine-spolm2014-126179

Simpósio de Pesquisa Operacional e Logística da Marinha - Publicação Online

Agosto 2014, vol.1, num.1 - XVII Simpósio de Pesquisa Operacional e Logística da Marinha

Artigo - Open Access Idioma principal | Segundo principal

PROPOSTA DE UM MÉTODO BASEADO EM DENSIDADE E GRADE PARA O PROBLEMA DE AGRUPAMENTO AUTOMÁTICO

Semaan, Gustavo Silva; Vasconcelos, Raphael Borges; Brito, José André de Moura; Ochi, Luiz Satoru

Artigo:

A área de “Cluster Analysis” agrega diversos métodos que têm como objetivo a identificação de grupos dentro de um conjunto de dados. O novo método proposto neste trabalho foi desenvolvido a partir do estudo de uma técnica baseada em grade e densidade. Ele tem como objetivo identificar o número de grupos em um problema de agrupamento automático com base na maximização do índice Silhueta. Os resultados computacionais apresentados neste estudo indicam que o método proposto é promissor no diz respeito à qualidade das soluções produzidas.

The cluster analysis has several methods that aim to identify groups within a dataset. This paper presents a new method for the automatic clustering problem based on both Density and Grid methodologies. The goal of the new method is identify the ideal number of clusters given a dataset by the maximization of Silhouette Index. According the computational experiments, the use of this method is a new promising way to solve the problem.

Download (PDF)

Palavras-chave:

DOI: 10.5151/marine-spolm2014-126179

Referências bibliográficas

[1] Alves, V., R. Campello, Andamp; E. Hruschka (2006). Towards a fast evolutionary algorithm for clustering. In IEEE Congress on Evolutionary Computation, 2006, Vancouver, Canada, pp. 1776–1783.
[2] Baum, E.B. Iterated descent: A better algorithm for local search in combinatorial optimization problems. Technical report Caltech, Pasadena, CA. Manuscript, 1986.
[3] Cruz, M. D. O Problema de Clusterização Automática. Tese de Doutorado, UFRJ, Rio de Janeiro, 2010.
[4] Dias, C.R.; Andamp; Ochi, L.S.. Efficient Evolutionary Algorithms for the Clustering Problems in Directed Graphs. Proc. of the IEEE Congress on Evolutionary Computation (IEEE-CEC), 983-988. Canberra, Austrália, 2003.
[5] Ester, M., Kriegel, H.-P., and Xu, X., A Database Interface for Clustering in Large Spatial Databases, In: Proceedings of the 1st International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), pp. 94- 99, Montreal, Canada, August, 1995.
[6] Ester, M., H.-P. Kriegel, J. Sander, Andamp; X. Xu (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96), pp. 226–231.
[7] Garai, G. Andamp; Chaudhuri, B. (2004), A novel genetic algorithm for automatic clustering, Pattern Recognition Letters, Ed. 25, pg. 173–187.
[8] Goldschimidt R.; Passos, E. Data Mining: um guia prático. Editora Campus, Rio de Janeiro: Elsevier, 2005.
[9] Hair, J.F, Black, W.C, Babin, B.J., Anderson, R.E. e Tatham, R.L. Análise Multivariada de Dados, Bookman, Sexta Edição, 2009.
[10] Han, J., e Kamber, M., Cluster Analysis. In: Morgan Kaufmann. Publishers (eds.), Data Mining: Concepts and Techniques, 2 ed., chapter 8, New York, USA, Academic Press, 2006.
[11] Hruschka, E. R., Ebecken, N. F. F. A Genetic algorithm for cluster analysis. IEEE Transactions on Evolutionary Computation , 2001.
[12] Hruschka, E. R. Andamp; Ebecken, N. F. F. (2003). A genetic algorithm for cluster analysis. Intelligent Data Analysis 7 (1), 15–25.
[13] Hruschka, E. R., R. J. G. B. Campello, Andamp; L. N. de Castro (2004a). Evolutionary algorithms for clustering gene-expression data. In Proc. IEEE Int. Conf. on Data Mining, Brighton/England, pp. 403–406.
[14] Hruschka, E. R., R. J. G. B. Campello, Andamp; L. N. de Castro (2004b). Improving the efficiency of a clustering genetic algorithm. In Advances in Artificial Intelligence - IBERAMIA 2004: 9th Ibero-American Conference on AI, Puebla, Mexico, November 22-25. Proceedings, Volume 3315, pp. 861–870. Springer-Verlag GmbH, Lecture Notes in Computer Science.
[15] Hruschka, E. R., R. J. G. B. Campello, Andamp; L. N. de Castro (2006). Evolving clusters in gene-expression data. Information Sciences 176 (13), 1898–1927.
[16] Larose, D. T. Discovering Knowledge in Data, An Introduction to Data Mining. John Wiley Andamp; Sons, 2005.
[17] Larrañaga, Pedro; Andamp; Lozano, Jose A. Estimation of distribution algorithms: A new tool for evolutionary computation. Kluwer Academic Publishers, Boston, 2002.
[18] Maulik, U. Andamp; Bandyopadhyay, S. (2000), Genetic Algorithm-based Clustering Technique, Pattern Recognition p.33,1455-1465.
[19] Macqueen, J. B. (1967). Some Methods for Classification and Analysis of MultiVariate Observations. Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press. P. 281-297, V. 1.
[20] Naldi, M. C. Andamp; A. C. P. L. F. Carvalho (2007). Clustering using genetic algorithm combining validation criteria. In Proceedings of the 15th European Symposium on Artificial Neural Networks, ESANN 2007, Volume 1. 2007.
[21] Naldi, C. N. Técnicas de Combinação para Agrupamento Centralizado e Distribuído de Dados. Tese de Doutorado, USP - São Carlos, 2011.
[22] Oliveira, C. EDACLUSTER: Um Algoritmo Evolucionário para Análise de Agrupamentos Baseados em Densidade e Grade, Dissertação (Mestrado em Engenharia Elétrica), Universidade Federal do Pará, 2007.
[23] Pan, S. Andamp; K. Cheng (2007). Evolution-based tabu search approach to automatic clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C - Applications and Reviews 37 (5), 827–838.
[24] Rakesh, A., Johanners, G., Dimitrios, G. Andamp; Prabhakar, R. (1999). Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. of the ACM SIGMOD, p.94-105.
[25] Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20, 53–65.
[26] Semaan, G. S., Cruz, M.D., Brito, J. A. M., and Ochi, L. S. "Proposta de um método de classificação baseado em densidade para a determinação do número ideal de grupos em problemas de clusterização", Learning Andamp; Nonlinear Models v.10 n4, 2012.
[27] Semaan, G.S. Algoritmos para o Problema de Agrupamento Automático. Tese de Doutorado, Instituto de Computação, Universidade Federal Fluminense, 2013.
[28] Soares, S. S. R. F., Ochi, L. S. Um Algoritmo Evolutivo com Reconexão de Caminhos para o Problema de Clusterização Automática. in XII Latin Ibero American Congress on Operations Research, Proc. of the XII CLAIO, 2004.
[29] Tseng, L. Andamp; . Yang, S.B. A genetic approach to the automatic clustering problem. Pattern Recognition 34, 2001.
[30] Wang et. al., Wang, X., Qiu, W., Zamar, R. H. (2007). CLUES: A non-parametric clustering method based on local shrinking. Computational Statistics Andamp; Data Analysis 52, 2007.

Como citar:

Semaan, Gustavo Silva; Vasconcelos, Raphael Borges; Brito, José André de Moura; Ochi, Luiz Satoru; "PROPOSTA DE UM MÉTODO BASEADO EM DENSIDADE E GRADE PARA O PROBLEMA DE AGRUPAMENTO AUTOMÁTICO", p-153-162. In: Anais do XVII Simpósio de Pesquisa Operacional e Logística da Marinha - SPOLM 2014. São Paulo: Blucher, 2014.
ISSN 21756295, DOI 10.5151/marine-spolm2014-126179

últimos 30 dias

135
downloads

360
visualizações

808
indexações

Simpósio de Pesquisa Operacional e Logística da Marinha - Publicação Online

PROPOSTA DE UM MÉTODO BASEADO EM DENSIDADE E GRADE PARA O PROBLEMA DE AGRUPAMENTO AUTOMÁTICO

PROPOSTA DE UM MÉTODO BASEADO EM DENSIDADE E GRADE PARA O PROBLEMA DE AGRUPAMENTO AUTOMÁTICO

Artigo:

Referências bibliográficas

Como citar:

Sobre a Blucher

Para você

Outros