In this paper, we would like to present our research result to build a graph clustering system using the SOM neural network and graph spectra. We use this system to support the visualization of similar protein structures in graph database of protein structures. Graph spectra is a set of eigenvalues of the normalized Laplacian matrix representing the graph. These eigenvalues are sorted in descendant order. We create a feature vector of sorted eigenvalues in descendant order to represent graph. SOM neural network is used to cluster the graph spectra; graph distance is Euclidean distance between graph spectra. Using graph spectra, we can improve the speed of training phase of SOM neural network. After clustering, the 2D SOM output layer will create the clusters of similar protein structures. By putting 2D SOM output layer on the computer display, we can visualize the similar protein structures of database by moving around the computer display. Our proposed solution was tested with the protein structures downloaded from SCOP database which was created by manual inspection and automated methods for description of the structural and evolutionary relationships between all proteins known. Our results are compared with the SCOP.
~
PHUC, Do and PHUNG, Nguyen Thi Kim, 2010. Visualization of the Similar Protein Structures Using SOM Neural Network and Graph Spectra. In: NGUYEN, Ngoc Thanh, LE, Manh Thanh and ŚWIĄTEK, Jerzy (eds.), Intelligent Information and Database Systems. Online. Berlin, Heidelberg: Springer Berlin Heidelberg. p. 258–267. Lecture Notes in Computer Science. ISBN 978-3-642-12100-5. [Accessed 4 March 2024].
Graph is an effective way to represent data objects. Currently, graphs are used to represent the 3D objects such as the structure of chemical elements, protein structure, XML [2, 7, 9, 11]. With the growth of data, the data volume increases rapidly requiring to organize databases to store data represented by graphs. In this article, graphs are represented by graph spectra. We use SOM neural network to cluster the graph spectra. Using graph spectra, we can speed up the process of training the SOM neural network. Finally, we use SOM output layer to visualize graphs representing the protein structure on the computer display. The paper is organized as follows: 1) Introduction 2) Organizing data in relational databases 3) Graph distance 4) Graph distance based on the graph spectra 5) Clustering graphs by using SOM neural network 6) Graph database of protein structure 7) Using SOM network and graph spectra for visualization and similarity query in graph database of protein structures 8) Experimental and discussion 9) Conclusions.
Graph spectra is a set of eigenvalues of the normalized Laplacian matrix representing the graphs. For graphs with n vertices, the graph spectra is as follows: […]