Reference Graph

This paper proposes the reference graph based recommendation algorithm.

The algorithm begins with some given documents and gets a group of documents the user is interested in by the different citation relations among the references and displays their topology structure. This is very useful for researchers.

The contribution of this paper mainly focuses on the following three points. (I) It is the first paper to use reference graph to recommend documents. This makes the recommendation more accurate, comprehensive and clearly. (2) It analyzes the structure of documents and the different relationships among them. It deeply describes the relationships among documents by a reference graph and proposes an algorithm to construct the reference graph. (3) It proposes an algorithm to find the relative sub-graph in the reference graph, which produces the topology structure among the recommended documents.

The rest of the paper is organized as follows. Document structure is analyzed in Section 2. Section 3 introduces the reference graph and the construction algorithm. Section 4 presents the algorithm of relative sub-graph construction and analyzes the cost of the algorithm. The experimental results are shown in Section 5. The conclusion is given in Section 6.

~

YANG, Yan and YUN, Long, 2010. Literature recommendation based on reference graph. In: 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE). August 2010. p. V3-400-V3-404. DOI 10.1109/ICACTE.2010.5579583.

> Research papers are frequently requested in digital library. When users query for documents about a research problem, users usually want all the related documents and the citations with the relations among them, to enable the user know the development of the research area. Traditional recommendation techniques are mainly based on similarity among users or documents to do recommendation. They do not use the relationships among documents and, especially, cannot provide topology structure of related documents to the user. This paper proposes a new recommendation technique based on reference graph of documents. Combined with traditional recommendation techniques, it can recommend research paper more completely, accurately and clearly. Experimental results show that the reference graph based recommendation technique is more effective.

[…]

III. CONSTRUCTION OF REFERENCE GRAPH

The reference graph is constructed according to the degree of correlation between documents, which is defined by the similarity and the citation relations among them.

[…]

The reference graph is constructed when the digital library is created. So the construction algorithm is done in the preprocessing step. The reference graph based recommendation algorithm finds out the related sub-graph of the reference graph as the recommended result. The sub­ graph is obtained from one or some documents in the reference graph that the user is interested in.

IV. REFERENCE GRAPH BASED RECOMMENDAnON ALGORITHM

The constructing algorithm of reference graph adds all the documents directly or indirectly cited by Vi to the reference graph, which makes it expand exponentially. The target of the recommendation algorith is to identify the sub-graph in the reference graph, which includes a limited number of nodes the user is interested in. To find all the interested documents about a research problem, we must know what the users' interest is. User interest can be reflected by a few given documents. There are two methods to find them. One is the user inputs their interested documents actively. The other is to get them by the system automatically through data mining on the log data. The first method is adopted in this paper, and the sub-graph is constructed from the given documents.

[…]

From the definition, we can see that best related sub­ graph is the sub-graph of the reference graph, which has the largest degree of correlation with the set of source nodes. To find the best related sub-graph, the degree of correlations of all the nodes with the set of source nodes should be calculated. Thus all sub-graphs rooted by each of the source node must be searched. Suppose 10 documents are cited by each document. The length of the longest path beginning from the source node is L. The cost of identifying the best related sub-graph will be 10^L. The cost of this operation is intolerable.

[…]