Arxiv HEP-TH (high energy physics theory) citation graph is from the e-print arXiv and covers all the citations within a dataset of 27,770 papers with 352,807 edges. If a paper i cites paper j, the graph contains a directed edge from i to j. If a paper cites, or is cited by, a paper outside the dataset, the graph does not contain any information about this.
The data covers papers in the period from January 1993 to April 2003 (124 months). It begins within a few months of the inception of the arXiv, and thus represents essentially the complete history of its HEP-TH section.
The data was originally released as a part of 2003 KDD Cup.
Dataset statistics | |
---|---|
Nodes | 27770 |
Edges | 352807 |
Nodes in largest WCC | 27400 (0.987) |
Edges in largest WCC | 352542 (0.999) |
Nodes in largest SCC | 7464 (0.269) |
Edges in largest SCC | 116268 (0.330) |
Average clustering coefficient | 0.3120 |
Number of triangles | 1478735 |
Fraction of closed triangles | 0.04331 |
Diameter (longest shortest path) | 13 |
90-percentile effective diameter | 5.3 |
File | Description |
---|---|
cit-HepTh.txt.gz | Paper citation network of Arxiv High Energy Physics Theory category |
cit-HepTh-dates.txt.gz | Time of nodes (paper submission time to Arxiv) |
cit-HepTh-abstracts.tar.gz | Paper meta information (see below) |