Open positions
Open research positions in SNAP group are available here.

Higher-order network structure of disease pathways

Dataset information

This is a dataset of network structural features describing disease pathways. The dataset contains information on network motif counts and on significance analysis of the motifs. Network motifs are subgraphs that recur within disease pathways. This dataset contains information on graphlets, connected non-isomorphic induced subgraphs. There are 30 possible graphlets of size 2 to 5 nodes. The simplest graphlet is just two nodes connected by an edge, and the most complex graphlet is a clique of size 5. By taking into account the symmetries between nodes in a graphlet, there are 73 different positions or orbits for 2-5-node graphlets.

Broadly, a disease pathway is a system of interacting proteins whose atypical activity collectively produces some disease phenotype. Given a human physical protein-protein interaction network, whose nodes represent proteins and edges represent protein-protein interactions, the disease pathway for a given disease is a subgraph of the PPI network specified by the set of proteins that are associated with the disease and by the set of corresponding protein-protein interactions. Examples of disease pathways include 'adrenal cortex carcinoma pathway,' 'Noonan syndrome,' and 'mitochondrial complex I deficiency.'



File Size Description
D-MtfPathways_disease-motifs.csv.gz 236KB Network motifs of disease pathways (feature table)