Open positions
Open research positions in SNAP group are available here.

Chemical-gene interaction network

Dataset information

This is a chemical-gene interaction network that contains information on interactions between genes (i.e., proteins encoded by genes) and small molecules. Nodes represent chemicals and genes, and edges represent biological interactions between them. For example, small molecules can activate or inhibit proteins, such as enzymes or receptors, and can target proteins by binding to them with different binding affinities. For instance, aspirin has relatively low binding affinities, whereas rofecoxib is specifically binding protein PTGS2. The chemical-gene interaction network is global and as such considers interactions anywhere in an organism.

Dataset statistics
Nodes 9569
Drug nodes 1774
Gene nodes 7795
Edges 131034
Nodes in largest SCC 9538
Fraction of nodes in largest SCC 1.000000
Edges in largest SCC 131001
Fraction of edges in largest SCC 0.999748
Diameter (longest shortest path) 8
90-percentile effective diameter 3.864298

The network aggregates high-throughput experiments data, manually curated datasets, and the results of several prediction methods into a single global network of chemical-gene interactions.



File Size Description
ChG-InterDecagon_targets.csv.gz 2.5MB Drug-target protein associations from several curated databases