Chemical-gene interaction network

Dataset information

This is a chemical-gene interaction network that contains information on interactions between genes (i.e., proteins encoded by genes) and small molecules. Nodes represent chemicals and genes, and edges represent biological interactions between them. For example, small molecules can activate or inhibit proteins, such as enzymes or receptors, and can target proteins by binding to them with different binding affinities. For instance, aspirin has relatively low binding affinities, whereas rofecoxib is specifically binding protein PTGS2. The chemical-gene interaction network is global and as such considers interactions anywhere in an organism.

Dataset statistics
Nodes	9569
Drug nodes	1774
Gene nodes	7795
Edges	131034
Nodes in largest SCC	9538
Fraction of nodes in largest SCC	1.000000
Edges in largest SCC	131001
Fraction of edges in largest SCC	0.999748
Diameter (longest shortest path)	8
90-percentile effective diameter	3.864298

The network aggregates high-throughput experiments data, manually curated datasets, and the results of several prediction methods into a single global network of chemical-gene interactions.

References

Modeling polypharmacy side effects with graph convolutional networks. Marinka Zitnik, Monica Agrawal, and Jure Leskovec. Bioinformatics. 2018.
Presented at ISMB 2018

STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Szklarczyk, Damian, et al. Nucleic Acids Research. 2015.

Files

File	Size	Description
ChG-InterDecagon_targets.csv.gz	2.5MB	Drug-target protein associations from several curated databases