This dataset was provided by Cuneyt Akcora from U Manitoba, Friedhelm Victor from TU Berlin, Murat Kantarcioglu and Yulia Gel from UT Dallas.
The Ethereum blockchain stores the transactions that have been executed between roughly 200M account addresses. Ethereum contains two types of account addresses: externally owned and smart contract code accounts. Externally owned accounts (EOA) are controlled by private keys that are managed by real-life entities. Some entities are ordinary users, whereas others are organizations such as blockchain exchanges. There are two types of exchanges; centralized exchanges (CEX), also known as custodial exchanges, manage users’ funds on their behalf, via multiple, centrally controlled EOAs. Decentralized exchanges (DEX) in contrast, typically do not require placing funds in the custody of a single entity, and are typically implemented as smart contract accounts. As exchanges play a major role in blockchain transaction networks, and DEX have gained significant popularity with the advent of Decentralized Finance, understanding these type of addresses and associated transactions has emerged as an important task.
This dataset consists of weighted, directed graphs with partially available node labels. Specifically, it consists of token (asset) networks that have been extracted from the Ethereum blockchain between Oct-16-2018 and May-04-2020 and are among the largest during that time frame. It covers the ERC20 assets TUSD, BAT, MANA, MGC, BNT, HEX, AMB, LINK, DAI, HT, AZ, LAMB, SAI, EGT, MXM, USDP, MKR, USDC, NPXS, STORJ, BNB, EBK, WETH, KICK, OMG, KNC, ZRX and ENJ which correspond to the token address field. The data is not anonymized, and can thus be looked up with online block explorers and linked to external information. These networks can be used individually, or jointly, as some nodes may appear in multiple networks. Node labels were obtained in May 2020 from Etherscan.io, a prominent Ethereum block explorer, that curates and maintains address labels. In total, 296 addresses from 149 centralized and decentralized exchange addresses are listed publicly, which are likely used frequently. The dataset also provides address labels (label, address, name, asset) for addresses in the 0.1 depth Alphacore of the stablecoin network.
Properties | |
---|---|
Number of graphs: 28 | |
Directed: Yes | |
Node features: No | |
Edge features: Yes | |
Graph labels: No | |
Node labels: Partially | |
Temporal: Yes |
Possible tasks | |
---|---|
Classification | Given a token transaction network and a list of centralized (CEX) and decentralized (DEX) addresses, predict which other Ethereum addresses belong to an exchange |
Core decomposition | Given a token transaction network, identify its cores by using node features. Use the list of centralized (CEX) and decentralized (DEX) addresses as your ground truth with the assumption that cex and dex addresses appear in the highest core of the network (see the AlphaCore article cited below for a justification of this assumption) |
@inproceedings{victor2021alphacore, title={Alphacore: Data Depth based Core Decomposition}, author={Victor, Friedhelm and Akcora, Cuneyt G and Gel, Yulia R and Kantarcioglu, Murat}, booktitle={Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery \& Data Mining}, pages={1625--1633}, year={2021} }
File | Description |
---|---|
ethereum-exchanges.zip | Decentralized Exchange Classification Dataset: AlphaCore |