Open positions
We have filled all the positions for this quarter. More info.

Graph Embedding with Self Clustering: Facebook, February 13 2018

Dataset information

We collected data about Facebook pages (November 2017). These datasets represent blue verified Facebook page networks of different categories. Nodes represent the pages and edges are mutual likes among them. We reindexed the nodes in order to achieve a certain level of anonimity. The csv files contain the edges -- nodes are indexed from 0. We included 8 different distinct types of pages. These are listed below. For each dataset we listed the number of nodes an edges.

The data was collected in November 2017.

Government 7,057 89,455
New Sites 27,917 206,259
Athletes 13,866 86,858
Public Figures 11,565 67,114
TV Shows 3,892 17,262
Politician 5,908 41,729
Artist 50,515 819,306
Company 14,113 52,310

Source (citation)

  • B. Rozemberczki, R. Davies, R. Sarkar and C. Sutton. GEMSEC: Graph Embedding with Self Clustering. 2018.

  • Files

    File Description
    gemsec_facebook_dataset.tar.gz Facebook data from February 13 2018