This public dataset is one month of edits made by edits on Wikipedia pages. We selected the 1,000 most edited pages as items and editors who made at least 5 edits as users (a total of 8,227 users). This generates 157,474 interactions. Similar to the Reddit dataset, we convert the edit text into a LIWC-feature vector.
Users | Items | Interactions | Node Labels | Node Features | Edge Labels | Edge Features | Action Repetition (%) |
---|---|---|---|---|---|---|---|
8,227 | 1,000 | 157,474 | Exist | None | None | Exist | 61 |
Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). Association for Computing Machinery, New York, NY, USA, 1269–1278. DOI:https://doi.org/10.1145/3292500.3330895
@inproceedings{kumar2019predicting,
title={Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks},
author={Kumar, Srijan and Zhang, Xikun and Leskovec, Jure},
booktitle={Proceedings of the 25th ACM SIGKDD international conference on Knowledge discovery and data mining},
year={2019},
organization={ACM}
}
wikipedia.tsv
row_id user_id item_id timestamp
0 0 0 0
1 1 1 54
2 1 2 306
3 2 3 479
......
original_id mapped_id
0 0
1 1
4 2
7 3
......
row_id edge_feature_0 edge_feature_1
0 0.5 1.0
1 -0.5 1.0
......
user_id timestamp state_label
0 0.000 0
1 36.000 0
1 77.000 1
......
Sejoon Oh, soh337@gatech.edu, Georgia Institute of Technology
Srijan Kumar, srijan@gatech.edu, Georgia Institute of Technology