News Articles with Story-level Alignment (BIGNEWS and BIGNEWSALIGN) (First Release, 2022)
3.7 million news articles of 11 media outlets with different ideological leanings. 1.1 million stories with each story cluster containing articles from different media.
Efficient Attentions for Long Document Summarization
Luyang Huang, Shuyang Cao, Nikolaus Parulian, Heng Ji, and Lu Wang
Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
NLG with Planning for Content and Style (First Release, 2019)
NLG datasets for arguments and Wikipedia articles.
In Plain Sight: Media Bias through the Lens of Factual Reporting
Lisa Fan, Marshall White, Eva Sharma, Ruisi Su, Prafulla Kumar Choubey, Ruihong Huang, and Lu Wang
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), short paper, 2019.
Reddit CMV Argument Corpus (a larger collection) and Arguments from News Media (First Release, 2019)
CMV arguments with collected relevant arguments from mainstream media
of different ideological leanings.
Argument Mining for Understanding Peer Reviews
Xinyu Hua, Mitko Nikolov, Nikhil Badugu, and Lu Wang
Proceedings of the Conference of the North American Chapter of the
Association for Computational Linguistics (NAACL), short paper, 2019
Reddit CMV Argument Corpus (First Release, 2018)
Arguments and counter-arguments related with politics and policy, collected from reddit.com/r/changemyview.
Socially-Informed Timeline Generation Corpus (First Release, 2015)
New York Times, CNN, and BBC news articles and user comments on four major events happened in 2014.
New York Times news articles and user comments in 2013.
Socially-Informed Timeline Generation for Complex Events
Lu Wang, Claire Cardie, and Galen Marchetti
Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2015.
Wikipedia Disputed Discussion Corpus (First Release, 2016)
Dispute and non-dispute discussions from Wikipedia talkpages.