General Information

Abstract

This project aims to build text summarization systems that can understand and aggregate information from long documents, so as to allow users to explore their content with summaries that are generated in styles they prefer. The summarization tools will make long documents more accessible and comprehensible, easing the knowledge learning experience of the general public. Researchers and practitioners can also use the tools to summarize long documents relevant to their work, and educators can incorporate them in their classes to bolster students' reading and writing skills. The project also broadens the investigator's efforts of engaging young students in immersive research opportunities, allowing them to participate in the design and implementation of advanced summarization systems.

Keywords

Long document summarization, summarization evaluation, question generation, question-answer hierarchy

Funding Agency

NSF, Award Number: 2046016. Duration: July 1, 2021 - June 30, 2026.

People Involved

In addition to the PI, the following students work on the project.
  • Shuyang Cao
  • Xin Liu
  • Simin (Olivia) Fan

Publications

BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases. Xin Liu, Muhammad Khalifa, and Lu Wang. Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), short paper, 2023.

Time-aware Prompting for Text Generation. Shuyang Cao and Lu Wang. Findings of the Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP), 2022.

HIBRIDS: Attention with Hierarchical Biases for Structure-aware Long Document Summarization. Shuyang Cao and Lu Wang. Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), 2022.

Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs. Xu Wang, Simin Fan, Jessica Houghton, and Lu Wang. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022.

CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization. Shuyang Cao and Lu Wang. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Controllable Summarization with Constrained Markov Decision Process. Hou Pong Chan, Lu Wang, and Irwin King. Transactions of the Association for Computational Linguistics (TACL), 2021.