Abstract
We consider extractive summarization within a cluster of related texts (multidocument summarization). Unlike single-document summarization, redundancy is particularly important because sentences across related documents might convey overlapping information. Thus, sentence extraction in such a setting is difficult because one will need to determine which pieces of information are relevant while avoiding unnecessary repetitiveness. To solve this difficult problem, we propose a novel reinforcement learning-based method Policy Blending with maximal marginal relevance and Reinforcement Learning (PoBRL) for solving multidocument summarization. PoBRL jointly optimizes over the following objectives necessary for a high-quality summary: importance, relevance, and length. Our strategy decouples this multiobjective optimization into different subproblems that can be solved individually by reinforcement learning. Utilizing PoBRL, we then blend each learned policies to produce a summary that is a concise and a complete representation of the original input. Our empirical analysis shows high performance on several multidocument datasets. Human evaluation also shows that our method produces high-quality output.
Original language | English (US) |
---|---|
Pages (from-to) | 416-427 |
Number of pages | 12 |
Journal | IEEE Transactions on Artificial Intelligence |
Volume | 4 |
Issue number | 3 |
DOIs | |
State | Published - Jun 1 2023 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Science Applications
Keywords
- Artificial intelligence
- deep learning
- deep reinforcement learning
- document summarization
- machine learning
- natural language processing (NLP)
- reinforcement learning (RL)