Location
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
Document Type
Event
Start Date
15-4-2025 4:00 PM
Description
In recent years, the rapid growth of social media platforms has led to an information overload, as a result, the ability to compress long and complex texts into short and precise summaries is essential, especially in online discussions and comment sections. Summarizing such content is difficult due to inconsistencies in sentence structure, slang, abbreviations, and the lack of formal grammar. State-of-the-art models such as BART and PEGASUS have shown promising results, but their performance on informal datasets remains lower compared to structured text benchmark. To address these challenges, we fine-tune BART and PEGASUS on the Reddit TIFU dataset, leveraging their transformer-based architectures to improve abstractive summarization of informal text. Our contribution lies in adapting state-of-the-art summarization models specifically for informal, user-generated discussions, making summarization more effective for online platforms. Our fine-tuned model achieves a 6.6% improvement in ROGUEL compared to existing summarization model, demonstrating its effectiveness in generating concise and coherent summaries of Reddit discussions.
Included in
GRM-060 Abstractive Summarization of Informal Text: Fine-Tuning Transformers on Reddit Discussions
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
In recent years, the rapid growth of social media platforms has led to an information overload, as a result, the ability to compress long and complex texts into short and precise summaries is essential, especially in online discussions and comment sections. Summarizing such content is difficult due to inconsistencies in sentence structure, slang, abbreviations, and the lack of formal grammar. State-of-the-art models such as BART and PEGASUS have shown promising results, but their performance on informal datasets remains lower compared to structured text benchmark. To address these challenges, we fine-tune BART and PEGASUS on the Reddit TIFU dataset, leveraging their transformer-based architectures to improve abstractive summarization of informal text. Our contribution lies in adapting state-of-the-art summarization models specifically for informal, user-generated discussions, making summarization more effective for online platforms. Our fine-tuned model achieves a 6.6% improvement in ROGUEL compared to existing summarization model, demonstrating its effectiveness in generating concise and coherent summaries of Reddit discussions.