Multi Article Summarization and Visualization

Sreekar Anne
2 min readNov 17, 2020

This blog contains a detailed explanation of the role and work done in my 4th-year Capstone project.

Capstone Project is something that is very important in testing one’s knowledge gained across the duration of Engineering. My project is Multi Article Summarization and Visualisation where we build a news aggregator that can summarise the content without any manual intervention.

Introduction

The pattern of consumption of news has evolved continuously from Radio to Television. Five years down the lane consumption of news through Television will also reduce continuously and people will use the internet to consume news and everyone is preferring to read short content or a gist instead of a detailed article. People are migrating to the internet to consume news from conventional ways like television and newspapers. So there is a need for good news and content aggregators in the market.

My Approach

In the current Indian market, there are few news aggregator apps like Inshorts, Dailyhunt, etc. Using content Writers will trigger cost and even the content will be limited to certain extinct. They provide news and also summarise news but everything is done manually with the help of content writers. Our main aim is to build software for existing news aggregators by automatic summarization of text. In this, we will extract articles on daily topics from different news sources and then summarise the text (using nltk packages) collectively and then finding important words in the summarized text by using named entity recognition(using spacy packages), and then images are generated from these words which will be displayed to the user along with the article.

Learning Outcomes

My part in the project was on Abstractive summarization and named entity recognition. These are a few topics related to Natural language processing. I was familiar with the concepts but did not work on Natural language processing before so it was a bit challenging at first but due to the support and guidance from our faculty eventually, the project took good shape. We build a model and trained it using Inshorts dataset. This model will summarize the text given to it. Once the text is summarized it is passed through a named entity recognition code where important words in the sentence are recognized and images related to the text are automatically generated.

Conclusion

It was a wonderful experience all together working on a capstone project and also a great learning curve in my education which will always be helpful for me throughout my career. I would like to thank Dr. Deepak Garg, Dr. Indrajeet Gupta for their continuous support and guidance.

If you wish to Connect! : lets LinkedIn

--

--