← Back to Papers
2025 archives; digital diagnostic imaging; newspapers; natural language processing; paradigms (social sciences); motivation (psychology); deep learning; storytelling; html (document markup language); libraries and archives; archives; newspaper publishers; book stores and news dealers; news dealers and newsstands; book;periodical;and newspaper merchant wholesalers; book;periodical and newspaper merchant wholesalers

Unlocking the Digitized Historical Newspaper Archive: Exploring Historical Insights with Deep Learning.

Wai-Yip Lum, Vincent and Kin-Fu Yip, Michael

This paper aims to utilize historical newspapers through the application of computer vision and machine/deep learning to extract the headlines and illustrations from newspapers for storytelling. This endeavor seeks to unlock the historical knowledge embedded within newspaper contents while simultaneously utilizing cutting-edge methodological paradigms for research in the digital humanities (DH) realm. We targeted to provide another facet apart from the traditional search or browse interfaces and incorporated those DH tools with place- and time-based visualizations. Experimental results showed our proposed methodologies in OCR (optical character recognition) with scraping and deep learning object detection models can be used to extract the necessary textual and image content for more sophisticated analysis. Timeline and geodata visualization products were developed to facilitate a comprehensive exploration of our historical newspaper data. The timeline-based tool spanned the period fro

Added 2026-04-21