During the last few years, as digitization has gradually moved from an experimental and temporal activity toward one that is structural and continuous, mass digitization projects have been increasing in number. Newspapers appeal to a large audience, but in many cases are inaccessible. Therefore, it is no surprise that many institutions are now deciding to digitize their newspaper collections. Digitization and web delivery makes these collections available to a worldwide audience.
Contentra Technologies is one of the leading service providers in the digitization of archived newspapers. Contentra also specializes in digitizing books, journals, manuscripts, and the like. We partner with several newspaper publishers, national libraries, university and state libraries, creating customized solutions to suit each of their specific needs. Contentra plays a significant active role in the archival digitization movement.
In order to make a newspaper available for searching on the Internet, the following processes take place:
- The microfilm copy or paper original is scanned
- Master and Web image files are generated
- De-speckling, de-skewing and cropping of images
- Metadata is assigned for each issue, page, and article to improve the search ability of the newspaper
- OCR software is run over high-resolution images to create searchable full text
- OCR text, images, and metadata are imported into a digital library software program