Digitizing and structuring over 2 million pages of cultural heritage and historical content

 
 

The National Digital Newspaper Program (NDNP), a partnership between the National Endowment for the Humanities and the Library of Congress, is a long-term effort to develop an internet-based, searchable database of U.S. newspapers with descriptive information and select digitization of historic pages. 

DDD worked with the Library of Congress and over 10 state and local institutions to digitize more than 2 million pages of historic newspaper archives residing in their collections. DDD encoded the text into METS/ALTO xml files with article segmentation and descriptive metadata to create an enhanced browsing and searching experience. With these files, the Library of Congress was able to create a digital archive to enhance the study of American history.