Improving Digital Library Support for Historic Newspaper Collections
| AUTHOR | Lin, Leo |
| PUBLISHER | LAP Lambert Academic Publishing (05/23/2011) |
| PRODUCT TYPE | Paperback (Paperback) |
Description
National and international initiatives are underway around the globe to digitise the vast treasure troves of historical artefacts they contain and make them available as digital libraries (DLs). The developed DLs are often constructed from facsimile pages with pre-existing metadata, such as historic newspapers stored on microfiche or generated from the non- destructive scanning of precious manuscripts. Access to the source documents is therefore limited to methods constructed from the metadata. Other projects look to introduce full-text indexing through the application of off-the-shelf commercial Optical Character Recognition (OCR) software. While this has greater potential for the end user experience over the metadata-only versions, the approach currently taken is best effort in the time available rather than a process informed by detailed analysis of the issues. In this thesis, we investigate if a richer level of support and service can be achieved by more closely integrating image processing techniques with DL software. The thesis presents a variety of experiments, implemented within the recently published open-source OCR System (Ocropus)...
Show More
Product Format
Product Details
ISBN-13:
9783844398854
ISBN-10:
3844398856
Binding:
Paperback or Softback (Trade Paperback (Us))
Content Language:
English
More Product Details
Page Count:
116
Carton Quantity:
68
Product Dimensions:
6.00 x 0.28 x 9.00 inches
Weight:
0.40 pound(s)
Country of Origin:
US
Subject Information
BISAC Categories
Computers | Information Technology
Descriptions, Reviews, Etc.
publisher marketing
National and international initiatives are underway around the globe to digitise the vast treasure troves of historical artefacts they contain and make them available as digital libraries (DLs). The developed DLs are often constructed from facsimile pages with pre-existing metadata, such as historic newspapers stored on microfiche or generated from the non- destructive scanning of precious manuscripts. Access to the source documents is therefore limited to methods constructed from the metadata. Other projects look to introduce full-text indexing through the application of off-the-shelf commercial Optical Character Recognition (OCR) software. While this has greater potential for the end user experience over the metadata-only versions, the approach currently taken is best effort in the time available rather than a process informed by detailed analysis of the issues. In this thesis, we investigate if a richer level of support and service can be achieved by more closely integrating image processing techniques with DL software. The thesis presents a variety of experiments, implemented within the recently published open-source OCR System (Ocropus)...
Show More
Your Price
$62.84
