Digitization Standards and Tips
Documentation from the Digital Library Center, University of Tennessee Libraries
http://diglib.lib.utk.edu/dlc/techdocs/UT_DigitizationStandards2004.pdf
Scanning Standards used for Volunteer Voices (the basics)
- Scan items at 400ppi on long edge (refer to UT Digitization Standards--link above--for specifics based on object type)
- Save as .tif file
- Use filename as detailed on Filenaming Schemes pages
Image sizes/resolution for web delivery
- Primary Image
- Thumbnail Image
- Mini-thumbs (for cross-format searching)
- OCR’d Text
- Each should be named the same as the tif page from which it was created – but with a different extension. .jpg for each of the images (put in directories: JPEG thumbs minithumbs) And .txt for the OCR’d text (put in directory: ocr ) These subdirectories should be created in the directory of the parent object, such as in 00, 01, etc.
Images - required derivatives to deliver for web production
- 00: (image files, one per metadata record) jpegs, thumbs, minithumbs
- 01: (image files, several per metadata record) jpegs, thumbs, minithumbs only for the first page in each sequence (files ending in 0001.tif)
- 02: (text files to be OCR’d) jpegs, OCR text files
- 03: (text files which have been transcribed: there should be .txt files here already for each page) jpegs
- 04: (these are text requiring transcription; how delivered? Until decided, provide all that may be needed) jpegs, thumbs, minithumbs only for the first page in each sequence (files ending in 0001.tif)