The person who processes incoming tiffs for derivatives for delivery will OCR these tiff files to create a text file for each one.
Not sure whether to transcribe? README
The following files belong to the same object.
a single-page letter (clear handwriting or typed):
0023_000003_000012_0000.xml
0023_000003_000012_0001.tif
0023_000003_000012_0001.txt
(this last is the text file we create with OCR software from the tiff)
OR:
a five-page letter (clear handwriting or typed):
0023_000003_000012_0000.xml
0023_000003_000012_0001.tif
0023_000003_000012_0002.tif
0023_000003_000012_0003.tif
0023_000003_000012_0004.tif
0023_000003_000012_0005.tif
the corresponding OCR text files will be created after they get to the DLC:
0023_000003_000012_0001.txt
0023_000003_000012_0002.txt
0023_000003_000012_0003.txt
0023_000003_000012_0004.txt
0023_000003_000012_0005.txt
0023 means that this is institution 23
000003 means that this is their collection number 3 (other institutions may have a collection 3 also)
000012 is the item number. This is the 12th item in the series
0000 is the metadata record, with the .xml extension
0001 this is the first digital object that applies to this metadata record; here the number is the sequence to be applied in display
(This may NOT be the same as the page number for the scanned image!)
0002 then is the 2nd page in the sequence 0003 is the third page in the sequence, and so on
The scanned pages and their corresponding metadata files thus do NOT have the same filename!
However, each tiff has a corresponding text file that DOES have the same filename.
Return to Filenaming Schemes
Page Information
|
Wiki Information |
Recent PBwiki Blog Posts |