Scheme 00, for single images with one metadata (xml) record
Scheme 01, for multi-page images with one metadata (xml) record
Scheme 02, for non-transcribed text (such as a letter), no transcription needed, one or more pages per metadata (xml) record
Scheme 03, for digitally transcribed text, one metadata (xml) record
Scheme 04, for single or multi-page, text requiring transcription, one metadata (xml) record
If the digital object you have in front of you does NOT fit any of these descriptions, categorize it as “other” and describe it in detail. We need as much information as possible about this object to be able to develop support for web delivery at a later date.
Main explanation of the filenaming scheme being used by Volunteer Voices as well as other DLC projects.
All sections of numbers are separated by an underscore. Extension is appended (in lowercase: .tif for tiffs, .xml for xml)
The first set of numbers (4 digits) identify the institution. This corresponds to the institution’s identifying number in the vvadmin database [admindb].
The second set of numbers (6 digits) identify the collection within that institution. This corresponds to the collection’s identifying number in the vvadmin database. Each institution may have collections that have the same number: 001, 002, 003, etc. [select id from coll where name=”Whittaker’s Confederate Uniforms Collection”]
The third set of numbers (6 digits) is the item number. The first 2 sets of numbers are unique to this digital object across all digital objects in Volunteer Voices.
For example:
the following files are all for the 432nd item (or digital object) in collection 12 at institution 123, and they are the metadata file and the first 3 pages of a text document:
0123_000012_000432_0000.xml 0123_000012_000432_0001.tif 0123_000012_000432_0002.tif 0123_000012_000432_0003.tif
Example: 0123_012_02_0432_0001.tif is the part of the non-transcribed text object that is the 432nd item in the 12th collection from institution #123.
The remainder of the filename (everything after the item number) will be created by the content people and will provide sufficient information (via file-naming scheme adopted) for us to reconstruct the object and display it correctly. (In the example above, the 5th set of numbers tell us which is the metadata record, and what the sequence of display is for each tiff. ) A scrapbook with sub-objects for some pages and both images and text will require different processing and display than a set of photos of the four sides of a Roman column, or a thesis with 4 movie files, 2 audio tapes and a spreadsheet. We cannot forsee all possible combinations, so each must be covered by a different scheme. Thus, look up the scheme referenced by the third set of numbers, in order to make sense of the remainder of the filename. [select * from FileNameSchemes where id=”03”;]
Example: 0123_000012_000432_0001.tif is the first page of the non-transcribed text object that is the 432nd item in the 12th collection from institution #123.
NOTE: Content people will need to create this part of the filename! (in this last example, the “_0001.tif”) The rest should be generated by the database when creating the base metadata record, which (in schemes 00, 01, 02, 03, and 04) will end in “_0000.xml”:
Example: 0123_000012_000432_0000.xml is the database-generated filename for the metadata record for the 432nd item (non-transcribed text) in the 12th collection from institution #123.
For the scan of the first page, change the last four digits to 0001 and change the extension to “.tif”.
For the scan of the second page, change the last four digits to 0002 and change the extension to “.tif”, and so on.The first four parts of the filename (here, “0123_000012_000432_”) should match the first four parts of the metadata record filename (here, “0123_000012_000432_0000.xml”).
Page Information
|
Wiki Information |
Recent PBwiki Blog Posts |