Digitizing the Photothek’s Card Catalogues

Since 1902, the Photothek has used a handwritten and typed card index – arranged in three catalogues (‘artist’, ‘sites’ and ‘iconography’) – documenting acquisitions from 1897 until the switch to electronic records in 1993. Although this analogue system remains invaluable for scholars, a digitized version of the card index will allow different searches across the metadata, corresponding to different research questions, e.g. on the provenance of the photographs.
Our project employs an AI driven workflow that goes beyond simple transcription by converting each card into machine readable text while also identifying archival symbols and deciphering shorthand conventions. The system evaluates its own output by assigning confidence scores and flagging uncertain entries for human review. This integrated process will speed up cataloguing, enable full text search across the entire collection and maintain high data quality.
By digitizing and richly annotating these cards we will preserve them as artifacts of early archival practice, complete with their original classification and provenance notes. The resulting dataset will underpin the DH Lab’s long-term goal of a unified searchable catalogue that bridges analogue and digital holdings, fostering deeper discovery and cross referencing.
Our project employs an AI driven workflow that goes beyond simple transcription by converting each card into machine readable text while also identifying archival symbols and deciphering shorthand conventions. The system evaluates its own output by assigning confidence scores and flagging uncertain entries for human review. This integrated process will speed up cataloguing, enable full text search across the entire collection and maintain high data quality.
By digitizing and richly annotating these cards we will preserve them as artifacts of early archival practice, complete with their original classification and provenance notes. The resulting dataset will underpin the DH Lab’s long-term goal of a unified searchable catalogue that bridges analogue and digital holdings, fostering deeper discovery and cross referencing.