While the majority of the material we digitize is in English, a significant number of items are not. The language used makes no difference in the image capturing and editing stages but it may require different decisions to direct the rest of the process. Our OCR (optical character recognition) software is capable of recognizing a number of different languages, but it isn’t able to deal with handwritten characters in any language. This means we have to decide whether we need to describe an item, or provide a full translation. While a translation can offer much greater access to a resource, it may also mean coordinating with scholars or language experts outside the library. Our cataloging staff have a range of linguistic backgrounds and fluencies, but we cannot accommodate every language.
One of the first collections we tackled with non-English resources was the Chinese-Canadian Collection featuring materials from the Chinese Consolidated Benevolent Association. As this collection was scanned to be part of the Multicultural Canada collection, we wanted to offer the best possible descriptions of the materials. The first hurdle was the age of the documents; it meant that many were handwritten in traditional (non-simplified) characters. We found a scholar who could read them and was able to supply translations, then got to work on the rest of the collection descriptions.
The decision was made to supply both Chinese and English titles, and to sort by the Chinese title. One of our cataloging staff input the titles using simplified Chinese characters, created an English title, and ensured that the translations were input accurately and were matched to the correct items. The description was then built from the English title. The result is a set of primary documents that are very accessible and that represent an otherwise under-represented part of our local history.
For some documents, we are lucky enough to have a translation already on hand. That was the case with the Certificate of Nobility for Nicolas Abkhazi in the Abkhazi collection. The original is written in Georgian, but there is a typewritten translation which we uploaded as part of the digital object.
Sometimes, the decision is made not to translate or transcribe the content of the documents. This was the decision for items added to the László Hudec collection. Along with a number of blueprints and architectural records, there are letters, receipts, and other ephemera in English, Chinese and German.
In short, each project is assessed independently, taking into consideration our resources (staff and time), the audience, and intended use of the collection. Of course we can always add more information to the records at a later date should we have more time, more staff, or if a researcher proposed a project that worked to enrich any of our digital collections.