As part of various and diverse projects relating to research data, digital copies, archiving and publications – both internally as well as in cooperation with researchers and external institutions – we are constantly generating new experience that we can incorporate into training and future projects.
The COVID-19 pandemic shows an unprecedented urgency with which reliable information about treatment options should be obtained. Against this background, an international research team under the leadership of the Department of Clinical Research (DKF) Basel set itself the goal to collect and process such information.
Specifically, this means that evidence from various sources and databases can be viewed and COVID-19-relevant publications can be summarised briefly. The results of the studies are then critically evaluated, in order to determine how much the results of the study can be trusted. This sets this project apart from most others, as evidence is often collected and made available without saying anything about the reliability of the studies.
The University Medical Library’s information specialists support the COVID evidence team as project partners with their expertise in systematic information searching. The database is updated daily and incorporates information about all planned, current and completed studies on the prevention, diagnosis, treatment and clinical management of COVID-19. The database should be a reliable starting point for making treatment decisions, planning new studies or developing practical guidelines and systematic reviews.
The new database can be viewed via this link: https://covid-evidence.org/.
With an external partner, we analysed the contents of an archive, which, apart from the usual analogue publications and documents, contained a comprehensive collection of various data storage media from the 1980s and 1990s. Approximately 700 data carriers (floppy disks, Iomega Zip, Iomega Jaz, SyQuest, external hard drives) document the digital creations of the musician, who started to use specialised software to compose and synthesise music early on in the emerging digitisation period. Quite a few of these data carriers could only be used with specialised synthesisers, or were, unfortunately, no longer readable because of their age. As part of the analysis and preparation of the data for long-term archiving, we checked the data consistency, the readability of the media in and with various systems, the efficient evaluation and inventory of the data as the basis for the scientific appraisal, as well as the possibilities of file format migration and the preparation of the data for reuse with authenticity and integrity.
Since the early 2000s we have been photographing and reproducing manuscripts, early prints, pictures, maps and other analogue historical holdings and have been publishing these digital items on various platforms for free reuse. However, these digital reproductions should also be usable in future decades. For us as a memory institution, this principle also applies to the digital publications of our university’s researchers and the digital-born data that we hold in our archive holdings. In order to guarantee this long-term accessibility, we are developing a modular infrastructure that will give us this security through redundant data storage, regular copying and requirements-based planned file migration according to international standards and recommendations. We are working intensively in close cooperation with various partners in Basel, but also throughout Switzerland, on the structure and characteristics of various file formats, with optimal data storage strategies on distributed storage systems and with efficient and (partially) automated workflows for data ingestion and migration.
The impresso project is an SNSF Sinergia research project that is developing novel ways of accessing newspaper archives using text mining methods. As associated partners, the University Library and the Swiss Economics Archive provide newspapers and journals relevant to economic history to the project. For this purpose, an efficient mass digitisation process was developed and an AI-based recognition of newspaper issues was established.
Instead of delivering the data to the project on a hard disk, we searched for a solution which would make the newspaper holdings accessible even after completion of the project, in the spirit of FAIR data. We published the newspapers on a standardised IIIF interface. The impresso project was able to extract the data via this interface and could then process them further in its system.
The prototype can be found here:
- IIIF presentation API: impresso collection (JSON)
- Display of images and metadata in the Universal Viewer
For the SNSF research project Printed Markets. The Basel Avisblatt 1729-1845 of Prof. Burghartz, we digitised the first Basel newspaper, the Avis-Blatt, in full with its 116 volumes and 50 000 pages, taking into consideration its special requirements.
A year before the actual project started, we developed a prototype in collaboration with the professorship and the Research Navigator, in which the digital methods were tested. Part of the prototype was the digitising of a sample and the OCR-processing of the Gothic script for the creation of training data. It turned out that for these old “Fraktur” (Gothic) scripts, we achieved far better full-text results (98%) when we did the additional work of editing in Transkribus.
If digital facsimiles take on the status of research data, this has an influence on the requirements for the quality of the digitisation. In contrast to the usual item-specific digitisation, which reproduces inconsistencies in the original, the aim of this project was to achieve a complete transmission of all the information. To achieve this, we had to obtain additional copies of the newspaper from other archives and undertake a multi-stage check for quality and completeness. This quality level was achieved through consistent cooperation. Project and library staff worked with the Goobi workflow software, whereby the quality assurance could be integrated and processed efficiently. For the seamless processing of the data, the University Library transmitted the volumes directly to the Transkribus platform. This reduced waiting times.