A Net Archive Corpus for Education, Learning and Training
Purpose: The aim of the project was to establish a corpus from the national Danish web archive Netarkivet that could be used by students allowing them to learn about archived web.
The Danish Net Archive includes materials which have already been published, but access is only allowed for scholars and researchers.
A main reason is that it cannot be guaranteed that there are no sensitive personal data as defined in the legislation on data protection in Denmark.
To ensure that the archive can be used by students, it was undertaken to establish an extract of the archive which would not contain sensitive data of this kind, thus allowing students to learn the methods needed to study the materials in the archive.
A corpus for training was a fundamental precondition for the recruitment of new PhD-projects and new research projects, and thereby also a precondition to ensure further development of digital methods within the field.
For a number of reasons the corpus was never established.
Niels Ole Finnemann, Professor, Royal School of Library and Information Science
Ulrich Have, IT-Architect, NetLab/DigHumLab, Aarhus University
in collaboration with the State Library, Aarhus.