The BnL and the “ARKS-in-the-Open” project
In 2001, the California Digital Library (CDL) introduced the ARK format. For many years, it has served as an incubator for ARK infrastructure, maintaining a user registry and providing a “global resolver” service (http://n2t.net). In 2018, the CDL and DuraSpace (now LYRASIS) announced a collaboration aimed at building an open, international community around ARK: ARKs in the Open. 33 organizations on 4 continents have already expressed their interest and support for this project, among them the National Library of Luxembourg.
With the aim of achieving long-term sustainability, the ARK infrastructure must emerge from the CDL and mature in partnership with organizations and members of the community. In order to follow international standards and collaborate in parallel on their development, the BnL participates in the technical working group of the project through Roxana Maurer, responsible for digital preservation and the technical implementation of “persist.lu”. The “Technical Working Group” oversees the development and maintenance of specifications, software and servers that support the infrastructure of the ARKs-in-the-Open community
Fields of application
The first objects to receive ARKs are documents digitized by the BnL and available for consultation on eluxemburgensia.lu. More than 100,000 ARKs were allocated to digitized objects: daily and weekly newspapers, manuscripts, books and posters.
A second round of mass allocation of ARKs has begun in early May 2020. A few thousand digital editions (most in PDF format) of several national daily and weekly newspapers were integrated into the long-term preservation system and simultaneously received ARKs. The same holds true for the collections of the BnL web archive.
Bibliographic records and authority records are also suitable to receive ARKs. A possible first campaign could target bibliographic records of books in paper format digitized by the BnL.
Technical choices and associated metadata
The BnL has decided not to use prefixes in order to differentiate between collections or types of resources. Thus, the ARK remain entirely opaque: there is no possibility to distinguish between https://persist.lu/ark:70795/g1v0gx, which references a digitized book (“Grammaire complète du nom ou substantive” by Pierre-Victor Sturm) and https://persist.lu/ark:70795/tjhcdk, which points to a digital daily newspaper (the January 4th, 1945 edition of the “Escher Tageblatt”).
Using the inflection “? “, the user can access metadata made available by the BnL for its ARKs. Here is the example for ARK ark:70795/ghmrs9:
“type”: “digitized newspaper”,
“owner”: “Bibliothèque nationale du Luxembourg (=) National Library of Luxembourg (=) BnL”,
There are 3 types of metadata:
1. Information on the ARK:
- the organization that generated the ARK (“owner”)
- the current URL for the resource (“target”)
- the latest ARK update (“updated”)
2. Information on the content of the resource in “erc” format (Electronic Resource Citation)
3. Information on
- technical choices:
- the ARK is opaque
- the ARK can no longer be reassigned once published
- the ARK uses a control character
- digital object policies:
- “objectAvailability: lifetime” = one expects the object to be available as long as the provider exists
- “contentVariance : rising” = the content previously recorded may change during a format migration and can be adjusted or enriched at any time (for example, by additional metadata)
These metadata follow the guidelines outlined in the article: Persistence Statements: Describing Digital Stickiness.