Hauptinhaltsblöcke
Abschnittsübersicht
-
-
Digital preservation services are critical to ensuring the longevity and accessibility of scholarly content, particularly for OA journals. These services protect against data loss due to technological change, institutional change and other potential disruptions. Using a variety of technological frameworks and collaborative efforts, digital preservation services maintain the integrity and availability of scholarly works.
This section examines some of the international digital preservation services - LOCKSS, CLOCKSS, PKP PN, Internet Archive, Portico and PubMed Central - detailing their unique approaches and contributions to the protection of digital content. LOCKSS, CLOCKSS, PKP PN and Portico are so called "dark archives" that only give access to the data when the publisher cannot do it anymore. Internet Archive and PubMed Central are "bright archives" that publish the content at their websites as soon as they have it.
It is important to note that, according to the official websites of these services, Portico is the only preservation service that migrates formats and ensures logical preservation. However, Portico provides only limited details of its workflows. All the other services mainly guarantee bitstream preservation without necessarily protecting content from format obsolescence. It may therefore be advisable to combine the options of a certified institutional/national library, which undertakes all preservation actions, and a well-known international service that stores content repeatedly, republishes it, is registered with Keepers and meets the requirements of Plan S, NIH or DOAJ's recommendations for best practices.
LOCKSS (Lots of Copies Keep Stuff Safe) is a program developed by the Stanford University Libraries at 1999. It is “a principle, a program, a community, and a software application” with solutions for distributed technical infrastructure (FAQ - CLOCKSS, 2024). It is based on a peer-to-peer network of multiple copies and servers that ensures the permanent availability of digital content through numerous distributed copies.
Several networks have been created on this basis, such as the Global LOCKSS Network (GLN LOCKSS). This system enables the funding libraries to access content locally when it is no longer available from publishers. Although publishers participate in the GLN at no cost, GLN preserves a small number of OA Journals. Sprout & Jordan (2018, p. 247) underline that it held “around 200 OJS titles […] of approximately 10,000” in 2018. This is due to the fact that the funding libraries choose the content to preserve, so they often used to opt for post-cancellation service for the paid journals (Sprout & Jordan, 2018, p. 248). It is nevertheless possible for any scholarly journal to apply for preservation free of charge (cf. https://www.lockss.org/gln#publishers).
On the basis of LOCKSS, many other services and preservation communities were launched, among them national initiatives. The lists of them can be found at the LOCKSS website: https://www.lockss.org/join-lockss/networks and https://www.lockss.org/join-lockss/case-studies#preserving_national_open-access_scholarly_output.
The main global long-term preservation services based on LOCKSS for open access journals are CLOCKSS and PKP Preservation Network (PKP PN).
CLOCKSS (Controlled LOCKSS) is a not-for-profit organization, funded by a network of publishers and libraries and governed by a Board with representatives of those institutions. Utilizing the LOCKSS technology, CLOCKSS preserves content in its original formats and ensures long-term data validity through a polling-and-repair mechanism, with mirror repositories at twelve major academic institutions worldwide guaranteeing long-term preservation and access. CLOCKSS collaborates with publishers to secure perpetual preservation rights and access to their content, which is ingested and verified for integrity at specialized servers. The content is then managed and preserved through continuous audit and repair processes. Upon a trigger event, content is migrated to the latest formats, copyright checks are performed, and it is made publicly available under Creative Commons licenses, ensuring that scholarly contributions remain accessible and open to all. They are “directly available via Open URLs through Crossref, or either of local library link-resolvers or from CLOCKSS triggered contact” (How CLOCKSS Works - CLOCKSS, 2024).
Moire information in the CLOCKSS service may be found here:
CLOCKSS website: https://clockss.org/
Schonfeld, R. (2024). Kitchen Essentials: An Interview with Alicia Wise of CLOCKSS [Blogpost] https://scholarlykitchen.sspnet.org/2024/02/20/kitchen-essentials-alicia-wise-clockss/
Video: Thib Guicherd-Callin on the technology behind CLOCKSS:
The PKP Preservation Network (PKP PN) is a digital preservation initiative designed specifically for journals that use the Open Journal Systems (OJS) software which is common among OA journals that are published by small publishers and are at risk of loss. Launched in June 2016 by the Public Knowledge Project (PKP), it addresses the critical need to safeguard content from OJS journals, many of which lack preservation through established services like CLOCKSS or Portico. Its usage is free of charge for OJS users: “PKP has found ways to automate the work of getting the journals into the PN, meaning that the service is not time-intensive to maintain once launched, and that staffing costs are kept to a minimum. One of PKP’s goals in developing the PN was to maintain a low barrier to entry to enable as much participation as possible. The resulting network provides free preservation services for any OJS journal that meets a few minimum requirements” (Sprout & Jordan, 2018). These minimum requirements include installing the plugin, running on OJS 3.1.2 or newer, having an ISSN and having published at least one article (for more information cf. https://docs.pkp.sfu.ca/pkp-pn/en/).
If you would like to learn more about the PKP Preservation Network, consider looking in to these resources:
PKP Documantation: https://docs.pkp.sfu.ca/pkp-pn/en/
Sprout, B., & Jordan, M. (2018). Distributed Digital Preservation : Preserving Open Journal Systems Content in the PKP PN [Article] https://dx.doi.org/10.14288/1.0378578Video: Public Knowledge Project (2022). Registering for the PKP Preservation Network:
One more important preservation service, not based on LOCKSS, is Portico. It was founded in 2005 and is supported by over 1288 libraries and 1105 publishers (Facts and Figures - Portico, 2024). Portico's services have evolved over time to respond not only to subscription content but also to open access materials and to emphasise the inclusion of smaller publishers (Wise, 2021, p. 3). Portico and CLOCKSS have also developed very complex workflows enabling the preservation of “features such as embedded visualizations, multimedia, data, complex interactive features, maps, annotations, and in some cases they may depend on third-party platforms or APIs, such as YouTube or Google Maps” (Millman, 2020).
Consider these resources, if you would like to learn more about Portico:
Portico Website, Why Portico: https://www.portico.org/why-portico/
Wittenberg, K., Glasser, S., Kirchhoff, A., Morrissey, S., & Orphan, S. (2018). Challenges and opportunities in the evolving digital preservation landscape: Reflections from Portico. Insights the UKSG Journal, 31, 28. https://doi.org/10.1629/uksg.421Ingenta (2019). Ingenta Webinar – Archiving your Valued Content with Portico.
The Internet Archive, which has been active in the archiving and provision of digital content since 1996, has been involved in a number of projects in relation with OA journals. For example, thanks to the Mellow Foundation, Columbia University Libraries has archived several "small, relatively obscure titles" that do not use OJS and do not appear in DOAJ using the Internet Archive's Archive It tool (Regan, 2016, p. 94), which is still available. Since 2017, the Mellow Foundation has also supported another project in which the Internet Archive (1) developed the Fatcat-Wiki to record the status of long-term archiving while providing open access to the articles; (2) processed all the scholarly publications already captured and made them available through Internet Archive Scholar (for more detialed information cf ‘How the Internet Archive Is Ensuring Permanent Access to Open Access Journal Articles | Internet Archive Blogs’, 2020). Currently, the “Internet Archive makes basic automated attempts to capture and preserve all open access research publications on the public web, at no cost. This effort comes with no guarantees around completeness, timeliness, or support communications” (For Publishers - The Fatcat Guide, n.d.). It operates on basis of metadata harvesting from metadata catalogs, like Crossref and DOAJ, and requires PIDs. There are also possibilities to save content in the Internet Archive that has not been harvested automatically, e.g. by a paid subscription service Archive It (for more detailed information cf. https://support.archive-it.org/hc/en-us/articles/208111766-Want-to-know-more-about-Archive-It)
If you would like to learn more about the Internet Archive, these may be useful resources:
The Fatcat Guide, For Publishers: https://guide.fatcat.wiki/publishers.html
The Fatcat Guide: https://fatcat.wiki/
Internet Archive Blogs, How the Internet Archive is Ensuring Permanent Access to Open Access Journal Articles: https://blog.archive.org/2020/09/15/how-the-internet-archive-is-ensuring-permanent-access-to-open-access-journal-articles/
CNI Fall 2023 Project Briefings, Multi-Custodial Approaches to Digital Preservation of Scholarship: https://www.cni.org/topics/digital-preservation/multi-custodial-approaches-to-digital-preservation-of-scholarshipPubMed Central (PMC) provides a digital preservation service for biomedical and life sciences literature. Articles are ingested through direct submissions from publishers and authors, ensuring compliance with metadata and format standards like (JATS) XML for consistency. Rigorous quality control checks verify the integrity and completeness of submissions. Content is stored redundantly across multiple locations to prevent data loss and is subject to regular format migration to adapt to evolving technologies, while permanent URLs ensure stable access. PMC supports compliance with funding agencies' public access policies in the US and is a well-known database.
To learn more about PubMed Central, consider this ressource:
PMC, For Publishers: https://www.ncbi.nlm.nih.gov/pmc/pub/pubinfo/
-
- The table you can download above compares the described preservation services according to the following criteria
- Governance
- Costs (in USD, per year)
- Open Source
- Preservation method
- What can be preserved?
- Journal content (as of June 2024)
- Access to preserved content
- How to apply
- Publisher platforms with integration to the service
- Trigger event definition
The information presented is based on Laakso et. al (2021) and additional information on the Internet Archive and PubMed Central collected by Anastasiia Afanaseva in July 2024. -
Which publication software supports which preservation services?
D3.2 of CRAFT-OA collected information from experts on OJS, Janeway and Lodel on the respective software's support for digital preservation services/long-term archiving. The answers are captured in the following table from the report:
Table: CRAFT-OA Deliverable 3.2: Report on challenges and help measures faced by OA journals and platforms (Laakso et al., 2024, S. 73)
It should be added to the table above that there is a Portico plugin available for OJS (https://github.com/jonasraoni/portico) as well. Additional information on the support of publishing software for long-term archiving can also be found in Finding the Right Platform: A Crosswalk of Academy-Owned and Open-Source Digital Publishing Platforms. For German speakers, there is also a description of the currently most widely used plugins for long-term archiving systems in OJS and general steps for their installation and use: Langzeitarchivierung in Open Journal Systems (OJS)
-
Would you like to test your knowledge after working through this chapter? Take this private self-assessment quiz.
-