Newly Discovered Archival Logic: Functions Emerging from the Records in Contexts Standard

Arian Rajh

doi:10.1515/pdtc-2025-0033

Artikel Open Access

Newly Discovered Archival Logic: Functions Emerging from the Records in Contexts Standard

Veröffentlicht/Copyright: 8. September 2025

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen Erkunden Sie dieses Fachgebiet

Aus der Zeitschrift Preservation, Digital Technology & Culture Band 54 Heft 3

Abstract

The article discusses the reality of information for archivists in an era when information is not primarily shaped in documentary form. The situation in which contemporary administrative, business, and private information is organized in various formats necessitates a reexamination of the subject of archival science and practice. The author focuses on the ontology of the new professional standard, Records in Contexts, using it to redefine essential professional concepts and develop archival descriptions. As it relies on Description Logics, Records in Contexts introduces two new functions in addition to the existing functions. The author presents the first, terminology and discourse-related function by disassembling archival terminology into simpler, more straightforward concepts or their constituents, and analyzing them through the lenses of Description Logics. This function acknowledges today’s new reality, where information is spread in more forms than just documents. The second function is related to a Description Logics reasoning engine that supports the RiC ontology, allowing for the creation of information that does not need to be anticipated by archivists. This liberates archivists from the Schellenbergian illusion of anticipating users’ needs. The article offers an example of a practical exercise for students in archival science, showing how description is powerful when formatted as data according to the new standard. The two functions discussed are interconnected, and a comprehensive approach would have archivists manage all types of information and organize their output more semantically. The conclusion indicates such a direction, a broader application of the new archival standard compared to the previous one.

Keywords: archival materials; description logics; discourse; information; ontology; records in contexts

1 Introducing a New Reality

More than a quarter of a century ago, a Swedish archivist noted that determining which archival records should be publicly available is challenging in the “computerized era” (Gränström 1999, 84). Being publicly available assumes the establishment of a long-term preservation regime and the provision of care by archivists. In today’s era of “data-intensive capitalism” (Beer 2019, 4), the emphasis of this mindset should extend beyond just documents and archival records being solely under the archivist’s control.

Nowadays, data are generated in far more significant quantities than documents (Taylor 2024; Bartley 2024; Duarte 2025). Data is continuously created, processed, and stored across social media platforms, streaming services, artificial intelligence systems, IoT devices (including health and sports wearables, as well as medical devices for managing conditions), financial systems, both conventional and those with blockchain-supported digital transactions, business cloud systems, scientific research systems, peer to peer networks, decentralized state registers and databases, business analytics systems, and more. Foucault 4.0 derivative theories of Toner (2024) and Beer (2019) have shown mechanisms of new knowledge supporting social power in IoT and Big Data environments and, thus, the immense social impact of data. Interpretative intentions within specific social contexts influence data selection, processing, and use. Big Data has been recognized as a significant challenge to social sciences (Couldry 2020, 1,137), which also applies to archival science. Ecosystems, frameworks, platforms, engines, and solutions that leverage Big Data, AI, data analytics, and advanced technologies like Apache Hadoop, Apache Spark, Snowflake, Amazon Redshift, and Databrick represent environments that promote information generation, similar to the roles played by document management systems, records management, enterprise management systems, or content services in previous years or decades. In the last several decades, archivists worldwide have become familiar with managing digital documents and overseeing digital records management systems; they have successfully redefined the profession and broadened their professional tasks to the digital counterpart of paper documents. It was essential for the profession, and the archivists would significantly undermine their social role if they failed to do this. The preservation of digital documents has become more complicated due to digital signing. Digital documents’ authenticity includes integrity and identity properties, with identity being based on certificates, so archivists started considering data from certificates. However, it appears that the most significant change is still ahead of us on our horizon. Given the “powerful presence of data in ordering and shaping our lives” (Beer 2019, 134), developing Computational Archival Science (CAS) as a peripheral subdomain of archival science would not be enough; instead, we need to transform the field of archival science to enable archivists to adapt to today’s realities. With the new infosphere reality rising, the “archives as data” paradigm (Mordell 2019) no longer seems strange.

This reality extends beyond information found in digital documents, whether they are simple or digitally signed, as well as the data and metadata organized within various relational and non-relational databases and warehouses. It also includes information formatted in much more complex digital forms. These are some examples to consider. Information in software code, ontologies, and other information products should also be preserved for future understanding of what shaped our present. Geospatial data and smart contracts do not differ from traditional paper maps or conventional contracts in the results for end-users, and they should be seen as contemporary archival materials. However, the difference between smart contracts, geospatial information, and their traditional counterparts is significant in their function and technology, as they do not replicate their paper-based versions. An example of a smart contract can illustrate this significant change in social information. Traditional contracts exist in the form of documents, supported by formal diplomatic characteristics and verified by a trusted third party (e.g., a notary). In contrast, a smart contract is computer code deployed to be executed in a decentralized blockchain environment without needing a trusted third party once data confirms that predefined conditions have been met (Taherdoost 2023, 7). The public Ethereum blockchain platform uses “bytecode,” which consists of step-by-step instructions in the form of a string of characters. In contrast, permissioned Hyperledger Fabric employs “chaincode,” which resembles more traditional application code (Khan et al. 2021, 2903). Neither of them resembles any documents. A smart contract combines instructions and initial state data during deployment and in a typical use case. During execution, it processes input data and interacts with its stored state. It lacks traditional diplomatic features and cannot be considered a formal, classical agreement suitable to become an official archival document. Although its source code may be stored in textual form in repositories before deployment, these “drafts” do not necessarily reflect the smart contract’s final operational version and never prove its execution. Therefore, archiving the source code of smart contracts as textual files like traditional documents serves no real purpose. Some additional examples from today’s infosphere, like AI outputs and crypto tokens, go even further. AI’s actions today and the accompanying activities-related para data should be preserved and analyzed to establish responsibilities. Also, there are materials like non-fungible (NFT) and semi-fungible tokens (SFT), which are used for today’s art and memorabilia, and that contain ownership information. After all, information management practices have previously utilized clay tokens representing goods and evidence of their ownership (Postgate 1992, 51–53), so addressing their digital equivalents by archival domain should not seem so unfamiliar. If archival methods had remained the same throughout history, we would have stayed focused on clay tablets. Although still important, paper and digital documents are merely a cove in today’s vast ocean of information. It would be illogical to try to preserve just this cove.

Following this introduction, the second section examines various options for archivists to navigate today’s realities. The third section highlights the two new functions of Records in Contexts archival ontology and descriptions based on Semantic Web and Description Logics (DLs) concepts. The author presents a conclusion in the article’s final section and offers a holistic perspective that combines these two new functions.

2 Archivists’ Responses

While other non-documentary forms of information thrive today, focusing solely on practices involving documents and records is ineffective. Conventional definitions of archival science may describe records and historical archives, but they are too limiting and fail to capture the breadth of social information generated today. What might archivists’ responses be to today’s evolving world of information? This largely reflects a literature review approach, although not all sources originate from archival science; they also include interdisciplinary fields that combine data science and sociology or originate from other domains.

One option is to ignore what is occurring, disregard the new paradigm (Kuhn 1996) of archives as data (Mordell 2019), and maintain the status quo. Archival communities choosing this option would largely adhere to practices established in the more or less distant past. They would focus on flatly comprehended digital records as products of computer text editors that replaced the typewriter while neglecting contemporary movements, for example, the software code repositories of the GitHub Archive Program with its vault, the Software Heritage initiative, and database archiving initiatives and database standards like SIARD (DILCIS Board 2021). However, what worked for archival science in the late twentieth century, when it successfully transitioned from a paper-based positivist discipline to a field capable of managing new digital and digitized, yet still recognizable, forms of documents, may not be effective anymore. While many documents continue to be generated within official administration, they do not always provide optimal structures for information related to business transactions, administrative activities, and both official and personal activities. Today’s businesses and various social and personal spheres often rely on more fluid information than typical records, so new requirements for preserving information have emerged. Preserving software code is a good example. Its preservation for future study can help us to understand automated, information-driven processes that shape today’s social dynamics. Archiving databases is another example. If data from databases using various technologies that may become obsolete survives, it can be used to understand essential government and business processes or migrate to new technologies for reuse. Data generated by legacy systems delivers historical information just as it was in paper documents. This is why standards such as SIARD are essential, and this is why archivists should plan preservation methodologies and suggest more advanced solutions. The first “ignorant” approach involves ignoring the efforts made by Dryad, Zenodo, and Open Science Framework to archive research and scholarly data, and dismissing other non-archival initiatives aimed at preserving dataset-based information as unworthy of archivists’ attention.
Another option would be to accept the novelty without reflexivity and introspection about the archivist’s role. This could lead to archivists being absorbed into a new information ideology, dataism, and what Beer (2019) characterized as a “data imaginary.” The archivists should remember that their efforts must consistently retain the transgressive nature to counterbalance their role as an administrative discipline within the Foucauldian power framework. Additionally, using quick fixes like preservation-related software designed primarily for commercial purposes or similar solutions, which focus on a perceived gap in the field and lack an archival perspective, is unlikely to be very effective. Archivists should take the lead in these processes rather than follow them. Their deep understanding of information management concepts, which are several millennia old, as well as the theoretical and methodological traditions of preservation, should be applied to solving similar problems related to new types of information.
The third way is to leave most of the information that needs to be preserved to a subdomain of archival science, such as CAS, which, for now, remains exterior to theoretical mainstreams. CAS primarily uses computational information processing methods with archival materials to assist archivists’ work (Marciano 2022, 212). As a newly established subdomain of archival science, CAS should guide archival discourse toward a new understanding of the nature of information today, how it is produced, and what can be done with it. CAS incorporates the Big Data concept into its definition (Payne 2018, 6), but, sometimes, CAS still highlights the concept of records (Marciano 2022, 212). CAS represents a good step forward, but it should be thoroughly incorporated into archival science: CAS should become “plain” archival science. This archival subdomain or trans-discipline (Payne 2018) should be integrated into new archival science in a phase when it is “normalized” in Kuhn’s sense (Kuhn 1996). “Computational methods” (Lemieux and Marciano 2025) should be incorporated into a regular methodology box of archival science.
The final option goes in that direction and demonstrates the active role of archivists in today’s world. It entails a genuine disruption of archival theory and practice and advocates for a new approach as the dominant information form has changed. This could be done by incorporating CAS or similar contemporary attempts into mainstream archival science and redefining the bare foundation of archival science before that. CAS has the potential to revolutionize archival science, but the transformation requires a thorough examination of the archival discourse. Changing a discipline is challenging if its language lacks the concepts needed to facilitate this change or the means for its practitioners to communicate throughout the process. This is where the new archival descriptive standard could step in. There is literature that aims for this approach or discusses it (Park 2015; Zou 2019). This article contributes to these efforts.

It is essential not to overlook the discourse in a discipline. Discourse shapes our thinking and practices by establishing rules regarding what we should discuss and act upon, primarily through “principles of classification, of ordering” (Foucault 1981, 56). Discourse achieves this through definitions, along with various discourse devices. Definitions also classify and organize their components. They are resilient to change because they are integrated into national legislations and professional standards. The following section will examine the new professional standard and the valuable, possibly overlooked, functions and tools that come with it. Where could using these functions lead us?

3 The RIC Standard and Its Newly Introduced Functions

The archival description has long been acknowledged for various functions, even prior to the introduction of the new descriptive standard, of course. These “usual” functions are connected to the study, management, and protection of archival materials. These functions were identified during the consolidation period (Ridener 2009, 21–22) when archival practices evolved into the discipline we recognize today. A newer function related to the postcolonial approach in archival practice is the reparative function (Luke and Mizota 2024), but it does not hinge exclusively on the new archival standard. However, two new functions presented in this article are based on the new archival standard and arise from its characteristics, which differ from those of older standards.

Records in Contexts (RiC) is a recent professional standard for archivists, providing descriptive and formative guidelines for metadata related to archival entities. It includes an introduction, a conceptual model (RiC-CM), its ontology (RiC-O), and still-unpublished application guidelines. There are several functions that the RiC standards can fulfill. The most evident is related to describing archival materials and associated entities. RiC is designed to replace the collection of ISAD(G) family standards and Encoded Archival Standards (EAS) that describe archival entities. The older ISAD(G) standard focused on describing archival materials. It was combined with standards for describing record creators (ISAAR-CPF), their functions and activities (ISDF), and archival repositories (ISDIAH). Each standard was designed for creating finding aids or describing archival entities, and their computer-readable forms are based on XML EAS standards: EAD following ISAD(G), EAC (ISAAR-CPF), EAF (ISDF), and EAG (ISDIAH). The other functions described in the literature include managing and protecting archival materials. The reparative function from the literature is based on justice in the representation of communities and individuals within archival materials (Frick and Proffitt 2022, 18).

RiC pertains to the Semantic Web and the Linked Data movement, allowing RiC-compatible descriptions of archival entities to utilize various linked-data possibilities. RiC-harmonized descriptions are being serialized primarily as RDF/XML and RDF Turtle files. RiC-O is an OWL-based ontology, and like other similar ontologies, it evokes Description Logics (DLs) mechanisms (Rajh 2024a, 10). DLs are a cluster of logically grounded languages with varying expressivity and reasoning potentials that can systematically and concisely represent gathered knowledge of one or more areas (Baader et al. 2017, 1). Several ontologies demonstrate the robust applicability of DLs across various fields, including the complex domain of molecular biology, genomics, bioinformatics (Wroe et al. 2003), and the cultural heritage domain (Erlangen CRM/OWL). They demonstrate that building or enhancing a domain is possible based on the potential of DLs. DLs comprise formalisms that capture domain-specific knowledge by articulating its concepts and the characteristics of individuals or their attributes (Baader et al. 2003, 43). Archivists should use DLs to examine RiC and dictionary definitions and achieve greater conciseness in their discourse, as shown in Section 3.1. This represents the first function of RiC, which is different from the known functions of the ISAD(G) standard. This function could enable the community of archivists to review their terminology, analyze compound concepts, decompose them into simple concepts, and possibly cleanse archival discourse. It banks on DLs’ strict precision in modeling and managing the concept hierarchy and subsumptions, fitting specific or simple concepts into broader or more complex ones. With DLs in the background, “[f]ormal ontologies require the majority of concepts to be at least a kind of one other concept” (Wroe et al. 2003, 630). This approach provides a tool to detect and prevent contradictions and gaps between them, forming a well-developed system of concepts and utilizing axioms during the modeling process.

Formalisms and various DL languages emerged during the scientific developments of the last third of the twentieth century; however, archivists encountered them when RiC-O was recently released. As will be shown in Section 3.2, indirectly using those formalisms by archivists, through RiC-O, expands archival descriptions for various future applications without explicitly including all possible description details. Description Logics’ reasoning potential can enable unexpected applications of the descriptions of archival materials. There will always be user questions about archival entities that the archivist might not foresee, or unexpected topics that could interest future users. There will continually be users with unpredictable inquiries or unanticipated research angles. An archivist cannot and should not attempt to predict everything. With carefully designed descriptions and alignment on DLs, archivists are not required to predict future uses of their descriptions and archival materials. Section 3.2 discusses this second function, in which RiC excels compared to ISAD(G). This function relies on a strict Description Logics modeling methodology, the formality of these languages, a commitment to precise semantics, and reasoning capabilities based on transitivity, cardinality constraints, inheritance, and other characteristics. These mechanisms are integrated into the RiC ontology itself. In contrast, a conceptual model like RiC-CM does not offer such capabilities, as it is less formal and lacks logical axioms as a foundational structure.

3.1 Considering Archival Terminology

DLs include entities of concepts, roles, and constructors, offering an environment for representing domain knowledge. DLs can also serve to cleanse discourse. As Goerz and his fellow researchers stated, “in the long run the only opportunity to resolve the problems due to underspecification and vagueness is to aim at a formalization in some logical language” (Goerz et al 2008, 3). Clear definitions should be essential for archival discourse, and DLs provide an excessively cautious method for defining terminology.

The whole structure of concepts that constitute the knowledge in the archival domain should be more systematic, building definitions from the least to the most complicated, or systematically linking them. Working on professional definitions is not a creative writing exercise; therefore, systematic organization, necessary redundancy, and the repetition of smaller concepts within larger ones are essential. DLs could be tools for remodeling archival science concepts – archives (as archival materials), fonds, collections, record sets, records, record parts, and information and data concepts. Then, the methodology of DLs could explore other primary RiC-O classes and redefine them if necessary. Archivists could use the same method to verify the most crucial professional terminology published in professional dictionaries, like ICA/InterPARES Multilanguage Archival Terminology (MAT). This section will examine definitions from RiC-O and MAT through the lenses of DLs and disassemble them into simpler concepts or their constituents. Concepts and object properties (roles) that link them will be extracted from selected RiC and MAT definitions. After the analysis, a recommendation on archival science’s broader and contemporary subject will be provided.

The typical subjects of archival work are archives, records, and materials. The definition taken from MAT defines the archives as “the whole of the documents made and received by a juridical or physical person or organization in the conduct of affairs, and preserved” (International Council of Archives and InterPARES Trust 2015). The problem with this definition is that administrative databases and other non-documentary information would not strictly be considered archives because they do not exist in documentary form; however, in practice, they are included in the totality of the archival material of a creator. The terminology should follow the practice in this case. On the other hand, RiC-O defines its central term not as archives but rather as records, described as “discrete information content formed and inscribed, at least once, by any method on any carrier in any persistent, recoverable form by an Agent in the course of life or work activity” (EGAD 2024a, n.p.). A non-literal translation to DLs that retains the meaning of this definition can be seen in Rajh (2024a, 13), but this used “Information” instead of the “InformationContent” concept:

R e c o r d ≡ I n f o r m a t i o n ⊓ ∃ d o c u m e n t s . A c t i v i t y ⊓ ∃ h a s C r e a t o r . A g e n t ⊓ ∃ h a s O r H a d I n s t a n t i a t i o n . I n s t a n t i a t i o n .

And further,

R e c o r d ⊑ R e c o r d S e t .

R e c o r d P a r t ⊑ R e c o r d .

This is because RiC also focuses on the RecordSet concept (“[o]ne or more records that are grouped together by an Agent based on the records sharing one or more attributes or relations”; EGAD 2024b, n.p.) and the RecordPart concept as the “[c]omponent of a Record with independent information content that contributes to the intellectual completeness of the Record” (EGAD 2024c, n.p.). In these equations, DL symbols are used to express concept equivalence (≡) and subsumption (⊑), while the constructors ⊓ and ∃ represent conjunction and existential restriction, respectively. A conjunction is used when an individual must belong to all specified simple concepts that are used to build the complex concept (“elements that are in the extension of both” [concepts] (Baader et al. 2017, 12). Existential restriction denotes that the individual being defined has at least one relationship (via roles: documents, hasCreator, hasOrHadInstantiation) to another individual (of classes: Activity, Agent, and Instantiation), or “elements that have at least one [role]-filler that is in” [class used as object] (Baader et al. 2017, 13). The Record concept definition in DLs cited above uses instantiation because it represents “the inscription of information made by an Agent on a physical carrier in any persistent, recoverable form as a means of communicating information through time and space,” i.e., the inscribing of information in a persistent form (EGAD 2024d). Its DL definition, which would be relatively close to the RiC definition in the narrative form, would be the following:

I n s t a n t i a t i o n ≡ ∃ i n s c r i b e . D a t a ⊔ I n f o r m a t i o n ⊔ ∃ h a s C r e a t o r . A g e n t ⊔ ∃ h a s C a r r i e r T y p e . C a r r i e r T y p e .

The proposed DL formula also incorporates the concept of data, as data is instantiated in databases. ⊔ stands for disjunction for “elements that are in the extension of either [the first or the second class], or both” (Baader et al. 2017, 13). The CarrierType RiC-O concept already encompasses the meaning of physical material, and both information and data inherently possess the characteristic of transmissibility (EGAD 2024e). Tables 1 and 2 show more straightforward concepts and roles that form the basis of the definitions of complex concepts in the MAT terminology and the RiC standard.

Table 1:

Decomposition of complex concepts into simple concepts included in MAT and RiC-O.

1. Archives (MAT)	2. Record set (RiC-O)	3. Record (RiC-O)	4. Record part (RiC-O)	5. Instantiation (RiC-O)	6. Carrier type (RiC-O)
Document	Record	Information content	Component	Data	Physical material
Juridical person	Agent	Carrier	Record	Information	Information
Physical person	Attribute	Form	Information content	Agent
Organization	Relation	Agent	Completeness	Carrier type
Affairs		Activity
Preservation		Course of life

Table 2:

Analyzed combinations of concepts and roles in selected definitions of archives, record set, record, record part, instantiation, and carrier type from MAT and RiC-O.

1. Archives (MAT)	2. Record set (RiC-O)	3. Record (RiC-O)	4. Record part (RiC-O)	5. Instantiation (RiC-O)	6. Carrier type (RiC-O)
(Documents) made by (juridical or physical person or organization)	(Record or records) grouped together by (agent)	(Information content) formed by (method)	(Component of record) contributes to (record)	(Inscription of information) made by (agent)	(Information) is represented on (physical material)
(Documents) received by (juridical or physical person or organization)	(Records) share (attribute or relation)	(Information content) inscribed by (method)
(Documents) preserved (by juridical or physical person or organization)

DLs’ complex concepts comprise one or more simple concepts connected with roles and connectors. Table 2 below presents the most crucial roles and object properties defined in the MAT and RiC-O terminologies.

This article aims to provide a broader definition of materials, encompassing data, information, or both, along with an instantiation (generated by specific techniques in a particular structure). These materials may be created through the creator’s activity, accumulated by the creator, or connected to the creator’s events in some other manner. This definition encompasses an extensive collection of information in written form that can originate from an organization or an individual, forming either an organizational or personal archive. Information in a personal archive does not need to relate strictly to the creator’s business activities; it may be preserved for its value in connection with an event involving the individual in some other way. The closest Description Logics representation of a top archival entity that utilizes RiC-O concepts and roles would be as follows:

M a t e r i a l s ≡ D a t a ⊔ I n f o r m a t i o n ⊓ ∃ h a s O r H a d I n s t a n t i a t i o n . I n s t a n t i a t i o n ⊓ ∃ i s A s s o c i a t e d W i t h E v e n t . C r e a t i o n ⊓ ∃ h a s C r e a t o r . A g e n t ⊔ ∃ i s A s s o c i a t e d W i t h E v e n t . A c c u m u l a t i o n ⊓ ∃ h a s O r g a n i c O r F u n c t i o n a l P r o v e n a n c e . A g e n t ⊔ ∃ i s A s s o c i a t e d W i t h E v e n t . E v e n t ⊓ ∃ i s R e l a t e d T o . A g e n t .

Materials are data or information structured and recorded in various ways, which were created or accumulated by agents in the performance of their activities or are related to those agents. The definition of materials incorporates “smaller” concepts like Information and Data as subclasses of the general Thing class. Classes of Creation and Accumulation should be defined as subclasses of Activity. The literature’s definitions of data are very diverse. However, data must consist of characters that the chosen formal system provides; they are not randomly selected and can be transmitted. Information is a set of related data with characteristics, according to specific authors – truthfulness (Floridi 2004), transmissibility (Shannon and Weaver 1972), and additional meanings that this set of related data has for the user in a specific context (Zack 1999). Because information can be fraudulent, truthfulness is omitted as a characteristic of information by several authors (Harari 2024).

Archival materials represent a specific subgroup of previously identified materials for which preservation is considered essential:

A r h i v a l M a t e r i a l s ⊑ M a t e r i a l s ⊓ ∃ i s A p p r a i s e d . A g e n t ⊓ ∃ i s P r e s e r v e d . LongtermPreservationRegime.

As defined above, archival materials are a subclass of materials appraised according to the values of individuals, communities, or societies for which they are preserved. The description logics’ definition of archival materials would involve adding “isPreserved” as the new role and designing the Preservation concept as a sub-concept of activity. Considering potential archival reappraisals, the definition does not specify the value type or how long it should be preserved. The formulas above were just a few examples of the direction the first function should take.

3.2 Clairvoyants and Users of Archival Materials

T. R. Schellenberg is one of archival theory’s foremost canonical writers. He is best known for his appraisal theory and classification of values (Schellenberg 1956, 16, 28, 140, 149). His theory on evaluating materials as archival is rooted in recognizing their potential future uses, particularly in terms of informational value, which depends on the archivists’ ability to foresee how these materials will be used. This is why Booms (1991, 26) viewed Schellenbergian archivists skeptically, regarding them as failed futurologists (Tschan 2002, 188). Being free from the Schellenbergian illusion allows archivists to avoid predicting expected usage and focus on designing descriptions as they see the materials. Using Semantic Web and DLs allows the products of their description to encounter various usages by utilizing various semantic queries instead of clairvoyance. The following example can show us how.

In spring 2024, undergraduate students at the Department of Information and Communication Sciences within the Faculty of Humanities and Social Sciences in Zagreb, Croatia, conducted an exercise on RiC-compatible description (Rajh 2024b). The exercise, extended over five weeks in 2024, aimed to teach the students about archival descriptive work, various standards, linked data, visualization, semantic queries, and the enhanced possibilities given by RiC. The students described archival boxes with records of Jaroslav Sidak, a university professor and a prominent twentieth-century historian. Students were taught in archival description, descriptive standards, Semantic Web, and RDF Turtle serialization, and acquainted with conventions designed to produce harmonized Turtle files. The exercise resumed a year later and lasted for several weeks in spring 2025, with a new class of students and other participants, as the course became open to the public this academic year. A crucial aspect of these exercises was the provision of syntactic and semantic conventions, and this demonstrates the necessity of EGAD providing the fourth part of the RiC standard, the guidelines. The students were instructed to adhere to conventions that ensured consistent use of quantity metadata; they were advised to follow the ISO 8601 standard and the related Extended Date/Time Format (EDTF) for time metadata, and so on. The most essential convention was related to using object properties and enhancing the semantic potential of the description. Participants have encountered challenges in organizing the contents of archival boxes with materials and structuring RDF Turtle descriptions. However, in 2024, eight students successfully hand-coded valid Turtle files, and eight archival boxes were described. All these files were uploaded into a graph database, GraphDB Free edition (Ontotext 2025), to link and visualize metadata. In 2025, thirteen participants submitted valid Turtle files, and the teacher provided the last file, so all twenty-two boxes of the record set were processed. All the entities were visualized again. Each year after successfully uploading, the teacher launched a set of semantic queries in the GraphDB software environment on this dataset, which resulted in various new information. The first kind of information was related to linked agents, events, dates, and record sets or records (Figure 1). This information was not explicitly included in the Turtle files provided by the students; instead, it was obtained through semantic queries, that is, derived from the underlying DLs’ mechanisms. There was no end date for the entire record set stated in the title, and another semantic query provided the date of the newest record that students described. This second query also yielded new information.

Figure 1:

An example of a semantic query illustrating archival entities from various descriptions provided by different students who participated in the exercise.

The exercise showed that archivists could increase the potential usage of their work by designing descriptions that follow the RiC standard and lean on its capabilities provided by OWL (Web Ontology Language) and DLs. The exercise demonstrated that relying on specific roles’ transitivity and other role characteristics, class hierarchy, subsumption, and other features enhances DL-augmented reasoning. The students received guidance during the convention presentation on using RiC-O object properties and on connecting record sets, people, activities, and places as subjects and objects in their descriptions rather than relying solely on data properties. Although both approaches satisfy the requirements of the RiC standard, using object properties facilitates the actual linking of metadata and enables the generation of various new information. While it is possible to provide the same or very similar information using data or object properties as the predicates of RDF Turtle triplets, employing object properties can lead to different kinds of SPARQL semantic queries, resulting in information that archivists (or students, in the case mentioned above) may not have anticipated. An archivist who writes descriptions cannot know what kind of usage a researcher will need from archival material – these estimations are, in fact, impossible to make. Various future applications can be facilitated by employing the formal semantics of Description Logics hidden within an OWL-based ontology, such as RiC, and by adhering to carefully designed descriptions in conventions or implementation guidelines. The students were also advised to use RiC critically, knowing that we want greater semantic flexibility from the description of the materials without getting lost in the data imaginary (Beer 2019, 18). This distinguishes the second and fourth approaches, or the responses of archivists, discussed in Section 2 of this article. Semantic flexibility is why it would be helpful to consider the additional characteristics of specific object properties within the ontology to facilitate complex reasoning (e.g., transitivity). The potential for employing automated reasoning shows that the RIC is not merely a substitute for the ISAD(G) family of standards and supports the possibility of a shift in archival descriptive methodology. Archivists can enhance their descriptions with more powerful semantic potential instead of acting as clairvoyants who try to anticipate every possible usage of archival materials. Lessons learned from the exercise include using RiC elements based on their semantic potential instead of attempting to predict future use, incorporating relevant archival entities that align with the characteristics of the entire record set along with the creator’s concerns and activities, and maintaining a disciplined application of syntactical forms that can be processed and queried as harmonized data sets.

4 Conclusions

Our social and private infosphere has changed significantly, and documents are no longer the most used information form. Most information related to business, work-related matters, financial and administrative transactions, health, lifestyle, and other information from both legal and physical persons is taken, processed, and stored as data. Archivists, viewed as guardians, custodians, mediators, analysts, explorers, organizers, or caretakers of information, cannot overlook this reality.

Archival description practices from the late nineteenth century and twentieth century focused on a finding aid type that prioritized tutoring users and directing or guessing their information needs. Since the 1990s, archival communities have utilized the ISAD(G) family of standards, contributing to the international harmonization of archival descriptive practices. A new standard, Records in Contexts, has recently emerged, and archivists need time to become familiar with its possibilities. This new standard provides additional functions, so it should not be viewed merely as a replacement for ISAD(G) and related standards. It includes an ontology that can be used not just for interoperability but also for automated reasoning. RiC provides more than just a tool for describing archival entities, as the Descriptive Logics mechanism is incorporated into its ontology’s back end. As mentioned, DLs may not be new tools, but they are new to archivists. RiC and DLs enable additional functions of archival description, such as refining terminology for archivists and generating new information about archival materials for users – information that the archivist, the creator of the description, did not anticipate. These functions are interconnected. Refining archival terminology is advisable to enable the second function fully, as the ontology and descriptions would be more effective. DLs within the RiC-O framework can be used to assess and improve the discourse. The previously mentioned literature on gene and heritage information ontologies supports this claim, and efforts could be undertaken in the areas of archives and archival description. This is the reason why the two proposed functions should be considered together. A robust, coherent, logical, and systematic archival ontology has a greater potential to establish new connections among entities defined by archival descriptions and create new information. The key to both functions lies in DLs with inference mechanisms.

Archivists should fully adapt archival science to the new reality, rather than merely adding peripheral data-related solutions to its current landscape. From a holistic perspective, the optimal approach to conceptualizing the archival domain correctly is to carefully examine and define compound concepts (classes) using simple concepts (classes) and roles (object properties) utilized in RiC-O. Simple concepts are essential for shaping domain knowledge (Goerz et al 2008, 4), and complex concepts should be developed gradually and systematically. Archivists should consider the simplest concepts because neglecting them can create gaps in understanding the domain subjects and weaknesses in the machine processing of descriptions. The concepts should reflect today’s information universe, encompassing both complex concepts and data, with data being the simplest entity integral to today’s infosphere. Description Logics beneath Records in Contexts ontology could provide a new armature for the definitions of archival science, which can then be precisely applied in discourse and legislation as ready-made elements.

Corresponding author: Arian Rajh, PhD, Assoc. Prof., Department of Information and Communication Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, Ivana Lučića 3, HR10000 Zagreb, Croatia, E-mail: arajh@ffzg.hr

References

Baader, Franz, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi, and Peter F. Patel-Schneider, eds. 2003. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press.Suche in Google Scholar

Baader, Franz, Ian Horrocks, Carsten Lutz, and Uli Sattler. 2017. An Introduction to Description Logic. Cambridge University Press.10.1017/9781139025355Suche in Google Scholar

Bartley, Kevin. 2024. “Big Data Statistics: How Much Data Is There in the World?” Rivery. https://rivery.io/blog/big-data-statistics-how-much-data-is-there-in-the-world/ (accessed March 11, 2025).Suche in Google Scholar

Beer, David. 2019. Data Gaze. Sage.Suche in Google Scholar

Booms, Hans. 1991. “Überlieferungsbildung: Keeping Archives as a Social and Political Activity.” Archivaria 33 (January) https://archivaria.ca/index.php/archivaria/article/view/11796 (accessed March 14, 2025).Suche in Google Scholar

Couldry, Nick. 2020. “Recovering Critique in an Age of Datafication.” New Media and Society 22 (7): 1135–51. https://doi.org/10.1177/1461444820912536.Suche in Google Scholar

DILCIS Board. 2021. “SIARD 2.2 Format Specification.” https://siard.dilcis.eu/SIARD_2-2/format/2021-08-31/SIARD_2-2.pdf (accessed June 22, 2025).Suche in Google Scholar

Duarte, Fabio. 2025. “Amount of Data Created Daily (2024).” Exploding Topics. https://explodingtopics.com/blog/data-generated-per-day (accessed March 11, 2025).Suche in Google Scholar

EGAD – Expert Group on Archival Description. 2024a. Records in Contexts Ontology (ICA RiC-O) Version 1.0.2. Record, September 4, 2024. https://www.ica.org/standards/RiC/RiC-O_1-0-2.html#Record (accessed March 14, 2025).Suche in Google Scholar

EGAD – Expert Group on Archival Description. 2024b. Records in Contexts Ontology (ICA RiC-O) Version 1.0.2. Record set, September 4, 2024. https://www.ica.org/standards/RiC/RiC-O_1-0-2.html#RecordSet (accessed March 14, 2025).Suche in Google Scholar

EGAD – Expert Group on Archival Description. 2024c. Records in Contexts Ontology (ICA RiC-O) Version 1.0.2. Record part, September 4, 2024. https://www.ica.org/standards/RiC/RiC-O_1-0-2.html#RecordPart (accessed March 14, 2025).Suche in Google Scholar

EGAD – Expert Group on Archival Description. 2024d. Records in Contexts Ontology (ICA RiC-O) Version 1.0.2. Instantiation, September 4, 2024. https://www.ica.org/standards/RiC/RiC-O_1-0-2.html#Instantiation (accessed March 14, 2025).Suche in Google Scholar

EGAD – Expert Group on Archival Description. 2024e. Records in Contexts Ontology (ICA RiC-O) Version 1.0.2. Carrier type, September 4, 2024. https://www.ica.org/standards/RiC/RiC-O_1-0-2.html#CarrierType (accessed March 14, 2025).Suche in Google Scholar

Floridi, Luciano. 2004. “Outline of a Theory of Strongly Semantic Information.” Minds and Machines 14 (2): 197–221. https://doi.org/10.1023/b-mind.0000021684.50925.c9.Suche in Google Scholar

Foucault, Michel. 1981. “The Order of Discourse.” In Untying the Text: A Poststructuralist Reader, edited by Robert Young. Routledge and Kegan Paul Ltd.Suche in Google Scholar

Frick, Rachel L., and Merrilee Proffitt 2022. Reimagine Descriptive Workflows: A Community-Informed Agenda for Reparative and Inclusive Descriptive Practice. OCLC Research.Suche in Google Scholar

Goerz, Guenther, Bernhard Schiemann, and Martin Oischinger. 2008. “An Implementation of the CIDOC Conceptual Reference Model (4.2.4) in OWL-DL.” In 2008 Annual Conference of CIDOC, Athens, September 15–18, 2008. https://erlangen-crm.org/docs/crm_owl_cidoc2008.pdf (accessed June 16, 2025).Suche in Google Scholar

Gränström, Claes. 1999. “Access to Current Records and Archives, as a Tool of Democracy, Transparency and Openness of the Government Administration.” Arhivski Vjesnik 42: 79–92.Suche in Google Scholar

Harari, Yuval Noah. 2024. Nexus: A Brief History of Information Networks from the Stone Age to AI. Penguin Random House.Suche in Google Scholar

International Council on Archives and InterPARES Trust. 2015. Multilingual Archival Terminology, Archives. http://www.ciscra.org/mat/mat/term/64 (accessed March 14, 2025).Suche in Google Scholar

Khan, Shafaq Naheed, Faiza Loukil, Chirine Ghedira-Guegan, E. Benkhelifa, and A. Bani-Hani. 2021. “Blockchain Smart Contracts: Applications, Challenges, and Future Trends.” Peer-to-Peer Networking and Applications 14 (5): 2901–25, https://doi.org/10.1007/s12083-021-01127-0.Suche in Google Scholar

Kuhn, Thomas. 1996. The Structure of Scientific Revolution, 3rd ed. University of Chicago Press.Suche in Google Scholar

Lemieux, Victoria L., and Richard Marciano. 2025. “Teaching Computational Archival Science: Context, Pedagogy, and Future Directions.” Information Research an International Electronic Journal 30 (iConf): 301–18. https://doi.org/10.47989/ir30iConf47347.Suche in Google Scholar

Luke, Stephanie M., and Sharon Mizota. 2024. “Instituting a Framework for Reparative Description.” Archival Science 24 (3): 481–508, https://doi.org/10.1007/s10502-024-09435-z.Suche in Google Scholar

Marciano, Richard. 2022. “Afterword: Towards a New Discipline of Computational Archival Science.” In Archives, Access, and Artificial Intelligence, edited by Lise Jaillant. Bielefeld University Press.10.1515/9783839455845-009Suche in Google Scholar

Mordell, Devon. 2019. “Critical Questions for Archives as (Big) Data.” Archivaria 87 (May): 140–61.Suche in Google Scholar

Ontotext. 2025. “GraphDB 11.0.1.” https://www.ontotext.com/products/graphdb/ (accessed June 21, 2025).Suche in Google Scholar

Park, Ok Nam. 2015. “Development of Linked Data for Archives in Korea.” D-Lib Magazine 21 (3/4).10.1045/march2015-parkSuche in Google Scholar

Payne, Nathaniel. 2018. “Stirring the Cauldron: Redefining Computational Archival Science (CAS) for the Big Data Domain.” In 2018 IEEE International Conference on Big Data, 3rd CAS Workshop. https://ai-collaboratory.net/wp-content/uploads/2020/03/4.Payne_.pdf (accessed March 13, 2025).10.1109/BigData.2018.8622594Suche in Google Scholar

Postgate, Nicholas. 1992. Early Mesopotamia: Society and Economy at the Dawn of History. Routledge.Suche in Google Scholar

Rajh, Arian. 2024a. “Considering Description Logic for Analyzing and Clarifying Archival Ontologies.” Atlanti + 34 (2): 7–21.https://doi.org/10.33700/2670-4579.34.2(2024).Suche in Google Scholar

Rajh, Arian. 2024b. “Archival Description Turns Truly Collaborative: An Exercise in Records in Contexts Standard.” Moderna arhivistika 7 (1): 63–82.10.54356/MA/2024/JVGE4567Suche in Google Scholar

Ridener, John. 2009. From Polders to Postmodernism: A Concise History of Archival Theory. California, USA: Litwin Books.Suche in Google Scholar

Schellenberg, Theodore R. 1956 (2003). Modern Archives: Principles and Techniques. Chicago, Illinois: Society of American Archivists.Suche in Google Scholar

Shannon, Claude E., and Warren Weaver. 1972. The Mathematical Theory of Communication. University of Illinois Press.Suche in Google Scholar

Taherdoost, Hamed. 2023. “Smart Contracts in Blockchain Technology: A Critical Review.” Information 14 (2): 117. https://doi.org/10.3390/info14020117.Suche in Google Scholar

Taylor, Petroc. 2024. “Amount of Data Created, Consumed, and Stored 2010-2023, with Forecasts to 2028.” Statista. https://www.statista.com/statistics/871513/worldwide-data-created/ (accessed March 11, 2025).Suche in Google Scholar

Toner, John. 2024. Wearable Technology in Elite Sport: A Critical Examination. Routledge.10.4324/9781003184409Suche in Google Scholar

Tschan, Reto. 2002. “A Comparison of Jenkinson and Schellenberg on Appraisal.” American Archivist 65 (2): 176–95, https://doi.org/10.17723/aarc.65.2.920w65g3217706l1.Suche in Google Scholar

Wroe, C. J., R. D. Stevens, C. Goble, and M. Ashburner. 2003. “A Methodology to Migrate Gene Ontology to a Description Logic Environment Using DAML+OIL.” In Biocomputing 2003 Proceedings of the Pacific Symposium, Kauai, Hawaii, January 3–7, 2003, 624–35. Singapore, New Jersey, London, Chennai, Beijing, Taipei: World Scientific.10.1142/9789812776303_0058Suche in Google Scholar

Zack, Michael H. 1999. “Managing Codified Knowledge.” MIT Sloan Management Review. https://sloanreview.mit.edu/article/managing-codified-knowledge/ (accessed March 19, 2025).Suche in Google Scholar

Zou, Qing. 2019. The Representation of Archival Descriptions: An Ontological Approach. PhD diss., School of Information Studies, McGill University. https://www.proquest.com/docview/2505370884/abstract/F24BBBC4C1A24796PQ/44 (accessed June 16, 2025).Suche in Google Scholar

Received: 2025-05-19

Accepted: 2025-07-11

Published Online: 2025-09-08

Published in Print: 2025-10-27

This work is licensed under the Creative Commons Attribution 4.0 International License.

Artikel in diesem Heft

https://doi.org/10.1515/pdtc-2025-0033

Schlagwörter für diesen Artikel

archival materials; description logics; discourse; information; ontology; records in contexts

Creative Commons

BY 4.0