Pragmatically Improving Access to Peirce’s Archive

Mary Keeler

doi:10.1515/css-2020-0009

Artikel Öffentlich zugänglich

Pragmatically Improving Access to Peirce’s Archive

Mary Keeler
Mary Keeler (b. 1948) is a retired professor of Telecommunication Media. She formerly served on the Editorial Board of the Springer-Verlag’s Lecture Notes in Artificial Intelligence series on Conceptual Structures (1997–2017). Her research includes Peirce’s sign theory, logic, and pragmatism, which are applied in Knowledge Processing technology, Complex Adaptive Systems, and Game Theory. Publications include: “Revelator’s complex adaptive reasoning methodology for resource infrastructure evolution” (2008) and “Complex adaptive reasoning: Knowledge emergence in the revelator game” (2009).

Veröffentlicht/Copyright: 6. Februar 2020

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen

Aus der Zeitschrift Chinese Semiotic Studies Band 16 Heft 1

Abstract

This second article (in a series of six), presents an examination of the existing conditions that limit access to Peirce’s corpus, and offers some detailed account of what reasoning technology could be used to improve these conditions. Peirce’s theory of recursive inquiry, relying on his pragmatism, gives us the basis for designing technology to support effective access to his archive, and to pursue future improvement by developing a crowdsourced community of inquiry for continuing semiosis.

Keywords: corpus; crowdsource; reasoning technology; recursive; semiosis

1 Introduction

Peirce’s philosophical work was not widely appreciated during his lifetime, but he has since been recognized as “the most original, versatile, and comprehensive philosophical mind [America] has yet produced” (Nagel 1980: 185). His ideas are now broadly studied worldwide, in nearly every field of research, even though many of his manuscripts remain accessible only in the Harvard archive (https://library.harvard.edu/collections/charles-s-peirce-papers) He worked as physicist, chemist, mathematician, astronomer, and geodesist, and was competent and accomplished in the sciences generally, though he always regarded himself as primarily a logician, but in a far broader sense than we now use that term. He extended traditional logic to include relations, which made possible the logical analysis of probabilistic reasoning (see in Peirce 1992: 258–607; and see W5, 372-387). Peirce not only demonstrated the connections among logic, probability, and inductive reasoning, he also first sketched and proposed an electrical switching circuit to perform logical and arithmetical operations (see Figure 2.1 below; L 269; W5, 421-23 [1886]).

Figure 2.1

Peirce’s proposed electrical switching circuit

As a scientist and (human) computer, Peirce worked to advance the physical sciences by experiment and theory development, and headed the establishment of the Office of Weights and Measures in 1872 (now, the National Institute of Standards and Technology).^[1] In 1876, he was first to calibrate the accuracy of the standard meter in wavelength units of light.^[2] This contributed to the worldwide effort to establish measurement standards as the basis upon which researchers could rely in designing instruments to compare results in their efforts to measure the force of gravity and to define the shape of the Earth and the Milky Way. As briefly mentioned in the previous article “Discovering the hidden treasure of C. S. Peirce’s manuscripts,” his scientific work led him to conceive pragmatism as a method of thinking for continuing the process of inquiry, with truth as the logical limit, to improve (or economize) the natural trial-and-error procedure of learning by experience. He created a graphical notation tool, Existential Graphs (EGs), as a topology of logic for observing and demonstrating this process of dialogic reasoning. He referred to EGs as the “lens” and to pragmatism as the “compass,” the tool and method of research needed to demonstrate his general theory of signs, providing the capability to explore how knowledge continues to grow through careful observation and strategic conceptualization in collaborative learning. If we hope to use this method to improve access to his manuscript archive, we must first examine what are the current access limitations.

2 Nature of the corpus and current conditions of access

For 100 years, the large collection of Peirce’s manuscripts in Harvard’s Houghton Library has remained effectively accessible only to those who visit the archive, but its high-acid paper continues to deteriorate, making it fragile to handle. As explained previously, the textual character of his philosophical explorations would defy representation in any traditional medium, and bibliographically they pose more difficulties. As mentioned, his writings are also filled with graphics and color crucial to interpretation of his philosophy, science, and methodology, in discussions that range across many disciplines. Because Peirce was a polymath, his work cannot be fully comprehended or critically edited from any one disciplinary perspective. Mathematics might occur significantly in a paper on phenomenology, or chemistry in one on pragmatism, or logic in a mathematical paper, or geodesy in one on metaphysics. His philosophical work was never composed for or consolidated into a published account during his lifetime, and only highly selected editions have been produced since.^[3] Very few hardy scholars have studied the estimated 50,000 unpublished pages he wrote from 1900 until his death in 1914. Of this material, Joseph Esposito observes:

In his later years Peirce had ample opportunity to conclude that his lifetime efforts would not be long remembered. Had he known otherwise he might have been inspired to write a short book on his conception of the task of philosophy, a table of contents for future generations to follow. Instead he proposed giant works, works far beyond the scope of a single man to complete even in a lifetime. (1980: 1)

Leslie Morris, who is Curator of Modern Books and Manuscripts at the Houghton Library, helps us examine the conditions of access to Peirce’s archive, provided by current “finding aid” and “metadata” systems. Her answers to the following four questions furnish a basis for examining how technological augmentation might improve these access conditions.

1) How accessible is the Peirce MS collection, compared with other collections in the Houghton's Modern Manuscripts archive?

Answer: To the extent that we can rely on the Annotated Catalogue of the Papers of Charles S. Peirce by Richard Robin (known as “the Robin” catalogue), the Peirce papers are as accessible as other collections of modern manuscript works. The Houghton did not accept the Peirce papers as “uncatalogued,” but the concept of “work” is troublesome, here, because it usually refers to what is published or presented in some formal condition. From the Houghton’s perspective, the Robin provides a way of organizing in categories with reference numbers for retrieval and citation. We have no other way of knowing what is in the archive other than what the Robin structure provides (Keeler et al. 2019: Notes 2–4).

For the most part, the other modern manuscript collections at Houghton are clearly associated with published works, which establish reliable means of identifying manuscripts, and providing access; although many collections will also have some unpublished material, they are generally more coherent in form and structure than the Peirce manuscripts. Such is the case with the Emily Dickinson collection.^[4]

2) What is the current physical condition of the CSP collection, and what is its expected “lifetime”?

Answer: Condition varies from folder to folder, and box to box. Some paper is relatively strong; other paper is so brittle it breaks when handled, or so soft it tears when turned. We improved storage conditions when we hired a student to go through the collection and remove paper clips, and anything else that might contribute to deterioration; we also re-housed everything in new archival folders and boxes, relieving overcrowding in many. Paper conservation will be needed in some cases to make sure the manuscripts can survive the physical handling of the digitization process. The conservator did a spot survey of the condition of the collection, in preparation for a grant application by the Houghton Library to digitize the CSP Manuscript Collection, in 2010.

A selected eight representative boxes of the manuscripts (from the 151 boxes in the collection) were assessed for overall condition, to determine conservation treatment needs and estimated time to stabilize the collection for safe handling during digitization (see the following excerpts from the conservation report).

The manuscripts reviewed span from the mid-19th century to the early 20th century, and are written in all types of ink, colored pencil and graphite on all sorts of paper types, generally single sheets of stationary and lined paper, although other formats and bound composition notebooks are present. There are occasional printed journals and other related material. The collection is cataloged and generally placed in numbered folders within manuscript boxes.

The manuscripts are generally in stable condition. However, some manuscripts have soil, tears, and creases in text area that present handling problems. The bound materials present issues for digitization using the Cannon system, and may require a studio camera setup. Due to the immense size of the collection, 151 boxes—estimated at 80 -100,000 sheets of paper, the time to treat minor condition issues will be high.^[5]

The conservation literature tells us that the Peirce papers were produced on paper that is inherently fragile:

In the mid 19th century, the transition began from cotton or linen-fiber paper to wood-pulp paper, which involved chemicals that reduced the “inherent life” of these products. “Mass deacidification” of deteriorating paper began in the mid 20th century, but that process is expensive and difficult, and then microfilming began to replace paper conservation efforts^[6]

3) How can a scholar/researcher who has the title of a Peirce essay, for example “A survey of pragmaticism,” find the MS number for it in the archive (the Robin catalogue is not very useful in this way)? ^[7]

Answer: In normal circumstances, when the library has its own electronic finding aid, we add information as we discover it. If a new attribution needs to be added to the record, we add it, but retain the old identification as “formerly identified as,” since the scholarly literature may have cited it in that way, and scholars may have only this earlier form. If a description is found to be wrong, it’s changed. Often, we don’t know where such titles supplied by scholars come from, or how to find them in the collection. Thanks to the work of L. John Old and Houghton archivists, the Robin catalogue was converted to a form compatible with our system, and we can now add information to it. But this is a very recent development, and we do not have the staff available to retrospectively add corrections such as those made by Christian Kloesel to this base document.^[8]

4) How well does the current “finding aid” metadata system work in describing the CSP material, considering its disorder? What might be more effective for searching the collection, considering its nature and condition?

Answer: That’s really a question for the Peirce community, but my impression is that the Robin, while useful, isn’t meeting their needs. From the Library’s perspective, given how long the Robin numbers have been used as a reference point, we’d be reluctant to completely rearrange the papers; and who is to know if a new physical arrangement would be any better? There needs to be a way to create a finding aid with persistent reference numbers that will allow the intellectual, rather than physical, rearrangement of the corpus.

If we had a huge “virtual light table,” we could show how complicated the manuscript collection is, with everything from index cards to large-size drawings. Take for example: http://agrippa.english.ucsb.edu/category/the-book! if you click on “Compare Images of Agrippa in Virtual Lightbox,” it does seem to do what I was suggesting. You can have a group of images, and move them around for comparison purposes. It’s a bit awkward to use once you get larger than thumbnails – and how does one get rid of the extra thumbnails? But I think, if you had enormous screen real estate, it would work well. Maybe, we need a combination of that “Light Table” and “Lightbox”?^[9]

3 Reasoning technology to improve access

Although Peirce is best known as the father of American pragmatism and the modern developer of semiotic (or sign theory), he has finally become widely recognized as a leading architect of relational logic, which underlies the advance in computing systems from data and information processors to knowledge processors. John Sowa’s seminal work to develop Conceptual Graphs (CG)^[10] was derived from Peirce’s graphical logic called Existential Graphs (EG), and knowledge science researchers have further developed the CG formalism as a graphical communication interface for building knowledge processing systems, to be interpreted efficiently by both programmer and program processor.

Peirce created the EG (or a “topology of logic”) to observe and experiment with the development of any reasoning as it progresses. He insisted that deductive thought is not the rigid rule-driven (algorithmic) procedure that traditional logic conveniently assumes, and that logic requires an instrument designed for its purpose of observing the reasoning process. EG provides diagrams for mapping the process of reasoning (see CP 4.512-513 [1903]), a purpose analogous to what sophisticated instruments and techniques have given empirical investigation for critical control in examining any evidence (see MS 291 [1905]).

[O]ne can make exact experiments upon uniform diagrams; and when one does so, one must keep a bright lookout for unintended and unexpected changes thereby brought about in the relations of different significant parts of the diagram to one another. Such operations upon diagrams, whether external or imaginary, take the place of the experiments upon real things that one performs in chemical and physical research […] experiments on diagrams are questions put to the Nature of the relations concerned. (CP 4.530 [1906])

Peirce’s theoretical view of inquiry (in semiotic terms) conceives knowing as the provisional result of continuing hypothetical inference. As we create ways to represent knowledge, any conventional system (or medium) we create tends to become algorithmic through our habitual application of it, allowing us to believe we can do only what that system can do. Peirce’s pragmatic method, implied by his sign theory (semiotic), reminds us to maintain a provisional view of our conventions by self-critically examining the actual and possible outcomes of our habitual behavior, using whatever means we can create to do so: observational instruments, modes of expressing and comparing results of these observations, and augmentation of these techniques and powers through the invention of communication media for that purpose. Peter Skagestad (1981) emphasizes Peirce’s perspective on the use of “thinking machines” to augment intelligence: “the specialized tools devised for symbol manipulation are valuable to the extent that they extend, augment, or amplify human intelligence, rather than replicating it” (p. 7).

Where once books and library indexes gave us the only access to intellectual resources, now we must construct new “mediation tools” for communication – not only for effective access, but also for continuing to improve these resources in contexts of collaborative research. Digital archive developers need efficient communication facilities to engage users in designing digital document delivery systems for customizing resources to serve particular requirements of application in education and research. Automated methods of search and retrieval (for both text and images) will continue to become more sophisticated, with knowledge processing increasing control over information quantity and quality. Pattern-matching search systems can now be modularly integrated with text search systems, but true recognition systems require perceptual and semantic judgment (or inference in conceptual-level processing).

Concept-based catalogues could be a primary tool for effective digital resource development and access. Based on the data from current print finding aids, they can link to both transcribed text and digital images of the original manuscript pages, as these are produced. By collaborative development through network communication, scholars and editors can first correct the catalogue entries, using additions and alterations accumulated during the decades of print-edition editing. Concurrently, an ontology of the collection (or a comprehensive conceptual framework that relates the data in a coherent structure, or logical representation) can be constructed as the basis for interrelating the catalogue entries conceptually to accommodate diverse disciplinary perspectives.

4 Ontology development

As more than a database, an ontology must functionally represent the conceptual perspectives of its users, and continue to evolve as users’ inquiry evolves. Graphical views (as visualizations of this data-under-evolution) are imperative if those collaborating are to remain critically aware of the implications of their contributions with respect to other contributions in the scheme as a whole. Methods under use and continuing development in knowledge representation research make possible the three essential functions required in the process of collaborative conceptual catalogue evolution: reporting, tracking, and mapping (see Sowa 1984: 405–424). Groups of scholars or researchers in any particular discipline must be able to report their interpretations concerning the implications of Peirce’s complex arguments, based on the representations of manuscript pages as their evidence. Any order of pages in a manuscript will be based on conjecture, since many pages have been lost or misplaced, but also because of Peirce’s complex composition style. He sometimes used the same page in several versions of elaborate arguments, the course of which sometimes even doubles-back on itself to pick up an unexplored path, as Shea Zellweger’s diagram shows in mapping the page-by-page course of Peirce’s “explorations.”^[11]

Each scholar contributing to the digital resource development of Peirce’s manuscript material must be enabled to create such maps or diagrams, as possible reconstructions, for page-by-page tracking of Peirce’s multi-path argument developments. These “Shea diagrams” (or S-diagrams) can then be matched for discrepancies in sequence across contributions, to pinpoint where controversies in reconstructions lie and further investigation must take place. These diagrams must be linked to digitized manuscript pages, as the primary evidence for what Peirce wrote; and as interpretations proceed, the diagrams can map the courses of different scholars’ arguments advanced on the basis of Peirce’s original work.^[12] The diagrammatic correlation (by linking) between the manuscript page images and a scholar’s represented arguments makes it possible for any researcher to trace developing interpretations back to the original evidence.^[13]

To begin an approach to that level of augmentation for access to Peirce’s manuscripts, two researchers in the technology of knowledge representation examine the potential. Heather Pfeiffer (Conceptual Graphs researcher) and Uta Priss (Formal Concept Analysis researcher) answer four questions related to those answered by the manuscript curator (see above).

1) Considering that the Robin catalogue is now the primary means of accessing Peirce’s manuscripts, how could computer technology improve standard “finding aid” methodology for this collection?

Answer: Of course, when the collection has been fully imaged and transcribed in digital text, the “finding aid” can rely on complete searchability. Cross-references to the established catalogue (“the Robin”) will then be most helpful, but the Robin must also be available in searchable text, online.^[14] Harvard has created a new finding aid which is not based on the Robin’s numbering system, but includes them in parentheses: https://id.lib.harvard.edu/ead/hou02614/catalog And the original Robin can be found at the Peirce Edition Project (without Kloesel’s annotations) as a searchable online file with the note: “Please bear in mind that the composition dates given in the catalog are those determined by Richard Robin and his associates [in 1967]. Annotations will be added later to make corrections and give composition dates established by the Peirce Edition Project.”

Experimentation with different interfaces will be needed to improve access to the catalogue’s entries, and to make possible updating of those entries when the manuscripts can be more easily resorted as digital files (with image linked to text). These three representations are essential for effective access:

Image-based system
Text-based system
Conceptual catalogue access to full text, connected to image files.

2) Considering the current physical condition of the CSP collection (and its urgent need for digitization) how could a “conceptual catalogue” help capture Peirce’s intentions better in the digital form than can any printed finding aid?^[15]

Answer: Once the manuscripts are preserved in a digital archive, they can be safely resorted in as many “virtual orders” as scholars might imagine. The conceptual catalogue could then keep track of these “experiments,” along with any crowdsourced evaluation of their validity. With continuing examination and experimentation, Peirce’s “intentions” may be converged upon, pragmatically. However, since Peirce’s concepts, as well as the artifact of his writings (the manuscripts) continued to grow and change, any representation of his “intentions” will be expected to evolve as well. A “knowledge base” representation of the preserved material, its continued virtual sorting, and the “shared, scholarly crowd-sourced testing” of possible orders, could exemplify Peirce’s intention of continuity of inquiry.

3) With a “conceptual catalogue,” how could a scholar/researcher who has the title of an unpublished Peirce MS, for example “A survey of pragmaticism,” find the MS number for it in the archive?^[16]

Answer: Begin with an experienced Peirce scholar/ editor’s instructions:

First go to the bibliography at the end of volume 8 of the Collected Papers. There the text in question is assigned bibliographic reference: g-c.1907c. Next, go to Robin’s catalogue, and open the third appendix where this reference is identified as MS 318.

Next open up the catalogue at entry 318 where you will see that the Collected Papers text comes from pages 7–45 of one draft and pages 46–87 of another.

Armed with that information you can then consult the microfilm, or, if you are at the Houghton, place a request for manuscript 318.

In short, all you need are the Collected Papers (CP) and Robin’s catalogue. Of course, things may not be this easy in all cases, since Robin’s catalogue is not up to date with current bibliographic Peirce scholarship, which would require a continuing online process of improvement.

The following “technology-augmented answer” assumes that Robin’s catalogue and CP are available in an electronic format, which has been converted into a relational database.^[17]

A relational database for this problem might contain three tables (for the bibliographic records, Robin’s catalogue, and the digital images locations). When a user searches for the title (“A Survey of Pragmaticism”), all tables would be joined via the bibliographic reference (1907), and subsection (c) numbers. A join is a means by which common values from each table are used to create a new set of records into a new table. Through the join query process the manuscript number (318) would be discovered, allowing the correct records from the image location table to retrieve the location of the relevant digitized images.

The answer from the experienced Peirce scholar requires manual processing of cross-referencing (see Keeler et al. 2019: Note 13) listing problems in that transition). If this is changed into a set of tables within a relational database, then a join query – the process where records from two or more tables are combined using a particular set of query criteria – could be done.^[18] Also, notice (in the scholar/ editor’s instructions) that augmenting the procedure for finding manuscript numbers with technology requires: knowing that the Collected Papers has a bibliography at the end of Volume 8; similar knowledge of, and digital access to, Robin’s catalogue’s index correlating CP bibliography reference numbers to manuscript numbers; and digitized images of the manuscript pages transcribed to digital text. Without that digitization, the tables cannot be validated.

4. How could a “conceptual catalogue” improve on the current “finding aid metadata system” in describing the CSP material, considering its disorder and its prevalent “overlapping hierarchy” structure?

Answer: Of course, after the manuscripts are digitized and transcribed, Robin’s catalogue will no longer be the only means of accessing the collection (but the manuscripts must be displayed in searchable full-text format). We are fortunate that the Peirce corpus at Harvard is free from copyright constraints.^[19]

The metadata markup must be chosen carefully so that it maintains essential information but does not restrict future exploration (by imposing an unwanted order or requiring frequent updates). The TEI (Text Encoding Initiative) provides a markup scheme that allows for recording syntactic structures as well as physical structures (such as line breaks and page breaks). But the TEI scheme is quite complex, so must be carefully considered to determine which markup tags should be selected and how they can be applied in a consistent manner.^[20]

A conceptual catalogue will provide a flexible interface for organizing the conceptual content of the manuscripts and for browsing and exploring. Overlapping hierarchies are not a problem. Traditional libraries are restricted to tree-hierarchies because they define an ordering system for arranging books on shelves. Using graphical interfaces on the computer makes it possible to create multiple simultaneous access structures. Hierarchies can be combined or individually presented in a flexible manner. Hierarchies can be designed or extracted from the data. Other graphical structures, such as networks, can also be extracted and combined with the hierarchies. A variety of tools for data exploration should be made available:

browsing interface for the relational database of Robin’s catalogue;
graphical tools for exploring hierarchies and networks arising from that data;
tools for exploring the manuscripts (for example a tool which lets researchers arrange manuscripts, pages or individual text passages – ideally it would be a tool where such items can be placed on a virtual table where they can be moved around, annotated, cut, glued together, etc.);
search functionality; and
linguistic tools (for example, it should be possible to build a concordance of words or phrases, finding all their occurrences in the manuscripts and listing them within their immediate context).

Because it is difficult to use optical character recognition (OCR) with handwriting, it might be a good idea to use crowdsourcing for the transcription of the manuscripts. Groups, such as Wikipedia, for the crowdsourced creation of online dictionaries have shown that it is feasible to find volunteers for such tasks. Some online infrastructure for managing the crowdsourcing process would be required [see more on this, in section 4, below].

Richard Robin tried to gather all “boxes” and “catalogues” in one place (make order out of chaos). We need to “preserve” that order, but manage continued “sorting” of the MS material, and develop a “sharing” process for crowdsourcing globally among Peirce researchers.^[21]

5 Technology for crowdsourcing manuscript transcription

“Crowdsourcing” on the Web has become a powerful collaborative methodology for everything from finding new stars in the galaxy to revolutionizing governments (see Zooniverse and The CrowdSauceGroup as examples). Employers can even crowdsource a labor force for “human intelligence tasks.”^[22]

In the article “Everyone’s an expert: The crowdsourcing of history” in Data Conservation Laboratory News, M. Ojala reports: “With sites like Wikipedia relying on expertise provided by a vast community of Internet users, crowdsourced knowledge is now something people rely on every day.” She points out that the crowdsourcing phenomenon predates the Internet, and can be as simple as yelling a question in a large crowd, which may by chance return the correct answer – if you are prepared to sift through responses. “Crowdsourcing is now changing the way we think about knowledge, expertise is no longer the exclusive domain of experts.”^[23]

Social media now make it possible to correct an article’s facts or write an argument in a blog post. The “Flickr Commons” project incorporates this power to identify items in the photo collections of 30 participating institutions (including the Library of Congress, the Getty Research institute, and the British Museum) to crowdsource the collective expertise of ordinary people. Validating answers to questions on social media is also a public process, which these institutions have learned is time consuming, but they conclude that “greater understanding of our shared past” makes the endeavor worthwhile.^[24]

A growing number of projects are using crowdsourcing to generate research and expand access to collections. The Zooniverse citizen science community is among the most successful in harnessing the public’s enthusiasm for scientific research. Their Old Weather project has helped scientists “recover Arctic and worldwide weather observations made by United States ships since the mid-19th century, by transcribing ships’ logs.” The National Library of Australia recruited amateur historians to correct the OCR text of digitized newspapers; and with user-generated content, Ancestry.com’s World Archives Project uses enthusiastic genealogists to transcribe millions of name indexes for use by researchers.

Recently, projects have been established to crowdsource the more complex task of transcribing manuscript collections, with the intention of engaging participants in adding value to such archives. Scripto, developed at the Roy Rosenzweig Center for History and New Media at George Mason University, and T-Pen from Saint Louis University, are open-source tools designed to facilitate manuscript transcription. Scripto, “a community transcription tool brings the power of MediaWiki to your collections,” is currently used in several projects, including transcription of The U.S. War Department’s collected letters from the Civil War era. T-Pen is used for collaborative transcription of manuscript material, with editorial notation.^[25]

At University College London, the Transcribe Bentham project^[26] has developed crowdsourced manuscript transcription on a large scale within an academic context. They have examined its impact on scholarly editing, and also “assessed the potential benefits of engaging the public in research, […] to break down traditional barriers” of academic exclusivity.^[27]

The original Bentham Project began in 1958 to publish a scholarly edition, The collected works of Jeremy Bentham (to replace the inadequate and incomplete eleven-volume edition published between 1838 and 1843) and has published 29 of the projected 70 volumes. Before they launched Transcribe Bentham in September 2010, they estimated 40,000 manuscripts remained to be transcribed and, like the Peirce manuscripts, much of that material was inadequately studied. Details of their progress further indicate why they decided to try crowdsourced help.

The volumes of the Collected Works are, to a large extent, based on edited transcripts of Bentham’s unpublished manuscripts, 60,000 folios of which are housed in UCL Library’s Special Collections, while a further 12,500 are held in the British Library. For over half a century, editorial staff at the Bentham Project have transcribed and edited Bentham’s manuscripts according to a set of established editorial conventions; they have transcribed an estimated 20,000 folios by hand, by typewriter, and by word processor (Causer et al. 2012: 120).

Transcribe Bentham is a collaborative project involving members of the Faculty of Laws, the University Computer Centre, Library Services, Learning and Media Services, Department of Information Studies, and Centre for Digital Humanities. From the beginning, it was established:

with the intention of engaging the public with Bentham’s thought and works, creating a searchable digital repository of the collection, and quickening the pace of transcription and publication by recruiting unpaid online volunteers to assist in transcribing the remaining manuscripts. Anyone, anywhere in the world with an Internet connection can participate in Transcribe Bentham, and volunteers require no prior background knowledge or technical expertise. After registering a user account, participants transcribe Bentham’s manuscripts into a text box and, using a customized toolbar (Fig. 2), encode the features of the manuscripts in Text Encoding Initiative (TEI)-compliant Extensible Mark-up Language (XML). The transcripts produced by volunteers thus have a dual purpose: they will feed into the Bentham Project’s editorial work and help form the basis of printed editions of the Collected Works, and are uploaded to UCL’s digital Bentham repository where, owing to the TEI encoding, they render the collection fully accessible to all; [Ibid. 120; they designed the Bespoke transcription tool to allow volunteers to transcribe and encode.]

Their assessment of the project emphasizes the need for moderating the transcription volunteer community:

Though moderation is evidently a time-consuming process, it was—and remains—an indispensable part of Transcribe Bentham. This suggests that project staff must devote some time to moderation and quality control; otherwise, users may lose interest, and feel undervalued, or exploited.^[28]

The Bentham team includes in their evaluation consideration of the longterm curation and preservation of the transcripts and the manuscripts, so that:

In future, editors may work solely with XML files and use an XSLT transformation to prepare from them a scholarly edition for print or digital publication, thus skipping steps currently required and rendering the editorial process more economical. In the current climate, where the possibility of securing full funding for a further forty-one volumes seems increasingly remote, the ability to publish digitally becomes an important asset (Causer et al. 2012: 125).

They conclude:

With adequate funding to support development costs, and enough time to mature, a crowdsourced manuscript transcription initiative like Transcribe Bentham could be enormously beneficial to a scholarly editorial project like the Bentham Project. An improved transcription tool would relieve volunteers from being overly-concerned with encoding and allow them to concentrate upon deciphering Bentham’s manuscripts, and result in the submission of a greater number of transcripts at a faster pace. The speedy production of high-quality transcripts would then quicken the pace of editing and publishing the anticipated forty-one printed volumes of Bentham’s works yet to appear. Our results suggest that a longer time-scale would allow the community of transcribers to develop and become more self-sufficient, requiring less feedback and quality control from staff, thereby rendering the project more cost-effective. Nevertheless, despite the difficulties involved in transcribing Bentham’s manuscripts, and despite the short time-frame in which the tool was developed, publicized, and made available to the public, Transcribe Bentham has engaged a wide range of people and produced a significant number of usable transcripts (and continues to do so). This underlines the great potential of crowdsourcing manuscript transcription: if untrained volunteers are able to transcribe the papers of Jeremy Bentham, some of which border on the illegible, they can transcribe almost anything. (Causer et al. 2012: 133)

The relative success of Transcribe Bentham contrasts with some other attempts.^[29] However, many other operations continue to thrive, for example, Smithsonian Digital Volunteers, and a new industry for it has emerged.^[30] In fact, the crowdsourcing methodology has spread to many other forms of development, beyond research contexts, and there are now many crowdfunding platforms, with sites that rank them in various ways. Social networking studies of crowdsourcing have led to related analysis of community-building, which clarifies the distinction between “crowd” and “community.” Establishing a crowdfunded operation for crowdsourced transcription (and editing) of Peirce’s manuscripts could be much easier than for Bentham, because Peirce already has an extensive (worldwide) community of online researchers.^[31]

6 Conclusions for future access development

We are, according to Peirce, primarily creatures who can form hypotheses, which originate as “spontaneous conjectures of instinctive reason” (CP 6.475 [1908]). His theory of inquiry as learning by experience (or logic in semeosis) attempts to explain how instinct evolves into intellect, and his pragmatic methodology was conceived to clarify how our necessarily fallible knowledge can progress with increasing validity and verifiability. This “bootstrapping operation of inquiry” also justifies its application: “as we remain disposed to self-criticism and to further inquiry, we have in this disposition an assurance that if the truth of any question can ever be got at, we shall eventually get at it”.^[32]

“Reasoning technology” can now augment Peirce’s pragmatic method, which entails continually checking the veracity of our representations with respect to the evidence we intend them to refer to (such as manuscript pages): validity entails verifiability. Any provisionally validated interpretations (as logical arguments) must function recursively in further inquiry, as Michael Hoffmann observes: “Peirce seems to consider the processes of reasoning [abduction, deduction, and induction] themselves to be structured recursively.” He offers this example from Peirce’s 1903 Lectures on Pragmatism: “where in analysis we treat operations as themselves the subject of operations” (137-38; CP 5.162 [1903]). The manuscripts for those lectures (MSS 301-316a) are increasingly colorful and so not yet effectively accessible, but Esposito has closely studied the original pages and finds clarification that Peirce intends the recursive process^[33] to be central to his evolutionary theory and pragmatic method of scientific inquiry – the clue to understanding his theory of inquiry as a theory of learning or evolutionary epistemology, which defines truth as that to which inquiry would converge if continued indefinitely.^[34]

Hoffman further argues that Peirce’s logical diagrams can represent their abstractions as new objects, “in Peirce’s terminology […] as hypostatic abstractions,” giving us the power to take novel points of view in continuing inquiry. In his “Grand Logic” of 1893, Peirce had already begun to respond to the question: “How do concepts evolve?”

We can answer for ourselves after having worked a while in the logic of relatives. It is not by a simple mental stare, or strain of mental vision. It is by manipulating on paper, or in the fancy, formulæ or other diagrams -- experimenting on them, experiencing the thing. Such experience alone evolves the reason hidden within us and as utterly hidden as gold ten feet below ground--and this experience only differs from what usually carries that name in that it brings out the reason hidden within and not the reason of Nature, as do the chemist’s or the physicist’s experiments. (CP 4.86, and see Keeler et al. 2019: Note 11)

Initially, the Peirce research community could create a Wikipedia-sort of model for conceptual catalogue transcription and development, which could be greatly improved in its operation if guided by Peircean principles of inquiry, and technology inspired by his ideas. Consider Robert Burch’s appraisal of Peirce in the Stanford Encyclopedia of Philosophy (SEP): “given his lifelong ideas and goals as a scientist-philosopher, he probably would have found the current practical importance of his ideas entirely to be expected.” For example: “He would not be in the least surprised to find that the topic of constructing ‘ontologies’ in vogue among computer scientists […] he would be right at home among them.” Burch finishes his entry with a synoptic account of the many contemporary, practical and even crucial uses of Peirce’s ideas in the development of algorithms at the core of what is known as “Social Network Analysis.” And Peirce’s ideas hold much more of relevance to the future of Web-based community development to overcome the challenges of access to his writings, as we shall see in further explorations.^[35]

Peirce would now hope that once we better understand the current access limitations and technology’s potential, we could continue to conjecture what sort of augmented access might be developed. As we proceed in the following articles, let us further examine how his manuscript evidence and ideas might lead to technology improvement for unlimited future expeditions, as his pragmatism encourages!

About the author

Mary Keeler

Mary Keeler (b. 1948) is a retired professor of Telecommunication Media. She formerly served on the Editorial Board of the Springer-Verlag’s Lecture Notes in Artificial Intelligence series on Conceptual Structures (1997–2017). Her research includes Peirce’s sign theory, logic, and pragmatism, which are applied in Knowledge Processing technology, Complex Adaptive Systems, and Game Theory. Publications include: “Revelator’s complex adaptive reasoning methodology for resource infrastructure evolution” (2008) and “Complex adaptive reasoning: Knowledge emergence in the revelator game” (2009).

Note: This is the second of a series of short articles on improving effective digital access to Peirce’s manuscripts. It has been partly adapted from the in-progress online handbook Discovering the future in the past: How C.S. Peirce’s 19th century ideas challenge 21st century technology, which is available on ResearchGate at:

https://www.researchgate.net/publication/335870177_Discovering_the_Future_in_the_Past

References

Causer, Tim, Justin Tonra & Valerie Wallace. 2012. Transcription maximized; expense minimized? Crowdsourcing and editing the collected works of Jeremy Bentham. Literary and Linguistic Computing 27(2). 119–137.10.1093/llc/fqs004Suche in Google Scholar

Esposito, Joseph, 1980. Evolutionary metaphysics: The development of Peirce’s theory of categories. Athens, OH: Ohio University Press. http://www.digitalpeirce.fee.unicamp.br/p-virtuality.htm (accessed 23 December 2019).Suche in Google Scholar

Hoffmann, Michael, 2003. Peirce’s “Diagrammatic reasoning” as a solution of the learning paradox. In G. Debrock (ed.) , Process pragmatism: Essays on a quiet philosophical revolution, 121–143. Amsterdam: Rodopi, B.V.10.1163/9789004493261_012Suche in Google Scholar

Hookway, Christopher. 1985. Peirce. London: Routledge and Kegan Paul.Suche in Google Scholar

Keeler, Mary & Christian Kloesel. 1997. Communication, semiotic continuity, and the margins of the Peircean text. In David C. Greetham (ed.), The margins of the text, 269–322. Ann Arbor, MI: University of Michigan Press.Suche in Google Scholar

Keeler, Mary, 1998. Iconic Indeterminacy and human creativity in the C.S. Peirce’s manuscripts. in George Bornstein & Teresa Tinkle (eds.), The iconic page in manuscript and digital culture, 157–194. Ann Arbor, MI: University of Michigan Press.Suche in Google Scholar

Keeler, Mary A., Leslie Morris, Heather Pheiffer, Uta Priss, L. John Old, and Michael Hoffmann. 2019. Discovering the future in the past: How C.S. Peirce’s 19th century ideas challenge 21st century technology. ResearchGate, October. https://www.researchgate.net/publication/335870177_Discovering_the_Future_in_the_Past (accessed 23 December 2019).Suche in Google Scholar

Ketner, Kenneth & Hilary Putnam. 1992. Reasoning and the logic of things: Charles Sanders Peirce. Cambridge, MA: Harvard University Press.Suche in Google Scholar

Peirce, Charles (ed.). 1883. A theory of probable inference. in Charles S. Peirce (ed.), The Johns Hopkins Studies in Logic, 126–181. Boston, MA: Little, Brown and Co.10.1037/12811-007Suche in Google Scholar

Peirce, Charles. 1887. Logical machines. The American Journal of Psychology I. 165–170. http://history-computer.com/Library/Peirce.pdf (accessed 23 December 2019).Suche in Google Scholar

Peirce, Charles. 1931–1958. Collected papers of Charles Sanders Peirce (8 vols.). Edited by Arthur Burks, Charles Hartshorne & Paul Weiss. Cambridge, MA: Harvard University Press.Suche in Google Scholar

Peirce, Charles. 1976. The new elements of mathematics by Charles S. Peirce. Edited by Carolyn Eisele. The Hague & Paris: Mouton & Co. B.V.Suche in Google Scholar

Peirce, Charles, 1984–2008. Writings of Charles S. Peirce: A chronological edition. Edited by Edward C. Moore et al. Bloomington, IN: Indiana University Press.Suche in Google Scholar

Peirce, Charles, 1977. Semiotic and significs: The correspondence between Charles S. Peirce and Victoria Lady Welby. Edited by Charles S. Hardwick. Bloomington, IN: Indiana University Press.Suche in Google Scholar

Peirce, Charles. 1992. Reasoning and the logic of things. Edited by Kenneth Ketner. Cambridge, MA: Harvard University Press.Suche in Google Scholar

Peirce, Charles. 1997. Pragmatism as a principle and method of right thinking: The 1903 Harvard “Lectures on pragmatism.” Edited by Patricia Turrisi. Albany, NY: SUNY Press.Suche in Google Scholar

Skagestad, Peter. 1981. The road of inquiry: Charles Peirce’s pragmatic realism. New York: Columbia University Press.Suche in Google Scholar

Semetsky, Inna. 2009. Meaning and abduction as process-structure: A diagram of reasoning. Cosmos and History: The Journal of Natural and Social Philosophy 5(2). 191–209. http://www.cosmosandhistory.org/index.php/journal/article/viewFile/159/258 (accessed 21 December 2019).Suche in Google Scholar

Sowa, John. 1984. Conceptual structures: information processing in mind and machine. Reading, MA: Addison-Wesley.Suche in Google Scholar

Peirce citing conventions

^[36]CP x.y = Collected papers of Charles Sanders Peirce, volume x, paragraph y.

EP x:y = The essential Peirce: Selected philosophical writings, volume x, page y.

NEM x:y = The new elements of mathematics by Charles S. Peirce, volume x, page y. (Some scholars use “NE”.)

PPM x = Pragmatism as a principle and method of right thinking: The 1903 Harvard “Lectures on pragmatism,” page x. (Some scholars use “HL”.)

SS: x = Semiotic and significs: The correspondence between C. S. Peirce and Victoria Lady Welby, page x. (Some scholars use “PW”.)

W x:y = Writings of Charles S. Peirce: A chronological edition, volume x, page y.

Published Online: 2020-02-06

Published in Print: 2020-02-25

Artikel in diesem Heft

Frontmatter
Part One: Cultural Signs and Sign Theories
Paraphrase or parasite?
Rethinking Cultural Terminology Translation
Parody and Garden Path
Bakhtin and Shpet – Inheritance and Transcendence
Dynamic Coherence in the Dialogue of Subjects
A Cognitive-Semiotic Construal of Metaphor in Discourse
Part Two: Peircean Semiotics and the Philosophy of Inquiry
Introduction to the 2020 Peirce Section
The Hidden Treasure of C. S. Peirce’s Manuscripts
Pragmatically Improving Access to Peirce’s Archive

https://doi.org/10.1515/css-2020-0009

Schlagwörter für diesen Artikel

corpus; crowdsource; reasoning technology; recursive; semiosis

Artikel in diesem Heft

Frontmatter
Part One: Cultural Signs and Sign Theories
Paraphrase or parasite?
Rethinking Cultural Terminology Translation
Parody and Garden Path
Bakhtin and Shpet – Inheritance and Transcendence
Dynamic Coherence in the Dialogue of Subjects
A Cognitive-Semiotic Construal of Metaphor in Discourse
Part Two: Peircean Semiotics and the Philosophy of Inquiry
Introduction to the 2020 Peirce Section
The Hidden Treasure of C. S. Peirce’s Manuscripts
Pragmatically Improving Access to Peirce’s Archive