Top Banner
1 Introduction to Scientific Reading and Writing and to Technical Modalities of Augmentation 1.1. Introduction This collective work is the result of a project begun in 2015, the fruit of reflection carried out by members of the haStec Laboratory of Excellence 1 . The project started with a seminar 2 , from which some of the participants agreed to contribute or evaluate chapters for this book. This work brings together original contributions, selected and reviewed by at least two members of our scientific committee, to whom we are greatly indebted. Our introduction aims to synthesize the broad outlines of the seminar and to provide tools for understanding the rest of the book. The purpose of this chapter is to situate digital reading and writing in the context of digital humanities, in order to better understand how the procedure is connected to, and involved in, the disciplinary movement. Reading and writing, from a scientific as well as a more general perspective, are ancient practices; the procedures involved have developed in parallel with the tools available, existing and structuring the thought processes of Chapter written by Evelyne BROUDOUX and Gérald KEMBELLEC. 1 Histoire et anthropologie des savoirs, des techniques et des croyances: History and anthropology of knowledge, techniques and beliefs (accessed September 27th 2016, http:// www.hesam.eu/labexhastec/). 2 See the seminar webpage (accessed September 7th 2016, http://www.dicen-idf.org/ seminaire-ecrilecture/). COPYRIGHTED MATERIAL
22

Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Sep 04, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

1

Introduction to Scientific Reading and Writing and to Technical

Modalities of Augmentation

1.1. Introduction

This collective work is the result of a project begun in 2015, the fruit of reflection carried out by members of the haStec Laboratory of Excellence1. The project started with a seminar2, from which some of the participants agreed to contribute or evaluate chapters for this book. This work brings together original contributions, selected and reviewed by at least two members of our scientific committee, to whom we are greatly indebted. Our introduction aims to synthesize the broad outlines of the seminar and to provide tools for understanding the rest of the book.

The purpose of this chapter is to situate digital reading and writing in the context of digital humanities, in order to better understand how the procedure is connected to, and involved in, the disciplinary movement. Reading and writing, from a scientific as well as a more general perspective, are ancient practices; the procedures involved have developed in parallel with the tools available, existing and structuring the thought processes of

Chapter written by Evelyne BROUDOUX and Gérald KEMBELLEC. 1 Histoire et anthropologie des savoirs, des techniques et des croyances: History and anthropology of knowledge, techniques and beliefs (accessed September 27th 2016, http:// www.hesam.eu/labexhastec/). 2 See the seminar webpage (accessed September 7th 2016, http://www.dicen-idf.org/ seminaire-ecrilecture/).

COPYRIG

HTED M

ATERIAL

Page 2: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

2 Reading and Writing Knowledge in Scientific Communities

generations, well before the development of new theories of thought during the 20th Century.

Nevertheless, in terms of the history of scientific writing, new schools of philosophical thought emerged in the first half of the 20th Century, which established normative positions in scientific thought, through which thought may be described and categorized. The possibility of using tools to connect human knowledge was explored via the idea of the Memex3, at the end of the Second World War, although it only began to take concrete form in the final decade of the 20th Century: the Web was initially envisaged as a scientific, writable entity. In this section, we shall provide a brief overview of the digital humanities and their connection with the reading and writing process, externalized through dynamic forms of reception. We shall also take the opportunity to present models for structuring information, particularly those used for data linked to the semantic web; this will be useful in understanding certain chapters in this book.

1.2. The digital humanities

1.2.1. Field of practice

The “digital humanities” have progressively gained territory over the last decade or so, as an interdisciplinary field of research which encompasses a set of practices currently coming into use in the humanities and social sciences. The first phase consisted of making use of available computer technology to digitize documents. Objects of study in the fields of history, literature, arts, and the museum and archive sectors have been digitized, offering a wealth of new and unprecedented research opportunities, with simplified access to sources generated by the construction of new databases.

Visual representations of statistical calculations carried out on quantitative data are now accessible to all, thanks to algorithms used in graphical interfaces. In e-books, augmentation takes the form of multi-entry summaries [TRE 14], and map-style representations make it easier to search for information. Finally, narration and hypermedia illustrations add new 3 https://fr.wikipedia.org/wiki/Memex (accessed September 27th 2016).

Page 3: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

elementprosopochronol

Figtak

a

The incorpointerprealongsidbroadenauthors

A sedigital rform ofprefers transforwith sh[DAC 1

4 Medievmanuscrip

ts to the expographic knoogical frieze

gure 1.1. Chroken from a biba color version

first phase iration of d

etation of cde a major fning access fand docume

econd, more research objef digital liteto call digita

rmation of reifts taking p

15].

val term denotpts.

perience, as owledge bas

e generated b

onological friezbliographic datn of this figure,

in the emergdigital techncorpora, lanfocus on edifor researcheents.

reflexive phects, highligheracy [LED al “lettrure”4

esearch and aplace in the b

ting the proces

Introduct

shown in thse for histo

by a search in

ze for the disptabase, linked see www.iste

gence of the niques into nguages, reitorial digitizers and the g

hase, corresphted the need12], which

4. In parallelanalysis metborders betw

ss of read/wri

tion to Scientific

he example bory of art, n a database.

play and examto external co

e.co.uk/kembe

digital hummethods fo

esearch terrzation projegeneral publ

ponding to td for trainingEmmanuel

, a need to cthods becamween discipli

ting performed

c Reading and W

below, takenwhich repre

mination of conontent (LOD). ellec/reading.z

manities was or the analyrains and acts, with thelic to historic

the arrival og, correspondSouchier [S

consider the e apparent [Rines and pro

d by monks in

Writing 3

n from a esents a

ntent For

zip

thus the ysis and archives, e aim of c works,

of native ding to a SOU 13]

ongoing RIE 12], ofessions

n copying

Page 4: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

4 Reading and Writing Knowledge in Scientific Communities

1.2.2. A disciplinary movement

An explanation for this movement can be found in another characteristic of the “digital humanities”, their self-description as a form of disciplinary shift. The origins of this movement can be traced back to efforts to break down barriers in the “humanities”, as they were seen in North American circles, as non-viable disciplines, without the connections to social sciences such as sociology and anthropology, which were already widespread in Europe, where humanities and social sciences tend to be grouped together. This movement has now had effects far beyond the boundaries of the humanities, posing fundamental questions concerning the theoretical basis of the new inter-discipline.

A movement results from a combination of federating elements, which communicate shared points of view, without being directed by an entity specifically charged with this function. Knowledge of the digital humanities spread through somewhat unconventional meetings, such as BarCamps5, then through the THATCamp6; these events disseminated principles and ideas for action, resulting in the production of manifestos designed to describe situations and define solutions.

The first manifesto, published on December 15, 2008 by Jeffrey Schnapp, Peter Lunenfeld, Johanna Drucker and Todd Pressner on the University of California, Los Angeles (UCLA) servers, was unusual in that it was the product of a seminar (Mellon) and of a collective writing process, incorporating 124 comments (filtered by invitation). It made use of the WordPress platform and the dedicated CommentPress plugin7, which can be edited by readers. The contents of the manifesto are intended to be subversive and radical (it states, for example, that anything which is not “open” should be considered to be “the enemy”), provoking critical comments. The main objective of the manifesto, whilst not stated explicitly, was to “free” the humanities from the confines of universities; disciplines and departments were perceived as systems of domination, perpetuating

5 BarCamps are open, participatory workshop events, where content is provided by the participants themselves, discussing themes of their own choosing. 6 The Humanities and Technology Camp. The first THATCamp was held at the Center for History and New Media at George Mason University (Virginia), 27th–28th June 2009. 7 CommentPress is a WordPress plugin, produced as part of a project run by the institute for the future of the book.

Page 5: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 5

rules designed to legitimize competitive advantages and blocking the progress of change.

A second manifesto, version 2.0, was published in 2009, ratifying the first edition, notably in terms of insertion into the “wiki-economy” and the fight against the “naturalization” of print culture. In this manifesto, the digital humanities are seen as “an array of convergent practices”, rather than as a unified field. There is a special focus on curation, as an “augmented scholarly practice”, and to openness to actors from outside of the scientific sphere.

In France, the Digital Humanities International monitoring blog8, financed by a TGE Adonis project9, published 568 posts on this theme between 2008 and 2012; this was followed by a major upsurge, triggered by Open Edition with the launch of the first European THATCamp on the subject of digital humanities in 2010. This resulted in the publication of a manifesto, this time in French, with certain marked differences from those published on the UCLA website. Specifically:

– the “modification of the conditions of production and diffusion of knowledge”;

– the formation of the field of digital humanities from the “convergence of interests of communities” with regard to practices, tools and a variety of transversal tools (coding of textual sources, geographical information systems, lexicometry, digitization of cultural, scientific and technical heritage, web mapping, data mining, 3D, oral archives, digital and hyper-media arts, literatures, etc.).

The actors involved stated their intention to create a “supportive, open, welcoming and freely accessible community of practice”. The document places an emphasis on free access to data and meta-data, alongside sharing and collective working.

Digital humanities projects have also been encouraged by public infrastructures that aim to provide technical support for digitization initiatives. In France, equipment has been provided (through TGE Adonis then TGIR Huma-Num) alongside a digital scientific library (BSN,

8 http://dhi.intd.cnam.fr/ (accessed September 27th 2016). 9 http://www.huma-num.fr.

Page 6: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

6 Reading and Writing Knowledge in Scientific Communities

bibliothèque scientifique numérique). At European level, the Dariah-EU infrastructure10 has also been created.

The dynamic nature of the movement is evident in the information published on the DH list, a French-language discussion list on the digital humanities, created in March 2010 by Frédéric Clavert, Marin Dacos and Pierre Mounier. It has now been transformed into a service run by Humanistica, the French-language association for digital humanities.

1.3. Notable features of reading and writing

1.3.1. Scientific reading and writing

Digital reading and writing practices apply both to scholarly reading and to the Internet. Practices associated with the culture of “scholarly” reading have been developed over centuries, and annotations themselves have become subjects for study, either as additional elements in connection with the original texts or as documents in their own right.

The first “scholarly” reading techniques, seen, historically, from the 12th Century onwards, combine reading and writing in a process known as lettrure, involving both attentive reading and commentary. Reading and writing, the exclusive preserve of a small and essentially monastic “lettered” elite, were considered as a single process, made up of connected and complementary actions in which the highly structuring activity of reading allowed readers to become actors themselves, enriching the transmitted ideas. By means of intellectual capitalization and aggregation, this process participated in a scriptural transformation and could take concrete form on the physical medium through marginalia, footnotes and other annotations.

The networking effects of the Internet have transformed this activity, adding technical layers that relate both to the reading and writing process and to the circulation of texts, their potential and effective augmentation, their diffusion and the interception of feedback concerning their reception. The Internet and technologies associated with the use of hypertext links have resulted in the development of enriched reading environments; we have begun to examine these environments both in terms of innovations in programming and from the perspective of current and future usages. 10 Digital Research Infrastructure for the Arts and the Humanities – EU.

Page 7: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 7

In certain languages, the term “ecrilecture”11 (with regional variations) has been used to refer to creative literary practices involving the use of computers, such as automatic text generation. In 1992, Pedro Barbosa used the term ecrileitura to describe the phenomenon whereby the reader is responsible for the composition of texts to read; the author takes a position earlier in the creative process, producing a text program able to generate multiple variations, of which the author cannot control either the readable forms or the interpretations. Alain Vuillemin used the term to characterize the new behavior of readers involved in creative manipulations from their side of a screen. “The act of ecrilecture, interactive writing and reading, is therefore seen as a peripheral action, implemented by the user of a computer on the basis of a fragment of reference text” [VUI 99, p. 103]. The first French “dynamic annotation system”, intended for use by readers at the Bibliothèque Nationale de France (BNF), was conceptualized, in 1999, as part of a digitization program:

“It will be possible to create a corpus of text from the collections, to organize it using bookmarks or tags, then to create associations with annotations and comments relating to pre-selected fragments” [VUI 99, p. 103].

Unfortunately, the project failed to achieve its aims; the planned “second generation” reading stations were replaced by simple reference search points, which did not offer the capacity for “ecrilecture” or for readers to share the results of their reference searches.

Vuillemin also explained his vision as follows:

“In an ideal world […] reading would result in a writing act, and acts of rewriting would lead to re-reading, not only ‘around’ a text, but also, in a way, within the text, probing its intratextual and intertextual depths. […] As this integration process becomes established, reading will cease to be ‘assisted’ by the computer, instead becoming a form of active or even interactive reading, to the point where it becomes a dynamic action, in constant renewal; in short, a true creative act of ‘ecrilecture’ [VUI 99, p. 102].

11 Note: in English, the term “reading and writing” is most widely used, but it lacks the specificity of the French term écrilecture. In the absence of a satisfactory English equivalent, the term “ecrilecture” has been adopted in this work for reasons of clarity and precision.

Page 8: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

8 Reading and Writing Knowledge in Scientific Communities

In Chapter 2, Viviane Clavier and Céline Paganelli consider ecrilecture as an intellectual and instrumentalized process that allows us to analyze the ways in which knowledge is created in professional communities, based on observations of their documentary practices. They make a distinction between ecrilecture, demonstrating scientific activity, and lettrure, relating to books and demonstrating erudition. Their study covers three different areas, offering a synthesis of similarities and differences in observed practices on the basis of work already carried out. The three communities under consideration were literary researchers, hospital doctors and doctoral students in information and computer sciences.

This critical position, a condition for the transmission of knowledge, is explored by Thomas Bottini in Chapter 3. The concept of ecrilecture starts from the principle that the internal writing process which takes place during reading may be externalized in different forms of annotations, supported by computing procedures. There is also a need to consider the operational aspect, highlighting the specific mental operations involved in criticism and the properties of the media holding “scholarly” content. Any conceptualization of a system of ecrilecture must therefore involve a presentation of the fundamental characteristics to which a multimedia device must respond. First, this space must be able to accommodate a variety of semiotic elements (text, graphics, sound, etc.) without limiting critical exploration and whilst maintaining the basic functions of manipulation: access to the semiotic form of appropriation, definition of a point of interest, definition of a zone or extraction of a fragment. Second, rules emanating from the typodispositional logic of the final document should not be imposed to the detriment of critical operators, which promote the exploration of an emerging network of meaning.

Annotation is notable in that it attracts the interest of actors concerned with writing in both the humanities and computer science, and a considerable amount of work has been produced on this theme over the last 30 years or so. Recently, research blogs, such as the one published by Marc Jahjah in relation to his thesis (2014), Les marginalia de lecture dans les “réseaux sociaux” du livre12 (Reading Marginalia in the “Social Networks” of the Book), or the one published by Johanna Daniel in relation to her

12 https://marginalia.hypotheses.org/.

Page 9: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 9

creation of a benchmark for annotation tools for use in the history of art13 (2014), have highlighted developments in thesis production.

Annotation fulfills a variety of functions at all stages of publication, including the advancement of an object during individual or collective writing, or the inclusion of comments, supporting the collaborative creation of a critical apparatus.

At this point, it is useful to note the distinction between metadata and annotations [PRI 04]: metadata is attached to a resource, identified as such, whilst annotations are “situated more within this resource, and written during the course of a reading and annotation process”. Annotation thus occurs within the object of writing, in the course of a manual process of ecrilecture.

It is possible to go even further, considering that the processes of ecrilecture, supported by multiple computerized functions, can now be extended automatically by computational reasoning applied to their semantics, creating a form of augmentation.

1.3.2. Ecrilecture: a major concept in the digital humanities

Olivier Le Deuff’s investigation [LED 15] of the role of indexing in the creation of the digital humanities, as an originally manual reading/ writing practice, supports the ideas expressed in version 2.0 of the Digital Humanities Manifesto, which aims to offer readers “an open, outstretched hand” [JUL 15].

Digital humanities enthusiasts are devoted to disseminating the idea that digital technology leads to profound transformations of activities associated with knowledge construction; in this, they are supported by some of their predecessors in the field of computing, who realized early on that this new technology represented a tool for writing as much as for calculation. One notable work was “Computers and writing – State of the art” [HOL 92], a seminal volume of interdisciplinary articles concerning the statistical analysis of text, indexing, text editor design, reference management, collaborative writing, hypertext writing, the cognitive aspects of writing, and so on.

These ideas were developed further by the precursors of the digital humanities, such as Jay Bolter [BOL 90], co-author of the hypertext writing 13 http://johannadaniel.fr/isidoreganesh/.

Page 10: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

10 Reading and Writing Knowledge in Scientific Communities

tool Storyspace, for whom computers represent a new phase in the spatialization of writing, following Illich [ILL 91] and Goody [GOO 79]:

“Writing is always spatial, and each technology in the history of writing (e.g., the clay tablet, the papyrus roll, the codex, the printed book) has presented writers and readers with a different space to exploit. The computer is our newest technology of writing, and we are still learning how to use its space” [BOL 90].

It is now common for digital humanities projects to include the provision of a platform for readers, which offers annotation functions. One example with a reflexive focus is The Debates in the Digital Humanities14, a hybrid publication platform launched in 2013, which explores debates in the field of digital humanities at their point of emergence. The open-access publication is made available simultaneously to the printed edition. The platform developed to include additional functions whereby readers are able to interact with content, explicitly marking passages and adding terms to a collectively produced index.

On the edges of the digital humanities and ecrilecture processes, we find annotation programs, designed to lighten the cognitive load by providing spaces for the externalization of thought, promoting a critical approach. In Chapter 4, Marc Jahjah highlights arguments based on imagination in the presentation of the Hypothes.is program, essentially developed for university research. Semiotic analysis of the interfaces of this extension, a browser add-on that creates an additional column for annotating visited websites, shows that it facilitates an exchangeless ecrilecture process and promotes an “overview” vision.

Other examples of annotation can be found in scientific journals which offer the possibility for open evaluation and comments, such as PeerJ, a biomedical science review. In France, the experimental VertigO15 review ran over three months in late 2015, offering five texts for open evaluation; this is in contrast to the classic double-blind process. Five further texts were submitted for comments, with the explicit aim of promoting formal improvements in terms of expression.

14 http://dhdebates.gc.cuny.edu/. 15 http://vertigo.hypotheses.org/.

Page 11: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 11

This use of a community of peers for evaluation, consultable by all, and of proposals for formal modifications, submitted in the form of comments by any interested party, highlights a trend toward broadening the scientific selection process to include the general audience.

In Chapter 5, Lisa Chupin presents an example of scientific crowdsourcing for the transcription of herbarium labels. The ecrilecture tasks opened to public contribution are broken down in advance to create normalized texts, from which contributors are able to sort and compare their propositions via controlled forms of interaction. Proposals are then processed by algorithms, which help to resolve conflicts in interpretation, and statistics, facilitating choices established according to criteria of scientific validity. A second level of ecrilecture is found in comments, the contents of which are not of immediate utility to the system in question; however, later use of these comments will lead to an augmentation of the knowledge obtained by participants. Their informational value lies in the connections that may be created within internal collections, or in improvements that might be made to interface design.

In addition to digitization and participation, other main concepts encountered in the digital humanities include semantics and interoperability [BLA 15].

Whilst hypertextual writing has not lived up to its full potential in terms of challenging narrativity, the new technical instrumentation available for the semantization of writing has opened new, previously unimaginable, doors.

The technologies involved in the semantic web involve a form of double-writing and double-reading. Reading is carried out by both human users and machines; annotation even has its own vocabulary of descriptive metadata16. Writing with the potential for machine automation, in terms of annotations and metadata, has the capacity to condition later readings by human users. This conditioning is the result of indexing indications, intended for different “horizontal” search engines, such as Google, or “vertical” search engines that harvest information from the open archives of scientific publications.

16 http://lov.okfn.org/dataset/lov/vocabs/vann.

Page 12: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

12 Reading and Writing Knowledge in Scientific Communities

1.4. Current hypertext technologies

1.4.1. From hypertext to the data web

Initially, during the “first Internet period”, the contents of a document were purely textual, illustrated and backed up by hypermedia proposed directly by the editor, who might also be the author, or aggregated by an author-editor, selected from external sources and loaded into their document, potentially using hyperlinks to the source. With the development of Web 2.0 (the social Internet), hyperdocuments were made available within tools which enabled interactions between readers and authors, by means of comment threads; these threads enriched the document by means of collective meta-reflection, perfectly applicable to a scientific process (see Chapter 5 in the case of herbaria). The third, “semantic” revolution enabled segmentation and fine documentation of components of hypertext production, with the aim of normalization and, especially, sharing. Data elements contained in documents became contextualizable, whilst retaining their status as autonomous micro-units of meaning (microdata), and can be freely linked to other data carrying a similar meaning in other contexts. The principle of linked open data is thus etymologically comparable to a form of “weaving”, in the sense used in Roland Barthes’ theory of text.

The principle of this interaction and the benefits it confers in terms of conceptual disambiguation, serendipity and the discovery of connected information can be easily seen from the perspective of berrypicking, which consists of “bouncing” from one document to another, redefining information requirements over the course of content discovery, following the model presented by Marcia Bates [BAT 89]17. Whilst the value of this editorialization of metadata around content in terms of the interconnection of data and the construction of knowledge is evident, the conceptual aspects of how this is carried out are less obvious. In the next section, we shall return to the definition of the formalism of descriptions and content links in the specific case of hyperdocuments, with a focus on the benefits available within a framework of scientifically contextualized ecrilecture.

New methods of intra- and inter-documentary connection of data, and/or their recent popularization, have widened the field of possibilities in terms of 17 Bates’ information search model was initially designed to improve the usage of information search interfaces, in this case accelerated through the use of directly hyperlinked and described connected information.

Page 13: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 13

scientific production, particularly through the provision of access to data sets, dedicated descriptive vocabularies, collaborative writing platforms, scientific media databases and, evidently, collections of scientific articles.

At the heart of these principles is interoperability. In Chapter 6, Camille Prime-Claverie and Annaïg Mahé consider the principle of conceptual and technical interoperability of information fragments, something which is taken for granted in the context of the web of linked data; however, its implementation is far from simple, requiring globalized governance [BOU 16]. In certain areas of the scientific communities, information collections are a veritable tower of Babel of norms, standards and protocols, which are more of a hindrance than a help in terms of the possibility to annotate and augment references within a process of ecrilecture.

In Chapter 7, Rosemonde Letricot and Francesco Beretta carry out an in-depth investigation of a digital humanities project in the field of history, considering these problems of data stockpiling and presenting a methodology for modeling information fragments which not only allows, but also encourages interoperability, through a fine-grained description of content and the connections between elements.

This problem may appear to be purely documentary in nature, or even somewhat dated due to the description of finite document in a hypermediated context, within which the reading experience is not limited to a single document; the borders of this document itself have been blurred [BRO 16] through the use of inclusions, incoming and outgoing links and different available versions, as in the case of wikis. In order for this approach to retain its relevance, we must consider a more limited context: that of content and “atomized” data, or information fragments [PRI 04]. From a computational standpoint, information was, for a long time, defined as the reception of a contextualized and inscribed content, with editorialization, in a context of reception dependent on the reader. This raised problems of reception, which is clearly fundamental in a process of ecrilecture. This problem was clearly expressed by Shannon in a seminal article, published in 1948 [SHA 48]:

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These

Page 14: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

14 Reading and Writing Knowledge in Scientific Communities

semantic aspects of communication are irrelevant to the engineering problem”.

The data web has resulted in a shift in this paradigm, with content which may or may not be digital; the finely described and disambiguated collection of this content allows the formation of a corpus, which must necessarily take a digital form, and use a single form of description, the semantic triple. Modes of inscription and reception may take multiple forms, but the fundamental content does not change. Stéphane Crozat goes as far as to propose new editorial chains with associated semantic tools, aiming to make natively semiotic production calculable [CRO 16]. This point will be considered in greater detail later; for now, we must consider the nature of a triple, and the forms in which it may be encountered in various contexts linked to research. The semantic triple, or Resource Description Framework (RDF), may be defined as the formalism for describing content, based on a very simple principle, very similar to the grammatical construction of a phrase with subject, word and complement; in this case, we have a subject, predicate and object:

– the “subject” is the resource presented in the association. It may be represented by an Internet address, a Uniform Resource Identifier (URI), a chain of characters known as a “literal”, or a unique identifier in a knowledge base, known as a Unified Resource Number (URN). For example, a scientific article may be referred to in a description using a permanent address in a scientific archive such as HAL18 or ArXiv (URI), or its digital object identifier (DOI), a unique identifier assigned by scientific authorities, which constitutes a form of URN specifically for research articles;

– the “predicate” is the property assigned to the “subject”. This property refers to a category which is pre-defined by rules included in a set adopted by communities of use, and stored in a permanent manner on dedicated servers with a static web address. The predicate is thus presented in the form of a web address, including a prefix that specifies the address of the selected descriptive vocabulary (or schema), a radical that specifies the descriptive concept in question, and a suffix that is one of the descriptive attributes of the concept;

– the “object”, the final element of the triple, is the value of the property or predicate assigned to the subject; like the subject, it may be a literal, a URI or a URN.

18 French scientific auto-archiving center run by the CNRS.

Page 15: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 15

A simple example of a triple might be a scientific article hosted online by the archive equipment at the CNRS. In this case, the subject would be either the URI where the article may be accessed, or its unique identifier in the archive. For the predicate, the web address of a widely accepted descriptive language should be given, for example, that of the Dublin Core Metadata Initiative, which includes 15 basic descriptors; the selected descriptor must then be selected.

Thus, the phrase “the article hosted by HAL with the URI https://hal. archives-ouvertes.fr/hal-00628355 is entitled ‘Ontologie franco/anglaise du domaine informatique comme accès à un corpus de textes scientifiques’” would be expressed as follows:

<https://hal.archives-ouvertes.fr/hal-00628355>

<http://purl.org/dc/elements/1.1/title>

<‘Ontologie franco / anglaise du domaine informatique comme accès à un corpus de textes scientifiques’>

The description of this resource may be extended by creating other triples with the same subject – in this case, the article – but with different predicates, such as the author(s), date of publication, the subject of the article and the language of publication; evidently, the objects used must correspond to the chosen predicate.

The grammatical metaphor used above may be taken further, considering phrases expressed in the active or the passive voice, inverting the subject and the complement, with transitive notions of collection: “the chapters make up a book”, or “the book is made up of chapters”. This reflexivity is absolute in terms of content; in one case, the accent is placed on the “chapter” object, whilst in the second, the emphasis is on the collection, the “book”.

In technical terms, there are several methods for describing information fragments within web pages; the most widespread, including microdata and RDFa, are widely supported in scientific communities and communities of practice, and have been standardized. Vocabularies and data models, including those for modeling scientific objects, have been created and made

Page 16: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

16 Reading and Writing Knowledge in Scientific Communities

available through the schema.org website19. Some content managers have begun fine integration of these elements, making them “discoverable” and useable by final users.

Hence, a single web page may be the augmented sum of references to material content, which is itself referenced in digital catalogs and presented using consensual descriptive languages that are available online. The textual elements proposed by the author for description and/or criticism may also be tagged using descriptive elements.

As an illustration, consider a biographical note for an artist. This may include a text biography of the painter, a portrait showing the artist, a partial or full catalog of works – localized and identified, secondary literature and, potentially, a critical analysis of the artist’s work, including influences, dominant themes, any collaborations and contexts of production, such as reviews or exhibiting galleries. Considering this example further, a finer analysis of descriptive methods for the content in this biographical note is possible. The transposition from text to hypertext has already been covered in detail in published literature from a semiotic perspective, with an analysis of text segmentation and associated tagging. “Digital calculability” really comes into play with the latest version of HTML (HyperText Markup Language), which offers the possibility to create hyperdocuments with sections which can all be explicitly identified, using a typology which is oriented more toward semantics than to presentation. It thus becomes possible to refine the granularity of segmentation of documents, with tags marking content as “articles”, “dates”, “definitions”, etc. Birth and death dates are thus factual elements tagged in the text using semiotic markers that can be presented or highlighted by hypertext reading tools. These tools generally take the form of browser add-ons, which may easily be activated or deactivated as required.

More recent work has analyzed the metalinguistic context of hypertext links, highlighting other issues that are invisible to the human eye: the contexts of hypertext publications are no longer the exclusive concern of human readers, and also take account of the needs of machines and indexing algorithms. Indexing algorithms – most importantly Google – receive extra information in addition to that which is displayed, thanks to metadata included in the hypertext code [KEM 16a]. For ethical, editorial or economic

19 This community offers descriptive vocabulary for a wide range of scientific subjects, mainly, but not exclusively, in the technical sciences and in medicine: http://schema.org.

Page 17: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

reasons,human ulink andSIR 13informawhilst coutweigoverloadin a tradintrinsicresearchresearchlinks be

1.4.2. S

Thesnew posoftwaresuggestigraphicswith the

20 Very B21 An ind

, for exampusers, whilstd from coun

3]. Inverselyation destinechoosing notghed by the d. In all of tditional contc structural h, including h documentsetween, resea

Specific ele

se new methssibilities foe extension ions for recs, etc.) takee current pos

Figure

Big Infrastructudexing platform

ple, it is post preventing nting it in p

y, it is alsoed exclusivet to display cognitive c

these exampltext of optimquality, SE

in the humans use these arch documen

ements of s

hods of semaor enrichmen

for researchcommended n from the t [POU 16].

e 1.2. Example

ure. m for scientific d

Introductio

ssible to citsearch engin

popularity alo possible toly for use bit, consideri

cost, followiles, the issue

mizing the clEO, are alsonities; new snew methodnts.

scientific a

antic linkingnt. The Adonh blogs, whiresearch coIsidore platf

e of enrichmen

documents.

on to Scientific R

te and hypernes from follgorithms [So link to aby analyticaing the beneng the princes of describlassification o applicablescientific plads to give a

augmentati

g of web fragnis VBI20, foich allows antent (articlform21 and

nt of a hyperte

Reading and W

rlink a resollowing the oSAE 15a, SAa resource oal tools [KEefits of displciple of info

bing and linkof web page

e to the coatforms for aaccess to, an

ion: examp

gments haveor example, automatic coes, theses, cin direct con

ext page

Writing 17

ource for outgoing AE 15b, or meta-EM 16b] ay to be ormation king data es via an ntext of

accessing nd create

ples

e created offers a

ontextual chapters, nnection

Page 18: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

18 Reading and Writing Knowledge in Scientific Communities

Working along the same lines, Thomas Francart proposes integrating hypermedia content, displayed on demand and drawn from scientific data repositories, for example, those used in médiHal22 or semantized encyclopedic databases (see Figure 1.2), such as the dbPédia knowledge base, into articles or research blogs; this content is displayed in an extra column alongside the main content23 [FRA 15]. This content may be selected manually by the author, by a physical ecrilector, or even by an algorithm on the basis of tracking observations.

Figure 1.3. Example of article annotation in the PeerJ journal. For a color version of this figure, see www.iste.co.uk/kembellec/reading.zip

In the context of scientific publishing, Hans Dillaerts and Lise Verlaet have reconsidered the concept of linked data, preferring the term “semantic publishing”, as suggested by Shotton [SHO 09]; they associate this term with ecrilecture [VER 16]: “New forms of scientific journal now allow readers to participate in the semantic publishing process, notably via the use of “ecrilecture” tools”. Semantic publishing offers a number of advantages for both authors and readers, including the semantic enrichment of scientific publications with interactive data; reinforcement of the meaning of articles through semantic tagging; and direct linking to external resources or cited references, promoting the discovery and reuse of new knowledge through 22 The media sub-section of HAL, used for auto-archiving scientific media. 23 See the example of the proposed concept at http://labs.sparna.fr/isidore-enrichissement-article.html.

Page 19: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 19

publication of the article and research data in formats that can be read by both machines and humans. The issues and challenges involved in these new modes of ecrilecture are presented in Chapter 8.

The annotatable online PeerJ journal24 uses editorialized metadata in POSH25, RDFa and Microdata to enable contextual annotation and the insertion of situated questions for authors, thus allowing debates (see Figure 1.3: article text on the right, annotation tool on the left) within a peer review journal. The history of versions or reviews is accessible, allowing asynchronous observation of the definitive construction of the document, with different perspectives on the science being practiced. Following a similar model, Alexandre Monnin created an online annotatable version of his 2013 thesis using Philoweb.org26, following the principle of an augmented scientific semantic web and segmenting his content using the HTML5 mode. The interactive tool used the CommentPress plugin, mentioned above, and allowed critical readers to submit situated comments to the author. Over the course of the experiment, which went on for 2 years, a first layer of peritext was created around the content, allowing the author to present connected information. Readers then began to participate in annotating the manuscript. Following on from this first stage, which the author considered to be valuable, debate continued in parallel in the form of epitext, using social media streams associated with the project; the experiment’s Facebook page played host to a considerable volume of discussion. In a similar vein, Johanna Daniel, a graduate of the prestigious Ecole des Chartes in Paris, created a pragmatic combination of content and form, allowing readers to follow the writing process for her thesis Les outils d’annotation et l’édition scientifique de corpus textuels (annotation tools and scientific publishing of textual corpora) online, herself using an annotation tool. This experiment was not without its advantages for the writer, as it allowed her to raise awareness of her work and to obtain feedback in terms of both content and methodology, alongside collaborative spellchecking27.

24 Peer review journal, available since 2013 at: https://peerj.com/. 25 Plain Old Semantic HTML, the old way of expressing metadata in HTML hyperdocuments. 26 See http://hackyourphd.org/2014/01/interview-dalexandre-monnin-une-these-augmentee-avec-philoweb-org/. 27 See the associated blog: http://johannadaniel.fr/isidoreganesh/memoire/.

Page 20: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

20 Reading and Writing Knowledge in Scientific Communities

1.5. Conclusion

Our aim in this first chapter was to present the framework for digital reading and writing, alongside the main historical, conceptual and technical elements involved in its existence within the digital humanities.

The following chapters go into greater depth with regard to these different aspects, providing illustrations and subjects for reflection, for example, concerning the reproduction of classic editorial modes or their transformation using collaborative modes of reading and writing. Similarly, the automation of references does not necessarily require a modification of the traditional editorial model, but rather its integration into a defined process. In terms of reading and writing, the technicality of practice might be seen as the sign of a transformation of the act of reading itself. Finally, a reflection on augmentation must relate not only to content, but also to the different forms of authorities involved. A number of questions remain to be answered, relating, for example, to the tangible construction of meaning; the creation of new knowledge via data connections, made possible by web semantics, in the idealized “Memex” form envisioned by Bush, or a watering down along the horizontal model observed in social media. This version has already been criticized by Bourdieu in broadcast media, on the basis that it contains very little original content, endlessly duplicated in a form of “circular circulation” of information. The contributions made by social annotations, and the quality of these annotations, are an interesting subject for discussion in the near future.

1.6. Bibliography

[BAT 89] BATES M., “The design of browsing and berrypicking techniques for the online search interface”, Online Information Review, vol. 13, no. 57, pp. 407–424, 1989.

[BLA 15] BLANCHARD A., SABUNCU E., “Les humanités numériques, une science “plug and play”?”, in CARAYOL V., MORANDI F. (eds), Le tournant numérique des sciences humaines et sociales, Maison des sciences de l’homme d’Aquitaine, Pessac, 2015.

[BOL 90] BOLTER J., Writing Space: The Computer, Hypertext, and the History of Writing, Lawrence Erlbaum Associates, Mahwah, 1990.

Page 21: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

Introduction to Scientific Reading and Writing 21

[BOU 16] BOULET V., “De la SDN à la Nuit debout: les métadonnées et les enjeux de gouvernance internationale”, I2D – Information, données & documents, vol. 53, pp. 35–36, 2016.

[BRO 16] BROUDOUX E., “Contours du document numérique connecté”, in PAGANELLI C., CHAUDIRON S., ZREIK K. (eds), Documents et dispositifs à l’ère post-numérique, Conférence Cide 18, Europia Productions, Paris, France, 2016.

[CRO 16] CROZAT S., “Ecrire avec une machine à calculer, écrire pour une machine à calculer”, I2D – Information, données & documents, vol. 53, pp. 62–64, 2016.

[DAC 15] DACOS M., MOUNIER P., Humanités numériques: état des lieux et positionnement de la recherche française dans le contexte international, Research Report, Institut français, 2015.

[FRA 15] FRANCART T., “L’apport de la sémantique dans l’écriture scientifique augmentée”, Quatrième séance du séminaire écrilecture augmentée sur le web pour les communautés scientifiques, CNAM, Labex Hastec, Paris, France, March 2015.

[GOO 79] GOODY J., La Raison graphique, la domestication de la pensée sauvage, Editions de Minuit, Paris, 1979.

[HOL 92] HOLT P., WILLIAMS N. (eds), Computers and Writing: State of the Art, Kluwer Academic Publishers, Dordrecht, 1992.

[ILL 91] ILLICH I., Du lisible au visible. Sur l’art de lire de Hugues de Saint-Victor, Editions du Cerf, Paris, 1991.

[JUL 15] JULIEN Q., CITTON Y., “Manifeste pour des humanités numériques 2.0”, Multitudes, vol. 59, pp. 181–195, available at: http://www.cairn.info/revue-multitudes-2015-2-page-181.htm, 2015.

[KEM 16a] KEMBELLEC G., “Que voit réellement Google de la sémantique des pages web?”, I2D – Information, données & documents, vol. 53, p. 65, 2016.

[KEM 16b] KEMBELLEC G., “Le web de données en contexte bibliothécaire”, I2D – Information, données & documents, vol. 53, pp. 30–31, 2016.

[LED 12] LE DEUFF O., “Humanisme numérique et littératies”, Semen, no. 34, pp. 117–134, 2012.

[LED 15] LE DEUFF O., “Les humanités digitales précèdent-elles le numérique?”, in SALEH I. et al. (eds), H2PTM’15, ISTE Editions, London, 2015.

[POU 16] POUYLLAU S., “Isidore Suggestion, des recommandations de lecture pour les blogs de science”, I2D – Information, données & documents, vol. 53, p. 44, 2016.

Page 22: Introduction to Scientific Reading and Writing and to Technical COPYRIGHTED MATERIAL ... · 2020. 1. 7. · Introduction to Scientific Reading and Writing 5 rules designed to legitimize

22 Reading and Writing Knowledge in Scientific Communities

[PRI 04] PRIÉ Y., GARLATTI S., “Méta-données et annotations dans le web sémantique”, Revue I3 Information-Interaction-Intelligence, vol. 4, pp. 45–68, 2004.

[RIE 12] RIEDER B., RÖHLE T., “Digital methods: five challenges”, in BERRY D.-M. (ed.), Understanding Digital Humanities, Palgrave Macmillan, Basingstoke, 2012.

[SAE 15a] SAEMMER A., Rhétorique du texte numérique: figures de la lecture, anticipations de pratiques: essai, Presses de l’Enssib, Villeurbanne, 2015.

[SAE 15b] SAEMMER A., “Pour une sémiotique critique de l’hyperlien”, Quatrième séance du séminaire écrilecture augmentée sur le web pour les communautés scientifiques, Paris, France, available at: http://www.dicen-idf.org/evenement/ quatrieme-seance-du-seminaire-ecrilecture, June 2015.

[SHA 48] SHANNON C., “A mathematical theory of communication”, The Bell System Technical Journal, vol. 27, pp. 379–423 et 623–656, juillet et octobre 1948.

[SHO 09] SHOTTON D., “Semantic publishing: the coming revolution in scientific journal publishing”, Learned Publishing, vol. 22, no. 2, pp. 85–94, 2009.

[SIR 13] SIRE G., La production journalistique et Google : chercher à ce que l’information soit trouvée, PhD Thesis, Panthéon-Assas University, p. 339, November 2013.

[SOU 13] SOUCHIER E., “La ‘lettrure’ à l’écran”, Communication & langages, vol. 2012, no. 174, pp. 85–108, janvier 2013.

[TRE 14] TRÉHONDART N., “Le livre numérique ‘augmenté’ au regard du livre imprimé: positions d’acteurs et modélisations de pratiques”, Les Enjeux de l’information et de la communication, no. 15/2, pp. 23–37, 2014.

[VER 16] VERLAET L., DILLAERTS H., “L’enjeu du web de données pour l’édition scientifique”, I2D – Information, données & documents, vol. 53, p. 49, 2016.

[VUI 99] VUILLEMIN A., “La lecture interactive et l’écrilecture”, in VUILLEMIN A., LENOBLE M. (eds), Littérature, informatique, lecture, Presses Universitaires de Limoges, Limoges, 1999.