Top Banner
No. 27 DARE Unesco computerized data retrieval system for documentation in the social and human sciences 0 0 v) Unesco
46

DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Sep 09, 2018

Download

Documents

dinhque
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

No. 27 DARE Unesco computerized data retrieval system for documentation in the social and human sciences

0 0 v)

Unesco

Page 2: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

REPORTS AND PAPERS IN THE SOCIAL SCIENCES

The reports and Papers are intended to present to a restricted public of specialists descriptive or documentary material as and when it becomes available during the execution of Unesco’s pro- gramme in the field of the social sciences. They will consist of either reports relating to the Regular programme of Unesco and its operational programmes of aid to Member States or documen- tation in the form of bibliographies, repertories and directories. The authors alone are responsible for the contents of the

Reports and Papers and their views should not necessarily be taken to represent those of Unesco.

These documents are published without strict periodicity. Currently available.

SS/CH 11 - SS/CH 15 - SS/CH 17 -

SS/CH 18 - SS/CH 19 - SS/CH 20 - SS/CH 22 - SS/CH 23 - SS/CH 24 - SS/CH 25 -

SS/CH 26 -

International Repertory of Institutions Conducting Popu- lation Studies (bilingual : EnglishlFrench), 1959. International Co-operation and Programmes of Economic and Social Development (bilingual : English/French), 1961. International Directory of Sample Survey Centres (outside the United States of America) (bilingual: English/French), 1962. The Social Science Activities of Some Eastern European Academies of Sciences, 1963. Attitude Change: a review and bibliography of selected research, 1964 (out of print in English, available in French). International Repertory of Sociological Research Centres (outside the U.S.A.) (bilingual: English/French), 1964. Institutions Engaged in Economic and Social Planning in Africa (bilingual: English/French), 1966. International Repertory of Institutions Specializing in Research on Peace and Disarmament, 1966. Guide for the Establishment of National Social Science, Documentation Centres in Developing Countries, 1969. Ecological data in comparative research : Report on a first International Data Confrontation Seminar (bilingual : English/French), 1970. Data archives for the Social Sciences : Purposes, operations and problems, 1972.

Printed in the Workshops of Unesco Pkzce de Fontenoy, Paris-F, France 0 Unesco 1972 Printed in France SHC.72/XV. 2l/A

1972 International Book Year

Page 3: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

DARE Unesco computerized data retrieval system for documentation in the social and human sciences (including an analysis of the present system)

Paul V5sArhelyi Institute for Economic Planning, Budapest

Unesco

Page 4: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

PREFACE

Even before the installation of a computer in Unesco, the Social Science Documentation Centre (SSDC) , popularly known as the Social Science Clearing House, was gathering and supplying data to its main clients - social sciences specialists in and out of the Unesco Secretariat, international and national research and training centres, and pro- fessional groups. Its main collection of data is in the synoptic card index and supporting files on so- cial science research, advanced training and docu- mentation centres. These now include more than two-thirds of the 3, 000 centres it eventually hopes to cover.

Every effort is now being made to ensure that the index will be of value not only within Unesco but also to centres and scholars in Member States. The data stored are circulated through the Reports and Papers in the Social Sciences IRPSSI. the Inter- national Social Science Journal (ISSJ) , '*the B G - graphy, Documentation, Terminology bulletin, the International Peace Research Newsletter, the Inter- national Committee for Social Science Documenta- tion's World list of social science periodicals, and in ad hoc publications.

The Centre has a second large file - the bio- graphical file on some 5, 000 social science spe- cialists. From this amajor directorywas extracted: Social scientists specializing in African studies, a revised version of which is now ready for publica- tion. Similar directories for other regions could be compiled as required. Among other needs these directories would help to satisfy the constant de- mands from programme specialists for social scientists as potential contractors, consultants, teachers, research investigators, junior experts, speakers, trainees, fellows, and so on.

Thinking about the mechanics of compilation and problems of coding, and experience with simi- lar problems in Southern Asia, led to the decision to plan a long-term integrated project, to be car- ried out in stages in the coming biennia. The deci- sion to instal an ICL 1902A computer at Unesco headquarters ensured the necessary technical support.

A feasibility study by David Nasatir showed computerization of SSDC information to be both desirable and feasible. The resulting project is, however, modest in its aims, that is to meet the increasing information demands of programme specialists of Unesco, and those of other clients as far as possible.

The next step was to make a detailed analysis of the existing arrangements, and design a new computerized system. This was the responsibility of the present author. The actual programming work is now being done by a team headed by George Majtenyi and consisting of Thomas KoItai and Zolt5n Zsombok. by the end of 1972.

The present account describes the main opera- tional features and the quantitative and qualitative development possibilities.

The experience gained in designing the new system may help others similarly asked to provide various services on the basis of large files of non- numeric data. It might also serve as a model for those interested in the creation of specialized but interconnected computerized information systems within the United Nations group of organizations. The system described here may of course be sub- ject to minor changes.

The system will be operational

3

Page 5: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

CONTENTS

INTRODUCTION

Chapter I EXISTING SYSTEM

1 . 2 . Present operation . . . . . . . . . . . 3 . Limitations . . . . . . . . . . . . . .

Some basic concepts . . . . . . . . . .

Chapter I1 IMPROVEMENTS NEEDED

1 . The information base . . . . . . . . . 2 . New or improved services . . . . . . . 3 . Tailored information . . . . . . . . . . 4 . Efficiency . . . . . . . . . . . . . .

Chapter I11 DARE: DATA RETRIEVAL SYSTEM FOR DOCUMENTATION IN THE SOCIAL AND H U M A N SCIENCES

1 . Sub- systems . . . . . . . . . . . . . 2 . Design problems . . . . . . . . . . .

Chapter IV MAINTAINING THE DATA BASE

A . Manual procedures . . . . . . . . . . B . Computerized procedures . . . . . . .

Chapter V ANSWERING QUESTIONS

1 . Formulating questions . . . . . . . . . 2 . Computer procedures involved . . . . . 3 . Use of computer output . . . . . . . .

Chapter VI PREPARING INDEXES

1 . Procedures . . . . . . . . . . . . . . 2 . Frequency of indexing runs; distri-

bution of outputs . . . . . . . . . . . Chapter VI1 LAUNCHING DARE

1 . Phases . . . . . . . . . . . . . . . . 2 . Input to data base . . . . . . . . . . . 3 . Opening files . . . . . . . . . . . . . 4 . Current Awareness Service . . . . . . .

7

7

10

11

13

13

14

16 16

20

25

32

35

36

39

41

42 42

42

43

4

Page 6: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

INTRODUCTION

The Social Science Documentation Centre (SSDC) is a servicing unit of the Unesco Secretariat in Paris. The computerized system has been de- signed to facilitate its work in the circulation of information, data retrieval, and indexing services.

The structure of this report reflects the se- quence followed in the planning.

The first chapter gives some basic concepts and shows limitations in a manual system which make computerization desirable - pointing out, however, that the possibilities offered by computeri- zation must not be overestimated. The next two chapters deal with the improvements needed, and the complications caused by the fact that require- ments are often mutually contradictory. Chap- ters IV - VI describe the three major parts of the system: updating, retrieval, indexing subsystems. The final chapter deals with the change-over to computerization.

The new system is intended to improve the range and quality of both the information and the services. Data acquisition will be improved by getting in touch directly with institutions and spe- cialists and by widening the range of documents scanned. Four computerized files will be created: institutions, projects, specialists, documents. The four files will be linked by pointers created automatically in updating, to allow sophisticated

search proceeding from one type of information (file) to another.

Quality will be improved by seeking more up- to-date, precise and detailed information than is at present available. Subject indexing will be im- proved by the use of descriptors, mechanical up- dating, and registration on magnetic discs. Cur- rent information will be circulated faster by Infor- mation Notes and selective, individual current awareness notices, since both will be prepared by computer. Information retrieval will provide answers to individual specialized queries as well. General subject indexes will be updated and pub- lished regularly. answer more sophisticated questions faster and more fully than the staff can at present.

It was decided to use the programmes furnished with the computer by ICL, including the searching and reporting facilities of the FIND 2 Multiple En- quiry Package. The system obviously had to be compatible with other systems under development in Unesco; hence the adoption of the MARC I1 record format. Storage and processing is simpli- fied by coding most of the information, and labour is saved by automatic coding and decoding basedon computerized authority and decode files.

stages : pilot-running, parallel-running and direct change- over.

In short, it should be possible to

The new system will be introduced in three

5

Page 7: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter I

EXISTING SYSTEM

The Social Science Documentation Centre (SSDC) has for years been providing information services for programme specialists and other clients. The data it has manually collected, processed and stored can now serve as a basis for a computerized information service.

1. BASIC CONCEPTS

In addition to traditional documentation and library terminology, other terms used will correspond to international MARC I1 communication format usage.

A data element is a unit of information, e. g. pagination, title, type of activity.

A field contains data elements. Each field is assigned a name which represents the contents of that field, e. g. collation, imprint, annotation, etc. There are two kinds of fields: fixed and variable.

A fixed length field contains a data element which is always expressed by the same number of characters. For example, a date is always ex- pressed as four numeric characters, such as 1968. Fixed fields often contain codes, e. g., subject field covered, type of activity.

A variable length field contains a data element, the length of which cannot be predetermined (e. g. title, annotation).

A continuation field: contains the end or over- flow of a data element too long to contain in one or the first field. (See p. 18, col. 2)

A field which contains more than one dataele- ment is called a multiple field. Data elements in a multiple field may be of the same type (e. g. dif- ferent names, a list of titles) or different types (for example: a collation is a variable field made up of three data elements which are at the same time of three different types. pagination, illustra- tion and size; similarly, an imprint is madeup of the following elements: place, publisher, date of publication).

unit, e. g., the information on a catalogue card.

-

A record is a collection of fields treated as a

2. PRESENT OPERATION

The data base has to be maintained, developed and regularly updated, and the information it contains used to provide various services.

update the information base.

intended for permanent or recurring use; tempo- rary records used for a limited time period or for a given purpose only;

The basic aim in processing documents is to

This information base is made up of records

and auxiliary files.

Main files

(a) Synoptic card index of institutions: Questions relating to specialists and research projects have at present to be answered through this index.

Double-coding is done, that is, the different subject fields etc. are assigned code numbers, then the numbers are further coded by assigning them a colour. Colour tags are inserted at given positions, a position or group of positions being allotted to a data element. Coding and retrieval by these means have obvious limitations, render- ing the development of such a file beyond a certain point both impossible and aimless.

stored in alphabetical sequence by name of institu- tion within larger groups arranged by country.

Still another card file of institutions, used only for the International Social Science Journal (ISSJ) , contains more data elements than the synoptic card index. This file is not used for information retrieval. It contains about 1, 200 records with high information content in an alphabetical order and it will play an important rale in the creation of the new, computerized data base on institutions.

(b) Most of the data elements in this index are coded to allow searching by colour, but the size of this file makes a full-scale search by colour quite impossible; it contains about 6, 000 records in al- phabetical sequence.

Another file of some 1, 700 record cards are

Synoptic card index of specialists.

7

Page 8: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

(c) Mission reports and conference papers: Three of the main files cover mission reports and conference working papers. The same data is ar- ranged in three different ways: alphabetical by name of author; numeric by subject code; numeric by geographical area code.

TemDorarv files

(a) Information on institutions for inclusion in the Information Note.

(b) Two "forthcoming conference" files, alpha- betical, and chronological (for schedules in Infor- mation Note and ISSJ).

(c) Two fileson new periodicals (for prepar- ing the Information Note, Z J , and compilation of bibliographies) .

(d) Two "new documents" files, indexed by author and by subject.

-

Auxiliary files

(a) Alphabetical card catalogue of institutions, containing as many cards for an institution as there are names in the different languages. logue contains about 2, 500 records.

This cata-

(b) Alphabetical card catalogue of acronyms. (c) Specialists (d) Answers to circular letters (e) Some 1, 500 dossiers on institutions in al-

phabetical order by country, and giving when pos- sible: (i) description; statutes; members; (ii) administrative meetings: working papers, reports of activities, current research, financial reports, programmes, summary reports; (iii) meet- ings; working papers, proceedings; (iv) publica- tions.

Updating of these files is by no means easy. The main phases (see fig. 1) are as shown below.

(a) Receive documents (e. g., letters, books, periodicals, reports, fugitive materials, book an- nouncements, etc. ) produced by staff members and experts of Unesco; or by outside institutions, or- ganizations, libraries; or coming indirectly through the Unesco Library. The procedure followed in processing internal and external documents is quite different. But external documents constitute the majority of all documents received.

(b) Selection. Only external documents of potential interest are retained for further process- ing.

Scanning to identify items which contain information (i) modifying an existing file; (ii) de- manding the creation of a new file; (iii) rendering an existing file obsolete.

(d) Recording. Checking before actually in- corporating the new information is time-consuming and complicated (see fig. 2).

(e) Circulation. After all information has been incorporated, circulation is the next step in the work flow.

(c)

(f) Storage. On reception back from circulation,

Figure 1

documents for scanning

Scan documents TI Pi-ocess records marked in documents

Circulate documents

Decide on storage +I

storage will depend on the type of document; normally, only documents related directly and closely to the work of an institution or organization will be kept and stored.

The procedure followed in the case of internal documents prepared by staff members and experts (i. e. mission reports or conference working papers) is different. Each is registered and is given a biblio- graphical description. Catalogue cards are typed in several copies, and filed in author, subject and geographical files. plicates) are preserved.

All such documents (except du-

Servicing by SSDC

(a) Recurring publications. The Centre pub- lishes an Information Note for internal use and at irregular intervals, and contributes regularly to the ISSJ, the Bibliography, Documentation, Term- inology periodical, and to the International Peace Research Newsletter.

(b) Answering queries; non-recurring publica- tions. Answering the queries of programme spe- cialists and of users outside Unesco constitutes a major service. The information is obtained from repertories, directories, and so on, and mainly also from the card index ofinstitutions. It is often necessary to go back to the supporting dossiers and examine the documents which are stored in them.

The preparation of non-recurring directory- type publications involves a specific type of infor- mation retrieval and the specialist concerned is often a member of the SSDC staff.

-

a

Page 9: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Flowchart A 12 (Details of block 4 in Flowchart A)

Search for synoptic card in file 15

of record (4.1 1

Prepare card for file I C 1 and I C2

Figure 2

Check marked data elements

Check name in alpha file I C 1 i

Check circular file O F

W

Info. exceeds ” minimum

found, found, not Sent, Sent, not changed changed Doc. is answer Doc. no answer Not

V V

Not

0 to 4.1

Insert new Mark answer data elements in circular

.1 >

.1 q

I

Code Acknowledge

Insert

9

Prepare Prepare circular

a b ,.

index tags Prepare ’ synoptic card

Page 10: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

3. LIMITATIONS in a file containing several thousand cards, the searchis obviously not goingto be easy, especially iftwo or even three positions are allotted to a data element (e. g., the first and second social science subject codes are both allotted three positions). If the question is multiple (e. g., if the client needs information about institutions which are both work- ing on peace research and provide training), search- ing has to be done on the basis of a combination of colours in different positions, which is so difficult as to be hardly worth the effort.

The number of data elements whose con- tent can be indicated by the colour of the tag or tags inserted is very small. On the other hand, they offer the only way of reducing the mental and manual work involved, since data simply written on the card can be found only by physically hand- ling the actual card.

The Information Note has two drawbacks. First, as the material is not organized by subject matter, the user wastes time reading entries of no interest to him; secondly, it is not published regularly.

(c) Users generally ask only questions they know can be answered by SSDC. The system is too slow and cumbersome to deal with more com- plicated or unusualmatters, e. g., if asked to pre- pare a specialized directory for the purposes of a forthcoming conference.

The number of cards on all the main files already exceeds the amount that can be easily handled. This is especially true of the synoptic files.

(e) Too much duplication is involved in the manual work of updating and using the information base. The same information has to be retyped in the same or a slightly changed sequence in prepar- ing (a) cards for the different files and (b) the ul- timate publications, listings and so on.

(iv)

(d)

The present system was considered by its users to be well designed and efficient, and served them well. However, information needs are constantly and rapidly increasing in complexity, demanding even more rapid, selective, and timely servicing of an ever increasing amount of information.

Certain existing limitations can no longer be dealt with by manual methods, i. e. simply by in- creasing the human effort.

These limitations are as follows: (a) The information already processed and

stored cannot be fully exploited, because there are too few staff, and certain operations are not feas- ible by manual methods (e. g., the regular publica- tion of reference documents or indexes covering most or all of the data available on particular sub- jects). The absence of such references means that even the simplest questions often have to be handled individually by SSDC staff.

A highly selective service is also needed which would give programme specialists exactly the in- formationthey require, no more and no less. The problem is not a lack of information but one of selection.

(b) (i) The only questions that can be answered directly are those relating to institutions, missions, or conference reports. In the case of questions concerning research projects, specialists or docu- ments prepared outside Unesco, only indirect searchis possible (first inthe index of institutions, and then by examining all data on projects, special- ists or references in the dossiers). The card in- dex of specialists is out of date and, for projects and documents, no special file has as yet been created.

(ii) Because of the limited information regis- tered on synoptic cards, both the index and the dos- siers may have to be searched to find data about, e. g., the administration of a listed institution.

(iii) As was mentioned above, data elements are allotted fixed fields on the cards, and the content of the data element (e. g., the geographical area covered by research in an institution in Africa) is indicated by the colour of the tag inserted. But

The Unesco computer

The ICL 1902A machine available at Unesco Head- quarters in Paris is a relatively small but expand- able third generation computer. The configuration (i. e., input, output, storage and central processor) is shown in fig. 3.

10

Page 11: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

I C L 1902A COMPUTER CONFIGURATION

-

Figure 3

Core Storage

Program of instructions

Central Processor

Working data storage y{ Buffer 1-1 Calculator

I 32,000 words of 24 bits each

I Input Devices

1

Card Reader 1,600 cards/min.

Co nso I e Typewriter

Buffer

Buffer 0 output

Input/output and storage devices

I Magnetic Tapes four units 17 mil. ch. each

Exchangeable Disc Stores four units 8 mil. ch. each

Devices

Card Punch Unit 100 carddmin.

Line Printer I 1,350 Iinedmin. 160 pos. i

11

Page 12: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter I1

IMPROVEMENTS NEEDED

What can be demanded of a computerized system? Requirements can be stated on the basis of SSDC experience, the known limitations of the present arrangements, and a knowledge of what users want. These requirements were the starting point in de- signing the new system.

1. THE INFORMATION BASE

A good information base is obviously the indispen- sable foundation for a good service. If SSDC is to serve the whole Unesco Sector of Social Sciences, Human Sciences and Culture, (including the Divi- sion of Philosophy), this base obviously cannot be restricted to information on the social sciences only.

All the main files (institutions, specialists, mission reports and conferences I working papers) must be made comprehensive and easy to handle; the number of records each contains, and the num- ber of data elements forming these records, should be increased; and all the information should be precise, detailed and up-to-date.

(a) Access

At present, the institution file is the best developed, so that information concerning institutions is rela- tively easy to handle and retrieve. Other types of information are only indirectly and with consider- able difficulty accessible. tem should be to make all types of information equally accessible, including information about projects, research, specialists in particular sub- jects, forthcoming conferences (of temporary impor- tance only and so of lower priority).

Hitherto, SSDC has made contact only with in- stitutions. Thus information on specialists, if any, arrives from institutions or is selected from docu- ments, but nothing is directly received from the specialists themselves, This and other points could be covered by sending questionnaires to institutions and individuals.

The aim of the new sys-

Access to all files should be direct, and not

through the intermediary of the institution file, and all major files should accordingly be com- puterized.

Interrelated information in the different files should be linked (e. g., data on a specialist working in a particular institution on certain projects) so that, whenever necessary, data elements stored as parts of different records can be simultaneously found. In the present SSDC scheme this is done by pointers which provide references to other records in a different file.

(b) Comprehensiveness

If the number of records on a file is too small, the information that can be provided has only a limited value (e. g., a search for an institution capable of providing technical assistance for a specific proj- ect in the form of research facilities and personnel with specific types of expertise may fail to find the best simply because it does not figure in the insti- tution file). On the other hand, selection is impera- tive - files cannot include everybody and everything, and some kind of optimum must be fixed. rious files can be considered in turn from this point of view.

date records on institutions having social science activities is about 1, 700 and is reasonably repre-. sentative, but the maximum may exceed 3, 000: a very large addition (perhaps 1,000) in the first year of computerization, falling to perhaps 500 in the second year, slowing down subsequently to only a few hundreds in the following years.

covered is quite small. Current updating may raise the existing figure of 1, 000 to 2, 000. After a few years of computerized operation, the records should cover 15, 000 - 20, 000 specialists: an ad- dition of 5, 000 - 7, 000 through replies to circulars in the preparatory change-over period, and 1, 000 - 2, 000 yearly thereafter.

The total should level out at about 20, 000, at which point deletions and additions will cancel out.

The va-

(i) Institutions. The present number of up-to-

(ii) Specialists. The proportion at present

12

Page 13: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

(iii) Projects. Information on projects is la- tent, i. e. it exists only in the backing documenta- tion. A n input of 2,000 - 3, 000 records should be ready at the end of the change-over period. This will still represent only a small proportion of proj- ects in the social sciences. Taking into considera- tion the fact that projects lose interest after com- pletion, the optimal total number of records is es- timated at 15, 000 - 20, 000. interest reporting on findings resulting from origi- nal field research will be included in this file; this means an initial input of 5, 000 - 8, 000 records. As SSDC is not a library, documents are regarded as products of research which give useful informa- tion about institutions, individuals or projects, their activities and results. Document records no longer of current interest will be transferred to the CDS (Computerized Documentation Service) bibliographic file, and thus continue to be available to specialists seeking information on specific subjects. The final size of the file should not exceed a maximum of 40, 000 records. This total would be reached by yearly additions of 3,000 - 4,000 records. In view of the limited staff available, it will take several years to reach saturation point, where, i.e., dele- tions and transfers approximately equal additions. The additional information can be found only by more direct contacts with institutions and individuals so increasing the number of answers to question- naires and documents scanned.

(iv) Documents. Only documents of current

(c) Descriptors

The descriptors used will be a more detailed ver- sion of the relevant parts of the Unesco List ofDes- criptors, all to be fully compatible with the Aligned List of Descriptors elaborated for use by the United Nations and other international organizations. Gen- eric terms appearing in the hierarchic.al part of the list of descriptors will designate the main subjects covered, other details being covered in a concise annotation. entry should not exceed 10 - 15. ably Unesco staff. If done outside, it should be checked for consistency.

The number of descriptors used per

Indexing should be done by specialists, prefer-

2. NEW OR IMPROVED SERVICES ~

(a) Specialized directories

Computerization should make it less necessary for a small staff to have to deal personally with indivi- dual queries, and facilitate the preparation of spe- cialized directories from which users can easily obtain the information they need.

For each such directory, the Secretariat would define the subject headings and descriptors. tories really represent a special form of informa- tion retrieval.

Direc-

(b) Subject indexes

Subject indexes should make it possible to find the answer to most questions without the intervention of SSDC staff and without a computer run, and thus allow both computer and staff to concentrate on answering highly specific and sophisticated questions.

It is hoped to publish the first subject index, covering all the information available in the four files (institutions, specialists, projects, docu- ments) at the end of the first year of computerized operation; and to issue supplements (same format and subject headings) every six months. Full in- dexes will subsequently be prepared (e. g., the full content of the files printed out, classified by sub- ject) every two years.

In these indexes, the entries will appear under the main subject headings, i. e., under the descrip- tors forming the hierarchical part of the Unesco List of Descriptors. Each entry will appear as many times as it contains main subject headings, and each entry will contain all infarmation available.

(c) Current awareness service

This service is intended only for programme spe- cialists, to keep them informed without having to go through a huge amount of literature that is not relevant to their particular interests (which they must of course indicate beforehand). be supplied directly each week with the relevant computer print-outs containing any new informa- tion available.

would probably be 30 - 50.

They would

The number of programme specialists involved,

(d) Information Note

This should also appear weekly, giving any new information available on institutions, specialists, projects, documents, new periodicals, and ap- proac hing conferences .

3. TAILORED INFORMATION

It should be possible to supply answers to multiple questions. Manual systems can answer simple queries only, e. g., "what institutions are doing re- search on foreign policy? '' - and even those with considerable difficulty. A compuierized system should be able to deal with much more complicated queries, e. g., "institutes doing research and pro- viding documentation on foreign policy, publishing a journal, established before 1960, having a staff of over 100, financed from university funds only".

Speed, of course, is another obvious advantage of a computerized system. However, the present configuration of the Unesco computer does not al- low direct access by the users. The computer workload and organization will decide how promptly a question or batch of questions will be accepted

13

Page 14: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

for processing. It was suggested that SSDC should have access twice a day (morning questions before noon, afternoon questions during the night). The ideal would be to have at least one console giving direct access and the possibility of dialoguing into the computer.

4. EFFICIENCY

The new computerized system must work with other systems whenever they add to its efficiency, and supply them in turn if required. This applies in particular to two other important computerization projects in Unesco, interconnected with SSDC to some extent: the Computerized Documentation Service (CDS), and the Recruitment Roster in the Bureau of Personnel.

Onlyfiles containing a huge amount of records and subject to frequent changes should be computer- ized, as the processing is rapid but also very ex- pensive. This was the reason for deciding not to computerize the bibliographic, conference and pe- riodical files. Part of the information about new documents, mission reports and conference work- ing papers will be processed by CDS. Again, the International Committee for Social Science Docu- mentation (ICSSD) is computerizing social science

literature. The exact scope of the project is not known, but it will concentrate on processing infor- mation found in articles in periodicals.

continue to' be handled manually.

(now in its third edition) with the fourth now being prepared by the ICSSD for Unesco. Revised and enlarged editions are published regularly with Unesco assistance. Eventually this work may be considered as the social science component in the proposed International Serial Data System.

Finally, the same information often has to be retyped several times in the manual system. This can be eliminated if computerized outputs are made directly usable in the preparation of publications. This involves using a paper tape input instead of punched cards, for which SSDC would need to have a Flexowriter or access to one. If, to meet the needs of both CDS and SSDC, the configuration is enlarged to include paper tape input and output de- vices, publications can be prepared directly from the computer's paper tape output. Print-outs would have to be retyped if the quality of the publication so demands. Otherwise, they can be used directly in replying to requests, and so on.

The various points referred to in this chapter are graphically represented in fig. 4.

Information on approaching conferences will

The World List of Social Science Periodicals

14

Page 15: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Figure 4

-Qualitative-

Better sat isfact ion of - information needs

Information base

Questionnaires

Direct contact Data acquisition

Updating computerized updating

Linking inter-related Pointers records

-ScoDe of contacts

rNumber of files

Scope of scanning Number of records i- -Quantitative

~~~~~~~~n tS A d d it i o na I data e I e m e nts

Connection between sources- and SSDC

Updating techniques

Organ izat iona I and subjective problems

Computerization

- Up-to-date information

Organization Accurate and rdetailed input

Descriptors Quality of subject indexing i Complete storage- Computerization

Precise and - information detailed

,-Quantitative Additional services

i -Services rAllowing of specific questions

LQualitative I Speed

L F i n e multiple grouping

Use of FIND 2 r -Efficiency

Cooperation in processing document r Input phase

L U s e of means responding to the level of task

Loutput phase Directly usable printouts

15

Page 16: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter I11

DARE: DATA RETRIEVAL SYSTEM FOR THE SOCIAL AND HUMAN SCIENCES

The new Data Retrieval System (DARE) is to be a flexible man-machine arrangement. It will be fairly complex, with three major subsystems : up- dating, retrieval, and indexing.

I. SUBSYSTEMS

Updating subsystem. The information base is the foundation for all the other services (retrieval, preparation of indexes, current awareness) and must be constantly updated. This also involves keeping authority and decode files, to code infor- mation before storage and decode in preparing out- put listings.

ready for more or less immediate use, namely:

lists in filling out input worksheets;

questionnaires to specialists ;

The subsystem can provide certain products

(i) computer print-outs that serve as authority

(ii) labels that can be used, e. g. in mailing

(iii) (iv)

data for the Information Note; data for Current Awareness Notices.

The retrieval subsystem (i) provides - and limits itself to - the precise information needed by users; (ii) permits the preparation of specialized direc- tories.

The indexing subsystem can provide alphabeti- cal indexes for use in filling out input worksheets for the updating subsystem; and general subject in- dexes that cover a subject as a whole.

11. DESIGN PROBLEMS

A. Updating subsystem

The updating system gives rise to most design problems, of which the following are the most im- portant: (i) how to interconnect records; (ii) re- cord formats; (iii) organization of files; (iv) cod- ing and decoding.

four autonomous files which must be interconnected, (i) Interconnecting files. DARE is based on

i. e., information already retrieved must automati- cally direct the further search, which thus becomes a sequence of retrieval steps passing from one type of information to another. This would be ne- cessary in answering a question of the following type: which institutes are working on peace re- search, on what research projects, and by whom (specialists) ?

Accordingly, new records have to contain the required linking information (pointer) with records already in the file, e.g., a new record on a new project names the institute and specialists con- cerned. Pointers should link the new information on the card with the institutionand specialists files; and link the additional information they contain '

back ih turn to the card. Naturally, if a record is deleted, the relevant

pointers should also be deleted (e. g., a specialist leaves an institution, its name is deleted from his record, as is likewise the pointer in the institution record linking it to his).

sion to insert a new record should remain the res- ponsibility of the staff. Thus, only the creation of pointers should be automatic but not the crea- tion of the records themselves.

A new record should be included only if it is likely to serve a purpose. This normally means that a certain minimum amount of information should be available as follows:

Institutions : name, country code, address. main subjects covered.

Specialists: name, date of birth, official address, main subjects covered.

Projects: title of project, country code, sponsor- ing institution, address, main subjects covered.

Documents: name of author (personal or corpo- rate), title, imprint or source (in the case of articles), main subjects covered, analytical abstract with descriptors.

As DARE is to be highly selective, the deci-

(ii) Record formats. As DARE must be com- patible with other Unesco computerization projects (especially with the Computerized Documentation

16

Page 17: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

The third and fourth ErouDs (called control Service (CDS), and the Bureau of Personnel Ros- ter Project (PER), it is best to have a common record format. Information on articles and reports on original research is to be included in the CDS unique document file, while DARE has a common interest with PER in certain data on specialists.

Unfortunately, the structure and format of the CDS and PER cards are entirely different: CDS uses the MARC I1 format (variable length fields), while the PER records contain fixed length fields only. Data identification is different, and so is country coding and the designation of sub- jects covered. Moreover, the FIND programming package to be used by DARE adds certain other complications.

As a compromise, it was decided that DARE would also adopt the MARC I1 format and co- operate mainly with CDS, using the CDS coding system, since DARE data on specialists is normally too scanty for personnel purposes, while DARE is free to use data from the Personnel Roster.

Knowledge implicit to a person looking at a record containing several data elements (e. g., name, address, and so on) has to be made explicit to a computer, which sees them as an uninterrupted string of characters. The machine must be enabled to discriminate between data elements and deter- mine precisely where each begins and ends. The data can then be stored on,e. g., magnetic disc, and the computer can be programmed to manipulate the data elements in a variety of ways.

The infor- mation content of the fields is identified by labels (called tags), and the end of fields is indicated by special characters (called field terminators). Fur- ther information about the field can be supplied by additional one-character codes (called indicators), the use of two indicators being allowed for each va- riable field. A special symbol is used in machine manipulation to separate data elements within a multiple field. Subfield codes (in lower-case alpha- betic characters) are used in conjunction with another special character (called a delimiter) to identify data elements within multiple fields. The MARC I1 record consists of four groups of fields. The first two groups contain auxiliary information about the record itself (for use by the computer in the course of processing); the third and fourth groups contain the information required by the user.

The first group (called a header or a leader) is fixed in length for all records (24 characters) and contains auxiliary information, e. g., the logi- cal record length, record status, base address of data). tory) contains auxiliary information about the va- riable length fields, telling the computer the name, length and location of each such field included in the record. It is made up of a series of fixed length entries containing the 3-character identification tag, the number of characters in the field identified by the tag, and the starting character position relative to the base of each of the variable length fields.

The MARC I1 method is as follows.

The second group (called the record direc- -

- - . fields and variable fields) are made up of fixed and variable length fields containing the informa- tion required by the user.

in conjunction with the MARC I1 format cards, three important points must be observed.

(a) Variable length fields may be of two types: terminator (a specified character is used to ter- minate the field) or subrecord (each variable length field is precoded by a word indicating the length of that field).

(b) All variable length fields must be present in every record. If a terminator type field is to be omitted within a record, the terminator must still be retained; and if a subrecord type field contains no information, the blank must be indi- cated by a count word containing count of one.

some characters from the content of a field.

Using the FIND 2 Multiple Enquiry Package

(c) In the output, there is no way of omitting

It was accordingly decided that: (a) the number of fields will be fixed, and the

same for all master records. Field termina- tors will be included in cases when the field contains no information. The length of the re- cord directory will thus also be fixed and contain 348 characters, (b) As indicators and subfield codes in the out-

put listings would confuse the user, the indicator positions at the beginning of the fields will be left blank.

(c) Organization of files. Ideally, format stan- dardization between DARE, PER and CDS files should permit a common storage system. As indi- cated above, this is ruled out. Updating and search- ing would mean going over a tape thousands of times, which would waste an inordmate amount of computer time and soon spoilthe tape itself. DARE has no alternative to using exchangeable disc stores; CDS decided to use magnetic tapes. The DARE master files are organized at random, while the authority and decode files will be indexed sequen- tial files.

(d) Coding and decoding. Alphabetical data elements may contain up to 50- 100 characters and simultaneously appear in several records (e. g., the name of an institute appears not only in the insti- tute record but in all records containing informa- tion on projects, specialists or reports connected with it. This would lead to a waste of storage space. Moreover, the search for all the various types of data elements would involve a character- by-character comparison of variable length fields - neither the most efficient nor the fastest method of information retrieval.

It was accordingly decided to code data. should be done in conjunction with CDS and be auto- matic. Dictionaries (called authority files) built into the system are used to replace alphabetic in- put data elements by numbers; numbers are de- coded back in preparing output listings.

This

The five dictionaries cover: descriptors,

17

A

Page 18: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

institutions, projects, specialists and documents respectively.

The descriptors file contains codes for the fol- lowing data elements: subject fields, geographi- cal areas, types of publication, facilities available, methods of data processing used, type of organiza- tion, methods of financing, nationality, language.

The DARE codes assigned to descriptors are those accepted by CDS; the codes for nationality, geographical areas, country and language should be agreed upon for all three systems; DARE has a free hand in assigning codes to the remaining data elements, as there is no overlapping with the other systems.

Once established, the descriptors file, based on the revised Unesco list of descriptors, will change little.

A distinction must be made between pointers and references.

As opqosed to the descriptors file, the other four will change constantly. sirable that codes for them should be generated automatically. ning in starting the file. Each record, in alpha- betical order, is assigned a serial code number. New records are numbered and inserted in relation to the codes found on either side of the new name or title.

of records the file will finally contain is approxi- mately known from the beginning. In the case of institutions, for example, about 60% of the final figure may be known at the time of starting the file. Hence, account can be taken of the likely frequency of character distribution. But the proportion will be too smallin the case of specialists, projects and documents; hence, the computer codes for them must be generated as serial numbers, irrespective of the alphabetical order, simply in the sequence in which new records are added to the file.

A problem of ambiguity now arises. tions or specialists could have the same name, pro- jects or documents could be designated by the same title. However it,is unlikely that two institutions or projects in the same country have the same title; that two specialists have both the same name and the same birthday; or that two documents have the same title, the same date, and authors with the same name. The solution is obviously to add another element in each case to the code, i. e., - for institutions: name alus countrv code:

It is accordingly de-

This is ensured right at the begin-

However, this is feasible only when the number

Institu-

- for projects: title, plus country code ; - for specialists: full name, plus month and year

of birth; for documents: use the Luhn Code, the refer- ence being composed .from the 5 first letters of the author's last

-from the initials of his first and middle names, .from the initials of the first 5 words of the

- from the last two digits of the year of publication. name,

title, and

However, the names of institutions and projects are often long and similar, the difference appear- ing mainly in their last parts. view of storage, it would be uneconomical to create a separate authority record for each. A better so- lution is to make out an authority record in two parts: the first part contains the first 64 charac- ters of the name or title; the second part consists of one or more continuation fields containing the end of the names or titles that begin with the same 64 characters, supplemented with the appropriate country code. To allow the computer to differen- tiate, the first part of the record is supplemented with a "continuation count'' indicating the number of continuation fields to be checked if the first 64 characters of the name to be coded and those of the authority record are found to be matching. This use of continuation fields should result in a considerable saving in storage space.

The ICL FIND 2 Multiple Enquiry Package also, imposes certain limitations on decoding: -Any number of fields in a record can be decoded,

but the number of decode files which may be used during the preparation of each output record is limited to one;

-the maximum length of a code (which provides the key in decoding) is 64 characters;

-the maximum length of a decoded entry (i. e., the alphabetical information it contains) is 127 characters ;

From the point of

-control fields cannot be decoded; -records on the decode file must be of fixed length; -the decode file must be held on the exchangeable

disc store; -the size of the decode file is limited by the disc

storage capaclty available; and -the decode file must be loaded sequentially with

standard indexes. In view of these limitations it was decided to have a single decode only. not be assigned at different items on the four au- thority files, even though these are themselves different. Accordingly, the code numbers for des- criptors start from 00 000 000, for institutions from 10 000 000, for specialists from 40 000 000, and so on. As the decode records can have fixed length records only, the decoded entry field adopted is the maximum accepted by FIND, i. e., 8 charac- ters in length for the field containing the code, 127 characters in length for the one containing the al- phabetical information.

The numeric codes created for coding refer- ences to documents are not decoded since the pro- duct would merely be another code (the Luhn code) whose only purpose is to help connect interrelated records.

designing coding and decoding procedures.

in different languages and have different abbrevia- tions and acronyms, a single code number only should be assigned to it.

The same code number can-

Names and titles cause further problems in

Although the name of an institution may appear

Printed listings do not

18

Page 19: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

needto give allthe possible name variations. One is selected as the "official name'' for all listings, all other "synonymous names" being accepted at the input side only, and never appearing in output listings. No rigid rules govern the selection of the "official name". The normalpractice is to use the full name in English inthe case of nationalinstitu- tions or, if it is not known, the name in the origi- nal language or an abbreviation of it: in the case of international organizations, such accepted ab- breviations as Unesco and FAO are used - other- wise, the full name in English.

Similar problems are similarly dealt with in the case of names of projects, specialists, sub- jects, geographical areas, and so on.

Whenever a new official name is added to one of the authority files, a corresponding decode re- cord should be automatically created for inclusion in the decode files; whenever an official name is deleted, its code is likewise deleted from the de- code file.

A practical means of simultaneously inserting linking information into all interconnected records can now be explained.

For every new record, new field, or new data element, containing a reference that after coding becomes a pointer, that is added to one of the files, an "inverse" record is automatically created, in- dicating the computer code of the record contain- ing the new information. This inverse record is processed in the same way as the other records used to update one of the master files.

B. Retrieval subsystem

In designing this subsystem, a distinction had to be made between simple and chained retrieval.

(i) Simple retrieval. A simple list is re- quested and can be supplied after searching in one only of the master files, e. g., a list of peace re- search specialists and the names and addresses of

the institutions with which they are connected. (ii) Chained retrieval. The request involves

searching in two or more of the master files, e. g. particulars of specialists working in human rights projects and details of the projects themselves.

As they are easy for non-programmers to work with, it was decided to use FIND 2 Multiple Enquiry programmes as far as possible. All operations involved in simple retrieval are performed by means of a FIND run only. In chained retrieval, FIND is used to searchthe first file (main enquiry). For the second, third and, if appropriate, fourth master files, following the links provided by point- ers, special-purpose chained retrieval programme is used. Thus, in the example quoted above, the search for projects relating to human rights would be made by FIND, the search for the specialists referred to in the records selected would be made by a chained retrieval programme.

enough to be able to formulate the questions in the most appropriate way. As DARE should be flexible, and be able to supply the information requested by a programme specialist in the form most conven- ient for his use, output formats cannot be fixed in advance. However, after some practical experience in operating DARE, it should be possible to devise some typical formulae for questions and to some extent standardize information retrieval requests.

One member of the staff should know FIND well

C. Indexing subsystem

One of the problems in designing the indexing sub- system was that institutions, specialists, projects or documents might involve several subject fields, and so could be assigned several subject codes. In preparing an index by main subjects covered, such institutes, specialists, projects or documents should appear under each of the subject headings. This is done, before sorting, by an operation called demultiplication. which is described in some de- tail later.

19

Page 20: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter IV

MAINTAINING THE DATA BASE

The manual and computerized procedures involved in updating are summarized in figs. 5, 6 and 7, 8 respectively.

A. MANUAL PROCEDURES

Receive documents

The term document is used loosely to describe: -Replies of institutions and specialists to circulars,

or other correspondence from them; -Reports of institutions; -Books, periodicals, research reports, conference

-Other fugitive materials.

the date of receipt.

reports, etc. ;

Documents are registered, and stamped with

Scan documents

Scan incoming documents, mark any data on an in- stitution, specialist, project or research product of potential interest.

Also mark any information on approaching con- ferences and new periodicals, but for manual hand- ling only and possible inclusion in Information Note.

Select documents

Select marked items for further processing.

Process records marked in documents

Prepare worksheets to serve as punch documents, i. e., machine-readable records for computer input. Process serially. Also process marked entries in documents serially.

Make out a worksheet for every amendment to an authority or master file, i. e.,

(a) addition of a new institution, specialist, pro- ject or document;

(b) addition of new details to any such institu- tion, project, specialist or document;

Receive Documents Figure 5 C] Scan Documents

Select Documents -1 Process Records Marked in Documents

for Keypunching

Decide on Storage -1 (c) deletion of an institution, specialist, pro-

(d) deletion of details regarding an institution,

(e) alterations regarding an institution, spe-

(f) addition of a new official name to one of the

(g) addition of a new synonynious name toone

(h) deletion of an official name from one of the

(i) deletion of a synonymous name from one of

Different worksheets and input records cover

duct or document;

specialist, project or document;

cialist, project or document;

authority files and to the decode file;

of the authority files;

authority files and the decode file;

the authority files.

the different cases: DARE Worksheets institution (form AI) project (form AP)

20

Page 21: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Details of Block (4)

Check marked data element

worksheet

I

to next record

Check content of data element

Check name in author list

I found Search record in Master List c

I not found

Search for record in

I not non-official

Figure 6

official I non-official I official name Define amount of information

>min r-

Prepare annotation

I not I changed I to be deleted worksheet worksheet references

1- I .1

to next Data Element worksheet

to next record

to be changed

Define official worksheet

record official name

+- found

Fill out worksheet

to next record

21

Page 22: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Read and validate

List of

Amendments . Authority file of specialists # SsOl

1900 / / /

Roster cl---- -1

Invalid Items Amendments

errors

Labels

Figure 7

updating N e w records records Decode file

2. Sort

3. Update authority

J.

Authority files of specialists, institutes, projects, documents and descriptors

* ss02 1900

Sorted amendments

22

Page 23: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

4. Print authority

N e w records >

c!

# SS06

1900

6. Sort new records

7. Prepare Notes and

Figure 8

# X63C

1900

Authority files

Decode file

Notices L/ 23

Page 24: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

specialist (form AS) document (form AD) authority list (form B) change (form C)

Type of input record code Delete official name Delete synonymous name 1 New official name 2

Delete record 4 New record 5 Delete field 6 New field 7 Change field 8

fl

New synomymous name 3

Decide the type of file involved, fill in the appropri- ate worksheet in natural language (i. e., in alphanu- merical form). Delete preprinted answers which are non-pertinent. Code only the country and main subject fields covered (see below).

1. Information on an institution (see fic6J

Checkname or acronym against the printed authority list containing in alphabetical order all the names and acronyms registered by the end of the lastupdatingrun.

(a) If the name or acronym is found, somede- tails must be already stored in the master file. Con- sult the master list of institution reprinted every 3 months.

If the institute is found, examine the informa- tion registered. If not foimd, consult the Informa- tion Note supplementing the master list. Process the new data elements serially.

(aa) If previously registered and contents un- changed, proceed to the next data element. (ab) If previously registered but content

changed, make out type C worksheet (change) and enter: - the computer code of the institute (from the - the type of file code (i.e., I) - the type of record code (i.e.. 8) - the tag of the field affected and both the old and

new content of the field, separated by an '

equality sign (=).

authority list)

Proceed to next data element in the institute record.

(ac) If not previously registered, make out type AI worksheet and enter: - the computer code of the institute (from the - the type of file code (i.e., I) - the type of record code (i.e., 7) - the tag of the new field and the information to

be stored in this field.

authority list)

Several new fields may constitute input and be in- cluded in the same type AI worksheet.

Proceed to the next data element marked for the same institution.

(ad) If a field or subfield is to be deleted, make out type AI worksheet and enter: - the type of file code (I)

- the type of record code (6) - the tag of the field to be deleted. If only a

sub-field is to be deleted then the tag should be followed by the information to be deleted. If the name or acronym is not found in the (b)

authority list, check whether the name is official or not.

(ba) If the name is official and is not in the authority list, no details about the institute have yet been registered in the master file. Check whether or not element contains the minimum information. If not, proceed to the next record. If yes, - assign the country code (from the table of

- prepare an annotation containing up to 15 des- country codes)

criptors. (In principle the number of des- criptors depends on the scope of the insti- tute's activities but it is recommended that the average should not exceed 15)

documents. - note references to specialists, projects and

Make out type AI worksheet and enter: - the type of file code (I) - the type of record code (5) - the tags of the fields (reprinted on the form)

and all the information available on the in- stitute.

(bb) If the name is not official but synony- mous, find the official name and search for it in the authority list. - If the official name is found, make out a type

B worksheet and enter: - the type of file code (E) - the type of record code (3) - the new synonymous name - the computer code of the officialname, then proceed as in (ba).

If only the official name of the institu- tion is changed, make out a type B worksheet and enter: - the type of file code (E) - the type of record code (g), and - delete previous official name.

- the type of file code (E) - the type of record code (1) - the new official name of the institution. (c)

master file (e. g., institution liquidated) make out a type AI worksheet and enter:

Proceed to the next record. (bc)

Prepare another type B worksheet and enter:

If a full record is to be deleted from the

- the type of file code (I) - the type of record code (4) - and the computer code of the institute (from

the authority list).

2. Information on a specialist

Follow the same clerical procedures as in the case of institutions, but note certain differences:

- there will be practically no synonymous names;

24

Page 25: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

use type AS worksheets for alladditions (new rec- ord or new field) and deletions (delete record or field). the type of file code will be S. some references will relate to institutions. use type F file code for any direct updating of the specialists authority file.

3. Information on a research project

Again follow same clerical procedures, but note the following: - always use English language title. - synonymous names may be used, but should be

avoided as far as possible. - all worksheets are of type AP. - the type of file code will be P. - use type G file code for any direct updating ofthe

projects authority file.

4. Information on a document

- Follow same clerical procedure, but note the fol- - there will be no updating as contents do not change - there are no synonymous names (Luhn codes) for - all worksheets are type AR - the type of file code will be R.

lowing:

in time

the same document

5. Information on a forthcoming conference (Manual processing only)

Check marked information against previous Infor- mation Note. If not yet published, check against the alphabetical card index of forthcoming confer- ences, - if previously registered, check data elements for

changes - if not yet registered, or if any data elements have

changed, prepare slip in 2 copies. File one by date of the conference and use in preparing the Information Note, the other by name of the or- ganizing institution.

6. Information on a new periodical (Manual processing only)

Check marked information against previous Infor- mation Note. If not yet published, prepare a slip and file by title in alphabetical order (and use in pre- paring Information Note).

7. Changes in descriptors

If new descriptors are adopted or existing descrip- tors are dropped, make out type B worksheet and enter: - the type of file code (d) - the type of record code (v, 1, 2, 3, according to

the action to be taken)

- the descriptor itself and the computer code as-

To ensure compatibility with CDS, codes for des- criptors will be manually assigned, and not gener- ated automatically.

Stamp any worksheets containing no computer code with a serial number to serve for identification during input procedures.

signed to it

Present worksheet for punching.

Decide on storage

Store only documents directly related to the subject of the main file (mainly institutions and specialists). - If the document is an annual report, returned ques-

tionnaire, letter or other fugitive material de- scribing an institution's activity or current pro- ject: store.

- If the document is a book, research report, article or document prepared by an institute: photocopy and store the title page and the abstract.

- If the document is an institution's periodical: store the first and last issues only.

Store documents relating to institutions and projects in Institution Folders, and file by official name of the institution alphabetically within country.

Store documents relating to specialists in Special- ist Folders, and file by specialist's name and date of birth.

Send publications for storage by main Unesco Li- brary.

Send worksheets for kevmnchinp

Have amendments to the master files and updating material for the authority files punched on paper tape (some amendments to specialist records will be on magnetic tape resulting from data processing for Personnel). per tape format and punching procedures.

projects, in Institution Folders; on specialists, in Specialist Folders; on documents, serially in a Document Folder.

Use CDS input pa-

Store worksheets as follows: on institutions and

With keypunching, the manual part of the updating procedures comes to an end. tions are performed by computer.

Subsequent opera-

B. COMPUTERIZED OPERATIONS

The computerized part of updating subsystem con- sists of several programmes, each responsible for a major task. (See figs. 7 and 8. )

Read and validate

Check all data elements to ensure: - - that new official names contain all the required that the various numeric fields do indeed contain numeric characters

25

Page 26: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

data elements (e. g. for descriptors, descriptor itself and its code; for institutions, country code and name; for specialists, date of birth andname; for projects, country code and title)

- that new records contain the specified minimum in- formation

- that all additions or deletions of synonymous names, or field additions, deletions or alterations are properly coded. Reject all records found to be defective and re-

port on the listing of invalid records, in each case explaining why.

Output all data found to be correct and complete via the Read and Validate programme to one tem- porary file, called DARE-WORK-1, on a magnetic disc.

A difference in structure as between authority and master records calls for a difference also in the processing of input records.

1. If the input record amends an authority file (i. e., is oftype p, 1, 2, or 3) no change in format is required; programme output to magnetic disc has the same format as input on paper tape. Data ele- ments are identified without the use of tags, hence no directory is needed.

2. If the input record adds a new institute, pro- ject, specialist or document (i.e., is of type 5), the programme has to update both a master file and the corresponding authorityfile. The work of punching new official name records (input record type 2) can here be avoided. The programme creates two rec- ords on paper tape:

(a) one for the authority file, containing: - the word count. - the type of authority file code:

. for new institutions: E

. for new specialists: F

. for new projects: G

. for new documents: H - the type of record code (2) - the key information for the authority record, i. e. :

. for institutions: country code name

. for specialists: name date of birth

. for projects country code title

. for documents: the Luhn code

(b) asecond record, for the master file, containing: - a complete header, - a directory containing the field tags found in the in-

put record, and the corresponding field length and field address,

- a control field containing the record key, i. e. : . for institutions: country code

name . for specialists: name

date of birth . for projects: country code

title

. for documents: Luhn code and the fixed length data elements. - all variable fields of the input record, together with

If the input record deletes from a master file (input record type 4), the record to be deleted can be identified by its computer code only. The pro- gramme output need only contain: - a header with the logical record length and the

base address of data, - an empty directory, and - a control iield containing the computer code of the

record to be deleted.

the field terminator.

3.

4. If the input record deletes one or more fields (input record type 6), both the record and the fields must be contained in the programme output, i. e. - a header with the logical record length and base ad-

- a directory, containing: dress of data,

the tags of the fields to be deleted, and the length and start address of these fields in the inputrec- ord, if both the tag and the field(s) to be amended and the information to be deleted are found in the input record, or only the tags of the fields to be deleted

(without field length and start address) if the in- put record does not contain the information to be deleted, and

- variable fields, whenever both the tags of the field(s) to be amended in the master record and the infor- mation (character string) to be deleted are found in the input record.

If the input record is type 7 (new field), the 5. programme output record must contain:

- a header with the logical record length and base - a directory containing the tags of fields to be amen-

address of data,

ded, and the length and start address of these fields in the input record, and

- variable fields containing the information to be added to the master record.

6. If the input record is type 8 (change field), the programme output record will be in two parts:

- one (as a record oftype 6) containing theinforma- tion to be deleted, and consisting of a header, a directory, and variable fields

- and one (as a record of type 7) containing thenew information, consisting also of a header, a di- rectory and variable fields.

Thus, in fact, no type 8 records will appear in this programme output.

The programme may also read in records from the Bureau of PersonnelRECRUT-30-07 sorted tape, but only such information as helps to update DARE specialist records.

tween the two files is one-way and will affect exist- ing records in the DARE specialist file only. authority file is checked to identify specialists reg-

As indicated previously, communication be-

The

26

Page 27: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

istered on both files by name and date of birth. If a match occurs, the programme will create type 6

- Authority files

(delete field) records containing the specialist's computer code (from the authority file) andthe tags designating the following fields:

File number Official-Not official type Type of dossier Matricule number or organization codes Marital status Availability date Notice Languages Level and area of education Professional specialty Interview Placement restriction Application date New field (type 7) records will also be created,

containing the same computer code and the same tags, together with the information found in the corresponding fields in the Roster record.

Both records will be output to DARE-WORK-1. After reading of the two tapes, records found

to be incorrect will be repunched and resubmitted to the programme. A n accumulation of the last run's correct data and the repunched (now correct) data will be then made.

For checking and statistical purposes, the pro- gramme will indicate how many records of each type have been processed, and the number of matches between the two files.

Sort - The Read and Validate programme produces a mix- ture containing all types of records not in any order. A standard ICL sortingprogramme renders this in- to a new, ordered file as output to magnetic disc. Records in this temporary DARE-WORK-2 file have exactlythe same format as in DARE-WORK-1. Ini- tially the length of the DARE-WORK-2 File is 6. It is autoniatically extended during the run. At the end of this run, DARE-WORK-1 is compressed to its ini- tial $ length again.

Updating authority files and decode file and the creation of updatingecords for master files

A third programme is used tq update the authority files and the decode file, and to update the master files by creating the necessary record for thenext programme.

The programme reads the DARE-WORK-2 out- put file.

The method of processing a record read from DARE-WORK-2 depends onboth the type of file and the type of records. Nevertheless, in so far as the types of files are concerned, the basic difference is only between the processing of records aiming to amend one of the master iiles and of thoseaiming to amend one of the authority files.

1. Inupdating an authority file for institutions, the type of record may be: (a) delete official name ($) (b) delete synonymous name (1) (c) new official name (2) (d) new synonymous name (3) (a) Delete official name. Checkinput record con-

taining the country code and name of institution against the authority list of institutions. No match means an error, whichis reported on the listing of errors. Check the continuation count inthe author- ity record to find the number of names beginning with the same 61 characters. -If this count has the value of 1, delete the full record. -If its value is higher than 1, there must be several

names differing only in some of their endcharac- ters. Check the continuation field of the input rec- ord against the first, second, etc. continuation fields in the authority recorduntil a match occurs. Delete the field found from the authority record and modify the continuation count accordingly. No match again means an error, which is reported on the listing of errors. Make out an inverse record having for keythe

computer code found in the authority record. Search the decode file for the corresponding record, and delete. (b) Delete synonymous name. Proceed as in (a),

except that, as the decode file contains only the of- ficialnames, no inverse record is necessary and the decode file remains unchanged. (c) Add new official name. Check the first 61

characters of the name, and the country code of the institution against the authority file. If no match is found, generate a computer code for the new of- ficial name as an average of the codes of the official names found on either side of the new name, and output the new authority record to the correspond- ing file with a continuation count of 1.

If a match is found, check the continuation field of the input record against the continuation fields of the authority record. -A full match means either that the names were mis-

takenlyconsiderednew, or that there are two in- stitutions with the same name in the same coun- try. In both cases a human check and decision is necessary; hence, report on the listing of errors as on "unidentified synonym". If no match is found, generate a computer code and a composite record for the authority file, i. e. insert a new continuation field and increase the continuation count accordingly.

For every new official name accepted, create an inverse record, having for key the newly generated computer code and containing up to 127 characters of the official name (more characters cannot be re- corded because of the FIND package mentioned earlier. Add this record to the decode file. (d) Add new synonymous name. As the computer

code, official name, synonymous name and country

27

Page 28: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

code, already appear in the input record, no code generation is necessary and the decode file is not affected. Proceed exactly as in (c), but note that: -instead of generating a computer code, the new

name will be output together with the code found in the input record,

-no inverse record will be created and the decode file remains unchanged. 2. To update the authority file of specialists

proceed as in updating the institutions file but note that: -as the authority records of names will not contain

continuation fields and continuation counts, the question of matching does not arise

to the last computer code. -a new computer code is found simply by adding 1

3. To update the authority file for projects proceed as in updating the institutions file, except that, again, a new computer code is found simply by adding 1 to the last.

To update the authority file for documents proceed again as in 1 but note that: -no inverse record is needed, as Luhn codes do not

4.

appear in the output listings and donot formpart of the decode file :

-the question of matching continuation fields does not arise

-the computer code is generated as a serial number (as in 2 and 3 above). 5. To update the authority file of descriptors

proceed again as in 1, but note that: -there will be no continuation fields, -as the code is not generated but manually assigned,

it will already be contained in the input record. The creation ofupdating records for the four master- files involves processes, read from DARE-WORK-2, that are practically the same for all four files and include : - automatic coding of alphabetic information - creation of pointers - creation of inverse records - transforming records into the master record format - creation of four updating files for updating by the

Processes preparatory to the deletion or addition of a record from or to a master file, and the de- letion or addition of a field.

next programme.

Masterfile

1. Deletion of a record. No coding, format- ing or other operation is necessary. The record read from DARE-WORK-2 contains the computer code (the only information necessary in searching for the record to be deleted), and will be thus out- put unchanged to the appropriate updating file.

2. Addition of a new record. (a) Replace the alphabetical record identification

by a computer code from the authority file: -for institutions: by name of institution and country

-for specialists: by name and date of birth -for projects: by title of project and country code

code

-for documents: by Luhn code generated by the val- idation programme. (b) Code, from the authority file of all alphabet-

ical information on: types of activity geographical areas covered types of publication facilities available methods of data processing type of organization financing natj onality

(c) Select all descriptors from the current text, and code for insertion on the "Authority codes" field of the record. (d) Generate as follows pointers to connectinter-

related records in different files. Check all refer- ences to institutions, specialists, projects and doc- uments found in the record read from DARE-WORK-2 against the corresponding authority file. * If a match occurs, at least the minimum informa- tion already exists in a record forming part of the information base. Code a link between the stored and the new records (using the computer code found in the authority record), and insert this code in the appropriate pointer field (e. g., checkthe names of senior staff enumerated in an institu- tion record against the authority file of special- ists, and insert the computer code foundinthis authority file in the pointer field of specialists). Ifno match is found, no record exists in the author- ity file, so there is no question of creating a pointer. Accordingly, retain the reference in alphabetical formin the reference field, nopointer being inserted in the corresponding pointer field. (e. g., names of any senior staff not yet regis- tered in the authority file of specialists remain in alphabetical form and no pointer is inserted for them in the pointer field for specialists). (e) After coding, format the record into master

record format, i. e., in view of the limitations of the FIND 2 package, all records must containfieldter- minators even for fields not actually containing in- formation. The programme must accordingly: - define any fields missing from the record, input field terminators as appropriate in the record, input spaces at the beginning of variable length fields in the positions reserved for indicators if such indicators are lacking, - insert a word containing the value of 1 in binary form to allow for the preparation of statistical tables if subsequently necessary (this is because FIND can perform arithmetical operations only with binary fields), - correspondingly change all the records directory information, inserting the new tags, field lengths and field addresses, and amending the previously recorded field lengths and addresses. (f) In view of the need for two-way communica-

tion between interrelated records, all records re- ferred to by pointers in new records should be sup- plemented by the computer code of these new

28

Page 29: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

records. create inverse records of type 7 (new field) for all pointers in the new record, output to the corresponding updating file. (g) Also use the new record as input toDARE-

WORK-1 (which is free at this point) to produce the Information Note and the Current Awareness Notices listings. (h) A final operation is necessary for new insti-

tutions and new project records only. A specialist with no record in the masterfile mentioned in con- nexion with an institution or project is sent a ques- tionnaire to obtain the information necessary. The programme covers this operation as follows. When no match is found in checking names from an institution or project record against the authority file for specialists, the programme outputs to the line printer the names and institution addresses of the specialists concerned. The printer prepares the labels for addressing the questionnaires.

is to find out whether or not the field contains a pointer. - If the field is a pointer field, create type 6 (delete

- If a normal field, no inverse record is needed. The records will be output to the correspond-

ing updating file. 4. Addition of a field. Proceed exactly as in

3, except that the inverse record, if necessary, will be type 7 (new field).

For this purpose the programme will

Each of these will be

3. Deletion of a field (type 6). The first task

Examine the tag of the field;

field) inverse records:

Print authority list

Authority lists are constantly changing (following changes in the master files) and should be reprinted as often as updated. A programme for this purpose is included in the updating subsystem.

Theprogramme reads the authority files of institutions, specialists, projects, documents and, optionally, of descriptors.

When continuation fields occur in the author- ity record for institutions or for projects, a demulti- plication operation has to be performed before output. In the printout, the names of institutions or projects beginning with the same 61 characters mustnot ap- pear as a single entry (as they do in the storage rec- ords). Hence, as many output records are created as there are continuation fields in the authorityrec- ords, so that the printout in each case contains the full name.

3. The programme next organizes the output of information as a listing on the line printer, the fields being automatically given, column headings, generated from their names, as follows: -for institutions: country code, name of institution,

-for specialists: name of specialist, date of birth,

-for projects: country code, title of project, com-

1.

2.

computer code

computer code

puter code

-for documents: Luhn code, computer code -for descriptors: descriptor, computer code.

9date master files

The programme successively reads the four updat- ing files and performs an input-output operation on each of the corresponding master files. This is a simple procedure.

However, updating a master record may involve modifying one or more other records also. Deleting an institution record, for example, involves deleting pointers in it to collaborating specialist, and the crea- ting of inverse records. These, however, cannotbe simply added to the updating file because processing of that file may be already completed (e. g. theinsti- tutions file is read first; later, updating the special- ists file generates an inverse record to delete a pointer from an institution record; this inverse rec- ord cannot simply be added to the updating file).

ter files, the programme serially writes a temporary file (which may again be DARE-WORK-2), in which all newly-created inverse delete field records are provisionally stored. Once it has reached the end of the last updating file, the programme reads this temporary file, thus really completing the updating run.

It should be noted that the organization of the master files will be at random and they will be up- dated by random access.

The detailed programme is as follows. A n up- dating record is read from an updating file accord- ing to the value of an indicator which will be changed on reaching the end of that file. thod of processing this updating record will vary with the type of record.

The solution is as follows. In updating themas-

The me-

1. Delete record

Check on master file for a record having the same keg. -If none is found, there has been an error, which is

-If a record is found, create as many inverse rec- ords of type 6 (delete field) as it contains pointers. These inverse records contain both the tag and the information content of the field to be deleted; they are prepared (as explained above) on the DARE-WORK-2 file. However, no information will be output to the master file, and themaster record itself will be deleted.

included in the listing of errors.

2. New record

If a record having the same key is found on the mas- ter file, then the updating record is not reallya new one; an error has occurred which is reported on the listing of errors.

If no such record is found, no further operation is necessary, and the updating record is simply out- put to the master file.

29

Page 30: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

3. Delete field and only if, the newupdating record arriving from the updating file does not relate to a master rec- ord in the primary store (cf. procedure under 3. Delete field).

If no record having the given key is found in the mas- ter file, there can obviously be no deletion, and there is an error, which is mentioned on the list- ing of errors.

If a record is found, determine whether the field to be deleted is multiple (i. e., contains several data elements of the same kind), or simple. Examine the structure of the updating record.

If the delete field record contains only the tagofthe field to be deleted, the field is simple. Proceed as follows: compress the corresponding field in the master

record in such a way that it contains only the field terminator (the field must not be wholly deleted) as the FIND programme requires that all variable length fields be present in every record, even if they contain no information). . amend the directory accordingly, changing the start address of the fields following the one deleted, and the length of the deleted field. - amend the word count of the record. If the delete field record contains both the tag iden- tifying the field and the information to be deleted, the field is multiple (containing the data element to be deleted and others that must notbe touched). Hence: - search for the subfield that contains the infor- mation to be deleted and compress to zero length; . amend the directory and work count accordingly. In both cases, before transferring the amended record to the master file, ensure that it is not affected by the next updating record to be pro- cessed. When the new updating record is read in, check the key of the record to be amended against that of the record already in the core store. Only if no match is found will the pro- gramme output the previously amended record and search for the newly affected one.

4. New field

Check whether a master record having the same key as the updating record really exists on the file. - If not, a new field obviously cannot be added, the

error is reported on the listing of updating er- rors, and the programme proceeds to the next updating record. - If the record is found, the procedures will depend on the type of field affected.

- If the field in the master record already contains information, this field is multiple. Create a new subfield to contain the new information. Amend the starting positions of the fields follow- ing the extended field, its own length, and the word count.

tain any information, insert the information con- tained in the ''new field" record at the correct place, and suitably amend the dictionary and the word count. A new master record is read in if,

- If the field in the master record does not yet con-

Sort new records

When updating is completed the master files are ready for use, but are not needed in preparing the Information Note and Current Awareness Notices, because all new records are separately available on the DARE-WORK-1 file written during Run 3 (up- date authority files and create updating records for the master files). The purpose of DARE-WORK-1 is precisely to avoid having to select thenew rec- ords from the whole bulk of master records.

must be arranged by country and, within countries, in alphabetical order. gramme sorts by fields in the following order:

The names of institutions, projects, and so on

A standard ICL sort pro-

alphabetical by (sorting) name within country within type of file.

This new sorted file is output to a magnetic disc and will constitute a serial file DARE-NEWREC.

Prepare Information Note and Current Awareness - Notices

The FIND 2 programme package is used to select and print information for the above finalproducts.

As the Information Note covers all new rec- ords added to the information base during updating, no selection operation is performed, and FIND is used only as an output generator, producing all the necessary products in a single run.

The preparation of the Current AwarenessNo- tices, on the other hand, demands selection from among the new record input on the DARE-NEWREC file.

a batch of 96 questions simultaneously. The first is so framed that all the records needed for the preparation of the Note are retrieved and become output to the line printer, sorted in the order in- dicated above. Awareness Notice users. These questions are de- rived from the interest "profiles" of programme specialists, and modified as these profiles change.

All questions are communicated to the machine by punched parameter cards. The output format required is also specified. To repeat regular, re- curring inquiries, there is no need to repeat the in- quiry or specifications; the FIND programme pack- age can store specifications and programme in a parameter file. rent Awareness Notices programme is generated before the first regular Current Awareness run (i. e., during the change- over period) and the profiles will be card-input only on this occasion. The programme

The FIND programme can, in one run, process

The remaining 95 serve 95 Current

Thus the InformationNote and Cur-

30

Page 31: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

will then be preserved for regular use; on anyreg- ular run, profile modifications become input on cards as changed parameters.

In computer teminology, preparation of the Note and Notices means a re-entry to the interro- gation phase of FIND 2.

Each record read from DARE-NEWREC is ex- amined by the programme to see whether it matches one or more user-profiles. If so, it is marked in such a way as to allow the output programme to iden- - tify the user(s) concerned. The new record is ex- tended to contain 96 additional positions (bits), each corresponding to a given profile (e. g., the fifthpo- sition (bit) corresponds to profile 5). These posi- tions are originally filled with zeros, When a rec- ord matches a profile, the position changes to 1, and the corresponding bit is added.

is entered in a temporary "hit" file (DARE-NIT- NEW).

At this point, the line printer can be allotted to print: - the total number of hits for each profile; - the total number of records interrogated; and

the percentage hit rate for each inquiry. 'The programme then automatically enters the out- put phase, and organizes the output from the hit file as a listing on the line minter.

-

On the termination of the process, eachrecord

v

First, the Information Note is printed, contain- ing all new records decoded and arranged in order. -

In printing, each Current Awareness Notice lists the name and address of the specialist to be notified, the profile, and all new records matching it (or only those of them specifically requested by the specialist).

all fields beginning on a new line. Variable length fields are allotted 30 characters in the print line. If the information contained by the field is longer, the continuation is printed on the following lines, directly underneath, within the same print positions.

The printout format is simple, clearly arranged,

The list Option (most flexible of the four output FIND-2 options) will be used.

Runs: frequency, requests

Updating runs will be made weekly. If there are any changes in user profiles, the

standard ICL form (from which parameter cards for the preparation of Current Awareness Notices can be punched) should be clipped on to the request.

Distribution of computer output

The programmes composing the updating subsystem prepare the following printed listings: - List of input errors - List of authority errors - List of updating errors - List of names and addresses of specialist to be

- A full printout of the authority lists

Information Note

sent questionnaires

Current Awareness Notices

jection and are for internaluse only. Those respon- sible for punching cards and completing the work- sheets correct the errors and resubmit thedata to the computer.

stuck to envelopes by a Cheshire labelling machine; each envelope receives a questionnaire and is for- warded to the dispatch department.

The full printout of the authority lists is dis- tributed to staff performing the checking and coding operations necessary in preparing worksheets to be used for punch documents.

The Information Note is multiplied and distri- buted to all interested specialists.

Current Awareness-Notices are sent to the spe- cialists named on them.

The 3 error lists explain the reasons for re-

The names and addresses of specialists are

31

Page 32: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter V

ANSWERING QUESTIONS

The ICL FIND 2 Multiple Enquiry System is a gen- eral purpose package which can handle search and output problems at computer speeds without pro- gramming costs. TheEnquiry language is close to ordinary language.

Simple retrieval problems relating to one file can be solved by a single FIND run. Chained re- trieval, using links between different files, requires a special programme.

1. FORMULATING QUESTIONS

The various references available are naturally al- ways consulted first. Only if they fail toprovide an answer is a computer run requested.

A question requiring computerized information retrieval involves parameters that express specific instructions to the general purpose programme. For example, a programme specialist wants a male fam- ily planning expert, qver 40, for a training pro- gramme in Africa. The list should give, for each candidate, name, address, date of birth, institutions to which attached, specializations, current occupa- tion. Each requirement in the question is treatedas a separate condition, e. g., experience of African problems. Each condition is broken down into 3 components known in ICL terminology as field name, constant, relationship.

Field designates a section of a record contain- ing a specific piece of information. The fieldname, which serves as reference, should clearly indicate what this information is, and do so in what, as near- ly as possible, is everyday language. A field con- taining the designation of the geographical area of interest to an individual can be designated GEOGR- AREA-INT.

AFRICA. are as follows: -Character constants, which are normally identifica-

tion data (e. g., names) comprising any characters from standard set of 64 characters, (e. g.,AFRICA, SMITH).

--

In the present example, there is one constant: The types of constants that may occur

-Numeric constants: 6 types of numeric constants are acceptable to FIND, but DARE willuse only positive integers (e. g., 150).

-Date. Dates can be relative to any base date, but DARE will use the ICL standard, i. e., 1 January 1900. The constant to be compared with a date- field will have the format DD/ MM/YY, where DD I day, MM month, YY year.

-Predefined constant. The same constant used sev. era1 times in an Enquiry batch can be specified and given a name. The field name and constant are linked by one of

six relationships; equal to, not equal to, greater than, greater than or equal to, less than, less than or equal to. In formulating a question these are written as mnemonics (EQL, NEQ, GTR, GEQ, LSS, LEQ) or symbols (e. g., =).

So far, the above example now reads: GEOGR-AREA-INT EQL AFRICA

Other conditions are linked by the logical operator AND, e. g.,:

SEX EQL MALE AND DATE-BIRTH LEQ 01/01/31

Here, each condition has to be satisfied. The log- ical operator OR is used if fulfilling any one of sev- eral conditions were sufficient.

When details are stored in different files, the pointers and way of proceeding from file to file must be defined (e. g., what institutions are doing research on human rights, on what projects, with what special- ists? ). file, proceeds to the project file and terminates with the specialists file.

Four basic options are available on FIND: Print, List, Total and Table, each producing a different type of output; DARE will usually use List and Print.

The Print option is used for reports not requiring a particular format or more than one data element per record (e. g. names of specialists workin-gon a given subject). A major limitation of this option is that reports are limited to one line of outputper re- cord and one record per line. Only variable length fields containing more than 30 characters can be printed in several lines. (This is important, as

Here retrieval starts from the institution

32

Page 33: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

answers to most DARE queries need more than one line).

The list of option will usually be used by DARE. It is the most flexible, as the format and content of each line can be specified by the user but, of course, the preparation of a detailed specification also in- volves more human effort.

This specification must indicate the type of out- put device to be used (line printer or magnetic tape); the size of line (or of output record in the case of magnetic tape); the option required (Print or List); the main heading to appear at the top of the first page; the heading to be printed at the top of eachpair; the details of line spacing; and the details of all fields to be output.

between fields is specified by coded letters: The DARE output format of each field and spacing

A alphanumeric B spaces Z zero X integers E (binary) dates

The code letters are preceded by a number indicat- ing the size of field required. Fields should be named as in formulating inquiry conditions.

Other parameters can be used. As suggested earlier, the staff should have at least one liaison officer with a good enough knowledge of FIND lan- guage and facilities to be able to formulate more sophisticated questions. Next, parameter data must be transferred to standard ICL punch worksheets.

According to ICL conventions, each parameter starts with a major directive, preceded by the sign #(e. g.,# R E A D , #WRITE ,#ENQUIRY 1. With- in the parameters, fields in free format are filled out with instructions that express the specific re- quir ements.

The full list of input parameters is as follows: (a) #READ

-Indicate type of storage device containing the file to be read (DARE will always use Exchangeable Discs ED) immediately after this major directive.

-Specify master file to be searched (e.g. DARE- SPEC-MA) and file generation number (always the highest generationof the file on line to the system, i. e., -1).

-As intermediate information is dumped, the dump frequency, which must also be specified is NO.

READ, ED, DARE-PERS-MA, -1, NO The format of the READ parameter will thus be:

(b) #DICTIONARY To avoid having to prepare new dictionaries,

a standard dictionary will be input for all runs in- terrogating DARE files. f o r fixed length fields: afield name (starting with an alphabetic character,

and consisting of any number of alphanumeric characters, but so arranged that the combination of the first seven is always unique within the dic-

,field type (in DARE, either a character field or a

It will indicate:

tionary).

binary field holding a positive integer)

.start address, giving the word address if a field starts in character position g (in the case of ICL 1900 series machines, records are considered to consist of smaller units called words, each word being 4 characters long), or giving the word address and character position in the wordif a field does not start at character position g.

-length of field (either in words or in characters). - f o r m e variable length fields: -field name (as for fixed length fields) - -field type (as for fixed length fields) -number of preceding fields (terminating character

maddress

The parameters in (a) and (b) are preceded by the major directive #ENQUIRY.

Each inquiry is given an identifying name as up to 96 can be submitted in one computer run. The first character of this identifier is a question mark (e. g., 7 ENQ 1).

If the inquiry wants to find out about candidates, it now reads:

of the last one)

(c) #ENQUIRY

ENQUIRY 7 CANDIDATES (ACTIVITY EQL TRAINING AND SUBJECT EQL FAMILY-PLANNING AND SEX EQL MALE AND DATE-BIRTH LEQ 01/01/31) The identifier and the conditions linked by log-

ical operators of the first question are followed by those of the second, third (up to 96) questions to be answered in the same retrieval run.

(d) In a chained rqtrieval, three additional parameters follow the # ENQUIRY parameter.

First, the optional parameter #WRITE , and the name of the hit file which will depend on the master file interrogated, and accordingly

DARE-HIT-PER, DARE-HIT-PRO, or DARE-HIT- DOC.

Secondly, $z HALT , used to halt the programme before the output run in order to insert the special chained retrieval programme.

jor directive #CHAIN , and the followinginforma- tion. -The level number (1, 2, 3 or 4) which defines the

sequence of each type of information at the out- put. Level 1 is the level of the main question. For example, the request is for (1) thenames of institutions working on a specific subject, (2) their projects, (3) specialists working on these projects and (4) details of reports published by these specialists (all these data to be supplied by a single run, without any manual intervention), the level numbers will be: institution details 1

specialist details 3 document details 4

be one of the following: DARE -HIT-INS,

Thirdly (this is not a FIND parameter), the ma-

project details 2

Two or three types of information may have the same level, e. g., the user does not want specialists

33

Page 34: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

1. Validation and Interrogation

# X63C

3900

2. Chained Retrieval (optional)

< Decode File

3. Sort (only in case of Chained Retrieval)

4. Print

Parameters O n e of the # X63C Master Files

Parameters O n e of the # X63C Master Files

Listing

Figure 9

1,2 or 3 of the Master Files

1

34

Page 35: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

classified by project, but only the details ofprojects in selectedinstitutions and staff working inthem. In this case the levels will be:

institution details 1

specialist details 2 The name of the file to be processed at the level.

defined by the level number. The position reserved for each type of pointer

(first, institution pointer; second specialist pointer; third, project pointer; fourth document pointer). A character (1) is punched in the appropriate position whenever a pointer in a chained retrieval is to be followed as a link to another file. For example: the main question relates to institutions working on a specific field, and the user asks for more details about projects in these institutions. In the position reserved for theproject pointers, 1 is punched. The system can follow as many pointers at the different levels as may be necessary, except that nopointers linking a lower level file to a higher level file are feasible in the same inquiry (e. g., if the institution is level 1 and specialists level 2, no pointers from specialists back to institutions are allowed; the pro- gramme follows the links in one direction only.

(e) In a simple retrieval, #ENQUIRY para- meters are used, while in a chained retrieval, # CHAIN parameters are used, followed by the out- put parameters.

A n example (for a simple question) will show how the output can be specified. The dataelements required are given mnemonic name for each of the fields containing the information:

name of specialist NAME his address ADDR date of birth annotation ANN0 TN

Format requirements are as follows: =main headings: "candidates for teaching family

planning", to be followed by 4 blank lines (spaces) -heading for each page: "particulsrs of specialists",

to be followed by 2 blank lines, -each data element to appear on a new line; address,

date of birth and annotation to be preceded by 4 blanks.

project details 2

DA TE - BIRTH

- To produce such a report on a line printer having 160 positions the output should be specified as follows: OPTION, LP, 160, LIST HEAD

SPACE 4 PAGE

SPACE 2 FORM 140 A , NAME FORM

CANDIDATES FOR TEACHING FAMILY PLANNING

PARTICULARS OF SPECIALISTS

4B 140 A , ADDR

FORM 4B 8E , DATE BIRTH FORM

4B 140 A , ANNOTN

(f) the last parameter is: #STOP If waiting time is not to exceed one day, SSDC

would need access twice a day to the computer, once to process requests before noon, a second time to process those received during the afternoon, during the night.

signed by the liaison officer, should be accompanied by the clipped-on standard ICL form from which parameter cards can be punched.

Requests for retrieval runs, prepared and

2. COMPUTER PROCEDURES INVOLVED

The ICL FIND 2 package can be built into any larger system. As it can also be used by information cen- tres without having a detailed knowledge of thevari- ous procedures, there is no point in going into too much detail aboutprogrammes, except for the chained retrieval programme which needs some explanations.

The programmes constituting the retrieval sub- system are shown in fig. 9. They are, briefly, as follows.

Validation and interrogation

This programme is in three phases: validation, in- quiry translation, interrogation.

meter input on punched cards indicating: -the master file to be interrogated -the inquiry (the main inquiry in the case of chained

(a) Validation. The programme reads para-

retrieval) -the information required. The programme checks the parameter cards and lists correct parameters on the parameter listing. Correct parameters are output to a file which can be scrapped at the end of the inquiry translation phase (see 6).

gramme will halt if errors have occurred; other- wise it goes on to the next phase.

produced by the validation phase are translated in- to one large inquiry programme.

(c) Interrogation. Input will be one of thefour master files.

The inquiry programme generated in (b) reads the appropriate master file serially and interrogates each record.

Batches of up to 96 inquiries may be processed simultaneously, with the following types of inquiry: /-Standard inquiry: records satisfying criteria in the

question are output by the programme. --Count inquiry: only the number of records satisfy-

ing the question is counted (the records themselves are not output by the programme).

At the end of the input parameter file the pro-

(b) Inquiry translation. The inquiry tables

35

Page 36: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

- Lead inquiry: this tests whether all the following inquiries should be checked (there maybe no point in doing so, because a major common con- dition was not satisfied by the record). The pur- pose is to decrease interrogation time, when, e. g. , a set of inquiries relate to projects that contain different record conditions. In standard inquiries, when at least one of the in-

quiries is a hit, 1-2-3-4 special hit words contain- ing 24 positions (bits) are added to the end of the input record (1 hit word for 24 inquiries). When the record is a hit for one inquiry, the bit corres- ponding is set in the hit word. The hit word is then usedlater in the output phase to check the inquiries satisfied by the record.

The programme outputs the records supplemen- ted with hit words to hit file.

At the end of the run, the line printer indicates. - the total number of records interrogated -the total number of hits for each inquiry -the percentage hit rate for each inquiry. The programme halts at the end of this run, if a chained retrieval is to be performed; otherwise it enters the output phase. If there are no hits, the programme will halt in any case.

Chained retrieval

Chained retrieval is necessary when a question asks for two or more types of information.

The programme reads the hit file prepared as a result of the previous run, and 1, 2 or 3 master files. The minimum number of input files is thus 2, the maximum 4.

The programme validates the parameter cards and checks: - that they contain the level number and name of the file

- that there are no pointers linking a lower level file to a higher.

If errors occur, the programme halts, indicat-

If no errors occur, the parameters are output ing why the parameter set has been rejected.

to the line printer, and the programme proceeds as follows:

It reads the first record from the hit file (file level 1) and searches for the first type of pointer indicated in the parameter card. Having found the field in the hitrecord, it follows the first linkdesig- nated by the first pointer in this field in order to find the corresponding record in the other (level 2) file. Having found this record, it checks the level 2 parameter information. If this contains an in- struction to proceed to the next level, the programme follows the link along the line indicated by the first pointer of the second type of pointer on the para- meter card, thus reaching the level 3 file. Ifnec- essary, it follows the link down to the level 4 file.

In other words, chains of pointers are followed to their end, and theprogramme each time outputs the record found at the end of each chain. (see fig. 10).

The records found at the end of a chain are sup- plemented by a field which provides the key in the followingsorting run. Its length is the sum of the lengths of the computer code fields of the four in- put files (i. e., 4 x 8 characters). It contains the computer codes of all records touched in follow- ing a chain of pointers down to the record to be output (including the computer code of that record).

The structure of this field is thus the following: -the first subfield (first 8 characters) contains the

computer code of the 1st level record, whichis the startingpoint of the chain leading to the rec- ord to be output.

-the second subfield (second 8 characters) contains the computer code of the 2nd level recordfound in following down the chain,and so on.

Subfields not filled with a computer code (because the search is e. g., only a level 3 search) are filled with p's, to the right of them, there is no subfield containing a computer code. If there is such a sub- field, they are filled with high values.

This supplementary field is added to the end of the record to be output, all other fields of this rec- ord remaining unchanged. The record is then out- put to a serial work file.

Sort __ In a chained retrieval, records belonging together though of different types must appear together at the output (the first level record being at the top of the entry, followed by the first second level record belonging to it, and so on).

Anindentedformat should appear on the output listing (see fig. ll), andis obtained by having the file containing all records selected during the chained retrieval sorted by a standard ICL sort programme on the supplementary field containing the four computer codes.

Print -- The programme processes: - the hit file output by the interrogation programme (in simple retrievals) or

- the file output by the sort programme (in chained retrievals ).

The output is thus organized as a listing to theline printer.

ly enters the output phase at the end of the interroga- tion phase. In a chained retrieval, the user can re- enter the system using a previously validated par- ameter file, thus allowing direct entry to the output phase.

The nature of the processing will depend on the output option specified in the validation phase.

In a simple retrieval, the programme automatical-

3. USE OF COMPUTER OUTPUT

On receipt of computer output, answers should be

36

Page 37: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

1 st level . . . P,, P12 P13 . . . 1 Figure 10

2nd level

3rd level

4th level

~

I 1 c=P12 ...

//lo

c = Pll . . . P2, P22 I C=P2,. . .

Y6 7\\

v - p T - l p q prl

c = Pn . . . Pa P32 P33

7 J4 \5, J8 P,, = 1st pointer of type 1, etc. C = computer code ld = lststep

Figure 1 1

1st level record

2nd level record

I 3rd level record

3rd level record 1 I

1 2nd level record

3rd level record

1st level record I ~

I 2nd level record

and so on.

37

Page 38: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

checked for possible errors by the liaison officer before forwarding to the user. The user should be asked how pertinent, correct and useful the answer is. It might be necessary to perform another

computer run the better to meet his requirements. Checks should be made as often as possible. Direct contact with the user is essential both before and after the retrieval run.

38

Page 39: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter VI

PREPARING INDEXES

The operation of the indexing subsystem is almost entirely automatic; not even the preparatory pro- cedures are manual. The programme is summa- rized in fig. 12.

1. PROCEDURES

Sort

In a printed subject

- index, entries should be easy

to handle, in alphabetic order, by country, under subject headings. Record storage is random. The first operationis sorting. A standard ICL sort pro- gramme reads one of the master files and sorts the records by sorting name within country. Sorting by subject is not feasible at this stage because a rec- ord may contain several subject codes; moreover, the same record cannot be output to different places by a simple sort programme.

No format changes being necessary, the sorted file is simply output to a work file (DARE-WORK-1).

Print reference index

As anintermediate product from the run, a reference list containing all information currently available on the files can be printed out at this stage, arranged alphabetically within countries. The output pro- gramme re-enters the output phase and prints the information, each field appearing in a new line.

~-

-- Demultiplication

The next task is to demultiply i. e., to create as many new records (to be used in the preparation of a sub- ject index) as there are main subject codes in each input record. These new records will then contain only one main subject code each.

only at the end of the change-over period and there- after, at the end of each biennium. In between, sup- plements will be published. The programme can be

Full subject indexes for DARE will be prepared

instructed to read all records, or only those input to the master file over a given period. The dis- tinction is made by the computer code, which also expresses the order of input of records. Thus, in preparing a supplement, only records whose key is a computer code higher than that of the inputonthe last run are read (except for institution records where only records posterior to a given date will be read).

As already mentioned, the programme creates as many new records as there are main subject codes. The format remains unchanged, except that eachnew record will contain only one main subject code in the field identified by the tag. This field thus also becomes a fixed length field; as it begins on a given position, the demultiplication allows a standard ICL sort programme to be used for the preparation of the subject index.

The programme outputs the records to a work file (DARE- WORK-2).

Sort - The file is now ready for the final sorting operation.

A standardICL sortprogramme is usedto sort the records output by the demultiplication pro- gramme in the final order required for the prepa- ration of the Subject index. The order of sorting is: sorting name within country within main subject code.

The records will still be output to a work file (DARE- WORK- I).

Chained retrieval --- Separate subject indexes will be published for in- stitutions, specialists, projects, documents. Any of the four can, if necessary, include other kinds of information (e. g. details of some projects in the index of institutions). The chained retrieval programme can be optionally used for this purpose.

39

Page 40: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

1. Sort

Parameter

One Master File U-{ # SS30 , l g o o l

> # SSlO +-----A I _ _______ J +

3. Demultiplication

# SS32 Decode File

1900

4. Sort

Chained Retrieval (optional)

6. Print Subject Index

Figure 12

Alphabetical Reference Index U

Work File 9

files

40

Page 41: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Output subject index

A subject index, being for general use, should be published as an ordinary book, with capitals and lower case letters. The present computer config- uration does not allow this.

The FIND 2 Multiple Enquiry Package output format is flexible, but the Unesco model has at pres- ent no facilities for output to paper tape, and relies on magnetic tape. This can be sent to a service bureau for final processing. Alternatively, if the present configuration is supplemented by a paper tape punch, another programme could be used for output from magnetic tape to paper tape.

2. FREQUENCY OF INDEXING RUNS; DISTRIBUTION

Frequency of runs and how to request them

Index preparation runs will be performed regularly: -for the printing of quarterly alphabetical reference

-for the printing of general subject indexes:

.

indexes

. a full index at the end of each biennium - supplements at the end of each intermediate 6 months period, i. e., 3 times in a biennium. All requests for runs should specify the type

of index to be prepared (alphabetical reference in- dex only, or with general subject index also); and whether a full index or only a supplement. If only a supplement, indicate date of last subject-index preparation run (in the case of institutions) or the last computer codes (in the case of specialist, pro- ject and document records).

Distribution of outputs

In the case of the indexing subsystem, the computer will output: - an alphabetical reference index - optional general subject index or supplements The first is for internal, Secretariat distribution only.

The subject indexes will be for more general use, and hence produced in larger issues for distri- bution to Unesco staff, other United Nations organi- zations, and possibly to other subscribers.

41

Page 42: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Chapter VI1

LAUNCHING DARE

Prior to the change-over to full-scale computerized operation, various contacts have to be made with persons and institutions, and procedures necessary to the opening of files and the creation of a frame- work for the system as a whole have to be worked out. The change-over period will last for several months.

1. PHASES

Three phases, or combinations of them, are involved: pilot running, parallel running and direct change- over.

Pilot running

All DARE programmes will be prepared and tested with experimental data before any actual data is in- put or any services are provided. By the time the four master files are opened, the system shouldbe fully operational.

Parallel running

Meanwhile, as the manual system continues, the computerized information base will be built up un- til it contains a minimum all up-to-date information available on the manual files. Retrieval requests, answered by both systems will allow mutualcheck- ing of results and show what corrections to the sys- tem, if any, are needed. The old system can be discontinued when all records are input to the com- puterized file, and the new system can be shownto be operating correctly.

Direct change-over -- Updating the Synoptic Card Indexes can be stopped right at the beginning, and the change-over made to the new procedures required for computerized up- dating.

2. INPUT TO DATA BASE

Questionnaires will seek any supplementaryinfor- mation that may be required over and above what is already available in the manual files.

3. OPENING FILES

As new records for inclusion in the master files involve automatic coding, authority and decode files have to be opened. Moreover, the alphabetic sequence institution codes must take account of the frequency of character distribution; hence all the names should be input before any other information on institutions is recorded.

Authority and decode files to open, which are indexed sequential files, a standard ICL programme (marked # XJEZ) must be used.

The procedure is as follows. - Make out type B worksheets for all existing information. * For institutions, enter official name (i. e. English

or original name only) and country code from al- phabetical card catalogue.

For specialists, enter name and date of birth. - For projects, enter title and country code. - For documents, enter by Luhn code. . For descriptors, enter together with their manually

assigned code number. right justified with leading zeros, and will thus always start with a zero.

All descriptor codes are

-Create authority files independently in the follow- ing order: institutions, specialists, projects, doc- uments. Punch information recorded on worksheets, and

input to computer - but only after all available in- formation for a given authority file has been punched (to avoid having to reorganize the indexed sequen- tial files).

idating programme. WORK- 1.

For the starting authority records use the val- Records are output to DARE-

42

Page 43: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

To ensure correspondence between the alpha- betical sequence and the code sequence, sort this file alphabetically (standard ICL sort programme). Records are output to DARE-WORK-2.

To open the authority files, use the standard ICL programme for opening indexed sequential files, with a built in segment for automatically generating the computer codes required. This segment works as follows: -For institutions, the computer code expresses al-

phabetical order of names within country. Code numbers start from 10,000, 000 and increment by 50, 000, allowing the inclusion of 49, 999 newnames between twonames alreadyon file. In DARE, some 1, 700 official names (i. e. about 6070of the final au- thority list of institutions) can already be loaded at this stage. Hence, the frequency of character dis- tribution can be taken into account. As only 1, 500 - 2, 000 new institutions (say 12, 000 new official and synonymous names) will be added later, the incre- ment of 50, 000 allows the inclusion of new names without any need for resequencing - even when the file reaches its maximum size,

-For specialists, projects and documents, the computer codes do not express the alphabetical or- der; hence serial numbers are assigned, each incre- menting the previous number by 1.

In the case of descriptors, the segment does not work.

The standard ICL programme for opening au- thority files does not create any inverse records for inclusion in the decode file.

ate inverse records for the official names recorded in the authority files. Using a small special pro- gramme, sort these inverse cards into the fields containing the computer code, and output to awork file. Use the standard ICL opening programme to read the work file and write the decode file. Once the authority and decode files are ready, open the random organized master files.

To open the indexed sequential decode file, cre-

4. CURRENT AWARENESS SERVICE

Discuss requirements personally with the programme specialists concerned, including the output format desired. Use these profiles to prepare parameter cards. These cards are input to the FIND programme, validated and translated. This programme is con- served for regular use.

At the end of the change-over period, whenall available information is stored in the machine, a basic list of information can be prepared for all programme specialists on the Current Awareness Notices mail- ing list. Thereafter, the conserved programme can be regularly used in preparing Current Awareness Notices.

43

Page 44: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

UNESCO PUBLICATIONS: NATIONAL DISTRIBUTORS Argentina Australia

Austria Belgium Bolivia Brazil

Bulaaria Burma

Cameroon Canada Ceylon Chile

Colombia

Congo (People's Republic of) Costa Rica

Cuba Cyprus

Czechoslovakia

Dahomey Denmark

Arab Republic of Egypt Ethiopia Finland France

French West Indies Germany (Fed. Rep.)

Ghana

Greece Hong Kong

Hungary

Iceland India

Indonesia Iran Iraq

Ireland Israel

Italy Jamaica

Kenya Khmer Republic

Korea Kuwait Liberia Libya

Luxembourg Malaysia

Malta Mauritius

Mexico Monaco

Netherlands Netherlands Antilles

N e w Caledonia N e w Zealand

Japan

Niger Nigeria

Norway

Pakistan

Peru

Philippines Poland

Portugal Southern Rhodesia

Romania Senegal

Singapore South Africa

Editorial Losada, S.A.. Alsina I 131, BUENOS AIRES. Publications: Educational Supplies Pty. Ltd., Box 33, Post Office. Brookvale 2100, N.S.W. Pm'odicds: Dominie Pty. Ltd., Rox JJ, Post Office, Brookvale 2100, N.S.W. Sub-agent: United Nations Association of Australia, Victorian Division. 4th Floor. Askew House, 364 Lonsdale St., MELBOURNE (Victoria) 3000. Verlag Georg Fromrne & Co., Arbeitergasse 1-7. rogr WIEN. Jean D e Lannoy. I 12, rue du Trine, BIIUXELLLS 5. Libreria Universitaria, Universidad San Francisco Xavier, apartado 212. SUCRE. Fundaqao Getulio Vargas. ServiCo de Puhlicacdes. caixa postal 11 120 Praia de Botafogo 188, RIO DE JANEIRO, G.B. Hemus. Kantora Literatura. bd. Rouskv 6, SOFIIA. Trade Corporation n.O (9), 550-552 Merihant Street, RANGOON. Lihrairie Richard, B.P. 4017, YAOUNDC. Information Canada, OTTAWA (Ont.). Lake House Bookshop. Sir Chittarnpalam Gardiner Mawata. P.O. Box 244. COLOMBO 2. Editorial Universitaria, S.A., casilla IOZZO, SANTIAGO. Libreria Buccholz Galena, avenida JimCnez de Quesada 8-40. apartado aCreo 49-56, BOGOTA; Distrilibroa Ltda., Pi0 Alfonso Garcia, carrera 4n, nom 36-119 y 36-125, CARTAGENA; J. Germln Rodriguez N., Calle 17, 6-59, apartado nacional 83. GIRARDOT (Cundioamarca). Editorial Losada Ltda., Calle IRA. n." 7-37. apartado aCreo 5829, apartado nacional 931. BOGOTA. Sirb-depots: Edificio La Ceiba, Oficina 804, Medellin Calle 37, n.O 14-73 Oficina. 305. BUCARAMANGA; Edlficio Zaccour, Oficina 736, C~1.1.

Librairie populaire. B.P. 577, BRAZZAVILLE. Librerla Trejos S.A., apartado 1313, SAN Josd. TelCfonos: 2285 y 3200. Pstribuidora Nacional de Publicaciones, Neptuno 674, LA HABANA. MAM', Archbishop Makarios 3rd Avenue, P.O. Box 1722, NICOSIA. SNTL. Spalena 51. PRAHA I (Permonen1 display); Zahranicni literatura, 1 1 Soukenicka, PRAHA 1. For Slowkia onlv : Nakladatelstvo Alfa. Hurbanovo nam 6 BRATISLAVA. Librairie nationale, B.P. 294. PORTO NOVO. Ejnar Munksgaard Ltd.. 6 Norregade, 1165 KBBENHAVN K. Librsirie Kasr El Nil. 78, rue Kasr El Nil. LE CAIRE National Centre for Unesco Publications, I Tlaaat Harb Street, Tahrir Square, CAIRO. National Commission for Unesco, P.O. Box 2996, ADDIE ABABA. Akateeminen Kirjakauppa, 2 Keskuskatu, HELSINKI. Librairie de I'Unesco, place de Fontenoy, 75 PARIS-7e. CCP 1~5~8-48. Librairie Fdix Conseil. I I rue Perrinon, FORT-DE-FRANcE (Martiniclue). Verlag Dokumentation. Postfach 148, Jaiserstrasse 13, 8023, MUNCHFN-PULLACH. 'The Courier' (German edition onlv): Bahrenfelder Chaussee 160, HAMBURC-BAHFIENFLXD. CCP 27 66 50. Presbyterian Bookshop Depot Ltd.. P.O. Box 195, ACCRA; Ghana Book Suppliers Ltd.. P.O. Box 7869, ACCRA; The University Bookshop of Ghana, ACCRA; The University Bookshop of Cape Coast; The Uni- versity Bookshop of Legon. P.O. Box I, LEGON. Librairie H. Kauffrnann. 28, rue du Stade. ATHENAI; Librairie Eleftheroudakis, Nikkis 4, ATHENAI. Swindon Book Co., 13-15 Lock Road. KOWLOON. Akadhiai Konyvesbolt Vlci U 22. BUDAPEST V. A.K.V. Konyvtlrosok Boltja. NCpkoztirsasig utja 16, BUDAPEST VI. Snaebjorn Jonsson & Co. H. F.. Hafnarstracti 9, REYKJAVIK. Orient Longmans Ltd.; Nicnl Road, Ballard Estate, BOMBAY I; 17 Chittaranjan Avenue, CALCUTTA 13; 36a Mount Road, MIDRAS 2; 3/5 Asaf Ali Road, NEW DELHI I. Sub-depots: Oxford Book 1L Stationery Co., 17 Park Street, C A L C U ~ ~ A 16; and Scindia House, NEW DeLHl; Publications Section. Ministry of Edu- cation and Youth Services, 72 Theatre Communication Building, Connaught Place, NEW DFLHI I. Indira P.T.. Djl. Dr. Sam Ratulangic 37, DJAKARTA. Commission nationale iranienne pour I'Unesco, 1/154, avenue Roosevelt. B.P. 1533. T~HBRAN. McKenzie's Bookshop, AI-Rashid Street, DAGHDAD; University Bookstore, University of Baghdad. P.0. nox 75, BAGHDAD. The National Press, 2 Wellington Road, Ballsbridge, DLTILIN 4. Emanuel Brown, formerly Blumstein's Bookstores: 35 Allenby Road and 48 Nachlat Benjamin Street, TEL AVIV; 9 Shlomzion Harnalka Street, JERUSALEM. LICOSA (Libreria Cornmissionaria Sansoni S.p.A.). via Lamarmora 45. casella postale 552, 50121 FIRENZE Sangster'a Rook Stores Ltd.. P.0. Box 366, 101 Water Lane, KINGSTON. Maruzen Co. Ltd., P.O. Box 5050, Tokyo International, TOKYO. The ESA Ltd.. P.O. Box 30167. NAIROBI. Librarie Albert Portail, 14. avenue Boulloche, PHNOM-PENH. Korean National Commission for Unesco, P.O. Box Central 64, SEOUL. The Kuwait Bookshop Co. Ltd.. P.O. Rox 2942, KUWAIT. Cole& Yancv Bookshops Ltd.. P.O. Box 286. MONR0VI.A. Agency for Development of Publication and Distribution, P.O. Box 34-35, TRIPOLI. LibrairiePaul Bruck. 22 Grande-Rue, LUXEMROURG. Federal Publications Sdn. Bhd.. Balai Berita. 31 Jalan Riong. KUALA LUMPUR. Sapienza's Library. 26 Kingsway. VALLETTA. Nalanda Co. Ltd, 30 Bourbon Street, PORT-LOUIS. C I L A (Centro Interarnericano de Libros Acadhicos), Sullivan 31 bis, MfXICO 4, DF Rritish Library, 30. boulevard des Moulins. MONTE-CARLO. N.V. Martinus Nijhoff, LanEe Voorhout 9. 's-GRAVENHAGE. G. C. T. Van Dorp & Co. (Ned. Ant.) N.V., WILLEMSTAD (Curacao, N.A.). Reprex S.A.R.L., B.P. 1572, Government Printing Office, Government Bookshops: Rutland Street, P.O. Box 5344, AUCKLAND; 130 Oxford Terrace, P.O. Box 1721, CHRISTCHURCH; Alma Street, P.O. Box 857, HAMILTON: Princes Street, P.0. Box 1104. DUNEDIY; Mulgrave Street. Private Bag, WELLINGTON. Lihrairie Manclert, B.P. 868, NIAMEY. The University Rookshop of Ife; The University Bookshop of Ibadan, P.0: BOX- 286, IBADAN; The University of Nsukka; The University Bookshop of Lagos; The Ahmadu Bello Un!verslty Bookshop of Zaria. All publications: Johan Grundt Tanum (Booksellers), Karl Johnnsgate 43, OSLO I. 'The Courier' only : A/S Narvesens Litteraturjeneste, Boa 6125, OSLO 6. The West-Pak Publishing Co. Ltd., Unesco Publications House, P.O. Box 374. G.P.O., LAHORE. Showrooms: Urdu Bazaar, LAHORE, and 57-58 Muree Highway, G/6-1. ISLAMABAD. Pakistan Puhlications Rookshop Sarwar Road, KAWALPINDI ; Parihagh DACC?\. 'The Courier' onlvr Editorial Losada Peruana, apkado 472, LIMA. Other publications: Distribuidora Inca S.A., Emilio Althaus 470. Lince, casilla 3 1 ~ 5 . LIVA. The Modern Book Co., 926 Rizal Avenue, P.O. Box 632, MANILA. Osrodek Rozpowzechniania Wydawnictw Naukowych P A N , Palac Kultury i Nauki, WARWAWA. Dias & Andrade Ltda., Librarin Portugal, rua o Carrno 70, LIsDOA. Textbook Sales (PVT) Ltd.. 67 Union Avenue, SALISnURv. I.C.E. LIBRI. Calea Victoriri. no. 126. P.O. Box 114-115, BUCURE$TI. La Maison du Livre, 13, avenue Roume, B.P. 20-60. DAKAR; Lihrairie Clairafrique. B.P. Lihrnirie 'Le SCnCpal'. B.P. 1594. DAKAR. Federal Publications Sdn Bhd., Times House, River Valley Road, SINGAPORE 9. Van Schaik's Bookstore (Pty.) Ltd., Libri Buildina, Church Street, P.O. Box 724 PRETORIA.

DAKAR :

Page 45: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

Spain

Sudan Sweden

Switzerland Tanzania Thailand

Toea

United States Upper Volta

Venezuela

Yuposlavia Republic of Zaire

All piihlirntion.s: Editiones Iheroamericanas, S.A. calle de Ofiate 15. MADRID 20; Distribucion de Publiraciones del Consejo Superior de lnvestigncibnes Cientificas, Vitrubio 16. MADRID 6; Lihren'a del Consejo Superior de Investigaciones Cientificas, Egipciacas 15, BARCELONA. For 'The Courier' only: Ediciones Liher. apartado 17, ONDARROA (Viscaga). AI Bashir Bookshop P.O. Box II I R , KHARTOUM. All piihlirnfionsi A/B. C.E. Fritzes Kunpl. Hovbokhandel, Fredsgatan 2, Box 16356, 103 27 STOCKHOLM 16. For 'The Courier': Svenska FN-FGrbundet, Vasagatan 15, IV, 101 zj STOCKHOLM I. Postgiro 1.8 46 92. Europa Verlag, RBmistrasse 5, ZURICH; Lihrairie Palot, 6, rue Grenus. 1211 GENBVE 11. Dar es Salaam Bookshop. P.O. Box 9030. DAR ES SALAAM. Suksapan Panit, Mansion 9, Rajdnmnern Avenue, BANGKOK. Lihrairie &an&lique. B.P. 378; LoMC; Librairie du Bon Pasteur, B.P. 1x64. LoMB; Librsiric modertie, B.P. 777, LOM6. Librairie Hachette, 469 Istiklal Caddesi, Be\.oglu, ISTANBUL. Uganda Bookshop, P.O. Rox 135, KAMPALA. Merhdunarodnaja Kniga, MOSKVA, G-zoo. H.M. Stationery Office, P.O. Row 569, LONDON SEI 9NH; Government bookshops: London. Belfast. Birmingham. Bristol, Cardiff, Edinburgh, Manchester. Unesco Publications Center P.O. Box 437. N e w York, N.Y. 10016. Librairie A!tie. ,B.P. 64, Ouagadougou, Librairie catholique 'Jeunesse d'Afrique'. OUAGADOUGOU. Libreria Historia, Monjas a Padre Sierra, Edificio Oeste 2, n.0 6 (frente al Capitolio). apartado de correos 7320-tor. CARACAS. Jugmlovenska Knjiga. Terazije 27, REOGRAD. Drzavna Zaluzba Slovenije Mestni Trp. 26, LJUBLIANA La Librairie. Instirut politique conmlais. B.P. 7.307, KINSHASA. Commission nationale de la Republiwe du Zalre DOW I'Unesco. Ministhe de I'edlication nationale. KINSHASA.

Page 46: DARE: UNESCO computerized data retrieval system …unesdoc.unesco.org/images/0000/000054/005427eo.pdf · DARE Unesco computerized data retrieval system for documentation in the social

m 1972 International

Book Year

US, $1 .SO; 45p (stg.) 6F plus taxes, if applicable [B,3051]