Historical Reflections How Charles Bachman Invented the DBMS, a Foundation of Our Digital World His 1963 Integrated Data Store set the template for all subsequent database management systems.

FI F T Y - T H R E E Y E A R S AG O a small team working to automate the business processes of the General Electric Company built the first database man-

agement system. The Integrated Data Store—IDS—was designed by Charles W. Bachman, who won the ACM’s 1973 A.M. Turing Award for the accomplish-ment. Before General Electric, he had spent 10 years working in engineering, finance, production, and data process-ing for the Dow Chemical Company. He was the first ACM A.M. Turing Award winner without a Ph.D., the first with a background in engineer-ing rather than science, and the first to spend his entire career in industry rather than academia.

Some stories, such as the work of Babbage and Lovelace, the creation of the first electronic computers, and the emergence of the personal computer industry have been told to the public again and again. They appear in popu-lar books, such as Walter Isaacson’s recent The Innovators: How a Group of Hackers, Geniuses and Geeks Created the Digital Revolution, and in museum exhibits on computing and innova-tion. In contrast, perhaps because da-tabase management systems are rarely experienced directly by the public,

Figure 1. This image, from a 1962 internal General Electric document, conveyed the idea of random access storage using a set of “pigeon holes” in which data could be placed.



establish a “totally integrated man-agement information system.”8 This would integrate and automate all the core operations of a business, ideally with advanced management report-ing and simulation capabilities built right in. The latest and most expensive computers of the era had new capabili-ties that seemed to open the door to a more aggressive approach. Compared to the machines of the 1950s they had relatively large memories. They featured disk storage as well as tape drives, could process data more rap-idly, and some were even used to drive interactive terminals.

The reality of data processing changed much more slowly than the hype, and remained focused on simple administrative applications that batch processed large files to accomplish tasks such as weekly payroll process-ing, customer statement generation, or accounts payable reporting.

Many companies announced their intention to build totally integrated management information systems, but few ever claimed significant suc-cess. A modern reader would not be shocked to learn that firms were un-able to create systems of comparable scope to today’s Enterprise Resources Planning and data warehouse proj-ects using computers with perhaps the equivalent of 64KB of memory, no real operating system, and a few megabytes of disk storage. Still, even partially in-tegrated systems covering significant portions of a business would have real value. The biggest roadblocks to even modest progress toward this goal were the sharing of data between applica-tions and the difficulties application programmers faced in exploiting ran-dom access disk storage.

Getting a complex job done might involve dozens of small programs and the generation of many working tapes full of intermediate data. These banks of whirring tape drives provided com-puter centers with their main source of visual interest in the movies of the era. Tape-based processing techniques evolved directly from those used with pre-computer mechanical punched card machines: files, records, fields, keys, grouping, merging data from two files, and the hierarchical combi-nation of master and detail records within a single file. These applied to

database history has been largely ne-glected. For example, the index of Isaa-cson’s book does not include entries for “database” or for any of the four people to have won Turing Awards in this area: Charles W. Bachman and Ed-gar F. Codd (1981), James Gray (1988), or Michael Stonebraker (2014).

That’s a shame, because if any tech-nology was essential to the rebuild-ing of our daily lives around digital infrastructures, which I assume is what Isaacson means by “the Digital Revolution,” then it was the database management system. Databases un-dergird the modern world of online information systems and corporate intranet applications. Few skills are more essential for application develop-ers than a basic familiarity with SQL, the standard database query language, and a database course is required for most computer science and informa-tion systems degree programs. Within ACM, SIGMOD—the Special Interest Group for Management of Data—has a long and active history fostering da-tabase research. Many IT professionals center their entire careers on database technology: the census bureau esti-mates the U.S. alone employed 120,000 database administrators in 2014 and predicts faster than average growth for this role.

Bachman’s IDS was years ahead of its time, implementing capabilities that had until then been talked about but never accomplished. Detailed func-tional specifications for the system were complete by January 1962, and Bachman was presenting details of the planned system to his team’s in-house customers by May of that year. It is less clear from archival materials when the system first ran, but Bachman tells me that a prototype installation of IDS was tested with real data in the summer of 1963, running twice as fast as a custom-built manufacturing control system performing the same tasks.

The details of IDS, Bachman’s life story, and the context in which it arose have been explored elsewhere.2,6 In this column, I focus on two specific ques-tions:

˲ Why do we view IDS as the first da-tabase management system, and

˲ What were its similarities and dif-ferences versus later systems?

There will always be an element

of subjectivity in judgments about “firsts,” particularly as IDS predated the concept of a database management system. As a fusty historian I value nu-ance and am skeptical of the idea that any important innovation can be fully understood by focusing on a single breakthrough moment. I have docu-mented many ways in which IDS built on earlier file management and report generation systems.7 However, if any system deserves the title of “first data-base management system” then it is clearly IDS. It became a model for the earliest definitions of “data base man-agement system” and included most of the core capabilities later associated with the concept.

What Was IDS For?Bachman created IDS as a practical tool, not an academic research project. In 1963 there was no database research community. Computer science was just beginning to emerge as an academic field, but its early stars focused on pro-gramming language design, theory of computation, numerical analysis, and operating system design. In contrast to this academic neglect, the efficient and flexible handling of large collec-tions of structured data was the central challenge for what we would now call corporate information systems depart-ments, and was then called business data processing.

During the early 1960s the hype and reality of business computing diverged dramatically. Consultants, visionaries, business school professors, and com-puter salespeople had all agreed that the best way to achieve real economic payback from computerization was to

If any technology was essential to the rebuilding of our daily lives around digital infrastructures, it was the database management system.



magnetic tape much as they had done to punched cards, except that tape storage made sorting much harder. The formats of tape files were usually fixed by the code of the application programs working with the data. Ev-ery time a field was added or changed all the programs working with the file would need to be rewritten. If applica-tions were integrated, for example, by treating order records from the sales accounting system as input for the pro-duction scheduling application, the resulting web of dependencies made it increasingly difficult to make even minor changes when business needs shifted.

The other key challenge was mak-ing effective use of random access stor-age in business application programs. Sequential tape storage was conceptu-ally simple, and the tape drives them-selves provided some intelligence to aid programmers in reading or writ-ing records. Applications were batch-oriented because searching a tape to find or update a particular record was too slow to be practical. Instead, mas-ter files were periodically updated with accumulated data or read through to produce reports. With the arrival, in the early 1960s, of disk storage a com-puter could theoretically apply up-dates one at a time as new data came in and generate reports as needed based on current data. Indeed this was the target application of IBM’s RAMAC computer, the first to be equipped with a hard disk drive. A programmer working with a disk-based system could easily instruct the disk drive to pull data from any particular platter or track, but the hard part was figuring out where on the disk the desired re-cord could be found. The phrase “data base” was associated with random ac-cess storage but was not particularly well established, so Bachman’s alter-native choice of “data store” would not have seemed any more or less familiar at the time.

Without significant disk file man-agement support from the rudimentary operating systems of the era only elite programmers could hope to create an efficient random access application. Mainstream application programmers were beginning to shift from assembly language to high-level languages such as COBOL, which included high-level

support for structuring data in tape files but lacked comparable support for random access storage. Harnessing the power of disks meant finding ways to sequence, insert, delete, or search for records that did not simply repli-cate the sequential techniques used with tape. Solutions such as hashing, linked lists, chains, indexing, inverted files, and so on were quickly devised but these were relatively complex to implement and demanded expert judgment to select the best method for a particular task (see Figure 1).

IDS was intended to substantially solve these two problems, so that ap-plications could be integrated to share data files and ordinary programmers could effectively develop random ac-cess applications using high-level lan-guages. Bachman designed IDS to meet the needs of an integrated systems project called MIACS, for Manufactur-ing Information and Control System. General Electric had many factories spread over its various divisions, and could not produce and support a dif-ferent integrated manufacturing sys-tem for each one. Furthermore, it was entering the computer business, and its managers recognized that a flexible and generic integrated system based on disk storage would be a powerful tool in selling its machines to other companies. A prototype version of MI-ACS was being built and tested on the firm’s Low Voltage Switch Gear depart-ment by a group of systems-minded staff specialists.

Was IDS a Database Management System?By interposing itself between appli-cation programs and the disk files in which they stored data, IDS carried out

what we still consider the core task of a database management system. Pro-grams could not manipulate data files directly, instead making calls to IDS so that it would perform the data opera-tions on their behalf.

Like modern database manage-ment systems, IDS explicitly stored and manipulated metadata about the records and their relationships, rather than expecting each application pro-gram to understand and respect the format of every data file it worked with. It enforced relationships between dif-ferent record types, and would protect database integrity. Database design-ers specified record clusters, linked list sequencing, indexes, and other details of record organization to boost performance based on expected usage patterns. However, the first versions did not include a formal data descrip-tion language. Instead of being de-fined through textual commands the metadata was punched onto specially formatted input cards. A special com-mand told IDS to read and apply this information. New elements could be added without deleting existing re-cords. Each data manipulation com-mand contained a reference to the ap-propriate element in the metadata.

IDS was designed to be used with a high-level programming language. In the initial prototype version, op-erational in early 1963, this was Gen-eral Electric’s own GECOM language, though performance and memory concerns drove a shift to assembly language for the application program-ming in a higher performance version completed in 1964. Calls to IDS opera-tions such as store, retrieve, modify, and delete were evaluated at runtime against embedded metadata. As high-level languages matured and memory grew less scarce, later versions of IDS worked with application programs written in COBOL.

This provided a measure of what is now called data independence for programs. If a file was restructured to add fields or modify their length then the programs using it would continue to work properly. Files could be moved around and records reorganized with-out rewriting application programs. That made running different applica-tion programs against the same data-base much more feasible. IDS also in-

IDS was designed to be used with a high-level programming language.



Controller was built and installed at Weyerhaeuser, on a computer hooked up to a national Teletype network. The system serviced remote users at their Teletypes without any intervention needed by local operators. Requests to process order entry, inventory manage-ment, invoicing, and other business transactions were processed automati-cally by the Problem Controller and ap-plication programs.

Bachman’s original version of IDS lacked a backup and recovery system, a key feature of later database manage-ment systems. This was added in 1964 by the International General Electric team that produced and operated the first production installation of IDS. A recovery and restart magnetic tape logged each new transaction as it was started and captured database pages “before” and “after” they were modi-fied by the transaction, so that the da-tabase could be restored to a prior con-sistent state if something went wrong before the transaction was completed. The same tape also served as a backup of all changes written to the disk in case there was a disk failure since the last full database backup.

The first packaged versions of IDS did lack some features later viewed as essential for database management systems. One was the idea that spe-cific users could be granted or denied access to particular parts of the data-base. This omission was related to an-other limitation: IDS databases could be queried or modified only by writing and executing programs in which IDS calls were included. There was no capa-bility to specify “ad hoc” reports or run one-off queries without having to write a program.a These capabilities did ex-ist during the 1960s in report genera-tor systems (such as 9PAC and MARK IV) and in online interactive data man-agement systems (such as TDMS) but these packages were generally seen as

a On reading this observation, Bachman noted “IDS came into use long before the notion of online, interactive users came into vogue. There is no record of anyone writing an IDS transaction processing application program that processed transactions that specified a query or report and returned the desired out-put. However, the capability of IDS and the Problem Controller to handle such a query or report specifying transaction programs was clearly available. A missed opportunity!”

was used for paging buffers by IDS’s virtual memory manager.

Requests from users to process par-ticular transactions were read from “problem control records” stored and retrieved by IDS in the same manner as application data records. Transactions could be simple, or contain a batch of data cards to be processed. The Prob-lem Controller processed one transac-tion at a time by executing the desig-nated application program. It worked its way through the queue of transac-tion requests, choosing the highest pri-ority outstanding job and refreshing the queue from the card reader after each transaction was finished.

The Problem Controller did not appear in later versions of IDS but did provide a basis for an early online transaction processing system. By 1965 an expanded version of the Problem

cluded its own system of paging data in and out of memory, to create a virtual memory capability transparent to the application programmer.

The concept of transactions is fundamental to modern database management systems. Programmers specify that a series of interconnected updates must take place together, so that if one fails or is undone they all are. IDS was also transaction oriented, though not in exactly the same sense. Bachman devised an innovative trans-action processing system, which he called the Problem Controller. The Problem Controller and IDS were load-ed when the computer was booted. The Problem Controller and IDS oc-cupied 4,000 words of memory. They took control of the entire computer, which might have only 8,000 words of memory. The residual area in memory










. B



















Figure 2. This drawing, from the 1962 presentation “IDS: The Information Processing Machine We Need,” shows the use of chains to connect records. The programmer looped through GET NEXT commands to navigate between related records until an end-of-set condition is detected.



a separate class of software from da-tabase management systems. By the 1970s report generation packages, still widely used, included optional mod-ules to interface with data stored in da-tabase management systems.

IDS and CODASYLAfter Bachman handed IDS over to a different team within General Elec-tric in 1964 it was made available as a documented and supported software package for the company’s 200-series computers. In those days software packages from computer manufac-turers were paid for by hardware sales and given to customers without an additional charge. Later versions sup-ported its 400- and 600-series systems. New versions followed in the 1970s after Honeywell bought out General Electric’s computer business. IDS was a strong product, in many respects more advanced than IBM’s competing IMS that appeared several years later. However, IBM machines so domi-nated the industry that software from other manufacturers was doomed to relative obscurity.

During the late 1960s the ideas Bachman created for IDS were taken up by the Database Task Group of CO-DASYL, a standards body for the data processing industry best known for its creation and promotion of the CO-BOL language. Its initial report, issued in 1969, drew heavily on IDS in defin-ing a proposed standard for database management systems, in part thanks to Bachman’s own service on the com-mittee.4 The report documented foun-dational concepts and vocabulary such as data definition language, data ma-nipulation language, schemas, data

independence, and program indepen-dence. It went beyond early versions of IDS by adding security features, includ-ing “privacy locks” and “sub-schemas,” roughly equivalent to views in modern systems, so that particular programs could be constrained to work with de-fined subsets of the database.

CODASYL’s definition of the archi-tecture of a database management sys-tem and its core capabilities were quite close to that included in textbooks to this day. In particular, it suggested that a database management system should support online, interactive applica-tions as well as batch-driven applica-tions and have separate interfaces. In retrospect, the committee’s work, and a related effort by CODASYL’s Systems Committee to evaluate existing sys-tems within the new framework,5 were significant primarily for formulating and spreading the concept of a “data base management system.”

Although IBM itself refused to sup-port the CODASYL approach many other computer vendors endorsed the committee’s recommendations and eventually produced systems incorpo-rating these features. The most success-ful CODASYL system, IDMS, came from an independent software company. It began as a port of IDS to IBM’s domi-nant System/360 mainframe platform.b

The Legacy of IDSIDS and CODASYL systems did not use the relational data model, formu-lated years later by Ted Codd, which underlies today’s dominant SQL data-base management systems. Instead it introduced what would later be called the “network data model.” This en-coded relationships between differ-ent kinds of records as a graph, rather than the strict hierarchy enforced by tape systems and some other software packages of the 1960s such as IBM’s later and widely used IMS. The network data model was widely used during the

b The importance of the database management system to the emerging packaged software in-dustry is a major theme in M. Campbell-Kelly, From Airline Reservations to Sonic the Hedgehog: A History of the Software Industry. MIT Press, Cambridge, MA, 2003 and is explored in detail in T.J. Bergin and T. Haigh, “The Commercial-ization of Database Management Systems, 1969–1983.” IEEE Annals of the History of Com-puting 31, 4 (Oct.–Dec. 2009), 26–41.

IDS was a strong product, in many respects more advanced than IBM’s competing IMS that appeared several years later.

sees himself above all as an engineer, retaining a professional engineer’s zest for the elegant solution of diffi-cult problems and faith in the power of careful and rational analysis. As he wrote in a note at the end of the tran-script of an oral history interview I conducted with him in 2004, “My work has been my play.”1

When database specialists look at IDS today they immediately see its limitations compared to modern sys-tems. Its strengths are more difficult to recognize, because its huge influ-ence on the nascent software industry meant that much of what was revo-lutionary about it in 1963 was soon taken for granted. Without IDS, or Bachman’s tireless championing of the ideas it contained, the very con-cept of a “database management sys-tem” might never have taken root. IDS did more than any other single piece of software to broaden the range of business problems to which comput-ers could usefully be applied and so to usher in today’s “digital” world where every administrative transaction is realized through a flurry of database queries and updates rather than by completing, routing, and filing in trip-licate a set of paper forms.

Bachman, C.W. Oral history interview by Thomas

Haigh September 25–26, 2004, Tucson, AZ. ACM Oral History Interviews collection. ACM Digital Library,

Bachman, C.W. The origin of the integrated data store (IDS): The first direct-access DBMS. IEEE Annals of the History of Computing 31, 4 (Oct.–Dec. 2009), 42–54.

Bachman, C.W. The programmer as navigator. Commun. ACM 16, 11 (Nov. 1973), 653–658.

CODASYL Data Base Task Group. CODASYL Data Base Task Group: October 1969 Report.

CODASYL Systems Committee. Survey of Generalized Data Base Management Systems, May 1969. Association for Computing Machinery, New York, 1969.

Haigh, T. Charles W. Bachman: Database software pioneer. IEEE Annals of the History of Computing 33, 4 (Oct.–Dec. 2011), 70–80.

Haigh, T. How data got its base: Generalized information storage software in the 1950s and 60s. IEEE Annals of the History of Computing 31, 4 (Oct.–Dec. 2009), 6–25.

Haigh, T. Inventing information systems: The systems men and the computer, 1950–1968. Business History Review 75, 1 (Spring 2001), 15–61.

Thomas Haigh ( is a visiting professor at Siegen University, an associate professor of information studies at the University of Wisconsin—Milwaukee, and immediate past chair of the SIGCIS group for historians of computing.

1970s and 1980s, and commercial da-tabase management systems based on this approach were among the most successful products of the mushroom-ing packaged software industry.

Bachman spoke memorably in his 1973 Turing Award lecture of the “Programmer as Navigator,” chart-ing a path through the database from one record to another.3 The network approach used in IDS required pro-grammers to work with one record at a time. Performing the same op-eration on multiple records meant retrieving a record, processing and if necessary updating it, and then mov-ing on to the next record of interest to repeat the process. For some tasks this made programs longer and more cumbersome than the equivalent in a relational system, where a task such as deleting all records more than a year old or adding 10% to the sales price of every item could be performed with a single command.

IDS and other network systems encoded what we now think of as the “joins” between different kinds of records as part of the database struc-ture rather than specifying them in each query and rebuilding them when the query is processed (see Figure 2). Bachman introduced a data structure diagramming, often called the “Bach-man diagram” to describe these rela-tionships.c Hardcoding the relation-ships between record sets made IDS much less flexible than later rela-tional systems, but also much sim-pler to implement and more efficient for routine operations.

IDS was a useful and practical tool for business use from the mid-1960s, while relational systems were not com-mercially significant until the early 1980s. Relational systems did not be-come feasible until computers were orders of magnitude more powerful than they had been in 1963 and some extremely challenging implementa-tion issues had been overcome by pio-neers such as IBM’s System R group and Berkeley’s INGRES team. Even after relational systems were commer-

c C.W. Bachman, “Data Structure Diagrams,” Data Base 1, 2 (Summer 1969), 4–10 was very influential in spreading the idea of data struc-ture diagrams, but internal GE documents make clear he was using a similar technique as early as 1962.

cialized the two approaches were seen for some time as complementary, with network systems used for high-perfor-mance transaction-processing systems handling routine operations on large numbers of records (for example, credit card transaction processing or custom-er billing) and relational systems best suited for “decision support” analytical data crunching. IDMS, the successor to IDS, underpins some very large main-frame applications and is still being supported and enhanced by its current owner Computer Associates, most re-cently with release 18.5 in 2014. Howev-er it, and other database management systems based on Bachman’s network data model, have long since been su-perseded for new applications and for mainstream computing needs.

Although by any standard a suc-cessful innovator, Bachman does not fit neatly into the “hackers, geniuses, and geeks” framework favored by Wal-ter Isaacson. During his long career Bachman had also founded a public company, played a leading role in for-mulating the OSI seven-layer model for data communications, and pio-neered online transaction processing. In 2014, he visited the White House to receive from President Obama a National Medal of Technology and Innovation in recognition of his “fun-damental inventions in database management, transaction processing, and software engineering.”d Bachman

d The 2012 medals were presented at the White House in November 2014.

IDS was a useful and practical tool for business use from the mid-1960s, while relational systems were not commercially available until the early 1980s.

