CS 185C/286: The History of Computing October 31 Class Meeting Department of Computer Science San Jose State University Fall 2011 Instructor: Ron Mak www.cs.sjsu.edu/~mak
Dec 20, 2015
CS 185C/286: The History of Computing October 31 Class Meeting
Department of Computer ScienceSan Jose State University
Fall 2011Instructor: Ron Mak
www.cs.sjsu.edu/~mak
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
2
Don Chamberlin
History of Computing Speaker Wednesday, Nov. 2, 6:00-7:00 PMAuditorium ENGR 189 Reception before the talk in
ENGR 294 at 5:00 PM “Fifty Years of Data:
How Advances in Database Management Have Helped to Shape Our World” Co-inventor of the SQL and
XQuery database languages
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
3
What is a Database?
A collection of information that lasts over a long period of time Can be accessed simultaneously by multiple
instances of an application or by instances of many applications
Managed by a database management system (DBMS)_
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
4
Database Management System (DBMS)
Users can create new databases Specify the structure of the data (schema)
Users can query (ask questions about) the data Users can modify the data Store large amounts (terabytes) of data Store data for a long time (many years) Ensure reliability
Recover from errors and failures Ensure data integrity
Maintain proper relationships among data
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
5
DBMS, cont’d
Control access to data by multiple users and applications
Ensure data operations are completed (atomicity) roll back partially completed operations
Maintain a data model, which determines: structure of the data operations on the data constraints on the data
Types of data models - hierarchical - relational - object-oriented - object-relational
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
6
The Relational Data Model
Data element: a value that is stored in the database values are typed a value can be null
Entity: a group of data elements that together are meaningful for a person or an application Each data element is the value of an attribute
of the entity_
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
7
The Relational Data Model, cont’d
Table: a conceptual two-dimensional structure that contains entities of a particular type. Also called a relation Each row (also called a record) contains the
attribute values of one entity. Each column (also called a field) holds an
attribute value.
Table relation Row entity Rows and columns records and fields
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
8
Logical Data Model Initial version
Id Name Class_code Subject Room
7003 Rogers, Tom 926 Java programming 101
7008 Thompson, Art 908 Data structures 114
7012 Lane, John 951 Software engineering 210
7012 Lane, John 974 Operating systems 109
7051 Flynn, Mabel 931 Compilers 222
John Lane teachestwo classes.
Each table has a primary key (PK) field whose value in each record uniquely identifies that record.
Id Name Teacher_id_1 Teacher_id_2 Teacher_id_3
1001
Doe, John 7003 7012 7008
1005
Novak, Tim 7012 7008 null
1009
Klein, Leslie null null null
1014
Jane, Mary 7051 null null
1021
Smith, Kim 7003 7012 7051
Student
Teacher
Student id name which teachers
Teacher id name which classes
taught
Class class code subject name class room number
PK
PK
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
9
Normalization
Relational tables need to be normalized. Improve the stability of the model
More resilient to change
Faster record insertions and updates Improve data quality
There are six normal forms, but we will only consider the first two. Each normal form includes the lower normal forms
Example: A database in second normal form is also in first normal form.
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
10
First Normal Form (1NF)
Separate multi-valued data elements. Break the name fields into last name and first name fields.
Id Last First Teacher_id_1 Teacher_id_2 Teacher_id_3
1001 Doe John 7003 7012 7008
1005 Novak Tim 7012 7008 null
1009 Klein Leslie null null null
1014 Jane Mary 7051 null null
1021 Smith Kim 7003 7012 7051
Id Last First Class_code Subject Room
7003 Rogers Tom 926 Java programming 101
7008 Thompson Art 908 Data structures 114
7012 Lane John 951 Software engineering 210
7012 Lane John 974 Operating systems 109
7051 Flynn Mabel 931 Compilers 222
Student
Teacher
Id Name Teacher_id_1 Teacher_id_2 Teacher_id_3
1001
Doe, John 7003 7012 7008
1005
Novak, Tim 7012 7008 null
1009
Klein, Leslie null null null
1014
Jane, Mary 7051 null null
1021
Smith, Kim 7003 7012 7051Id Name Class_code Subject Room
7003 Rogers, Tom 926 Java programming 101
7008 Thompson, Art 908 Data structures 114
7012 Lane, John 951 Software engineering 210
7012 Lane, John 974 Operating systems 109
7051 Flynn, Mabel 931 Compilers 222
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
11
First Normal Form, cont’d Move repeating data elements
to a new table.
Id Last First
1001 Doe John
1005 Novak Tim
1009 Klein Leslie
1014 Jane Mary
1021 Smith Kim
Student_id Teacher_id
1001 7003
1001 7012
1001 7008
1005 7012
1005 7008
1014 7051
1021 7003
1021 7012
1021 7051
Linkingtable
Id Last First Class_code Subject Room
7003 Rogers Tom 926 Java programming 101
7008 Thompson Art 908 Data structures 114
7012 Lane John 951 Software engineering 210
7012 Lane John 974 Operating systems 109
7051 Flynn Mabel 931 Compilers 222
Student
Teacher
Student_Teacher
Id Last First Teacher_id_1 Teacher_id_2 Teacher_id_3
1001 Doe John 7003 7012 7008
1005 Novak Tim 7012 7008 null
1009 Klein Leslie null null null
1014 Jane Mary 7051 null null
1021 Smith Kim 7003 7012 7051
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
12
Problem!
Suppose Prof. Lane decides he doesn’t want to teach Operating Systems anymore and we delete that row.
What other information do we lose as a result? We lose the fact that the class is taught in Room 109.
The problem arises because the Teacher table really contains two separate sets of data: teacher data and class data
Id Last First Class_code Subject Room
7003 Rogers Tom 926 Java programming 101
7008 Thompson Art 908 Data structures 114
7012 Lane John 951 Software engineering 210
7012 Lane John 974 Operating systems 109
7051 Flynn Mabel 931 Compilers 222
Teacher
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
13
Second Normal Form
Keep related data together (cohesiveness).
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Class_code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Teacher Class
Primary key (PK)
Primary key (PK) Foreign key (FK)
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
14
Final Database Structure
Id Last First
1001 Doe John
1005 Novak Tim
1009 Klein Leslie
1014 Jane Mary
1021 Smith Kim
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Student_id Class_code
1001 926
1001 951
1001 908
1005 974
1005 908
1014 931
1021 926
1021 974
1021 931
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Teacher
Student
Class
Student_Class John Doe takes Java programming, software engineering, and data structures.
The Java Programming class has John Doeand Kim Smith.
Mabel Flynn teaches compilers.
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
15
SQL
Structured Query Language (SQL) An industry standard But has many proprietary extensions
Language for managing data in a relational database Create and drop (delete) databases Create, alter, and drop tables of a database Retrieve, insert, update, and delete data
in the tables._
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
16
SQL Query Examples
What is the class code of the Java programming class?
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Class
SELECT code FROM class WHERE subject = 'Java programming'
+------+| code |+------+| 926 |+------+
Source tables
Desired fields
Selection criteria
Query
Results
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
17
SQL Query Examples, cont’d
Who is teaching Java programming?
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
ClassTeacher
SELECT first, last FROM teacher, classWHERE id = teacher_id AND subject = 'Java programming'
+-------+--------+| first | last |+-------+--------+| Tom | Rogers |+-------+--------+
Selecting from multiple tablesis called a join.
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
18
SQL Query Examples, cont’d
What classes does John Lane teach?
SELECT code, subjectFROM teacher, classWHERE last = 'Lane' AND first = 'John'AND id = teacher_id
+------+----------------------+| code | subject |+------+----------------------+| 951 | Software engineering || 974 | Operating systems |+------+----------------------+
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
ClassTeacher
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
19
SQL Query Examples, cont’d Who is taking Java programming?
Id Last First
1001 Doe John
1005 Novak Tim
1009 Klein Leslie
1014 Jane Mary
1021 Smith Kim
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Student_id Class_code
1001 926
1001 951
1001 908
1005 974
1005 908
1014 931
1021 926
1021 974
1021 931
SELECT id, last, firstFROM student, class, student_classWHERE subject = 'Java programming'AND code = class_code AND id = student_id
+------+-------+-------+| id | last | first |+------+-------+-------+| 1001 | Doe | John || 1021 | Smith | Kim |+------+-------+-------+
Class
Student_Class
Student
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
20
SQL Query Examples, cont’d Who are John Lane’s students and in which subjects?
Id Last First
1001 Doe John
1005 Novak Tim
1009 Klein Leslie
1014 Jane Mary
1021 Smith Kim
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Student_id Class_code
1001 926
1001 951
1001 908
1005 974
1005 908
1014 931
1021 926
1021 974
1021 931
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Teacher StudentClass
Student_Class
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
21
SQL Query Examples, cont’d
Id Last First
1001 Doe John
1005 Novak Tim
1009 Klein Leslie
1014 Jane Mary
1021 Smith Kim
Code Teacher_id Subject Room
908 7008 Data structures 114
926 7003 Java programming 101
931 7051 Compilers 222
951 7012 Software engineering 210
974 7012 Operating systems 109
Student_id Class_code
1001 926
1001 951
1001 908
1005 974
1005 908
1014 931
1021 926
1021 974
1021 931
SELECT student.first, student.last, subjectFROM student, teacher, class, student_classWHERE teacher.last = 'Lane' AND teacher.first = 'John'AND teacher_id = teacher.idAND code = class_code AND student.id = student_idORDER BY subject, student.last
+-------+-------+----------------------+| first | last | subject |+-------+-------+----------------------+| Tim | Novak | Operating systems || Kim | Smith | Operating systems || John | Doe | Software engineering |+-------+-------+----------------------+
Id Last First
7003 Rogers Tom
7008 Thompson Art
7012 Lane John
7051 Flynn Mabel
Teacher StudentClass
Student_Class
Who are John Lane’s students and in which subjects?
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
22
XML Data
Data can also be stored as XML
XQuery is designed to query XML data SQL : relational databases XQuery : XML
_
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
23
XQuery Examples
Query:What titles are in the bookstore?
doc("books.xml")/bookstore/book/title
Results:<title lang="en">Everyday Italian</title><title lang="en">Harry Potter</title><title lang="en">XQuery Kick Start</title><title lang="en">Learning XML</title>
<bookstore>
<book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price></book>
<book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J.K. Rowling</author> <year>2005</year> <price>29.99</price></book>
<book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price></book>
<book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price></book>
</bookstore>
Department of Computer ScienceFall 2011: October 31
CS 185C/286: History of Computing© R. Mak
24
XQuery Examples, cont’d<bookstore>
<book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price></book>
<book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J.K. Rowling</author> <year>2005</year> <price>29.99</price></book>
<book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price></book>
<book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price></book>
</bookstore>
Query:Which books cost less than $30?
doc("books.xml")/bookstore/book[price<30]
Results:<book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J.K. Rowling</author> <year>2005</year> <price>29.99</price></book>