Top Banner
Chapter 3 Distributed Data Processing
36

Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Mar 30, 2015

Download

Documents

Damian Deaton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Chapter 3Distributed Data Processing

Page 2: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Data Centers

A facility that houses computer systems and their associated components including storage and telecommunication systems

Can occupy a single room in a building, one or more floors, or an entire building

Much of the equipment consists of servers mounted in rack cabinets that are placed in rows that form corridors that enable people access

to both the front and the rear of each cabinet

Mainframe computers and storage devices are placed alongside the racks

Air-conditioning is used to control temperature and humidity

Page 3: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Centralized Data Processing

Data processing is done on one or on a cluster of computers located in a central data processing facilityUsers transmit data to the centralized data processing facility where it is processed by applications running on the computers located there

The data processing for an application does not take place on the user’s computing device

Page 4: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Centralized Data Processing Centralized Computers

One or more computers are located in a central facilityCentralized Processing

All applications are run on computers in the central data processing facility

Centralized DataMost data are stored in files and databases at the central facility

Centralized ControlThe central facility is managed by a data processing or information security manager

Centralized Support StaffMust include a technical support staff to operate and maintain the data center hardware and applications

Page 5: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

DallasCounty

InformationSystems

Architecture

Page 6: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Distributed Data Processing (DDP)Computers are dispersed throughout an organizationObjective is to process information in a way that is most effective based on operational, economic, and/or geographic considerationsMay include a central data center plus satellite data centers or it may resemble a community of peer computing facilities

Various computers in the system must be connected to one anotherA DDP facility involves the distribution of computers, processing, and data

Page 7: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Carnival Valor Wireless

LAN

Page 8: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.1 Requirements for the Corporate Computing Function

Page 9: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.2

Potential Benefits

of Distributed Data

Processing

(page 1 of 2)

Page 10: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.2

Potential Benefits

of Distributed

DataProcessing

(page 2 of 2)

Page 11: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.3

Potential Drawbacks

of Distributed Data Processing

Page 12: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.4

Major Characteristics

of Data Center

Tiers

Page 13: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Data Center Computing and Storage Technologies

Mainframes Sales continue to be strong and they are increasingly being used as a hub for enterprise infrastructure because of their potential to enhance security, ensure availability, and improve manageability

In-memory computing systemsProcessors include terabyte-plus RAM capable of storing large data setsHas the potential to revolutionize business intelligence (BI) by making it possible to bring the equivalent data warehouse into memory to enable real-time data mining and business analytics

Page 14: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Virtualization

The creation of a virtual (rather than actual) version of something

In computing this means creating virtual versions of operating systems, servers, storage devices, and networks

Categories:Operating system virtualizationServer virtualizationStorage virtualizationNetwork virtualization

Page 15: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Client/Server Architecture (C/S)

Combines advantages of distributed and

centralized computing

Cost-effective and achieves economies of

scale by centralizing support for specialized

functions

Flexibility is provided by the fact that the functional services provided by servers

are not necessarily in a one-to-one relation

with physical computers

Page 16: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Three Tier Enterprise System Architecture

Page 17: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

IntranetsProvides users of client devices with applications associated with the Internet but isolated within the organizationKey features:

Uses Internet-based standards such as HyperText Markup Language (HTML) and the Simple Mail Transfer Protocol (SMTP)Uses the TCP/IP protocol suite applications and servicesIncludes wholly owned content that is not accessible to external users over the public Internet

Such content can also be access by authorized internal users even though the corporation has Internet connections and runs a Web server on the Internet

Page 18: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Extranets

Makes use of TCP/IP protocols and applications, especially the WebDistinguishing feature is that it provides access to corporate resources by authorized outside clients

This outside access can be provided via the company’s connections to the Internet or through other data communications networks

Enables authorized outside clients with fairly extensive access to corporate resourcesTypical model of operation is client/server

Page 19: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Application Service Provider (ASP)

Businesses that provide computer-based services to business subscribers over a network

Software that ASPs provide is called on-demand software or software as a service (SaaS)

Costs and complexities of sophisticated software can be reduced to levels that small and medium-size firms can afford• Software is kept up to date• 24x7x365 technical support is provided• Physical and electronic security is provided

Service-level agreements • guarantee certain levels of service,

such as availability

Page 20: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Cloud Computing

Encompasses any subscription-based or pay-per-use service that extends an organization’s existing IT capabilities over the Internet in real timeEnables businesses to increase capabilities or capacity without investing in new infrastructure, licensing new software, or training personnelForms of cloud computing:

Software as a service (SaaS)Infrastructure as a service (IaaS)Platform as a service (PaaS)Managed service providers (MSP)

Page 21: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Distributed ApplicationsTwo dimensions characterize the distribution of applications

Allocation of application functions within the network

One application may be split up into components that are dispersed among multiple computersOne application may be replicated on different computersDifferent applications may be distributed among different computers

Whether the distribution of the application is vertical or horizontal

Page 22: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Vertical PartitioningInvolves one application split up into components that are dispersed among a number of machines

Examples:• Insurance

• Branch office operations and head office operations• Retail chains

• Point-of-sale terminals• Office and sales personnel computers

• Process control• Each major operational area is controlled by a console or workstation that is fed

information from distributed process-control microprocessors • Web mashups

• Created by integrating data from multiple sources to create a new application

• Google, eBay, Amazon

Page 23: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Horizontal PartitioningInvolves either one application replicated on a number of machines or a number of different applications distributed among a number of machines

Data processing is distributed among a number of computers that have a peer relationship

Computers normally operate autonomously

Examples:

• Small office/home office (SOHO) peer-to-peer networks• Users are linked together in peer-to-peer LANs• Access rights to sharable resources are governed by setting sharing

permissions on the individual machines• Air traffic control system

• Each regional center for air traffic control operates autonomously of the other centers, performing the same set of applications

Page 24: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Other Forms of DDP

Distributed devicesATM machinesFactory automation

Network managementCentralized systems provide management and control of distributed nodesAt least some of the computers in the distributed system must include some management and control logic to enable them to interact with the central network management system

Page 25: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Database Management Systems (DBMS)

DatabaseA structured collection of data stored for use in one or more applicationsIn addition to data, a database contains the relationships between data items and groups of data items

DBMSA suite of programs for constructing and maintaining the database and for offering ad hoc query capabilities to multiple users and applications

Query languageProvides a uniform interface to the database

Page 26: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Database Management Systems

Page 27: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Database Organization

Distributed database

A collection of several different databases, distributed among multiple computers, that looks like a single database to the user

DBMS controls access

Three ways of organizing data for use by an organization:

1. Centralized

2. Replicated

3. Partitioned

Page 28: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Centralized Versus Distributed DatabasesCentralized

Housed in a central computer facilityUsers and applications can be at a remote locationDesirable when the security and integrity of the data are paramountOften used with a vertical DDP organization

Distributed Design of data organization is more understandable and easier to implementData can be stored locally under local controlConfines the effects of a computer breakdown to its point of occurrenceCollection of data and the number of users is not limited by a single computer’s size and processing power

Page 29: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.5 Replication Strategy Variants

Page 30: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.6 Advantages and Disadvantages

of Database Distribution Methods

Page 31: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.7 Strategies for Database Organization

Page 32: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Networking Implications of DDP

Page 33: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Full Connectivity Using a Central Switch

Page 34: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Availability and PerformanceAvailability

The percentage of time that a particular function or application is available for usersCan be “desirable” or “essential”High availability requirements

Distributed system must be designed so that the failure of a single computer or device within the network does not deny access to the applicationCommunications links and equipment must be highly availableSome form of link and communication equipment redundancy and backup is needed

Performance

Response time is critically important for high interactive applications

Network must have sufficient capacity and flexibility to provide the required response time

If time is not critical, the major network performance concern is throughput

The network must be designed to handle large volumes of data

Page 35: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Table 3.8

Traditional Data Storage/Management

Technologies

Page 36: Data processing is done on one or on a cluster of computers located in a central data processing facility Users transmit data to the centralized data.

Summary Centralized and distributed

organization

Technical trends leading to distributed data processing

Management and organizational considerations

Data center evolution

Client/server architecture

Intranets and extranets

Web services and cloud computing

Chapter 3: Distributed Data Processing

Distributed applications

Other forms of DDP

Database management systems

Centralized versus distributed databases

Replicated and partitioned databases

Networking implications of DDP

Big data infrastructure considerations