Top Banner
REQUIREMENT ENGINEERING: 1. Introduction The objectives of this module are To establish the importance / relevance of requirement specifications in software development To bring out the problems involved in specifying requirements To illustrate the use of modelling techniques to minimise problems in specifying requirements Requirements can be defined as follows: A condition or capability needed by a user to solve a problem or achieve an objective. A condition or capability that must be met or possessed by a system to satisfy a contract, standard, specification, or other formally imposed document. At a high level, requirements can be classified as user/client requirements and software requirements. Client requirements are usually stated in terms of business needs. Software requirements specify what the software must do to meet the business needs. For example, a stores manager might state his requirements in terms of efficiency in stores management. A bank manager might state his requirements in terms of time to service his customers. It is the analyst's job to understand these requirements and provide an appropriate solution. To be able to do this, the analyst must understand the client's business domain: who are all the stake holders, how they affect the system, what are the constraints, what are the alterables, etc. The analyst should not blindly assume that only a software solution will solve a client's problem. He should have a broader vision. Sometimes, re-engineering of the business processes may be required to improve efficiency and that may be all that is required. After all this, if it is found that a software solution will add value, then a detailed statement of what the software must do to meet the client's needs should be
98
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: part-2

REQUIREMENT ENGINEERING:

1. Introduction

The objectives of this module are To establish the importance / relevance of requirement specifications in software development To bring out the problems involved in specifying requirements To illustrate the use of modelling techniques to minimise problems in specifying requirements

Requirements can be defined as follows: A condition or capability needed by a user to solve a problem or achieve an objective. A condition or capability that must be met or possessed by a system to satisfy a contract, standard, specification, or other formally imposed document.

At a high level, requirements can be classified as user/client requirements and software requirements. Client requirements are usually stated in terms of business needs. Software requirements specify what the software must do to meet the business needs. For example, a stores manager might state his requirements in terms of efficiency in stores management. A bank manager might state his requirements in terms of time to service his customers. It is the analyst's job to understand these requirements and provide an appropriate solution. To be able to do this, the analyst must understand the client's business domain: who are all the stake holders, how they affect the system, what are the constraints, what are the alterables, etc. The analyst should not blindly assume that only a software solution will solve a client's problem. He should have a broader vision. Sometimes, re-engineering of the business processes may be required to improve efficiency and that may be all that is required. After all this, if it is found that a software solution will add value, then a detailed statement of what the software must do to meet the client's needs should be prepared. This document is called Software Requirements Specification (SRS) document.

Stating and understanding requirements is not an easy task. Let us look at a few examples: "The counter value is picked up from the last record" In the above statement, the word 'last' is ambiguous. It could mean the last accessed record, which could be anywhere in a random access file, or, it could be physically the last record in the file "Calculate the inverse of a square matrix 'M' of size 'n' such that LM=ML=In where 'L' is the inverse matrix and 'In' is the identity matrix of size 'n' "This statement though appears to be complete, is missing on the type of the matrix elements. Are they integers, real numbers, or complex numbers. Depending on the answer to this question, the algorithm will be different. "The software should be highly user friendly"How does one determine, whether this requirement is satisfied or not. "The output of the program shall usually be given within 10 seconds"

Page 2: part-2

What are the exceptions to the 'usual 10 seconds' requirement? The statement of requirements or SRS should possess the following properties: All requirements must be correct. There should be no factual errors All requirements should have one interpretation only. We have seen a few examples of ambiguous statements above. The SRS should be complete in all respects. It is difficult to achieve this objective. Many times clients change the requirements as the development progresses or new requirements are added. The Agile development methodologies are specifically designed to take this factor in to account. They partition the requirements in to subsets called scenarios and each scenario is implemented separately. However, each scenario should be complete. All requirements must be verifiable, that is, it should be possible to verify if a requirement is met or not. Words like 'highly', 'usually', should not be used. All requirements must be consistent and non-conflicting As we have stated earlier, requirements do change. So the format of the SRS should be such that the changes can be easily incorporated

2. Understanding Requirements

2.1 Functional and Non-Functional Requirements

Requirements can be classified in to two types, namely, functional requirements and non-functional requirements.Functional requirements specify what the system should do.Examples are:Calculate the compound interest at the rate of 14% per annum on a fixed deposit for a period of three years Calculate tax at the rate of 30% on an annual income equal to and above Rs.2,00,000 but less than Rs.3,00,000 Invert a square matrix of real numbers (maximum size 100 X 100) Non-functional requirements specify the overall quality attributes the system must satisfy.The following is a sample list of quality attributes: portability reliability performance testability modifiability security presentation reusability understandability acceptance criteria interoperability Some examples of non-functional requirements are:Number of significant digits to which accuracy should be maintained in all numerical calculations is 10

Page 3: part-2

The response time of the system should always be less than 5 seconds The software should be developed using C language on a UNIX based system A book can be deleted from the Library Management System by the Database Administrator only The matrix diagonalisation routine should zero out all off-diagonal elements, which are equal to or less than 10-3 Experienced officers should be able to use all the system functions after a total training of two hours. After this training, the average number of errors made by experienced officers should not exceed two per day.

2.2 Other Classifications

Requirements can also be classified in to the following categories: Satisfiability Criticality Stability User categories

Satisfiability: There are three types of satisfiability, namely, normal, expected, and exciting.Normal requirements are specific statements of user needs. The user satisfaction level is directly proportional to the extent to which these requirements are satisfied by the system. Expected requirements may not be stated by the users, but the developer is expected to meet them. If the requirements are met, the user satisfaction level may not increase, but if they are not met, users may be thoroughly dissatisfied. They are very important from the developer's point of view. Exciting requirements, not only, are not stated by the users, they do not even expect them. But if the developer provides for them in the system, user satisfaction level will be very high. The trend over the years has been that the exciting requirements often become normal requirements and some of the normal requirements become expected requirements. For example, as the story goes, on-line help feature was first introduced in the UNIX system in the form of man pages. At that time, it was an exciting feature. Later, other users started demanding it as part of their systems. Now a days, users do not ask for it, but the developer is expected to provide it.

Criticality: This is a form of priortising the requirements. They can be classified as mandatory, desirable, and non-essential. This classification should be done in consultation with the users and helps in determining the focus in an iterative development model.

Stability: Requirements can also be categorised as stable and non-stable. Stable requirements don't change often, or atleast the time period of change will be very long. Some requirements may change often. For example, if business process reengineering is going on alongside the development, then the corresponding requirements may change till the process stabilises.

Page 4: part-2

User categories: As was stated in the introduction, there will be many stake holders in a system. Broadly they are of two kinds. Those who dictate the policies of the system and those who utilise the services of the system. All of them use the system. There can be further subdivisons among these classes depending on the information needs and services required. It is important that all stakeholders are identified and their requirements are captured.

3. Modelling Requirements

3.1 Overview

Every software system has the following essential characteristics: It has a boundary. The boundary separates what is with in system scope and what is outside It takes inputs from external agents and generates outputs It has processes which collaborate with each other to generate the outputs These processes operate on data by creating, modifying, destroying, and querying it The system may also use data stores to store data which has a life beyond the system In the following, we will be describing artifacts used by Structured Systems Analysis and Design Methodology (SSADM). It uses: Data Flow Diagram (DFD) for modelling processes and their interactions. Entity Relationship Diagram (ERD) for modelling data and their relationships. Data Dictionary to specify data Decision Tables and Decision Trees to model complex decisions. Structured English to paraphrase process algorithms. State Transition Diagram to model state changes of the system.

3.2 Data Flow Diagram (DFD)

Data flow diagram focuses on movment of data through the system and its transformations. It is divided in to levels. Level 0, also known as the context diagram, defines the system scope. It consists of external agents, system boundary, and the data flow between the external agents and the system. Level 1 is an explosion of Level 0, where all the major processes, data stores, and the data flow between them is shown. Level 2, Level 3, etc. show details of individual processes.

The notation used is the following:

External agents: They are external to the system, but interact with the system. They must be drawn at level 0, but need not be drawn at level 2 onwards. Duplicates are to be identified. They must be given meaningful names.

Page 5: part-2

Process: They indicate information processing activity. They must be shown at all levels, At level 0, only a single process, depicting the system is shown. On subsequent levels, the number of processes should be limited to 7 ± 2. No duplicates are allowed.

Data Stores: They are used to store information. They are not shown at level 0. All data stores should be shown at level 1. Duplicates must be indicated.

Data Flows: They indicate the flow of information. They must be shown at all levels and meaningful names must be given.

Examples:

1. Customer places sales orders. The system checks for availability of products and updates sales information 2. Company receives applications. Checks for eligibility conditions. Invites all eligible candidates for interview. Maintains a list of all candidates called for interview. Updates the eligibility conditions as and when desired by the management

Getting started: Identify the inputs or events which trigger the system and outputs or responses from the system Identify the corresponding sources and destinations (external agents) Produce a context diagram (Level 0). It should show the system boundary, external agents, and the dataflows connecting the system and the external agents. Produce Level 1 diagram. It must show all the external agents, all the major processes, all the data stores, and all the dataflows connecting the various artifacts. The artifacts should be placed based on logical precedence rather than temporal precedence. Avoid dataflow crossings. Refine the Level 1 diagram. Explode the individual processes as necessary. Points to remember:

1) Remember to name every external agent, every process, every data store, and every dataflow.

2) Do not show how things begin and end.

3) Do not show loops, and decisions.

4) Do not show dataflows between external agents. They are outside the scope of the system.

Page 6: part-2

Do not show dataflow between an external agent and a data store. There should be a process in between.

6) Do not show dataflow between two data stores. There should be a process in between.

7) There should not be any unconnected external agent, process, or data store.

8) Beware of read-only or write-only data stores

9) Beware of processes which take inputs without generating any outputs. Also, beware of processes which generate outputs spontaneously without taking any inputs.

9) Ensure that the data flowing in to a process exactly matches the data flowing in to the exploded view of that process. Similarly for the data flowing out of the process.

10) Ensure that the data flowing out of a data store matches data that has been stored in it before.

See the appendix for the complete data flow diagram of "Material Procurement System (Case Study)"

Requirements Engineering Course Contents Prev Next

3.3 Entity Relationship Diagram (ERD)

ERD complements DFD. While DFD focuses on processes and data flow between them, ERD focuses on data and the relationships between them. It helps to organise data used by a system in a disciplined way. It helps to ensure completeness, adaptability and stability of data. It is an effective tool to communicate with senior management (what is the data needed to run the business), data administrators (how to manage and control data), database designers (how to organise data efficiently and remove redundancies). It consists of three components.

Entity: It represents a collection of objects or things in the real world whose individual members or instances have the following characteristics: Each can be identified uniquely in some fashion. Each plays a necessary role in the system we are building. Each can be described by one or more data elements (attributes). Entities generally correspond to persons, objects, locations, events, etc. Examples are employee, vendor, supplier, materials, warehouse, delivery, etc.

There are five types of entities.

Page 7: part-2

Fundamental entity: It does not depend on any other entity for its existence. For e.g. materials Subordinate entity: It depends on another entity for its existance. For example, in an inventory management system, purchase order can be an entity and it will depend on materials being procured. Similarly invoices will depend on purchase orders. Associative entity: It depends on two or more entities for its existence. For example, student grades will depend on the student and the course. Generalisation entity: It encapsulates common characteristics of many subordinate entities. For example, a four wheeler is a type of vehicle. A truck is a type of four wheeler . Aggregation entity: It consists of or an aggregation of other entities. For example, a car consists of engine, chasis, gear box, etc. A vehicle can also be regarded as an aggregation entity, because a vehicle can be regarded as an aggregation of many parts.

Attributes: They express the properties of the entities.

Every entity will have many attributes, but only a subset, which are relevant for the system under study, will be chosen. For example, an employee entity will have professional attributes like name, designation, salary, etc. and also physical attributes like height, weight, etc. But only one set will be chosen depending on the context.

Attributes are classified as entity keys and entity descriptors. Entity keys are used to uniquely identify instances of entities. Attributes having unique values are called candidate keys and one of them is designated as primary key. The domains of the attributes should be pre-defined. If 'name' is an attribute of an entity, then its domain is the set of strings of alphabets of predefined length.

Relationships: They describe the association between entities.

They are characterised by optionality and cardinality.

Optionality is of two types, namely, mandatory and optional.Mandatory relationship means associated with every instance of the first entity there will be atleast one instance of the second entity. Optional relationship means that there may be instances of the first entity, which are not associated with any instance of the second entity. For example, employee-spouse relationship has to be optional because there could be unmarried employees. It is not correct to make the relationship mandatory.

Cardinality is of three types: one-to-one, one-to-many, many-to-many.One-to-one relationship means an instance of the first entity is associated with only one instance of the second entity. Similarly, each instance of the second entity is related to one instance of the first entity. One-to-many relationship means that one instance of the first entity is related to many instances of the second entity, while an instance of the second entity is associated with onLy one instance of the first entity.

Page 8: part-2

In many-to-many relationship an instance of the first entity is related to many instances of the second entity and the same is true in the reverse direction also.

Other types of relationships are multiple relationships between entities, relationships leading to associative entities, relationship of entity with itself, EXCLUSIVE-OR and AND relationships

ERD notation: There are two type of notation used: Peter Chen notationBachman notation.

Not surprisingly, Peter Chen and Bachman are the name inventors of the notation. The following table gives the notation.COMPONENT REPRESENTATIONENTITY OR OBJECT TYPE PURCHASE ORDER

RELATIONSHIPCARDINALITYOPTIONALITY PETER CHEN BACHMAN

5. A teacher teaches many students and a student is taught by many teachers. A teacher conducts examination for many students and a student is examined by many teachers.

6. An extension of example-3 above is that student-grades depend upon both student and the course. Hence it is an associative entity

7. An employee can play the role of a manager. In that sense, an employee reports to another employee.

8. A tender is floated either for materials or services but not both.

9. A car consists of an engine and a chasis

Page 9: part-2

3.4 Data Dictionary

It containsan organised list of data elements, data structures, data flows, and data storesmini specifications of the primitive processes in the systemany other details which will provide useful information on the system

Data element is piece of data, which can not be decomposed further in the current context of the system.

Examples are purchase_order_no., employee_name, interest_rate, etc. Each data element is a member of a domain. The dictionary entry of a data element should also specify the domain.

Data structure is composed of data elements or other data structures.

Examples are Customer_details, which may be composed of Customer_name and Customer_address. Cutomer_address in turn is a structure. Another example is Invoice, which may be composed of Invoice_identification, Customer_details, Delivery_address, Invoice_details.

Data flow is composed of data structures and/or data elements. Definitions of dependent data structures/data elements precede the definition of data flow. While defining the data flow the connecting points should be mentioned.

Also useful to include the flow volume/frequency and growth rates.

Data store, like data flow is made up of a combination of data structures and/or data elements. The description is similar to data flows.

The notation elements used in the data dictionary are the following: [spouse_name]This indicates that spouse_name is optional {dependent_name, relationship} * (0 to 15)This indicates that the data strucure can be repeated 0 to 15 times {expense_description, company_name, charge} * (1 to N)This indicates that the data structure may be repeated 1 to N where N is not fixed voter_identity_number/customer_account_number This indicates that either of the elements will be present. Data dictionary also contains mini specifications. They state the ways in which data flows that enter the primitive process are transformed in to data flows that leave the process.

Page 10: part-2

Only the broad outline is given, not the detailed steps. They must exist for every primitive process. Structured english is used for stating minispecifications.

Once the DFD, ERD, and the Data dictionary are created, the three of them must be matched against each other. DFD and ERD can be created independently and parallely. Every data store in the DFD must correspond to atleast one entity in the ERD. There should be processes in DFD which create modify and delete instances of the entities in ERD. For every relationship in ERD there should be a process in DFD which uses it. For every description in the data dictionary, there should be corresponding elements in DFD and ERD. Prev Up Next Course Contents

3.5 Decision Tree and Decision Tables

A decision tree represents complex decisions in the form of a tree. Though visually it is appealing, it can soon get out of hand when the number and complexity of decisions increase. An example is given below. First the textual statment is given and then the corresponding decision tree is given:

Rules for electricity billing are as below:

If the meter reading is "OK", calculate on consumption basis(i.e. meter reading)If the meter reading appears "LOW", then check if the house is occupiedIf the house is occupied, calculate on seasonal consumption basis otherwise calculate on consuption basis If the meter is damaged, calculate based on maximum possible electricity usage

There are two types of decision tables, binary-valued(yes or no) and multi-valued. An example follows:

ELECTRICITY BILL CALCULATION BASED ON CUSTOMER CLASS

If a customer uses electricity for domestic purposes and if the consumption is less than 300 units per month then bill with minimum monthly charges.Domestic customers with a consumption of 300 units or more per month are billed at special rate.Non-domestic users are charged double that of domestic users (minimum and special rates are double).

BINARY-VALUED DECISION TABLE Domestic Customer Y Y N N

Consumtion < 300 units per month Y N Y N

Page 11: part-2

Minimum rate Y N N N Special rate N Y N N Double minimum rate N N Y N Double special rate N N N Y

MULTI-VALUED DECISION TABLE Customer D D N N Consumption ≥ 300 <300 ≥300 <300 Rate S M 2S 2M

Like decision trees, binary-value decision tables can grow large if the number of rules increase. Multi-valued decision tables have an edge. In the above example, if we add a new class of cutomers, called Academic, with the rules:

If the consumption is less than 300 units per month then bill with concessional rates. Otherwise bill with twice the concessional rates. then new tables will look like the following:

BINARY-VALUED DECISION TABLE (three rows and two columns are added to deal with the extra class of customers) Academic N N N N Y Y Domestic customer Y Y N N N N Consumption < 300 units/month Y N Y N Y N Minimum rate Y N N N N N Special rate N Y N N N N Twice minimum rate N N Y N N N Twice special rate N N N Y N N Concessional rate N N N N Y N Twice concessional rate N N N N N Y

MULTI-VALUED DECISION TABLE (only two columns are added to deal with the extra class of customers) Customer Domestic Domestic Non-domestic Non-domestic Academic Academic Consumption ≥ 300 <300 ≥300 <300 ≥300 <300 Rate Special Minimum Twice special Twice minimum Twice concessional Concessional

3.6 Structured English

To specify the processes (minispecifications) structured english is used.

It consists of: sequences of instructions (action statements) decisions (if-else)

Page 12: part-2

loops (repeat-until) case groups of instructions Examples:

Loose, normal english:

In the case of 'Bill', a master file is updated with the bill (that is consumer account number and bill date). A control file is also to be updated for the 'total bill amount'. A similar treatment is to be given to 'Payment'

Structured english:

If transaction is 'BILL' then update bill in the Accounts master file update total bill amount in the Control file

If transaction is 'PAYMENT' then update receipt in the Accounts master file update total receipt amount in the Control file

Another example:

If previous reading and new reading match then perform 'status-check'If status is 'dead' then calculate bills based on average consumptionElse compute bill based on actual consumption

status-check

If meter does not register any change after switching on any electrical device then meter status is 'dead'Else meter status is 'ok' Prev Up Next Course Contents

3.7 State Transition Diagram

Another useful diagram is the state transition diagram. It can be used to model the state changes of the system. A system is in a state and will remain in that state till a condition and an action force it to change state. See the following figure. Appendix contains another example.

Page 13: part-2

4. Conclusion

The output of the requirements engineering phase is the software requirements specifications (SRS) document. At a minimum, it should contain the DFD, ERD, and the data dictionary and the minispecifications. The other diagrams may be used as required. The satndards body of Institute for Electrical Electronics Engineers (IEEE) has defined a set of recommended practices called "IEEE Recommended Practice for Software Requirements Specifications", IEEE standards 830-1998. It can be used as a guideline document for SRS.

SYSTEM DESIGN OVERVIEW

1. Introduction to Design

1.1 Introduction

Design is an iterative process of transforming the requirements specification into a design specification.

Consider an example where Mrs. & Mr. XYZ want a new house. Their requirements include,

a room for two children to play and sleepa room for Mrs. & Mr. XYZ to sleep a room for cooking a room for dining a room for general activities

and so on. An architect takes these requirements and designs a house. The architectural design specifies a particular solution. In fact, the architect may produce several designs to meet this requirement. For example, one may maximize children’s room, and other minimizes it to have large living room. In addition, the style of the proposed houses may differ: traditional, modern and two-storied. All of the proposed designs solve the problem, and there may not be a “best” design.

Software design can be viewed in the same way. We use requirements specification to define the problem and transform this to a solution that satisfies all the requirements in the specification.

Some definitions for Design:

“Devising artifacts to attain goals” [H.A. Simon, 1981].

Page 14: part-2

“The process of defining the architecture, component, interfaces and other characteristics of a system or component” [ IEEE 160.12].

“The process of applying various techniques and principles for the purpose of defining a device, a process or a system in sufficient detail to permit its physical realization” [Webster Dictionary].

Without Design, System will be

- unmanageable since there is no concrete output until coding. Therefore it is difficult to monitor & control.

- inflexible since planning for long term changes was not given due emphasis.

- unmaintainable since standards & guidelines for design & construction are not used. No reusability consideration. Poor design may result in tightly coupled modules with low cohesion. Data disintegrity may also result.

- inefficient due to possible data redundancy and untuned code.

- not portable to various hardware / software platforms.

Design is different from programming. Design brings out a representation for the program – not the program or any component of it. The difference is tabulated below.Design ProgrammingAbstactions of operations and data("What to do") Device algorithms and data representationsEstablishes interfaces Consider run-time environments

Choose between design alternatives

Make trade-offs w.r.t.constraints etc Choose functions, syntax of language Devices representation of program Construction of program

1.2 Qualities of a Good Design

Functional: It is a very basic quality attribute. Any design solution should work, and should be constructable.

Efficiency: This can be measured through run time (time taken to undertake whole of processing task or transaction)response time (time taken to respond to a request for information) throughput (no. of transactions / unit time) memory usage, size of executable, size of source, etc

Page 15: part-2

Flexibility: It is another basic and important attribute. The very purpose of doing design activities is to build systems that are modifiable in the event of any changes in the requirements.

Portability & Security: These are to be addressed during design - so that such needs are not “hard-coded” later.

Reliability: It tells the goodness of the design - how it work successfully (More important for real-time and mission critical and on-line systems).

Economy: This can be achieved by identifying re-usable components.

Usability: Usability is in terms of how the interfaces are designed (clarity, aesthetics, directness, forgiveness, user control, ergonomics, etc) and how much time it takes to master the system.

1.3 Design Constraints

Typical Design Constraints are: Budget Time Integration with other systems Skills Standards Hardware and software platforms

Budget and Time cannot be changed. The problems with respect to integrating to other systems (typically client may ask to use a proprietary database that he is using) has to be studied & solution(s) are to be found. ‘Skills’ is alterable (for example, by arranging appropriate training for the team). Mutually agreed upon standards has to be adhered to. Hardware and software platforms may remain a constraint.

Designer try answer the “How” part of “What” is raised during the requirement phase. As such the solution proposed should be contemporary. To that extent a designer should know what is happening in technology. Large, central computer systems with proprietary architecture are being replaced by distributed network of low cost computers in an open systems environment We are moving away from conventional software development based on hand generation of code (COBOL, C) to Integrated programming environments. Typical applications today are internet based.

1.4 Popular Design Methods

Popular Design Methods (Wasserman, 1995) include1) Modular decomposition Based on assigning functions to components.

Page 16: part-2

It starts from functions that are to be implemented and explain how each component will be organized and related to other components. 2) Event-oriented decomposition Based on events that the system must handle. It starts with cataloging various states and then describes how transformations take place. 3) Object-oriented design Based on objects and their interrelationships It starts with object types and then explores object attributes and actions.

Structured Design - uses modular decomposition.

1.5 Transition from Analysis to Design

(source: Pressman, R.S. “Software Engineering: A Practitioner’s Approach”, fifth edition, McGraw-Hill, 2001)

The data design transforms the data model created during analysis into the data structures that will be required to implement the software.

The architectural design defines the relationship between major structural elements of the software, the design patterns that can be used to achieve the requirements and the constraints that affect the implementation.

The interface design describes how the software communicates within itself, and with humans who use it.

The Procedural design (typically, Low Level Design) elaborates structural elements of the software into procedural (algorithmic) description.Prev

2. High Level Design Activities

Broadly, High Level Design include Architectural Design, Interface Design and Data Design.

2.1 Architectural Design

Shaw and Garlan (1996) suggest that software architecture is the first step in producing a software design. Architecture design associates the system capabilities with the system components (like modules) that will implement them. The architecture of a system is a comprehensive framework that describes its form and structure, its components and how they interact together. Generally, a complete architecture plan addresses the functions that the system provides, the hardware and network that are used to develop and operate it, and the software that is used to develop and operate it. An architecture style involves its components, connectors, and constraints on combining components. Shaw and Garlan (1996) describe seven architectural styles. Commonly used styles includePipes and Filters

Page 17: part-2

Call-and-return systems Main program / subprogram architecture Object-oriented systems Layered systems Data-centered systems Distributed systems Client/Server architecture

In Pipes and Filters, each component (filter) reads streams of data on its inputs and produces streams of data on its output. Pipes are the connectors that transmit output from one filter to another.

e.g. Programs written in Unix shell

In Call-and-return systems, the program structure decomposes function into a control hierarchy where a “main” program invokes (via procedure calls) a number of program components, which in turn may invoke still other components.

e.g. Structure Chart is a hierarchical representation of main program and subprograms.

In Object-oriented systems, component is an encapsulation of data and operations that must be applied to manipulate the data. Communication and coordination between components is accomplished via message calls.

In Layered systems, each layer provides service to the one outside it, and acts as a client to the layer inside it. They are arranged like an “onion ring”.

e.g. OSI ISO model.

Data-centered systems use repositories. Repository includes a central data structure representing current state, and a collection of independent components that operate on the central data store. In a traditional database, the transactions, in the form of an input stream, trigger process execution.

e.g. Database

A popular form of distributed system architecture is the Client/Server where a server system responds to the requests for actions / services made by client systems. Clients access server by remote procedure call.

For a detailed description of architecture styles, read [1] (OPTIONAL)

The following issues are also addressed during architecture design:Security Data Processing: Centralized / Distributed / Stand-alone Audit Trails

Page 18: part-2

Restart / Recovery User Interface Other software interfaces

2.2 User Interface Design

The design of user interfaces draws heavily on the experience of the designer. Pressman [2] (Refer Chapter 15) presents a set of Human-Computer Interaction (HCI) design guidelines that will result in a "friendly," efficient interface.

Three categories of HCI design guidelines are General interactionInformation displayData entryGeneral Interaction

Guidelines for general interaction often cross the boundary into information display, data entry and overall system control. They are, therefore, all-encompassing and are ignored at great risk. The following guidelines focus on general interaction. Be consistent Use a consistent format for menu selection, command input, data display and the myriad other functions that occur in a HCI. Offer meaningful feedback Provide the user with visual and auditory feedback to ensure that two way communication (between user and interface) is established. Ask for verification of any nontrivial destructive action If a user requests the deletion of a file, indicates that substantial information is to be overwritten, or asks for the termination of a program, an "Are you sure ..." message should appear. Permit easy reversal of most actions UNDO or REVERSE functions have saved tens of thousands of end users from millions of hours of frustration. Reversal should be available in every interactive application. Reduce the amount of information that must be memorized between actions The user should not be expected to remember a list of numbers or names so that he or she can re-use them in a subsequent function. Memory load should be minimized. Seek efficiency in dialog, motion, and thought Keystrokes should be minimized, the distance a mouse must travel between picks should be considered in designing screen layout, the user should rarely encounter a situation where he or she asks, "Now what does this mean." Forgive mistakes The system should protect itself from errors that might cause it to fail (defensive programming) Categorize activities by functions and organize screen geography accordingly One of the key benefits of the pull down menu is the ability to organize commands by type. In essence, the designer should strive for "cohesive" placement of commands and actions.

Page 19: part-2

Provide help facilities that are context sensitive Use simple action verbs or short verb phrases to name commands A lengthy command name is more difficult to recognize and recall. It may also take up unnecessary space in menu lists.

Information Display

If information presented by the HCI is incomplete, ambiguous or unintelligible, the application will fail to satisfy the needs of a user. Information is "displayed" in many different ways with text, pictures and sound, by placement, motion and size, using color, resolution, and even omission. The following guidelines focus on information display. Display only information that is relevant to the current context The user should not have to wade through extraneous data, menus and graphics to obtain information relevant to a specific system function. Don’t bury the user with data, use a presentation format that enables rapid assimilation of information Graphs or charts should replace voluminous tables. Use consistent labels, standard abbreviations, and predictable colors The meaning of a display should be obvious without reference to some outside source of information.Allow the user to maintain visual context If computer graphics displays are scaled up and down, the original image should be displayed constantly (in reduced form at the corner of the display) so that the user understands the relative location of the portion of the image that is currently being viewed.Produce meaningful error messages Use upper and lower case, indentation, and text grouping to aid in understanding Much of the information imparted by a HCI is textual, yet, the layout and form of the text has a significant impact on the ease with which information is assimilated by the user. Use windows to compartmentalize different types of information Windows enable the user to "keep" many different types of information within easy reach. Use “analog” displays to represent information that is more easily assimilated with this form of representation For example, a display of holding tank pressure in an oil refinery would have little impact if a numeric representation were used. However, a thermometer-like display were used, vertical motion and color changes could be used to indicate dangerous pressure conditions. This would provide the user with both absolute and relative information. Consider the available geography of the display screen and use it efficiently When multiple windows are to be used, space should be available to show at least some portion of each. In addition, screen size (a system engineering issues should be selected to accommodate the type of application that is to be implemented.

Data Input

Page 20: part-2

Much of the user's time is spent picking commands, typing data and otherwise providing system input. In many applications, the keyboard remains the primary input medium, but the mouse, digitizer and even voice recognition systems are rapidly becoming effective alternatives. The following guidelines focus on data input: Minimize the number of input actions required of the user Reduce the amount of typing that is required. This can be accomplished by using the mouse to select from pre-defined sets of input; using a "sliding scale" to specify input data across a range of values; using "macros" that enable a single keystroke to be transformed into a more complex collection of input data. Maintain consistency between information display and data input The visual characteristics of the display (e.g., text size, color, placement) should be carried over to the input domain. Allow the user to customize the input An expert user might decide to create custom commands or dispense with some types of warning messages and action verification. The HCI should allow this. Interaction should be flexible but also tuned to the user’s preferred mode of input The user model will assist in determining which mode of input is preferred. A clerical worker might be very happy with keyboard input, while a manager might be more comfortable using a point and pick device such as a mouse. Deactivate commands that are inappropriate in the context of current actions. This protects the user from attempting some action that could result in an error. Let the user control the interactive flow The user should be able to jump unnecessary actions, change the order of required actions (when possible in the context of an application), and recover from error conditions without exiting from the program. Provide help to assist with all input actions Eliminate “Mickey mouse” input Do not let the user to specify units for engineering input (unless there may be ambiguity). Do not let the user to type .00 for whole number dollar amounts, provide default values whenever possible, and never let the user to enter information that can be acquired automatically or computed within the program.

3. Structured Design Methodology

The two major design methodologies are based onFunctional decomposition Object-oriented approach 3.1 Structured Design

Structured design is based on functional decomposition, where the decomposition is centered around the identification of the major system functions and their elaboration and refinement in a top-down manner. It follows typically from dataflow diagram and associated processes descriptions created as part of Structured Analysis. Structured design uses the following strategiesTransformation analysisTransaction analysis

Page 21: part-2

and a few heuristics (like fan-in / fan-out, span of effect vs. scope of control, etc.) to transform a DFD into a software architecture (represented using a structure chart).

In structured design we functionally decompose the processes in a large system (as described in DFD) into components (called modules) and organize these components in a hierarchical fashion (structure chart) based on following principles:Abstraction (functional) Information Hiding Modularity

Abstraction

"A view of a problem that extracts the essential information relevant to a particular purpose and ignores the remainder of the information." -- [IEEE, 1983]

"A simplified description, or specification, of a system that emphasizes some of the system's details or properties while suppressing others. A good abstraction is one that emphasizes details that are significant to the reader or user and suppress details that are, at least for the moment, immaterial or diversionary." -- [Shaw, 1984]

While decomposing, we consider the top level to be the most abstract, and as we move to lower levels, we give more details about each component. Such levels of abstraction provide flexibility to the code in the event of any future modifications.

Information Hiding

“... Every module is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings." --- [Parnas, 1972]

Parnas advocates that the details of the difficult and likely-to-change decisions be hidden from the rest of the system. Further, the rest of the system will have access to these design decisions only through well-defined, and (to a large degree) unchanging interfaces.

This gives a greater freedom to programmers. As long as the programmer sticks to the interfaces agreed upon, she can have flexibility in altering the component at any given point.

There are degrees of information hiding. For example, at the programming language level, C++ provides for public, private, and protected members, and Ada has both private and limited private types. In C language, information hiding can be done by declaring a variable static within a source file.

Page 22: part-2

The difference between abstraction and information hiding is that the former (abstraction) is a technique that is used to help identify which information is to be hidden.

The concept of encapsulation as used in an object-oriented context is essentially different from information hiding. Encapsulation refers to building a capsule around some collection of things [Wirfs-Brock et al, 1990]. Programming languages have long supported encapsulation. For example, subprograms (e.g., procedures, functions, and subroutines), arrays, and record structures are common examples of encapsulation mechanisms supported by most programming languages. Newer programming languages support larger encapsulation mechanisms, e.g., "classes" in Simula, Smalltalk and C++, "modules" in Modula, and "packages" in Ada.

Modularity

Modularity leads to components that have clearly defined inputs and outputs, and each component has a clearly stated purpose. Thus, it is easy to examine each component separately from others to determine whether the component implements its required tasks. Modularity also helps one to design different components in different ways, if needed. For example, the user interface may be designed with object orientation and the security design might use state-transition diagram.

3.2 Strategies for converting the DFD into Structure Chart

Steps [Page-Jones, 1988]Break the system into suitably tractable units by means of transaction analysis Convert each unit into into a good structure chart by means of transform analysis Link back the separate units into overall system implementation

Transaction Analysis

The transaction is identified by studying the discrete event types that drive the system. For example, with respect to railway reservation, a customer may give the following transaction stimulus:

The three transaction types here are: Check Availability (an enquiry), Reserve Ticket (booking) and Cancel Ticket (cancellation). On any given time we will get customers interested in giving any of the above transaction stimuli. In a typical situation, any one stimulus may be entered through a particular terminal. The human user would inform the system her preference by selecting a transaction type from a menu. The first step in our strategy is to identify such transaction types and draw the first level breakup of modules in the structure chart, by creating separate module to co-ordinate various transaction types. This is shown as follows:

Page 23: part-2

The Main ( ) which is a over-all coordinating module, gets the information about what transaction the user prefers to do through TransChoice. The TransChoice is returned as a parameter to Main ( ). Remember, we are following our design principles faithfully in decomposing our modules. The actual details of how GetTransactionType ( ) is not relevant for Main ( ). It may for example, refresh and print a text menu and prompt the user to select a choice and return this choice to Main ( ). It will not affect any other components in our breakup, even when this module is changed later to return the same input through graphical interface instead of textual menu. The modules Transaction1 ( ), Transaction2 ( ) and Transaction3 ( ) are the coordinators of transactions one, two and three respectively. The details of these transactions are to be exploded in the next levels of abstraction.

We will continue to identify more transaction centers by drawing a navigation chart of all input screens that are needed to get various transaction stimuli from the user. These are to be factored out in the next levels of the structure chart (in exactly the same way as seen before), for all identified transaction centers.

Transform Analysis

Transform analysis is strategy of converting each piece of DFD (may be from level 2 or level 3, etc.) for all the identified transaction centers. In case, the given system has only one transaction (like a payroll system), then we can start transformation from level 1 DFD itself. Transform analysis is composed of the following five steps [Page-Jones, 1988]:Draw a DFD of a transaction type (usually done during analysis phase) Find the central functions of the DFD Convert the DFD into a first-cut structure chart Refine the structure chart Verify that the final structure chart meets the requirements of the original DFD

Let us understand these steps through a payroll system example:Identifying the central transform

The central transform is the portion of DFD that contains the essential functions of the system and is independent of the particular implementation of the input and output. One way of identifying central transform (Page-Jones, 1988) is to identify the centre of the DFD by pruning off its afferent and efferent branches. Afferent stream is traced from outside of the DFD to a flow point inside, just before the input is being transformed into some form of output (For example, a format or validation process only refines the input – does not transform it). Similarly an efferent stream is a flow point from where output is formatted for better presentation. The processes between afferent and efferent stream represent the central transform (marked within dotted lines above). In the above example, P1 is an input process, and P6 & P7 are output processes. Central transform processes are P2, P3, P4 & P5 - which transform the given input into some form of output.First-cut Structure Chart

Page 24: part-2

To produce first-cut (first draft) structure chart, first we have to establish a boss module. A boss module can be one of the central transform processes. Ideally, such process has to be more of a coordinating process (encompassing the essence of transformation). In case we fail to find a boss module within, a dummy coordinating module is created

In the above illustration, we have a dummy boss module “Produce Payroll” – which is named in a way that it indicate what the program is about. Having established the boss module, the afferent stream processes are moved to left most side of the next level of structure chart; the efferent stream process on the right most side and the central transform processes in the middle. Here, we moved a module to get valid timesheet (afferent process) to the left side (indicated in yellow). The two central transform processes are move in the middle (indicated in orange). By grouping the other two central transform processes with the respective efferent processes, we have created two modules (in blue) – essentially to print results, on the right side.

The main advantage of hierarchical (functional) arrangement of module is that it leads to flexibility in the software. For instance, if “Calculate Deduction” module is to select deduction rates from multiple rates, the module can be split into two in the next level – one to get the selection and another to calculate. Even after this change, the “Calculate Deduction” module would return the same value.Refine the Structure Chart

Expand the structure chart further by using the different levels of DFD. Factor down till you reach to modules that correspond to processes that access source / sink or data stores. Once this is ready, other features of the software like error handling, security, etc. has to be added. A module name should not be used for two different modules. If the same module is to be used in more than one place, it will be demoted down such that “fan in” can be done from the higher levels. Ideally, the name should sum up the activities done by the module and its sub-ordinates. Verify Structure Chart vis-à-vis with DFD

Because of the orientation towards the end-product, the software, the finer details of how data gets originated and stored (as appeared in DFD) is not explicit in Structure Chart. Hence DFD may still be needed along with Structure Chart to understand the data flow while creating low-level design.Constructing Structure Chart (An illustration) Some characteristics of the structure chart as a whole would give some clues about the quality of the system. Page-Jones (1988) suggest following guidelines for a good decomposition of structure chart:Avoid decision splits - Keep span-of-effect within scope-of-control: i.e. A module can affect only those modules which comes under it’s control (All sub-ordinates, immediate ones and modules reporting to them, etc.)Error should be reported from the module that both detects an error and knows what the error is.

Page 25: part-2

Restrict fan-out (number of subordinates to a module) of a module to seven. Increase fan-in (number of immediate bosses for a module). High fan-ins (in a functional way) improves reusability.

Refer [Page-Jones, 1988: Chapters 7 & 10] for more guidelines & illustrations on structure chart

3.3 How to measure the goodness of the design

To Measure design quality, we use coupling (the degree of interdependence between two modules), and cohesion (the measure of the strength of functional relatedness of elements within a module). Page-Jones gives a good metaphor for understanding coupling and cohesion: Consider two cities A & B, each having a big soda plant C & D respectively. The employees of C are predominantly in city B and employees of D in city A. What will happen to the highway traffic between city A & B? By placing employees associated to a plant in the city where plant is situated improves the situation (reduces the traffic). This is the basis of cohesion (which also automatically ‘improve’ coupling).

COUPLING

Coupling is the measure of strength of association established by a connection from one module to another [Stevens et al., 1974]. Minimizing connections between modules also minimizes the paths along which changes and errors can propagate into other parts of the system (‘ripple effect’). The use of global variables can result in an enormous number of connections between the modules of a program. The degree of coupling between two modules is a function of several factors [Stevens et al., 1974]: (1) How complicated the connection is, (2) Whether the connection refers to the module itself or something inside it, and (3) What is being sent or received. Table 1 summarizes various types of coupling [Page-Jones, 1988].NORMAL Coupling (acceptable) COMMON (or GLOBAL) Coupling (unacceptable) CONTENT (or PATHOLOGICAL) Coupling (forbidden)DATA Coupling STAMP Coupling CONTROL CouplingTwo modules are data coupled if they communicate by parameters (each being an elementary piece of data) Two modules are stamp coupled if one passes to other a composite piece of data (a piece of data with meaningful internal structure) Two modules are control coupled if one passes to other a piece of information intended to control the internal logic of the other Two modules are common coupled if they refer to the same global data area Two modules exhibit content coupled if one refers to the inside of the other in any way (if one module ‘jumps’ inside another module)e.g. sin (theta)returning sine value

calc_interest (amt, interest rate, term)returning interest amt. e.g. calc_order_amt (PO_Details)returning value of the order e.g. print_report (what_to_print_flag)

Instead of communicating through parameters, two modules use a global data.Jumping inside a module violate all the design principles like abstraction,

information hiding and modularity.

Page 26: part-2

We aim for a ‘loose’ coupling. We may come across a (rare) case of module A calling module B, but no parameters passed between them (neither send, nor received). This is strictly should be positioned at zero point on the scale of coupling (lower than Normal Coupling itself) [Page-Jones, 1988]. Two modules A &B are normally coupled if A calls B – B returns to A – (and) all information passed between them is by means of parameters passed through the call mechanism. The other two types of coupling (Common and Content) are abnormal coupling and not desired. Even in Normal Coupling we should take care of following issues [Page-Jones, 1988]:Data coupling can become complex if number of parameters communicated between is large.In Stamp coupling there is always a danger of over-exposing irrelevant data to called module. (Beware of the meaning of composite data. Name represented as a array of characters may not qualify as a composite data. The meaning of composite data is the way it is used in the application NOT as represented in a program)“What-to-do flags” are not desirable when it comes from a called module (‘inversion of authority’): It is alright to have calling module (by virtue of the fact, is a boss in the hierarchical arrangement) know internals of called module and not the other way around.

In general, Page-Jones also warns of tramp data and hybrid coupling. When data is passed up and down merely to send it to a desired module, the data will have no meaning at various levels. This will lead to tramp data. Hybrid coupling will result when different parts of flags are used (misused?) to mean different things in different places (Usually we may brand it as control coupling – but hybrid coupling complicate connections between modules). Page-Jones advocates a way to distinguish data from control flags (data are named by nouns and control flags by verbs).

Two modules may be coupled in more than one way. In such cases, their coupling is defined by the worst coupling type they exhibit [Page-Jones, 1988].

COHESION

Designers should aim for loosely coupled and highly cohesive modules. Coupling is reduced when the relationships among elements not in the same module are minimized. Cohesion on the other hand aims to maximize the relationships among elements in the same module. Cohesion is a good measure of the maintainability of a module [Stevens et al., 1974]. Stevens, Myers, Constantine, and Yourdon developed a scale of cohesion (from highest to lowest):Functional Cohesion (Best) Sequential CohesionCommunicational CohesionProcedural CohesionTemporal CohesionLogical CohesionCoincidental Cohesion (Worst)

Page 27: part-2

Let us create a module that calculates average of marks obtained by students in a class:

calc_stat(){

// only a pseudo code

read (x[])

a = average (x)

print a

}

average(m){

sum=0

for i = 1 to N{

sum = sum + x[i]}

return (sum/N)

}

In average() above, all of the elements are related to the performance of a single function. Such a functional binding (cohesion) is the strongest type of binding. Suppose we need to calculate standard deviation also in the above problem, our pseudo code would look like:

calc_stat(){

// only a pseudo code

read (x[])

a = average (x)

s = sd (x, a)

print a, s

}

average(m){ // same as before

Page 28: part-2

}

sd (m, y){//function to calculate standard deviation

}

Now, though average() and sd() are functionally cohesive, calc_stat() has a sequential binding (cohesion). Like a factory assembly line, functions are arranged in sequence and output from average() goes as an input to sd(). Suppose we make sd() to calculate average also, then calc_stat() has two functions related by a reference to the same set of input. This results in communication cohesion.

Let us make calc-stat() into a procedure as below:

calc_stat(){

sum = sumsq = count = 0

for i = 1 to N

read (x[i])

sum = sum + x[i]

sumsq = sumsq + x[i]*x[i]

…}

a = sum/N

s = … // formula to calculate SD

print a, s

}

Now, instead of binding functional units with data, calc-stat() is involved in binding activities through control flow. calc-stat() has made two statistical functions into a procedure. Obviously, this arrangement affects reuse of this module in a different context (for instance, when we need to calculate only average not std. dev.). Such cohesion is called procedural.

Page 29: part-2

A good design for calc_stat () could be (data not shown):

In a temporally bound (cohesion) module, the elements are related in time. The best examples of modules in this type are the traditional “initialization”, “termination”, “housekeeping”, and “clean-up” modules.

A logically cohesive module contains a number of activities of the same kind. To use the module, we may have to send a flag to indicate what we want (forcing various activities sharing the interface). Examples are a module that performs all input and output operations for a program. The activities in a logically cohesive module usually fall into same category (validate all input or edit all data) leading to sharing of common lines of code (plate of spaghetti?). Suppose we have a module with all possible statistical measures (like average, standard deviation, mode, etc.). If we want to calculate only average, the call to it would look like calc_all_stat (x[], flag1, flag2, para1,…). The flags are used to indicate out intent. Some parameters will also be left blank.

When there is no meaningful relationship among the elements in a module, we have coincidental cohesion.

Refer [Page-Jones, 1988: Chapters 5 & 6] for more illustrations and exercises on coupling and cohesion.

4. Data Design

The data design can start from ERD and Data Dictionary. The first choice the designer has to make is to choose between file-based system or database system (DBMS). File based systems are easy to design and implement. Processing speed may be higher. The major draw back in this arrangement is that it would lead to isolate applications with respective data and hence high redundancy between applications. On the other hand, a DBMS allows you to plug in your application on top of an integrated data (organization-wide). This arrangement ensures minimum redundancy across applications. Change is easier to implement without performance degradation. Restart/recovery feature (ability to recover from hardware/software failures) is usually part of DBMS. On the flip side, because of all this in-built features, DBMS is quite complex and may be slower.

A File is a logical collection of records. Typically the fields in the records are the attributes. Files can be accessed either sequentially or randomly; Files are organized using Sequential or Indexed or Indexed Sequential organization.

E.F. Codd introduced a technique called normalization, to minimize redundancy in a table.

Consider,

R (St#, StName, Major, {C#, CTitle, Faculty, FacLoc, Grade})

Page 30: part-2

The following are the assumptions given:Every course is handled by only one faculty Each course is associated to a particular faculty Faculty is unique No student is allowed to repeat a course

This table needs to be converted to 1NF as there is more than one value in a cell. This is illustrated below:

After removing repeating groups, R becomes

R1(St#, StName, Major) (UNDERLINE)R2(St#,C#, CTitle, Faculty, FacLoc, Grade) as illustrated below:

The primary key of the parent table needs to be included in the new table so as to ensure no loss of information.

Now, R1 is in 2NF as it has no composite key. However, R2 has a composite key of St# and C# which determine Grade. Also C# alone determines CTitle, Faculty, FacLoc. Hence, R2 has a partial dependency as follows:

Inorder to convert it into 2NF we have to remove partial dependency. R2 now becomes

R21(St#, C#, Grade)R22(C#, CTitle, Faculty, FacLoc)

Now, R21 & R22 are in 2NF which is illustrated as below:

Now, R1 and R21 are in 3NF also. However, in R22 there exists one transitive dependency, as Faculty determines FacLoc, shown as below:

Inorder to convert it into 3NF we have to remove this transitive dependency. R22 now becomes

R221(C#, CTitle, Faculty*)

Page 31: part-2

R222(Faculty, FacLoc), assuming Faculty is unique.

Now R221 and R222 are in 3NF.

The final 3NF reduction for the given relation is:

Student (St#, StName, Major) – R1

Mark_List (St#, C#, Grade) - R21

Course (C#, CTitle, Faculty*) – R221 and

Faculty (Faculty, FacLoc) – R22

De-normalization can be done for improving performance (as against flexibility)

For example, Employee (Name, Door#, StreetName, Location, Pin) may become two table during 3NF with a separate table for maintaining Pin information. If application does not have any frequent change to Pin, then we can de-normalize this two table back to its original form by allowing some redundancy for having improved performance.

Guidelines for converting ERD to tables:Each simple entity --> Table / FileFor Super-type / Sub-type entities Separate / one table for all entitiesSeparate table for sub & super-typeIf M:M relationship between entities, create one Table / File for relationship with primary key of both entity forming composite key.Add inherited attribute (foreigh key) depending on retrieval need (Typically on the ‘M’ side of 1:M relationship

REVIEWS, WALKTHROUGHS & INSPECTIONS

1. Formal Definitions

Quality Control (QC)

A set of techniques designed to verify and validate the quality of work products and observe whether requirements are met.

Software Element

Every deliverable or in-process document produced or acquired during the Software Development Life Cycle (SDLC) is a software element.

Page 32: part-2

Verification and validation techniques

Verification - “Is the task done correctly?”

Validation - “Is the correct task done?”

Static Testing

V&V is done on any software element.

Dynamic Testing

V&V is done on executing the software with pre-defined test cases.Prev Course Contents Next Up………………………………………………………………………Reviews, Walkthroughs & Inspections Course ContentsPrev Next

2. Importance of Static Testing

Why Static Testing?

The benefit is clear once you think about it. If you can find a problem in the requirements before it turns into a problem in the system, that will save time and money. The following statistics would be mind boggling.

M.E. Fagan "Design and Code Inspections to Reduce Errors in Program Development", IBM Systems Journal, March 1976.

Systems Product

67% of total defects during the development found in Inspection

Applications Product

82% of all the defects found during inspection of design and code

A.F. Ackerman, L. Buchwald, and F. Lewski, "Software Inspections: An Effective Verification Process," IEEE Software, May 1989.

Operating System

Inspection decreased the cost of detecting a fault by 85%

Page 33: part-2

Marilyn Bush, "Improving Software Quality: The Use of Formal Inspections at the Jet Propulsion Laboratory", Proceedings of the 12th

International Conference on Software Engineering, pages 196-199, IEEE Computer Society Press, Nice, France, March 1990.

Jet Propulsion Laboratory Project

Every two-hour inspection session results, on an average, in saving of $25,000

The following three ‘stories’ should communicate the importance of Static Testing:

When my daughter called me a ‘cheat’My failure as a programmer Loss of “Mars Climate Obiter” A Few More Software Failures - Lessons for others

The following diagram of Fagan (Advances in Inspections, IEEE Transactions on Software Engineering) captures the importance of Static Testing. The lesson learned could be summarized in one sentence - Spend a little extra earlier or spend much more later.

The ‘statistics’, the above stories and Fagan’s diagram emphasizes the need for Static Testing. It is appropriate to state that not all static testing involves people sitting at a table looking at a document. Sometimes automated tools can help. For C programmers, the lint program can help find potential bugs in programs. Java programmers can use tools like the JTest product to check their programs against a coding standard.

When to start the Static Testing?

To get value from static testing, we have to start at the right time. For example, reviewing the requirements after the programmers have finished coding the entire system may help testers design test cases. However, the significant return on the static testing investment is no longer available, as testers can't prevent bugs in code that's already written. For optimal returns, a static testing should happen as soon as possible after the item to be tested has been created, while the assumptions and inspirations remain fresh in the creator's mind and none of the errors in the item have caused negative consequences in downstream processes. Effective reviews involve the right people. Business domain experts must attend requirements reviews, system architects must attend design reviews, and expert programmers must attend code reviews. As testers, we can also be valuable participants, because we're good at spotting inconsistencies, vagueness, missing details, and the like. However, testers who attend review meetings do need to bring sufficient knowledge of the business domain, system architecture, and programming to each review. And everyone who attends a review, walkthrough or inspection should understand the basic ground rules of such events.

Page 34: part-2

The following diagram of Somerville (Software Engineering 6th Edition) communicates, where Static Testing starts.

3. Reviews

IEEE classifies Static Testing under three broad categories: Reviews Walkthroughs Inspections

What is “Reviews?”

A meeting at which the software element is presented to project personnel, managers, users, customers or other interested parties for comment or approval. The software element can be Project Plans, URS, SRS, Design Documents, code, Test Plans, User Manual.

What are objectives of “Reviews”?

To ensure that:The software element conforms to its specifications. The development of the software element is being done as per plans, standards, and guidelines applicable for the project. Changes to the software element are properly implemented and affect only those system areas identified by the change specification.

Reviews - InputA statement of objectives for the technical reviews The software element being examined Software project management plan Current anomalies or issues list for the software product Documented review procedures Earlier review report - when applicable Review team members should receive the review materials in advance and they come prepared for the meeting Check list for defects

Reviews – MeetingExamine the software element for adherence to specifications and standardsChanges to software element are properly implemented and affect only the specified areas Record all deviations Assign responsibility for getting the issues resolved Review sessions are not expected to find solutions to the deviations.

Page 35: part-2

The areas of major concerns, status on previous feedback and review days utilized are also recorded. The review leader shall verify, later, that the action items assigned in the meeting are closed

Reviews - OutputsList of review findings List of resolved and unresolved issues found during the later re-verification

SRS - Checklist

4. Walkthrough

Walkthrough – Definition

A technique in which a designer or programmer leads the members of the development team and other interested parties through the segment of the documentation or code and the participants ask questions and make comments about possible errors, violation of standards and other problems.

Walkthrough - ObjectivesTo find defects To consider alternative implementations To ensure compliance to standards & specifications

Walkthrough – InputA statement of objectives The software element for examination Standards for the development of the software Distribution of materials to the team members, before the meeting Team members shall examine and come prepared for the meeting Check list for defects

Walkthrough- MeetingAuthor presents the software element Members ask questions and raise issues regarding deviations Discuss concerns, perceived omissions or deviations from the specifications Document the above discussions Record the comments and decisions The walk-through leader shall verify, later, that the action items assigned in the meeting are closed

Walkthrough – OutputsList of walk-through findings List of resolved and unresolved issues found during the later re-verification

Page 36: part-2

C - Checklist

C++ - Checklist

5. Inspection

Inspection – Definition

A visual examination of software element to detect errors, violations of development standards and other problems. An inspection is very rigorous and the inspectors are trained in inspection techniques. Determination of remedial or investigative action for a defect is a mandatory element of a software inspection, although the solution should not be determined in the inspection meeting.

Inspection – ObjectivesTo verify that the software element satisfies the specifications & conforms to applicable standards To identify deviations To collect software engineering data like defect and effort To improve the checklists, as a spin-off

Inspection – InputA statement of objectives for the inspection Having the software element ready Documented inspection procedure Current defect list A checklist of possible defects Arrange for all standards and guidelines Distribution of materials to the team members, before the meeting Team members shall examine and come prepared for the inspection

Inspection – MeetingIntroducing the participants and describing their role (by the Moderator) Presentation of the software element by non - author Inspectors raise questions to expose the defects Recording defects - a defect list details the location, description and severity of the defect Reviewing the defect list - specific questions to ensure completeness and accuracy Making exit decision

- Acceptance with no or minor rework, without further verification - Accept with rework verification (by inspection team leader or a designated member of the inspection team)- Re-inspect

Inspection - OutputDefect list, containing a defect location, description and classification

Page 37: part-2

An estimate of rework effort and rework completion date

FAQ

6. Comparison of Reviews, Walk-Throughs and Inspections

Objectives:Reviews - Evaluate conformance to specifications; Ensure change integrity Walkthroughs - Detect defects; Examine alternatives; Forum for learning Inspections - Detect and identify defects; Verify resolution

Group Dynamics:Reviews: 3 or more persons ; Technical experts and peer mixWalkthroughs: 2 to 7 persons Technical experts and peer mix Inspections: 3 to 6 persons; Documented attendance; with formally trained inspectors

Decision Making & Change Control:Reviews: Review Team requests Project Team leadership or management to act on recommendations Walkthroughs: All decisions made by producer; Change is prerogative of the author Inspections: Team declares exit decision – Acceptance, Rework & Verify or Rework & Re-inspect

Material Volume:Reviews: Moderate to high Walkthroughs: Relatively low Inspections: Relatively low

PresenterReviews: Software Element Representative Walkthroughs: Author Inspections: Other than author

7. Advantages

Advantages of Static Methods over Dynamic Methods Early detection of software defects Static methods expose defects, whereas dynamic methods show only the symptom of the defect Static methods expose a batch of defects, whereas it is usually one by one in dynamic methods Some defects can be found only by Static Testing Code redundancy (when logic is not affected)Dead codeViolations of coding standards

Page 38: part-2

8. Metrics for Inspections

Fault DensitySpecification and DesignFaults per pageCodeFaults per 1000 lines of code

Fault Detection Rate

Faults detected per hour

Fault Detection Efficiency

Faults detected per person - hour

Inspection Efficiency

(Number of faults found during inspection) / (Total number of faults during development)

Maintenance Vs Inspection

"Number of corrections during the first six months of operational phase" and "Number of defects found in inspections" for different projects of comparable size

9. Common Issues for Reviews, Walk-throughs and Inspections

Responsibilities for Team Members Leader: To conduct / moderate the review / walkthrough / inspection processReader: To present the relevant material in a logical fashionRecorder: To document defects and deviationsOther Members: To critically question the material being presented Communication Factors Discussion is Kept within boundsNot extreme in opinionReasonable and calmNot directed at a personal levelConcentrate on finding defectsNot get bogged down with disagreementNot discuss trivial issuesNot involve itself in solution-huntingThe participants should be sensitive to each other by keeping the synergy of the meeting very high

Being aware of, and correcting any conditions, either physical or emotional, that are draining off the participant’s attention, shall ensure that the meeting is fruitful – i.e.

Page 39: part-2

Maximum number of defects is found during the early stages of software development life cycle.

TESTING AND DEBUGGING

1. Introduction to Testing

1.1 A Self-Assessment Test [1]

Take the following test before starting your learning:

“A program reads three integer values. The three values are interpreted as representing the lengths of the sides of a triangle. The program displays a message that states whether the given sides can make a scalene or isosceles or equilateral triangle (Triangle Program)”

On a sheet of paper, write a set of test cases that would adequately test this program.

Evaluate the effectiveness of your test cases using this list of common errors.

Testing is an important, mandatory part of software development; it is a technique for evaluating product quality and also for indirectly improving it, by identifying defects and problems [2].

What is common between these disasters?

Ariane 5, Explosion, 1996

AT&T long distance service fails for nine hours, 1990

Airbus downing during Iran-conflict, 1988

Shut down of Nuclear Reactors, 1979

...Software faults!! ……………………………………..Testing and Debugging Course Contents Prev Next

1.2 How do we decide correctness of a software?

Page 40: part-2

To answer this question, we need to first understand how a software can fail. Failure, in the software context, is non-conformance to requirements. Failure may be due to one or more faults [3]:Error or incompleteness in the requirements Difficulty in implementing the specification in the target environment Faulty system or program design Defects in the code

From the variety of faults (Refer Pfleeger [3] for an excellent discussion on various types of faults) above, it is clear that testing cannot be seen as an activity that will start after coding phase – Software testing is an activity that encompasses the whole development life cycle. We need to test a program to demonstrate the existence of a fault. Contradictory to the terms, ‘demonstrating correctness’ is not a demonstration that the program works properly. Testing brings negative connotations to our normal understanding. Myers’ classic (which is still regarded as the best fundamental book on testing), “The Art of Software Testing” lists the following as the most important testing principles:(Definition): Testing is the process of executing a program with the intent of finding errorsA good test case is one that has a high probability of detecting an as-yet undiscovered errorA successful test case is one that detects an as-yet undiscovered error.

Myers [1] discusses more testing principles:Test case definition includes expected output (a test oracle)Programmers should avoid testing their own programs (third party testing?)Inspect the result of each testInclude test cases for invalid & unexpected input conditionsSee whether the program does what it is not supposed to do (‘error of commission’)Avoid throw-away test cases (test cases serve as a documentation for future maintenance)Do not assume that the program is bug free (non-developer, rather a destructive mindset is needed)The probability of more errors is proportional to the number of errors already found

Consider the following diagram:

If we abstract a software to be a mathematical function f, which is expected to produce various outputs for inputs from a domain (represented diagrammatically above), then it is clear from the definition of Testing by Myers that Testing cannot prove correctness of a program – it is just a series of experiments to find out errors (as f is usually a discrete function that maps the input domain to various outputs – that can be observed by executing the program)

Page 41: part-2

There is nothing like 100% error-free code as it is not feasible to conduct exhaustive testing, proving 100% correctness of a program is not possible.

We need to develop an attitude for ‘egoless programming’ and keep a goal of eliminating as many faults as possible. Statistics on review effectiveness and common sense says that prevention is better than cure. We need to place static testing also in place to capture an error before it becomes a defect in the software.

Recent agile methodologies like extreme programming addresses these issues better with practices like test-driven programming and paired programming (to reduce the psychological pressure on individuals and to bring review part of coding) [4]

1.3 Testing Approaches

There are two major approaches to testing:Black-box (or closed box or data-driven or input/output driven or behavioral) testingWhite-box (or clear box or glass box or logic-driven) testing

If the testing component is viewed as a “black-box”, the inputs have to be given to observe the behavior (output) of the program. In this case, it makes sense to give both valid and invalid inputs. The observed output is then matched with expected result (from specifications). The advantage of this approach is that the tester need not worry about the internal structure of the program. However, it is impossible to find all errors using this approach. For instance, if one tried three equilateral triangle test cases for the triangle program, we cannot be sure whether the program will detect all equilateral triangles. The program may contain a hard coded display of ‘scalene triangle’ for values (300,300,300). To exhaustively test the triangle program, we need to create test cases for all valid triangles up to MAXIMUM integer size. This is an astronomical task – but still not exhaustive (Why?) To be sure of finding all possible errors, we not only test with valid inputs but all possible inputs (including invalid ones like characters, float, negative integers, etc.).

“White-box” approach examines the internal structure of the program. This means logical analysis of software element where testing is done to trace all possible paths of control flow. This method of testing exposes both errors of omission (errors due to neglected specification) and also errors of commission (something not defined by the specification). Let us look at a specification of a simple file handling program as given below.

“The program has to read employee details such as employee id, name, date of joining and department as input from the user, create a file of employees and display the records sorted in the order of employee ids. “

Examples for errors of omission: Omission of Display module, Display of records not in sorted order of employee ids, file created with fewer fields etc.

Page 42: part-2

Example of error of commission: Additional lines of code deleting some arbitrary records from the created file. (No amount of black box testing can expose such errors of commission as the method uses specification as the reference to prepare test cases. )

However, it is many a time practically impossible to do complete white box testing to trace all possible paths of control flow as the number of paths could astronomically large. For instance, a program segment which has 5 different control paths (4 nested if-then-else) and if this segment is iterated 20 times, the number of unique paths would be 520+519+…+51 = 1014 or 100 trillion [1]. If we were able to complete one test case every 5 minutes, it would take approximately one billion years to test every unique path. Due to dependency of decisions, not all control paths may be feasible. Hence, actually, we may not be testing all these paths. Even if we manage to do an exhaustive testing of all possible paths, it may not guarantee that the program work as per specification: Instead of sorting in descending order (required by the specification), the program may sort in ascending order. Exhaustive path testing will not address missing paths and data-sensitive errors. A black-box approach would capture these errors!

In conclusion, It is not feasible to do exhaustive testing either in block or in white box approaches. None of these approaches are superior – meaning, one has to use both approaches, as they really complement each other.Not, but not the least, static testing still plays a large role in software testing.

The challenge, then, lies in using a right mix of all these approaches and in identifying a subset of all possible test cases that have highest probability of detecting most errors. The details of various techniques under black and white box approach are covered in Test Techniques.

2. Levels of Testing

2.1 Overview

In developing a large system, testing usually involves several stages (Refer the following figure [2]).

Unit Testing Integration Testing System Testing Acceptance Testing

Page 43: part-2

Initially, each program component (module) is tested independently verifying the component functions with the types of input identified by studying component’s design. Such a testing is called Unit Testing (or component or module testing). Unit testing is done in a controlled environment with a predetermined set of data fed into the component to observe what output actions and data are produced.

When collections of components have been unit-tested, the next step is ensuring that the interfaces among the components are defined and handled properly. This process of verifying the synergy of system components against the program Design Specification is called Integration Testing.

Once the system is integrated, the overall functionality is tested against the Software Requirements Specification (SRS). Then, the other non-functional requirements like performance testing are done to ensure readiness of the system to work successfully in a customer’s actual working environment. This step is called System Testing.

The next step is customer’s validation of the system against User Requirements Specification (URS). Customer in their working environment does this exercise of Acceptance Testing usually with assistance from the developers. Once the system is accepted, it will be installed and will be put to use.

2.2 Unit Testing

Pfleeger [2] advocates the following steps to address the goal of finding faults in modules (components):

Examining the codeTypically the static testing methods like Reviews, Walkthroughs and Inspections are used (Refer RWI course)

Proving code correcto After coding and review exercise if we want to ascertain the correctness of the code we can use formal methods. A program is correct if it implements the functions and data properly as indicated in the design, and if it interfaces properly with all other components. One way to investigate program correctness is to view the code as a statement of logical flow. Using mathematical logic, if we can formulate the program as a set of assertions and theorems, we can show that the truth of the theorems implies the correctness of the code.o Use of this approach forces us to be more rigorous and precise in specification. Much work is involved in setting up and carrying out the proof. For example, the code for performing bubble sort is much smaller than its logical description and proof.

Testing program components (modules)

Page 44: part-2

o In the absence of simpler methods and automated tools, “Proving code correctness” will be an elusive goal for software engineers. Proving views programs in terms of classes of data and conditions and the proof may not involve execution of the code. On the contrary, testing is a series of experiments to observe the behaviour of the program for various input conditions. While proof tells us how a program will work in a hypothetical environment described by the design and requirements, testing gives us information about how a program works in its actual operating environment.o To test a component (module), input data and conditions are chosen to demonstrate an observable behaviour of the code. A test case is a particular choice of input data to be used in testing a program. Test case are generated by using either black-box or white-box approaches (Refer Test Techniques)

2.3 Integration Testing

Integration is the process of assembling unit-tested modules. We need to test the following aspects that are not previously addressed while independently testing the modules:Interfaces: To ensure “interface integrity,” the transfer of data between modules is tested. When data is passed to another module, by way of a call, there should not be any loss or corruption of data. The loss or corruption of data can happen due to mis-match or differences in the number or order of calling and receiving parameters.Module combinations may produce a different behaviour due to combinations of data that are not exercised during unit testing.Global data structures, if used, may reveal errors due to unintended usage in some module.

Integration Strategies

Depending on design approach, one of the following integration strategies can be adopted:

· Big Bang approach

· Incremental approachTop-down testingBottom-up testingSandwich testing

To illustrate, consider the following arrangement of modules:

Big Bang approach consists of testing each module individually and linking all these modules together only when every module in the system has been tested.

Page 45: part-2

Though Big Bang approach seems to be advantageous when we construct independent module concurrently, this approach is quite challenging and risky as we integrate all modules in a single step and test the resulting system. Locating interface errors, if any, becomes difficult here.

The alternative strategy is an incremental approach, wherein modules of a system are consolidated with already tested components of the system. In this way, the software is gradually built up, spreading the integration testing load more evenly through the construction phase. Incremental approach can be implemented in two distinct ways: Top-down and Bottom-up.

In Top-down testing, testing begins with the topmost module. A module will be integrated into the system only when the module which calls it has been already integrated successfully. An example order of Top-down testing for the above illustration will be:

The testing starts with M1. To test M1 in isolation, communications to modules M2, M3 and M4 have to be somehow simulated by the tester somehow, as these modules may not be ready yet. To simulate responses of M2, M3 and M4 whenever they are to be invoked from M1, “stubs” are created. Simple applications may require stubs which simply return control to their superior modules. More complex situation demand stubs to simulate a full range of responses, including parameter passing. Stubs may be individually created by the tester (as programs in their own right) or they may be provided by a software testing harness, which is a piece of software specifically designed to provide a testing environment.

In the above illustration, M1 would require stubs to simulate the activities of M2, M3 and M4. The integration of M3 would require a stub or stubs (?!) for M5 and M4 would require stubs for M6 and M7. Elementary modules (those which call no subordinates) require no stubs.

Bottom-up testing begins with elementary modules. If M5 is ready, we need to simulate the activities of its superior, M3. Such a “driver” for M5 would simulate the invocation activities of M3. As with the stub, the complexity of a driver would depend upon the application under test. The driver would be responsible for invoking the module under test, it could be responsible for passing test data (as parameters) and it might be responsible for receiving output data. Again, the driving function can be provided through a testing harness or may be created by the tester as a program. The following diagram shows the bottom-up testing approach for the above illustration:

For the above example, driver must be provided for modules M2, M5, M6, M7, M3 and M4. There is no need for a driver for the topmost node, M1.

Page 46: part-2

Myers [1] lists the advantages and disadvantages of Top-down testing and Bottom-up testing:TestingAdvantages DisadvantagesTop-Down Advantageous if major flaws occur toward the top of the program Early skeletal program allows demonstrations and boosts morale Stub modules must be produced Test conditions my be impossible, or very difficult, to createObservation of test output is more difficult, as only simulated values will be used initially. For the same reason, program correctness can be misleadingBottom-up Advantageous if major flaws occur toward the bottom of the program Test conditions are easier to createObservations of test results is easier (as “live” data is used from the beginning) Driver modules must be produced The program as an entity does not exist until the last module is added

To overcome the limitations and to exploit the advantages of Top-down and Bottom-up testing, a sandwich testing is used [2]: The system is viewed as three layers – the target layer in the middle, the levels above the target, and the levels below the target. A top-down approach is used in the top layer and a bottom-up one in the lower layer. Testing converges on the target layer, chosen on the basis of system characteristics and the structure of the code. For example, if the bottom layer contains many general-purpose utility programs, the target layer (the one above) will be components using the utilities. This approach allows bottom-up testing to verify the utilities’ correctness at the beginning of testing.

Choosing an integration strategy [2] depends not only on system characteristics, but also on customer expectations. For instance, the customer may want to see a working version as soon as possible, so we may adopt an integration schedule that produces a basic working system early in the testing process. In this way coding and testing can go concurrently. 2.4 System Testing

The objective of unit and integration testing was to ensure that the code implemented the design properly. In system testing, we need to ensure that the system does what the customer wants it to do. Initially the functions (functional requirements) performed by the system are tested. A function test checks whether the integrated system performs its functions as specified in the requirements.

After ensuring that the system performs the intended functions, the performance test is done. This non-functional requirement includes security, accuracy, speed, and reliability.

System testing begins with function testing. Since the focus here is on functionality, a black-box approach is taken (Refer Test Techniques). Function testing is performed in a controlled situation. Since function testing compares the system’s actual performance with its requirements, test cases are developed from requirements document (SRS). For

Page 47: part-2

example [2], a word processing system can be tested by examining the following functions: document creation, document modification and document deletion. To test document modification, adding a character, adding a word, adding a paragraph, deleting a character, deleting a word, deleting a paragraph, changing the font, changing the type size, changing the paragraph formatting, etc. are to be tested.

Performance testing addresses the non-functional requirements. System performance is measured against the performance objectives set by the customer. For example, function testing may have demonstrated how the system handles deposit or withdraw transactions in a bank account package. Performance testing evaluates the speed with which calculations are made, the precision of the computation, the security precautions required, and the response time to user inquiry.

Types of Performance Tests [2]

1. Stress tests – evaluates the system when stressed to its limits. If the requirements state that a system is to handle up to a specified number of devices or users, a stress test evaluates system performance when all those devices or users are active simultaneously. This test brings out the performance during peak demand.

2. Volume tests – addresses the handling of large amounts of data in the system. This includesChecking of whether data structures have been defined large enough to handle all possible situations, Checking the size of fields, records and files to see whether they can accommodate all expected data, and Checking of system’s reaction when data sets reach their maximum size.

3. Configuration tests – analyzes the various software and hardware configurations specified in the requirements. (e.g. system to serve variety of audiences)

4. Compatibility tests – are needed when a system interfaces with other systems (e.g. system to retrieve information from a large database system)

5. Regression tests – are required when the system being tested is replacing an existing system (Always used during a phased development – to ensure that new system’s performance is at leaset as good as that of the old)

6· Security tests – ensure the security requirements (testing characteristics related to availability, integrity, and confidentiality of data and services)

7· Timing tests – include response time, transaction time, etc. Usually done with stress test to see if the timing requirements are met even when the system is extremely active.

8· Environmental tests – look at the system’s ability to perform at the installation site. If the requirements include tolerances to heat, humidity, motion, chemical presence,

Page 48: part-2

moisture, portability, electrical or magnetic fields, disruption of power, or any other environmental characteristics of the site, then our tests should ensure that the system performs under these conditions.

9· Quality tests – evaluate the system’s reliability, maintainability, and availability. These tests include calculation of mean time to failure and mean time to repair, as well as average time to find and fix a fault.

10· Recovery tests – addresses response to the loss of data, power, device or services. The system is subjected to loss of system resources and tested if it recovers properly.

11· Maintenance tests – addresses the need for diagnostic tools and procedures to help in finding the source of problems. To verify existence and functioning of aids like diagnostic program, memory map, traces of transactions, etc.

12· Documentation tests – ensures documents like user guides, maintenance guides and technical documentation exists and to verify consistency of information in them.

13· Human factor (or Usability) tests – investigates user interface related requirements. Display screens, messages, report formats and other aspects are examined for ease of use.

2.5 Acceptance Testing

Acceptance testing is the customer (and user) evaluation of the system, primarily to determine whether the system meets their needs and expectations. Usually acceptance test is done by customer with assistance from developers. Customers can evaluate the system either by conducting a benchmark test or by a pilot test [2]. In benchmark test, the system performance is evaluated against test cases that represent typical conditions under which the system will operate when actually installed. A pilot test installs the system on an experimental basis, and the system is evaluated against everyday working.

Sometimes the system is piloted in-house before the customer runs the real pilot test. The in-house test, in such case, is called an alpha test, and the customer’s pilot is a beta test. This approach is common in the case of commercial software where the system has to be released to a wide variety of customers.

A third approach, parallel testing, is done when a new system is replacing an existing one or is part of a phased development. The new system is put to use in parallel with previous version and will facilitate gradual transition of users, and to compare and contrast the new system with the old.

3. Test Techniques

We shall discuss Black Box and White Box approach.

3.1 Black Box Approach

Page 49: part-2

- Equivalence Partitioning- Boundary Value Analysis- Cause Effect Analysis- Cause Effect Graphing- Error Guessing

I. Equivalence Partitioning

Equivalence partitioning is partitioning the input domain of a system into finite number of equivalent classes in such a way that testing one representative from a class is equivalent to testing any other value from that class. To put this in simpler words, since it is practically infeasible to do exhaustive testing, the next best alternative is to check whether the program extends similar behaviour or treatment to a certain group of inputs. If such a group of values can be found in the input domain treat them together as one equivalent class and test one representative from this. This can be explained with the following example.

Consider a program which takes “Salary” as input with values 12000...37000 in the valid range. The program calculates tax as follows:

- Salary up to Rs. 15000 – No Tax- Salary between 15001 and 25000 – Tax is 18 % of Salary- Salary above 25000 – Tax is 20% of Salary

Here, the specification contains a clue that certain groups of values in the input domain are treated “equivalently” by the program. Accordingly, the valid input domain can be divided into three valid equivalent classes as below:

c1 : values in the range 12000...15000c2: values in the range 15001...25000c3: values > 25000

However, it is not sufficient that we test only valid test cases. We need to test the program with invalid data also as the users of the program may give invalid inputs, intentionally or unintentionally. It is easy to identify an invalid class “c4: values < 12000”. If we assume some maximum limit (MAX) for the variable Salary, we can modify the class c3 above to “values in the range 25001...MAX and identify an invalid class “c5: values > MAX”. Depending on the system, MAX may be either defined by the specification or defined by the hardware or software constraints later during the design phase.

Page 50: part-2

If we further expand our discussion and assume that user or tester of the program may give any value which can be typed in through the keyboard as input, we can form the equivalence classes as explained below.

Since the input has to be “salary” it can be seen intuitively that numeric and non-numeric values are treated differently by the program. Hence we can form two classes

- class of non-numeric values- class of numeric values

Since all non-numeric values are treated as invalid by the program class c1 need not be further subdivided.

Class of numeric values needs further subdivision as all elements of the class are not treated alike by the program. Again, within this class, if we look for groups of values meeting with similar treatment from the program the following classes can be identified:

- values < 12000 - values in the range 12000…15000- values in the range 15001…25000- values in the range 25001...MAX- values > MAX

Each of these equivalent classes need not be further subdivided as the program should treat all values within each class in a similar manner. Thus the equivalent classes identified for the given specification along with a set of sample test cases designed using these classes are shown in the following table.Class Input Condition (Salary)

Expected Result (Tax amount) Actual/Observed Result Remarksc1 – class of non-numeric values A non-numeric value Error Msg: “Invalid Input”

c2 - values < 12000 A numeric value < 12000 Error Msg: “Invalid Input”

c3 - vaues in the range 12000…15000 A numeric value >=12000 and <= 15000

No Tax c4 - values in the range 15001…25000 A numeric value >=150001 and <= 25000

Tax = 18% of Salary c5 - values in the range 25001...MAX A numeric value >=250001 and <= MAX

Tax = 20% of Salary c6 - values > MAX A numeric value > MAX Error Msg: “Invalid Input”

We can summarise this discussion as follows:

To design test cases using equivalence partitioning, for a range of valid input values identify

Page 51: part-2

- one valid value within the range- one invalid value below the range and - one invalid value above the range

Similarly, To design test cases for a specific set of values

- one valid case for each value belonging to the set- one invalid value

Eg: Test Cases for Types of Account (Savings, Current) will be

- Savings, Current (valid cases) - Overdraft (invalid case)

It may be noted that we need fewer test cases if some test cases can cover more than one equivalent class.

II. Boundary Value Analysis

Even though the definition of equivalence partitioning states that testing one value from a class is equivalent to testing any other value from that class, we need to look at the boundaries of equivalent classes more closely. This is so since boundaries are more error prone.

To design test cases using boundary value analysis, for a range of values,

- Two valid cases at both the ends- Two invalid cases just beyond the range limits

Consider the example discussed in the previous section. For the valid equivalence class “c2-2: values in the range 12000...15000” of Salary, the test cases using boundary value analysis are Input Condition (Salary) Expected Result (Tax amount)

Actual/Observed Result Remarks11999 Invalid input 12000 No Tax 15000 No Tax 150001 Tax = 18% of Salary

If we closely look at the Expected Result column we can see that for any two successive input values the expected results are always different. We need to perform testing using boundary value analysis to ensure that this difference is maintained.

The same guidelines need to be followed to check output boundaries also.

Page 52: part-2

Other examples of test cases using boundary value analysis are:

- A compiler being tested with an empty source program- Trigonometric functions like TAN being tested with values near p/2- A function for deleting a record from a file being tested with an empty data file or a data file with just one record in it.

Though it may sound that the method is too simple, boundary value analysis is one of the most effective methods for designing test cases that reveal common errors made in programming.

III. Cause Effect Analysis

The main drawback of the previous two techniques is that they do not explore the combination of input conditions.

Cause effect analysis is an approach for studying the specifications carefully and identifying the combinations of input conditions (causes) and their effect in the form of a table and designing test cases

It is suitable for applications in which combinations of input conditions are few and readily visible.

IV. Cause Effect Graphing

This is a rigorous approach, recommended for complex systems only. In such systems the number of inputs and number of equivalent classes for each input could be many and hence the number of input combinations usually is astronomical. Hence we need a systematic approach to select a subset of these input conditions.

Guidelines for graphing :

– Divide specifications into workable pieces as it may be practically difficult to work on large specifications.

– Identify the causes and their effects. A cause is an input condition or an equivalence class of input conditions. An effect is an output condition or a system transformation.

– Link causes and effects in a Boolean graph which is the cause-effect graph.

– Make decision tables based on the graph. This is done by having one row each for a node in the graph. The number of columns will depend on the number of different combinations of input conditions which can be made.

– Convert the columns in the decision table into test cases.

Page 53: part-2

Consider the following specification:

A program accepts Transaction Code - 3 characters as input. For a valid input the following must be true.

1st character (denoting issue or receipt)

+ for issue- for receipt

2nd character - a digit

3rd character - a digit

To carry out cause effect graphing, the control flow graph is constructed as below. In the graph:

(1) or (2) must be true (V in the graph to be interpreted as OR)

(3) and (4) must be true (? in the graph to be interpreted as AND)

The Boolean graph has to be interpreted as follows: - node (1) turns true if the 1st character is ‘+’- node (2) turns true if the 1st character is ‘-’ (both node (1) and node (2) cannot be true simultaneously)- node(3) becomes true if the 2nd character is a digit- node(4) becomes true if the 3rd character is a digit - the intermediate node (5) turns true if (1) or (2) is true (i.e., if the 1st character is ‘+’ or ‘-‘)- the intermediate node (6) turns true if (3) and (4) are true (i.e., if the 2nd and 3rd characters are digits)- The final node (7) turns true if (5) and (6) are true. (i.e., if the 1st character is ‘+’ or ‘-‘, 2nd and 3rd characters are digits)- The final node will be true for any valid input and false for any invalid input.

A partial decision table corresponding to the above graph:Node Some possible combination of node states(1) 0 1 1 1 0 1(2) 0 0 0 0 0 0(3) 0 0 0 1 1 1(4) 0 0 1 0 1 1(5) 0 1 1 0 0 1(6) 0 0 0 1 1 1(7) 0 0 0 0 0 1

Page 54: part-2

Sample Test Case for the Column $xy +ab +a4 +2y @45 +67

The sample test cases can be derived by giving values to the input characters such that the nodes turn true/false as given in the columns of the decision table.

V. Error Guessing

Error guessing is a supplementary technique where test case design is based on the tester's intuition and experience. There is no formal procedure. However, a checklist of common errors could be helpful here.

Prev Course Contents Next Up……………………………………………………………………Testing and Debugging Course Contents Prev Next

3.2 Test Techniques

White Box Approach

- Basis Path Testing

Basis Path Testing is white box testing method where we design test cases to cover every statement, every branch and every predicate (condition) in the code which has been written. Thus the method attempts statement coverage, decision coverage and condition coverage.

To perform Basis Path TestingDerive a logical complexity measure of procedural design Break the module into blocks delimited by statements that affect the control flow (eg.: statement like return, exit, jump etc. and conditions)Mark out these as nodes in a control flow graphDraw connectors (arcs) with arrow heads to mark the flow of logicIdentify the number of regions (Cyclomatic Number) which is equivalent to the McCabe’s numberDefine a basis set of execution paths Determine independent pathsDerive test case to exercise (cover) the basis set

McCabe’s Number (Cyclomatic Complexity)Gives a quantitative measure of the logical complexity of the moduleDefines the number of independent paths

Page 55: part-2

Provides an upper bound to the number of tests that must be conducted to ensure that all the statements are executed at least once.Complexity of a flow graph ‘g’, v(g), is computed in one of three ways: V(G) = No. of regions of GV(G) = E - N +2 (E: No. of edges & N: No. of nodes)V(G) = P + 1 (P: No. of predicate nodes in G or No. of conditions in the code)

McCabe’s Number = No. of Regions (Count the mutually exclusive closed regions and also the whole outer space as one region)= 2 (in the above graph)

Two other formulae as given below also define the above measure:

McCabe’s Number = E - N +2( = 6 – 6 +2 = 2 for the above graph)

McCabe’s Number = P + 1( =1 + 1= 2 for the above graph)

Please note that if the number of conditions is more than one in a single control structure, each condition needs to be separately marked as a node.

When the McCabe’s number is 2, it indicates that there two linearly independent paths in the code. i.e., two different ways in which the graph can be traversed from the 1st node to the last node. The independents in the above graph are:

i) 1-2-3-5-6ii) 1-2-4-5-6

The last step is to write test cases corresponding to the listed paths. This would mean giving the input conditions in such a way that the above paths are traced by the control of execution. The test cases for the paths listed here are show in the following table. Path

Input Condition Expected Result Actual Result Remarksi) value of ‘a’ > value of ‘b’ Increment ‘a’ by 1 ii) value of ‘a’ <= value of ‘b’ Increment ‘b’ by 1

4. When to stop testing

When To Stop Testing ?

The question arises as testing is never complete and we cannot scientifically prove that a software system does not contain any more errors.

Page 56: part-2

Common Criteria Practiced

• Stop When Scheduled Time For Testing Expires

• Stop When All The Test Cases Execute Without Detecting Errors

Both are meaningless and counterproductive as the first can be satisfied by doing absolutely nothing and the second is equally useless as it does not ensure quality of test cases.

Stop When

• All test cases, derived from equivalent partitioning, cause-effect analysis & boundary-value analysis, are executed without detecting errors.

Drawbacks

• Rather than defining a goal & allowing the tester to select the most appropriate way of achieving it, it does the opposite !!!

• Defined methodologies are not suitable for all occasions !!!

• No way to guarantee that the particular methodology is properly & rigorously used

• Depends on the abilities of the tester & not quantification attempted !

Completion Criterion Based On The Detection Of Pre-Defined Number Of Errors

In this method the goal of testing is positively defined as to find errors and hence this is a more goal oriented approach.

Eg.

- Testing of module is not complete until 3 errors are discovered

- For a system test : Detection of 70 errors or an elapsed time of 3 months, whichever come later

How To Determine "Number Of Predefined Errors " ?

Predictive Models

• Based on the history of usage / initial testing & the errors found

Page 57: part-2

Defect Seeding Models

• Based on the initial testing & the ratio of detected seeded errors to detected unseeded errors

(Very critically depends on the quality of 'seeding')

Using this approach, as an example, we can say that testing is complete if 80% of the pre-defined number of errors are detected or the scheduled four months of testing is over, whichever comes later.

Caution !

The Above Condition May Never Be Achieved For The Following Reasons

• Over Estimation Of Predefined Errors

(The Software Is Too Good !!)

• Inadequate Test Cases

Hence a best completion criterion may be a combination of all the methods discussed

Module Test

• Defining test case design methodologies (such as boundary value analysis...)

Function & System Test

• Based on finding the pre-defined number of defects

5. Debugging

Debugging occurs as a consequence of successful testing. It is an exercise to connect the external manifestation of the error and the internal cause of the error

Debugging techniques include use of:

• Breakpoints

A point in a computer program at which execution can be suspended to permit manual or automated monitoring of program performance or results

• Desk Checking

Page 58: part-2

A technique in which code listings, test results or other documentation are visually examined, usually by the person who generated them, to identify errors, violations of development standards or other problems

• Dumps

1. A display of some aspect of a computer program’s execution state, usually the contents of internal storage or registers

2. A display of the contents of a file or device

• Single-Step Operation

In this debugging technique a single computer instruction, is executed in response to an external signal.

• Traces

A record of the execution of computer program, showing the sequence of instructions executed, the names and values of variables or both.

OBJECT ORIENTED CONCEPTS

1. Introduction

Improvement of programmer productivity has been a primary preoccupation of the software industry since the early days of computers. Hence it is no surprise that it continues to be a dominant theme even today.However, it should be noted that concerns arising out of this preoccupation have changed with times. As software engineering and software technology evolved, the concerns have changed from productivity during software development to reliability of software. Hence the emphasis on extensibility and reusability of software modules. Its perceived benefits are: reduction in the effort needed to ensure reliability of software, and improved productivities of the software development process Software being inherently complex, the time tested technique of decomposition also known as "DIVIDE and CONQUER" has been applied from the early days of software development. Two types of decomposition have been used, namely, algorithmic decomposition, and object-oriented decomposition.The algorithmic decomposition views software as Program={Data Structures}+{Operations} The program is viewed as a series of tasks to be carried out to solve the problem.The object-oriented decomposition views it as Object={Data Structures}+{Operations}Program={Objects}

Page 59: part-2

The program is viewed as a set of objects, which cooperate with each other to solve the problem. For example, consider a typical banking system. Algorithmic decomposition views it as a series of tasks, not unlike the following: Open an account Deposit money Withdraw money Transfer money between accounts Close an account ...etc... The object-oriented decomposition views it as a set of objects, like the following: Account object account number (data) current balance (data) get account number (operation) update balance (operation) Account holder object name (data) address (data) deposit money (operation) withdraw money (operation) ...etc

2. How object oriented decomposition helps software productivity

Data and operations are packaged together into objects. This facilitates independent and parallel development of code. Because of packaging most updates are localised. This makes it easier to maintain OO systems. It allows reuse and extension of behaviour. Permits iterative development of large systems Basically, these gains arise from the dominant themes in the OO paradigm.They are Data Abstaction: This provides a means by which a designer or programmer can focus on the essential features of her data. It avoids unnecessary attention being paid to implementation of data.

Encapsulation: This helps in controling the visibility of internal details of the objects. It improves security and integrity of data

Hierarchy: It is the basic means to provide extensibility of software modules. and helps in increasing the reuse of modules.

Polymorphism: It is the means by which an operation behaves differently in different contexts.

Page 60: part-2

These themes are discussed further below against the backdrop of significant periods in the history of software engineering methodologies.

3. Historical Perspective

After early experience in software development, the software field hoped that structured programming ideas would provide solutions to its problems. Structured programming is a style of programming usually associated with languages such as C, Fortran, Pascal and so on. Using structured programming techniques, a problem is often solved using a divide and conquer approach. An initially large problem is broken into several smaller sub-problems. Each of these is then progressively broken into even smaller sub-problems, until the level of difficulty is considered to be manageable. At the lowest level, a solution is implemented in terms of data structures and procedures. This approach is often used with imperative programming languages that are not object-oriented languages, i.e. the data structures and procedures are not implemented as classes. It was looked upon as a technique to make programs easier to understand, modify, and extend. It was felt that use of the most appropriate control structures and data structures would make a program easy to understand. It was realised that branching (use of goto's) makes a program unstructured. It was felt that a program should be modular in nature.

The language PL/I was designed around this time. It is a language rich in data structures and control structures and promotes modularity. However, ideas of structured programming were devoid of a methodology. Given a program one could apply the definition of structuredness and say whether it was a structured program or not. But one did not know how to design a structured program. Similarly, guidelines for developing good modular structure for a program were conspicuously absent. In the absence of a methodology, a novice could obtain absurd results starting from a problem specification. For example, the rule that each module should be about 50 lines of code can be used to decide that each segment of 50 lines in a program should become a module.

Around this time it was also felt that having to use a set of predefined data types of the language was very constraining, and hence prone to design errors/program bugs. So a methodology for program design was necessary. It should make the task of program design simple and systematic. provide effective means to control complexity of program design provide easy means to help a user design his data The PASCAL language was designed with a rich control structure, and the means to support user defined data. The designer of PASCAL, Niklaus Wirth, also popularised the methodology of stepwise refinement. It can be summarised as follows: Write a clear statement of the problem in simple English. Avoid going to great depth. Also focus on what is to be done, not on how it is to be done. Identify the fundamental data structures involved in the problem. This data belongs to the problem domain.

Page 61: part-2

Identify the fundamental operations which, when performed on the data defined in the above step, would implement the problem specification. These operations are also domain specific. Design the data identified in step 2 above. Design the operations identified in step 3 above. If any of these operations are composed of other operations apply steps 1-4 to each such operation. Application of this methodology in a recursive manner leads to a hierarchical program structure. An operation being defined becomes a 'problem' to be solved at a lower level leading to the identification of its own data and sub-operations. When a clear and unambiguous statement of the problem is written in simple English in the above stated manner the data gets identified in the abstract form. Only the features of the problem get highlighted. Other features are suppressed. similarly, the essential features of an operation get highlighted such data and operations are said to be 'domain-specific'. They are directly meaningful in the problem domain use of domain-specific data and operations simplifies program design. It is more natural than use of data or operations which have more or less generality than needed. It is also less prone to design or programming errors. PASCAL permitted the programmer to define her own data types. These could be Simple data, viz. enumerated data types. Structured data, viz. array and record types Operations on user defined data were coded as procedures and functions. Thus, a user could define the set of values a data item could take, and could also define how the values were to be manipulated. The point to be noted here is that permitting a user to define her own data permits her to use most appropriate data for a problem. For example, an inventory control program could start like this: type

code = {issue, receipt}var

trans_code : code;

begin...if trans_code = issue then ...

Advantage of this is that programming is now more natural, and hence less prone to errors. PASCAL was also the first widely used language to emphasize the importance of compile-time validation of a program. It implies the following: Syntax errors should be detected by a compiler. Most compilers do this. Violations of language semantics should also be detected by a compiler. The semantic checks should also apply to user-defined data. Thus, invalid use of data should be detected by the compiler. This will eliminate the debugging effort which would otherwise be required to detect and correct the error. PASCAL does not fully succeed in implementing compile-time validation. However, stressing its importance is one of its achievements. PASCAL lacks features to define

Page 62: part-2

legal operations on user defined data and to ensure that only such operations are used on the data. This endangers data consistency. For example: type

complex_conjugate = recordreal1 : real;imag1 : real;real2 : real;imag2 : real;

end;

begin...real1 := real1+3; {this violates the idea of complex conjugates}

The user defined data is like any other data and hence its components can be used in any manner consistent with the type of the components. Hence complex conjugates can only be used in some specific manner, but this is not known to the compiler. It will allow its component real1 to be used as a real variable.

SIMULA was the first language, which introduced the concept of classes. Infact, it is the first language which introduced many of the object oriented concepts, which are taken for granted today. A class in SIMULA is a linguistic unit which contains: Definition of the legal values that data of the class can assume. Definition of the operations that can be performed on the data. A user can create variables of a class in her program, and can use the operations defined in the class on these variables. This way the compiler knows what operations are legal on the data. Attempts to use any other operations will lead to compile-time errors. However, it did not provide encapsulation. Encapsulation implies 'sealing' of the internal details of data and operations. Use of encapsulation in the class promotes information hiding. Only those data and operations, which are explicitly declared as visible to the outside world, will be visible and the rest will be hidden. Encapsulation implies compile-time validation. This would ensure that a program cannot perform any invalid operation on the data. Thus, illegal access to the imaginary part of a complex number can be prevented. This takes compile-time validation of a program one step further. (At this point you are encouraged to visit these links [1], [2]

Data abstraction coupled with encapsulation provides considerable advantages in the presence of modularity.Consider a program consisting of two modules A and B. We can make the following observations: Module B defines its own data. The data of module B can only be manipulated by its operations, also defined in module B. Hence, module A cannot access the data of module B directly. The responsibility for correct manipulation of the data now rests only with module B. Chnages made in module B do not affect module A. This simplifies debugging.

Page 63: part-2

Now we are in a position to clarify the expressions: Object={Data structures}+{Operations}Program={Objects} An object is a collection of data items and operations that manipulate data items. A object oriented program is a collection of objects which interact with each other. Each object is an instance of a class. A class defines a 'data type'. Each object is a variable of a 'data type'. Thus a class is a template and objects are its 'copies' The data structures are declared in the class. The operations are also defined in the class. Each operation is called a method. Each object of the class contains a copy of the data structures declared in the class. These are called instance variables of the object. The operations in all objects of a class share the code of the class methods. Thus a single copy of the code exists in the program. In most present-day object oriented systems, the execution of an object oriented program is 'single threaded', i.e. only one object is active at a time. The program consists of one 'main' object which is active when the program is initiated.

4. Abstraction

Abstraction provides a well-defined conceptual boundary relative to the perspective of the viewer. When we define abstract data, we identify their essential characteristics. All other characteristics are unimportant. There are different types of abstraction: Entity abstraction: The object presents a useful model of an entity in the problem-domain Action abstraction: The object provides a generalised set of operations, all of which perform the same kind of function. Virtual machine abstraction: The object groups together operations that are all used by some superior level of control Coincidental abstraction: The object packages operations that have no relation to each other. Entity abstraction is considered to be the best form of abstraction as it groups together the data and operations concerning an entity. For example, an object called salaried_employee could contain operations like compute_income_tax, compute_pf, etc. Thus information about an entity is localized. One often faces the question as to how to identify useful objects from a problem specification. The answer is look for nouns in the specification. They represent 'things' or entities. This leads to entity abstraction.

5. Encapsulation

While abstraction helps one to focus on the essential characteristics of an object, encapsulation enables one to expose only those details that are necessary to effectively use the object. This is achieved through information hiding and "need to know" principle. One could declare some of the variables and methods of the object as private and the rest public. Only public variables and methods will be accessible by other objects. By carefully selecting what is available to the outside world, the developer can prevent illegal use of objects. For example, consider a class called Person with an operation like 'compute_income_tax'. Details of a person's income are not essential to a viewer of this

Page 64: part-2

class. Only the ability to compute the income-tax for each object is essential. Hence the data members of the class can be declared private, while 'compute_income_tax' can be public.

6. Hierarchy

Hierarchy is a ranking or ordering of abstractions. The important hierarchies in an OO system are: The class structure hierarchy The object structure hierarchy A class structure hierarchy is used for sharing of behaviour and sharing of code between different classes of entities which have some features in common. This is achieved through inheritance. It is called the 'is a' hierarchy. Features that are common to many classes are migrated to a common class, called base-class, (or super-class, or parent-class). Other classes may add, modify, or even hide some of these features. These are called derived-classes, (or sub-classes, or child-classes). For example, vehicle can be a base-class and two-wheeler, three-wheeler, and four-wheeler can be derived classes. There is also a possibility of inheriting from more than one class. This is called multiple inheritance. For example, a two-in-one inherits the features of a radio and a tape-recorder. However, it can raise the following complications: Possibility of name clashes Repeated inheritance from two peer super-classes The object structure hierarchy is also called 'part of' hierarchy. It is implemented through aggregation, that is one object becomes a part of another object. Aggregation permits grouping of logically related structures. It is not unique to object oriented systems. For example, in C a structure can be used to group logically related data elements and structures. In OO context, a vehicle consists of many parts like engine, wheels, chasis, etc. Hence it can be represented as an aggregation of many parts.

7. Polymorphism

It is a concept wherein a name may denote instances of different classes as long as they are related by some common super class. Any such name is thus able to respond to some common set of operations in different ways. For example, let us say that we have declared a class called polygon with a function called draw(). We have derived three classes, rectangle, triangle, pentagon, each with its own redefinition of the draw() function. The version of the draw function to be used during runtime is decided by the object through which it is called. This is polymorphism. It assists in adopting a unified design approach.

Refer Prof. Thomas Huckle’s site for a collection of software bugs:

http://wwwzenger.informatik.tu-muenchen.de/persons/huckle/bugse.html

and refer http://www.cs.tau.ac.il/~nachumd/verify/horror.html for software horror stories!

Page 65: part-2

8. Modularity

Modularity creates a number of well-defined, documented boundaries within a system. A module typically clusters logically related abstractions. This is invaluable in understanding a system. Each module can be compiled separately, but has connections with other modules. Connections between modules are the assumptions which modules make about each other. Ideally, a module should be a single syntactic construct in the programming language.

9. Meyer's Criteria

Bertrand Meyer suggests five criteria for evaluating a design method's ability to achieve modularity and relates these to object oriented design: Decomposability: Ability to decompose a large problem into subproblems. Composability: Degree to which modules once designed and built can be reused to create other systems. Understandability: Ease of understanding the program components by themselves, i.e. without refering to other components. Continuity: Ability to make incremental changes. Protection: A characteristic that reduces the propagation of side effects of an error in one module. Based on the foregoing criteria, Meyer suggests five rules to be followed to ensure modularity: Direct mapping: Modular structure of the software system should be compatible with modular structure devised in the process of modeling the problem domain Few interfaces: Minimum number of interfaces between modules. Small interfaces: Minimum amount of information should move across an interface. Explicit interfaces: Interfaces should be explicit. Communication through global variables violates this criterion. Information hiding: Information concerning implementation is hidden from the rest of the program. Based on the above rules and criteria, the following five design principles follow: Linguistic modular units principle: Modules must correspond to syntactic units in the language used. Self-documentation principle: All information about a module should be part of the module itself. Uniform access principle: All services offered by a module should be available through a uniform notation

Page 66: part-2

Open-closed principle: Modules should be both open and closed. That is, it should be possible to extend a module, while it is in use. Single choice principle: Whenever a software system must support a set of alternatives, one and only one module in the system should know their exhaustive list. A class in an OO language provides a linguistic modular unit. A class provides explicit interfaces and information hiding. This is what sets OO languages apart from early languages like PASCAL and SIMULA.

10. Characteristics of an object

From the perspective of human congnition, an object is one of the following: A tangible and/or visible thing. Something that may be apprehended intellectually. Something towards which thought or action is diected. Some objects may have clear physical identities(e.g.machine), while some others may be intangible with crisp conceptual boundaries(e.g. a chemical process). However, the following statement is true for all objects: An object has state, behaviour, and identity. An object's behaviour is governed by its state. For example, a two-in-one cannot operate as a radio, when it is in tape-recorder mode. An operation is a service that an object offers to the rest of the system. There are five kinds of service: Modifier: The operation alters the state of the object. Selector: The operation accesses the state of an object but does not alter it. Iterator: The operation accesses parts of an object in some order. Constructor: The operation creates an object and initialises its state. Destructor: The operation destroys the object There are three categories of objects: Actor: The object acts upon other objects but is never acted upon. Server: The object is always acted upon by other objects. Agent: The object acts on other objects and is also acted upon by others. There are three types of visibility in to a class. Public: The feature that is declared public is accessible to the class itself and all its clients. Protected: The feature that is declared protected is accessible to the class itself and all its derived classes and friends Private: The feature that is declared private is accessible only to the class itself and its friends

11. Extensibility

The notions of abstraction and encapsulation provide us with features of understandability, continuity, and protection. As we said in the beginning, one of the goals of OO paradigm is 'extensibility'. It can be done in two ways: extension by scaling and semantic extension. Consider a banking application which supports two 'banking stations'. What changes are required to support additional stations? Let us say the application contains a class called BSTATION.

Page 67: part-2

Each banking station is an instance of this class. To incorporate the extension We create another instance of BSTATION. This instance will share the code for the methods with other instances, but will have its own copies of the data. Use of the new station is integrated in the software, i.e. provision is made to direct some workload at the new station. This is extension by scaling. It is made simple by the notion of a class. We can achive scaling by merely declaring another variable of a class. Semantic extension may involve Changing the behaviour of existing entities in an application.Defining new kinds of behaviour for an entity. For example, We may want to compute the average age of students in years, months and days.We may want to compute the average marks of students. Let us say these requirements necessitate changes to the entity 'Persons'. The classical process is to declare that the entity is under maintenance. This implies: Applications using 'Persons' must be held in abeyance. They cannot be used until the maintenance is complete. 'Persons' must then be modified. It must then be extensively tested. If necessary, each application using 'Persons' should be retested. Now existing applications using 'Persons' can be resumed. New applications using 'Persons' can be developed, tested, etc. Thus, this form of maintenance for extension is obtrusive. It interrupts the running applications. Avoiding obtrusion leads to different versions of software. Semantic extension using object oriented methodolgy does not suffer from this problem. Applications using an entity are not affected by the extension and can continue to run while the entity is being extended. Such applications need not be retested nor revalidated. This is achieved through the notion of inheritance.

12. Conclusions

Thus object-oriented approach has many advantages over the structured methodology. There are many languages which support object-oriented programming. Some of the popular ones are, Smalltalk, Eiffel, Ada, C++, Java. The choice of the language will usually be based on business requirements. Object-oriented approach itself is most suitable in business areas which are Rapidly changing Require fast response to changes Complex in requirements or implementation Repeated across company with variations Having long-life applications that must evolve

Page 68: part-2