Reliability Assessment of WEB Applications V. S. Alagar O. Ormandjieva Department of Computer Science Concordia University Montreal, Quebec H3G 1M8, Canada Phone: +1(514) 848-7810 alagar,ormandj @cs.concordia.ca Febru ary 18, 2002 Abstract The paper discusses a formal approach for specifying time-dependent Web applications and proposes a Mark ov model for reliabi lity predict ion. Meas ures for predicti ng reli abili ty are calcu late d from the formal architectural specification and system configuration descriptions. Keywords: reliability prediction, software measurement, Markov model. 1 Introduction The reliability of a software syste m is define d in [IEEE9 0] as the abilit y to perf orm the requ ired funct ional ity under state d conditions for specified perio d of time. In this pape r the software system under disc ussion is a Web-based system. Web is a large and complex distrib uted system whose heterogeneous components inter act in var ious ways to achie ve the result of an appli cation. Often, the perfo rman ce of an applica tion initiated at a site is rated as good if the server at that site is robust and links are not broken. Such a rating does give a subjective qualitative assessment, but does not provide a scientific quantitative measurement of the reli abilit y of the site. This paper propo ses a metho dology for an assessme nt of quality of Web applica tion compo nents through reliab ility pred ictio n, when a formal m odel of the Web applic ation could be specified in an Objected-oriented formalism. Many techn iques exist to test and stati stica lly analy ze trad itiona l software. Howe ver , these method s can not be rea dily applied to a Web envi ronment. In a recent paper Kallepall i and Tian [KT2001 ] have surveyed the characteristics of Web applications and usage and proposed a statistical testing method for Web applications. Their appr oach relies on usage and failure informa tion collecte d in the log files. Web failure is define d as the inabi lity to corr ectly delive r infor matio n or document s requ ired by Web users. Based on this definition of failure, they classify types of failures and provide a method for testing source or contentfailures. We complement their work by offering a formal time-constrained model of the Web on which testing and reliability analysis can be done. This work is supported by grants from Natural Sciences and Engineering Research Council, Canada and Concordia University Graduate Fellowships.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
The reliability of a software system is defined in [IEEE90] as the ability to perform the required functionality
under stated conditions for specified period of time. In this paper the software system under discussion is
a Web-based system. Web is a large and complex distributed system whose heterogeneous components
interact in various ways to achieve the result of an application. Often, the performance of an application
initiated at a site is rated as good if the server at that site is robust and links are not broken. Such a rating does
give a subjective qualitative assessment, but does not provide a scientific quantitative measurement of thereliability of the site. This paper proposes a methodology for an assessment of quality of Web application
components through reliability prediction, when a formal m odel of the Web application could be specified
in an Objected-oriented formalism.
Many techniques exist to test and statistically analyze traditional software. However, these methods
can not be readily applied to a Web environment. In a recent paper Kallepalli and Tian [KT2001] have
surveyed the characteristics of Web applications and usage and proposed a statistical testing method for
Web applications. Their approach relies on usage and failure information collected in the log files. Web
failure is defined as the inability to correctly deliver information or documents required by Web users.
Based on this definition of failure, they classify types of failures and provide a method for testing source
or content failures. We complement their work by offering a formal time-constrained model of the Web on
which testing and reliability analysis can be done.
£ This work is supported by grants from Natural Sciences and Engineering Research Council, Canada and Concordia University
Quality assurance and reliability assessment for Web applications should focus on the prevention of Web
failures or the reduction of chances for such failures. Consequently, we contend that early reliability as-
sessment is necessary for the reduction of testing efforts, and for ensuring a level of operational reliability.
We propose an early analysis on the formal architecture model of the Web application. This uses a Markov
model, which can adapt to changing system configurations that satisfy the architectural design. We may
view Web applications as Markov systems, in which state changes occur with certain probabilities. From
the Markov model of an application, we can calculate a predictive measurement of reliability. Markov
matrices for individual Web components can be constructed from log files, the source used for statistical
testing in [KT2001]. We give methods for calculating Markov matrices for synchronously interacting Webcomponents from the Markov matrices of individual components. Synchrony hypothesis is that two Web
components that interact on a shared message will change their status (states and associated information)
simultaneously. In a typical application, several Web components collaborate to achieve a task. It is impor-
tant to assess the reliability of every collaboration in an application. We provide a method to compute the
reliability of the whole system from the reliability measures of the collaborations in the system.
2 Web and Markov Models: Basic Concepts
The Web is a large network of interconnected components. Conceptually, it is a graph where each vertex
(node) is a computer system providing an interface to the other nodes in the network. A Web application
is multi-layered, with the user at the top of the layer and the information source at the bottom of the layer.
A user interacts with the nodes in the Web through a browser, which has links to the home pages at the
interfaces of the nodes in the network. The sever at a node provides the services for controlled navigation
for accessing and retrieving information from the information sources at that node. We model the Web
components as User , Browser , and Server classes. Objects, instantiated from the classes, interact in a
meaningful way through messages. The behavior of objects in a class is captured by a hierarchical labeled
transition system with finite number of states. A state represents an operational high-level unit. A transition
between two states may be labeled by a message shared by objects of different classes, or by an event internal
to the object. A state, when complex, is itself a hierarchical labeled transition system, with its substates and
transitions defined by the buttons and the whistles specific to that state. That is, a transition from a substate
to another substate
¼ of an object is implicitly labeled by the link name on the page associated with
.
Some Web applications may put time constraints on the navigation paths within their systems. This is
typical when secure information or time-varying information is to be made available. For instance, consider
the home page of a hypothetical on-line brokerage system HOBS. A user may be able to reach the home
page of HOBS using a browser. However, the user must be authenticated to get services at the site. Once
authenticated, the user may be authorized to do one or more business activity at each state of the server
object. A state may include secure information such as user-id , account-number , and account-balance.
Typical states of the HOBS system where time may plays a role can be Stock Trading, Mutual Fund Trading,
Account Overview, Positions, Quotes and Research, Financial Planning, and Services.
The architecture of HOBS design may allow the user to explore the substates of a state or change to adifferent state as specified by the links on the current page. For instance, the hierarchy of stages rooted at
the state Stock Trading may impose timing constraints on the information displayed at its substates. After
reviewing an order at one of its substates, the user may be allowed to change, review or cancel the order.
If the activity at the state review is not completed within a certain amount of time, the backward transition
may be disabled. There are two reasons for this: (1) the page contains secure information, and (2) the
information, such as stock price is a time-dependent value. Another instance where time plays a role is
when the user fails to interact for a certain period of time in a state. The system, after waiting for a period
of time, may force a new log in session. These instances illustrate the failure of the system to deliver the
information requested by the user. However, this type of failure is not a fault of the system. The behavior of
the system deviates from the user-expected behavior of the system, yet the system behaves according to the
time-constrained functionality imposed by system requirements. In order to model such applications and
their reliability our formal model includes time constraints.
2.1 Markov models
Markov models are one of the most powerful tools available to engineers and scientists for analyzing com-
plex systems. Analysis of Markov models yield results for both the time-dependent evolution of the system
and the steady state properties of the system. The Markov property states that given the current state of the
system, the future evolution of the system is independent of its history.
The Markov model of a Web component may be represented by a state diagram. The states represent the
stages in the Web component that are observable to the users and the transitions between states have assigned
probabilities. The probabilities are calculated from the usage and related failure information collected in thelog file that maintains the Web site. We may use this data as initial transition probabilities. An algebraic
representation of a Markov model is a matrix, called transition matrix, in which the rows and columns
correspond to the states, and the entry Ô
in the
-th row,
-th column is the transition probability for being
in state
at the stage following state
. We use transition matrix representation in reliability calculation
algorithms.
2.2 Discussion
Initial transition probabilities, obtained from various sources including log files and other subjective opinions
of experts can not be used for predicting the reliability of the system. We contend that the reliability shouldbe calculated from the steady state of the Markov system. A steady state or equilibrium state is one in which
the probability of being in a state before and after transitions is the same as time progresses. Computing the
steady state vector for the transition matrix of a large system is hard. However, as in our approach, when the
system is modularly constructed it seems possible to partition the system into smaller components, which
might reduce the complexity of computing steady state vectors.
The formal model of the Web that we discuss in the next section is based on timed labeled transition
system semantics. From the state machine description of a Web component, it is possible to construct the
Markov machine corresponding to that model.
The organization of this paper is as follows. A formal model of the Web is given in Section 3. Section 4
formally describes the method of modeling the Web application as a Markov system. Section 5 presents thereliability prediction measures. Section 6 concludes the paper with a discussion on our ongoing research
configurations for two distinct applications in a system. In the rest of the paper we use the term object to
mean incarnation as well.
The behavior of objects in a class is specified by a finite state machine, augmented with state hierarchy,
logical assertions and timing constraints for transitions. A complex state is an encapsulation of a state
hierarchy, and hence another finite state machine, with an initial state, and which can include other complex
states. In our model, Web objects communicate using a synchronous message passing mechanism. An
external event in the system is either an input or an output event, which can only occur at an instance
of a specific port type. Events label the transitions between states. Logical assertions on the attributes
specify a port condition, an enabling condition, and a post condition on each transition. Local clocks aredefined to enforce time constraints associated with a transition. Both time constraints and functionality are
encapsulated in an object.
An abstract model of a Web system is specified as a collection of interacting Web components. A
Web component is an object instantiated from a generic class. A pair of objects in this collection interact
synchronously through shared messages. These messages occur at the compatible ports. Two ports in a
system are compatible if the set of input messages at one port is equal to the set output messages at the other
port. A port link connects two compatible ports. A port link is an abstraction of communication mechanism
between the objects associated with the ports. Since the signature of ports are well-defined, the port links
effectively determine the set of all valid messages that can be exchanged among the objects in a subsystem.
3.1 Operational Semantics
Web objects communicate through messages. A message from an object to another object in the system is
called a signal and is represented by a tuple
Ô
Ø
, denoting that the event
occurs at timeØ
, at a port
Ô
. The status of an object at any timeØ
is the tuple ́ Ê µ
, where the current state
is a simple state,
is
the assignment vector for attributes, andÊ
is the vector of outstanding reactions. A computational step of an
object occurs when the object with status ́ Ê µ
, receives a signal
Ô
Ø
and there exists a transition
specification that can change its status. A computation
of an object
is a sequence, possibly infinite,
of alternating statuses and signals,Ç Ë
¼
¼
Ô
¼
Ø
¼
Ç Ë
½
½
Ô
½
Ø
½
. Typically, the Web system is non-
terminating; consequently, a computation is in general an infinite sequence. The set of all computationsof an object
is denoted by Ó Ñ Ô ́ µ
. The computation of the Web system is an infinite sequence of
system statuses and signals that effect status changes [AAM98]. A period is a finite subsequence of the Web
computation such that it starts with some initial state and finishes with its next appearance in the computation
sequence.
3.2 A Simple Model of the Web
We abstract the multi-layered architecture of Web applications into three Web components: User , Browser ,
and Server . This abstraction, although is simple, is quite expressive and sufficient to illustrate the reliability
calculation. Extension of our approach to more complex and detailed models are not difficult.
In our model, we assume that several users (clients) may use a browser independently and concurrently
to access information from a server. For simplicity, we assume that one browser is associated with a server.
Once again, this restriction is only for the sake of simplicity of exposition, and can be generalized. A user
Figure 1: Class Diagram for User, Browser and Server Entities.
chooses the server of his choice and initiates a request to a server. That is, the user sends a message to
the corresponding browser, which then commands the server t o allow the connection. When the last user
requesting access to a server disconnects, the browser commands the server to close. During this period,
the user- browser-server interaction must work without fault. The security (expressed as a safety property)
requires that the operation of the system satisfies certain timing constraints, the server remains open, and
provides the requested information (not violating time constraints) during every period of transaction.
A high-level class structure diagram of the model in UML-based notation is shown in Figure 1. The
User class has one port type with signature Ø Ü Ø
. The Browser class has two port types, @È
with
signature Ø Ü Ø
, and @
with signature È Ö Ñ Ø Ë Ø Ó Ô È Ö Ñ Ø
. The Server class has one port
type with signature È Ö Ñ Ø Ë Ø Ó Ô È Ö Ñ Ø
. The figure shows that a port type, modeled as a class, has
an aggregation relationship with the class for which it is intended. An association relationship between
compatible port types is shown. A port identifier is declared as a variable of type @
in User class, and a
variable of typeË Ø
is introduced in Browser class. Time constraints and functionality of objects of classes
are described in statechart diagrams. A formal specification includes structural and behavioral information.
User Model
The statechart diagram for User is shown in Figure 2(a). The significant states of a User object are idle,
toAccess, access, leave. At any instant, a user is in one of these states. In the Idle state, a user has not
initiated any request. To access the server, the user sends t he event Get to the browser used by it in state
Idle, and changes his state to toAccess. In state toAccess, the attribute cr is set to pid , the identifier of the
port where Get occurs. This transition is the constraining transition for two time constraints, labeled TCvar1
and TCvar2. Within 2 to 4 units of time of outputting the request (specified by TCvar1), the user accesses
the server. That is, the user changes his state to access by initiating the internal event In. The state leave
is reached when the user has retrieved the information requested, and this happens within 6 units of time
(specified by TCvar2) from the instant the user requested access to the server. The user sends the message Exit to the browser and reaches the initial state. The formal specification of the User class is shown in
Figure 5: System Configuration Models and Specifications
A system configuration specification defines objects instantiated from the three classes and their inter-
actions. Figure 5(a) is a collaboration diagram in UML style for a linear system with one user object, oneserver object, and one server object. The formal specification for this system is shown in Figure 5(b). A
more complex system, that is non-linear, consisting of five users, two browsers and two servers is shown
in Figure 5(c). In this configuration, user3 is allowed to access both browsers, while the other user objects
interact with only one browser. The formal specification for this subsystem configuration is shown in Figure
Figure 7: Markov Model and State Transition Matrix for Synchronous Product of User and Browser
Step 7. If Ü
½
Ü
¾
For each ́
¼
¼
µ ¾ × Ø
½
do
Å ́ µ ́
¼
¼
µ ℄
Å ́ µ ́
¼
¼
µ ℄
Æ
Step 8. If Ü
½
Ü
¾
do
For each ́
¼
¼
µ ¾ × Ø
¾
do
Å ́ µ ́
¼
¼
µ ℄
´ ½ Æ µ ¢ Å ́ µ ́
¼
¼
µ ℄
Æ
¼
Step 9. To fill in the matrixÅ
with¼
where there are no entries
The Markov model and transition probability matrix for the synchronous product of User and Browser
objects is shown in Figure 7.
4.3 Step 3: Markov Model for a System
A partitioning method given in [O2002], is the basis of our discussion in this section. A system configu-ration, when partitioned, produces two types of subsystem components: (1) linear subsystem configuration,
as shown in Figure 8(a), and (2) non-linear subsystem configuration as shown in Figure 9(a).
4.4 Case 1: Linear System
In a linear system, objects synchronize in the past. If Ó
½
Ó
Ò
are objects in the linear system and
Å
½
Å
Ò
are respectively their transition matrices, then the transition matrixÅ
of the linear system
is computed as follows:
1. ComputeÅ Å
½
ª Å
¾
(Apply Algorithm SPM)
2. for ¿
toÒ
computeÅ Å ª Å
(Apply Algorithm)
The Markov model and the transition matrix for the linear system (Figure 5(a)) are shown in Figure 8(b).
reconfigured system is less than the uncertainty that existed in the current system. The reliability measure-
ment will allow the reconfigured system to be deployed. However, if Ê Ð Ð Ø Ý ́
µ Ê Ð Ð Ø Ý ́
½
µ
,
then there is more uncertainty present in the reconfiguration. This would suggest to determine the subsys-
tem(s) of
that are responsible for lowering the overall reliability.
6 Conclusions and Research Directions
The main result of this paper is a formal approach to calculate the reliability of a time-dependent Web
application. The Web model discussed in the paper is simple, yet representative of the different Web layers.
The model can be generalized to include more Web components:
̄
a browser linked to several servers,
̄
users interacting with Agents, who in turn interact with browsers/servers, and
̄
servers protected by firewalls, and hence a model of firewall will have to be included as well.
In a practical setting, the number of Web components and their interactions will be large. There are also other
factors such as resource constraints, load factor, and communication complexity. From a reliability point of
view, we require a good formal model which takes these factors into account. In the formal model proposed
in this paper the load factor and communication delays can be brought in as synchronization constraints, and
resources can be modeled within each class (such as the Set in Browser class) and timing constraints may be
imposed on database transactions. Calculation of transition probabilities for large evolving configurations
involves multiplying fairly large matrices. The density of the transition probability matrix of a system
depends on the number of transitions in the product matrix, which due to synchronization constraints, might
be sparse. The sparsity of the matrix and the availability of very fast powering and multiplication algorithms
for matrices may be used to speed up reliability calculation for changing configurations.One of our goals is to empirically evaluate the reliability model. This is one aspect of our ongoing study
in metrics and measurements for real-time reactive systems.
References
[AAM98] V.S. Alagar, R. Achuthan, D. Muthiayen. TROMLAB: A Software Development Environment for Real-Time Reactive Systems. Technical Report, (first version 1996, revised 1998), Concordia Univer-sity, Montreal, Canada.
[IEEE90] IEEE Standard Glossary of Software Engineering Terminology. IEEE Std 610.12.1990.
[KT2001] Chaitanya Kallepalli, Jeff Tian. Measuring and Modeling Usage and Reliability for StatisticalWeb Testing. IEEE Transactions on Software Engineering, Nov. 2001 (Vol.27, No.11), pp.1023–1036.
[O2002] Olga Ormandjieva. Quality Measurement for Real-Time Reactive Systems Ph.D thesis, Departmentof Computer Science, Concordia University, Montreal, Canada, January 2002.