Intelligent Heart Disease Prediction System Using Naïve Bayes Synopsis A major challenge facing healthcare organizations (hospitals, medical centers) is the provision of quality services at affordable costs. Quality service implies diagnosing patients correctly and administering treatments that are effective. Poor clinical decisions can lead to disastrous consequences which are therefore unacceptable. Hospitals must also minimize the cost of clinical tests. They can achieve these results by employing appropriate computer-based information and/or decision support systems. Most hospitals today employ some sort of hospital information systems to manage their healthcare or patient data. These systems are designed to support patient billing, inventory management and generation of simple statistics. Some hospitals use decision support systems, but they are largely limited. Clinical decisions are often made based on doctors’ intuition and experience rather than on the knowledge rich data hidden in the database. This practice leads to unwanted
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Intelligent Heart Disease Prediction System Using Naïve Bayes
Synopsis
A major challenge facing healthcare organizations (hospitals, medical centers) is
the provision of quality services at affordable costs. Quality service implies diagnosing
patients correctly and administering treatments that are effective. Poor clinical decisions
can lead to disastrous consequences which are therefore unacceptable. Hospitals must
also minimize the cost of clinical tests. They can achieve these results by employing
appropriate computer-based information and/or decision support systems.
Most hospitals today employ some sort of hospital information systems to manage
their healthcare or patient data. These systems are designed to support patient billing,
inventory management and generation of simple statistics. Some hospitals use decision
support systems, but they are largely limited.
Clinical decisions are often made based on doctors’ intuition and experience
rather than on the knowledge rich data hidden in the database. This practice leads to
unwanted biases, errors and excessive medical costs which affects the quality of service
provided to patients.
The main objective of this research is to develop a Intelligent Heart Disease
Prediction System using three data mining modeling technique, namely, Naïve Bayes. It
is implemented as web based questionnaire application .Based on the user answers, it can
discover and extract hidden knowledge (patterns and relationships) associated with heart
disease from a historical heart disease database. It can answer complex queries for
diagnosing heart disease and thus assist healthcare practitioners to make intelligent
clinical decisions which traditional decision support systems cannot. By providing
effective treatments, it also helps to reduce treatment costs.
HARDWARE CONFIGURATION
• Intel Pentium IV
• 256/512 MB RAM
• 1 GB Free disk space or greater
• 1 GB on Boot Drive
• 17” XVGA display monitor
• 1 Network Interface Card (NIC
SOFTWARE CONFIGURATION
• MS Windows XP/2000
• MS IE Browser 6.0/later
• MS DotNet Framework 2.0
• MS Visual Studio.Net 2005
• Internet Information Server (IIS)
• MS SQL Server 2000
• Windows Installer 3.1
SOFTWARE FEATURES
C#.Net
C# is an object-oriented programming language developed by Microsoft as part of the
.Net initiative. C# is intended to be a simple, modern, general-purpose, object-oriented
programming language. Because software robustness, durability and programmer
productivity are important, the language should include strong type checking, array
bounds checking, detection of attempts to use uninitialized variables, source code
portability, and automatic garbage collection.
C# is intended to be suitable for writing applications for both hosted and embedded
systems, ranging from the very large that use sophisticated operating systems, down to
the very small having dedicated functions. C# applications are intended to be economical
with regards to memory and processing power requirements. Programmer portability is
the very important feture of C#.
C# compiler could generate machine code like traditional compilers of C++ or
FORTRAN; in practice, all existing C# implementations target Common Language
Infrastructure (CLI). C# is more type safe than C++. The only implicit conversions by
default are those which are considered safe, such as widening of integers and conversion
from a derived type to a base type.
C# is the programming language that most directly reflects the underlying Common
Language Infrastructure (CLI). Most of C# intrinsic types correspond to value-types
implemented by the CLI framework. C# supports a strict boolean type, bool Statements
that take conditions, such as while and if, require an expression of a boolean type.
FEATURES OF C#.NET
The Visual Studio.Net is a tool rich programming environment containing all the
functionality for handling C# projects.
The .Net integrated development environment provides enormous advantages for
the programmers.
C# is directly related to C, C++ and Java. C # is a case sensitive language and it is
designed to produce portable code.
C# includes features that directly support the constituents of components such as
properties, methods and events.
C# is an object oriented language which supports all object oriented programming
(OOP’s) concepts such as encapsulation, polymorphism and inheritance.
Encapsulation is a programming mechanism that binds code and the data together.
It manipulates and keeps both safe from outside interference and misuse.
Polymorphism is the quality that allows one interface to access a general class of
action.
Inheritance is the process by which one object can acquire the properties of
another object.
Intellisense displays the name of every members of a class.
C# allows us for creating both managed and unmanaged applications.
Interoperability, simplicity, performance, cross language integration and language
independent are important features of C#.
Multiple inheritance is not supported, although a class can implement any number
of interfaces.
There are no global variables or functions. All methods and members must be
declared within classes.
FEATURES OF WINDOWS XP
The major feature of the Windows XPprofessional
Reliable
Easy to use and Maintain
Internet Ready
Windows File protection
Protects core system files from being overwritten by application installs. In the
event a file is overwritten, Windows File Protection will replace that file with the correct
version. By safeguarding system files in this manner, Windows XP mitigates many
common system failures found in earlier versions of windows.
Driver Certification
Provides safeguards and assure that device drivers have not been tampered with
and reducing the risk of installing non-certified drivers.
Full 32-bit Operating System
Minimizes the chance of application failures and unplanned reboots.
Microsoft Installer
Works with the Windows Installer service, helping users install, configure, track,
upgrade, and remove software programs correctly, minimizing the risks of user error and
possible loss of productivity.
System Preparation Tool (SysPrep)
Help administrator clone computer configurations, systems, and applications,
resulting in simplifier, faster, more cost-effective deployment.
Remote OS Installation
Permits Windows XP Professional to be installed across the network (including
SysPrep images). Remote OS Installation saves time and reduces deployment.
Multilingual Support
Allows users to easily create, read, and edit documents in hundreds of languages.
Faster Performance
Provides 25 percent faster performance than Widows 95 and Windows 98 on
system with 64 megabytes (MB) or more of memory.
Faster Multitasking
Uses a full 32-bit architecture, allowing you to run more programs and perform
more tasks at the same time than Windows 95 or Windows 98.
Scalable Memory and Processor Support
Supports up to 4 gigabytes (GB) of RAM and up to two symmetric
multiprocessors.
Peer-to-Peer Support for windows 95/98 and Windows NT
Enables windows 2000 professional to interoperate with earlier versions of
Windows on a peer-to-peer level, allowing the sharing of all resources, such as folders,
printers and peripherals.
Internet Information Services (IIS) 5.0
Includes web and FTP server support, as well as support for Front Page
transactions, Active server Pages, and database connections. Available as an optional
component, IIS 5.0 is installed automatically for those upgrading from versions of
Windows with Personal Web Server installed.
Search Bar
Helps user to quickly search for different types of information, such as Web pages
or people addresses. And choose which search engine you want to use – all from one
location.
History Bar
Helps user to find the way back to sites viewed in the past. The History bar not
only tracks Web sites, but also intranet sites, network servers and local folders.
Favorites
Helps user to find and organize relevant information whether it’s stored in files,
folders or Web sites.
Strong Development Platform
Support for Dynamic HTML Behaviors and XML gives developers the broadest
range of options – with the fastest development time.
SQL SERVER 2005
The SQL server web data administrator enables us to easily manage
our SQL Server data, wherever we are using its built-in features, we can do the following
from Microsoft Internet Explorer or our favorite web browser. Microsoft SQL Server
2000 introduces several server improvements and new features:
XML Support
The relational database engine can return data as Extensible Markup Language (XML)
documents. Additionally, XML can also be used to insert, update, and delete values in the
database.
Federated Database Servers
SQL Server 2000 supports enhancements to distributed partitioned views that allow you
to partition tables horizontally across multiple servers. This allows you to scale out one
database server to a group of database servers that cooperate to provide the same
performance levels as a cluster of database servers. This group, or federation, of database
servers can support the data storage requirements of the largest Web sites and enterprise
data processing systems.
SQL Server 2000 introduces Net-Library support for Virtual Interface Architecture (VIA)
system-area networks that provide high-speed connectivity between servers, such as
between application servers and database servers.
Existing system
Clinical decisions are often made based on doctors’ intuition and experience
rather than on the knowledge rich data hidden in the database.
This practice leads to unwanted biases, errors and excessive medical costs which
affects the quality of service provided to patients.
There are many ways that a medical misdiagnosis can present itself. Whether a
doctor is at fault, or hospital staff, a misdiagnosis of a serious illness can have
very extreme and harmful effects.
The National Patient Safety Foundation cites that 42% of medical patients feel
they have had experienced a medical error or missed diagnosis. Patient safety
is sometimes negligently given the back seat for other concerns, such as the cost
of medical tests, drugs, and operations.
Medical Misdiagnoses are a serious risk to our healthcare profession. If they
continue, then people will fear going to the hospital for treatment. We can put an
end to medical misdiagnosis by informing the public and filing claims and suits
against the medical practitioners at fault.
Proposed Systems
This practice leads to unwanted biases, errors and excessive medical costs which
affects the quality of service provided to patients.
Thus we proposed that integration of clinical decision support with computer-
based patient records could reduce medical errors, enhance patient safety,
decrease unwanted practice variation, and improve patient outcome.
This suggestion is promising as data modeling and analysis tools, e.g., data
mining, have the potential to generate a knowledge-rich environment which can
help to significantly improve the quality of clinical decisions.
The main objective of this research is to develop a prototype Intelligent Heart
Disease Prediction System (IHDPS) using three data mining modeling techniques,
namely, Decision Trees, Naïve Bayes and Neural Network.
So its providing effective treatments, it also helps to reduce treatment costs. To
enhance visualization and ease of interpretation,
Modules:
Analyzing the Data set:
A data set (or dataset) is a collection of data, usually presented in tabular form.
Each column represents a particular variable. Each row corresponds to a given member of
the data set in question. It lists values for each of the variables, such as height and weight
of an object or values of random numbers. Each value is known as a datum. The data set
may comprise data for one or more members, corresponding to the number of rows.
The values may be numbers, such as real numbers or integers, for example
representing a person's height in centimeters, but may also be nominal data (i.e., not
consisting of numerical values), for example representing a person's ethnicity. More
generally, values may be of any of the kinds described as a level of measurement. For
each variable, the values will normally all be of the same kind. However, there may also
be "missing values", which need to be indicated in some way.
A total of 500 records with 15 medical attributes (factors) were obtained from
the Heart Disease database lists the attributes. The records were split equally into two
datasets: training dataset (455 records) and testing dataset (454 records). To avoid bias,
the records for each set were selected randomly.
The attribute “Diagnosis” was identified as the predictable attribute with value
“1” for patients with heart disease and value “0” for patients with no heart disease. The
attribute “PatientID” was used as the key; the rest are input attributes. It is assumed that
problems such as missing data, inconsistent data, and duplicate data have all been
resolved.
Here in our project we get a data set from .dat file as our file reader program will
get the data from them for the input of Naïve Bayes based mining process.
Naives Baye’s Implementation in Mining:
I recommend using Probability For Data Mining for a more in-depth introduction
to Density estimation and general use of Bayes Classifiers, with Naive Bayes Classifiers
as a special case. But if you just want the executive summary bottom line on learning and
using Naive Bayes classifiers on categorical attributes then these are the slides for you.
Bayes' Theorem finds the probability of an event occurring given the probability
of another event that has already occurred. If B represents the dependent event and A
represents the prior event, Bayes' theorem can be stated as follows.
Bayes' Theorem:
Prob(B given A) = Prob(A and B)/Prob(A)
To calculate the probability of B given A, the algorithm counts the number of cases
where A and B occur together and divides it by the number of cases where A occurs
alone.
Applying Naïve Bayes to data with numerical attributes and using the Laplace
correction (to be done at your own time, not in class)( data with some numerical
attributes), predict the class of the following new example using Naïve Bayes
classification: with some numerical attributes), predict the class of the following new