v DATABASE MANAGEMENT SYSTEM :
·
Definition:
A database-management system (DBMS) is a collection of
interrelated data and a set of programs to access those data. The collection of
data, usually referred to as the Database, contains information relevant to an enterprise. The primary
goal of a DBMS is
to provide a way to store and retrieve database information that is both convenient and efficient.
§ Database-management system is widely
use in Banking, Airlines, University, Credit Card Transaction,
Telecommunication, Finance, Manufacturing, Sales, Human Resources.
·
Database System Vs File System:
• Data
redundancy and inconsistency:
Since different programmers create the files and application programs
over a long period, the various files are likely to have different formats and
the programs may be written in several programming languages. Moreover, the
same information may be duplicated in several places (files).
Data inconsistency that is, the various copies of the same data may no longer
agree.
• Difficulty in accessing data:
The conventional file-processing environments do not allow
needed data to be retrieved in a convenient and efficient manner.
Example-
Suppose that one of
the bank officers needs to find out the names of all customers who live within
a particular postal-code area. The officer asks the data-processing department
to generate such a list. Because the designers of the original system did not
anticipate this request, there is no application program on hand to meet it.
There is, however, an application program to generate the list of all customers. The bank officer has now two choices: either obtains the list of all customers and
extract the needed
information manually or ask a system
programmer to write the necessary application program. Both
alternatives are obviously unsatisfactory.
• Data isolation:
As the data are
scattered in various files, and files may be in different formats, writing new
application programs to retrieve the appropriate data is difficult.
• Integrity problems:
The data values
stored in the database must satisfy certain types of consistency constraints. For example, the balance of a
bank account may never fall below a prescribed amount (say, $25). Developers
enforce these constraints in the system by adding appropriate code in the
various application programs. However, when new constraints are added, it is
difficult to change the programs to enforce them.
• Atomicity problems:
A computer
system, like any other mechanical or electrical device, is subject to failure. Consider a program to transfer $50 from account A to account B. If a system
failure occurs during the execution of the program, it is possible that the $50
was removed from account A but
was not credited to account B, resulting in an inconsistent database state. Clearly, it is
essential to database consistency that either both the credit and debit occur,
or that neither occur. That is, the funds transfer must be atomic. It
is difficult to ensure atomicity in a conventional file-processing system.
• Concurrent-access anomalies:
For the sake of overall performance of the system and faster response,
many systems allow multiple users to update the data simultaneously. In such an
environment, interaction of concurrent updates may result in inconsistent data.
Consider bank account A, containing $500. If two customers withdraw funds (say $50 and
$100 respectively) from account A at about the same time, the result of the concurrent executions
may leave the account in an incorrect (or inconsistent) state.
If the two programs run concurrently, they may both read the value $500,
and write back $450 and $400, respectively. Depending on which one writes the
value last, the account may contain either $450 or $400, rather than the
correct value of $350.
• Security problems:
Not every
user of the database system should be able to access all the data. For example,
in a banking system, payroll personnel need to see only that part of the
database that has information about the various bank employees. They do not
need access to information about customer accounts. But, since application
programs are added to the system in an ad hoc manner, enforcing such security
constraints is difficult.
- This is the measure disadvantage in
file processing system over a database management system.
·
Data Abstraction:
Many database
system users are not computer trend developers hide the complexity form users
throw severable table of abstraction to simplify users to interactions with the
system. This is called data abstractor.
• Physical level:
The lowest level of
abstraction describes how the data are actually stored. The physical level describes
complex low-level data structures in detail.
• Logical level:
The next-higher level of
abstraction describes what data are stored in the database, and what relationships exist
among those data. The logical level thus describes the entire database in terms
of a small number of relatively simple structures.
• View level:
The highest level of
abstraction describes only part of the entire database.
Many users of the database system do not need all this information; instead,
they need to access only a part of the database. The
view level of abstraction exists to simplify their interaction with the system.
The system may provide many views for the same database.
·
Data Model:
Data model is a collection of conceptual tools for describing data, data
relationships, data semantics, and consistency constraints.
Entity
Relationship Model:
The entity-relationship (E-R) data model is based on a perception of
real world that consist of a collection of basic objects call Entity and relationships among these objects.
v Database Legacy:
Flat file, Hierarchical & Network
database are usually referred as Legacy Database. They represent the ways.
People used to organize information in prehistoric times.
1.
Flat file Database:
The flat file database was probably one
of the earliest database management systems.
Problems of flat file database are: Data redundancy, slow processing speed,
error prone, retrieved.
Hierarchical Database
Network
Database Relational
Database
2. Hierarchical
Database:
It stores data in Hierarchical
structures similar to that of a family tree, organization chart, or pyramid. It is based on
“parents/child” paradigm in each parents could have many children but each
child has one and only one parents.
In the hierarchical database model redundancy is greatly reduce
compared with Flat file database model. The hierarchical database maintains one
to many relationships very well.
3. Network Database:
It
is also based on the concept of parents/child relationship but removes the
restriction of one child having one and only one parents.
v Data Integrity:
Data
integrity falls into three following categories:
1. Entity Integrity: Entity integrity ensures that row can be identifying by an
attribute called the Primary key. The
primary key cannot have a null value.
2. Domain Integrity: Domain integrity refers to the range of valid entities for a
given column. It ensure that there are only valid entities in the column.
3. Referential Integrity: Referential integrity ensures that for every value of Foreign
Key, there is matching value of primary key.
v Database Languages:
A database system
provides a data definition
language to specify the
database schema and a data manipulation
language to express
database queries and updates. In practice, the data definition and data
manipulation languages are not two separate languages; instead they simply form
parts of a single database language, such as the widely used SQL language.
Ø Data-Definition Language:
We specify a database schema by a set of definitions expressed by a
special language called a data-definition language (DDL).
For instance, the following statement in the SQL language defines the account table:
create table account
(account-number char(10),
balance integer)
Execution of the above DDL statement creates the account table. In addition, it updates a special set of tables
called the data dictionary or data directory.
A
data dictionary contains metadata—that is, data
about data. The schema of a table is an example of metadata. A database system
consults the data dictionary before reading or modifying actual data.
We
specify the storage structure and access methods used by the database system by
a set of statements in a special type of DDL called a data storage and
definition language. These
statements define the implementation details of the database schemas, which are
usually hidden from the users.
The
data values stored in the database must satisfy certain consistency constraints. For example, suppose the balance on
an account should not fall below $100. The DDL provides facilities to specify such constraints. The database systems
check these constraints every time the database is updated.
Ø Data-Manipulation Language:
Data manipulation is
• The retrieval of information stored in
the database
• The insertion of new information into
the database
• The deletion of information from the
database
• The modification of information stored
in the database
A data-manipulation language (DML) is a language that enables users to access or manipulate data as
organized by the appropriate data model. There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data are needed without specifying how to get those data.
Declarative DMLs are usually easier to learn and use than are procedural DMLs. However, since a user does not have to specify how
to get the data, the database system has to figure out an efficient means of
accessing data. The DML component of the SQL language is nonprocedural.
v The Query Processor:
The query
processor components include
• DDL interpreter:
It
interprets DDL statements and records the definitions
in the data dictionary.
• DML compiler:
It
translates DML statements in a query language into an evaluation
plan consisting of low- level instructions that the query evaluation engine understands.
A query can usually be translated into any of a number of alternative
evaluation plans that all give the same result. The DML compiler also performs query optimization, that is, it picks the lowest cost
evaluation plan from among the alternatives.
• Query evaluation engine:
It executes
low-level instructions generated by the DML compiler.
v Database Users and Administrators:
A primary
goal of a database system is to retrieve information from and store new information
in the database. People who work with a database can be categorized as database
users or database administrators.
Ø Database Users and User Interfaces
There are
four different types of database-system users, differentiated by the way they
expect to interact with the system. Different types of user interfaces have
been designed for the different types of users.
• Naive users:
Naïve users
are
unsophisticated users who interact with the system by invoking one of the
application programs.
For example, a bank teller who needs to transfer $50 from account A to
account B invokes a program called transfer. This program asks the teller for the amount of money to be
transferred, the account from which the money is to be transferred, and the
account to which the money is to be transferred.
The typical user interface for naive users is a forms interface, where
the user can fill in appropriate fields of the form. Naive users may also
simply read reports generated from the database.
• Application programmers:
Application programmers are computer professionals who write
application programs. Application programmers can choose from many tools to
develop user interfaces. Rapid application development (RAD) tools are
tools that enable an application programmer to construct forms and reports
without writing a program.
• Sophisticated users:
Sophisticated users interact with the system without
writing programs. Instead, they form their requests in a database query
language. They submit each such query to a query processor, whose
function is to break down DML statements into instructions that the storage manager understands.
• Specialized users:
Specialized
users are sophisticated users who write specialized database applications that
do not fit into the traditional data-processing framework. Among these
applications are computer-aided design systems, knowledge base and expert systems, systems that store data with complex
data types (for example, graphics data and audio data), and
environment-modelling systems.
v Database Administrator:
One of
the main reasons for using DBMSs is to have
central control of both the data and the programs that access those data. A
person who has such central control over the system is called a database administrator (DBA). The functions of a DBA include:
• Schema definition:
The DBA creates the original database schema by executing a
set of data definition statements in the DDL.
• Storage structure
and access-method definition
• Schema and
physical-organization modification:
The DBA carries out changes to the schema and physical organization
to reflect the changing needs of the organization, or to alter the physical
organization to improve performance.
• Granting of
authorization for data access:
By granting different types of authorization,
the database administrator can regulate which parts of the database various
users can access. The authorization information is kept in a special system
structure that the database system consults whenever someone attempts to access
the data in the system.
• Routine maintenance: Examples of the
database administrator’s routine maintenance activities are:
ü Periodically backing up the database, either onto tapes or onto remote
servers, to prevent loss of data in case of disasters such as flooding.
ü Ensuring that enough free disk space is
available for normal operations, and upgrading disk space as required.
ü Monitoring jobs running on the database and ensuring that performance is
not degraded by very expensive tasks submitted by some users.
v Storage Manager:
A storage manager is a program module that provides the interface
between the low level data stored in the database and the application programs
and queries submitted to the system. The storage manager is responsible for the
interaction with the file manager. The raw data are stored on the disk using
the file system, which is usually provided by a conventional operating system.
The storage manager translates the various DML statements into low-level file-system commands. Thus, the storage manager
is responsible for storing, retrieving, and updating data in the database.
The storage manager components include:
• Authorization and
integrity manager:
It tests for the
satisfaction of integrity constraints and checks the authority of users to
access data.
• Transaction manager:
It ensures that the
database remains in a consistent (correct) state despite system failures, and
that concurrent transaction executions proceed without conflicting.
• File manager:
It manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
• Buffer manager:
It is responsible for
fetching data from disk storage into main memory, and deciding what data to
cache in main memory. The buffer manager is a critical part of the database
system, since it enables the database to handle data sizes that are much larger
than the size of main memory.
The storage manager implements several data structures
as part of the physical system implementation:
• Data files: It stores the database itself.
• Data dictionary: It stores metadata about the structure
of the database, in particular the schema of the database.
• Indices: It provides fast access to data items that hold
particular values.
v
Transaction Manager:
It ensures that
the database remains in a consistent (correct) state despite system failures
and that concurrent transaction execution proceed without conflicting.
·
File
Manager: It manages the allocation of space on
disk storage & the data structures used to represent information stored on
a disk
·
Buffer
Manger: It is responsible for fetching data from
disk storage into main memory & deciding what data to cache in the memory.
The buffer manager is a critical part of the database, since it enables the
database to handle data sizes that much larger than the size of main memory.
v Entity:
An entity
is a “thing” or “object” in the real world that is
distinguishable from all other objects. Entities
are the building block of database. We represent an entity using a rectangular box that contains the name of the entity.
v Attributes:
An attributes is a property of an entity
that differentiates from other entity
& provide information about the entity. In an E-R
diagram we represent attributes as ellipse.
• Simple and Composite attributes:
Simple attributes have been simple;
that is, they are not divided into subparts.
Composite attributes can
be divided into subparts (that is, other attributes).
For
example, an attribute name could be structured as a composite attribute consisting of first-name, middle-initial, and last-name. So name is a composite attributes.
• Single-valued and Multi-valued
attributes:
When an
attribute has a single value for a particular entity, it is called the single-value attribute.
When an attribute has a multiple value for a particular entity, it is
called the Multi-value attribute. As example, a person can have several phones
no. So phones no. attribute is Multi-valued
attribute.
• Derived attribute:
The value for this type of attribute can be
derived from the values of other related attributes or entities.
As example consider, the customer entity set has
an attribute age, which indicates the customer’s age. If the customer entity set also
has an attribute date-of-birth, we can calculate age from date-of-birth
and the current date.
Ø N.B: An
attribute takes a null value when an entity does not have a value for it. The null value may
indicate “not
applicable”—that
is, that the value does not exist for the entity. For example, one may have no
middle name. Null can also designate that an attribute value is unknown. An
unknown value may be either missing (the value does exist, but we do not have that information) or not known (we do not know
whether or not the value actually exists).
v Relationship:
Relationship is used to
establish a connection between a pair of logically related entities. It is a
association between entities.
Separate
entities can have relationships with each other. For example- if student study
various course the entity are student and course, while the relationship
between them is represented by a diamond type.
|
|
|||||
v Mapping
Cardinalities:
Mapping cardinalities, or
cardinality ratios, express the number of entities to which another entity can
be associated via a relationship set.
For
a binary relationship set R between entity sets A
and B, the mapping cardinality must be one of the following:
• One to one. An entity in A is associated with at
most one entity in B, and an entity in B is associated with at
most one entity in A. (See Figure 2.4a.)
• One to
many. An entity in A
is associated with any number (zero or
more) of entities in B. An entity in B, however, can be associated with at
most one entity in A. (See Figure 2.4b.)
• Many to
one. An entity in A
is associated with at most one entity in B. An entity in B, however, can
be associated with any number (zero or more) of entities in A. (See Figure
2.5a.)
• Many to
many. An entity in A
is associated with any number (zero or
more) of entities in B, and an entity in B is associated with any number (zero or more) of entities in A. (See Figure
2.5b.)
v Keys:
A
key allows
us to identify a set of attributes that suffice to distinguish entities from
each other. Keys also help uniquely identify relationships, and thus
distinguish relationships from each other. So keys plays an important role
maintain data integrity.
·
Candidate Key: It is important to have an attribute in a table that
uniquely identifies a row. So an attributes which identifies a row is called
Candidates Key. This attributes has values that are unique.
Consider the table Vehicle
VEHICLE:
Serial#
|
Regn#
|
Description
|
023451
|
5602
|
Leyland
|
023452
|
4502
|
Volvo
|
023453
|
4513
|
Toyota
|
The value of attribute Serial#, Regn# &
Description are unique in every row. Therefore all three are candidates key.
ü Candidates
key can also referred to as Surrogate
Key
·
Primary
Key:
The candidate key that we choose to identify each row uniquely
is called the Primary key In the
table Vehicle: If we choose Serial# to identify rows uniquely serial# is the
primary key.
·
Alternative
Key:
A candidate key that
is not chosen as a primary key is an alternative key. In the table VEHICLE, if
we choose Serial# as the primary key then Reg# & Description is the
alternative key.
·
N.B: It
is important that we understood that a primary key is the only sure way to
identify the row of a table. Hence an attribute key may have the null value. A null value is not to be permitted in a
primary key. Since it would be difficult to uniquely identify rows
containing null value.
·
Supper Key:
·
Composite
Key:
In a certain table, a single attribute cannot be used to
identify rows uniquely so a combination of two or more attributes is used as a
primary key. Such keys are called Composite
Key.
·
Foreign
Key:
When a primary key of
one table appears as an attribute in another table, it is called the Foreign Key. In the second table a foreign key is used to relate two tables.
v Weak
Entity Set:
An entity set may not have sufficient attributes to form a
primary key. Such an entity set is termed a weak
entity set. An entity set that has a primary key is
termed a strong entity set.
For a weak
entity set to be meaningful, it must be associated with another entity set,
called the identifying or
owner entity set. The relationship associating the weak entity set with the
identifying entity set is called the identifying
relationship.
Although a weak entity set does not have a primary key, we nevertheless
need a means of distinguishing among all those entities in the weak entity set
that depend on one particular strong entity. The discriminator
of a weak entity set is a set of
attributes that allows this distinction to be made. Hence the primary key of a
weak entity set is formed by the primary key of the identifying entity set,
plus the weak entity set’s discriminator.
v
Entity-Relationship Diagram:
An
E-R diagram
can express the overall logical structure
of a database graphically. E-R diagrams are simple and clear—qualities that may well account
in large part for the widespread use of the E-R
model. Such a diagram consists of the
following major components:
• Rectangles, which represent entity sets
• Ellipses, which represent attributes
• Diamonds, which represent relationship sets
• Lines, which link attributes to entity
sets and entity sets to relationship sets
• Double ellipses, which represent multi-valued
attributes
• Dashed ellipses, which denote derived attributes
• Double lines, which indicate total
participation of an entity in a relationship set
• Double
rectangles, which represent weak entity sets
No comments:
Post a Comment