Inss651 CH 12
Distributed databases
DDB:
A DB PHYSICALLY SPREAD ACROSS COMPUTERS AT MULTIPLE LOCATIONS CONNECTED
BY A NETWORK AND IN WHICH USERS CAN ACCESS DATA AT ANY SITE IN THE NETWORK
IS DEFINED AS DDB
DISTRIBUTE DBMS:
A DBMS CAPABLE OF SUPPORTING AND MANIPULATING DDB
TERMINE LOCATION OF REQUESTED DATA
TRANSFER REQUEST FROM ONE NODE TO ANOTHER NODE
RECOVERY, CONCURRENCY, DEADLOCK etc
DDBMS
a DDBMS is like a DBMS with additional responsibilities of
communication and coordination of data storage and processing over
several distributed sites
Advantages of DDB:
Data is located at the demand site faster data access
faster data processing
growth facilitation
better communication
reduced operating costs
user-friendly interface
less problem of a single point failure
DDB disadvantages
management & control
security
lack of standards
storage problem
Distributed processing
DB is located at one site
each site can update data
DDB:
DB is stored over one or more physically
independent sites
data is fragmented over many sites
DDB CONFIGURATION (PAGE 463)
single site PROCESSING, single site data (SPSD)
processing is NOT done at the user's end
TP & DP are part of DBMS at single site
Multiple site processing, single - site data (MPSD)
DB at a single site
file server at the DB end
all record locking is done at the user level
entire file MUST travel thru the network for processing at the
work station
Multi site proc,multi site data (MPMD):
Fully distributed processing
HOMOGENOUS DDBMS:
computers on similar platforms
HETEROGENOUS DDBMS:
different types of DBMS running on different platforms
KEY:
end user transparency
Transparency is defined as user's unawareness of data
location, processing replication and performance
for users ALL DATA IS LOCAL
DD TRANSPARENCY (p 467)
1. DISTRIBUTION
FRAGMENTATION
REPLICATION
LOCATION
2. TRANSACTION
3.FAILURE
4.PERFORMANCE
5.HETEROGENEITY
DISTRIBUTION:
users do not have to specify fragments in query
users have to specify fragment name but not location
users have to specify both fragmentation and location
Trans. Transparency:
This ensures that the transaction will be completed only when
ALL sites involved complete their part in the transaction
a site is named "PRIMARY"; if transaction is updated at that site
it is assumed updated and it is the responsibility of the
primary site to update other sites
REPLICATION TRANSPARENCY:
if data is replicated at different sites, users should be unaware of it
FAILURE TRANSPARENCY:
guarantees that either all the actions of each transaction are
committed or else none of them are committed
DISTRIBUTED CONCURRENCY CONTROL:
DDBMS must ensure transaction, at ALL sites, are completed
before a final COMMIT is done
deadlock are common
TWO-PHASE COMMIT:
a final commit should not be issued until all sites have been
committed
DO-UNDO-REDO
DO..records before and after values in the transaction log
UNDO..reverses an operation, using the log entries of DO
REDO..redoes an operation, using DO portion of the sequence
Phase 1:
coordinator site sends msg to all subordinate site to prepare to COMMIT
subbordinates create transaction log and send back their state for COMMIT
go to phase 2
otherwise it is aborted
Phase2:
coordinator asks each site to COMMIT
sites update their database
reply to coordinator, whether COMMITTED or ROLLBACK
if even one site did not COMMIT, all changes are UNDONE
DDB DESIGN
GLOBAL SCHEMA
DISTRIBUTED DATA DICTIONARY
DATA:
Data FRAGMENTATION (PAGE 477) (IMPORTANT)
HORIZONTAL PARTITION.
ROWS OF A TABLE ARE PUT ON DIFFERENT SITES
ROWS MAY BE LOCATED AT HOME SITES
EMPLOYEE(E_NUM,E_NAME ,E_PHONE,SITE_NUMB)
DEFINE FRAGMENT F1 AS
SELECT *
FROM EMPLOYEE
WHERE SITE_NUMB = 'S1';
DATA IS CLOSE TO SITE
LOCAL OPTIMIZATION
DATA NOT RELEVANT ARE NOT AVAILABLE AT A SITE
PROBLEMS:
DATA NEEDED FROM MULTIPLE SITES
IF A SITE IS DESTROYED
VERTICAL PART.
DEFINE FRAGMENT F1 AS
SELECT E_NUMB,
E_NAME, SITE_NUMB
FROM EMPLOYEE
WHERE SITE_NUMB IN
('S1','S2');
Data Allocation:
data should be as close to the user as possible
ISSUES IN DDB:
SOFTWARE COSTS & COMPLEXITY
COORDINATION AMONG SITES
DATA INTEGRITY
SLOW RESPONSE
Tags: databases ddb:, databases, physically, distributed, inss651