INSS651 CH 12 DISTRIBUTED DATABASES DDB A DB PHYSICALLY

INSS651 CH 12 DISTRIBUTED DATABASES DDB A DB PHYSICALLY






Inss651 CH 10

Inss651 CH 12




Distributed databases



DDB:


A DB PHYSICALLY SPREAD ACROSS COMPUTERS AT MULTIPLE LOCATIONS CONNECTED

BY A NETWORK AND IN WHICH USERS CAN ACCESS DATA AT ANY SITE IN THE NETWORK

IS DEFINED AS DDB

DISTRIBUTE DBMS:


A DBMS CAPABLE OF SUPPORTING AND MANIPULATING DDB


TERMINE LOCATION OF REQUESTED DATA


TRANSFER REQUEST FROM ONE NODE TO ANOTHER NODE


RECOVERY, CONCURRENCY, DEADLOCK etc


DDBMS


a DDBMS is like a DBMS with additional responsibilities of

communication and coordination of data storage and processing over

several distributed sites



Advantages of DDB:



DDB disadvantages


management & control

security

lack of standards

storage problem


Distributed processing


DB is located at one site


each site can update data


DDB:

DB is stored over one or more physically

independent sites


data is fragmented over many sites


DDB CONFIGURATION (PAGE 463)


single site PROCESSING, single site data (SPSD)


processing is NOT done at the user's end


TP & DP are part of DBMS at single site


Multiple site processing, single - site data (MPSD)


DB at a single site


file server at the DB end


all record locking is done at the user level


entire file MUST travel thru the network for processing at the

work station


Multi site proc,multi site data (MPMD):

Fully distributed processing


HOMOGENOUS DDBMS:


computers on similar platforms



HETEROGENOUS DDBMS:


different types of DBMS running on different platforms


KEY:


end user transparency


Transparency is defined as user's unawareness of data

location, processing replication and performance


for users ALL DATA IS LOCAL



DD TRANSPARENCY (p 467)


1. DISTRIBUTION

FRAGMENTATION

REPLICATION

LOCATION


2. TRANSACTION


3.FAILURE


4.PERFORMANCE


5.HETEROGENEITY

DISTRIBUTION:





Trans. Transparency:


This ensures that the transaction will be completed only when

ALL sites involved complete their part in the transaction


a site is named "PRIMARY"; if transaction is updated at that site

it is assumed updated and it is the responsibility of the

primary site to update other sites



REPLICATION TRANSPARENCY:


if data is replicated at different sites, users should be unaware of it


FAILURE TRANSPARENCY:


guarantees that either all the actions of each transaction are

committed or else none of them are committed


DISTRIBUTED CONCURRENCY CONTROL:


DDBMS must ensure transaction, at ALL sites, are completed

before a final COMMIT is done


deadlock are common


TWO-PHASE COMMIT:


a final commit should not be issued until all sites have been

committed


DO-UNDO-REDO


DO..records before and after values in the transaction log


UNDO..reverses an operation, using the log entries of DO


REDO..redoes an operation, using DO portion of the sequence


Phase 1:


coordinator site sends msg to all subordinate site to prepare to COMMIT


subbordinates create transaction log and send back their state for COMMIT


go to phase 2

otherwise it is aborted


Phase2:


coordinator asks each site to COMMIT


sites update their database


reply to coordinator, whether COMMITTED or ROLLBACK


if even one site did not COMMIT, all changes are UNDONE


DDB DESIGN


GLOBAL SCHEMA


DISTRIBUTED DATA DICTIONARY


DATA:

Data FRAGMENTATION (PAGE 477) (IMPORTANT)


HORIZONTAL PARTITION.


ROWS OF A TABLE ARE PUT ON DIFFERENT SITES


ROWS MAY BE LOCATED AT HOME SITES


EMPLOYEE(E_NUM,E_NAME ,E_PHONE,SITE_NUMB)



SELECT *

FROM EMPLOYEE

WHERE SITE_NUMB = 'S1';


DATA IS CLOSE TO SITE




PROBLEMS:




VERTICAL PART.


DEFINE FRAGMENT F1 AS

SELECT E_NUMB,

E_NAME, SITE_NUMB

FROM EMPLOYEE

WHERE SITE_NUMB IN

('S1','S2');


Data Allocation:


data should be as close to the user as possible



ISSUES IN DDB:


SOFTWARE COSTS & COMPLEXITY


COORDINATION AMONG SITES


DATA INTEGRITY


SLOW RESPONSE





Tags: databases ddb:, databases, physically, distributed, inss651