ACCESSING DATABASES WHAT IS REDUNDANCY? A KEY

ACCELEGRANTS BUDGET GUIDANCE ACCESSING THE BUDGET THE BUDGET PAGE
ACCESSING BANNER FINANCIAL DATA VIA THE CAMPUS PORTAL (FINANCE
ACCESSING COURT CASES (EXERCISE) PO 4333 (SPRING 2008) LOUIS

ACCESSING DATABASES WHAT IS REDUNDANCY? A KEY
ACCESSING OPERATIONS AND RIGGING (CONSTRUCTION) LEVEL 3 AWARDED BY
BARONY MEDICAL CENTRE INFORMATION FOR ACCESSING HELP ON MINOR

Accessing Databases

Accessing Databases

What is Redundancy?

A key concept in comparing databases is the issue of redundancy. Many databases try to be "non-redundant". Unfortunately, biological data is too complex to fit a simple definition of redundancy. Are two alleles of the same locus redundant? Two isozymes in the same organism? The same locus in two closely related organisms? Hence, each "non-redundant" database has its own definition of redundancy. Some use automated measures, while others use manual culling; the former are amenable to large projects, the latter give higher quality. Other databases don't attempt to be non-redundant, but rather sacrifice this goal in favor of ensuring completeness.

Databases

nr (NCBI)

The nr nucleotide database maintained by NCBI as a target for their BLAST search services is a composite of GenBank, GenBank updates, and EMBL updates.

GenBank / EMBL / DDBJ

In theory, GenBank, the EMBL Datalibrary, and the DNA Databank of Japan (DDBJ) are just names for the same database. In reality, small timelags in propagating data between the database centers causes minor differences in these databases. However, if one of these libraries is merged with the updates to all of these databases, a complete set of sequences is formed.

dbEST (Boguski, Lowe, & Tolstoshev. Nature Genetics 4:332 1993) is a library of Expressed Sequence Tags (Science 252:1651), single-pass cDNA sequences generated from automated sequencers.

CAUTION: ESTs are blindly sequenced from cDNA libraries with little or no human intervention; they are therefore likely to contain sequencing errors and are frequently contaminated with heterologous sequences and transcribed repetitive elements.

nr (NCBI)

The nr protein database maintained by NCBI as a target for their BLAST search services is a composite of SwissProt, SwissProt updates, PIR, PDB. Entries with absolutely identical sequences have been merged.

SwissProt

SwissProt is maintained by Amos Bairoch at the University of Geneva. SwissProt is a highly-curated, highly-crossreferenced, non-redundant database. Unfortunately, the cost of this labor-intensive quality enhancement process is that not every sequence is in SwissProt. If you wish to look up information about a sequence, SwissProt is the first place to look.

PIR

The Protein Identification Resource was originated by the late Margaret Dayhoff. It attempts to enjoy the advantages of a complete and a non-redundant database.

PDB

The Protein Data Bank, maintained by Brookhaven National Laboratory (Long Island, New York, USA), contains all publically available solved protein structures. Searches against the pdb can be used to ask whether any known 3D structures are similar to your query protein.

OWL

Prot. Eng. 3:153

Prosite

Prosite is a database of protein motifs maintained by Amos Bairoch at the University of Geneva (NAR 19:2241, 1991). Each motif (defined by either a regular expression or a profile) is accompanied by a description of the motif and what is known about it's biology, as well as a listing of the true positive, false negative, and false positive SwissProt entries for the pattern.

BLOCKS

BLOCKS is a database developed by Steve Henikoff and colleagues. A block is a gap-free multiple alignment of sequences based on Prosite (Henikoff & Henikoff, NAR 19:6565 1991).

ACCESSING DATABASES  WHAT IS REDUNDANCY?  A KEY

This document is intended to serve as a guide to using certain bioinformatics programs. It cannot be guaranteed to be free of errors or completely up-to-date. If you know of errors or other shortcomings of this document, please mail them to Keith Robison (Church Lab, HMS Genetics)

[email protected]



CHANGING YOUR PASSWORD AND ACCESSING YOUR MAILBOX USING THE
DDA FACTSHEET 6 “REASONABLE ADJUSTMENTS” IN ACCESSING GOODS FACILITIES
E PIC TIP SHEET ACCESSING THE DAR ACCESSING THE


Tags: accessing databases, accessing, databases, redundancy?