Bioinformatics Encyclopedia
Home Bioinformatics Science Fair Projects Bioinformatics Resources Bioinformatics Books Biology Jokes and Evolution
 
 


Sequence Database



See also:

In the field of bioinformatics, a sequence database is a large collection of DNA, protein, or other sequences stored on a computer. A database can include sequences from only one organism, as in databases including all the proteins in Saccharomyces cerevisiae, or it can include sequences from all organisms whose DNA has been sequenced.

Search issues

Sequence databases can be searched using a variety of methods. The most common is probably searching for a sequence similar to a certain target protein or gene whose sequence is already known to the user. The BLAST program is a method of this type.

Many inputs create inconsistencies

A major problem with all the large genetic sequence databases is that records are deposited in them from a wide range of sources, from individual researchers to large genome sequencing centers. As a result, the sequences themselves, and especially the biological annotations attached to these sequences, vary tremendously in quality. Also there is much redundancy, as multiple labs often submit numerous sequences that are identical, or nearly identical, to others in the databases.

Many annotations are based not on laboratory experiments, but on the results of sequence similarity searches for previously-annotated sequences. Of course, once a sequence has been annotated based on similarity to others, and itself deposited in the database, it can also become the basis for future annotations. This leads to the transitive annotation problem because there may be several such annotation transfers by sequence similarity between a particular database record and actual wet lab experimental information. Therefore, one must always regard the biological annotations in major sequence databases with a considerable degree of skepticism, unless they can be verified by reference to published papers describing high-quality experimental data, or at least by reference to a human-curated sequence database.

For more information see the following links:

External links

This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia Encyclopedia article "Sequence Database"

Most Popular

Bioinformatics Introduction

Sequence Alignment

Sequence Database

Phylogenetics

Protein Structure Prediction


Bioinformatics Books











Site Map   About Us

Comments and inquiries could be addressed to:
webmaster@juliantrubin.com


Last updated: July 2008
Copyright © 2003-2008 Julian Rubin