Bioinformatics Data Formats

Part of the Bioinformatics For Dummies Cheat Sheet

When you're using the Internet to help with your bioinformatics project, you come across data in all sorts of different formats. The following table can help you understand common bioinformatics formats and what you can and cannot do with them.

Format Name Description
RAW Sequence format that doesn't contain any header. Spaces and numbers are usually tolerated.
FASTA This is the default format. Sequence format that contains a header line and the sequence: >name
AGCTGTGTGGGTTGGTGGGTT
PIR Sequence format that's similar to FASTA but less common
MSF Multiple sequence alignment format
CLUSTAL Multiple sequence alignment format (works with T-Coffee)
TXT Text format
GIF, JPEG, PNG, PDF Graphic formats. Do not use them to store important information.
blog comments powered by Disqus

SERIES
Bioinformatics For Dummies Cheat Sheet

Advertisement

Inside Dummies.com

Dummies.com Sweepstakes

Win $500. Easy.