Bioinformatics Data Formats
When you’re using the Internet to help with your bioinformatics project, you come across data in all sorts of different formats. The following table can help you understand common bioinformatics formats and what you can and cannot do with them.
|RAW||Sequence format that doesn’t contain any header. Spaces and
numbers are usually tolerated.
|FASTA||This is the default format. Sequence format that contains a
header line and the sequence: >name
|PIR||Sequence format that’s similar to FASTA but less common|
|MSF||Multiple sequence alignment format|
|CLUSTAL||Multiple sequence alignment format (works with T-Coffee)|
|GIF, JPEG, PNG, PDF||Graphic formats. Do not use them to store important