Bioinformatics For Dummies Cheat Sheet

Jean-Michel Claverie

Cedric Notredame

Updated

2022-04-12 18:59:12

From the book

Bioinformatics For Dummies

Download E-Book

IT Disaster Recovery Planning For Dummies

Explore Book

Download E-Book

IT Disaster Recovery Planning For Dummies

Explore Book

Bioinformatics is the marriage of molecular biology and information technology. Websites direct you to basic bioinformatics data and get down to specifics in helping you analyze DNA/RNA and protein sequences.

All of this data comes at you in several formats, so becoming familiar with various format types helps you know how to interpret and store the data.

Where to find bioinformatics data

Bioinformatics combines information technology and molecular biology, so it makes sense that the internet is the main arena for pursuing bioinformatics information.

The following list offers links to helpful websites around the world and the areas that they specialize in.

Ensembl: Human genome
GenBank/DDBJ/EMBL: Nucleotide sequence
PubMed: Literature references
Swiss Institiute of Bioinformatics: Annotated protein sequences
InterProScan: Protein domains
OMIM: Genetic diseases
GenomeNet: Metabolic pathways

Websites for analyzing DNA/RNA sequences

The bioinformatics websites in the following list offer help in analyzing DNA and RNA sequences. And, in the marriage of information technology and molecular biology that is bioinformatics, this type of analysis is what it’s all about.

Webcutter: Restriction map
GenomeScan: Gene discovery
blastn, tblastn, blastx: Database search
The Genome Browser: Browse the ultimate data!
Mfold: RNA structure prediction

Websites for analyzing protein sequences

With bioinformatics you can explore molecular biology using information technology. The links to the websites in the following list focus on protein sequences. Some offer searchable databases, others help you investigate a single protein; all are helpful.

BLAST: Database homology search
Entrez: Database search
InterProScan: Find protein domains
ExPASy: Analyze a protein
ClustalW: Multiple sequence alignment
T-Coffee: Evaluate multiple alignment
Jalview: Multiple alignment editor
PSIPRED: Secondary structure prediction
Cn3D: Display and spin 3-D structures

Bioinformatics data formats

When you’re using the internet to help with your bioinformatics project, you come across data in all sorts of different formats. The following table can help you understand common bioinformatics formats and what you can and cannot do with them.

Format Name	Description
RAW	Sequence format that doesn’t contain any header. Spaces and numbers are usually tolerated.
FASTA	This is the default format. Sequence format that contains a header line and the sequence: >name AGCTGTGTGGGTTGGTGGGTT
PIR	Sequence format that’s similar to FASTA but less common
MSF	Multiple sequence alignment format
CLUSTAL	Multiple sequence alignment format (works with T-Coffee)
TXT	Text format
GIF, JPEG, PNG, PDF	Graphic formats. Do not use them to store important information.

About This Article

About the book author:

Jean-Michel Claverie is Professor of Medical Bioinformatics at the School of Medicine of the Université de la Méditerranée, and a consultant in genomics and bioinformatics. He is the founder and current head of the Structural & Genomic Information Laboratory, located in Marseilles, a sunny city on the Mediterranean coast of France. Using science as a pretext to travel, Jean-Michel has held positions in Paris (France), Sherbrooke (PQ, Canada), the Salk Institute (La Jolla, CA), the Pasteur Institute (Paris), Incyte pharmaceutical (Palo Alto, CA); and the National Center for Biotechnology Information (Bethesda, MD). He has used computers in biology since the early days –– his Ph.D. work involved modeling biochemical reactions by programming an 8K Honeywell 516 computer right from the console switches! Although he has no clear recollection of it, he has been credited with introducing the French word “bioinformatique” in the late eighties, before involuntarily coining the catchy “bioinformatics” by mistranslating it while giving a talk in English!
Jean-Michel’s current research interests are in microbial and structural genomics, and in the development of bioinformatic methods for the prediction of gene function. He is the author or coauthor of more than 150 scientific publications, and a member of numerous international review panels and scientific councils. In his spare time, he enjoys the relaxed pace of life in Marseilles, with his wife Chantal and their two sons, Nicholas and Raphael.

Cedric Notredame is a researcher at the French National Centre for Scientific Research. Cedric has used and abused the facilities offered by science to wander around Europe. After a Ph.D. at EMBL (Heidelberg, Germany) and at the European Bioinformatics Institute (Cambridge, UK) under the supervision of Des Higgins (yes, the ClustalW guy), Cedric did a post-doc at the National Institute of Medical Research (London, UK), in the lab of Willie Taylor and under the supervision of Jaap Heringa. He then did a post-doc in Lausanne (Switzerland) with Phillip Bucher, and remained involved with the Swiss Institute of Bioinformatics for several years. Having had his share of rain, snow, and wind, Cedric has finally settled in Marseilles, where the sun and the sea are simply warmer than any other place he has lived in.
Cedric dedicates most of his research to the multiple sequence alignment problem and its many applications in biology. His friends claim that his entire life (past, present, future) is somehow stuffed into the T-Coffee multiple-sequence alignment package. When he is not busy dismantling T-Coffee and brewing new sequences, Cedric enjoys life in the company of his wife, Marita.

This article can be found in the category:

General Information Technology

Hot off the press

Explore Related content

IT Disaster Recovery Planning For Dummies

e-Discovery For Dummies

mySAP ERP For Dummies

Green IT For Dummies

NetSuite For Dummies

Windows PowerShell 2 For Dummies

Virtualization For Dummies

IT Architecture For Dummies

GIS For Dummies

Bioinformatics For Dummies

Book & Article Categories

Book & Article Categories

Collections

Bioinformatics For Dummies Cheat Sheet

Where to find bioinformatics data

Websites for analyzing DNA/RNA sequences

Websites for analyzing protein sequences

Bioinformatics data formats

About This Article

About the book author:

This article can be found in the category:

Explore Related content

Book & Article Categories

Book & Article Categories

Collections

Bioinformatics For Dummies Cheat Sheet

Where to find bioinformatics data

Websites for analyzing DNA/RNA sequences

Websites for analyzing protein sequences

Bioinformatics data formats

About This Article

This article is from the book:

About the book author:

This article can be found in the category:

Explore Related content

Exploring the Capability of Building Agentic Workflows and AI Agents in a Modern iPaaS

Why Trusted Knowledge Is the Foundation of Real AI ROI

Why Your Organization Should Embrace a Digital Workplace Mindset

Bioinformatics For Dummies Cheat Sheet

Virtualization For Dummies Cheat Sheet

GIS For Dummies Cheat Sheet

IT Architecture For Dummies Cheat Sheet

Where to Find Bioinformatics Data

Bioinformatics Data Formats

Bioinformatics Web Sites for Analyzing DNA/RNA Sequences

Bioinformatics Web Sites for Analyzing Protein Sequences

Virtualization Project Steps

Reasons for Moving to Virtualization

Types of Virtualization

Virtualization Mini-Glossary

Major Players and Products in Virtualization

GIS Map Characteristics to Keep in Mind

Types of GIS Output

What You Can Do with GIS

Grid-Based GIS Map Functions