EMBOSS

EMBOSS is an open source set of applications for sequence analysis.

Contents

[edit] 1 Available Console Commands

For easy access.

[edit] 1.1 ACD File Utilities

[edit] 1.2 Merging Sequences to Make a Consensus

  • cons: Creates a consensus from multiple alignments
  • megamerger: Merge two large overlapping nucleic acid sequences
  • merer: Merge two overlapping sequences

[edit] 1.3 Finding Differences Between Sequences

  • diffseq: Find differences between nearly identical sequences

[edit] 1.4 Dot Plot Sequence Comparisons

  • dotmatcher: Displays a thresholded dotplot of two sequences
  • dotpath: Non-overlapping wordmatch dotplot of two sequences
  • dottup: Displays a wordmatch dotplot of two sequences
  • polydot: Displays all-against-all dotplots of a set of sequences

[edit] 1.5 Global Sequence Alignment

  • est2genome: Align EST and genomic DNA sequences
  • needle: Needleman-Wunsch global alignment
  • stretcher: Finds the best global alignment between two sequences
  • esim4: Align an mRNA to a genomic DNA sequence

[edit] 1.6 Local Sequence Alignment

  • matcher: Finds the best local alignments between two sequences
  • seqmatchall: All-against-all comparison of a set of sequences
  • supermatcher: Match large sequences against one or more other sequences
  • water: Smith-Waterman local alignment
  • wordmatch: Finds all exact matches of a given size between 2 sequences

[edit] 1.7 Multiple Sequence Alignment

  • emma: Multiple alignment program - interface to ClustalW program
  • infoalign: Information on a multiple sequence alignment
  • plotcon: Plot quality of conservation of a sequence alignment
  • prettyplot: Displays aligned sequences, with colouring and boxing
  • showalign: Displays a multiple sequence alignment
  • tranalign: Align nucleic coding regions given the aligned proteins
  • mse: Multiple Sequence Editor

[edit] 1.8 Publication-Quality Display

  • abiview: Reads ABI file and display the trace
  • cirdna: Draws circular maps of DNA constructs
  • lindna: Draws linear maps of DNA constructs
  • pepnet: Displays proteins as a helical net
  • pepwheel: Shows protein sequences as helices
  • prettyplot: Displays aligned sequences, with colouring and boxing
  • prettyseq: Output sequence with translated ranges
  • remap: Display sequence with restriction sites, translation etc
  • seealso: Finds programs sharing group names
  • showalign: Displays a multiple sequence alignment
  • showdb: Displays information on the currently available databases
  • showfeat: Show features of a sequence
  • showseq: Display a sequence with features, translation etc
  • sixpack: Display a DNA sequence with 6-frame translation and ORFs
  • textsearch: Search sequence documentation. Slow, use SRS and Entrez!

[edit] 1.9 Enzyme Kinetics Calculations

  • findkm: Find Km and Vmax for an enzyme reaction

[edit] 1.10 Manipulation and Display of Sequence Annotation

  • coderet: Extract CDS, mRNA and translations from feature tables
  • extractfeat: Extract features from a sequence
  • maskfeat: Mask off features of a sequence
  • showfeat: Show features of a sequence
  • twofeat: Finds neighbouring pairs of features in sequences

[edit] 1.11 Information and General Help for Users

  • infoalign: Information on a multiple sequence alignment
  • infoseq: Displays some simple information about sequences
  • seealso: Finds programs sharing group names
  • showdb: Displays information on the currently available databases
  • textsearch: Search sequence documentation. Slow, use SRS and Entrez!
  • tfm: Displays a program's help documentation manual
  • whichdb: Search all databases for an entry
  • wossname: Finds programs by keywords in their one-line documentation

[edit] 1.12 Menu Interface(s)

  • emnu: Simple menu of EMBOSS applications

[edit] 1.13 Nucleic Acid Secondary Structure

[edit] 1.14 Codon Usage Analysis

  • cai: CAI codon adaptation index
  • chips: Codon usage statistics
  • codcmp: Codon usage table comparison
  • cusp: Create a codon usage table
  • syco: Synonymous codon usage Gribskov statistic plot

[edit] 1.15 Composition of Nucleotide Sequences

  • banana: Bending and curvature plot in B-DNA
  • btwisted: Calculates the twisting in a B-DNA sequence
  • chaos: Create a chaos game representation plot for a sequence
  • compseq: Count composition of dimer/trimer/etc words in a sequence
  • dan: Calculates DNA RNA/DNA melting temperature
  • freak: Residue/base frequency table or plot
  • isochore: Plots isochores in large DNA sequences
  • sirna: Finds siRNA duplexes in mRNA
  • wordcount: Counts words of a specified size in a DNA sequence

[edit] 1.16 CpG Island Detection and Analysis

[edit] 1.17 Predictions of Genes and Other Genomic Features

  • getorf: Finds and extracts open reading frames (ORFs)
  • marscan: Finds MAR/SAR sites in nucleic sequences
  • plotorf: Plot potential open reading frames
  • showorf: Pretty output of DNA translations
  • sixpack: Display a DNA sequence with 6-frame translation and ORFs
  • syco: Synonymous codon usage Gribskov statistic plot
  • tcode: Fickett TESTCODE statistic to identify protein-coding DNA
  • wobble: Wobble base plot

[edit] 1.18 Nucleic acid Motif Searches

  • dreg: Regular expression search of a nucleotide sequence
  • fuzznuc: Nucleic acid pattern search
  • fuzztran: Protein pattern search after translation
  • marscan: Finds MAR/SAR sites in nucleic sequences

[edit] 1.19 Nucleic acid Sequence Mutation

  • msbar: Mutate sequence beyond all recognition
  • shuffleseq: Shuffles a set of sequences maintaining composition

[edit] 1.20 Primer Prediction

  • eprimer3: Picks PCR primers and hybridization oligos
  • primersearch: Searches DNA sequences for matches with primer pairs
  • stssearch: Search a DNA database for matches with a set of STS primers

[edit] 1.21 Nucleic acid Profile Generation and Searching

  • profit: Scan a sequence or database with a matrix or profile
  • prophecy: Creates matrices/profiles from multiple alignments
  • prophet: Gapped alignment for profiles

[edit] 1.22 Nucleic acid repeat Detection

[edit] 1.23 Restriction Enzyme Sites in Nucleotide Sequences

  • recoder: Remove restriction sites but maintain same translation
  • redata: Search REBASE for enzyme name, references, suppliers etc
  • remap: Display sequence with restriction sites, translation etc
  • restover: Find restriction enzymes producing specific overhang
  • restrict: Finds restriction enzyme cleavage sites
  • showseq: Display a sequence with features, translation etc
  • silent: Silent mutation restriction enzyme scan

[edit] 1.24 Transcription factors, promoters and terminator Prediction

  • tfscan: Scans DNA sequences for transcription factors

[edit] 1.25 Translation of Nucleotide Sequence to Protein Sequence

  • backtranambig: Back translate a protein sequence to ambiguous codons
  • backtranseq: Back translate a protein sequence
  • coderet: Extract CDS, mRNA and translations from feature tables
  • plotorf: Plot potential open reading frames
  • prettyseq: Output sequence with translated ranges
  • remap: Display sequence with restriction sites, translation etc
  • showorf: Pretty output of DNA translations
  • showseq: Display a sequence with features, translation etc
  • sixpack: Display a DNA sequence with 6-frame translation and ORFs
  • transeq: Translate nucleic acid sequences

[edit] 1.26 Protein Secondary Structure

  • garnier: Predicts protein secondary structure
  • helixturnhelix: Report nucleic acid binding motifs
  • hmoment: Hydrophobic moment calculation
  • pepcoil: Predicts coiled coil regions
  • pepnet: Displays proteins as a helical net
  • pepwheel: Shows protein sequences as helices
  • tmap: Displays membrane spanning regions

[edit] 1.27 Protein tertiary Structure

  • psiphi: Phi and psi torsion angles from protein coordinates

[edit] 1.28 Composition of Protein Sequences

  • backtranambig: Back translate a protein sequence to ambiguous codons
  • backtranseq: Back translate a protein sequence
  • charge: Protein charge plot
  • checktrans: Reports STOP codons and ORF statistics of a protein
  • compseq: Count composition of dimer/trimer/etc words in a sequence
  • emowse: Protein identification by mass spectrometry
  • freak: Residue/base frequency table or plot
  • iep: Calculates the isoelectric point of a protein
  • mwcontam: Shows molwts that match across a set of files
  • mwfilter: Filter noisy molwts from mass spec output
  • octanol: Displays protein hydropathy
  • pepinfo: Plots simple amino acid properties in parallel
  • pepstats: Protein statistics
  • pepwindow: Displays protein hydropathy
  • pepwindowall: Displays protein hydropathy of a set of sequences

[edit] 1.29 Protein Motif Searches

  • antigenic: Finds antigenic sites in proteins
  • digest: Protein proteolytic enzyme or reagent cleavage digest
  • epestfind: Finds PEST motifs as potential proteolytic cleavage sites
  • fuzzpro: Protein pattern search
  • fuzztran: Protein pattern search after translation
  • helixturnhelix: Report nucleic acid binding motifs
  • oddcomp: Find protein sequence regions with a biased composition
  • patmatdb: Search a protein sequence with a motif
  • patmatmotifs: Search a PROSITE motif database with a protein sequence
  • pepcoil: Predicts coiled coil regions
  • preg: Regular expression search of a protein sequence
  • pscan: Scans proteins using PRINTS
  • sigcleave: Reports protein signal cleavage sites

[edit] 1.30 Protein Sequence Mutation

  • msbar: Mutate sequence beyond all recognition
  • shuffleseq: Shuffles a set of sequences maintaining composition

[edit] 1.31 Protein Profile Generation and Searching

  • profit: Scan a sequence or database with a matrix or profile
  • prophecy: Creates matrices/profiles from multiple alignments
  • prophet: Gapped alignment for profiles

[edit] 1.32 Database Installation

[edit] 1.33 Database Indexing

  • dbiblast: Index a BLAST database
  • dbifasta: Database indexing for fasta file databases
  • dbiflat: Index a flat file database
  • dbigcg: Index a GCG formatted database
  • dbxfasta: Database b+tree indexing for fasta file databases
  • dbxflat: Database b+tree indexing for flat file databases
  • dbxgcg: Database b+tree indexing for GCG formatted databases

[edit] 1.34 Utility Tools

  • embossdata: Finds or fetches data files read by EMBOSS programs
  • embossversion: Writes the current EMBOSS version number
XHTML 1.1 CSS 2 Sec 508