EMBOSS
EMBOSS is an open source set of applications for sequence analysis.
[edit] 1 Available Console Commands
For easy access.
[edit] 1.1 ACD File Utilities
- acdc: ACD compiler
- acdcpretty: ACD pretty printing utility
- acdtable: Creates an HTML table from an ACD file
- acdtrace: ACD compiler on-screen trace
- acdvalid: ACD file validation
[edit] 1.2 Merging Sequences to Make a Consensus
- cons: Creates a consensus from multiple alignments
- megamerger: Merge two large overlapping nucleic acid sequences
- merer: Merge two overlapping sequences
[edit] 1.3 Finding Differences Between Sequences
- diffseq: Find differences between nearly identical sequences
[edit] 1.4 Dot Plot Sequence Comparisons
- dotmatcher: Displays a thresholded dotplot of two sequences
- dotpath: Non-overlapping wordmatch dotplot of two sequences
- dottup: Displays a wordmatch dotplot of two sequences
- polydot: Displays all-against-all dotplots of a set of sequences
[edit] 1.5 Global Sequence Alignment
- est2genome: Align EST and genomic DNA sequences
- needle: Needleman-Wunsch global alignment
- stretcher: Finds the best global alignment between two sequences
- esim4: Align an mRNA to a genomic DNA sequence
[edit] 1.6 Local Sequence Alignment
- matcher: Finds the best local alignments between two sequences
- seqmatchall: All-against-all comparison of a set of sequences
- supermatcher: Match large sequences against one or more other sequences
- water: Smith-Waterman local alignment
- wordmatch: Finds all exact matches of a given size between 2 sequences
[edit] 1.7 Multiple Sequence Alignment
- emma: Multiple alignment program - interface to ClustalW program
- infoalign: Information on a multiple sequence alignment
- plotcon: Plot quality of conservation of a sequence alignment
- prettyplot: Displays aligned sequences, with colouring and boxing
- showalign: Displays a multiple sequence alignment
- tranalign: Align nucleic coding regions given the aligned proteins
- mse: Multiple Sequence Editor
[edit] 1.8 Publication-Quality Display
- abiview: Reads ABI file and display the trace
- cirdna: Draws circular maps of DNA constructs
- lindna: Draws linear maps of DNA constructs
- pepnet: Displays proteins as a helical net
- pepwheel: Shows protein sequences as helices
- prettyplot: Displays aligned sequences, with colouring and boxing
- prettyseq: Output sequence with translated ranges
- remap: Display sequence with restriction sites, translation etc
- seealso: Finds programs sharing group names
- showalign: Displays a multiple sequence alignment
- showdb: Displays information on the currently available databases
- showfeat: Show features of a sequence
- showseq: Display a sequence with features, translation etc
- sixpack: Display a DNA sequence with 6-frame translation and ORFs
- textsearch: Search sequence documentation. Slow, use SRS and Entrez!
[edit] 1.9 Enzyme Kinetics Calculations
- findkm: Find Km and Vmax for an enzyme reaction
[edit] 1.10 Manipulation and Display of Sequence Annotation
- coderet: Extract CDS, mRNA and translations from feature tables
- extractfeat: Extract features from a sequence
- maskfeat: Mask off features of a sequence
- showfeat: Show features of a sequence
- twofeat: Finds neighbouring pairs of features in sequences
[edit] 1.11 Information and General Help for Users
- infoalign: Information on a multiple sequence alignment
- infoseq: Displays some simple information about sequences
- seealso: Finds programs sharing group names
- showdb: Displays information on the currently available databases
- textsearch: Search sequence documentation. Slow, use SRS and Entrez!
- tfm: Displays a program's help documentation manual
- whichdb: Search all databases for an entry
- wossname: Finds programs by keywords in their one-line documentation
[edit] 1.12 Menu Interface(s)
- emnu: Simple menu of EMBOSS applications
[edit] 1.13 Nucleic Acid Secondary Structure
- einverted: Finds DNA inverted repeats
[edit] 1.14 Codon Usage Analysis
- cai: CAI codon adaptation index
- chips: Codon usage statistics
- codcmp: Codon usage table comparison
- cusp: Create a codon usage table
- syco: Synonymous codon usage Gribskov statistic plot
[edit] 1.15 Composition of Nucleotide Sequences
- banana: Bending and curvature plot in B-DNA
- btwisted: Calculates the twisting in a B-DNA sequence
- chaos: Create a chaos game representation plot for a sequence
- compseq: Count composition of dimer/trimer/etc words in a sequence
- dan: Calculates DNA RNA/DNA melting temperature
- freak: Residue/base frequency table or plot
- isochore: Plots isochores in large DNA sequences
- sirna: Finds siRNA duplexes in mRNA
- wordcount: Counts words of a specified size in a DNA sequence
[edit] 1.16 CpG Island Detection and Analysis
- cpgplot: Plot CpG rich areas
- cpgreport: Reports all CpG rich regions
- geecee: Calculates fractional GC content of nucleic acid sequences
- newcpgreport: Report CpG rich areas
- newcpgseek: Reports CpG rich regions
[edit] 1.17 Predictions of Genes and Other Genomic Features
- getorf: Finds and extracts open reading frames (ORFs)
- marscan: Finds MAR/SAR sites in nucleic sequences
- plotorf: Plot potential open reading frames
- showorf: Pretty output of DNA translations
- sixpack: Display a DNA sequence with 6-frame translation and ORFs
- syco: Synonymous codon usage Gribskov statistic plot
- tcode: Fickett TESTCODE statistic to identify protein-coding DNA
- wobble: Wobble base plot
[edit] 1.18 Nucleic acid Motif Searches
- dreg: Regular expression search of a nucleotide sequence
- fuzznuc: Nucleic acid pattern search
- fuzztran: Protein pattern search after translation
- marscan: Finds MAR/SAR sites in nucleic sequences
[edit] 1.19 Nucleic acid Sequence Mutation
- msbar: Mutate sequence beyond all recognition
- shuffleseq: Shuffles a set of sequences maintaining composition
[edit] 1.20 Primer Prediction
- eprimer3: Picks PCR primers and hybridization oligos
- primersearch: Searches DNA sequences for matches with primer pairs
- stssearch: Search a DNA database for matches with a set of STS primers
[edit] 1.21 Nucleic acid Profile Generation and Searching
- profit: Scan a sequence or database with a matrix or profile
- prophecy: Creates matrices/profiles from multiple alignments
- prophet: Gapped alignment for profiles
[edit] 1.22 Nucleic acid repeat Detection
- einverted: Finds DNA inverted repeats
- equicktandem: Finds tandem repeats
- etandem: Looks for tandem repeats in a nucleotide sequence
[edit] 1.23 Restriction Enzyme Sites in Nucleotide Sequences
- recoder: Remove restriction sites but maintain same translation
- redata: Search REBASE for enzyme name, references, suppliers etc
- remap: Display sequence with restriction sites, translation etc
- restover: Find restriction enzymes producing specific overhang
- restrict: Finds restriction enzyme cleavage sites
- showseq: Display a sequence with features, translation etc
- silent: Silent mutation restriction enzyme scan
[edit] 1.24 Transcription factors, promoters and terminator Prediction
- tfscan: Scans DNA sequences for transcription factors
[edit] 1.25 Translation of Nucleotide Sequence to Protein Sequence
- backtranambig: Back translate a protein sequence to ambiguous codons
- backtranseq: Back translate a protein sequence
- coderet: Extract CDS, mRNA and translations from feature tables
- plotorf: Plot potential open reading frames
- prettyseq: Output sequence with translated ranges
- remap: Display sequence with restriction sites, translation etc
- showorf: Pretty output of DNA translations
- showseq: Display a sequence with features, translation etc
- sixpack: Display a DNA sequence with 6-frame translation and ORFs
- transeq: Translate nucleic acid sequences
[edit] 1.26 Protein Secondary Structure
- garnier: Predicts protein secondary structure
- helixturnhelix: Report nucleic acid binding motifs
- hmoment: Hydrophobic moment calculation
- pepcoil: Predicts coiled coil regions
- pepnet: Displays proteins as a helical net
- pepwheel: Shows protein sequences as helices
- tmap: Displays membrane spanning regions
[edit] 1.27 Protein tertiary Structure
- psiphi: Phi and psi torsion angles from protein coordinates
[edit] 1.28 Composition of Protein Sequences
- backtranambig: Back translate a protein sequence to ambiguous codons
- backtranseq: Back translate a protein sequence
- charge: Protein charge plot
- checktrans: Reports STOP codons and ORF statistics of a protein
- compseq: Count composition of dimer/trimer/etc words in a sequence
- emowse: Protein identification by mass spectrometry
- freak: Residue/base frequency table or plot
- iep: Calculates the isoelectric point of a protein
- mwcontam: Shows molwts that match across a set of files
- mwfilter: Filter noisy molwts from mass spec output
- octanol: Displays protein hydropathy
- pepinfo: Plots simple amino acid properties in parallel
- pepstats: Protein statistics
- pepwindow: Displays protein hydropathy
- pepwindowall: Displays protein hydropathy of a set of sequences
[edit] 1.29 Protein Motif Searches
- antigenic: Finds antigenic sites in proteins
- digest: Protein proteolytic enzyme or reagent cleavage digest
- epestfind: Finds PEST motifs as potential proteolytic cleavage sites
- fuzzpro: Protein pattern search
- fuzztran: Protein pattern search after translation
- helixturnhelix: Report nucleic acid binding motifs
- oddcomp: Find protein sequence regions with a biased composition
- patmatdb: Search a protein sequence with a motif
- patmatmotifs: Search a PROSITE motif database with a protein sequence
- pepcoil: Predicts coiled coil regions
- preg: Regular expression search of a protein sequence
- pscan: Scans proteins using PRINTS
- sigcleave: Reports protein signal cleavage sites
[edit] 1.30 Protein Sequence Mutation
- msbar: Mutate sequence beyond all recognition
- shuffleseq: Shuffles a set of sequences maintaining composition
[edit] 1.31 Protein Profile Generation and Searching
- profit: Scan a sequence or database with a matrix or profile
- prophecy: Creates matrices/profiles from multiple alignments
- prophet: Gapped alignment for profiles
[edit] 1.32 Database Installation
- aaindexextract: Extract data from AAINDEX
- cutgextract: Extract data from CUTG
- printsextract: Extract data from PRINTS
- prosextract: Build the PROSITE motif database for use by patmatmotifs
- rebaseextract: Extract data from REBASE
- tfextract: Extract data from TRANSFAC
[edit] 1.33 Database Indexing
- dbiblast: Index a BLAST database
- dbifasta: Database indexing for fasta file databases
- dbiflat: Index a flat file database
- dbigcg: Index a GCG formatted database
- dbxfasta: Database b+tree indexing for fasta file databases
- dbxflat: Database b+tree indexing for flat file databases
- dbxgcg: Database b+tree indexing for GCG formatted databases
[edit] 1.34 Utility Tools
- embossdata: Finds or fetches data files read by EMBOSS programs
- embossversion: Writes the current EMBOSS version number