My software notes

January 25, 2006

convert protein sequence from 1 to 3 letter code for each residue

Filed under: NMRPipe and NMRview,xplor/xplor-nih/cns — kpwu @ 3:37 am

Copy the following lines in a shell script file, make sure the file is executable, so far only work one direction from single letter code to 3-letter code, plan to cover both directions and DNA/RNA in the future.

The script was designed for NMRView sequence format, and Xplor sequence format,too.
————————————————————————————–

#!/bin/sh
# convert protein sequence from one letter code to three letter code
# Just set up your sequence file name for “xseq= ??”
# Kuen-Phon Wu, 06/11/2004 version 0.2
# 06/17/2004 version 0.3
#
# This program will ignore any other characters which are not standard 20 Amino
# acids, like: B,O…. Either lowercases or uppercases can be process via this
# program, have fun!
#
# Just use as: ./seq1to3.sh [FILENAME] > [OUT_FILE]

usage=”Usage: seq1to3.sh [inputfile] > [OUTPUT]”

if [ $# -lt 1 ] ; then
echo “$usage”
exit 1
fi

tr ‘A-Z’ ‘a-z’ ==== $1 | tr ‘[:punct:]’ ‘b’ | tr ‘0-9’ ‘b’ |tr ‘\n’ ‘b’| \
sed -e ‘s/[o|x|z|u|j|b| ]//g’ \
-e ‘s/a/ALA\n/g’ -e ‘s/c/CYS\n/g’ -e ‘s/d/ASP\n/g’ \
-e ‘s/e/GLU\n/g’ -e ‘s/f/PHE\n/g’ -e ‘s/g/GLY\n/g’ \
-e ‘s/h/HIS\n/g’ -e ‘s/i/ILE\n/g’ -e ‘s/k/LYS\n/g’ \
-e ‘s/l/LEU\n/g’ -e ‘s/m/MET\n/g’ -e ‘s/p/PRO\n/g’ \
-e ‘s/r/ARG\n/g’ -e ‘s/q/GLN\n/g’ -e ‘s/n/ASN\n/g’ \
-e ‘s/s/SER\n/g’ -e ‘s/t/THR\n/g’ -e ‘s/w/TRP\n/g’ \
-e ‘s/y/TYR\n/g’ -e ‘s/v/VAL\n/g’

Advertisements

3 Comments »

  1. You should worn people about encoding.
    Since your html code does translate
    wrongly characters as ”, “” to styled beggining and ending quotes.
    To make work under unix. you should also make sure to save it with appropriate line ending and clean ASCII encoding

    the script below will do the opposite. You may publish it in your page
    Best wishes,
    Rogelio

    ————-
    #!/bin/sh
    # convert protein sequence from three letter code to one letter code
    # Just write down your sequence to a ascii file with
    # three letter-words, upper or lower case
    # separated by a puntuation sign like comma, space, dash or
    # other. Avoid heading, comments or other information
    # Rogelio Rodríguez-Sotres, 24/09/2007 adapted from
    # Kuen-Phon Wu, 06/11/2004 version 0.3
    #
    # This program will ignore any other characters in stardar 20 Amino
    # acids, Either lowercases or uppercases can be process via this
    # program
    #
    # Just use as: ./seq1to3.sh [FILENAME] > [OUT_FILE]
    usage=’Usage: seq3to1.sh [inputfile] > [OUTPUT]’

    if [ $# -lt 1 ] ; then
    echo “$usage”
    exit 1
    fi

    #### NOTE: change ==== to the other direction of “>” before running
    cat $1 | tr ‘[:punct:]’ ‘ ‘ | tr ‘[0-9]’ ‘ ‘ |tr ‘\n’ ‘ ‘| tr ‘[A-Z]’ ‘[a-z]’ | \
    sed -e ‘s/[x|z|j|b| ]/ /g’ \
    -e ‘s/ ala / A /g’ -e ‘s/ cys / C /g’ -e ‘s/ asp / D /g’ \
    -e ‘s/ glu / E /g’ -e ‘s/ phe / F /g’ -e ‘s/ gly / G /g’ \
    -e ‘s/ his / H /g’ -e ‘s/ ile / I /g’ -e ‘s/ lys / K /g’ \
    -e ‘s/ leu / L /g’ -e ‘s/ met / M /g’ -e ‘s/ pro / P /g’ \
    -e ‘s/ arg / R /g’ -e ‘s/ gln / Q /g’ -e ‘s/ asn / N /g’ \
    -e ‘s/ ser / S /g’ -e ‘s/ thr / T /g’ -e ‘s/ trp / W /g’ \
    -e ‘s/ tyr / Y /g’ -e ‘s/ val / V /g’ -e ‘s/ //g’

    Comment by rogelio rodriguez — August 24, 2007 @ 10:13 am | Reply

  2. Thanks for the comment.
    I know I am not a good programmer. I am glad to know other ways to make the script better and better.
    This script was the second or third shell script I wrote after I read “Teach yourself Shell Programming” in 2004. Just try to share some of my scripts, I will add such warns about encoding when I release new scripts or I will attach the script in the post.

    Kuen-Phon Wu

    Comment by kpwu — August 24, 2007 @ 11:48 am | Reply

  3. Thanks to my father who informed me regarding this web site, this webpage is really awesome.

    Comment by fruit of the earth aloe vera gel walmart — April 24, 2013 @ 8:07 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: