PCA and FEL through gromacs

 

Principal component analysis and Free energy landscapes of protein systems in gromacs.

>gmx covar -f md_t14cnopbc.xtc -s md_t14c.tpr -o eigenval.xvg -v eigenvect.trr -xpm covara.xpm

[least squares fit:c-alpha; covariance: c alpha]

>gmx xpm2ps -f covara.xpm -o covara.eps -do covar.m2p

>gmx anaeig -v eigenvect.trr -f md_t14cnopbc.xtc -s md_t14c.tpr -first 1 -last 2 -proj proj_eig.xvg -2d 2d_proj.xvg

>gmx sham -f 2d_proj.xvg -ls gibbs.xpm -notime #for free energy, "sham" module uses PC1 and PC2 file

gmx xpm2ps -f gibbs.xpm  -o gibbs.eps -rainbow red

python xpm2txt.py -f gibbs.xpm -o gibbs.txt 


#xpm2txt.py is mentioned here. Just copy and paste this script. Run in command line as mentioned above

#!/usr/bin/env python

import sys

"""

Utility tool to convert xpm files generated by GROMACS to a 3-column text file.

"""


USAGE = "USAGE: xpm2txt.py -f <input xpm file> -o <output txt file> [-s]\n"

USAGE+= "Options:\n"

USAGE+= "\t-s\t(int)\tSorts the output by a given column"

USAGE+= "\n" # always keep this line


# Parse arguments

read_input, read_output, sort = False, False, False

xpm_file, out_file, column_sort = None, None, None

for arg in sys.argv[1:]:

    if read_input:

        read_input = False

        xpm_file = arg

    elif read_output:

        read_output = False

        out_file = arg

    elif sort:

        sort = False

        column_sort = int(arg)

    if arg[0] == "-":

        if arg == "-f":

            read_input = True

            continue

        elif arg == "-o":

            read_output = True

            continue

        elif arg == "-s":

            sort = True

        else:

            print USAGE

            sys.stderr.write('ERROR: Option not recognized: %s\n' %arg)

            sys.exit(1)


if not xpm_file:

    print USAGE

    sys.stderr.write('ERROR: You forgot to provide an input file.\n')

    sys.exit(1)

if not out_file:

    out_file = "out.txt"


# Parse XPM file

xpm_handle = open(xpm_file)

xpm_data = []

x_axis, y_axis = [], []

letter_to_value = {}

for line in xpm_handle:

    if line.startswith("/* x-axis"):

        x_axis = map(float, line.split()[2:-2]) # We trim the last value


    if line.startswith("/* y-axis"):

        y_axis = map(float, line.split()[2:-2]) # We trim the last value


    if line.startswith('"') and x_axis and y_axis: # Read data

        xpm_data.insert(0, line.strip().strip(',')[1:-1])


    if line.startswith('"') and len(line.split()) > 4:

        letter = line.split()[0][1:]

        value = float(line.split()[-2][1:-1])

        letter_to_value[letter] = value

xpm_handle.close()


# Match x/y/data

txt_values = []

for y_index, data_value in enumerate(xpm_data):

    y_value = y_axis[y_index]

    for x_index, x_value in enumerate(x_axis):

        txt_values.append([x_value, y_value, letter_to_value[data_value[x_index]]])


# Apply sorting if requested

if column_sort:

    try:

        txt_values.sort(key=lambda x: x[column_sort-1])

    except IndexError:

        print USAGE

        sys.stderr.write('ERROR: Column not found (%s)\n' %(column_sort))

        sys.exit(1)


# Print to file

out_handle = open(out_file, 'w')

for x, y, z in txt_values:

    out_handle.write("%3.5f\t%3.5f\t%3.5f\n" %(x,y,z))

out_handle.close()

    

  For any queries regarding the script, kindly drop your message in the comments section below.

Comments

  1. Using the script you have provided, I am getting an error:

    Traceback (most recent call last):
    File "xpm2txt.py", line 72, in
    y_value = y_axis[y_index]
    IndexError: list index out of range

    What could be the issue.

    ReplyDelete
    Replies
    1. Hi, try to download this script from here: http://kangsgo.com/images/2018/8/xpm2txt.py
      and try to re-run again.
      check with different python versions also.. like i run with python2.7

      Delete
  2. Hi
    I have follow your work steps but when I try to excute python xpm2txt.py give me the following error:
    "line 16
    if read_input:
    ^
    SyntaxError: invalid character in identifier"

    When I use python3
    "SyntaxError: invalid non-printable character U+00A0"

    How can I fix this?

    Thank you

    ReplyDelete
  3. Hi, try to download this script from here: http://kangsgo.com/images/2018/8/xpm2txt.py
    and try to re-run again.
    check with different python versions also.. like i run with python2.7

    ReplyDelete
  4. Hi Sir/ Madam,
    I figured the lowest energy bin in shamlog.log and binindex.ndx.
    There are ~125 frames numbers (in ascending order) are there in lowest energy bin. How to find the lowest energy frame from that bin?
    Also in 2d_PCA.xvg, there are xy for 50ns while the result gibbs.xpm have 1000+ lines, which has different PCA1 and PCA2 for energy of 0 units, kindly help me to find the lower energy structure frames. Thanks in advance.

    ReplyDelete
  5. Hi.. We apologize for the late response.
    The low energy conformers can be extracted by:
    1. when u run the pymol script to generate the FEL image.. dont save directly in the folder by the script.. rather let the image open as a window so that u can save it manually..
    2. when image window opens.. mouse over to the basins of your interest, then somewhere on the bottom left side u could see the x and y coordinates of the basin.
    3. search those values in the pca12.dat that contains the time frames also.
    4. Now u wont find the exact values of the coordinate but somewhere near to it. x and y coordinates denotes to mode 1 (PC1) and mode 2 (PC2) values respectively.
    5. likewise u will get the approximate idea of the time frame (mentioned in the first column of the file).
    6. then using any frame extraction method .. u can extract that particular frame from the trajectory.

    ReplyDelete
    Replies
    1. Hi... the above mentioned pca12.dat, how could i get this file. Thanks in advance

      Delete
  6. Hello Sir
    I am facing issue while running this script:xpm2txt.py
    This is the error:
    File "xpm2txt.py", line 139, in
    y_value = y_axis[y_index]
    TypeError: 'map' object is not subscriptable
    Please help me resolve it.. I have downloaded this script from different sources but everytime i run it, am getting same error...

    ReplyDelete
    Replies

    1. The error you're encountering, "TypeError: 'map' object is not subscriptable," indicates that you are trying to access elements of a map object as if it were a list or an array, which is not possible.

      In your code, you're using the map function to convert elements to floats in lines 23 and 27. However, the map function returns a map object in Python 3, not a list. To fix this issue, you need to convert the map objects to lists.

      Here's how you can do it:
      x_axis = list(map(float, line.split()[2:-2])) # Convert map object to a list
      y_axis = list(map(float, line.split()[2:-2])) # Convert map object to a list

      Delete
  7. Hi, it may depend on the python version you are using. if you are using latest version, then try to convert this script into the latest version. there are various online converters for python 2 to python 3. Let us know if it works for after this step.

    ReplyDelete

Post a Comment

Popular posts from this blog

Calculation of RMSD values between two ligand poses using web servers/tools

AMBER TUTORIAL-2: How to simulate a protein- ligand system: basic steps for md simulation in amber (AMBER 16)

Python script to plot 3D FEL plots