PCA and FEL through gromacs

- June 05, 2021

Principal component analysis and Free energy landscapes of protein systems in gromacs.

>gmx covar -f md_t14cnopbc.xtc -s md_t14c.tpr -o eigenval.xvg -v eigenvect.trr -xpm covara.xpm

[least squares fit:c-alpha; covariance: c alpha]

>gmx xpm2ps -f covara.xpm -o covara.eps -do covar.m2p

>gmx anaeig -v eigenvect.trr -f md_t14cnopbc.xtc -s md_t14c.tpr -first 1 -last 2 -proj proj_eig.xvg -2d 2d_proj.xvg

>gmx sham -f 2d_proj.xvg -ls gibbs.xpm -notime #for free energy, "sham" module uses PC1 and PC2 file

gmx xpm2ps -f gibbs.xpm -o gibbs.eps -rainbow red

python xpm2txt.py -f gibbs.xpm -o gibbs.txt

#xpm2txt.py is mentioned here. Just copy and paste this script. Run in command line as mentioned above

#!/usr/bin/env python

import sys

"""

Utility tool to convert xpm files generated by GROMACS to a 3-column text file.

"""

USAGE = "USAGE: xpm2txt.py -f <input xpm file> -o <output txt file> [-s]\n"

USAGE+= "Options:\n"

USAGE+= "\t-s\t(int)\tSorts the output by a given column"

USAGE+= "\n" # always keep this line

# Parse arguments

read_input, read_output, sort = False, False, False

xpm_file, out_file, column_sort = None, None, None

for arg in sys.argv[1:]:

if read_input:

read_input = False

xpm_file = arg

elif read_output:

read_output = False

out_file = arg

elif sort:

sort = False

column_sort = int(arg)

if arg[0] == "-":

if arg == "-f":

read_input = True

continue

elif arg == "-o":

read_output = True

continue

elif arg == "-s":

sort = True

else:

print USAGE

sys.stderr.write('ERROR: Option not recognized: %s\n' %arg)

sys.exit(1)

if not xpm_file:

print USAGE

sys.stderr.write('ERROR: You forgot to provide an input file.\n')

sys.exit(1)

if not out_file:

out_file = "out.txt"

# Parse XPM file

xpm_handle = open(xpm_file)

xpm_data = []

x_axis, y_axis = [], []

letter_to_value = {}

for line in xpm_handle:

if line.startswith("/* x-axis"):

x_axis = map(float, line.split()[2:-2]) # We trim the last value

if line.startswith("/* y-axis"):

y_axis = map(float, line.split()[2:-2]) # We trim the last value

if line.startswith('"') and x_axis and y_axis: # Read data

xpm_data.insert(0, line.strip().strip(',')[1:-1])

if line.startswith('"') and len(line.split()) > 4:

letter = line.split()[0][1:]

value = float(line.split()[-2][1:-1])

letter_to_value[letter] = value

xpm_handle.close()

# Match x/y/data

txt_values = []

for y_index, data_value in enumerate(xpm_data):

y_value = y_axis[y_index]

for x_index, x_value in enumerate(x_axis):

txt_values.append([x_value, y_value, letter_to_value[data_value[x_index]]])

# Apply sorting if requested

if column_sort:

try:

txt_values.sort(key=lambda x: x[column_sort-1])

except IndexError:

print USAGE

sys.stderr.write('ERROR: Column not found (%s)\n' %(column_sort))

sys.exit(1)

# Print to file

out_handle = open(out_file, 'w')

for x, y, z in txt_values:

out_handle.write("%3.5f\t%3.5f\t%3.5f\n" %(x,y,z))

out_handle.close()

For any queries regarding the script, kindly drop your message in the comments section below.

Comments

LucyNovember 4, 2021 at 3:22 AM
Using the script you have provided, I am getting an error:

Traceback (most recent call last):
File "xpm2txt.py", line 72, in
y_value = y_axis[y_index]
IndexError: list index out of range

What could be the issue.
ReplyDelete
Replies
AANovember 9, 2021 at 12:47 AM
Hi
I have follow your work steps but when I try to excute python xpm2txt.py give me the following error:
"line 16
if read_input:
^
SyntaxError: invalid character in identifier"

When I use python3
"SyntaxError: invalid non-printable character U+00A0"

How can I fix this?

Thank you
ReplyDelete
Replies
DL COMP-BIO-TECHNovember 30, 2021 at 8:55 AM
Hi, try to download this script from here: http://kangsgo.com/images/2018/8/xpm2txt.py
and try to re-run again.
check with different python versions also.. like i run with python2.7
ReplyDelete
Replies
S.R. VigneshDecember 10, 2021 at 6:12 AM
Hi Sir/ Madam,
I figured the lowest energy bin in shamlog.log and binindex.ndx.
There are ~125 frames numbers (in ascending order) are there in lowest energy bin. How to find the lowest energy frame from that bin?
Also in 2d_PCA.xvg, there are xy for 50ns while the result gibbs.xpm have 1000+ lines, which has different PCA1 and PCA2 for energy of 0 units, kindly help me to find the lower energy structure frames. Thanks in advance.
ReplyDelete
Replies
DL COMP-BIO-TECHDecember 25, 2021 at 12:19 AM
Hi.. We apologize for the late response.
The low energy conformers can be extracted by:
1. when u run the pymol script to generate the FEL image.. dont save directly in the folder by the script.. rather let the image open as a window so that u can save it manually..
2. when image window opens.. mouse over to the basins of your interest, then somewhere on the bottom left side u could see the x and y coordinates of the basin.
3. search those values in the pca12.dat that contains the time frames also.
4. Now u wont find the exact values of the coordinate but somewhere near to it. x and y coordinates denotes to mode 1 (PC1) and mode 2 (PC2) values respectively.
5. likewise u will get the approximate idea of the time frame (mentioned in the first column of the file).
6. then using any frame extraction method .. u can extract that particular frame from the trajectory.
ReplyDelete
Replies
AkkiJanuary 18, 2023 at 2:20 AM
Hello Sir
I am facing issue while running this script:xpm2txt.py
This is the error:
File "xpm2txt.py", line 139, in
y_value = y_axis[y_index]
TypeError: 'map' object is not subscriptable
Please help me resolve it.. I have downloaded this script from different sources but everytime i run it, am getting same error...
ReplyDelete
Replies
DL COMP-BIO-TECHNovember 3, 2023 at 8:17 AM
Hi, it may depend on the python version you are using. if you are using latest version, then try to convert this script into the latest version. there are various online converters for python 2 to python 3. Let us know if it works for after this step.
ReplyDelete
Replies

Add comment

Search This Blog

DL COMP BIOTECH

PCA and FEL through gromacs

Comments

Post a Comment

Popular posts from this blog

Calculation of RMSD values between two ligand poses using web servers/tools

AMBER TUTORIAL-2: How to simulate a protein- ligand system: basic steps for md simulation in amber (AMBER 16)

Python script to plot 3D FEL plots