AMBER TUTORIAL-2: How to simulate a protein- ligand system: basic steps for md simulation in amber (AMBER 16)

BLOG #3

 Hi Guys,

        This is a new tutorial on how to simulate a protein-ligand system: basic steps using AMBER.

Before we begin, I hope you have read our previous post on AMBER TUTORIAL-1, since I will only be providing additional information with the ligand here, rather than repeating the previous instructions.

#To parameterize the ligand file, we start with Antechamber and then parmchck. 

  • These are a series of tools for creating files for organic molecules and some protein metal centers, which can then be read into LEaP. Junmei Wang created the Antechamber suite, which is intended to be used in conjunction with the general AMBER force field (GAFF) (gaff.dat).
  • This is the package's most crucial program. It can convert a variety of files and assign atomic charges and types to them as well. Sqm (or, alternatively, mopac or divcon), atomtype, am1bcc, bondtype, espgen, respgen, and prepgen are among the programs that antechamber runs in response to the input.
  • For more details regarding GAFF and ANTECHAMBER, you may read section 15 in amber 16 manual.

$AMBERHOME/bin/antechamber -i lig.pdb -fi pdb -o lig.mol2 -fo mol2 -c bcc -s 2

#here antechamber prepares ligand.pdb and outputs the prepared file to lig.mol2.

verbose on the terminal will appear like this. 

Running: /home/mitul/Desktop/16/amber16/bin/bondtype -j full -i ANTECHAMBER_BOND_TYPE.AC0 -o ANTECHAMBER_BOND_TYPE.AC -f ac

Running: /home/mitul/Desktop/16/amber16/bin/atomtype -i ANTECHAMBER_AC.AC0 -o ANTECHAMBER_AC.AC -p gaff

Total number of electrons: 336; net charge: 0

if there is any charge on the molecule, the program exists given error. You may use -nc flag as this flag specifies the net charge of the input molecule, otherwise, the net charge is read in from the input directly (such as gout, mopout, sqmout, sqmcrt, gcrt, etc.) or calculated by summing the partial charges

nc = total unperturbed charge on the molecule, and it can be 0, -1, -2, 1, 2 etc.

# for example

verbose on the terminal will appear like this. 

Running: /home/mitul/Desktop/16/amber16/bin/bondtype -j full -i ANTECHAMBER_BOND_TYPE.AC0 -o ANTECHAMBER_BOND_TYPE.AC -f ac

Running: /home/mitul/Desktop/16/amber16/bin/atomtype -i ANTECHAMBER_AC.AC0 -o ANTECHAMBER_AC.AC -p gaff

Total number of electrons: 335; net charge: 1

#then  use this command: $AMBERHOME/bin/antechamber -i lig.pdb -fi pdb -o lig.mol2 -fo mol2 -c bcc -s 2 -nc 1

$AMBERHOME/bin/parmchk -i lig.mol2 -f mol2 -o lig.frcmod

parmchk reads in an ac/mol2/prepi/prepc file, an atomtype similarity index file (the default is $AMBERHOME/dat/antechamber/PARMCHK.DAT) as well as a force field file (the default is $AMBERHOME/dat/leap/parm/gaff.dat). This command reads in lig.mol2 and finds the missing force field parameters listed in frcmod. It writes out a force field modification (frcmod) file containing any force field parameters that are needed for the molecule but not supplied by the force field (*.dat) file.

For more information you may read section 15.2 in AMBER16 manual.

$AMBERHOME/bin/tleap -f oldff/leaprc.ff14SB #Source leaprc file for ff14SB protein force field

source leaprc.gaff  #Source leaprc file for gaff (in case of ligands you need to source this force field) their default location: The leaprc.protein.ff14SB and leaprc.gaff files are under $AMBERHOME/dat/leap/cmd/ directory.

DRG=loadmol2 lig.mol2 #load prepared ligand file

check DRG #check function will identify missing bond and angle parameters and look for missing dihedrals

loadamberparams lig.frcmod #Load the frcmod file for ligand

saveoff DRG lig.lib #saves the ligand library file which will enable tleap to recognize this ligand in the future

set default PBRadii mbondi2 

saveamberparm DRG lig.prmtop lig.inpcrd #saving the ligand topology and coordinates files

rec=loadpdb pro_noh.pdb #load non-hydrogen protein file

set default PBRadii mbondi2

saveamberparm rec pro.prmtop pro.inpcrd #saving the protein topology and coordinates files

loadoff lig.lib #load the lib file we created above

complex=combine {rec DRG} #combining the receptor (protein) and ligand

if multiple ligands or protein chains are there to merge then you can combine here in this step like: complex=combine {rec DRG1 DRG2} or complex=combine {rec DRG peptide} or complex=combine {chainA chainB DRG} depending upon your system. If any confusion drop your message in the comments section.

savepdb complex comp.pdb  #saving the combined protein-ligand complex as PDB file to be visualized in any visualization software.

set default PBRadii mbondi2

saveamberparm complex comp.prmtop comp.inpcrd  #saving the protein-ligand complex topology and coordinates files

source leaprc.water.tip3p #Source leaprc file for TIP3P water model

charge complex #checking the charge on the complex 

addions2 complex Na+ 5 #adding ions for neutralizing the charges

charge complex #again checking the charges if they have neutralized to zero or not

solvatebox complex TIP3PBOX 10.0  ###or ## solvateoct complex TIP3PBOX 10.0 #### #solvating the complex with water in 10.0 angstrom box The minimum distance between any atom in solute and the edge of the periodic box is given by the distance parameter

set default PBRadii mbondi2

saveamberparm complex comp_solv.prmtop comp_solv.inpcrd  #saving the solvated protein-ligand complex topology and coordinates files

savepdb complex comp_solv.pdb #saving the solvated protein-ligand complex as PDB file to be visualised in any visualization software

quit #exit the tleap


  • The .in files for the upcoming commands are attached to the blog. These files are editable as per ur requirement. 
  • Here the commands will be running sander and pmemd on multiple processors for faster output using mpirun feature (in centos) for minimisation, heating, density and equilibration steps. 
  • If you want to run on single CPU then remove mpi from the command: example:- $AMBERHOME/bin/sander -O -i min.in -o min.out -p comp_solv.prmtop -c comp_solv.inpcrd -r min.rst -ref comp_solv.inpcrd


#perform 1st Minimization ( lines in min.in can be changed as per ur requirement:  maxcyc=2000,  ncyc=1000] Here, the systems will be minimized for 2000 steps, out of which 1000 steps will be used by the steepest descent method followed by 1000 steps using the conjugate gradient to remove bad contacts in the systems.  

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min.in -o min.out -p comp_solv.prmtop -c comp_solv.inpcrd -r min.rst -ref comp_solv.inpcrd

#output files: min.out and min.rst


#2nd Minimization [restraintmask=':1-254 & !@H='] please change the restrain mask as per the total residue number of solute (protein+ligand in this case).. example if ur protein contains 50 amino acids and ligand's residue number is 51, then restraintmask=':51& !@H=' all non-hydrogen atoms.  The position restraints of 10 and 2 kcal mol-1 Å-2, respectively, are placed on the entire protein system to relax the solvent molecules in the first and second minimization steps.

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min1.in -o min1.out -p comp_solv.prmtop -c min.rst -r min1.rst -ref min.rst

#output files: min1.out and min1.rst


#3rd Minimization [The third step of minimization will be carried out unrestrained. ]

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min2.in -o min2.out -p comp_solv.prmtop -c min1.rst -r min2.rst -ref min1.rst

#output files: min2.out and min2.rst


#Heating [The systems will be gradually heated for 20ps from 0K to 300K]..[nstlim=10000,  dt=0.002] the values in these can be changed if you want to run this step for more time. the calculation is a time of heating: nstlim*dt i.e. 10000*0.002 is 20ps. 

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i heat.in -o heat.out -p comp_solv.prmtop -c min2.rst -r heat.rst -ref min2.rst -x heat.nc

#output files: heat.out, heat.rst, heat.nc and/or mdinfo


#Density ( change the restraintmast values here in density.in file also like in min1.in file: restraintmask=':1-51',)

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i density.in -o density.out -p comp_solv.prmtop -c heat.rst -r density.rst -x density.nc -ref heat.rst

#output files: density.out, density.rst, density.nc and/or mdinfo


#Equilibration [Equilibration of the system for 500ps at 300K and 1atm pressure.]

mpirun -np 6 $AMBERHOME/bin/pmemd.MPI -O -i equil.in -o equil.out -p comp_solv.prmtop -c density.rst -r equil.rst -x equil.nc

#output files: equil.out, equil.rst, equil.nc and/or mdinfo


#Production or MD simulation (the prod.in file contains MD length of 1ns). 

$AMBERHOME/bin/pmemd.cuda -O -i prod.in -o md.out -p comp_solv.prmtop -c equil.rst -r md1ns.rst -x md1ns.nc 

#output files: md.out, md1ns.rst, md1ns.nc and/or mdinfo

#.nc file is a NETCDF trajectory file


#.out and mdinfo provides status of the running commands which can be seen in either text editors or by typing commands on terminal such as " tail -f mdinfo" [in another terminal to simultaneously view the status of the jobs.]

Some important calculations to remember:

length of simulation: nstlim*dt [ i.e. Number of MD steps in run *Time step in picoseconds (ps)(the time length of each MD step) ]

1000ps =1ns

to run for 100ns, change nstlim like : 50000000, then 0.002 (i.e. dt) is multiplied to get 100,000ps=100ns

ntpr=5000,  ntwx=5000: can be changed to write the trajectory

we have also provided a short automated script for running these minimisation, heating, density, equilibration and production commands step by step].. the script is job.sh... u can run it by command " sh job.sh. please don't forget to change the name of ur input and output files in this job.sh script before running it.

PS: to extend the simulations: please keep irest=1 in prod.in file, if u want to change the time, change the nstlim number, save] Also [PS: prod.in file contains the  ntwr=-2500 i.e the “restrt” file will be written, ensuring that recovery from a crash will not be so painful. this means that if you want to restart or extend the simulation from 0.5ns , you can use this md1ns.rst.2500 file in place of equil.rst for extension) You can change the ntwr number depending upon when to get the .rst file to be written.

use this command to extend the simulation from the last step of the md file:

$AMBERHOME/bin/pmemd.cuda -O -i prod.in -o md.out -p comp_solv.prmtop -c md1ns.rst -r md9ns.rst -x md9ns.nc 


Min.in file:


Minimize

 &cntrl

  imin=1,

  ntx=1,

  irest=0,

  maxcyc=2000,

  ncyc=1000,

  ntpr=100,

  ntwx=0,

  cut=16.0,

 /

Min1.in

initial minimisation whole system

 &cntrl

  imin   = 1,

  maxcyc = 2000,

  ncyc   = 1000,

  ntb    = 1,

  ntr    = 1,

  cut    = 8,

  restraint_wt = 2,

  restraintmask=':1-254 & !@H='

 /


min2.in

Unrestrained_minimization

 &cntrl

  imin   = 1,

  maxcyc = 2000,

  ncyc   = 1000,

  ntb    = 1,

  ntr    = 0,

  cut    = 8

 /


heat.in

Heat

 &cntrl

  imin=0,

  ntx=1,

  irest=0,

  nstlim=10000,

  dt=0.002,

  ntf=2,

  ntc=2,

  tempi=0.0,

  temp0=300.0,

  ntpr=100,

  ntwx=100,

  cut=8.0,

  ntb=1,

  ntp=0,

  ntt=3,

  gamma_ln=2.0,

  nmropt=1,

  ig=-1,

 /

&wt type='TEMP0', istep1=0, istep2=9000, value1=0.0, value2=300.0 /

&wt type='TEMP0', istep1=9001, istep2=10000, value1=300.0, value2=300.0 /

&wt type='END' /


density.in


 &cntrl

  imin=0,irest=1,ntx=5,

  nstlim=25000,dt=0.002,

  ntc=2,ntf=2,

  cut=8.0, ntb=2, ntp=1, taup=1.0,

  ntpr=500, ntwx=500,

  ntt=3, gamma_ln=2.0,

  temp0=300.0, ig=-1,

  ntr=1, restraintmask=':1-254',

  restraint_wt=2.0,

 /


equil.in

Equilibration

 &cntrl

  imin=0,

  ntx=5,

  irest=1,

  nstlim=250000,

  dt=0.002,

  ntf=2,

  ntc=2,

  temp0=300.0,

  ntpr=1000,

  ntwx=1000,

  cut=8.0,

  ntb=2,

  ntp=1,

  ntt=3,

  gamma_ln=2.0,

  ig=-1,

 /


prod.in

Production

 &cntrl

  imin=0,

  ntx=5,

  irest=1,

  nstlim=500000,

  dt=0.002,

  ntf=2,

  ntc=2,

  temp0=300.0,

  ntpr=5000,

  ntwx=5000,

 ntwr=-2500,

  cut=8.0,

  ntb=2,

  ntp=1,

  ntt=3,

  gamma_ln=2.0,

  ig=-1,

 /


job.sh

echo "1st Minimization"

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min.in -o min.out -p comp_solv.prmtop -c comp_solv.inpcrd -r min.rst -ref comp_solv.inpcrd

echo "Starting 2nd Minimization"

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min1.in -o min1.out -p comp_solv.prmtop -c min.rst -r min1.rst -ref min.rst

echo "Starting 3rd Minimization"

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i min2.in -o min2.out -p comp_solv.prmtop -c min1.rst -r min2.rst -ref min1.rst

echo "Starting Heating"

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i heat.in -o heat.out -p comp_solv.prmtop -c min2.rst -r heat.rst -ref min2.rst -x heat.nc

echo "Starting Density"

mpirun -np 6 $AMBERHOME/bin/sander.MPI -O -i density.in -o density.out -p comp_solv.prmtop -c heat.rst -r density.rst -x density.nc -ref heat.rst

echo "Starting Equilibration"

mpirun -np 6 $AMBERHOME/bin/pmemd.MPI -O -i equil.in -o equil.out -p comp_solv.prmtop -c density.rst -r equil.rst -x equil.nc

echo "Starting production"

$AMBERHOME/bin/pmemd.cuda -O -i prod.in -o md.out -p comp_solv.prmtop -c equil.rst -r md1ns.rst -x md1ns.nc 


References: 


1. Amber 16 user manual

2. https://ambermd.org/tutorials/basic/tutorial2/section6.htm 

3. http://ambermd.org/tutorials/basic/tutorial0/

4. http://ambermd.org/tutorials/pengfei/index.htm

5. https://ambermd.org/tutorials/advanced/tutorial1/section3.htm

6. Mittal, Lovika, Mitul Srivastava, Anita Kumari, Rajiv K. Tonk, Amit Awasthi, and Shailendra Asthana. "Interplay among Structural Stability, Plasticity, and Energetics Determined by Conformational Attuning of Flexible Loops in PD-1." Journal of Chemical Information and Modeling 61, no. 1 (2021): 358-384.


Please provide your suggestions, queries, and feedback related to our blogs in the comments section below. Also, do share and follow our blog for more updates related to Computational biology techniques.

 "Spend less time on techniques, more time on analysis."




Comments

Popular posts from this blog

Calculation of RMSD values between two ligand poses using web servers/tools

Python script to plot 3D FEL plots