Using FFGenOpt
Start by creating a directory under Molecules via
mkdir /path/to/FFGenOpt/Molecules/f3ccoo
Copy your .psf and coordinate files to this directory, as well as the output of
the Psi4 QM job. Since the initial parameters for trifluoroacetate are based on
DGenFF, you should also copy toppar_drude_master_protein_2019g_orig.str
from the Molecules/oac
directory.
If you plan on using CHARMM as your MD engine for the calculation of the
normal modes, then you will need drude_oac.inp
. If you plan on using OpenMM
instead, you will need freq.py
,
and normalmodeanalysis.py
from Molecules/bf4
.
In either case, make sure to modifiy the copied files so that the correct
coordinates and topologies are read.
If you have been using
ffparam (and don’t have access to CHARMM), you should also copy
psf_megre.py
, str_merge.py
and crd_merge.py
from bf4
.
These scripts will generate stream files, .psfs and coordinates with polarizable
atom types but without any extra particles such as lone pairs or Drudes,
respectively. Execute each of them via
python xxx_merge.py cgenff_file dgenff_file
substituting the command line arguments with the names of your files. The resulting new files will serve as the basis for the parametrization in the next paragraph.
Defining Force Constants
Now we can take a look at the config files that FFGenOpt uses to set up a
molecular system. You can use e.g. FF_GenOpt_conf/FF_GenOpt_bf4.cfg
as a
template for trifluoroacetate. The ready-to-deploy file might look like this:
EXTERN False # False: use OpenMM for MD freqs, True: use MDEXEC
MDEXEC python3 # charmm executable
PSF Molecules/f3ccoo/cgen_pol.psf # MD PSF file
CRD Molecules/f3ccoo/f3coo_mp2.crd # MD crd file
PARAMS Molecules/f3ccoo/ff.str # MD parameters
VARFILE Molecules/f3ccoo/var.str # parameterfiles with variables
MDINP Molecules/f3ccoo/freq.py # MD input file
MDOUT Molecules/f3ccoo/charmm_normalmode.out # MD run output
QMOUT Molecules/f3ccoo/f3coo_hess.out # QM reference
PARAM_FILENAME FF_GenOpt_results/f3coo/parameters.str # Write parameters here, they will be read by MDINP
POPULATION_FILENAME FF_GenOpt_results/f3coo/lastPopulation_f3coo.txt
LOG_FILENAME FF_GenOpt_results/f3coo/convergence_f3coo.log
GENERATIONS 100 #Number of generations to be created
POPULATION_SIZE 200 #Size of population
MUTATIONS_PER_GENERATION 50 #Crossovers and mutations per generation
STEP_SIZE 0.05 #Maximum step size for random algorithm
CROSSOVER_BLX 0.7 #Probability of choosing BLX method... has to sum up to 1
CROSSOVER_SBX 0.25 #Probability of choosing SBX method... has to sum up to 1
CROSSOVER_UNIFORM 0.05 #Probability of choosing uniform crossover... has to sum up to 1
MINIMUM_IMPROVEMENT 0.0001 #Minimum improvement to continue random algorithm direction in percent
MINIMUM_DIVERSITY 0.0001 #Minimum population diversity before a new initial population is created
#------------------------------------------------
# name type min max val
#------------------------------------------------
PARAMETER f3b1 bond 50 500 160.95
PARAMETER f3b2 bond 100 1000 525.00
PARAMETER f3b2 bond 100 1000 265.00
PARAMETER f3a1 angle 20 200 30.98
PARAMETER f3a2 angle 20 200 100.00
PARAMETER f3a3 angle 20 200 47.24
PARAMETER f3a4 angle 20 200 118.00
PARAMETER f3d1 dihedral 0.5 5 0.81
PARAMETER f3i1 improper 20 200 96.00
Tip
If you are stuck are overwhelmed by the number of files and how they should
be formatted, take a look at the tutorial_files
branch of the github
repo.
It contains the files created during this tutorial.
The first block defines the location of some crucial files and sets our MD executable to use python (you can also use CHARMM if you have access to it). The PSF and CRD lines are for the structure and coordinate files, respectively, while PARAMS is a text file containing the paths to the force field (toppar in CHARMM).
The VARFILE entry should be identical to the stream file generated by
str_merge.py
, with the values of the force constants substituted by unique
variable names (see the PARAMETER block).
MDINP is the name of the file containing the calculation of normal modes,
MDOUT are the results to be written (can also be redirected to /dev/null
if no debugging is necessary). QMOUT contains the results of the frequency
and normal mode calculations carried out at the start of the tutorial.
The next three lines give the locations for optimization results to be written,
so you should also create the directory specified here.
The second block contains the settings for the genetic algorithm, e.g. how many generations to compute and the number of members within a generation (population size).
The third block contains the names, types, bounds and starting values of the
force constants that should be optimized. The names have to be identical to
those in the VARFILE (minus the @
). The type has to be either bond
,
angle
, dihedral
or improper
. The next three numbers define the min
and max range that the force constants can take and an initial guess.
Starting FFGenOpt
If everything is set up correctly (i.e. all files exist and the paths point to the right location), you can start FFGenOpt from the main directory by
python FF_GenOpt.py /path/to/config_file
You should first see a message about no population found (if you are starting
the optimization for the first time) and that an initial population is being
generated. Once that is done, you will see a summary after each generation
detailing a best fitness score and average score along with the improvement
since the last generation, among other things. After the specified number of
optimization cycles (GENERATIONS in the config file) are through, you can
take a look at your resluts (or the logs if somehting went wrong) in the
directory specified in the config file. If everything looks sane, then
congratulations!
You can now substitute your old force field parameters by those in the
FF_GenOpt_results
directory and start computing new MD trajectories.