Running Simulations

See also

If you are unsure or need inspiration, the transformato/notebooks folder has a number of detailed examples.

After setting up a system, actually running it comprises two steps:

  • Creating the intermediate states

During this stage, you may also make final adjustments to the common core and dummy regions of your system.

  • Simulating the system

Here, you will essentially conduct small MD simulations for complex and waterbox in each of the intermediary states. This will generate the trajcetory files to be analysed later.

Step 3 - Creating intermediary states

We are now following the notebook shown here XXXXXXXXXXXXXXXX

Tip

We recommend using an ipython interpreter like jupyter-notebook or VSCode to run the notebooks, as this will enable you to easier visualize the common-core and make the necessary adjustments.

  1. Loading structures

    During this part, we load the required modules and tell transformato to load our config.yaml. Please note, input_dir and output_dir as additional arguments for load_config_yaml: These folders are where to find your structures and where to write your intermediate states, respectively.

    from transformato import load_config_yaml, SystemStructure, IntermediateStateFactory, ProposeMutationRoute
    from transformato.mutate import perform_mutations
    from IPython.display import SVG
    from transformato.utils import run_simulation, postprocessing
    import warnings
    warnings.filterwarnings("ignore", module='parmed')
    
    configuration = load_config_yaml(config='24to25.yaml',input_dir='../', output_dir='./longrun-norestraints-1/')
    
    load_config_yaml

    These parameters need to be adjusted by the user

    config

    path to your config yaml file

    input_dir

    The path where your input structures are stored

    output_dir

    The path to a folder where you want to write your intermediat states and later run the simulations

  2. Generating the mutation route and common core

    Now the two ligands will be processed and the alchemical mutation route connecting the two ligands to their common core structure via several intermediate states will be created.

    s1 = SystemStructure(configuration, "structure1")
    s2 = SystemStructure(configuration, "structure2")
    s1_to_s2 = ProposeMutationRoute(s1, s2)
    s1_to_s2.propose_common_core()
    
  3. Visualize the Common Core

    Now the common core suggested by \(\texttt{TRANSFORMATO}\) can be shown using the following command

    SVG(s1_to_s2.show_common_core_on_mol1())
    SVG(s1_to_s2.show_common_core_on_mol2())
    

    If one does not like the suggestion, one can manually interfere and either add atoms to the common core

    s1_to_s2.add_idx_to_common_core_of_mol1([idx1,idx2,...])
    s1_to_s2.add_idx_to_common_core_of_mol2([idx1,idx2,...])
    

    or remove atoms:

    s1_to_s2.remove_idx_to_common_core_of_mol1([idx1,idx2,...])
    s1_to_s2.remove_idx_to_common_core_of_mol2([idx1,idx2,...])
    

    In both cases, the idx of the respective atoms can be found in the graphic shown beforehand.

    Attention

    If you add or remove atoms for both structures, make sure you are using the correct (the same) order for the both structures

  4. Finalize the common core

    s1_to_s2.finish_common_core()
    

    \(\texttt{TRANSFORMATO}\) will add staged idxs and start attempting to scale charges. If you want, you may now repeat the SVG() command from above to see the changes you made.

  5. Create mutations and write intermediate states

    mutation_list = s1_to_s2.generate_mutations_to_common_core_for_mol1()
    i = IntermediateStateFactory(
    system=s1,
    configuration=configuration,
    )
    

    This will generate the necessary mutation list from one endstate to the common core.

    perform_mutations(nr_of_mutation_steps_charge=3, configuration=configuration, i=i, mutation_list=mutation_list)
    

    With this command, the actual intermediate state directories are written. After this has finished without errors, you may proceed to actually running the simulation.

    perform_mutations

    If needed, you can adjust the amount of intermediate steps here

    nr_of_mutation_steps_charge (default: 5)

    how many intermediate states should be created to scale the charges of the atoms in the dummy region

    nr_of_mutation_steps_cc (default: 5)

    how many intermediate states should be created to interpolate parameters between the two common core regions. This is only necessary for ligand 1!

  6. Running the simulation

    Now the simulation can be started locally with:

    run_simulation(i.output_files, engine="openMM")
    
    engine (["openMM","CHARMM"])

    you can decide whether you want to use openMM or CHARMM

    Note

    openMM is already available in the conda fep environment, CHARMM needs to be installed by oneself.

    Attention

    It is technically possible to run RBFE simulations locally but it can take a lot of time. For that reason, it is strongly suggested to offloade them to a supercomputer. Running RSFE on a local machine though, on a GPU can be done easily.

    If you take a look at the intst*/ directories now created (located at project-folder/replicate-folder/combinedstructure/singlestructure/intst*) you’ll two scripts: simulation.sh and simulation_charmm.sh

    Somewhat unsurprisingly, these are responsible for running the simulation as either openMM or CHARMM, containing the required information and arguments.

    Important

    You only need to run one of the options below. Please note, however, that CHARMM does not have the same features as openMM. If you need to modify these scripts for all future use in some way, you may find their sources in transformato/bin

    In each intermediate state directory (called intst1, intst2, …), there is a simulation.sh file for the use with openMM and a simulation_charmm.sh file for the use with CHARMM.

    Using openMM:

    For openMM, use simulation.sh:

    #!/bin/bash
    #SBATCH -p lgpu
    #SBATCH --gres=gpu
    
    
    source ~/anaconda3/etc/profile.d/conda.sh # you might need to source your conda environment
    conda activate fep
    
    path=$1
    
    cd ${path}
    pwd
    hostname
    
    input=lig_in_complex
    init=lig_in_complex
    pstep=lig_in_complex
    istep=lig_in_complex
    irst=lig_in_complex
    orst=lig_in_complex_rst
    python -u openmm_run.py -i ${input}.inp -t toppar.str -p ${init}.psf -c ${init}.crd -irst ${irst}.rst -orst ${irst} -odcd ${istep}.dcd &> complex_out.log
    
    input=lig_in_waterbox
    init=lig_in_waterbox
    pstep=lig_in_waterbox
    istep=lig_in_waterbox
    irst=lig_in_waterbox
    orst=lig_in_waterbox_rst
    python -u openmm_run.py -i ${input}.inp -t toppar.str -p ${init}.psf -c ${init}.crd -irst ${irst}.rst -orst ${irst} -odcd ${istep}.dcd  &> waterbox_out.log
    

    Essentially, this file just tells openMM what to run and then runs it, providing a bit of debug information along the way. Importantly, it takes the working directory as argument.

    So, if you want to simulate this intermediate state, you would run:

    ./simulation.sh /absolute/or/relative/path/to/that/intermediatestate
    

    Note

    Assuming your current working directory is that intermediate state, you can just supply ./ as argument.

    Using CHARMM:

    For CHARMM, use simulation_charmm.sh

    #!/bin/bash
    #SBATCH -p lgpu
    #SBATCH --gres=gpu
    path=$1
    SWITCH=$2
    
    cd ${path}
    pwd
    hostname
    
    
    run_complex () {
    input=charmm_run_complex
    OMP_NUM_THREADS=8 ${CHARMM} -i ${input}.inp > log_complex.out
    }
    
    run_waterbox () {
    input=charmm_run_waterbox
    OMP_NUM_THREADS=8 ${CHARMM} -i ${input}.inp > log_solv.out
    }
    
    
    case ${SWITCH} in
    1)
    run_complex
    ;;
    2)
    run_complex
    run_waterbox
    ;;
    esac
    

    Importantly, unlike openMM this takes two arguments: The path, same as before, and a CASE statement. You need to run both, so that will always be 2 for you. You also need to have the ${CHARMM} system variable defined.

    So, to run the simulation using CHARMM:

    ./simulation_charmm.sh /absolute/or/relative/path/to/your/intermediatestate 2
    
  7. Automation and offloading to a cluster

    There are two methods to automate running simulations:

    • Running it from the submit.ipynb

      Has the advantage of no extra step being necessary, but pythons multiprocessing facilities can be capricious. This variant also does not allow inspection of the intermediate states before submission.

      If you’d like this option, add this code to the bottom of your submit.ipynb:

      import os
      import glob
      import subprocess
      
      wd = os.getcwd()
      output_files = glob.glob(wd + f"/{folder}/*/*/intst*", recursive = True)
          # Whether {folder} is necessary depends on your folder setup
          # relative to your working directory -
          # in general, this should point to your intst** folders
      
      for path in sorted(output_files):
          print(f"Start sampling for: {path}")
          exe = subprocess.Popen(["ssh", "your clustername", "sbatch", f"{str(path)}/simulation.sh", str(path)], text=True, stdout=subprocess.PIPE )
          output = exe.stdout.read()
          print(output)
      

      You will have to replace your-clustername and modify the paths according to your setup.

    • Running it via script

      Has the advantage of allowing inspection beforehand and being more reliable, as well as allowing the modification of cluster instructions during submit.

      To use, create a script similar to this:

      for {i} in ./**/**/intst**/; #This assumes you start from the directory above the initial structure directory - you may change as necessary.
      do cd ${i}; # to prevent issues, it is preferred to switch to the script directory
      sbash simulation.sh; # sbash being the command for the workload manager slurm - you may need to replace as necessary.
      cd ../../..; # switch working directory back to original so next loop starts properly
      done;
      

      If you run this from the folder containing the replicate directories, any correctly built \(\texttt{TRANSFORMATO}\) replicate should be submitted to run. You can modify the loop glob to restrict the script to certain directories.