Fragment Libraries
General remarks
Fragments (templates) are used in Moloc for
Template Maching or
Ab Ovo Building. They consist of structures
equiped with with extensions, which are 'substituent-points' indicating
an atom of attachment and a direction in which an eventual attachment
would take place. In Moloc extensions are simply atoms which carry a
single-letter label that cannot be mistaken for an atom symbol, followed
by a number (e.g. L2). The atom representing the extension will be kept
'as is' in case the extension has not been used in the building process.
A symmetric bond-matrix with rows and columns labeled
with the extension labels indicates which extensions can be paired up
in the building process. The matrix elements may be zero (no bond allowed),
one for single bonds, etc.
Extension labels and bond matrix are defined in a file.
Moloc looks for file names starting with 'LABEL'. A default- and exemple
file, called LABEL_moloc, can be found in Moloc's 'dat' directory. It is
appropriate for templates generated with default parameters.
'Template Matching' typically uses one or a few libraries.
'Ab Ovo Building' has typically several libraries.
They contain fragments characterized by various properties: size, rings,
aromaticity, etc.
In addition, libraries carry three parameters, minimum- and maximum
number of fragments to be built into the nascent structure from the
library, as well as a frequency parameter, indicating how often it is
considered compared to the other libraries (default values: 0, 31, 1).
These tokens can help to direct the course of structure building along
desired lines.
Furthermore, seed-libraries may be defined, a fragment
of which is selected to start the building process. The coordinates of
this fragment are either held fixed or are constrained to their initial
values.
Library parameters can either be modified interactively
(Dyn/o/l/i) or in a list file (.ils) in which several libraries can be
specified for input (see option 'Dyn/o/l/g' and its help text).
Generation of a Fragment (Template) Library
We start from 100 molecules from the test set (tst_100.sd in the
moloc/dat directory) and generate templates with at least two and at most
7 atoms and at most one cut-bond.
- Enter Moloc's template utility (lib/t). Select 'p' (set parameters).
Set A=7, a=1, c=1. (By clicking the slider description with the
right-hand mouse key, you will obtain a description of the parameters.)
- Choose option 'g' and specify the tst_100 file name. After having
specified a prefix for the template names, Moloc will read one molecule
at a time and cut it up into pieces according to the rules specified by
the parameters. It should end up with 131 templates.
- Store the library as .mab file (e.g. mylib.mab) with option 's'.
- Remark: If many structures are to be shreddered, it is advisable to
partition them in several junks and merge the resulting libraries at the
end. This saves time spent when looking for duplicates in a library of
increasing size.
The templates in this library contain as many extensions as were found
in the generating structures. In order to produce a one-extension only
library, proceed as follows:
- In option 'p' set n=1. (A template with e.g. 3 extensions is expanded
into 3 single-extension templates.)
- Delete all structures in the current library (option 'd').
- Read in the old library (mylib.mab) in option 'r'. The program
reports: 354 templates
Further options allow to examine and modify libraries:
- Prune this library (option 'P') with all options set: duplicate
removal, energy minimization, atom labelling, and template rename
(you will be asked for a prefix). Now the library contains just 131
templates.
- With the browse option templates can be viewed and possibly removed,
if not wanted.
- The two filter options 'f' (regular filter options, lib/f), and 'e'
(for extension properties) allow for further trimming of libraries.
- With option 'm', a library can be expanded to contain multiple
conformers. For our 131-template this yields some 319 conformers.
The default procedure is to chop up structures at cut bonds of
single-bond type. Alternatively, one can specify a cut-bond library,
option 'c', which defines the bonds (substructures) at which cuts
should be made. Activate this mode with slider 'c' of the parameter
setting option 'p'.
A cut-bond library can be found in Moloc's data directory:
moloc/dat/shdr_bnd.mab. This file also contains coment lines on how
to specify the cuts.
Building a Seed Library interactively
- Start Moloc and read in the inhibitor entry '4dfr_B_MTX' from
the file 4dfr.pdb.
- In 'dTp/d' delete all atoms except NA2, C2, and N1,
which will yield the template, and N3 and C8a, which we will lead to
extensions.
- Go to the fragment-library menu, 'Dyn/o/l'.
- Select 'i' to build (modify) a library interactively.
The program asks for a extensions file. Take Moloc's default 'LABEL_moloc'.
If none is specified, meaningful labels will have to be typed. The program
now asks for a library name. Then you are dropped into new menu with an
empty current library.
- Chose option 'i' to insert our Moloc entry as
member of the fragment library.
- Select option 'c' and pick atom N3 in order to
change it to an extension. From the extension selector take L2,
appropriate for an extension from a sp2-carbon. The unused extension
should mutate into hydrogen (next question asked by Moloc).
- Change atom C8A to extension type L10 (unused to hydrogen).
- Exit from the menu. Moloc now asks whether the fragments of the
library should be kept fixed, constrained, or be free. In this last case
the program asks for minimum and maximum occurrence and a relative
frequency with which the library should be chosen.
- Finally, the program asks whether the fragments should be minimized.
This step may not be meaningful for fixed or constrained fragments.
- Store the new (modified) library (option 'w').
Links
The following URL's lead to publically available databases from which
structures for shreddering into templates can be extracted:
- http://DrugBank.ca
- http://zinc.docking.org
- http://pubchem.ncbi.nlm.nih.gov