Expert's
Corner
Index | Conformation | Position | Divers | Descriptors | Substituents | Regression | Mine | Xray | Peptides | Docking | Fragments | Template | Ab ovo Moloc
Home
Descriptors

Descriptors for Pharmacophoric Properties

General remarks

Pharmacophoric properties of molecules are described in Moloc mainly by topological pharmacophores (-> theory). From these, two types of descriptors have been derived: moment- and similarity- ones. This tutorial illustrates how to calculate them for the example of a set of substituted hydroxyquinolone compounds for which plasma protein binding data are known (J.Med.Chem.40,4053,1997). Files, containing the structures, hdrqn.sd, and the experimental data, hdrqn.lst, can be found in the moloc/dat directory.

Generation of Topological Moment Descriptors

These descriptors can be used to derive statistical models for various molecular properties which relate to pharmacophoric molecular features.

Topological Similarity Descriptors

In order to calculate similarity descriptors a similarity matrix for a set of molecules has to be calculated beforehand. One particular example of generating such a matrix based on the concept of topological pharmacophores is provided in the companion tutorial on Diversity Analysis. It is assumed here that the user has already familiarized himself with that tutorial.
It has to be kept in mind, that the set of molecules providing the similarity matrix is an integral part of similarity descriptors. They change with the set and are able to account for variations as they occur in the set.

As of February 2005, the next sections are obsolete (see recommended new version).
A possibility to use old model files is given in the last section of this page.

Generation of Files with Similarity Descriptors

The two new files, dsc.tab and dsc.mtx, provide the basis for the generation of statistical models based on the molecular description underlying the calculation of the used similarity matrix (in our case the topological pharmacophore description).

Calculate Model Parameters

Given the file of similarity descriptors, hdrqn_dsc.tab, and a set of corresponding target values, Hdrqn.lst, for which we want to obtain a predictive model, we may utilize any of the standard tools, such as PCR (principle component regression), PLS (partial least squares), or a neural net algorithm to produce such a model.
In most cases ordinary least squares will not be particularly suited because the large number of descriptors may lead to colinearity problems. Once a model has been obtained and its predictive capabilities have been established by cross-validation methods, the next step is to apply it to compounds not contained in the set that was used to generate the model. For this purpose the matrix hdrqn_dsc.mtx is useful. Furthermore, a few companion programs are provided with Moloc to facilitate prediction.

Calculate Predictions for a Linear Model

For our hydroxyquinolone example a PLS analysis yielded r**2 = 0.784 and q**2 = 0.704. The corresponding coefficients are given in file hdrqn.cfd (one coefficient per line, constant at the end). This file can be found in the data directory moloc/dat.
In order to calculate predictions for a set of structures, these must be provided in mol-format. In addition to this file the associated Moloc program Mtprmp requires: The calculations can be performed in two different formats (for simplicity we just calculate the values for the molecules of the original training set):

Running the new version of Mtprmp with old model files