File formats used by Moloc -> Moloc Home


Table of Contents


.mab Moloc's All Atom Format (MAB Force Field) ->Top

This is the minimal basic format that contains all topological information for Moloc's data structure.

line 1   Version 2 : This is the actual version.
line 2   Entry int: int int char : number of entry, number of atoms, number of monomers, name of entry
line 3   Spgp string : space group symbol (pdb convention), if avaiable
line 4   Cell 6*float : unit cell parameters, if avaiable
line 5   Ortg 1 4*float : first orthogonalization card, if avaiable
line 6   Ortg 1 4*float : second orthogonalization card, if avaiable
line 7   Ortg 1 4*float : third orthogonalization card, if avaiable
next lines   Monomer char int int int char : chainid-insertion_code, number of monomer, Monomer code, type, identifier: For each monomer such a line.
atom lines : char int int int int float float float float int int int ints : atom name, entry number, monomer number, core number, number of silent H's, coordinates, charge, user set, automatic set, number of ligands, ligans numbers

For batch runs only:
control line   Cntl int int int float float float int int char : switch (-1/0/1 = protocol/minimization/dynamics), iterations, repetitions, accuracy, time step, temperature (kT), fixed-atom mask, dummy, protocol file name (if needed)
weights line   wgts int floats : mask for force-field switches, weights of energy terms (logarithmic)
A protocol file contains several sets of control and (optionally) weight lines.
written: by Moloc only


.cif Moloc's Extended Format ->Top

This is a data-dictionary type format, analogous to the format of the Crystallographic Information Framework. However, it has a Moloc-specific dictionary, which allows to store auxillary information of Moloc sessions, such as entry colors, activiti stati, atom sets, etc.
It may be used to store whole Moloc sessions, which is on the one hand convenient, on the other hand this may lead to storing large structures, which may not change in a session (e.g. proteins) over and over again. This can be avoided by using the .ils format.


.ils Moloc input-list-of-files Format ->Top

Each line contains the path of an item to be read into Moloc. The files must have Moloc's default extensions.

.pdb Brookhaven Protein Database Format ->Top

General reference   http://www.pdb.bnl.gov/

Comments:

FORMUL   Omitting this record for heterogens may result in incorrect atom type assignments.
CRYST1   Non-conventional space-group names may lead to incorrect space group assignment.
.pdb files come in various modified formats, generated by diverse programs. Some of these can be read by Moloc.
A problem with .pdb files can be, that the topology of heterogens is not properly defined. These entries must be checked for correct silent-H conts (in dTp/h).
written: by Protein Data Bank, many programs, Moloc (tries to adhere to the standart definitions.)


.pdb Moloc extension ->Top

Newer sructures may contain more atoms or strands than allowed by the .pdb specification (99999, 26). In those cases structures are provided as .mmcif files. Moloc can read an extended .pdb-format with limits (9999999, 676). Such files can be generated from .mmcif files with the command Mdfy -p.

Altered Records:

SEQRES: 13-17, number of residues; 11-12, chain id.
HET: 12-13 chain id.
ATOM: 5-11, atom number; 21-22, chain id
HETATM: The key-word is abreviated to HETA. And same changes as for ATOM record.
CONECT: Key word CONE, 5-11, 12-18 atom numbers, only one covalent bond per record.
written: by Mdfy, and Moloc (structures above the standard limits).


.sfc for Surfaces, Positionned Comments, 3-d Graphs ->Top

line 1   identification
line 2   int float :  type of surface, solvent radius
line 3   int int int :  number of vertices (v), edges (e), faces (f)
line 4   int 6*float int :  global color, three scale factors (=1), 1.5/sqrt(dpa), face opaqueness, dummy (0), visibility
v lines   6*float int :  coordinates, normal unit vectors, color of vertices
e pairs of int   (about 6 per line) :  numbers of vertex pairs for each edge
f tripples of int   (about 4 per line) :  numbers of vertex tripples for each face

Comments:

surface type:   1: Lee-Richards, 2: Connolly, 3: polyhedral, 256: graph, 4096: positionned comments
color int:   534773760 + 128*128*127*blue + 28*127*green + 127*red (bit 0-6: red, bit 7-13: green, bit 14-20: blue, bit 21-27: alpha, bit 28 always set)
dpa:   dots per area
visibility:   bit0: overall, bit 1: vertices, bit 2: edges, bit 3: faces
Comment files have e = f = 0. The text at the position of each vertex is added after its color integer (space separated).
written: by Moloc only

Comments:

Lines starting with #   comments, ignored
written: by Moloc only


.tpr topological pharmacophores ->Top

initial lines starting with # are considered comments
line 1   identification of pharmacophore
line 2   int1 int2 int3 : number of H-binder agons, number of hydrophobic agons, number of data values per agon
line 3   float float float : characteristic distance for H-binders, characteristic distance for hydrophobs, dummy
line 4   (int1+int2)*int3 floats : strength values for the (int1+int2) agons
line 5   (int1+int2)*(int1+int2-1)/2 floats : values for the agon distance matrix
next 5 lines : next pharmacophore, etc.
written: by Moloc only


.sml similarity files ->Top

Similarity files come in two types, depending on whether the the similarity is given for all possible pairs of the n elements (triangular matrix) or just for a fixed number m (reasonably m < n/4) of nearest neighbors:

- similarity matrix (triangular)

line 1   dimension, n, of matrix (number of components), optional comments
next n lines (k = 1 to n): identifier of component k, k numbers in the range between zero and one (the last having a value of one = self similarity)

- near neighbour list

line 1   n ... "(m most similar)", optional comments
next n lines: identifier of component, m pairs (label, similarity value). The first element has label 1.
Trivial example:
3 components: topological,(1 most similar)
M1 3 0.291951
M2 3 0.521621
M3 2 0.521621


.plt Input for Gr2d, the 2-d Plot Program ->Top

line 1   n int :  n = key, number of data lines
line 2   r 4*float :  r = key, xmin, xmax, ymin, ymax (range, optional)
line 3   s floats :  s = key, shifts for column values (optional)
line 4   m floats :  m = key, multiplyers for column values (optional)
line 5   = x y = y N :  identifyers for columns (six in this case), = column ignores, x column for x-axis, y column for y-axes, N label (alphanumeric)
next lines   data values conforming with the specification of line 5