Next: Syntax of Conformational Search, Previous: Conformational Search, Up: Conformational Search

The conformational search process is a sampling over degrees of
freedom within a macromolecule. With the `CONGEN` command, the term,
"degree of freedom" is used somewhat more freely than in the statistical
mechanical sense. It means any operation that determines any number of
atomic positions (including zero) and which can be iterated at least
once over some variable. The reason for this generalization is to allow
input, output, and energy evaluation operations into the
course of the search in an simple and powerful way.

The sampling process is a series of nested iterations applied
over all the degrees of freedom in the order specified by the user.
All of the variables are sampled discretely, although there are
provisions to solve for certain variables over an continuous range
where constraints may be applied.^{1}
Thus, the computer time required for a search grows exponentially with
the number of degrees of freedom. It is easy to set up a run that
could run for the age of universe.

There are several different methods for directing the search process. The simplest method is a depth-first search where the program tries every sample in turn using an algorithm that requires a minimum of temporary storage to keep track of its progress. There are also methods for sampling based on the quality of the partial conformations, and these techniques can result in better quality conformations being generated early in the search process. It is also possible to generate random structures. Overview of Directed Searching, for more information.

The program as described in the *Biopolymers* 1987 paper was
originally designed to search the conformational space of a single
polypeptide segment within a protein. The version described here
provides that capability in a more general way, so that multiple
segments can be searched, the local environment around a segment can be
considered, or terminal segments can be sampled.

The degrees of freedom presently implemented can be divided into three categories, those which construct atoms within the system, those which do I/O, and finally, one for the evaluation of the conformations.

There are three degrees of freedom involved with construction; Backbone, Chain Closure, and Sidechain; and together, they can search over a polypeptide segment. In addition, by creating new sidechain topology files, the Sidechain degree of freedom can be adapted for any molecule. See Sidechain Topology, for more information. The backbone and chain closure degrees of freedom work together to construct the backbone for an internal polypeptide segment. The sidechain degree of freedom is used for making the sidechains. See the menu below for more description of these degrees of freedom.

There are two degrees of freedom for I/O, `WRITE` and
`RBEST`. The `WRITE` degree of freedom writes to a CONGEN
conformation file the position of all atoms constructed up to that point
in the search along with the latest evaluation of the conformation,
see Conformation File. This file can be read back by the
`RBEST` degree of freedom, it can be scanned for a particular
conformation using the `XCONF` command, merged with other
conformation files using the `MERGE` `CG` command (see CONGEN Related), and scanned with the `CMPLOOP` command (see Support Programs for Conformational Search). The `RBEST` degree of freedom
is used to read the best conformations from a CONGEN conformation file.
By using this degree of freedom, real space renormalization (see H.
Scheraga, *Biopolymers* (1983) **22**, 1-14) can be
implemented.

Finally, there is the `EVL` degree of freedom. `EVL` is
used to evaluate the conformation currently being constructed. Any type
of energy manipulation is possible, see Energy Manipulations, but
typically, only energy evaluation is done. The `EVL`
degree of freedom can also be used for comparing generated conformations
against a known structure, so that the theoretical limits of the
sampling can be assessed. Finally, the `EVL` option can invoke a user
written evaluation function or it can assign a random number to the
evaluation of the conformation.

Although CONGEN was written for searching protein segments, it can be applied to arbitrary molecules. The sidechain degree of freedom reads a topology file, see Sidechain Topology, which can be used to describe the conformational degrees of freedom in any molecules. The sidechain degree of freedom is capable to searching any subset of the degrees of freedom, and therefore, search protocols like those used for proteins can be executed where the central part of a small molecule can be done exhaustively, and the peripheral moieties can be iteratively searched.

Because long searches are common, CONGEN can periodically save the state of a search in a checkpoint file and restart the run from such a point. In addition, the status of the run can be periodically written to a file which can be typed by the user as the program is executing.

When CONGEN performs a search, it initializes the positions of all
atoms involved in the degrees of freedom. This prevents collisions between
newly constructed atoms and their prior positions if any. If you are
planning several sequential searches, then you should initialize
the position of all the atoms involved (using a `COOR INIT` command,
see Function of Coordinate Manipulations).

In some cases, it is desirable to repeat a search over a particular degree of freedom. For example, consider the problem of finding structures which satisfy a set of NMR constraints. Because many of the constraints involve sidechain atoms, it is desirable to search sidechains after each backbone. However, since constraints bridge across multiple sidechains, it is desirable to rebuild all the sidechains after each new one is added. CONGEN will support this type of operation in general by examining if atoms in one degree of freedom are reconstructed by a later degree of freedom. If so, then these “overlapping” atoms will be removed just prior to the sampling of any degree of freedom which generates new atomic positions. Note that this approach can be quite inefficienct, but it might be improved if this capability proves to be useful.

There is a limited capability to treat part of the molecule as a rigid
body while other parts are being searched. The backbone and sidechain
degrees of freedom both have `FIX` options, see Backbone Degree of Freedom, and Sidechain Degree of Freedom, which specify that
atoms be constructed with the same bond lengths, bond angles, and
torsion angles that they had when the `CONGEN` command was
invoked. This can be used to explore how two domains interact with one
another when a linker joining them is flexible.

It is possible to include a cavity formation term in the energy function used by the conformational search. See Gepol Command, for more information.

[1] For example, the van der Waals avoidance in the sidechain construction will adjust a chi angle selection until a close contact is avoided. Also, the chain closure procedure generates torsion angles over the complete domain of angles. In addtion, backbone and sidechain degrees of freedom can construct atoms using fixed torsion angles.