Go to the first, previous, next, last section, table of contents.


The Implementation of CONGEN

The implementation of CONGEN is complex, because of the age of many parts of the code, the desire to preserve useful functions from CHARMM, the requirements of portability, and the need to provide as much functionality as possible. This chapter of the manual and the following one on storage management are essential to anyone interested in modifying the program.

CONGEN is implemented as single program. As a result, it is big. However, because of the use of dynamic storage allocation, it requires less initial storage than many contemporary modeling programs. By placing everything together, the task of modifying the program is made more reliable because errors in modifying the program are more likely to be noticed. This philosophy requires that testing be an integral part of the implementation of CONGEN. Ideally, there should be tests that exercise every line of code in CONGEN. As changes are made, the pre-existing tests are run to verify that changes were made correctly. As new features are added, new tests are constructed to ensure their correct operation.

CONGEN is implemented using FORTRAN, the FLECS preprocessor, and C. The reason for this mix is largely inertia. Early work on CHARMM was in FORTRAN because of its "easy" portability and convenience for numerical processing. Later, FLECS was used to make the program easier to read and modify. When the conformational search code was added, recursion became essential which lead to the use of C. There are no plans to rewrite CONGEN into one language because it is anticipated that others may wish to add Fortran code into the program.

The use of two languages in CONGEN requires an interlanguage interface. Most modern computers provide a mechanism for communication between C and Fortran. However, these interfaces are invariably machine dependent. In order to reduce the portability problems this entails, the program, wrapgen, has been written to localize the machine dependencies into one place, and free the CONGEN programmer from this problem while working on the regular code.

Besides the problem of the interlanguage interface, portability is a very important aspect in the implementation of CONGEN. In general, most operations in CONGEN are implemented using standard language features, but there are instances where machine dependent features are essential. In order to accommodate these machine dependencies, all the source code in CONGEN is run through a preprocessor. For the C code, the normal C preprocessor is used. For the FLECS/Fortran code, a modified version of the GNU Emacs C preprocessor is used, namely fcpp, or Fortran C PreProcessor. Certain C preprocessor variables are defined which indicate compilation on a particular machine, and these variables are used to determine which code to compile. A configuration file, `config.h' is used to determine the settings of generic preprocessor variables.

Binaries for different platforms can be kept under one directory tree and mounted using NFS. See section Installation of CONGEN on UNIX, for more information.

Please note that this section of the manual is intended to be an overview. There is no substitute for studying the source code carefully. Although it is an important goal to keep this documentation up to date, one should not trust this documentation too much since it is effectively a copy of information which is always being changed. If there are any doubts about accuracy, the source code is the final arbiter.

The Structure of CONGEN

CONGEN has a very simple organization. The main program is primarily a command dispatcher. It reads each command from the input file, parses the first word which is a command verb, and executes code to parse the remainder of the command and then perform its function. Data storage is simple in principle -- there are several dozen COMMON blocks which store information about the system being modeled, and the commands normally act to use or modify these COMMON blocks. Thus, many of the subroutines are independent of each other since they only depend on the data in the COMMON blocks.

Within CONGEN, there are also a large number of subroutines and functions that provide a convenient programming environment. For example, there are string manipulation routines, array manipulation routines, storage management capabilities, and operating system interfaces. Most of these routines are found in the source files; `string.flx', `array.flx', `util.flx', `cutil.flx', and `cgutil.flx'; and others can be found scattered throughout the program. The programming tool, autodoc, see section Programming Tools, can be used to list the comments from the subprograms within FLECS files.

Programming Environment

CONGEN is implemented using Fortran 77, the FLECS Fortran preprocessor, and C. All code is passed through a C preprocessor in order to handle conditional compilation. In general only standard features of the languages are used, but this rule is violated when there are strong reasons for doing so. (For example, the six letter maximum length of identifiers in Standard Fortran is far too onerous to bear, so the C limit of 31 characters is used, and there is a tool, makeshort, which can generate C preprocessor definitions to reduce the size of the variables to whatever machine dependent limits are necessary).

The management of revisions is done using RCS, the Revision Control System. We use version 5.5 obtained from the Free Software Foundation. Every source file should be kept under RCS control.

Use of Fortran and FLECS

The main reasons for using Fortran as a base language is its widespread usage among the molecular modeling community and its wide availability. Most hardware manufacturers concentrate their compiler optimization capabilities into Fortran. However, Fortran lacks good control constructs, data structures, and operating system interfaces, so steps have been taken to circumvent these limitations. The control constructs are provided by using FLECS, data structures are provided through either an elaborate storage management scheme (see section Storage Management in Fortran) or by using C, and the operating system interfaces are generally provided using C.

FLECS is a Fortran preprocessor that allows us to use a much more robust set of control constructs than normally provided in Fortran. This allows for much more readable and understandable code than could be obtained via Fortran alone. In addition, the use of FLECS allows the development of new programs much more quickly because much less time must be spent working out the flow of control. It is described in detail in section `Introduction' in FLECS Manual.

Specifically, FLECS provides a block structured IF -- THEN --- ELSE clause, a structured iteration clause, WHILE and REPEAT WHILE clause as well as their inverses, CONDITIONAL and SELECT clauses, and internal procedures. The internal procedures are especially valuable because they allow long subroutines to be broken into small pieces while leaving all variables accessible. In addition, one can give very long names to these procedures which makes their purpose far more clear. For example, INTERPRET-COMMAND-AND-BUILD-CLAUSES tells you a lot more than CALL INTCBC or GOTO 285 does.

FLECS operates by translating any FLECS constructs into Fortran. Any non-FLECS constructs are merely copied. The Fortran compiler must be then be invoked to compile the translated code into machine language.

To invoke FLECS, type FLECS file1 file2 .... Any number of files can be translated. The default file extension for FLECS programs is `.flx'. The Fortran translations will be produced in files whose extension is `.f' or `.for' depending on the machine FLECS is executed on. The FLECS listing files appear with extension, `.fli'.

Concerning the portability of FLECS, FLECS was written in itself. It was designed to be transported and has been modified to use the C preprocessor to encode machine dependencies.

Not all of CONGEN has been written using FLECS, as the preprocessor was not adopted until August 1979. All new code should be written using it, and old code in needed of major modification should use FLECS as well.

Two non-standard feature of most Fortran compilers is routinely used, variable name lengths and memory overlays. Variable names can be any length. There is a programming tool, makeshort which can be found in `$CGP', which can be used to translate all long names into short names. (In this day, six character variable names is really quite an anachronism!)

The dynamic storage allocation schemes, see section Storage Management in Fortran, depend on mapping arrays of one type onto arrays of another type. It is necessary that REAL variables occupy the same amount of space as INTEGER variables. Numeric arrays are not mixed with character arrays, which avoids many alignment problems.

Use of C

C was first used in CONGEN for implementing the conformational search algorithm. This algorithm required complicated data structure processing and recursion, for which the language is eminently suitable. Later on, C was used for operating system interfaces because many of the operating system calls in C are portable. For example, the storage management Fortran library uses malloc in the C library to obtain additional storage from the operating system without a requirement for machine dependent code.

One limitation in the use of C is I/O. The Fortran and C I/O libraries are not compatible. Therefore, all I/O from C must be done using Fortran routines. See the source code for for_printf, for_scanf, etc. in `cutil.c' for routines which provide this functionality.

FCPP -- The Fortran C Preprocessor

FCPP is a modified version of the GNU Emacs C Preprocessor. This preprocessor provides all the functionality of the old style C preprocessor as well as #elif and the macros; __LINE__, __DATE__, __FILE__, and __TIME__. In addition, it contains additional features for Fortran code, and it has been ported to the VAX/VMS operating system in addition to other Unix platforms.

It is used on all Flecs code in CONGEN, and is described in greater detail in the fcpp manual page.

Note:, since fcpp is derived from the GNU Emacs C Preprocessor, it is licensed under the GNU Emacs License, which provides for more freedom to redistribute it than is available with CONGEN itself. Please see the license text in the source file, `cccp.c', or in your CONGEN license agreement, which incorporates the GNU Emacs License.

WRAPGEN -- The Wrapper Generator

The program, wrapgen, is used to generate the interfaces between Fortran and C. It is described in greater detail in the wrapgen manual page. Briefly, wrapgen takes a file of prototypes which describe functions written in either C or Fortran, and it generates procedures which can be called from the Fortran or C, respectively. The wrappers takes care of the machine dependencies inherent in character string representations, character string lengths, call by value or reference, and naming rules, so that the programmer need not be concerned about these details. All source code must include a wrapper header file generated by wrapgen in order to correctly handle the substitution of wrapper function names.

MKPROTO -- Make C Function Prototypes

The program, mkproto, will generate ANSI C function prototypes for all the procedures in a C source file. It is described in greater detail in the mkproto manual page.

In CONGEN, mkproto is used to avoid duplicating all the function definitions when preparing prototypes. Currently, it is used for all C files, except for `tree23.c' and `noproto.c'. The file, `tree23.c', is not run through mkproto because it is anticipated that it will be released as a separate procedure. The file, `noproto.c', is used for procedures for which mkproto doesn't work correctly. Presently, this occurs only with files containing machine dependencies where function definitions require the declaration of structures that are specific to a machine. Any necessary prototypes from this file can be put directly into `funct.h'.

The Configuration File, `config.h'

In order to simplify the use of machine dependent features, preprocessor variables have been defined which describe particular features. For example, the declarations of double precision variables in Fortran is given by the symbol, DOUBLE_TYPE. On most machines, this translates into DOUBLE PRECISION, but on the Cray, it translates to REAL. The source code file, `config.h', contains all these configuration variables. The values are set using a small number of machine identifiers which are set by the makefiles used to build CONGEN.

Making CONGEN

CONGEN is constructed using either the make command on Unix machines or the MMS utility on VMS. There is a master `makefile' or `descrip.mms' file in `CG', which will invoke all the necessary `makefile's needed to build the entire system. These `makefile's work fine on most machines, but have trouble on the Convex because their make program is older.

In order to handle the different compiler names and flags necessary for each different machine, many of the CONGEN directories have a file named `makefile.gen' which is missing definitions for these macros. In the `CG' directory, there are a set of `*.make' files which contain the macros needed for each type of machine. The names of the `*.make' files is as follows:

`sgi...'
There are several files for Silicon Graphics workstations which are described in section Installation of CONGEN on UNIX. Currently, there are four versions of these files, sgi_r10k_i6.4_c7.1_m4_a64.make, sgi_r3k_i5.3_c5.3_m1_a32.make, sgi_r4k_i5.3_c5.3_m2_a32.make, and sgi_r8k_i6.1_c6.1_m4_a64.make
`unicos'
Cray YMP running Unicos.
`rs6000'
IBM RS/6000 workstation running AIX.
`convex'
Convex computers.
`hpux'
Hewlett Packard 700 series workstations running HPUX.
`sparc'
Sun Sparcstations using gcc as the C compiler.
`alphaosf'
DEC Alpha machines running OSF.
`fujitsu'
Fujitsu VP240.

There is a program, fixmake, in the `CG' directory which will take one of these files, and incorporate into a `makefile'.

The selection of machine is done via the CGPLATFORM environment variable which is set in `$CG/cgdefs' or `$CG/cgprofile'.

The master `makefile' has several different targets that can be specified to build parts of the system. Use them with care. They are listed as follows:

prepare
Prepare for rebuilding the entire system. This will set up to rebuild everything including binary data files. It should be used only when porting CONGEN to a new machine, or when file structures are changed.
clean
Clean up unnecessary copies of executables, intermediate data files, objects, etc.
all
Make everything in place. Executables made by this target are not copied to the directories which are included into user PATH environment variables.
install
Make everything and put them where users will access them. Note that the install script from the X window system is used.(36)
tovax
Copy files to dino. This should be used only at Bristol-Myers Squibb, but will not have any effect on other sites except if the file name `/u/dino/fc/congen/...' leads someplace real.
setup
This will construct all the `makefile's from the `makefile.gen's.

In order to construct CONGEN from scratch on a brand new machine, the following steps must be followed:

  1. Modify `$CG/cgdefs' and `$CG/cgprofile' to reflect your directory structure and the hardware platform.
  2. Create a platform file in `$CG' which specifies the necessary switches for compilation.
  3. Set your default directory to `$CG'.
  4. Do a make prepare.
  5. Do a make install. Some of the utility programs like peer may not build correctly on all SGI platforms. This should not interfere with the installation of CONGEN.
  6. Compare the test cases found in `$CGT'.

To make a new directory tree for a different SGI platform, perform the following steps:

  1. Verify that the master tree name in $CGROOT/master_machine is correct.
  2. Add a new platform file in `CG' which specifies the necessary switches for compilation.
  3. Execute cd $CGROOT
  4. Execute ./update_tree new_platform_name
    The script, ./update_tree, copies all the files from the master directory stored in the file $CGROOT/master_machine into the directory specified above as new_platform_name. It sets up links for all RCS directories except the test case archive, which is duplicated for the new platform, because it is likely that the test values will change slightly.
  5. Execute source cgundef
  6. Follow the procedure for building CONGEN above starting at the cd $CG step.

Standards (Rules) for Writing CONGEN Code

The following set of rules is designed to help keep CONGEN readable and modifiable.

  1. All routines should have similar organization. Each Fortran subroutine should have the following structure:
          SUBROUTINE DOTHIS(ARG1,ARG2,....
    C
    C     A comment which describes the purpose of this subroutine.
    C     This comment is essential because it provides the only
    C     documentation for nearly all subroutines in CONGEN. The
    C     program, AUTODOC, can be used to get this comment from all
    C     subroutines.
    C     
          <Declarations>
    C
          <Code>
    
    The separation of the code from the declarations by a blank comment aids in reading the code. It becomes obvious where executed code begins. Each C procedure should be written like this:
    dothis(type par1, type par2,...)
    /*
    *   A comment which describes the purpose of the routine. This
    *   comment must come here so that automatic documentation can
    *   be implemented (a program similar to AUTODOC is planned.)
    */
    
    {
        local declarations
    
        code
    }
    
  2. Prototypes for all the C functions should be provided through the use of mkproto, see section MKPROTO -- Make C Function Prototypes. See the makefile for CONGEN for details of the mechanism. Note that not all files can be processed correctly, such as functions which are declared with structures or types that are machine dependent. All such functions should be placed in the file, `noproto.c'.
  3. All source files must have a copyright notice at the top. See any source file for the appropriate text.
  4. All C source files must include `config.h' and `wrappers_c.h' in that order. It is a good idea to use an existing source file to provide the copyright and includes to get started.
  5. All FLECS source files must include `config.h' and `wrappers_f.h'.
  6. All code should be written clearly. Since the code must be largely self-documenting, clarity should not be sacrificed for insignificant gains in efficiency. The use of C and the FLECS preprocessor is encouraged as it graphically illustrates the flow of control and allows for internal procedure calls. Variable names should be chosen with care so as to illustrate their purpose. Avoid using one or two letter variable names in any COMMON blocks. Comments should be used where the function of code is not obvious.
  7. All usages of integers, floats, doubles in C code must use the F77_INTEGER, F77_REAL, and F77_DOUBLE macros defined in `config.h'. Boolean variables should use the BOOLEAN macro, and Fortran logical variables defined in Fortran should use the F77_LOGICAL macro. The F77_INTEGER macro should declare to a long int. If not, you must review calls to the different scanf and printf functions to ensure correct typing.
  8. Be careful to distinguish between Fortran 77 logical variables and C integers being used to hold Boolean variables. The testing conditions are machine dependent. Use the macros provided in `macros.h' for Fortran logicals.
  9. Be careful that the type of any numeric constant match correctly with its usage across the various platforms that CONGEN is implemented on. For example, there may be problems with using a real constant in an intrinsic function call with multiple variables in Fortran, eg. SIGN(1.0,P). If P is DOUBLE_DECL, you will have a problem because DOUBLE_DECL can map to either REAL or DOUBLE PRECISION depending on the machine. In such cases, it is better to use a variable or parameter to store the constant.
  10. All usages of DOUBLE PRECISION variables in Fortran must be declared using the DOUBLE_DECL macro. This allows CONGEN to switch double precision variables to single precision on 64-bit computers.
  11. Any variable in Fortran code that holds a pointer to be used by the C code must be declared using the POINTER_DECL macro. On 64 bit architectures, this macro will expand to 64 bit integers. The equivalent type in C is given by the F77_POINTER macro.
  12. Any subroutine defined in C which can be called from FLECS code must have its prototype entered into the source file `wrap_cdef.proto'. Likewise, any subroutine defined in FLECS which can be called from C must have its prototype entered into the source file `wrap_fdef.proto'.
  13. Whenever Fortran common blocks are accessed within C, you must use the predefined macro for the common block name. The macro is the upper case name for the common block. Header files (suffix `.h') are defined for all common blocks used in C code. For example, if you want to refer to the X coordinate array in C, use COORD.x.
  14. There are number of rules associated with input and output:
    1. All input commands should be free field. The command processor should check that the entire command is consumed.
    2. Short outputs, messages, warnings, and error should be sent to unit 6 for output.
    3. All inputs should be echoed to unit 6. All values read by the command should also be output to unit 6.
    4. All warning and fatal messages should state what subroutine generated it, so that one find the location in the source code where the problem arose.
    5. All data structures output with unformatted I/O statements must have a HDR, ICNTRL, and TITLE in the first two records. See any existing binary output subroutines for the exact format.
    6. Unformatted I/O file formats should remain upward compatible. Use an ICNTRL array element to indicate which version of CONGEN wrote the file. Such upward compatibility must be maintained only across production versions of CONGEN. In other words, a file format for the developmental version may be freely changed until a new version is generated, at which point all future versions must be able to read it.
    7. All I/O must be done through Fortran I/O. C I/O is not to be used. See the procedures in the source code file, `CUTIL.C', for useful analogs of C I/O functions to make this rule easy to follow.
  15. All error conditions must terminate with a CALL DIE. The subroutine, DIE, provides a traceback or core dump so the program statements causing the error can be seen.
  16. Large or variable storage requirements for Fortran code must be met on the stack or heap. In C, cgalloc and cgfree should be used for all variable storage needs.
  17. Array overflows must always be checked for when arrays are being written. This is especially important when the array being constructed might be dynamically allocated. Error checking in general should be as complete as feasible.
  18. The code should use a minimum of non-standard Fortran or C features. All non-standard features must be conditionally compiled so that any CONGEN programmer is informed that the code is special.
  19. In order to make subroutines callable from different contexts, parameter passing should be done through the subroutine call rather than through COMMON blocks.
  20. All common blocks which are shared between multiple subprograms are to be placed in files and #include'd into the program. The common blocks should have comments describing each variable in the common block so that new users will know what's there. No directory should be specified for the #include'd files, so that the -I option to the C preprocessor can be used to select the directory at will. If a common block is to be shared between C and Fortran code, use the existing code in the *.h files to implement the needed name equivalence.
  21. Avoid the use of static memory for initialization purposes. As more sections of CONGEN are implemented on parallel computers, making the subroutines reentrant is essential. Also, avoid the use of EQUIVALENCE and DATA statements in Fortran, since all storage referenced by these statements is allocated statically on the Iris.
  22. When using scanf functions in C, use only long int's or doubles for your I/O, and then convert to your type. This avoids the need to control for machine dependent variations in data lengths.

Programming Tools

Presently, there is one tool for assisting in the development of CONGEN besides the language tools described within this section.

The program, autodoc, will collect information on all the entry points in a large Fortran or Flecs program and write them out using several different methods. For each entry point, the program collects the module name if different than the entry name, the file that entry point is in, the definition line of entry point, and the first block of comments which hopefully document the function of the routine. The files to be scanned are specified on the command line, and are written to file whose name is requested from the user when the program executes.

When autodoc is run, it will read the command line for files, and if none are found, it will ask you for files. Then, it will ask for an output file. It will then scan the files, and subsequently, it will ask you if you wish to sort the entry points by name. If not, the output will be in the order the files were read. Then it will ask if you want the short form of the listing. The short form is all the information on each entry except the comments. You will then be presented a list of subroutines which have no comments.

CONGEN Test Cases

The test cases may be found in `CGT' (as well as their developmental counterparts). All of these file generate output files which are to be compared with previous runs. In addition, some of the tests will generate other files which have the same file name, and these should be compared too. Scratch files have file names of `FOO' and file types which begin with the file name. For example, `DYNTEST1' generates a number of scratch files named, `FOO.DYNTEST1_nn', where nn is the unit number. These files should be deleted when the runs complete. The CPU time listed below is given in minutes for version 2 of CONGEN running on a single CPU of a Silicon Graphics 4D/200 series workstation.

Test cases run on platforms other than the Iris can be found in subdirectories under `$CGT' whose names match the platforms. For example, the Cray test case outputs are found in `$CGT/unicos'.

All the tests are run using the equivalent of the RUNCG command. On Unix machines, there are makefiles in both directories for running the test cases, and on VMS machines, there is a `descrip.mms' file. A target of diffs will make difference files for all the test cases.

The differences are always run through the program, `ndiffpost', see section ndiffpost -- Numerical Difference Postprocessor, and the output are written to files which have a suffix of `.dif'. The raw difference files are output to files which have a suffix of `.dif.raw'.

@multitable {AM94TEST7} {Time*} {Test D amino acid construction. Modified from CGTEST14.}

  • File Name @tab CPU Time* (min) @tab Purpose
  • AM94CYCLE @tab 1.9 @tab Test of AMBER94 using a cyclic peptide. Modified from CGCYCLE.
  • AM94GENER @tab 0.1 @tab Simple generation test for AMBER94.
  • AM94SPL @tab 2.6 @tab Test of splicing using AMBER94.
  • AM94TEST1 @tab 4.4 @tab Construction of all major residues in the AMBER94 topology file.
  • AM94TEST2 @tab 3.1 @tab Repeat of first AMBER 3 demonstration run, energy calculation of alpha-lytic protease.
  • AM94TEST3 @tab 0.7 @tab Simple conformational search testing AMBER94 energy calculation.
  • AM94TEST4 @tab 0.4 @tab Test minimized ring constructions for AMBER94.
  • AM94TEST5 @tab 0.1 @tab Checks the backbone and sidechain degrees of freedom work correctly at the endpoints of chains and prolines. Modified from CGTEST9.
  • AM94TEST6 @tab 0.6 @tab Test parser errors when AMBER94 is used. Modified from CGTEST1.
  • AM94TEST7 @tab 2.3 @tab Test D amino acid construction. Modified from CGTEST14.
  • AM94TEST8 @tab 2.2 @tab Test AMBER 94 amino acid constructions. Modified from CGTEST15.
  • AMTEST1 @tab 0.4 @tab Amber test 1, check terminal charges, part 1
  • AMTEST2 @tab 0.3 @tab Check terminal charges, part 2
  • AMTEST3 @tab 0.6 @tab Check conformational search with DNA protein complex.
  • AMTEST4 @tab 0.1 @tab Test multi-term torsion term and conformational search.
  • AMTEST5 @tab 0.1 @tab Test hydrogen bond term.
  • AMTEST6 @tab 0.3 @tab Test amino acid construction in conformational search.
  • AMTEST7 @tab 0.2 @tab Test antibody loop construction using AMBER potential.
  • BRBTEST @tab 0.1 @tab Tests Builder, Newton-Raphson minimization, and vibrational analysis.
  • CGCYCLE @tab 6.4 @tab Tests construction of cyclic peptides.
  • CGFIX @tab 0.2 @tab Test fixed atom construction in CONGEN.
  • CGFIX2 @tab 1.2 @tab Test mixture of fixed atom and regular construction in conformational search.
  • CGHBUILD @tab 0.2 @tab Tests partial sidechain construction in the context of rebuilding hydrogen bonds.
  • CGMERGE @tab 0.2 @tab Tests merging of conformation files.
  • CGPARA1 @tab 1.8 @tab Tests parallel processing in searching. The time given is the elapsed time.
  • CGPBE @tab 17.1 @tab Tests use of Poisson-Boltzmann equation with conformational search.
  • CGPBE2 @tab 21.4 @tab Tests parallel implementation of PBE with conformational search.
  • CGPBE3 @tab 3.7 @tab Test parallel conformational search using serial PBE evaluation.
  • CGRAND @tab 0.3 @tab Tests random node evaluation.
  • CGRESTART @tab 2.1 @tab Tests restarting when directed searching is done and MIX strategy used. (Currently fails on the CONVEX in malloc. No real idea why).
  • CGRESTART2 @tab 1.6 @tab Repeat of CGRESTART, but without restart step. Output should match CGRESTART except for command processing.
  • CGRESTART3 @tab 6.9 @tab Tests restarting when depth first search is used.
  • CGRESTART4 @tab 6.8 @tab Repeats CGRESTART3 without restarting. Output should match CGRESTART3 except for command processing.
  • CGTEST1 @tab 0.1 @tab Checks the CGEN parser. Many error messages are tested and no conformation file is written.
  • CGTEST2 @tab 0.2 @tab Check ALL and FIRST sidechain construction options
  • CGTEST3 @tab 0.2 @tab } Together, CGTEST3 and CGTEST4 check that the optimization
  • CGTEST4 @tab 0.7 @tab } of the sidechain search for FIRST and ALL in the case where sidechains interact. CGTEST3 has the optimization, whereas CGTEST4 omits it. The CG files generated by both tests should match each other except for the first record, but CGTEST4 should take more CPU time.
  • CGTEST5 @tab 0.3 @tab } CGTEST5 and CGTEST6 verify that the CLSA optimization
  • CGTEST6 @tab 0.5 @tab } used with backbone degrees of freedom works correctly. The CG files should be the same, but CGTEST6 should take longer to get the results.
  • CGTEST7 @tab 0.6 @tab Checks the energy calculations in the sidechain degree of freedom.
  • CGTEST8 @tab 0.5 @tab Checks esoterica of CLSA and CLSD options
  • CGTEST9 @tab 1.9 @tab Checks backbone termini processing and handling of prolines in both backbone and chain closure.
  • CGTEST10 @tab 0.9 @tab Checks all sidechain construction options
  • CGTEST11 @tab 0.2 @tab Tests van der Waals avoidance and Nosymmetry options in a single sidechain construction.
  • CGTEST12 @tab 2.6 @tab Test of van der Waals avoidance in context of full search. Iterative option.
  • CGTEST13 @tab 0.9 @tab Similar to CGTEST12 except Independent option used.
  • CGTEST14 @tab 6.4 @tab Test of D amino acid construction and all amino acid sidechains
  • CGTEST15 @tab 6.4 @tab Similar to CGTEST14, except we test the all hydrogen topology file.
  • CGTEST16 @tab 1.1 @tab Simple test of overlapping degrees of freedom.
  • CGTEST17 @tab 0.5 @tab Second test of overlapping degrees of freedom (sidechains).
  • CGTEST18 @tab 0.7 @tab Test of coordinate writing and energy display filters.
  • CGTEST19 @tab 1.9 @tab Test of ALLCISTRANS options.
  • CGTEST20 @tab 0.1 @tab Test of other non-bonded energy calculations.
  • CGTEST21 @tab 0.1 @tab Test RDEPTH search option.
  • CGTEST22 @tab 1.5 @tab Test cavity energy calculation.
  • CGTEST23 @tab 1.7 @tab Test combination of cavity and PBE energies.
  • CGTEST24 @tab 0.5 @tab Test Worst RMS evaluation option.
  • CGTEST25 @tab 0.5 @tab Test SGRID SELECT and AUTO options.
  • CGTEST26 @tab 4.2 @tab Test cavity energy in evaluate degree of freedom.
  • CONGEN @tab 0.3 @tab A simple conformational search over five residues
  • CONGEN2 @tab 0.4 @tab A two part conformational search over five residues
  • CORMANTST @tab 0.1 @tab Tests some coordinate manipulations.
  • CORTST1 @tab 0.1 @tab A virtually worthless test of the correlation functions
  • DELTEST @tab 0.1 @tab Tests deletion by value in the analysis section
  • DJSTEST @tab 0.1 @tab Tests ABNER
  • DRAWTEST @tab 0.1 @tab Tests drawing capability of the program.
  • DYNTEST1 @tab 0.2 @tab A series of tests on the dynamics algorithms. Not a complete test. Checks Gear and Verlet algorithms, SHAKE, ability to fix atoms in place. Also checks that the analysis facility can rotate a trajectory with respect to a fixed coordinate set. Some simple checks of dynamics analysis are also present.
  • GAUSSIAN @tab 6.0 @tab Test of interface to Gaussian 92.
  • GENERTEST @tab 0.1 @tab Tests some of the generation and patching routines.
  • GEPOL @tab 0.1 @tab Test GEPOL surface calculation.
  • GEPOL2 @tab 4.9 @tab Test incremental GEPOL options.
  • GEPOL3 @tab 4.8 @tab Another incremental GEPOL test.
  • H2OTST @tab 0.1 @tab Runs a water dimer to convergence and a true minimum. Also tests TLIMIT option.
  • HBCOMP @tab 0.5 @tab A self comparison of hemoglobin. Tests the comparison command in the analysis facility
  • HBMBCOMP @tab 1.4 @tab A comparison of hemoglobin to myoglobin. Tests comparison command and construction of difference tables.
  • ICTEST @tab 0.1 @tab Tests the routine that deal with internal coordinates.
  • IMH2OTEST @tab 0.4 @tab Water with periodic boundaries
  • IMPTEST0 @tab 0.1 @tab Inducible multipole code, volume test.
  • IMPTEST1 @tab 0.1 @tab Inducible multipole code, dimer test.
  • IMPTEST2 @tab 0.1 @tab Small molecule test, analysis interface.
  • IMST2TEST @tab 0.5 @tab ST2 water with periodic boundaries.
  • IMTEST @tab 0.1 @tab Checks Images for a small system with C2 symmetry.
  • JTEST1 @tab 0.2 @tab J coupling calculations on one leucine.
  • JTEST2 @tab 0.2 @tab Ensemble averaging of J coupling calculations on two leucines.
  • JTEST3 @tab 0.2 @tab J coupling calculations on one leucine with J errors.
  • JTEST4 @tab 0.3 @tab Ensemble averaging of J coupling calculations on two leucines with joining with convergence tests.
  • JTEST5 @tab 0.1 @tab Four leucine J coupling, ensemble averaging test with real data.
  • NANATST1 @tab 0.8 @tab Tests most of the features of the analysis facility
  • NANATST2 @tab 0.7 @tab Tests more features of the analysis facility
  • NANATST3 @tab 1.5 @tab Tests the dynamic properties in the analysis facility
  • NOETEST @tab 0.2 @tab Tests NOE constraint calculations and calculation of energy derivatives.
  • NOETEST2 @tab 0.1 @tab Test NOE ensemble averaging on a three atom system
  • NOETEST3 @tab 0.1 @tab Test NOE ensemble averaging on a four atom system
  • NOETEST4 @tab 0.5 @tab Test NOE code on a larger system.
  • NOETEST5 @tab 0.1 @tab Test NOE code on beta hairpin using real data.
  • PBETEST @tab 9.1 @tab Test Poisson-Boltzmann electrostatics.
  • PBETEST2 @tab 1.5 @tab More PBE testing. Thorough testing of options.
  • PBETEST3 @tab 0.1 @tab Test of dielectric smoothing.
  • PBETEST4 @tab 0.1 @tab Test of dielectric cavity
  • PBETEST5 @tab 0.3 @tab Test of cavity in a Debye-Huckel fluid.
  • PBETEST6 @tab 4.2 @tab Test of molecular surface usage in PBE code.
  • PBETEST7 @tab 5.5 @tab Test of charge anti-aliasing
  • PBETEST8 @tab 1.1 @tab Test of dielectric smoothing in a protein.
  • PBETEST9 @tab 0.1 @tab Test of dielectric combination rules
  • PBETEST10 @tab 1.6 @tab Test of dielectric constant modification based on accessible surface.
  • PBETEST11 @tab 0.2 @tab Test of margin option.
  • PBETEST12 @tab 3.9 @tab Margin and origin test using BPTI
  • PDBTEST1 @tab 0.4 @tab Test #1 of Brookhaven Data Bank reading. Read tendamistat.
  • PDBTEST2 @tab 0.7 @tab Test #2 of Brookhaven Data Bank reading. Read Fab KOL.
  • READTEST @tab 0.1 @tab Incomplete test of coordinate reading.
  • READTEST2 @tab 0.1 @tab Test of sequence reading by atom.
  • READTEST3 @tab 0.1 @tab IDREAD reading in PDB.
  • READTEST4 @tab 0.1 @tab Test of alternate coordinate reading in PDB I/O.
  • READTEST5 @tab 0.1 @tab Test of alternate model reading in PDB I/O.
  • SEARCHNOE @tab 0.3 @tab Tests conformational search with NOE's and also runs some simple tests of All Hydrogen construction.
  • SPHERE @tab 2.7 @tab Rudimentary test of sphere drawing.
  • ST2TEST @tab 0.2 @tab ST2 water without boundary conditions.
  • SURFTST @tab 0.2 @tab Checks the accessible surface calculation
  • TEST @tab 0.1 @tab Short test that hits a lot of stuff. Must always be run.
  • TESTCONS @tab 0.7 @tab Tests the harmonic constraints.
  • TESTCONS2 @tab 1.9 @tab Tests the interaction of dihedral and J coupling constraints with the conformational search.
  • TESTHB @tab 0.1 @tab Test hydrogen bond calculations.
  • TESTHOM @tab 1.7 @tab Test homology finding code.
  • TESTPARM @tab 0.1 @tab Test AMBER parameter reading code.
  • TESTRTF @tab 0.4 @tab Tests the RTF I/O commands, and a simple test of PEER output
  • TESTRTF2 @tab 0.1 @tab Test of charge generation in the RTF code.
  • TESTRTF3 @tab 0.1 @tab Test automatic generation code on three and four membered rings.
  • TESTSEL @tab 0.4 @tab Tests the atom selection routines and use of wildcards in commands in the analysis section
  • TESTSPL @tab 0.2 @tab Tests SPLICE command
  • TRANSFORM @tab 1.4 @tab Tests coordinate transformation commands.
  • TWIST @tab 0.1 @tab Tests TWIST command in the analysis facility
  • VIBRTST @tab 0.1 @tab Tests vibrational analysis * The CPU time is for code not optimized by the compiler.

    Modifications to CONGEN

    The following steps should be taken when making a change to CONGEN. They are intended to ensure that the change will be maintained in the future and does not unwittingly affect other program functions.

    1. If you have not already done so, establish a directory of your own for working on the source, and set up a symbolic link to the source RCS directory, `$CGS/RCS'.
    2. Make your modifications and debug them. Please follow the guidelines in section Standards (Rules) for Writing CONGEN Code, so that the code will be consistent. Use either make (on Unix systems) or MMS (on VMS systems) to rebuild the program.
    3. Run the standard test case and conformational search test case and compare them. On a Unix machine using the C-shell, do the following:
      cd $CGT
      make test.dif congen.dif
      more test.dif congen.dif
      
      On VMS, do the following:
      $ SET DEFAULT CGT:
      $ MMS TEST.DIF,CONGEN.DIF
      $ TYPE TEST.DIF
      $ TYPE CONGEN.DIF
      
      The files should be identical except for the first four lines, version numbers or locations of files, and the last few lines giving the free list on the heap. If they are different in any other way, you must be able to prove that the results are correct. If you change any commands, the test case must be modified so that it will give the same results as before if possible. If you cannot duplicate the test case, you must eliminate your changes.
    4. Run all the test cases in `CGT'. Use either make diffs on Unix machines, or MMS DIFFS on VMS machines. Any signigicant changes must be accounted for.
    5. If your modification involves a new feature, you must either modify an existing test or make a new test to demonstrate and check its operation. See section CONGEN Test Cases, for a description of the tests currently available. If you add a new test, please update that node. WARNING: Any additions made without this will stop working as the entropy of programming randomizes your code without detection.
    6. Checkin your change (using the co command), and enter a good descriptive log entry for what you have done.
    7. If your change involves adding or modifying a command or adding or modifying a feature, modify existing documentation or if none is available, make new documentation. Recreate the INFO file and the manual using the makefile in `CGD'.
    8. If you modify or add new energy functions, use the TEST command, see section TEST Command -- Test Internal Functions, to verify that the derivatives of your energy calculations are correct.

    Making New Versions of CONGEN

    This section of the manual is not complete, but is left as a guide for future work on the process of generating new versions.

    This section describes the steps in generating a new version of the protein system. It is constantly in flux and should be viewed as a guide.

    1. Make sure the version number and date in opening output of CONGEN.FLX is correct for this new version.
    2. Relink CONGEN if necessary.
    3. Redo a make depend in those directories where it is supported.
    4. Run all the test cases and compare against previous versions.
    5. Recompile the program with optimization, and compare results again.
    6. Rebuild the documentation.
    7. Clean up all directories of garbage.
    8. Make the tar files for distribution.
    9. Do a global setting of symbolic version number for this release.
    10. Backup the directory tree for posterity.

    Installation of CONGEN on VMS

    The installation of CONGEN on VAX/VMS is a very straightforward process. The files are organized so that CONGEN can be installed by either a system manager or an individual user without privilege. There are only a few steps to be taken:

    1. The tape on which CONGEN is shipped contains a single saveset, CONGEN.BAC. Restore the tape while preserving the directory structure into a directory of your own choosing, e.g.
      $ BACKUP MUA0:CONGEN.BAC [CONGEN...]
      
      You will need about 100000 blocks to restore the saveset.
    2. Modify the file, `[CONGEN.V2]CGDEFS.COM', to reflect your own directory structure.
    3. Change either the system site specific startup file, `SYS$MANAGER:SYSTARTUP.COM', or your `LOGIN.COM' file to include a call to `CGDEFS' to set up logical names. Use an argument of SYSNAM or JOBNAM as appropriate.
    4. Change either the system `LOGIN.COM' file or your `LOGIN.COM' file to include a call to CGDEFS to define commands, thusly, `@CG:CGDEFS COMMANDS'
    5. You may wish to delete rarely used files such as the bulk of the test cases or source code object files.
    6. Copy the INFO files (`congen', `congen-*', `flecsdoc', and `flecsdoc-*') in `CGD:' into the GNU Emacs manual directory and modify the INFO directory file so GNU Emacs can access the CONGEN documentation.
    7. It is helpful if GNU Emacs is installed so you can read the documentation online.

    Installation of CONGEN on UNIX

    The installation of CONGEN on UNIX is a very straightforward process. The files are organized so that CONGEN can be installed by either a system manager or an individual user without privilege. There are only a few steps to be taken:

    1. CONGEN is provided as a single gzip'ped tar file (you can get gzip from the Free Software Foundation (ftp to `prep.ai.mit.edu' or other mirrors). Restore the tapes while preserving the directory structure into a directory of your own choosing, e.g.
      zcat congen.tar.gz | tar xvfo -
      
      You will need about 600000 blocks to restore the tar file.
    2. Modify the files, `./congen/cgdefs' and `./congen/cgprofile', to reflect your own directory structure and machine usage. The files, `cgdefs' and `cgprofile', contain some special code for the definition of CGROOT which is used at our site to allow us to have two copies of CONGEN on different machines, and to switch based on the machines' availability. You can remove all the conditionals, and simply give it a definition. The machine designation is more complicated. The program, `./congen/setcgplatform', will determine what machine you are running on. For most machines, there is only one possible machine, but on the Silicon Graphics computers, there are many possibilities based on the type of processor, operating system, compiler, instruction set architecture, and application binary interface. The SGI machine string is encoded as sgi_rchip_iO.S_ccompiler_marchitecture_aabi. The choices for each part are as follows:
      chip
      3k, 4k, 5k, 8k, 10k
      irix
      5.3, 6.1, 6.2, 6.3, 6.4
      compiler
      5.3, 6.1, 6.2, 7.0, 7.1, 7.2
      MIPS
      1, 2, 3, 4, 5
      abi
      32, n32, 64
      Most distributions of CONGEN contain only the most compatible version which is found under the machine directory, sgi_r3k_i5.3_c5.3_m1_a32. See the section on rebuilding CONGEN, section Making CONGEN, for a description on how to build other versions of CONGEN which will give better performance on the newer architectures. The environment variable, CGPLATFORM, specifies which machine directory to use. The Perl program, ./congen/matchdir, finds the closest compatible machine directory for the environment you are running on. You can override the automatic selections by setting CGPLATFORM prior to the execution of cgdefs or at the beginning of cgdefs.
    3. Change either the system wide profile or `.cshrc' file or your own profile or `.cshrc' file to source `cgdefs' or `cgprofile'. The file, `cgdefs', is for the C shell, and the file, `cgprofile', is for the Bourne or Korn shells. Once these files are executed, all of the commands will work.
    4. Change your default directory to `$CGT', and run two tests as follows:
      make test.dif congen.dif
      
      Examine those two files. The only differences you should see are file names, version numbers, dates, allocations in the heap, and execution times.
    5. You may wish to delete rarely used files such as the bulk of the test cases or source code object files.
    6. Copy the INFO files (`congen', `congen-*', `flecsdoc', and `flecsdoc-*') in `$CGD' into the GNU Emacs manual directory and modify the INFO directory file so GNU Emacs can access the CONGEN documentation.
    7. It is helpful if TeX is installed. This will allow you to modify the documentation.


    Go to the first, previous, next, last section, table of contents.