Hybrid Code Description
This is a brief description the use of the program hybrid_exp3, and of the program's input and output files.
INPUT FILE
A sample input file, his14.dat, is included with the program. The first line is an integer giving the number of ionizable groups, N. The rest of the file is a series of multi-line descriptions for each group. For each description, the first line is the group's initial pKa, pKA(i); followed by z(i) (+1 for a base, -1 for an acid); then the group's self-energy, G(i), in kcal/mole; followed by the group's index (1...i...N). The self energy is actually the additional work of ionizing the group in its actual position in a protein, with all other ionizable groups assumed neutral. The group's "intrinsic pKa" is therefore pKa(i) - z(i) G(i)/(2.303RT).
The next set of lines gives the interaction energy, G(ij), in kcal/mole, of the ionized form of the group i with the other ionizable groups whose indices j are greater than i. Therefore, in the sample file with fourteen groups, the first group has 13 interaction energies listed; the second has 12; and so forth. This yields a symmetric interaction matrix for all the groups.
RUNNING THE PROGRAM
The program first requests a maximum cluster size. Ten is a good starting number. Forcing small clusters will tend to decrease accuracy, because inter-cluster interactions are treated less accurately than intra-cluster interactions. However, larger clusters lead to slower calculations.
The program then asks for "fixed cutoff theta", tcut. Theta is fractional ionization. The program needs to know the criterion for a "fixed" charge. Any group which, for this pH, will never have a fractional ionization greater than tcut, or will never have a fractional charge less than 1-tcut, will be set as "fixed". Such groups are treated as single-group clusters. A reasonable value for tcut is 0.05, following the reduced site method of Bashford and Karplus.
The program then asks for a starting pH, stopping pH, and pH step for the calculations. This permits generation of full titration curves.
You must then provide the names of the input file-- for example, his14.dat, which is provided-- and output file, which will be created.
The program then runs interactively. For each pH, it will print out the number of clusters, the number of groups in the largest cluster, and the largest value of Gc which was needed to subdivide the groups into clusters meeting your maximum cluster size limit. If every group turns out to be fixed, a note to this effect is generated.
The program then asks for another input file, so you may go straight to another input file, using the same calculation parameters.
Don't worry about 999's in the output. They simply indicate that all groups are "fixed", so there are no groups in "clusters".
OUTPUT FILE
The output file begins with the number of pHs examined. The top section of the file then provides for each pH: the average net system charge; the ionization energy, as defined in the reference; the difference between the calculated ionization energy and the ionization energy which would be obtained for the same set of groups in the absence of the perturbing protein environment; the approximate CPU time for the calculation; the largest value of Gc; the value of tcut; and the mean purely electrostatic energy, where the average is taken over all the ionization states.
The next section of the file gives computed pKas for all groups whose pKas are indeed bracketed by the range of pHs you requested. (Groups whose pKas are outside the pH range are given pKas of 999 to flag them.) For each group, the pKa is computed by linear interpolation between the two pHs which bracket a group charge of 0.5 for bases,
-0.5 for acids. Thus the precision depends a bit upon the pH step you use: the smaller the step, the better the estimate.
The lower part of the file provides the detailed average charges. It begins with the number of groups. The for each pH, the file gives, the pH, followed by the average charge of each group.
INITIAL TESTING
Test the program by sourcing the hereis file his14.inp, after you have compiled the program. The run will terminate with an error message... don't worry about it. The output
file, his14.out.test, should match the file his14.out, except for the CPU times. Also, if you redo the calculations with a maximum cluster size of two, the results should agree with those in Table II of the reference.
If you have problems with the code, I can try to help out: mgilson@ucsd.edu
REFERENCE:
Gilson,M.K. Multiple-Site Titration and Molecule Modelling: Two Rapid Methods for Computing Energies and Forces for Ionizable Groups in Proteins. Proteins: Structure, Function and Genetics, 15:266-282, 1993.