This page provides access to SDfiles for the compounds used in Robert Jorissen and Mike Gilson's paper on the use of a Support Vector Machine method for compound screening. These files are freely available for academic, commercial, or personal use. We do ask that you cite our reference in any publication that uses this information: Virtual Screening of Molecular Databases Using a Support Vector Machine, Jorissen & Gilson, J.Chem.Inf.Mod. 45:549-561, 2005, DOI 10.1021/ci049641u .
This download contains:
Each SDfile begins with 125 known binders: 25 compounds for each protein target. The targets are as follows:
The subsequent compounds are "decoys" from the National Cancer Institute diversity set. Note that the decoys included in compounds_1ST.sdf are the same as those included in compounds_ODD.sdf and those included in compounds_2ND.sdf are the same as those included in compounds_EVEN.sdf.
The NCI diversity set compounds were filtered and prepared as follows: