The availability of crystallographic or NMR information regarding the active site of interest introduces the possibility of using fast docking methods for compound selection or filtering of virtual combinatorial libraries. The various algorithms for generating possible binding modes have been reviewed recently (27), and the technical details will not be repeated here. Currently, the most popular programs for high-throughput ligand docking are DOCK (28), FlexX (29), and Gold (30). These tools apply diverse technologies to the problem, i.e., rapid shape matching, incremental construction, and genetic algorithms, respectively. While most of the prominent tools for docking are adept at finding appropriate poses for a given ligand, considerable issues remain for rank-ordering a variety of possible ligands (31,32). Many researchers have come to rely upon consensus scoring methods to aid in the actual compound selection (33-36). Alternatively, one may derive 3D database queries from active-site features and steric constraints for use with programs such as Unity.
Sequential filters - Most virtual screening experiments that utilize active site information involve several stages of Increasing complexity; large compound collections or virtual combinatorial libraries can be pre-filtered for desirable molecular properties or pharmacophoric elements prior to the application of sophisticated docking techniques.
Inhibition of carbonic anhydrase (CA) remains a target for the treatment of glaucoma, and several X-ray structures of potent inhibitors are available. The consistency in the binding site features attracted Gruneberg et al. to apply sequential filtering techniques to find novel leads in this mature therapeutic area (37). Significant pre-evaluation of the active site features included consideration of displaceable waters and receptor site probing for favorable areas of interaction. The search database consisted of ca. 100,000 compounds from the Maybridge and LeadQuest (38) compound collections; known CA inhibitors were included for validation. The database was first filtered for rule-of-five compliance (39) and the presence of precedented zinc-binding groups, using a 2D Unity search, with 5904 compounds passing these filters. The results of the receptor site probing were then combined with features of known inhibitors to define pharmacophoric centers. The derived Unity 3D query included two acceptors, one donor, and adjacent hydrophobe spheres to approximate an elliptical shape. The flexible database search retrieved 3314 compounds, which were ranked vs. two known inhibitors using computed similarity and volume superimposltion to approximate active site volume. The 100 best-ranked hits were then docked into the active site using FlexX. Visual inspection and consideration of the various scores were used to select 13 compounds for biological testing via a photometric assay. Three of the compounds were subnanomolar (e.g., 13 and 14), one is nanomolar, and seven are micromolar inhibitors; although all of the hits are of the well-precedented sulfonamide class, they are not covered by existing patents.
Structure-aided library design - A variety of techniques were combined in library design efforts to identify leads for the malarial aspartyl protease plasmepsin II (Plm II) (40). Owing to a 35% sequence similarity to Cathepsin D, a library previously designed for Cathepsin D was screened for Plm II activity. Of 1039 compounds, 13 showed >50% Inhibition at 1|iM in a fluorogenic peptide substrate assay. Compounds 15 and 16 were resynthesized in purer form and were shown to be submicromolar inhibitors of Plm II (300 and 220 nM, respectively).
Six iterations of library design were conducted to optimize these leads, using three sidechain variation sites and two methods of monomer selection. For each optimization site, the selected reagents from the ACD collection were first filtered for synthetic considerations, molecular weight, and acceptable functionalities. One mode of sidechain selection was based on diversity metrics and hierarchical clustering. Alternatively, the scaffold was anchored in the active site of the X-ray crystal structure of Plm II (from a complex with pepstatin), and the monomers were "grown" into available space by iterative attachment of their constitutive fragments, with consideration of preferred torsions during the growth process. At each layer of growth, the existing portions were minimized and ranked, with the 25 top-scoring pieces advancing to the next growth stage. At the conclusion of growth, the best-scoring pose of each molecule was saved for comparison to others. The best compounds were evaluated for conformational accessibility and hydrogen-bonding potential; a subset of these were selected for hierachical clustering, and the best-scoring compound from each cluster was selected. Improvements in activity were observed throughout the library iterations, with SAR that could be rationalized in terms of the modeled structures. This iterative procedure eventually produced 17, with a K¡ of 4.3 nM, 15-fold selectivity over Cathepsin D, and significant improvements in molecular weight and ClogP. One notable complication was the difficulty in modeling sidechains in the tight S1' and S2' subsites. The authors presumed some induced fit capability on the part of the enzyme, which was substantiated by a subsequent X-ray structure of Plm II with one of the potent analogues.
Incorporating active site flexibility - The previous two examples highlight the value of using multiple active sites. In the case of carbonic anhydrase, multiple structures were available to validate the consistency of the active site, while library design for Plm II was complicated by the lack of sufficient information at the outset. In the following examples, investigators were able to use structural information to effectively increase the number of valid binding possibilities.
Schapira, et al. incorporated a generalized hypothesis for nuclear hormone receptor antagonism as a starting point for the identification of novel ligands for retinoic acid receptor-a (RARa) (41). Using the X-ray structure of the ligand-binding domain of the estrogen receptor-a (ERa) complexed with the antagonist tamoxifen as a guide, the authors repositioned the C-terminal helix of the RARa structure to create an antagonist-appropriate active site. A grid potential representation of the binding site was then used for a flexible ligand search of the ACD (153,000 compounds) using the program MolSoft (42). A generous scoring cutoff was used to select over 700 of these hits for minimization within the intact active site, with receptor side-chain and ligand flexibility included. Of the 500 top-scoring hits from this optimization step, 32 were selected via inspection for biological testing. Two novel antagonists, 18 and 19, showed inhibitory activity of 55% and 33% at 20|iM in an in vitro screen. The authors relied upon intuitive inspection of the results (i.e., retention of a particular hydrogen-bonding pattern and quality of fit) to guide their selection rather than reliance purely upon scoring functions.
Was this article helpful?