Structure validation: Difference between revisions

Content deleted Content added
Conformation (dihedrals): protein & RNA: added wikipedia page on rotamer library
Added free to read link in citations with OAbot #oabot
Line 13:
Macromolecular crystallography was preceded by the older field of small-molecule [[x-ray crystallography]] (for structures with less than a few hundred atoms). Small-molecule [[diffraction]] data extends to much higher [[Resolution (electron density)|resolution]] than feasible for macromolecules, and has a very clean mathematical relationship between the data and the atomic model. The residual, or R-factor, measures the agreement between the experimental data and the values back-calculated from the atomic model. For a well-determined small-molecule structure the R-factor is nearly as small as the uncertainty in the experimental data (well under 5%). Therefore, that one test by itself provides most of the validation needed, but a number of additional consistency and methodology checks are done by automated software<ref>{{Cite journal | vauthors = Spek AL |year=2003 |title=Single-crystal structure validation with the program PLATON |journal=Journal of Applied Crystallography |volume= 36 |pages=7–13 |doi=10.1107/S0021889802022112|doi-access=free }}</ref> as a requirement for small-molecule crystal structure papers submitted to the [[International Union of Crystallography]] (IUCr) journals such as [[Acta Crystallographica]] section B or C. Atomic coordinates of these small-molecule structures are archived and accessed through the [[Cambridge Structural Database]] (CSD)<ref>{{cite journal | vauthors = Allen FH | title = The Cambridge Structural Database: a quarter of a million crystal structures and rising | journal = Acta Crystallographica Section B | volume = 58 | issue = Pt 3 Pt 1 | pages = 380–8 | date = June 2002 | pmid = 12037359 | doi = 10.1107/S0108768102003890 | doi-access = free }}</ref> or the [[Crystallography Open Database]] (COD).<ref>{{cite journal | vauthors = Gražulis S, Chateigner D, Downs RT, Yokochi AF, Quirós M, Lutterotti L, Manakova E, Butkus J, Moeck P, Le Bail A | display-authors = 6 | title = Crystallography Open Database - an open-access collection of crystal structures | journal = Journal of Applied Crystallography | volume = 42 | issue = Pt 4 | pages = 726–729 | date = August 2009 | pmid = 22477773 | pmc = 3253730 | doi = 10.1107/s0021889809016690 }}</ref>
 
The first macromolecular validation software was developed around 1990, for proteins. It included Rfree [[cross-validation (statistics)|cross-validation]] for model-to-data match,<ref name="Rfree">{{cite journal | vauthors = Brünger AT | title = Free R value: a novel statistical quantity for assessing the accuracy of crystal structures | journal = Nature | volume = 355 | issue = 6359 | pages = 472–5 | date = January 1992 | pmid = 18481394 | doi = 10.1038/355472a0 | author-link = Axel T. Brunger | bibcode = 1992Natur.355..472B | s2cid = 2462215 }}</ref> bond length and angle parameters for covalent geometry,<ref name="Engh">{{cite journal |vauthors=Engh RA, Huber R |year=1991 |title=Accurate bond and angle parameters for X-ray protein structure refinement |journal=Acta Crystallographica A |volume=47 |issue=4 |pages=392&ndash;400|doi=10.1107/s0108767391001071 }}</ref> and sidechain and backbone conformational criteria.<ref name="Ponder&Richards">{{cite journal |vauthors=Ponder JW, Richards FM |year=1987 |title=Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes |journal=Journal of Molecular Biology |volume=193 |issue=4 |pages=775–791 |doi=10.1016/0022-2836(87)90358-5|pmid=2441069 }}</ref><ref name="procheck">{{cite journal |vauthors=Laskowski RA, MacArthur MW, Moss DS, Thornton JM |author4-link=Janet Thornton |year=1993 |title=PROCHECK: a program to check the stereochemical quality of protein structures |journal=Journal of Applied Crystallography |volume=26 |issue=2 |pages=283–291 |doi=10.1107/s0021889892009944}}</ref><ref name="whatif">{{cite journal | vauthors = Hooft RW, Vriend G, Sander C, Abola EE | title = Errors in protein structures | journal = Nature | volume = 381 | issue = 6580 | pages = 272 | date = May 1996 | pmid = 8692262 | doi = 10.1038/381272a0 | bibcode = 1996Natur.381..272H | s2cid = 4368507 | doi-access = free }}</ref> For macromolecular structures, the atomic models are deposited in the [[Protein Data Bank]] (PDB), still the single archive of this data. The PDB was established in the 1970s at [[Brookhaven National Laboratory]],<ref>{{cite journal | vauthors = Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M | display-authors = 6 | title = The Protein Data Bank: a computer-based archival file for macromolecular structures | journal = Journal of Molecular Biology | volume = 112 | issue = 3 | pages = 535–42 | date = May 1977 | pmid = 875032 | doi = 10.1016/s0022-2836(77)80200-3 | author7-link = Olga Kennard }}</ref> moved in 2000 to the [http://www.rcsb.org/pdb RCSB] (Research Collaboration for Structural Biology) centered at [[Rutgers]],<ref>{{cite journal | vauthors = Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE | display-authors = 6 | title = The Protein Data Bank | journal = Nucleic Acids Research | volume = 28 | issue = 1 | pages = 235–42 | date = January 2000 | pmid = 10592235 | pmc = 102472 | doi = 10.1093/nar/28.1.235 | author8-link = Philip Bourne | author-link = Helen M. Berman }}</ref> and expanded in 2003 to become the [http://www.wwpdb.org/ wwPDB] (worldwide Protein Data Bank),<ref name="wwPDB">{{cite journal | vauthors = Berman H, Henrick K, Nakamura H | title = Announcing the worldwide Protein Data Bank | journal = Nature Structural Biology | volume = 10 | issue = 12 | pages = 980 | date = December 2003 | pmid = 14634627 | doi = 10.1038/nsb1203-980 | s2cid = 2616817 | author-link = Helen M. Berman | doi-access = free }}</ref> with access sites added in Europe ([http://pdbe.org|PDBe]) and Asia ([http://www.pdbj.org|PDBj]), and with NMR data handled at the [http://www.bmrb.wisc.edu BioMagResBank (BMRB)] in Wisconsin.
 
Validation rapidly became standard in the field,<ref name="Kleywegt2000">{{cite journal | vauthors = Kleywegt GJ |year= 2000 |title= Validation of protein crystal structures |journal=Acta Crystallographica D |volume=56 |issue= Pt 3 |pages=18–19|doi= 10.1107/s0907444999016364 |pmid= 10713511 }}</ref> with further developments described below. *Obviously needs expansion*