Carbohydrate Structure Database: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
Toux (talk | contribs)
misprints, funding sources, references, etc.
Line 8:
|citation = Carbohydrate Structure Database <ref name="Merged_CSDB">{{cite journal| author=Toukach Ph.V.| author2=Egorova K.S.| date=2016|journal=Nucleic Acids Research - Database Issue |volume=44|issue=D1|pages=D1229–D1236 |title=Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts|doi=10.1093/nar/gkv840|pmid=26286194|pmc=4702937}}</ref>
|laboratory =
|author = Philip V. Toukach, Ksenia S. Egorova, Yuri A. Knirel, et al.
|pmid =
|released = 2005
Line 34:
}}
 
'''Carbohydrate Structure Database (CSDB)''' is a free curated database and service platform in [[glycoinformatics]], launched in 2005<ref name="first">{{cite journal| author=Toukach F.V.|author2=Knirel Y.A.| date=2005|journal=Glycoconjugate Journal|volume=22|issue=4–6|pages=216–217|title=New database of bacterial carbohydrate structures}}</ref> by a group of Russian scientists from [http://zioc.ru/?lang=en N.D. Zelinsky Institute of Organic Chemistry], Russian Academy of Sciences. CSDB stores published structural, taxonomical, bibliographic and NMR-spectroscopic data on natural [[carbohydrates]] and carbohydrate-related molecules.
 
== Overview ==
Line 40:
The main data stored in CSDB are [[carbohydrate]] structures of bacterial, fungal, and plant origin. Each structure is assigned to an organism and is provided with the link(s) to the corresponding scientific publication(s), in which it was described. Apart from structural data, CSDB also stores [[Nuclear Magnetic Resonance Spectroscopy|NMR]] spectra, information on methods used to decipher a particular structure, and some other data.<ref name="Merged_CSDB" /><ref>{{cite journal| author=Harvey D.J.| date=2015|journal=Mass Spectrometry Reviews |title=Analysis of carbohydrates and glycoconjugates by matrix-assisted laser desorption/ionization mass spectrometry: An update for 2011-2012|doi=10.1002/mas.21471|pmid=26270629|volume=36| issue=3|pages=255–422}}</ref>
CSDB provides access to several carbohydrate-related research tools:
* Simulation of 1D and 2D [[NMR]] spectra of [[carbohydrates]] ([http://csdb.glycoscience.ru/database/index.html?help=nmr GODESSGODDESS: glycan-oriented dualdatabase-driven empirical spectrum simulation]).<ref name="GODESSGODDESS">{{cite journal| author=Kapaev R.R.| author2=Egorova K.S.| author3=Toukach Ph.V.|date=2014|journal=Journal of Chemical Information and Modeling |volume=54|issue=9|pages=2594–2611 |title=Carbohydrate structure generalization scheme for database-driven simulation of experimental observables, such as NMR chemical shifts|doi = 10.1021/ci500267u|pmid=25020143}}</ref><ref name="GODESS_1H">{{cite journal| author=Kapaev R.R.| author2=Toukach Ph.V.|date=2015|journal=Analytical Chemistry |volume=87|pages=7006–7010 |title=Improved carbohydrate structure generalization scheme for <sup>1</sup>H and <sup>13</sup>C NMR simulations| issue=14|doi=10.1021/acs.analchem.5b01413|pmid=26087011}}</ref><ref name="GODESS_2D">{{cite journal| author=Kapaev R.R.| author2=Toukach Ph.V.|date=2016|journal=Journal of Chemical Information and Modeling |volume=56|pages=1100–1104 |title=Simulation of 2D NMR Spectra of Carbohydrates Using GODESS Software| issue=6|doi=10.1021/acs.jcim.6b00083|pmid=27227420}}</ref>
* Automated [[NMR]]-based structure elucidation ([http://csdb.glycoscience.ru/database/index.html?help=nmr#grass GRASS: generation, ranking and assignment of saccharide structures]).<ref name="GRASS">{{cite journal| author=Kapaev R.R.| author2=Toukach Ph.V.|date=2018|journal=Bioinformatics |volume=34|issue=6|pages=957–963 |title=GRASS: semi-automated NMR-based structure elucidation of saccharides|doi = 10.1093/bioinformatics/btx696|pmid=29092007|doi-access=free}}</ref>
* [[Statistical analysis]] of structural feature distribution in [[glycomes]] of living organisms<ref name="taxon_clustering">{{cite journal| author=Egorova K.S.|author2=Kondakova A.N.|author3=Toukach Ph.V.| date=2015|journal=Database |pages=ID bav073 |title=Carbohydrate structure database: tools for statistical analysis of bacterial, plant and fungal glycomes|doi=10.1093/database/bav073|pmid=26337239|pmc=4559136|volume=2015}}</ref><ref name="Statistics">{{cite journal| author=Herget S.| author2=Toukach Ph.V.| author3=Ranzinger R.| author4=Hull W.E.| author5=Knirel Y.| author6=von der Lieth C.-W.| date=2008|journal=BMC Structural Biology |volume=8|pages=ID 35 |title=Statistical analysis of the Bacterial Carbohydrate Structure Data Base (BCSDB): Characteristics and diversity of bacterial carbohydrates in comparison with mammalian glycans|doi=10.1186/1472-6807-8-35|pmid=18694500| pmc=2543016}}</ref>
* Generation of optimized atomic coordinates for an arbitrary [[saccharide]]<ref name="RESTLESS">{{cite journal| author=Chernyshov I.Y.| author2=Toukach Ph.V.|date=2018|journal=Bioinformatics |title=REStLESS: Automated Translation of Glycan Sequences from Residue-Based Notation to SMILES and Atomic Coordinates|doi = 10.1093/bioinformatics/bty168|pmid=29547883|volume=34| issue=15|pages=2679–2681|doi-access=free}}</ref> and subdatabase of conformation maps.
* [[Taxon]] [[Cluster analysis|clustering]] based on similarities of [[glycomes]] (carbohydrate-based [[Tree of life (biology)|tree of life]])<ref name="taxon_clustering" />
* [[Glycosyltransferase]] subdatabase ([http://csdb.glycoscience.ru/gt.html GT-explorer])<ref name="CSDB_GT">{{cite journal| author= Toukach Ph.V.| author2=Egorova K.S. |date=20162017|journal=Glycobiology |volume=in production27|title=CSDB_GT: a new curated database on glycosyltransferases| issue=4 | pages=285–290 |doi=10.1093/glycob/cww137| pmid=28011601 |doi-access=free}}</ref><ref name="CSDB_GT2">{{cite journal| author=Egorova K.S.| author2= Knirel Y.A.| author3= Toukach Ph.V. |date=2019|journal=Glycobiology |volume=29|title=Expanding CSDB_GT glycosyltransferase database with Escherichia coli| issue=4 | pages=285–287 |doi=10.1093/glycob/cwz006| pmid=30759212 |doi-access=free}}</ref>
 
==History and funding==
 
Until 2015, [http://csdb.glycoscience.ru/bacterial/index.html Bacterial Carbohydrate Structure Database] (BCSDB) and [http://csdb.glycoscience.ru/plant_fungal/index.html Plant&Fungal Carbohydrate Structure Database] (PFCSDB) databases existed in parallel. In 2015, they were joined into the single [http://csdb.glycoscience.ru/database/index.html Carbohydrate Structure Database] (CSDB).<ref name="Merged_CSDB">{{cite journal| author=Toukach Ph.V.| author2=Egorova K.S.| date=2016|journal=Nucleic Acids Research - Database Issue |volume=44|issue=D1|pages=D1229–D1236 |title=Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts|doi=10.1093/nar/gkv840|pmid=26286194|pmc=4702937}}</ref> The development and maintenance of CSDB have been funded by [http://www.istc.int/en/ International Science and Technology Center] (2005-2007), [http://grants.extech.ru Russian Federation President grant program] (2005-2006), [http://www.rfbr.ru/rffi/eng Russian Foundation for Basic Research] (2005-2007,2012-2014,2015-2017,2018-2020), and [https://web.archive.org/web/20000419192143/http://www.dkfz.de/ Deutsches Krebsforschungszentrum] (short-term in 2006-2010), and [https://www.rscf.ru/en/ Russian Science Foundation] (2018-2020).
 
== Data sources and coverage ==
 
The main sources of CSDB data are:
* Scientific publications indexed in the dedicated citation databases, including [https://www.ncbi.nlm.nih.gov/pubmed/ NCBI Pubmed] and [http://webofknowledge.com/ Thomson Reuters Web Of Science] (approx. 1400018000 records).
* CCSD (Carbbank <ref>{{cite journal|author=Doubet S.|author2=Albersheim P.| date=1992|journal=Glycobiology|volume=2|issue=6 |pages=505–507 |title=CarbBank|pmid=1472756|doi=10.1093/glycob/2.6.505}}</ref>) database (approx. 3000 records).
The data are selected and added to CSDB manually by browsing original scientific publications. The data originating from other databases are subject to error-correction and approval procedures.<ref name="Critical">{{cite journal| author=Egorova K.S.| author2=Toukach Ph.V.| date=2012|journal=Journal of Chemical Information and Modeling |volume=52|pages=2812–2814 |title=Critical analysis of CCSD data quality| issue=11|doi=10.1021/ci3002815|pmid=23025661}}</ref>
As of the beginning of 2017, the coverage on [[bacteria]] and [[archaea]] is ca. 80% of carbohydrate structures published in scientific literature in the years 1943 - 2015.<ref name="Merged_CSDB"/> The time lag between the publication of relative data and their deposition into CSDB is about 18 months. Plants are covered up to 1997, and fungi up to 20052012.<ref name="PFCSDB">{{cite journal| author=Egorova K.S.| author2=Toukach Ph.V.| date=2013|journal=Carbohydrate Research |volume=389|pages=112–114|title=Expansion of coverage of Carbohydrate Structure Database (CSDB)|doi=10.1016/j.carres.2013.10.009|pmid=24680503}}</ref>
CSDB does not cover data from the [[animalia]] ___domain, except [[Protozoa|unicellular metazoa]]. There is a number of dedicated databases on [[animal]] [[carbohydrates]], e.g. [http://www.unicarbkb.org/ UniCarbKB] <ref name="unicarbkb">{{cite journal|author=Campbell M.P.|author2=Packer N.H. | date=2016|journal=Biochimica et Biophysica Acta |volume=1860|issue=8 |pages=1669–1675 |title=UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations|doi=10.1016/j.bbagen.2016.02.016|pmid=26940363}}</ref> or [http://glycosciences.de GLYCOSCIENCES.de].<ref>{{cite journal|author=Lütteke T.|author2=Bohne-Lang A.|author3=Loss A.|author4=Goetz T.|author5=Frank M.|author6=von der Lieth C.-W.| date=2006|journal=Glycobiology|volume=16|issue=5 |pages=71R–81R |title=GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research|doi=10.1093/glycob/cwj049|pmid=16239495|doi-access=free}}</ref>
 
Line 65:
== Interrelation with other databases ==
 
CSDB is cross-linked to other [[glycomics]] databases,<ref>{{cite journal| author=Ranzinger R.|author2=Herget S.|author3=Wetter T.|author4=von der Lieth C.-W.| date=2008|journal=BMC Bioinformatics |volume=9 |pages=ID 384 |title=GlycomeDB - integration of open-access carbohydrate structure databases|doi=10.1186/1471-2105-9-384|pmid=18803830 |pmc=2567997}}</ref><ref name="Integration_1">{{cite journal| author=Toukach Ph.V.|author2=Joshi H.| author3=Ranzinger R.| author4=Knirel Y.| author5=von der Lieth C.-W.| date=2007|journal=Nucleic Acids Research - Database Issue |volume=35|pages=D280–D286|title=Sharing of worldwide distributed carbohydrate-related digital resources: online connection of the Bacterial Carbohydrate Structure DataBase and GLYCOSCIENCES.de|issue=Database issue|doi=10.1093/nar/gkl883|pmid=17202164| pmc=1899093}}</ref> such as [http://www.monosaccharidedb.org MonosaccharideDB], [http://glycosciences.de Glycosciences.DE], [https://www.ncbi.nlm.nih.gov/pubmed/ NCBI Pubmed], [https://www.ncbi.nlm.nih.gov/taxonomy NCBI Taxonomy], [https://www.ncbi.nlm.nih.gov/nlmcatalog NLM catalog], [https://www.who.int/classifications/icd/en/ International Classification of Diseases 11], etc. StructuresBesides a native notation, CSDB Linear<ref>{{cite journal|author=Toukach Ph.V.|author2=Egorova K.S.| date=2020|journal=Journal of Chemical Information and Modeling |volume=60|issue=3 |pages=1276-1289 |title=New features of CSDB Linear, as compared to other carbohydrate notations|doi= 10.1021/acs.jcim.9b00744|pmid=31790229}}</ref>, structures are presented in multiple carbohydrate notations (SNFG,<ref>{{cite journal|author=Varki A.|display-authors=et al | date=2015|journal=Glycobiology |volume=25|issue=12 |pages=1323–1324 |title=Symbol Nomenclature for Graphical Representations of Glycans|doi=10.1093/glycob/cwv091|pmid=26543186|pmc=4643639}}</ref> SweetDB,<ref>{{cite journal|author=Loss A.|author2=Bunsmann P.|author3=Bohne A.|author4=Loss A.|author5=Schwarzer E.|author6=Lang E.|author7=von der Lieth C.-W. | date=2002|journal=Nucleic Acids Research |volume=30|issue=1 |pages=405–408 |title=SWEET-DB: an attempt to create annotated data collections for carbohydrates|pmid=11752350 |doi=10.1093/nar/30.1.405 |pmc=99123}}</ref> GlycoCT,<ref>{{cite journal|author=Herget S.|author2=Ranzinger R.|author3=Maass K.|author4=von der Lieth C.-W.| date=2008|journal=Carbohydrate Research |volume=343|issue=12 |pages= 2162–2171|title=GlycoCT - a unifying sequence format for carbohydrates|doi=10.1016/j.carres.2008.03.011|pmid=18436199}}</ref> [http://www.wurcs-wg.org WURCS],<ref>{{cite journal|author=Tanaka K.|author2=Aoki-Kinoshita K.F.|author3=Kotera M.|author4=Sawaki H.|author5=Tsuchiya S.|author6=Fujita N.|author7=Shikanai T.|author8=Kato M.|author9=Kawano S.|author10=Yamada I.|author11=Narimatsu H. | date=2014|journal=Journal of Chemical Information and Modeling |volume=54|issue=6 |pages=1558–1566 |title=WURCS: the Web3 unique representation of carbohydrate structures|doi=10.1021/ci400571e|pmid=24897372|doi-access=free}}</ref> [http://glycam.org GLYCAM],<ref>{{cite journal|author=Kirschner K.N.|author2=Yongye A.B.|author3=Tschampel S.M.|author4=González-Outeiriño J.|author5=Daniels C.R.|author6=Foley B.L.|author7=Woods R.J. | date=2008|journal=Journal of Computational Chemistry |volume=29|issue=4 |pages=622–655 |title=GLYCAM06: a generalizable biomolecular force field. Carbohydrates|doi=10.1002/jcc.20820|pmid=17849372|pmc=4423547}}</ref> etc.). CSDB is exportable as a [[Resource Description Framework]] (RDF) feed according to the [https://bioportal.bioontology.org/ontologies/GLYCORDF GlycoRDF] ontology.<ref name="Ontology">{{cite journal| author=Ranzinger R.| author2=Aoki-Kinoshita K.F.| author3=Campbell M.P.| author4=Kawano S.| author5=Lütteke T.| author6=Okuda S.| author7=Shinmachi D.| author8=Shikanai T.| author9=Sawaki H.| author10=Toukach Ph.V.| author11=Matsubara M.| author12=Yamada I.| author13=Narimatsu H.|date=2015|journal=Bioinformatics|volume=31|issue=6|pages=919–925|title=GlycoRDF: An ontology to standardize Glycomics data in RDF|doi=10.1093/bioinformatics/btu732|pmid=25388145| pmc=4380026}}</ref><ref name="Integration_2">{{cite journal| author=Aoki-Kinoshita K.F.| author2=Bolleman J.| author3=Campbell M.P.| author4=Kawano S.| author5=Kim J.| author6=Lütteke T.| author7=Matsubara M.| author8=Okuda S.| author9=Ranzinger R.| author10=Sawaki H.| author11=Shikanai T.| author12=Shinmachi D.| author13=Suzuki Y.| author14=Toukach Ph.V.| author15=Yamada I.| author16=Packer N.H.| author17=Narimatsu H.| date=2013|journal=Journal of Biomedical Semantics |volume=4|pages=ID 39 |title=Introducing glycomics data into the Semantic Web| issue=1|doi=10.1186/2041-1480-4-39|pmid=24280648| pmc=4177142}}</ref>
 
==External links==