User:Markf129/Earth sciences data format interoperability: Difference between revisions

Content deleted Content added
Markf129 (talk | contribs)
Markf129 (talk | contribs)
No edit summary
Line 1:
{{Userspace draft|date=July 2010}}
When studying the Earth sciences throughby observation or analytical [[model (abstract)|models]], it is often a challenge for both the user and collector on how to best organize and store the vast amount of information available. Different organizations may have specific technical goals, timeline constraints, or model constraints that outoften of necessity derivedrive new unique file conventions, distributions techniques, and architectures. While developing new solutions sometimes solves short term goals, it often causes more complex long term problems when standards are not adhered to<ref>{{cite article
| title = Model Data Interoperability for the United States Integrated Ocean Observing System
| author = Richard P. Signell
Line 21:
</ref>.
 
Interoperability allowsof dataobservational usersor to view, process, and analyze observationalmodel data ormust sciencebe model output easilyeasy and transparentlytransparent, without having to reformat the
data, write special tools to read or extract the data, or rely on specific proprietary software. If common formats wereare adhered to, many benefits would occur. First, it would promote the exchange of models and relevant science data. Second, observational data could be scaled and compared more easily to models. And third, it would eliminate confusion and unnecessary format conversions. Perhaps the most important reason is the latter, as considerable time can be spent converting between the different data formats<ref>{{cite article
| title = Background on BUFR and GRIB Formats
| author = Doug McLain
Line 42:
A [[file format]] defines how data is encoded for storage using a defined structure such as chunk, directory based, or unstructured. Usually the file format is easily identified by the file name extension (e.g. .jpg, .bufr). Thus, the data model describes how the data is organized, and the file format how the data is stored. Furthermore, conventions are used to describe what data types, formats, and design principles are applied for a given data model and/or format (e.g. [[Climate and Forecast Metadata Conventions]]). By identifying these three elements, data can be accurately described.
 
For example, data models contain datasets such as dimensions, variables, types, and attributes. Some models have the ability to even logically put these sets into groups. These components can be used together to capture the meaning of data and relations among data fields in an array-oriented dataset. In contrast to variables, which are intended for bulk data, attributes are intended for ancillary data, or information about the data<ref>{{cite article
| title =
| url = http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/index.html