Content deleted Content added
ClueBot NG (talk | contribs) m Reverting possible vandalism by 2607:FEA8:23E2:8700:BC4E:6AF8:F48:2D09 to version by AdaHephais. Report False Positive? Thanks, ClueBot NG. (4172149) (Bot) |
cut examplefarlm |
||
(24 intermediate revisions by 17 users not shown) | |||
Line 1:
{{Short description|Abstract model}}
[[File:Data modeling context.svg|thumb|
The term '''data model''' can refer to two distinct but closely related concepts. Sometimes it refers to an abstract formalization of the [[Object (philosophy)|objects]] and relationships found in a particular application ___domain: for example the customers, products, and orders found in a manufacturing organization. At other times it refers to the set of concepts used in defining such formalizations: for example concepts such as entities, attributes, relations, or tables. So the "data model" of a banking application may be defined using the entity-relationship "data model". This article uses the term in both senses.▼
A '''data model''' is an [[abstract model]] that organizes elements of [[data]] and [[Standardization|standardizes]] how they relate to one another and to the properties of real-world [[Entity|entities]].<ref>{{cite web |url = https://cedar.princeton.edu/understanding-data/what-data-model |title = What is a Data Model? |website = princeton.edu |access-date = 29 May 2024}}</ref><ref>{{cite web|title= UML Domain Modeling - Stack Overflow|url= https://stackoverflow.com/a/3835214|website= Stack Overflow|publisher= Stack Exchange Inc.|access-date= 4 February 2017}}</ref> For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.
▲[[File:Data modeling context.svg|thumb|360px| Overview of a data-modeling context: Data model is based on Data, Data relationship, Data semantic and Data constraint. A data model provides the details of [[information]] to be stored, and is of primary use when the final product is the generation of computer [[software code]] for an application or the preparation of a [[functional specification]] to aid a [[computer software]] make-or-buy decision. The figure is an example of the interaction between [[business process modeling|process]] and data models.<ref name="SS93">Paul R. Smith & Richard Sarfaty Publications, LLC 2009</ref>]]
The corresponding professional activity is called generally ''[[data modeling]]'' or, more specifically, ''[[database design]]''.
Data models are typically specified by a data expert, data specialist, data scientist, data librarian, or a data scholar.
A data [[modeling language]] and notation are often represented in graphical form as diagrams.<ref name="MRM99">
Michael R. McCaleb (1999). [http://nvl.nist.gov/pub/nistpubs/jres/104/4/html/j44mac.htm#apa "A Conceptual Data Model of Datum Systems"] {{Webarchive
|url= https://web.archive.org/web/20080921063005/http://nvl.nist.gov/pub/nistpubs/jres/104/4/html/j44mac.htm#apa |date= 2008-09-21
Line 14:
A data model can sometimes be referred to as a [[data structure]], especially in the context of [[programming language]]s. Data models are often complemented by [[function model]]s, especially in the context of [[enterprise model]]s.
A data model explicitly determines the ''structure of data''; conversely, '''structured data''' is data organized according to an explicit data model or data structure. Structured data is in contrast to ''[[unstructured data]]'' and ''[[semi-structured data]]''.
== Overview ==
▲The term '''''data model''''' can refer to two distinct but closely related concepts. Sometimes it refers to an abstract formalization of the [[Object (philosophy)|objects]] and relationships found in a particular application ___domain: for example the customers, products, and orders found in a manufacturing organization. At other times it refers to the set of concepts used in defining such formalizations: for example concepts such as entities, attributes, relations, or tables. So the "data model" of a banking application may be defined using the
Managing large quantities of structured and [[unstructured data]] is a primary function of [[information system]]s. Data models describe the structure, manipulation, and integrity aspects of the data stored in data management systems such as relational databases. They may also describe data with a looser structure, such as [[Word processor|word processing]] documents, [[Email|email messages]], pictures, digital audio, and video: [[XQuery and XPath Data Model|XDM]], for example, provides a data model for [[XML]] documents.
Line 23 ⟶ 27:
* "Business rules, specific to how things are done in a particular place, are often fixed in the structure of a data model. This means that small changes in the way business is conducted lead to large changes in computer systems and interfaces".<ref name="MW99"/>
* "Entity types are often not identified, or incorrectly identified. This can lead to replication of data, data structure, and functionality, together with the attendant costs of that duplication in development and maintenance".<ref name="MW99"/>
* "Data models for different systems are arbitrarily different. The result of this is that complex interfaces are required between systems that share data. These interfaces can account for between
* "Data cannot be shared electronically with customers and suppliers, because the structure and meaning of data has not been standardized. For example, engineering design data and drawings for process plant are still sometimes exchanged on paper".<ref name="MW99"/>
The reason for these problems is a lack of standards that will ensure that data models will both meet business needs and be consistent.<ref name="MW99"/>
Line 32 ⟶ 36:
[[File:4-2 ANSI-SPARC three level architecture.svg|thumb|320px|The ANSI/SPARC [[Three schema approach|three level architecture]]. This shows that a data model can be an external model (or view), a conceptual model, or a physical model. This is not the only way to look at data models, but it is a useful way, particularly when comparing models.<ref name="MW99"/>]]
A data model ''instance'' may be one of three kinds according to [[ANSI]] in 1975:<ref>American National Standards Institute. 1975. ''ANSI/X3/SPARC Study Group on Data Base Management Systems; Interim Report''. FDT (Bulletin of ACM SIGMOD) 7:2.</ref>
# [[Conceptual data model]]:
# [[Logical data model]]: describes the semantics, as represented by a particular data manipulation technology. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things.
# [[Physical data model]]: describes the physical means by which data are stored. This is concerned with partitions, CPUs, tablespaces, and the like.
Line 39 ⟶ 43:
== History ==
One of the earliest pioneering works in modeling information systems was done by Young and Kent (1958),<ref>Young, J. W., and Kent, H. K. (1958). "Abstract Formulation of Data Processing Problems". In: ''Journal of Industrial Engineering''. Nov-Dec 1958. 9(6), pp.
In the 1960s data modeling gained more significance with the initiation of the [[management information system]] (MIS) concept. According to Leondes (2002), "during that time, the information system provided the data and information for management purposes. The first generation [[database system]], called [[Integrated Data Store]] (IDS), was designed by [[Charles Bachman]] at General Electric. Two famous database models, the [[network data model]] and the [[hierarchical data model]], were proposed during this period of time".<ref>Cornelius T. Leondes (2002). ''Database and Data Communication Network Systems: Techniques and Applications''. Page 7</ref> Towards the end of the 1960s, [[Edgar F. Codd]] worked out his theories of data arrangement, and proposed the [[relational model]] for database management based on [[first-order logic|first-order predicate logic]].<ref>''"Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks"'', E.F. Codd, IBM Research Report, 1969</ref>
In the 1970s [[
In the 1970s [[G.M. Nijssen]] developed "Natural Language Information Analysis Method" (NIAM) method, and developed this in the 1980s in cooperation with [[Terry Halpin]] into [[
Bill Kent, in his 1978 book ''Data and Reality,''<ref>{{citation|title=Data and Reality |url=http://www.bkent.net/Doc/darxrp.htm}}</ref> compared a data model to a map of a territory, emphasizing that in the real world, "highways are not painted red, rivers don't have county lines running down the middle, and you can't see contour lines on a mountain". In contrast to other researchers who tried to create models that were mathematically clean and elegant, Kent emphasized the essential messiness of the real world, and the task of the data modeler to create order out of chaos without excessively distorting the truth.
In the 1980s, according to Jan L. Harrington (2000), "the development of the [[Object-oriented programming|object-oriented]] paradigm brought about a fundamental change in the way we look at data and the procedures that operate on data. Traditionally, data and procedures have been stored separately: the data and their relationship in a database, the procedures in an application program. Object orientation, however, combined an entity's procedure with its data."<ref name="JLH00">Jan L. Harrington (2000). ''Object-oriented Database Design Clearly Explained''. p.4</ref>
During the early 1990s, three Dutch mathematicians Guido Bakema, Harm van der Lek, and JanPieter Zwart, continued the development on the work of [[G.M. Nijssen]]. They focused more on the communication part of the semantics. In 1997 they formalized the method Fully Communication Oriented Information Modeling [[FCO-IM]].
Line 56 ⟶ 60:
=== Database model ===
{{
A database model is a specification describing how a database is structured and used.
Line 66 ⟶ 70:
: The hierarchical model is similar to the network model except that links in the hierarchical model form a tree structure, while the network model allows arbitrary graph.
; [[Network model]]
:
; [[Relational model]]
: is a database model based on first-order predicate logic. Its core idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values. The power of the relational data model lies in its mathematical foundations and a simple user-level paradigm.
; [[
: Similar to a relational database model, but objects, classes, and inheritance are directly supported in [[database schema]]s and in the query language.
; [[
: A method of data modeling that has been defined as "attribute free", and "fact-based". The result is a verifiably correct system, from which other common artifacts, such as ERD, UML, and semantic models may be derived. Associations between data objects are described during the database design procedure, such that normalization is an inevitable result of the process.
; [[Star schema]]
Line 86 ⟶ 90:
=== Data structure diagram ===
{{
[[File:Aggregate Data Structure Diagram.jpg|thumb|240px|Example of a Data Structure Diagram]]
A data structure diagram (DSD) is a [[diagram]] and data model used to describe [[Conceptual schema|conceptual data models]] by providing graphical notations which document [[entity class|entities]] and their [[Relational model|relationship]]s, and the [[Integrity constraints|constraint]]s that bind them. The basic graphic elements of DSDs are [[box]]es, representing entities, and [[arrow]]s, representing relationships. Data structure diagrams are most useful for documenting complex data entities.
Data structure diagrams are an extension of the [[
There are several styles for representing data structure diagrams, with the notable difference in the manner of defining [[Cardinality (data modeling)|cardinality]]. The choices are between arrow heads, inverted arrow heads ([[
[[File:B 5 1 IDEF1X Diagram.jpg|thumb|240px|left|Example of an [[IDEF1X]]
===
{{
An
There are several styles for representing data structure diagrams, with a notable difference in the manner of defining cardinality. The choices are between arrow heads, inverted arrow heads (crow's feet), or numerical representation of the cardinality.
=== Geographic data model ===
{{
A data model in [[Geographic information system]]s is a mathematical construct for representing geographic objects or surfaces as data. For example,
* the [[vector graphics|vector]] data model represents geography as points, lines, and polygons
* the raster data model represents geography as cell matrixes that store numeric values;
* and the [[Triangulated irregular network]] (TIN) data model represents geography as sets of contiguous, nonoverlapping triangles.<ref>Wade, T. and Sommer, S. eds. ''[http://store.esri.com/esri/showdetl.cfm?SID=2&Product_ID=868&Category_ID=49 A to Z GIS]''</ref>
Line 112 ⟶ 116:
Image:NGMDB data model application.jpg|NGMDB data model applications<ref name= "DRS03"/>
Image:NGMDB databases linked together.jpg|NGMDB databases linked together<ref name= "DRS03"/>
Image:Representing three-dimensional map information.jpg|Representing 3D
</gallery>
=== Generic data model ===
{{
Generic data models are generalizations of conventional data models. They define standardized general relation types, together with the kinds of things that may be related by such a relation type. Generic data models are developed as an approach to solving some shortcomings of conventional data models. For example, different modelers usually produce different conventional data models of the same ___domain. This can lead to difficulty in bringing the models of different people together and is an obstacle for data exchange and data integration. Invariably, however, this difference is attributable to different levels of abstraction in the models and differences in the kinds of facts that can be instantiated (the semantic expression capabilities of the models). The modelers need to communicate and agree on certain elements that are to be rendered more concretely, in order to make the differences less significant.
=== Semantic data model ===
{{
[[File:A2 4 Semantic Data Models.svg|thumb|320px|Semantic data models<ref name="FIPS184"/>]]
A semantic data model in software engineering is a technique to define the meaning of data within the context of its interrelationships with other data. A semantic data model is an abstraction that defines how the stored symbols relate to the real world.<ref name="FIPS184"/> A semantic data model is sometimes called a [[conceptual data model]].
Line 129 ⟶ 133:
=== Data architecture ===
{{
Data architecture is the design of data for use in defining the target state and the subsequent planning needed to hit the target state. It is usually one of several [[architecture ___domain]]s that form the pillars of an [[enterprise architecture]] or [[solution architecture]].
Line 158 ⟶ 162:
** ''cost'': the cost incurred in obtaining the data, and making it available for use.
=== Data organization ===
Another kind of data model describes how to organize data using a [[database management system]] or other data management technology. It describes, for example, relational tables and columns or object-oriented classes and attributes. Such a data model is sometimes referred to as the ''[[physical data model]]'', but in the original ANSI three schema architecture, it is called "logical". In that architecture, the physical model describes the storage media (cylinders, tracks, and tablespaces). Ideally, this model is derived from the more conceptual data model described above. It may differ, however, to account for constraints like processing capacity and usage patterns.
Line 166 ⟶ 170:
=== Data structure ===
{{
[[File:Binary tree.svg|thumb|240px|A [[binary tree]], a simple type of branching linked data structure]]
A data structure is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most [[algorithmic efficiency|efficient]] [[algorithm]] to be used. The choice of the data structure often begins from the choice of an [[abstract data type]].
Line 193 ⟶ 197:
For example, in the [[relational model]], the structural part is based on a modified concept of the [[Relation (mathematics)|mathematical relation]]; the integrity part is expressed in [[first-order logic]] and the manipulation part is expressed using the [[relational algebra]], [[tuple calculus]] and [[___domain calculus]].
A data model instance is created by applying a data model theory. This is typically done to solve some business enterprise requirement. Business requirements are normally captured by a semantic [[logical data model]]. This is transformed into a physical data model instance from which is generated a physical database. For example, a data modeler may use a data modeling tool to create an [[
=== Patterns ===
Line 201 ⟶ 205:
=== Data-flow diagram ===
{{
[[File:Data Flow Diagram Example.jpg|thumb|240px|Data-Flow Diagram example<ref>John Azzolini (2000). [http://ses.gsfc.nasa.gov/ses_data_2000/000712_Azzolini.ppt Introduction to Systems Engineering Practices]. July 2000.</ref>]]
A data-flow diagram (DFD) is a graphical representation of the "flow" of data through an [[information system]]. It differs from the [[flowchart]] as it shows the ''data'' flow instead of the ''control'' flow of the program. A data-flow diagram can also be used for the [[Data visualization|visualization]] of [[data processing]] (structured design). Data-flow diagrams were invented by [[Larry Constantine]], the original developer of structured design,<ref>W. Stevens, G. Myers, L. Constantine, "Structured Design", IBM Systems Journal, 13 (2),
It is common practice to draw a [[System context diagram|context-level data-flow diagram]] first which shows the interaction between the system and outside entities. The '''DFD''' is designed to show how a system is divided into smaller portions and to highlight the flow of data between those parts. This context-level data-flow diagram is then "exploded" to show more detail of the system being modeled
=== Information model ===
{{
[[File:A 01 Audio compact disc collection.svg|thumb|320px|Example of an [[EXPRESS (data modeling language)|EXPRESS G]] [[
An Information model is not a type of data model, but more or less an alternative model. Within the field of software engineering, both a data model and an information model can be abstract, formal representations of entity types that include their properties, relationships and the operations that can be performed on them. The entity types in the model may be kinds of real-world objects, such as devices in a network, or they may themselves be abstract, such as for the entities used in a billing system. Typically, they are used to model a constrained ___domain that can be described by a closed set of entity types, properties, relationships and operations.
According to Lee (1999)<ref name="Lee99"/>
An information model provides formalism to the description of a problem ___domain without constraining how that description is mapped to an actual implementation in software. There may be many mappings of the information model. Such mappings are called data models, irrespective of whether they are [[object model]]s (e.g. using [[Unified Modeling Language|UML]]), [[
[[File:JKDOM.SVG|thumb|180px|left|[[Document Object Model]], a standard [[object model]] for representing [[HTML]] or [[XML]] ]]
=== Object model ===
{{
An object model in computer science is a collection of objects or classes through which a program can examine and manipulate some specific parts of its world. In other words, the object-oriented interface to some service or system. Such an interface is said to be the ''object model of'' the represented service or system. For example, the [[Document Object Model|Document Object Model (DOM)]] [http://www.w3.org/DOM/] is a collection of objects that represent a [[web page|page]] in a [[web browser]], used by [[scripting language|script]] programs to examine and dynamically change the page. There is a [[Microsoft Excel]] object model<ref>[http://msdn2.microsoft.com/en-us/library/wss56bz7.aspx Excel Object Model Overview<!-- Bot generated title -->]</ref> for controlling Microsoft Excel from another program, and the [[ASCOM (standard)|ASCOM]] Telescope Driver<ref>{{cite web|url=http://ascom-standards.org/Standards/Requirements.htm |title=ASCOM General Requirements|date=2011-05-13 |access-date=2014-09-25}}</ref> is an object model for controlling an astronomical telescope.
In [[computing]] the term ''object model'' has a distinct second
===
{{Main|Object–role modeling}}
[[File:Schema for Geologic Surface.svg|thumb|320px|Example of the application of
The conceptual design may include data, process and behavioral perspectives, and the actual DBMS used to implement the design might be based on one of many logical data models (relational, hierarchic, network, object-oriented, etc.).<ref name = "msd">[http://msdn2.microsoft.com/en-us/library/aa290383(VS.71).aspx Object Role Modeling: An Overview (msdn.microsoft.com)]. Retrieved 19 September 2008.</ref>
Line 257 ⟶ 261:
== Further reading ==
* David C. Hay (1996). ''[https://books.google.com/books?id=eUQbAAAAQBAJ&
* Len Silverston (2001). ''The Data Model Resource Book'' Volume 1/2. John Wiley & Sons.
* Len Silverston & Paul Agnew (2008). ''The Data Model Resource Book: Universal Patterns for data Modeling'' Volume 3. John Wiley & Sons.
* Matthew West
{{Data model}}
{{Software engineering}}
{{Authority control}}
{{DEFAULTSORT:Data Model}}
|