Data modeling: Difference between revisions

Content deleted Content added
No edit summary
Tried to clarify the distinction between conceptual data modeling of the business and logical datamodeling of a database.
Line 1:
The term ''data modeling'' actually referst to two very different things. In the first sense, a data model is a description of the structure of an organization's data, and by implication of the underlying structure of the organization itself. It represents classes of things of significance about which a company wishes to hold information (''entity classes''), the nature of that information (''attributes''"), and relationships among those things. The organization of data presented is all about describing the organization and is not concerned with how data might be represented in a computer system.
In information system design, '''data modeling''' is the analysis and design of the information in the system, concentrating on the logical entities and the logical dependencies among these entities. Data modeling is an [[abstraction]] activity in that the details of the values of individual data observations are ignored in favor of the structure, relationships, names and formats of the data of interest, although a list of valid values is frequently recorded. The resulting [[data model]] defines not only the data's structure, but also its [[semantics]].
 
The entity classes represented can be the tangible things seen by the people in the business, but these tend to be very concrete and subject to change over time. A more robust approach is "conceptual" identifying more fundamental things of significance--of which the things the business sees are examples. For example, an entity class that should appear in every model is PERSON, representing all the people that the organization is concerned with. Entity classes like VENDOR and EMPLOYEE are not appropriate, because each of these describes a role played by a PERSON not the person h'self.
While ''data analysis'' is a common term for data modeling, the activity actually has more in common with the ideas and methods of [[synthesis]] than it does with taking things apart (the original meaning of ''[[analysis]]''). Data modeling strives to bring the data structures of interest together into a cohesive, inseparable, whole by eliminating unnecessary data redundancies and by relating data structures with [[Relational model|relationships]].
 
Properly done, a conceptual data model describes the organization's semantics. It is a collection of assertions about the nature of the business. This requires the entity class names to be in English (or French or Polish or whatever), not techno-babble. It also requires discipline in naming relationships so that sentences can be formed from them that represent concrete assertions about the business. One such discipline makes the relationship names prepositions (not verbs) so that they can appear in the sentence Each <<entity 1>> {must be|may be} <<relationship name>> {one and only one|one or more} <<entity 2>>. For example, "Each ORDER must be ''composed of one'' or more LINE ITEMS."
The process of developing the data model involves analyzing the kinds of data that that will generally fit into the information system, and the relationships between different data elements within that system. Then the modeler must come up with representations of data models that guide the software development process. In the early phases of a software development project, emphasis will be on the design of a [[conceptual schema|conceptual data model]]. This can be detailed into a [[logical data model]] sometimes called a [[functional data model]]. In later stages, this model may be translated into [[physical data model]].
The second kind of data model describes the way data would be organized using a database management system or other data management technology. This describes, for example, relational tables and columns or object-oriented classes and attributes. This is sometimes referred to as the "physical" model, but in the original ANSI three schema architecture, this is called "logical". In that world, the physical model describes the storage media (cylinders, tracks, and tablespaces). Ideally, this model will be derived from the more conceptual one just described, if it is to be the basis for a system that will truly serve the organization. It may differ for good and valid reasons, however, since the system designer must now account for things like processing capacity, usage patterns, and the like.
 
While ''data analysis'' is a common term for data modeling, the activity actually has more in common with the ideas and methods of [[synthesis]] than it does with taking things apart (the original meaning of ''[[analysis]]''). Data modeling strives to bring the data structures of interest together into a cohesive, inseparable, whole by eliminating unnecessary data redundancies and by relating data structures with [[Relational model|relationships]].
 
A different approach is through the use of [[adaptive systems]] such as [[artificial neural networks]] that can autonomously create implicit models of data.
 
Several techniques have been developed for the design of a data models. While these methodologies guide data modelers in their work, two different people using the same methodology will often come up with very different results. Most notable are:
 
* [[RM/T]]
* [[Bachman diagram]]s
* [[Business rules]] or [[business rules approach]]
* [[Entity-relationship diagram]]s
* [[Object Role Modeling]] (ORM) or Nijssen's Information Analysis Method (NIAM)
* [[Business rules]] or [[business rules approach]]* [[Business rules]] or [[business rules approach]]
* [[Object-relationship modeling]]
* [[RM/T]]
* [[Bachman diagram]]s* [[Object-relationship modeling]]
* [[Artificial neural network]]s
 
Line 22 ⟶ 26:
 
==External links==
* [http://www.essentialstrategies.com/publications/modeling] for articles on the subject.
* [http://www.databaseanswers.com/modelling_tools.htm Data Modelling Tools] from DatabaseAnswers.com
* [http://www.silverrun.com/modelsphere_dm.html SILVERRUN] - tools for conceptual, logical and physical data modeling