Data-oriented design: Difference between revisions

Content deleted Content added
No edit summary
mNo edit summary
 
(117 intermediate revisions by 57 users not shown)
Line 1:
{{Short description|Program optimization approach in computing}}
In [[computing]], '''Data oriented design''' (not to be confused with [[data-driven design]]) is a software optimisation approach motivated by [[cache coherency]], used in [[video game]] development. The approach is to focus on the data layout, separating and sorting [[field (computing)|field]]s according to when they are needed, and to think about transformations of data.
{{Distinguish|Data-driven programming}}
{{More citations needed|date=July 2020}}
In [[computing]], '''data-oriented design''' is a [[program optimization]] approach motivated by efficient usage of the [[CPU cache]], often used in [[video game]] development.<ref name="Llopis">{{cite web|url=http://gamesfromwithin.com/data-oriented-design|title=Data-oriented design|last=Llopis|first=Noel|date=December 4, 2009|website=Data-Oriented Design (Or Why You Might Be Shooting Yourself in The Foot With OOP)|archive-url=|archive-date=|access-date=April 17, 2020}}</ref> The approach is to focus on the data layout, separating and sorting [[field (computing)|fields]] according to when they are needed, and to think about transformations of data. Proponents include Mike Acton,<ref>{{cite web|title=CppCon 2014: Mike Acton "Data-Oriented Design and C++"|website = [[YouTube]]| date=29 September 2014 |url=https://www.youtube.com/watch?v=rX0ItVEVjHc}}</ref> [[Scott Meyers]],<ref>{{cite web|title=code::dive conference 2014 - Scott Meyers: Cpu Caches and Why You Care|website = [[YouTube]]| date=5 January 2015 |url=https://www.youtube.com/watch?v=WDIkqP4JbkE}}</ref> and [[Jonathan Blow]].
 
The [[parallel array]] (or [[structure of arrays]]) is the main example of data-oriented design. It is contrasted with the ''array of structures'' typical of object-oriented designs.
Proponents include [[Mike Acton]].
 
The definition of data-oriented design as a [[programming paradigm]] can be seen as contentious as many believe that it can be used side by side with another paradigm,<ref>{{cite web|access-date=2023-12-20|title=Data-Oriented Design|author=Richard Fabian|date=October 8, 2018|url=https://www.dataorienteddesign.com/dodbook/|website=www.dataorienteddesign.com}}</ref> but due to the emphasis on data layout, it is also incompatible with most other paradigms.<ref name="Llopis"/>
=== Motivation ===
 
== Motives ==
These techniques became especially popular during the [[PS3]] and [[xbox 360]] console generation when the hazards of [[cache misses]] became pronounced due to their use of [[in-order processor]]s and high clock speeds.<ref>{{cite web|title = data oriented design|url=http://www.dice.se/wp-content/uploads/2014/12/Introduction_to_Data-Oriented_Design.pdf}}</ref>. In modern systems (even with [[out of order execution]], [[main memory]] is as many as hundreds of clock cycles away from the [[processing element]]s, consequently [[locality of reference]] issues dominate performance.
These methods became especially popular in the mid to late 2000s during the [[seventh generation of video game consoles]] that included the [[IBM]] [[PowerPC]] based [[PlayStation 3]] (PS3) and [[Xbox 360]] consoles. Historically, [[game console]]s often have relatively weak [[central processing unit]]s (CPUs) compared to the top-of-line desktop computer counterparts. This is a design choice to devote more power and [[transistor budget]] to the [[graphics processing unit]]s (GPUs). For example, the 7th generation CPUs were not manufactured with modern [[out-of-order execution]] processors, but instead use [[in-order processor]]s with high clock speeds and deep [[Pipeline (computing)|pipelines]]. In addition, most types of computing systems have [[main memory]] located hundreds of [[clock cycle]]s away from the [[processing element]]s. Furthermore, as CPUs have become faster alongside a large increase in main memory capacity, there is massive data consumption that increases the likelihood of [[cache misses]] in the [[system bus|shared bus]], otherwise known as [[Von Neumann architecture#Von Neumann bottleneck|Von Neumann bottlenecking]]. Consequently, [[locality of reference]] methods have been used to control performance, requiring improvement of [[memory access pattern]]s to fix bottlenecking. Some of the software issues were also similar to those encountered on the [[Itanium]], requiring [[loop unrolling]] for upfront scheduling.
 
=== Contrast with OOPobject orientation ===
{{Original research section|date=September 2021}}
The claim is that traditional [[object-oriented programming]] (OOP) design principles result in poor data locality,<ref>{{cite web
| title = INTEL ® HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT IMPROVE VECTORIZATION EFFICIENCY USING INTEL SIMD DATA LAYOUT TEMPLATE (INTEL SDLT)
| url = https://www.intel.com/content/dam/www/public/us/en/documents/presentation/improving-vectorization-efficiency.pdf
}}</ref><ref>{{cite journal
| title = SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes
| author1 = Holger Homann
| author2 = Francois Laenen
| journal = Computer Physics Communications
| date = 2018
| volume = 224
| pages = 325–332
| doi = 10.1016/j.cpc.2017.11.015
| arxiv = 1710.03462
| bibcode = 2018CoPhC.224..325H
| s2cid = 2878169
| language = English
}}</ref> more so if runtime polymorphism ([[dynamic dispatch]]) is used (which is especially problematic on some processors).<ref>{{cite web|title=What's wrong with Object-Oriented Design? Where's the harm in it?|url=http://www.dataorienteddesign.com/dodmain/node17.html}}describes the problems with virtual function calls, e.g., i-cache misses</ref><ref name="Llopis"/> Although OOP appears to "organise code around data", it actually organises [[source code]] around [[data type]]s rather than physically grouping individual fields and arrays in an efficient format for access by specific functions. Moreover, it often hides layout details under [[abstraction layer]]s, while a data-oriented programmer wants to consider this first and foremost.
 
=== See also ===
* [[CPU cache]]
* [[Data-driven programming]]
* [[Entity component system]]
* [[memoryMemory access pattern]]
* [[Video game development]]
 
==References==
{{Reflist}}
 
[[Category:Software optimization]]
=== Contrast with OOP ===
[[Category:Video game development]]
 
[[Category:Programming paradigms]]
The claim is that traditional [[object-oriented]] design principles result in poor data locality, especially if [[runtime polymorphism]] is used (which itself is especially problematic on certain processors). Although OOP does on the surface seem to 'organise code around data', the practice is quite different. OOP is actually about organising [[source code]] around [[data types]], rather than making the layout of individual fields and arrays convenient for specific functions. It also frequently hides layout details under [[abstraction layer]]s, whilst a data-oriented programmer wants to think about this first and foremost.
 
 
 
 
{{stub}}
 
=== See also ===
 
* [[CPU cache]]
* [[AOS vs SOA]]
* [[memory access pattern]]
 
[[category:computing]]