Parallel rendering: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 04:11, 5 May 2011 edit Rilak (talk \| contribs) 9,848 edits →External links: Removed irrelevant category; more precise category; removed link farm and tag ← Previous edit		Latest revision as of 09:51, 6 November 2023 edit undo Kku (talk \| contribs) Extended confirmed users 121,760 edits m link [pP]arallel computing
(39 intermediate revisions by 29 users not shown)
Line 1: '''Parallel rendering''' (or '''~~Distributed~~distributed rendering''') is athe ~~method~~application ~~used~~of to[[parallel ~~improve~~programming]] to the ~~performance~~computational ___domain of [[computer graphics]] ~~software~~. ~~The~~[[Rendering ~~rendering~~(computer ofgraphics)\|Rendering]] graphics ~~requires~~can require massive computational resources for complex ~~objects~~scenes that arise in [[scientific visualization]], [[medical visualization]], [[~~iso~~Computer-~~surface~~aided ~~generation~~design\|CAD]] applications, and ~~some~~ [[~~CAD~~virtual reality]] ~~applications~~. ~~Traditional~~Recent ~~methods~~research ~~like~~has ~~[[Ray~~also ~~tracing~~suggested ~~(graphics)\|ray~~that ~~tracing]],~~parallel rendering can be applied to [[3Dmobile ~~texture~~gaming]]s, ~~etc~~to decrease power consumption and increase graphical fidelity.,<ref>{{Cite ~~work~~journal\|last1=Wu\|first1=C.\|last2=Yang\|first2=B.\|last3=Zhu\|first3=W.\|last4=Zhang\|first4=Y.\|date=2017\|title=Toward ~~extremely~~High ~~slowly~~Mobile inGPU ~~simple~~Performance ~~machines~~through Collaborative Workload Offloading\|journal=IEEE Transactions on Parallel and Distributed Systems\|volume=PP\|issue=99\|pages=435–449\|doi=10.1109/tpds.2017.2754482\|issn=1045-9219\|doi-access=free}}</ref> ~~Furthermore,~~Rendering is an [[~~virtual~~embarrassingly ~~reality~~parallel]] ~~and~~workload ~~visual~~in ~~simulation~~multiple ~~programs~~domains (e.g., ~~which~~pixels, ~~render~~objects, toframes) ~~multiple~~and ~~display~~thus ~~systems~~has ~~concurrently,~~been ~~are~~the ~~applications~~subject ~~for~~of ~~parallel~~much ~~rendering~~research. == ~~Subdivision~~Workload ~~of work~~distribution == There are two, often competing, reasons for using parallel rendering. Performance scaling allows frames to be rendered more quickly while data scaling allows larger data sets to be visualized. Different methods of distributing the workload tend to favor one type of scaling over the other. There can also be other advantages and disadvantages such as [[Latency (engineering)\|latency]] and [[load balancing (computing)\|load balancing]] issues. The three main options for primitives to distribute are entire frames, pixels, or objects (e.g. [[triangle mesh]]es). Parallel rendering divides the work to be done and processes it in parallel. For example, if we have a non-parallel ray-casting application, we would send rays one by one to all the pixels in the [[Viewing frustum\|view frustum]]. Instead, we can divide the whole frustum into some x number of parts and then run that many threads or processes to send rays in parallel to those x tiles. We can use a cluster of machines to do such a thing and then composite the results. This is parallel rendering. === Frame distribution === ~~==Non-interactive parallel rendering==~~ Each processing unit can render an entire frame from a different point of view or moment in time. The frames rendered from different points of view can improve image quality with anti-aliasing or add effects like depth-of-field and [[three-dimensional display]] output. This approach allows for good performance scaling but no data scaling. Traditional parallel rendering is a great example of what is meant by [[embarrassingly parallel]] in that the frames to be rendered are distributed amongst the available compute nodes. For instance, one frame is rendered on one compute node. Multiple frames can be processed because there are multiple nodes. A truly parallel process can distribute a frame across multiple nodes using a tightly coupled cross communication methodology to process frames by orders of magnitude faster. In this way, a full-rendering job consisting of multiple frames can be edited in real-time enabling designers to do better work faster. When rendering sequential frames in parallel there will be a lag for interactive sessions. The lag between user input and the action being displayed is proportional to the number of sequential frames being rendered in parallel. ~~==Interactive parallel rendering==~~ ~~In interactive parallel rendering, there are different approaches of distributing the rendering work, which have different advantages and disadvantages.~~ ===~~Sort-first~~ Pixel distribution === Sets of pixels in the screen space can be distributed among processing units in what is often referred to as sort first rendering.<ref>Molnar, S., M. Cox, D. Ellsworth, and H. Fuchs. “[http://www.cs.unc.edu/~fuchs/publications/SortClassify_ParalRend94.pdf A Sorting Classification of Parallel Rendering].” IEEE Computer Graphics and Algorithms, pages 23-32, July 1994.</ref> Distributing interlaced lines of pixels gives good load balancing but makes data scaling impossible. Distributing contiguous 2D tiles of pixels allows for data scaling by culling data with the [[view frustum]]. However, there is a data overhead from objects on frustum boundaries being replicated and data has to be loaded dynamically as the view point changes. Dynamic load balancing is also needed to maintain performance scaling. Sort-first rendering decomposes the final view in screen space, that is, each contributor renders a 2D tile of the final view.<ref>Molnar, S., M. Cox, D. Ellsworth, and H. Fuchs. “A Sorting Classification of Parallel ~~Rendering.” IEEE Computer Graphics and Algorithms, pages 23-32, July 1994.</ref> This mode has a limited scalability due to the parallel overhead caused by objects rendered on multiple tiles.~~ === Object distribution === ~~The image to the right {{fix~~ Distributing objects among processing units is often referred to as sort last rendering.<ref>Molnar, S., M. Cox, D. Ellsworth, and H. Fuchs. “[http://www.cs.unc.edu/~fuchs/publications/SortClassify_ParalRend94.pdf A Sorting Classification of Parallel Rendering].” IEEE Computer Graphics and Algorithms, pages 23-32, July 1994.</ref> It provides good data scaling and can provide good performance scaling, but it requires the intermediate images from processing nodes to be [[alpha compositing\|alpha composited]] to create the final image. As the image resolution grows, the alpha compositing overhead also grows. ~~\|text=image missing~~ }} shows an example of sort-first rendering on a [[video wall]]. Each computer in the video wall renders a portion of the viewing volume, or [[viewing frustum]], and the final image is the summation of the images on the monitors that make up the video wall. The [[speedup]] comes from the fact that graphics libraries ([[OpenGL]] for example) will [[Clipping (computer graphics)\|clip]] away pixels that would appear outside of the viewing volume. This happens very early in the [[graphics pipeline]], which accelerates the rendering process by eliminating the unneeded [[rasterization]] and post-processing on primitives that will not appear anyway. A load balancing scheme is also needed to maintain performance regardless of the viewing conditions. This can be achieved by over partitioning the object space and assigning multiple pieces to each processing unit in a random fashion, however this increases the number of alpha compositing stages required to create the final image. Another option is to assign a contiguous block to each processing unit and update it dynamically, but this requires dynamic data loading. ~~===Sort-last===~~ === Hybrid distribution === Sort-last rendering on the other hand decomposes the rendered database across all rendering units, and recombines the partially rendered frames. This modes scales the rendering very well, but the recomposition step is expensive due to the amount of pixel data processed during recomposition. The different types of distributions can be combined in a number of fashions. A couple of sequential frames can be rendered in parallel while also rendering each of those individual frames in parallel using a pixel or object distribution. Object distributions can try to minimize their overlap in screen space in order to reduce alpha compositing costs, or even use a pixel distribution to render portions of the object space. == Open source applications ==▼ ~~The image to the right {{fix~~ The open source software package [[Chromium (~~http://chromium.sourceforge.net~~computer graphics)\|Chromium]] provides a parallel rendering mechanism for existing applications. It intercepts the [[OpenGL]] calls and ~~processed~~processes them, typically to send them to multiple rendering units driving a [[video wall\|display wall]].▼ ~~\|text=image missing~~ }} shows an example of sort-last rendering. The computer in the top left corner is the [[Master-slave (computers)\|master]] computer. This means it is responsible for receiving the images created by the other computers, and then compositing them into the final image, which it displays on its own monitor. Equalizer ~~(http://www.equalizergraphics.com)~~ is an open source rendering [[Software framework\|framework]] and resource management system for multipipe applications. Equalizer provides an [[Application programming interface\|API]] to write parallel, scalable visualization applications which are configured at run-time by a resource server.<ref>{{Cite web \|url=http://www.equalizergraphics.com/ \|title=Equalizer: Parallel Rendering \|access-date=2020-04-30 \|archive-url=https://web.archive.org/web/20080511163442/http://www.equalizergraphics.com/ \|archive-date=2008-05-11 \|url-status=dead }}</ref>▼ ~~===Pixel Decompositions===~~ [[OpenSG]] ~~(http://opensg.vrsource.org/trac)~~ is an open source [[Scene graph\|scenegraph]] system that provides parallel rendering capabilities, especially on clusters. It hides the complexity of parallel [[Thread (computer science)\|multi-threaded]] and clustered applications and supports sort-first as well as sort-last rendering.<ref>{{Cite web \|url=http://www.opensg.org/ \|title=OpenSG \|access-date=2020-04-30 \|archive-url=https://web.archive.org/web/20170806213018/http://www.opensg.org/ \|archive-date=2017-08-06 \|url-status=dead }}</ref>▼ Pixel decompositions divide the pixels of the final view evenly, either by dividing full pixels or sub-pixels. The first 'squeezes' the view frustum, while the second renders the same scene with slightly modified camera positions for full-screen anti-aliasing or depth-of-field effects. Pixel decompositions do composite the pixels side-by-side, while sub-pixel decompositions blend all sub-pixels to compute the final pixel. Golem is an open source [[decentralized application]] used for [[parallel computing]] that currently works with rendering in [[Blender_(software)\|Blender]] and has plans to incorporate more uses.<ref>{{Cite web\|title=Golem Network\|url=https://golem.network/\|access-date=2021-05-16\|website=golem.network}}</ref> In contrast to sort-first, no sorting of rendered primitives takes place since all rendering resources render more or less the same view. Pixel decompositions are inherently load-balanced, and are ideal for purely fill-limited applications such as raytracing and 3D [[volume rendering]]. ~~===Others===~~ DPlex rendering distributes full, alternating frames to the individual rendering nodes. It scales very well, but increases the latency between user input and final display, which is often irritating for the user. Stereo decomposition is used for immersive applications, where the individual eye passes are rendered by different rendering units. Passive stereo systems are a typical example for this mode. ~~Parallel rendering can be used in graphics intensive applications to visualize the data more efficiently by adding resources like more machines.~~ ▲== Open source applications == ▲The open source software package Chromium (http://chromium.sourceforge.net) provides a parallel rendering mechanism for existing applications. It intercepts the [[OpenGL]] calls and processed them, typically to send them to multiple rendering units driving a [[video wall\|display wall]]. ▲Equalizer (http://www.equalizergraphics.com) is an open source rendering [[Software framework\|framework]] and resource management system for multipipe applications. Equalizer provides an [[Application programming interface\|API]] to write parallel, scalable visualization applications which are configured at run-time by a resource server. ▲[[OpenSG]] (http://opensg.vrsource.org/trac) is an open source [[Scene graph\|scenegraph]] system that provides parallel rendering capabilities, especially on clusters. It hides the complexity of parallel [[Thread (computer science)\|multi-threaded]] and clustered applications and supports sort-first as well as sort-last rendering. ==See also== ;Concepts * [[Server farm]] * [[Queue (data structure)\|Queue]] * [[Render farm]] ;Implementations Line 58 ⟶ 43: ==External links== * [https://web.archive.org/web/20070116035248/http://www.cs.princeton.edu/~rudro/cluster-rendering/ Cluster Rendering at Princeton University] {{DEFAULTSORT:Parallel Rendering}} {{Computer graphics}} [[Category:3D computer graphics]] [[Category:Applications of distributed computing]] ~~[[bs:Paralelno renderiranje]]~~