Content deleted Content added
m lc per MOS:EXPABBR and other common nouns Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit |
Erel Segal (talk | contribs) |
||
(6 intermediate revisions by 4 users not shown) | |||
Line 50:
For [[shared-memory]] computers, managing write conflicts greatly slows down the speed of individual execution of each computing unit. However, they can work perfectly well in parallel. Conversely, in the case of message exchange, each of the processors can work at full speed. On the other hand, when it comes to collective message exchange, all processors are forced to wait for the slowest processors to start the communication phase.
In reality, few systems fall into exactly one of the categories. In general, the processors each have an internal memory to store the data needed for the next calculations and are organized in successive [[Computer cluster|clusters]]. Often, these processing elements are then coordinated through [[distributed memory]] and [[message passing]]. Therefore, the load balancing algorithm should be uniquely adapted to a parallel architecture. Otherwise, there is a risk that the efficiency of parallel [[problem solving]] will be greatly reduced.
====Hierarchy====
Line 65:
===Fault tolerance===
Especially in large-scale [[computing cluster]]s, it is not tolerable to execute a [[parallel algorithm]] that cannot withstand the failure of one single component. Therefore, [[fault tolerant]] algorithms are being developed which can detect outages of processors and recover the computation.<ref>{{cite book |last1=Punetha Sarmila |first1=G. |last2=Gnanambigai |first2=N. |last3=Dinadayalan |first3=P. |title=2015 2nd International Conference on Electronics and Communication Systems (ICECS) |chapter=Survey on fault tolerant — Load balancing algorithmsin cloud computing |date=2015 |pages=1715–1720 |doi=10.1109/ECS.2015.7124879 |isbn=978-1-4799-7225-8 |s2cid=30175022 }}</ref>
==Approaches==
Line 145:
:www.example.org NS two.example.org
However, the [[zone file]] for {{mono|www.example.org}} on each server is different such that each server resolves its own IP Address as the A-record.<ref>{{Cite web|url=https://www.zytrax.com/books/dns/ch8/a.html|title=Chapter 8 - IPv4 Address (A) Record|website=www.zytrax.com}}</ref> On server ''one'' the zone file for {{mono|www.example.org}} reports:
:@ in a 192.0.2.1
Line 284:
Load balancing is often used to implement [[failover]]—the continuation of service after the failure of one or more of its components. The components are monitored continually (e.g., web servers may be monitored by fetching known pages), and when one becomes unresponsive, the load balancer is informed and no longer sends traffic to it. When a component comes back online, the load balancer starts rerouting traffic to it. For this to work, there must be at least one component in excess of the service's capacity ([[N+1 redundancy]]). This can be much less expensive and more flexible than failover approaches where every single live component is paired with a single backup component that takes over in the event of a failure ([[dual modular redundancy]]). Some [[RAID]] systems can also utilize [[hot spare]] for a similar effect.<ref name="IBM">{{cite web |url=https://www.ibm.com/support/knowledgecenter/en/SSVJJU_6.4.0/com.ibm.IBMDS.doc_6.4/ds_ag_srv_adm_dd_failover_load_balancing.html |title=Failover and load balancing |website=IBM |accessdate=6 January 2019}}</ref>
This technique can increase [[fault tolerance]] by enabling quick substitutions for the most complicated, most failure-prone parts of a system. However, it can make the load balancer itself a [[single point of failure]].
=== Data Ingestion for AI Model Training ===▼
Increasingly, load balancing techniques are being used to manage high-volume data ingestion pipelines that feed [[artificial intelligence]] [[AI training|training]] and [[inference]] systems—sometimes referred to as “[[AI Factory|AI factories]].” These AI-driven environments require continuous processing of vast amounts of structured and unstructured data, placing heavy demands on networking, storage, and computational resources.<ref>{{Cite web |title=Optimize Traffic Management for AI Factory Data Ingest |url=https://www.f5.com/company/blog/ai-factory-traffic-management-data-ingest |access-date=2025-01-30 |website=F5, Inc. |language=en-US}}</ref> To maintain the necessary high throughput and low latency, organizations commonly deploy load balancing tools capable of advanced TCP optimizations, connection pooling, and adaptive scheduling. Such features help distribute incoming data requests evenly across servers or nodes, prevent congestion, and ensure that compute resources remain efficiently utilized.<ref>{{Cite web |title=Optimize, Scale, and Secure AI Interactions |url=https://www.f5.com/solutions/use-cases/optimize-scale-and-secure-ai |access-date=2025-01-30 |website=F5, Inc. |language=en-US}}</ref>
Line 291 ⟶ 293:
==See also==
* [[Affinity mask]]
* [[Application
* [[Autoscaling]]
* [[Cloud computing]]
Line 300 ⟶ 301:
* [[Edge computing]]
* [[InterPlanetary File System]]
* [[Network
* [[Optimal job scheduling]] - the computational problem of finding an optimally-balanced schedule.
* [[SRV record]]
==References==
|