We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors. Our method partitions the computation in balanced tasks and distributes them to independent processors through the Prediction Binary Tree (PBT). At each new phase, current PBT is updated by using previous phase computing time (for each task) as (next phase) cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduce dependency among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i.e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality). Our strategy has been assessed on a significant problem, Parallel Ray Tracing. Our implementation shows a good scalability, and improves over coherence-oblivious implementations. We report different measurements showing that granularity of tasks is a key point for the performances of our decomposition/mapping strategy

Load balancing in mesh-like computations using Prediction Binary Trees

ERRA, UGO;
2008-01-01

Abstract

We present a load-balancing technique that exploits the temporal coherence, among successive computation phases, in mesh-like computations to be mapped on a cluster of processors. Our method partitions the computation in balanced tasks and distributes them to independent processors through the Prediction Binary Tree (PBT). At each new phase, current PBT is updated by using previous phase computing time (for each task) as (next phase) cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduce dependency among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i.e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality). Our strategy has been assessed on a significant problem, Parallel Ray Tracing. Our implementation shows a good scalability, and improves over coherence-oblivious implementations. We report different measurements showing that granularity of tasks is a key point for the performances of our decomposition/mapping strategy
2008
978-076953472-5
File in questo prodotto:
File Dimensione Formato  
cosenzaispdc08.pdf

accesso aperto

Tipologia: Documento in Pre-print
Licenza: DRM non definito
Dimensione 354.29 kB
Formato Adobe PDF
354.29 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/58924
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact