Fault-adaptive Scheduling for Data Acquisition Networks

Stein, Eloise;Bramas, Quentin;Colombo, Tommaso;Pelsser, Cristel
(2023) 48th IEEE Conference on Local Computer Networks (LCN) — Location: Florida, USA (1.October.2023)

Files

main.pdf
  • Open Access
  • Adobe PDF
  • 314.67 KB

Details

Authors
  • Stein, Eloise
    Author
  • Bramas, Quentin
    Author
  • Colombo, Tommaso
    Author
  • Author
Abstract
Supporting such an all-to-all traffic matrix is challenging as it can easily lead to congestion. Scheduling patterns are designed to avoid such congestion by spreading the communications over time. The time is divided in phases and communications are spread across the phases. However, current scheduling algorithms are not fault-tolerant. In this paper we propose a fault-adaptive congestion-free scheduling to support an all-to-all exchange in fat tree topology. Our approach consist in the computation of the minimum number of communication phases required to support the all-to-all exchange with the available links, and of the scheduling of the communications on these phases. It enables to recover from failures and makes optimal use of the remaining bandwidth. We show that our scheduling approach provides better performance than the most common approach which is the Linear-shift scheduling. The throughput is improved by roughly 80% with our approach, for as little as one link failure.
Affiliations

Citations

Stein, E., Bramas, Q., Colombo, T., & Pelsser, C. (2023). Fault-adaptive Scheduling for Data Acquisition Networks. 48th IEEE Conference on Local Computer Networks (LCN), Florida, USA. https://doi.org/10.1109/LCN58197.2023.10223324