Exploiting Efficient and Scalable Shuffle Transfers in Future Data Center Networks - 2015
Distributed computing systems like MapReduce in information centers transfer large amount of information across successive processing stages. Such shuffle transfers contribute most of the network traffic and create the network bandwidth become a bottleneck. In many commonly used workloads, information flows in such a transfer are highly correlated and aggregated at the receiver aspect. To lower down the network traffic and efficiently use the available network bandwidth, we tend to propose to push the aggregation computation into the network and parallelize the shuffle and scale back phases. In this paper, we have a tendency to initial examine the gain and feasibility of the in-network aggregation with BCube, a unique server-centric networking structure for future knowledge centers. To exploit such a gain, we have a tendency to model the in-network aggregation problem that's NP-exhausting in BCube. We propose 2 approximate ways for building the economical IRS-based mostly incast aggregation tree and SRS-based shuffle aggregation subgraph, solely based mostly on the labels of their members and the data center topology. We any style scalable forwarding schemes based on Bloom filters to implement in-network aggregation over massive concurrent shuffle transfers. Based mostly on a prototype and massive-scale simulations, we have a tendency to demonstrate that our approaches will significantly decrease the number of network traffic and save the info center resources. Our approaches for BCube can be custom-made to other servercentric network structures for future knowledge centers after minimal modifications.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here