Show simple item record

dc.contributor.authorRajasekaran, Sudarsanan
dc.contributor.authorNarang, Sanjoli
dc.contributor.authorZabreyko, Anton A.
dc.contributor.authorGhobadi, Manya
dc.date.accessioned2024-12-19T16:41:09Z
dc.date.available2024-12-19T16:41:09Z
dc.date.issued2024-11-18
dc.identifier.isbn979-8-4007-1272-2
dc.identifier.urihttps://hdl.handle.net/1721.1/157894
dc.descriptionHOTNETS ’24, November 18–19, 2024, Irvine, CA, USAen_US
dc.description.abstractThis paper argues that congestion control protocols in machine learning datacenters sit at a sweet spot between centralized and distributed flow scheduling solutions. We present MLTCP, a technique to augment today's congestion control algorithms to approximate an interleaved centralized flow schedule. At the heart of MLTCP lies a straight-forward principle based on a key conceptual insight: by scaling the congestion window size (or sending rate) based on the number of bytes sent at each iteration, MLTCP flows eventually converge into a schedule that reduces network contention. We demonstrate that MLTCP uses a gradient descent trend with a step taken at every training (or fine-tuning) iteration towards reducing network congestion among competing jobs.en_US
dc.publisherACM|The 23rd ACM Workshop on Hot Topics in Networksen_US
dc.relation.isversionof10.1145/3696348.3696878en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleMLTCP: A Distributed Technique to Approximate Centralized Flow Scheduling For Machine Learningen_US
dc.typeArticleen_US
dc.identifier.citationRajasekaran, Sudarsanan, Narang, Sanjoli, Zabreyko, Anton A. and Ghobadi, Manya. 2024. "MLTCP: A Distributed Technique to Approximate Centralized Flow Scheduling For Machine Learning."
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-12-01T08:53:35Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-12-01T08:53:36Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record