invited-talk

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications

Author:
Gabriele Russo Russo

University of Rome Tor Vergata, Rome, Italy

University of Rome Tor Vergata, Rome, Italy

0000-0001-8233-4570
Search about this author

ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance EngineeringApril 2023 Pages 137–138https://doi.org/10.1145/3578245.3585427

Published:15 April 2023Publication History

ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering

Pages 137–138

ABSTRACT

Modern distributed systems can benefit from the availability of large-scale and heterogeneous computing infrastructures. However, the complexity and dynamic nature of these environments also call for self-adaptation abilities, as guaranteeing efficient resource usage and acceptable service levels through static configurations is very difficult.

In this talk, we discuss a hierarchical auto-scaling approach for distributed applications, where application-level managers steer the overall process by supervising component-level adaptation managers. Following a bottom-up approach, we first discuss how to exploit model-free and model-based reinforcement learning to compute auto-scaling policies for each component. Then, we show how Bayesian optimization can be used to automatically configure the lower-level auto-scalers based on application-level objectives. As a case study, we consider distributed data stream processing applications, which process high-volume data flows in near real-time and cope with varying and unpredictable workloads.

References

Yahya Al-Dhuraibi, Fawaz Paraiso, Nabil Djarallah, and Philippe Merle. 2018. Elasticity in Cloud Computing: State of the Art and Research Challenges., Vol. 11, 2 (2018), 430--447. https://doi.org/10.1109/TSC.2017.2711009Google Scholar
Valeria Cardellini, Francesco Lo Presti, Matteo Nardelli, and Gabriele Russo Russo. 2022. Run-Time Adaptation of Data Stream Processing Systems: The State of the Art. ACM Comput. Surv., Vol. 54 (2022), 36 pages.Issue 11s. https://doi.org/10.1145/3514496Google ScholarDigital Library
Marios Fragkoulis, Paris Carbone, Vasiliki Kalavri, and Asterios Katsifodimos. 2020. A Survey on the Evolution of Stream Processing Systems. CoRR, Vol. abs/2008.00842 (2020). arxiv: 2008.00842 https://arxiv.org/abs/2008.00842Google Scholar
Peter I. Frazier. 2018. A Tutorial on Bayesian Optimization. Vol. abs/1807.02811. arxiv: 1807.02811 http://arxiv.org/abs/1807.02811Google Scholar
Omid Gheibi, Danny Weyns, and Federico Quin. 2021. Applying Machine Learning in Self-Adaptive Systems: A Systematic Literature Review. ACM Transactions on Autonomous and Adaptive Systems, Vol. 15, 3, Article 9 (2021), 37 pages. https://doi.org/10.1145/3469440Google ScholarDigital Library
Thomas Heinze, Leonardo Aniello, Leonardo Querzoni, and Zbigniew Jerzak. 2014. Cloud-based Data Stream Processing. In Proc. of 8th ACM Int'l Conf. on Distributed Event-Based Systems, DEBS '14. 238--245. https://doi.org/10.1145/2611286.2611309Google ScholarDigital Library
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, et al. 2015. Human-Level Control Through Deep Reinforcement Learning. Nat., Vol. 518, 7540 (2015), 529--533. https://doi.org/10.1038/nature14236Google ScholarCross Ref
Henriette Rö ger and Ruben Mayer. 2019. A Comprehensive Survey on Parallelization and Elasticity in Stream Processing. ACM Comput. Surv., Vol. 52, 2 (2019), 36:1--36:37. https://doi.org/10.1145/3303849Google ScholarDigital Library
Gabriele Russo Russo, Valeria Cardellini, and Francesco Lo Presti. 2019. Reinforcement Learning Based Policies for Elastic Stream Processing on Heterogeneous Resources. In Proc. of 13th ACM Int'l Conf. on Distributed and Event-based Systems, DEBS '19. 31--42. https://doi.org/10.1145/3328905.3329506Google ScholarDigital Library
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction 2 ed.). MIT Press, Cambridge, MA, USA.Google ScholarDigital Library

Index Terms

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering
April 2023
421 pages
ISBN:9798400700729
DOI:10.1145/3578245
General Chairs:
Marco Vieira
University of Coimbra, Portugal
,
Valeria Cardellini
University of Rome Tor Vergata, Italy
,
Program Chairs:
Antinisca Di Marco
University of L'Aquila, Italy
,
Petr Tuma
Charles University, Czechia
Copyright © 2023 Owner/Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 April 2023
Check for updates
Author Tags
reinforcement learning
auto-scaling
Qualifiers
- invited-talk
Conference

Acceptance Rates
Overall Acceptance Rate 208 of 674 submissions, 31%
Upcoming Conference
ICPE '23

Sponsor:

sigsoft

sigmetrics

ICPE '23: ACM/SPEC International Conference on Performance Engineering

April 15 - 19, 2023

Coimbra , Portugal
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 0
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications

Save to Binder

ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering

ABSTRACT

References

Cited By

Index Terms

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

Digital Edition

Caption

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications

Save to Binder

ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance Engineering

ABSTRACT

References

Cited By

Index Terms

Using Reinforcement Learning to Control Auto-Scaling of Distributed Applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

Digital Edition

Share this Publication link

Share on Social Media