research-article

Extracting clean performance models from tainted programs

Authors:
Marcin Copik

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

,
Alexandru Calotoiu

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

,
Tobias Grosser

University of Edinburgh, Edinburgh, United Kindgom

University of Edinburgh, Edinburgh, United Kindgom
View Profile

,
Nicolas Wicki

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

,
Felix Wolf

Technical University of Darmstadt, Darmstadt, Germany

Technical University of Darmstadt, Darmstadt, Germany
View Profile

,
Torsten Hoefler

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingFebruary 2021 Pages 403–417https://doi.org/10.1145/3437801.3441613

Published:17 February 2021Publication History

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Pages 403–417

ABSTRACT

Performance models are well-known instruments to understand the scaling behavior of parallel applications. They express how performance changes as key execution parameters, such as the number of processes or the size of the input problem, vary. Besides reasoning about program behavior, such models can also be automatically derived from performance data. This is called empirical performance modeling. While this sounds simple at the first glance, this approach faces several serious interrelated challenges, including expensive performance measurements, inaccuracies inflicted by noisy benchmark data, and overall complex experiment design, starting with the selection of the right parameters. The more parameters one considers, the more experiments are needed and the stronger the impact of noise. In this paper, we show how taint analysis, a technique borrowed from the domain of computer security, can substantially improve the modeling process, lowering its cost, improving model quality, and help validate performance models and experimental setups.

References

2018. Extra-P 3.0. http://www.scalasca.org/software/extra-p/download.html.Google Scholar
Sriram Aananthakrishnan, Greg Bronevetsky, and Ganesh Gopalakrishnan. 2013. Hybrid Approach for Data-flow Analysis of MPI Programs. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing (Eugene, Oregon, USA) (ICS '13). ACM, New York, NY, USA, 455--456.Google ScholarDigital Library
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.Google ScholarDigital Library
D. an Mey, S. Biersdorff, C. Bischof, K. Diethelm, D. Eschweiler, M. Gerndt, A. Knüpfer, D. Lorenz, A. D. Malony, W. E. Nagel, Y. Oleynik, C. Rössel, P. Saviankou, D. Schmidl, S. S. Shende, M. Wagner, B. Wesarg, and F. Wolf. 2012. Score-P: A Unified Performance Measurement System for Petascale Applications. In Proc. of the CiHPC: Competence in High Performance Computing, HPC Status Konferenz der Gauß-Allianz e.V., Schwetzingen, Germany, June 2010. Gauß-Allianz, Springer, 85--97. Google ScholarCross Ref
G. Bauer, S. Gottlieb, and T. Hoefler. 2012. Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3 rmd. In Proc. of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (Ottawa, Canada). IEEE Computer Society, 652--659.Google Scholar
Claude Bernard, Michael C. Ogilvie, Thomas A. Degrand, Carleton E. Detar, Steven A. Gottlieb, A. Krasnitz, R.L. Sugar, and D. Toussaint. 1991. Studying Quarks and Gluons On Mimd Parallel Computers. Int. J. High Perform. Comput. Appl. 5, 4 (Dec. 1991), 61--70.Google Scholar
A. Bhattacharyya, G. Kwasniewski, and T. Hoefler. 2015. Using Compiler Techniques to Improve Automatic Performance Modeling. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 468--479.Google Scholar
Alexandru Calotoiu, David Beckingsale, Christopher W. Earl, Torsten Hoefler, Ian Karlin, Martin Schulz, and Felix Wolf. 2016. Fast Multi-Parameter Performance Modeling. In Proc. of the 2016 IEEE International Conference on Cluster Computing (CLUSTER), Taipei, Taiwan. IEEE Computer Society, 172--181.Google ScholarCross Ref
Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, and Felix Wolf. 2018. Lightweight Requirements Engineering for Exascale Co-design. IEEE. To appear in IEEE International Conference on Cluster Computing (Cluster'18).Google Scholar
Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, and Felix Wolf. 2018. Lightweight Requirements Engineering for Exascale Co-design. In Proc. of the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK. IEEE, 1--11.Google ScholarCross Ref
A. Calotoiu, T. Hoefler, M. Poke, and F. Wolf. 2013. Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13).Google Scholar
Alexandru Calotoiu, Torsten Hoefler, Marius Poke, and Felix Wolf. 2013. Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes. In Proc. of the ACM/IEEE Conference on Supercomputing (SC13), Denver, CO, USA. 1--12.Google ScholarDigital Library
Alonzo Church. 1936. An unsolvable problem of elementary number theory. American journal of mathematics 58, 2 (1936), 345--363.Google Scholar
James Clause, Wanchun Li, and Alessandro Orso. 2007. Dytan: a generic dynamic taint analysis framework. In Proceedings of the 2007 international symposium on Software testing and analysis. ACM, 196--206.Google ScholarDigital Library
James Clause, Wanchun Li, and Alessandro Orso. 2007. Dytan: A Generic Dynamic Taint Analysis Framework. In Proceedings of the 2007 International Symposium on Software Testing and Analysis (London, United Kingdom) (ISSTA '07). Association for Computing Machinery, New York, NY, USA, 196--206. Google ScholarDigital Library
dfsan 2019. Clang 9 Documentation - DataFlowSanitizer. https://clang.llvm.org/docs/DataFlowSanitizer.html.Google Scholar
X. Fu, Z. Chen, Y. Zhang, C. Huang, W. Dong, and J. Wang. 2015. MPISE: Symbolic Execution of MPI Programs. In 2015 IEEE 16th International Symposium on High Assurance Systems Engineering. 181--188.Google Scholar
Dan Gohman. 2009. ScalarEvolution and Loop Optimization. Talk at LLVM Developer's Meeting.Google Scholar
S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson. 2007. Measuring Empirical Computational Complexity. In Proc. of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (Dubrovnik, Croatia) (ESEC-FSE '07). ACM, New York, NY, USA, 395--404.Google Scholar
P. Gschwandtner, A. Hirsch, S. Benedict, and T. Fahringer. 2018. Towards Automatic Compiler-assisted Performance and Energy Modeling for Message Passing Parallel Programs. In ARCS Workshop 2018; 31th International Conference on Architecture of Computing Systems. 1--8.Google Scholar
J. Hammer, G. Hager, J. Eitzinger, and G. Wellein. 2015. Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. CoRR abs/1509.03778 (2015). http://arxiv.org/abs/1509.03778Google Scholar
Torsten Hoefler, William Gropp, William Kramer, and Marc Snir. 2011. Performance Modeling for Systematic Performance Tuning. In State of the Practice Reports (Seattle, Washington) (SC '11). ACM, New York, NY, USA, Article 6, 12 pages.Google Scholar
T. Hoefler and D. Moor. 2014. Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations. Journal of Supercomputing Frontiers and Innovations 1, 2 (Oct. 2014), 58--75.Google Scholar
T. Hoefler, T. Schneider, and A. Lumsdaine. 2010. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10).Google Scholar
Kashif Ilyas, Alexandru Calotoiu, and Felix Wolf. 2017. Off-Road Performance Modeling - How to Deal with Segmented Data. In Proc. of the 23rd Euro-Par Conference, Santiago de Compostela, Spain (Lecture Notes in Computer Science, Vol. 10417). Springer, 36--48. Google ScholarCross Ref
Engin Ipek, Bronis R. de Supinski, Martin Schulz, and Sally A. McKee. 2005. An Approach to Performance Prediction for Parallel Applications. In Proceedings of the 11th International Euro-Par Conference on Parallel Processing (Lisbon, Portugal) (Euro-Par'05). Springer-Verlag, Berlin, Heidelberg, 196--205.Google Scholar
A. Jayakumar, P. Murali, and S. Vadhiyar. 2015. Matching Application Signatures for Performance Predictions Using a Single Execution. In Proc. of the 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2015). 1161--1170.Google Scholar
Min Gyung Kang, Stephen McCamant, Pongsin Poosankam, and Dawn Song. 2011. DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society.Google Scholar
Ian Karlin, Jeff Keasler, and Rob Neely. 2013. LULESH 2.0 Updates and Changes. Technical Report LLNL-TR-641973. 1--9 pages.Google Scholar
Vasileios P. Kemerlis, Georgios Portokalidis, Kangkook Jee, and Angelos D. Keromytis. 2012. Libdft: Practical Dynamic Data Flow Tracking for Commodity Systems. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments (London, England, UK) (VEE '12). Association for Computing Machinery, New York, NY, USA, 121--132. Google ScholarDigital Library
D. J. Kerbyson, H. J. Alme, A. Hoisie, F. Petrini, H. J. Wasserman, and M. Gittings. 2001. Predictive Performance and Scalability Modeling of a Large-Scale Application. In SC '01: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing. 39--39.Google Scholar
M. Kuhnemann, T. Rauber, and G. Runger. 2004. A source code analyzer for performance prediction. In 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings. 262--.Google Scholar
C. Lattner and V. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (Palo Alto, California) (CGO '04). IEEE Computer Society, Washington, DC, USA.Google Scholar
Benjamin C. Lee, David M. Brooks, Bronis R. de Supinski, Martin Schulz, Karan Singh, and Sally A. McKee. 2007. Methods of inference and learning for performance modeling of parallel applications. In Proc. of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (San Jose, California, USA) ((PPoPP '07)). ACM, 249--258.Google Scholar
Seyong Lee, Jeremy S. Meredith, and Jeffrey S. Vetter. 2015. COMPASS: A Framework for Automated Performance Modeling and Prediction. In Proceedings of the 29th ACM on International Conference on Supercomputing (Newport Beach, California, USA) (ICS '15). ACM, New York, NY, USA, 405--414.Google Scholar
Y. J. Lo, S. Williams, B. Van Straalen, T. J. Ligocki, M. J. Cordery, N. J. Wright, M. W. Hall, and L. Oliker. 2014. Roofline Model Toolkit: A practical tool for architectural and program analysis. In High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. Springer, 129--148.Google Scholar
G. Marin and J. Mellor-Crummey. 2004. Cross-architecture performance predictions for scientific applications using parameterized models. SIGMETRICS Performance Eval. Review 32, 1 (June 2004), 2--13.Google Scholar
K. Meng and B. Norris. 2017. Mira: A Framework for Static Performance Analysis. In 2017 IEEE International Conference on Cluster Computing (CLUSTER). 103--113.Google Scholar
M. R. Meswani, L. Carrington, D. Unat, A. Snavely, S. Baden, and S. Poole. 2013. Modeling and Predicting Performance of High Performance Computing Applications on Hardware Accelerators. Int. J. High Perform. Comput. Appl. 27, 2 (May 2013), 89--108. Google ScholarDigital Library
Henry Gordon Rice. 1953. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358--366.Google ScholarCross Ref
Marcus Ritter, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, and Felix Wolf. 2020. Performance Modeling at a Discount. IEEE. Accepted at the 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS'20).Google Scholar
Marcus Ritter, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, and Felix Wolf. 2020. Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling. In Proc. of the 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA. IEEE Computer Society. (to appear).Google ScholarCross Ref
Philip C. Roth and Jeremy S. Meredith. 2014. Value Influence Analysis for Message Passing Applications. In Proceedings of the 28th ACM International Conference on Supercomputing (Munich, Germany) (ICS '14). ACM, New York, NY, USA, 145--154.Google Scholar
Marc Shapiro and Susan Horwitz. 1997. The effects of the precision of pointer analysis. In International Static Analysis Symposium. Springer, 16--34.Google ScholarCross Ref
Dongdong She, Yizheng Chen, Abhishek Shah, Baishakhi Ray, and Suman Jana. 2019. Neutaint: Efficient Dynamic Taint Analysis with Neural Networks. arXiv:1907.03756 [cs.CR]Google Scholar
Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, and Felix Wolf. 2015. Exascaling Your Library: Will Your Implementation Meet Your Expectations?. In Proc. of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA. 1--11.Google ScholarDigital Library
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner. 2015. Performance-influence Models for Highly Configurable Systems. In Proc. of the 2015 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015). ACM, New York, NY, USA, 284--294.Google ScholarDigital Library
Vítor Sousa, Daniel de Oliveira, Patrick Valduriez, and Marta Mattoso. 2018. DfAnalyzer: Runtime Dataflow Analysis of Scientific Applications using Provenance. Proceedings of the VLDB Endowment 11 (08 2018).Google Scholar
K. L. Spafford and J. S. Vetter. 2012. Aspen: A Domain Specific Language for Performance Modeling. In Proc. of the International Conference on High Performance Computing, Networking, Storage and Analysis (Salt Lake City, Utah) (SC '12). IEEE Computer Society Press, Los Alamitos, CA, USA, Article 84, 11 pages.Google Scholar
Yulei Sui and Jingling Xue. 2016. SVF: interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th international conference on compiler construction. ACM, 265--266.Google ScholarDigital Library
N. R. Tallent and A. Hoisie. 2014. Palm: Easing the Burden of Analytical Performance Modeling. In Proc. of the 28th ACM International Conference on Supercomputing (Munich, Germany) (ICS '14). ACM, New York, NY, USA, 221--230. Google ScholarDigital Library
Rajeev Thakur, Rolf Rabenseifner, and William Gropp. 2005. Optimization of Collective Communication Operations in MPICH. Int. J. High Perform. Comput. Appl. 19, 1 (Feb. 2005), 49--66.Google ScholarDigital Library
Sebastian Unger and Frank Mueller. 2002. Handling irreducible loops: optimized node splitting versus DJ-graphs. ACM Transactions on Programming Languages and Systems (TOPLAS) 24, 4 (2002), 299--333.Google ScholarDigital Library
R. Vuduc, J. W. Demmel, and J. A. Bilmes. 2004. Statistical Models for Empirical Search-Based Performance Tuning. Int. J. High Perform. Comput. Appl. 18, 1 (Feb. 2004), 65--94.Google ScholarDigital Library
Xing Wu and Frank Müller. 2012. ScalaExtrap: Trace-Based Communication Extrapolation for SPMD Programs. ACM Transactions on Programming Languages and Systems 34, 1 (April 2012).Google ScholarDigital Library
Hengbiao Yu. 2018. Combining Symbolic Execution and Model Checking to Verify MPI Programs. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (Gothenburg, Sweden) (ICSE '18). Association for Computing Machinery, New York, NY, USA, 527--530. Google ScholarDigital Library
D. Zaparanuks and M. Hauswirth. 2012. Algorithmic Profiling. SIGPLAN Not. 47, 6 (June 2012), 67--76.Google ScholarDigital Library
Angeliki Zavou, Georgios Portokalidis, and Angelos D. Keromytis. 2011. Taint-Exchange: A Generic System for Cross-Process and Cross-Host Taint Tracking. In Advances in Information and Computer Security, Tetsu Iwata and Masakatsu Nishigaki (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 113--128.Google Scholar
Jidong Zhai, Wenguang Chen, and Weimin Zheng. 2010. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. SIGPLAN Notices 45, 5 (January 2010), 305--314.Google ScholarDigital Library

Index Terms

Extracting clean performance models from tainted programs
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
      2. Modeling methodologies
2. Theory of computation
  1. Models of computation
    1. Concurrency
  2. Semantics and reasoning
    1. Program semantics

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2021
507 pages
ISBN:9781450382946
DOI:10.1145/3437801
General Chair:
Jaejin Lee
Seoul National University, South Korea
,
Program Chair:
Erez Petrank
Technion, Israel
Copyright © 2021 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 February 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Evaluated & Functional / v1.1
- Artifacts Available / v1.1
Author Tags
taint analysis
high-performance computing
LLVM
compiler techniques
performance modeling
Qualifiers
- research-article
Conference

Acceptance Rates
PPoPP '21 Paper Acceptance Rate 31 of 150 submissions, 21%Overall Acceptance Rate 230 of 1,014 submissions, 23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 274
  Total Downloads
- Downloads (Last 12 months)65
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Extracting clean performance models from tainted programs

Save to Binder

PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

ABSTRACT

References

Cited By

Index Terms

Extracting clean performance models from tainted programs