Abstract
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside a human cell. Attacking RTC function with pharmaceutical compounds is a pathway to treating COVID-19. Conventional tools, e.g. cryo-electron microscopy and all-atom molecular dynamics (AAMD), do not provide sufficiently high resolution or timescale to capture important dynamics of this molecular machine. Consequently, we develop an innovative workflow that bridges the gap between these resolutions, using mesoscale fluctuating finite element analysis (FFEA) continuum simulations and a hierarchy of AI-methods that continually learn and infer features for maintaining consistency between AAMD and FFEA simulations. We leverage a multi-site distributed workflow manager to orchestrate AI, FFEA, and AAMD jobs, providing optimal resource utilization across HPC centers. Our study provides unprecedented access to study the SARS-CoV-2 RTC machinery, while providing general capability for AI-enabled multi-resolution simulations at scale.
- 2019) Small-molecule antiviral β-d-N 4-hydroxycytidine inhibits a proofreading-intact coronavirus with a high genetic barrier to resistance. Journal of Virology 93(24): e01348–19.Google Scholar , (
- 2020) A spike with which to beat COVID-19? Nature Reviews Microbiology 18(8): 414–414. DOI: 10.1038/s41579-020-0383-2Google Scholar (
- 2019) End-to-end differentiable learning of protein structure. Cell Systems 8(4): 292–301. DOI: 10.1016/j.cels.2019.03.006Google ScholarCross Ref (
- 2020) Fighting COVID-19 using molecular dynamics simulations. ACS Central Science 6(10): 1654–1656. DOI: 10.1021/acscentsci.0c01236Google Scholar (
- 2016) Ensemble toolkit: scalable and flexible execution of ensembles of tasks. In:
2016 45th International Conference on Parallel Processing (ICPP) . Los Alamitos, CA, USA: IEEE Computer Society,16-19 August 2016 , pp. 458–463. DOI: 10.1109/ICPP.2016.59Google Scholar , ( - 2021) Structural biology in the fight against COVID-19. Nature Structural & Molecular Biology 28(1): 2–7. DOI: 10.1038/s41594-020-00544-8Google Scholar , (
- 2021) The contribution of biophysics and structural biology to current advances in COVID-19. Annual Review of Biophysics 50(1): 493–523. DOI: 10.1146/annurev-biophys-102620-080956Google Scholar (
- 2020) The flexibility of ACE2 in the context of SARS-CoV-2 infection. bioRxiv. DOI: 10.1101/2020.09.16.300459Google Scholar , (
- 2018) Deep clustering of protein folding simulations. BMC Bioinformatics 19(18): 484. DOI: 10.1186/s12859-018-2507-5Google Scholar , (
- 2017) Determining atomistic SAXS models of tri-ubiquitin chains from bayesian analysis of accelerated molecular dynamics simulations. Journal of Chemical Theory and Computation 13(6): 2418–2429. DOI: 10.1021/acs.jctc.7b00059Google Scholar , (
- 2021) Achieving 100X faster simulations of complex biological phenomena by coupling ML to HPC ensembles. arXiv Preprint arXiv:2104.04797.Google Scholar , (
- 2015) Bayesian inference of protein structure from chemical shift data. PeerJ 3: e861. DOI: 10.7717/peerj.861Google Scholar , (
- 2000) LOF: identifying density-based local outliers. In: ACM Sigmod Record. ACM, Vol. 29, pp. 93–104.Google Scholar , (
- 2021) AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics. The International Journal of High Performance Computing Applications 35(5): 432–451. DOI: 10.1177/10943420211006452Google ScholarDigital Library , (
- 2007) Protein structure determination from NMR chemical shifts. Proceedings of the National Academy of Sciences 104(23): 9615–9620. DOI: 10.1073/pnas.0610313104Google ScholarCross Ref , (
- 2020) funcX: a federated function serving fabric for science. In:
Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing. 23 June 2020 , pp. 65–76. DOI: 10.1145/3369583.3392683Google ScholarDigital Library , ( - 2020) Structural basis for helicase-polymerase coupling in the SARS-CoV-2 replication-transcription complex. Cell 182(6): 1560–1573. DOI: 10.1016/j.cell.2020.07.033Google Scholar , (
- 2008) Meshlab: an open-source mesh processing tool. In:
Eurographics Italian Chapter Conference . Salerno, Italy,2 July 2008 , Vol. 2008, pp. 129–136.Google Scholar , ( - 2021) High throughput virtual screening and validation of a SARS-CoV-2 main protease non-covalent inhibitor. bioRxiv. DOI: 10.1101/2021.03.27.437323Google Scholar , (
- 2019) FirecREST: restful API on cray XC systems. arXiv.Google Scholar (
- 2020) Coarse-grained molecular simulations of the binding of the SARS-CoV-2 spike protein RBD to the ACE2 receptor. bioRxiv. DOI: 10.1101/2020.05.07.083212Google Scholar (
- 1992) Leukocyte deformability: finite element modeling of large viscoelastic deformation. Journal of Theoretical Biology 158(2): 173–193. DOI: 10.1016/S0022-5193(05)80716-7Google Scholar (
- 2019) Adversarial-residual-coarse-graining: Applying machine learning theory to systematic molecular coarse-graining. The Journal of Chemical Physics 151(12): 124110. DOI: 10.1063/1.5097559Google ScholarCross Ref (
- 2017) Challenges of integrating stochastic dynamics and cryo-electron tomograms in whole-cell simulations. The Journal of Physical Chemistry B 121(15): 3871–3881. DOI: 10.1021/acs.jpcb.7b00672Google Scholar , (
- 2017) OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Computational Biology 13(7): e1005659. DOI: 10.1371/journal.pcbi.1005659Google ScholarCross Ref , (
- 2020) Cross-facility science with the Superfacility Project at LBNL. In:
2020 IEEE/ACM 2nd Annual Workshop on Extreme-Scale Experiment-in-the-Loop Computing (XLOOP) . Los Alamitos, CA, USA: IEEE Computer Society,12 November 2020 , pp. 1–7. DOI: 10.1109/XLOOP51963.2020.00006Google Scholar , ( - 2015) Native architecture of the Chlamydomonas chloroplast revealed by in situ cryo-electron tomography. eLife 4: e04889. DOI: 10.7554/eLife.04889Google Scholar , (
- 1995) A smooth particle mesh Ewald method. Journal of Chemical Physics 103: 8577–8593.Google ScholarCross Ref , (
- 1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In:
Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96) . Portland, Oregon,2–4 August 1996 , pp. 226–231.Google Scholar , ( - 2013) Using collective variables to drive molecular dynamics simulations. Molecular Physics 111(22–23): 3345–3362. DOI: 10.1080/00268976.2013.813594Google Scholar (
- 2020) Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 368(6492): 779–782. DOI: 10.1126/science.abb7498Google Scholar , (
- 2021) The SIRAH-CoV-2 initiative: a coarse-grained simulations’ dataset of the SARS-CoV-2 proteome. Frontiers in Medical Technology 3: 644039. DOI: 10.3389/fmedt.2021.644039Google Scholar , (
- 2009) Gmsh: A 3-D finite element mesh generator with built-in pre-and post-processing facilities. International Journal for Numerical Methods in Engineering 79(11): 1309–1331.Google ScholarCross Ref (
- 2005) Protein structure elucidation from minimal NMR data: the CLOUDS approach. In: Nuclear Magnetic Resonance of Biological Macromolecules. Methods in Enzymology. Academic Press, Vol. 394, pp. 261–295. DOI: 10.1016/S0076-6879(05)94010-XGoogle Scholar (
- 2021) Continuum mechanical parameterisation of cytoplasmic dynein from atomistic simulation. Methods 185: 39–48.Google Scholar , (
- 2021) Generative deep learning for macromolecular structure and dynamics. Current Opinion in Structural Biology 67: 170–177.Google Scholar (
- 1996) VMD – visual molecular dynamics. Journal of Medical Genetics 14(1): 33–38. DOI: 10.1016/0263-7855(96)00018-5Google Scholar (
- 2020) Coarse graining molecular dynamics with graph neural networks. arXiv.Google Scholar , (
- 2015) FireWorks: a dynamic workflow system designed for high-throughput applications. Concurrency and Computation: Practice and Experience 27(17): 5037–5059. DOI: 10.1002/cpe.3505Google ScholarDigital Library , (
- 2021) Highly accurate protein structure prediction with AlphaFold. Nature 596(7873): 583–589. DOI: 10.1038/s41586-021-03819-2Google ScholarCross Ref , (
- 2019) The Charm++ Parallel Programming System. DOI: 10.5281/zenodo.3370873Google Scholar , (
- 2013). Parallel Science and Engineering Applications: The Charm++ Approach (1st ed.). CRC Press. DOI: 10.1201/b16251Google Scholar . (Eds.). (
- 2019) Volumetric finite-element modelling of biological growth. Open Biology 9(5): 190057. DOI: 10.1098/rsob.190057Google ScholarCross Ref (
- 2021) Cryo-EM as a powerful tool for drug discovery: recent structural based studies of SARS-CoV-2. Applied Microscopy 51(1): 1–7. DOI: 10.1186/s42649-021-00062-xGoogle Scholar (
- 2014) The resolution revolution. Science 343(6178): 1443–1444. DOI: 10.1126/science.1251652Google Scholar (
- 2019) DeepDriveMD: deep-learning driven adaptive molecular simulations for protein folding. In:
2019 IEEE/ACM Third Workshop on Deep Learning on Supercomputers (DLS) . Denver, CO, USA: IEEE,17-17 November 2019 , pp. 12–19.Google Scholar , ( - 2020) Neural operator: graph kernel network for partial differential equations. arXiv.Google Scholar , (
- 2019) Challenges and opportunities in cryo-EM single-particle analysis. Journal of Biological Chemistry 294(13): 5181–5197. DOI: 10.1074/jbc.REV118.005602Google Scholar (
- 2016) Visualizing the molecular sociology at the HeLa cell nuclear periphery. Science 351(6276): 969–972. DOI: 10.1126/science.aad8857Google Scholar , (
- 2016) Breaking Cryo-EM resolution barriers to facilitate drug discovery. Cell 165(7): 1698–1707. DOI: 10.1016/j.cell.2016.05.040Google Scholar , (
- 2018) Using pilot systems to execute many task workloads on supercomputers. arXiv.Google Scholar , (
- 2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science 373(6557): 871–876. DOI: 10.1126/science.abj8754Google Scholar , (
- 2021) A critical overview of computational approaches employed for COVID-19 drug discovery. Chemical Society Reviews 50: 9121–9151. DOI: 10.1039/D0CS01065KGoogle Scholar , (
- 2020) Machine Learning for Molecular Dynamics on Long Timescales. Cham: Springer International Publishing, pp. 331–372. DOI: 10.1007/978-3-030-40245-7_16Google Scholar (
- 2013) A stochastic finite element model for the dynamics of globular macromolecules. Journal of Computational Physics 239: 147–165. DOI: 10.1016/j.jcp.2012.12.027Google ScholarDigital Library , (
- 2021) Accelerating COVID-19 research using molecular dynamics simulation. Journal of Palliative Care 125(32): 9078–9091. DOI: 10.1021/acs.jpcb.1c04556Google Scholar (
- 2018) Advances in coarse-grained modeling of macromolecular complexes. Current Opinion in Structural Biology 52: 119–126. DOI: 10.1016/j.sbi.2018.11.005Google Scholar (
- 2017) In vivo bone strain and finite element modeling of a rhesus macaque mandible during mastication. Zoology 124: 13–29. DOI: 10.1016/j.zool.2017.08.010Google Scholar , (
- 2021) An atomistic model of the coronavirus replication-transcription complex as a hexamer assembled around nsp15. bioRxiv. DOI: 10.1101/2021.06.08.447516Google Scholar , (
- 2021) UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Science 30(1): 70–82.Google Scholar , (
- 2002) NAMD: biomolecular simulation on thousands of processors. In: Proceedings of the IEEE/ACM SC2002 Conference, Technical Paper 277. Baltimore, MD: IEEE Press,
16 November 2002 , pp. 1–18. DOI: 10.1109/SC.2002.10019Google Scholar , ( - 2005) Scalable molecular dynamics with NAMD. Journal of Computational Chemistry 26: 1781–1802.Google ScholarCross Ref , (
- 2020) Scalable molecular dynamics on CPU and GPU architectures with NAMD. The Journal of Chemical Physics 153: 044130. DOI: 10.1063/5.0014475Google ScholarCross Ref , (
- 2008) Adapting a message-driven parallel application to GPU-accelerated clusters. In: SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. Piscataway, NJ, USA: IEEE Press,
15 November 2008 , pp. 1–9.Google Scholar ( - 2011) Discovering conformational sub-states relevant to protein function. PLoS ONE 6(1): e15827. DOI: 10.1371/journal.pone.0015827Google Scholar , (
- 2020) Exploring the dynamics of flagellar dynein within the axoneme with Fluctuating Finite Element Analysis. Quarterly Reviews of Biophysics 53, E9. DOI: 10.1017/S0033583520000062Google Scholar , (
- 2020) Exploring the dynamics of flagellar dynein within the axoneme with Fluctuating Finite Element Analysis. Quarterly Reviews of Biophysics, 53, E9, DOI: 10.1017/S0033583520000062Google Scholar . (
- 2020) A structural view of SARS-CoV-2 RNA replication machinery: RNA synthesis, proofreading and final capping. Cells 9(5): 1267. DOI: 10.3390/cells9051267Google Scholar , (
- 2021) Toward real-time analysis of experimental science workloads on geographically distributed supercomputers. arXiv.Google Scholar , (
- 2012) A bayesian view on Cryo-EM structure determination. Journal of Molecular Biology 415(2): 406–418. DOI: 10.1016/j.jmb.2011.11.010Google ScholarCross Ref (
- 1997) NETGEN an advancing front 2D/3D-mesh generator based on abstract rules. Computing and Visualization in Science 1(1): 41–52.Google ScholarCross Ref (
- 2021) Multiscale modeling and cinematic visualization of photosynthetic energy conversion processes from electronic to cell scales. Parallel Computing 102: 102698.Google ScholarCross Ref , (
- 2020) Cell entry mechanisms of SARS-CoV-2. Proceedings of the National Academy of Sciences 117(21): 11727–11734.Google Scholar , (
- 2020) An orally bioavailable broad-spectrum antiviral inhibits SARS-CoV-2 in human airway epithelial cell cultures and multiple coronaviruses in mice. Science Translational Medicine 12: 541.Google Scholar , (
- 2018) Fluctuating Finite Element Analysis (FFEA): a continuum mechanics software tool for mesoscale simulation of biomolecules. PLoS Computational Biology 14(3): e1005897.Google Scholar , (
- 2017) Design optimization for accurate flow simulations in 3D printed vascular phantoms derived from computed tomography angiography. In: Medical Imaging 2017: Imaging Informatics for Healthcare, Research, and Applications. International Society for Optics and Photonics. Vol. 10138.Google Scholar , (
- 2013a) Early experiences scaling VMD molecular visualization and analysis jobs on blue waters. In: Extreme Scaling Workshop (XSW). pp. 43–50. DOI: 10.1109/XSW.2013.10Google ScholarDigital Library (
- 2014) GPU-accelerated analysis and visualization of large structures solved by molecular dynamics flexible fitting. Faraday Discussions 169: 265–283. DOI: 10.1039/C4FD00005FGoogle ScholarCross Ref , (
- 2016) Atomic detail visualization of photosynthetic membranes with GPU-accelerated ray tracing. Parallel Computing 55: 17–27. DOI: 10.1016/j.parco.2015.10.015Google ScholarDigital Library , (
- 2013b) GPU-accelerated molecular visualization on petascale supercomputing platforms. In: Proceedings of the 8th International Workshop on Ultrascale Visualization (UltraVis ’13). New York, NY, USA: ACM,
17 November 2013 , pp. 1–8.Google Scholar ( - 2021) A glycan gate controls opening of the SARS-CoV-2 spike protein. Nature Chemistry 13: 963–968.Google Scholar , (
- 2010) Multiscale modeling of proteins. Accounts of Chemical Research 43(2): 220–230. DOI: 10.1021/ar9001476Google Scholar (
- 2009) Molecular dynamics flexible fitting: a practical guide to combine cryo-electron microscopy and X-ray crystallography. Methods 49(2): 174–180.Google ScholarCross Ref , (
- 2021) Highly accurate protein structure prediction for the human proteome. Nature 596(7873): 590–596. DOI: 10.1038/s41586-021-03828-1Google Scholar , (
- 2020) Combined force-torque spectroscopy of proteins by means of multiscale molecular simulation. Biophysical Journal 119(11): 2240–2250.Google Scholar , (
- 2010) CHARMM general force field: a force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. Journal of Computational Chemistry 31(4): 671–690.Google Scholar , (
- 2020) Flexible fitting of small molecules into electron microscopy maps using molecular dynamics simulations with neural network potentials. Journal of Chemical Information and Modeling 60(5): 2591–2604.Google Scholar , (
- 2014) Finding the right fit: chiseling structures out of cryo-electron microscopy maps. Current Opinion in Structural Biology 25: 118–125. DOI: 10.1016/j.sbi.2014.04.001Google Scholar (
- 2013) Multiscale computational models of complex biological systems. Annual Review of Biomedical Engineering 15(1): 137–154. DOI: 10.1146/annurev-bioeng-071811-150104Google Scholar (
- 2020) Structural basis for RNA replication by the SARS-CoV-2 polymerase. Cell 182(2): 417–428.Google Scholar , (
- 2007) Exploring transmembrane transport through α-hemolysin with grid-steered molecular dynamics. The Journal of Chemical Physics 127(12): 09B619.Google Scholar (
- 2020) Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host & Microbe 27(3): 325–328.Google Scholar , (
- 2021a) Cryo-EM structure of an extended SARS-CoV-2 replication and transcription complex reveals an intermediate state in cap synthesis. Cell 184(1): 184–193. DOI: 10.1016/j.cell.2020.11.016Google Scholar , (
- 2021b) Coupling of N7-methyltransferase and 3-5 exoribonuclease with SARS-CoV-2 polymerase reveals mechanisms for capping and proofreading. Cell 184(13): 3474–3485.Google Scholar , (
- 2020) Architecture of a SARS-CoV-2 mini replication and transcription complex. Nature Communications 11(1): 5874. DOI: 10.1038/s41467-020-19770-1Google Scholar , (
- 2021) A multiscale coarse-grained model of the SARS-CoV-2 virion. Biophysical Journal 120(6): 1097–1104. DOI: 10.1016/j.bpj.2020.10.048Google ScholarCross Ref , (
- 2021) Molecular mechanism of interaction between SARS-CoV-2 and host cells and interventional therapy. Signal Transduction and Targeted Therapy 6(1): 1–19.Google Scholar , (
- 2014) Theoretical frameworks for multiscale modeling and simulation. Current Opinion in Structural Biology 25: 67–76. DOI: 10.1016/j.sbi.2014.01.004Google ScholarCross Ref (
- 2020) SARS-CoV-2 simulations go exascale to capture spike opening and reveal cryptic pockets across the proteome. bioRxiv.Google Scholar , (
Index Terms
(auto-classified)Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action
Comments