Author: Bosilca, George : Search

Applied Filters

People

Publications

Conferences

Publication Date

24 Results for: Author: Bosilca, GeorgeEdit SearchSave SearchRSS

Searched The ACM Full-Text Collection (691,749 records)|Expand your search to The ACM Guide to Computing Literature (3,482,419 records)

Showing 1 - 20of24 Results

Filters

Select All

Export Citations Save to Binder

per page:

Relevance

research-article
November 2022
Reshaping geostatistical modeling and prediction for extreme-scale environmental applications
SC '22: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisNovember 2022, Article No.: 2, pp 1–12

We extend the capability of space-time geostatistical modeling using algebraic approximations, illustrating application-expected accuracy worthy of double precision from majority low-precision computations and low-rank matrix approximations. We exploit ...
0
71
Metrics
Total Citations0
Total Downloads71
Last 12 Months71
Last 6 weeks20
1
Supplementary Material
reshaping_geostatistical_modeling_and_prediction_for_extreme-scale_environmental_applications.mp4 (1080p).mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2020
Task bench: a parameterized benchmark for evaluating parallel runtime performance
SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2020, Article No.: 62, pp 1–15

We present Task Bench, a parameterized benchmark designed to explore the performance of distributed programming systems under a variety of application scenarios. Task Bench dramatically lowers the barrier to benchmarking and comparing multiple ...
0
162
Metrics
Total Citations0
Total Downloads162
Last 12 Months17
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2020
Published By ACM
Using Advanced Vector Extensions AVX-512 for MPI Reductions
EuroMPI/USA '20: Proceedings of the 27th European MPI Users' Group MeetingSeptember 2020, pp 1–10https://doi.org/10.1145/3416315.3416316

As the scale of high-performance computing (HPC) systems continues to grow, researchers are devoted themselves to explore increasing levels of parallelism to achieve optimal performance. The modern CPU’s design, including its features of hierarchical ...
3
132
Metrics
Total Citations3
Total Downloads132
Last 12 Months36
Last 6 weeks3
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
June 2020
Published By ACM
Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications
PASC '20: Proceedings of the Platform for Advanced Scientific Computing ConferenceJune 2020, Article No.: 2, pp 1–11https://doi.org/10.1145/3394277.3401846

Climate and weather can be predicted statistically via geospatial Maximum Likelihood Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE-based iterative optimization procedure requires the solving of large-scale ...
17
373
Metrics
Total Citations17
Total Downloads373
Last 12 Months61
Last 6 weeks5
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Public Access
June 2020
Published By ACM
FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks
HPDC '20: Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed ComputingJune 2020, pp 113–124https://doi.org/10.1145/3369583.3392681

The performance and efficiency of distributed training of Deep Neural Networks (DNN) highly depend on the performance of gradient averaging among participating processes, a step bound by communication costs. There are two major approaches to reduce ...
8
433
Metrics
Total Citations8
Total Downloads433
Last 12 Months129
Last 6 weeks12
1
Supplementary Material
3369583.3392681.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
Upcoming Conferences
Skip slideshow

HPDC '23

June 16 - 23, 2023

Orlando World Marriott, Orlando, FL, USA

HPDC '23 Website

PASC '23

June 26 - 28, 2023

Congress Center Davos, Davos, Switzerland

PASC '23 Website

MM '23

October 28 - November 3, 2023

The Westin Ottawa Hotel, Ottawa, ON, Canada

MM '23 Website

SC '23

November 12 - 17, 2023

Colorado Convention Center, Denver, CO, USA

SC '23 Website
research-article
September 2019
Published By ACM
Runtime level failure detection and propagation in HPC systems
EuroMPI '19: Proceedings of the 26th European MPI Users' Group MeetingSeptember 2019, Article No.: 14, pp 1–11https://doi.org/10.1145/3343211.3343225

As the scale of high-performance computing (HPC) systems continues to grow, mean-time-to-failure (MTTF) of these HPC systems is negatively impacted and tends to decrease. In order to efficiently run long computing jobs on these systems, handling system ...
8
163
Metrics
Total Citations8
Total Downloads163
Last 12 Months22
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
June 2018
Published By ACM
ADAPT: an event-based adaptive collective communication framework
HPDC '18: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed ComputingJune 2018, pp 118–130https://doi.org/10.1145/3208040.3208054

The increase in scale and heterogeneity of high-performance computing (HPC) systems predispose the performance of Message Passing Interface (MPI) collective communications to be susceptible to noise, and to adapt to a complex mix of hardware ...
12
361
Metrics
Total Citations12
Total Downloads361
Last 12 Months48
Last 6 weeks3
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2017
Published By ACM
Dynamic task discovery in PaRSEC: a data-flow task-based runtime
ScalA '17: Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale SystemsNovember 2017, Article No.: 6, pp 1–8https://doi.org/10.1145/3148226.3148233

Successfully exploiting distributed collections of heterogeneous many-cores architectures with complex memory hierarchy through a portable programming model is a challenge for application developers. The literature is not short of proposals addressing ...
40
294
Metrics
Total Citations40
Total Downloads294
Last 12 Months79
Last 6 weeks15
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2017
Published By ACM
Efficient Communications in Training Large Scale Neural Networks
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017October 2017, pp 110–116https://doi.org/10.1145/3126686.3126749

We consider the problem of how to reduce the cost of communication that is required for the parallel training of a neural network. The state-of-the-art method, Bulk Synchronous Parallel Stochastic Gradient Descent (BSP-SGD), requires many collective ...
4
156
Metrics
Total Citations4
Total Downloads156
Last 12 Months9
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Public Access
September 2017
Published By ACM
Using software-based performance counters to expose low-level open MPI performance information
EuroMPI '17: Proceedings of the 24th European MPI Users' Group MeetingSeptember 2017, Article No.: 7, pp 1–8https://doi.org/10.1145/3127024.3127039

This paper details the implementation and usage of software-based performance counters to understand the performance of a particular implementation of the MPI standard, Open MPI. Such counters can expose intrinsic features of the software stack that are ...
8
255
Metrics
Total Citations8
Total Downloads255
Last 12 Months83
Last 6 weeks10
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2016
Failure detection and propagation in HPC systems
SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2016, Article No.: 27, pp 1–11

Building an infrastructure for Exascale applications requires, in addition to many other key components, a stable and efficient failure detector. This paper describes the design and evaluation of a robust failure detector, able to maintain and ...
4
228
Metrics
Total Citations4
Total Downloads228
Last 12 Months11
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
May 2016
Published By ACM
GPU-Aware Non-contiguous Data Movement In Open MPI
HPDC '16: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed ComputingMay 2016, pp 231–242https://doi.org/10.1145/2907294.2907317

Due to better parallel density and power efficiency, GPUs have become more popular for use in scientific applica- tions. Many of these applications are based on the ubiquitous Message Passing Interface (MPI) programming paradigm, and take advantage of ...
11
239
Metrics
Total Citations11
Total Downloads239
Last 12 Months26
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2015
Published By ACM
Practical scalable consensus for pseudo-synchronous distributed systems
SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2015, Article No.: 31, pp 1–12https://doi.org/10.1145/2807591.2807665

The ability to consistently handle faults in a distributed environment requires, among a small set of basic routines, an agreement algorithm allowing surviving entities to reach a consensual decision between a bounded set of volatile resources. This ...
12
216
Metrics
Total Citations12
Total Downloads216
Last 12 Months7
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
September 2015
Published By ACM
Sliding Substitution of Failed Nodes
EuroMPI '15: Proceedings of the 22nd European MPI Users' Group MeetingSeptember 2015, Article No.: 14, pp 1–10https://doi.org/10.1145/2802658.2802670

This paper considers the questions of how spare nodes should be allocated, how to substitute them for faulty nodes, and how much the communication performance is affected by such a substitution. The third question stems from the modification of the rank ...
6
279
Metrics
Total Citations6
Total Downloads279
Last 12 Months19
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
September 2015
Published By ACM
Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery
EuroMPI '15: Proceedings of the 22nd European MPI Users' Group MeetingSeptember 2015, Article No.: 11, pp 1–9https://doi.org/10.1145/2802658.2802668

Advanced failure recovery strategies in HPC system benefit tremendously from in-place failure recovery, in which the MPI infrastructure can survive process crashes and resume communication services. In this paper we present the rationale behind the ...
8
82
Metrics
Total Citations8
Total Downloads82
Last 12 Months2
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Public Access
February 2015
Published By ACM
Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy
ACM Transactions on Parallel Computing (TOPC), Volume 1, Issue 2January 2015, Article No.: 10, pp 1–28https://doi.org/10.1145/2686892

Dense matrix factorizations, such as LU, Cholesky and QR, are widely used for scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on ...
17
529
Metrics
Total Citations17
Total Downloads529
Last 12 Months46
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2014
PTG: an abstraction for unhindered parallelism
WOLFHPC '14: Proceedings of the Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance ComputingNovember 2014, pp 21–30

Increased parallelism and use of heterogeneous computing resources is now an established trend in High Performance Computing (HPC), a trend that, looking forward to Exascale, seems bound to intensify. Despite the evolution of hardware over the past ...
2
79
Metrics
Total Citations2
Total Downloads79
Last 12 Months8
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2014
Published By ACM
A Multithreaded Communication Substrate for OpenSHMEM
PGAS '14: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming ModelsOctober 2014, Article No.: 16, pp 1–2https://doi.org/10.1145/2676870.2676895

OpenSHMEM scalability is strongly dependent on the capability of its communication layer to efficiently handle multiple threads. In this paper, we present an early evaluation of the thread safety specification in the Unified Common Communication ...
1
35
Metrics
Total Citations1
Total Downloads35
Last 12 Months0
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
September 2014
Published By ACM
Optimizations to enhance sustainability of MPI applications
EuroMPI/ASIA '14: Proceedings of the 21st European MPI Users' Group MeetingSeptember 2014, pp 145–150https://doi.org/10.1145/2642769.2642797

Ultrascale computing systems are likely to reach speeds of two or three orders of magnitude greater than today's computing systems. However, to achieve this level of performance, we need to design and implement more sustainable solutions for ultra-scale ...
8
112
Metrics
Total Citations8
Total Downloads112
Last 12 Months6
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2013
Published By ACM
CPU-GPU hybrid bidiagonal reduction with soft error resilience
ScalA '13: Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale SystemsNovember 2013, Article No.: 2, pp 1–5https://doi.org/10.1145/2530268.2530270

Soft errors pose a real challenge to applications running on modern hardware as the feature size becomes smaller and the integration density increases for both the modern processors and the memory chips. Soft errors manifest themselves as bit-flips that ...
6
108
Metrics
Total Citations6
Total Downloads108
Last 12 Months2
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Caption

Reshaping geostatistical modeling and prediction for extreme-scale environmental applications

Task bench: a parameterized benchmark for evaluating parallel runtime performance

Using Advanced Vector Extensions AVX-512 for MPI Reductions

Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications

FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks

Upcoming Conferences

Runtime level failure detection and propagation in HPC systems

ADAPT: an event-based adaptive collective communication framework

Dynamic task discovery in PaRSEC: a data-flow task-based runtime

Efficient Communications in Training Large Scale Neural Networks

Using software-based performance counters to expose low-level open MPI performance information

Failure detection and propagation in HPC systems

GPU-Aware Non-contiguous Data Movement In Open MPI

Practical scalable consensus for pseudo-synchronous distributed systems

Sliding Substitution of Failed Nodes

Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery

Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy

PTG: an abstraction for unhindered parallelism

A Multithreaded Communication Substrate for OpenSHMEM

Optimizations to enhance sustainability of MPI applications

CPU-GPU hybrid bidiagonal reduction with soft error resilience