Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Department of Computer Science, The University of Tennessee, Knoxville, USA
,Department of Computer Science, The University of Tennessee, Knoxville, USA
,Department of Computer Science, The University of Tennessee, Knoxville, USA
Computer Science and Mathematics Division, Oak Ridge National Laboratory, USA
Computer Science and Mathematics Schools, The University of Manchester, UK
The number of processors embedded in high performance computing platforms is growing daily to solve larger and more complex problems. However, as the number of components increases, so does the probability of failure. The logical network ...
University of Tennessee
,Lawrence Berkeley National Laboratory
,Lawrence Berkeley National Laboratory
,University of Tennessee
,Lawrence Berkeley National Laboratory
,University of Tennessee
Many scientific applications rely on sparse direct solvers for their numerical robustness. However, performance optimization for these solvers remains a challenging task, especially on GPUs. This is due to workloads of small dense matrices that are ...
University of Tennessee
,King Abdullah University of Science and Technology, Thuwal, KSA
,King Abdullah University of Science and Technology, Thuwal, KSA
,University of Tennessee
,King Abdullah University of Science and Technology, Thuwal, KSA
,University of Tennessee
,University of Tennessee and The Oak Ridge National Laboratory and University of Manchester
,King Abdullah University of Science and Technology, Thuwal, KSA
,King Abdullah University of Science and Technology, Thuwal, KSA
,King Abdullah University of Science and Technology, Thuwal, KSA
,King Abdullah University of Science and Technology, Thuwal, KSA
We extend the capability of space-time geostatistical modeling using algebraic approximations, illustrating application-expected accuracy worthy of double precision from majority low-precision computations and low-rank matrix approximations. We exploit ...
University of Tennessee, Knoxville, USA
,University of Tennessee, Knoxville, USA
,University of Tennessee, Knoxville, USA
Oak Ridge National Laboratory, Oak Ridge, USA
University of Manchester, Manchester, UK
QR factorization of dense matrices is a ubiquitous tool in high performance computing (HPC). From solving linear systems and least squares problems to eigenvalue problems, and singular value decompositions, the impact of a high performance QR ...
The University of Tennessee, 1122 Volunteer Blvd, Knoxville, TN 37996, United States of America
,The University of Tennessee, 1122 Volunteer Blvd, Knoxville, TN 37996, United States of America
,The University of Tennessee, 1122 Volunteer Blvd, Knoxville, TN 37996, United States of America
,The University of Tennessee, 1122 Volunteer Blvd, Knoxville, TN 37996, United States of America
The modern CPU’s design, including the deep memory hierarchies and SIMD/vectorization capability have a more significant impact on algorithms’ efficiency than the modest frequency increase observed recently. The current introduction of ...
Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
Department of Computer Science, 14589University of Illinois at Urbana-Champaign, Urbana, IL, USA
Department of Mechanical Science and Engineering, 14589University of Illinois at Urbana-Champaign, Urbana, IL, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
,Innovative Computing Laboratory, 4285University of Tennessee, Knoxville, TN, USA
,Department of Computer Science, University of Colorado, Boulder, CO, USA
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Department of Mathematics, Virginia Tech, Blacksburg, VA, USA
,Innovative Computing Laboratory, 4285University of Tennessee, Knoxville, TN, USA
,Scientific Computation Research Center, Rensselaer Polytechnic Institute, Troy, NY, USA
,Innovative Computing Laboratory, 4285University of Tennessee, Knoxville, TN, USA
,Department of Computer Science, University of Colorado, Boulder, CO, USA
,Innovative Computing Laboratory, 4285University of Tennessee, Knoxville, TN, USA
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,AMD Research, Austin, TX, USA
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Mechanical Engineering Department, Middle East Technical University, Ankara, Turkey
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
,Occalytics LLC, Weehawken, NJ, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
Department of Nuclear Engineering, Penn State, PA, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Department of Computer Science, 14589University of Illinois at Urbana-Champaign, Urbana, IL, USA
,Innovative Computing Laboratory, 4285University of Tennessee, Knoxville, TN, USA
,Department of Computer Science, 14589University of Illinois at Urbana-Champaign, Urbana, IL, USA
,Pacific Northwest National Laboratory, WA, USA
,Department of Computer Science, University of Colorado, Boulder, CO, USA
,Mathematics and Computer Science, 1291Argonne National Laboratory, Lemont, IL, USA
Department of Mechanical Engineering, Aristotle University of Thessaloniki, Greece
,Center for Applied Scientific Computing, 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
,Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point ...
Innovative Computing Laboratory - UT, 37996, Knoxville, TN, USA
,Innovative Computing Laboratory - UT, 37996, Knoxville, TN, USA
,Oak Ridge National Laboratory, 37830, Oak Ridge, TN, USA
,Innovative Computing Laboratory - UT, 37996, Knoxville, TN, USA
Oak Ridge National Laboratory, 37830, Oak Ridge, TN, USA
University of Manchester, M13 9PL, Manchester, UK
The fast Fourier transform (FFT), is one the most important tools in mathematics, and it is widely required by several applications of science and engineering. State-of-the-art parallel implementations of the FFT algorithm, based on Cooley-Tukey ...
4285University of Tennessee, Knoxville, USA
,4285University of Tennessee, Knoxville, USA
Karlsruhe Institute of Technology, Karlsruhe, Germany
,1105Sandia National Lab, Albuquerque, USA
,37740Charles University, Prague, Czech Republic
,Karlsruhe Institute of Technology, Karlsruhe, Germany
,4285University of Tennessee, Knoxville, USA
Oak Ridge National Lab, Oak Ridge, USA
5292University of Manchester, Manchester, UK
,4578Lawrence Livermore National Lab, USA
,4285University of Tennessee, Knoxville, USA
,5292University of Manchester, Manchester, UK
,4578Lawrence Berkeley National Lab, Berkeley, USA
,1105Sandia National Lab, Albuquerque, USA
,4285University of Tennessee, Knoxville, USA
,5292University of Manchester, Manchester, UK
,1105Sandia National Lab, Albuquerque, USA
,Karlsruhe Institute of Technology, Karlsruhe, Germany
,1291Argonne National Lab, Argonne, USA
,53405National Renewable Energy Lab, Boulder, USA
,53405National Renewable Energy Lab, Boulder, USA
,4285University of Tennessee, Knoxville, USA
,4285University of Tennessee, Knoxville, USA
,4578Lawrence Livermore National Lab, USA
The efficient utilization of mixed-precision numerical linear algebra algorithms can offer attractive acceleration to scientific computing applications. Especially with the hardware integration of low-precision special-function units designed for machine ...
University of Tennessee, USA
,NVIDIA, USA
,University of Tennessee, Oak Ridge National Laboratory, and University of Manchester, USA
,University of Tennessee, USA
,NVIDIA, USA
,University of Manchester, UK
,University of Manchester, UK
,AMD, USA
,University of Tennessee, USA
,University of Tennessee, USA
,NAG Ltd., UK
This article describes a standard API for a set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS). The focus is on many independent BLAS operations on small matrices that are grouped together and processed by a single routine, called a ...
4285The University of Tennessee, Knoxville, TN, USA
,4285The University of Tennessee, Knoxville, TN, USA
,4285The University of Tennessee, Knoxville, TN, USA
,4285The University of Tennessee, Knoxville, TN, USA
,4285The University of Tennessee, Knoxville, TN, USA
,196328Nvidia Corporation, Santa Clara, CA, USA
,41487Naval Research Laboratory, Washington, DC, USA
,4285The University of Tennessee, Knoxville, TN, USA
Oak Ridge National Laboratory, Oak Ridge, TN, USA
University of Manchester, Manchester, England, UK
With the acquisition and widespread use of more resources that rely on accelerator/wide vector–based computing, there has been a strong demand for science and engineering applications to take advantage of these latest assets. This, however, has been ...
The University of Tennessee, USA
,The University of Tennessee, USA
,The University of Tennessee, USA
,The University of Tennessee, USA
As the scale of high-performance computing (HPC) systems continues to grow, researchers are devoted themselves to explore increasing levels of parallelism to achieve optimal performance. The modern CPU’s design, including its features of hierarchical ...
University of Tennessee
,University of Tennessee
,King Abdullah University of Science and Technology
,King Abdullah University of Science and Technology
,University of Tennessee
,King Abdullah University of Science and Technology
,King Abdullah University of Science and Technology
,University of Tennessee, the Oak, Ridge National Laboratory and the University of Manchester, UK
Climate and weather can be predicted statistically via geospatial Maximum Likelihood Estimates (MLE), as an alternative to running large ensembles of forward models. The MLE-based iterative optimization procedure requires the solving of large-scale ...
Innovative Computing Laboratory, The University of Tennessee, Knoxville, TN, USA
,Innovative Computing Laboratory, The University of Tennessee, Knoxville, TN, USA
,Nvidia Corporation, Santa Clara, CA, USA
,Innovative Computing Laboratory, The University of Tennessee, Knoxville, TN, USA
Oak Ridge National Laboratory, Oak Ridge, USA
University of Manchester, Manchester, UK
Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software,...
Karlsruhe Institute of Technology, Germany and University of Tennessee, USA
,Karlsruhe Institute of Technology, Germany
,National Taiwan University, Taiwan
,University of Tennessee, Oak Ridge National Lab, and University of Manchester, UK
,University of Jaume I, Spain
,Karlsruhe Institute of Technology, Germany
,University of Tennessee, USA
,National Taiwan University, Taiwan
,National Taiwan University, Taiwan
Efficient processing of Irregular Matrices on Single Instruction, Multiple Data (SIMD)-type architectures is a persistent challenge. Resolving it requires innovations in the development of data formats, computational techniques, and implementations that ...
University of Tennessee
,University of Tennessee
,University of Tennessee
,University of Tennessee
,University of Tennessee and University of Manchester
The SLATE (Software for Linear Algebra Targeting Exascale) library is being developed to provide fundamental dense linear algebra capabilities for current and upcoming distributed high-performance systems, both accelerated CPU-GPU based and CPU based. ...
Electrical Engineering and Computer Science Department, University of Tennessee, Knoxville, TN, USA
Oak Ridge National Laboratory, Oak Ridge, TN, USA
University of Manchester, Manchester, UK
,Grenoble Computer Science Laboratory, Grenoble, France
Grenoble Alps University, Grenoble, France
,Sandia National Laboratories, Computer Science Research Institute, Albuquerque, NM, USA
,Supercomputing Research Division, Information Technology Center, The University of Tokyo, Tokyo, Japan
,Tokyo Institute of Technology, Global Scientific Information and Computing Center, Tokyo, Japan
,Department of Electrical Engineering and Computer Science, The University of Tennessee, Knoxville, TN, USA
,We parallelize the LU factorization of a hierarchical low-rank matrix ( H -matrix) on a distributed-memory computer. This is much more difficult than the H -matrix-vector multiplication due to the dataflow of the factorization, and it is much harder than the ...
University of Tennessee, 37996, Knoxville, TN, USA
,University of Tennessee, 37996, Knoxville, TN, USA
,University of Tennessee, 37996, Knoxville, TN, USA
,University of Tennessee, 37996, Knoxville, TN, USA
,University of Tennessee, 37996, Knoxville, TN, USA
,University of Tennessee, 37996, Knoxville, TN, USA
Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
University of Manchester, M13 9PL, Manchester, UK
This work presents two implementations of linear solvers for distributed-memory machines with GPU accelerators—one based on the Cholesky factorization and one based on the LU factorization with partial pivoting. The routines are developed as part ...
University of Tennessee, Knoxville, Tennessee
,University of Tennessee, Knoxville, Tennessee
,University of Tennessee, Knoxville, Tennessee
,University of Tennessee, Knoxville, Tennessee
,University of Tennessee, Knoxville, Tennessee
This article presents an implementation of a distributed autotuning engine developed as part of the Bench-testing OpenN Software Autotuning Infrastructure project. The system is geared towards performance optimization of computational kernels for ...
University of Tennessee
,University of Tennessee
,University of Tennessee and Oak Ridge National Laboratory and University of Manchester
,University of Manchester
Low-precision floating-point arithmetic is a powerful tool for accelerating scientific computing applications, especially those in artificial intelligence. Here, we present an investigation showing that other high-performance computing (HPC) ...
The more conservative the merging algorithms, the more bits of evidence are required before a merge is made, resulting in greater precision but lower recall of works for a given Author Profile. Many bibliographic records have only author initials. Many names lack affiliations. With very common family names, typical in Asia, more liberal algorithms result in mistaken merges.
Automatic normalization of author names is not exact. Hence it is clear that manual intervention based on human knowledge is required to perfect algorithmic results. ACM is meeting this challenge, continuing to work to improve the automated merges by tweaking the weighting of the evidence in light of experience.
ACM will expand this edit facility to accommodate more types of data and facilitate ease of community participation with appropriate safeguards. In particular, authors or members of the community will be able to indicate works in their profile that do not belong there and merge others that do belong but are currently missing.
A direct search interface for Author Profiles will be built.
An institutional view of works emerging from their faculty and researchers will be provided along with a relevant set of metrics.
It is possible, too, that the Author Profile page may evolve to allow interested authors to upload unpublished professional materials to an area available for search and free educational use, but distinct from the ACM Digital Library proper. It is hard to predict what shape such an area for user-generated content may take, but it carries interesting potential for input from the community.
The ACM DL is a comprehensive repository of publications from the entire field of computing.
It is ACM's intention to make the derivation of any publication statistics it generates clear to the user.
ACM Author-Izer is a unique service that enables ACM authors to generate and post links on both their homepage and institutional repository for visitors to download the definitive version of their articles from the ACM Digital Library at no charge.
Downloads from these sites are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to definitive version of ACM articles should reduce user confusion over article versioning.
ACM Author-Izer also extends ACM’s reputation as an innovative “Green Path” publisher, making ACM one of the first publishers of scholarly works to offer this model to its authors.
To access ACM Author-Izer, authors need to establish a free ACM web account. Should authors change institutions or sites, they can utilize the new ACM service to disable old links and re-authorize new links for free downloads from a different site.
Authors may post ACM Author-Izer links in their own bibliographies maintained on their website and their own institution’s repository. The links take visitors to your page directly to the definitive version of individual articles inside the ACM Digital Library to download these articles for free.
The Service can be applied to all the articles you have ever published with ACM.
Depending on your previous activities within the ACM DL, you may need to take up to three steps to use ACM Author-Izer.
For authors who do not have a free ACM Web Account:
For authors who have an ACM web account, but have not edited their ACM Author Profile page:
For authors who have an account and have already edited their Profile Page:
ACM Author-Izer also provides code snippets for authors to display download and citation statistics for each “authorized” article on their personal pages. Downloads from these pages are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to the definitive version of ACM articles should reduce user confusion over article versioning.
Note: You still retain the right to post your author-prepared preprint versions on your home pages and in your institutional repositories with DOI pointers to the definitive version permanently maintained in the ACM Digital Library. But any download of your preprint versions will not be counted in ACM usage statistics. If you use these AUTHOR-IZER links instead, usage by visitors to your page will be recorded in the ACM Digital Library and displayed on your page.
We are preparing your search results for download ...
We will inform you here when the file is ready.
Download now!Your file of search results citations is now ready.
Download now!