Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Support for Machine Learning (ML) applications in networking has significantly improved over the last decade. The availability of public datasets and programmable switching fabrics (including low-level languages to program them) presents a full-stack ...
Jointly utilizing multiple GPUs to train graph neural networks (GNNs) is crucial for handling large graphs and achieving high efficiency. However, we find that existing systems suffer from high communication costs and low GPU utilization due to improper ...
Temporal Graph Neural Networks are gaining popularity in modeling interactions on dynamic graphs. Among them, Temporal Graph Attention Networks (TGAT) have gained adoption in predictive tasks, such as link prediction, in a range of application domains. ...
Tucker decomposition is one of the SOTA CNN model compression techniques. However, unlike the FLOPs reduction, we observe very limited inference time reduction with Tucker-compressed models using existing GPU software such as cuDNN. To this end, we ...
We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art ...
Auto-scheduling for tensor programs is a process where a search algorithm automatically explores candidate schedules (program transformations) for a given program on a target hardware platform to improve its performance. However this can be a very time ...
The Convolutional Neural Network (CNN) kernel is a fundamental building block for deep learning, which dominates the computational cost of deep learning pipelines for image analysis. The synthesis of high-performance GPU kernels for CNNs is thus of ...
Network slicing achieves cost-efficient slice customization to support heterogeneous applications and services. Configuring cross-domain resources to end-to-end slices based on service-level agreements, however, is challenging, due to the complicated ...
Over the most recent years, quantized graph neural network (QGNN) attracts lots of research and industry attention due to its high robustness and low computation and memory overhead. Unfortunately, the performance gains of QGNN have never been realized ...
An adaptive video object detection system selects different execution paths at runtime, based on video content and available resources, so as to maximize accuracy under a target latency objective (e.g., 30 frames per second). Such a system is well ...
Reinforcement learning (RL) is emerging as a powerful technique for solving complex code optimization tasks with an ample search space. While promising, existing solutions require a painstaking manual process to tune the right task-specific RL ...
Training models with massive inputs is a significant challenge in the development of Deep Learning pipelines to process very large digital image datasets as required by Whole Slide Imaging (WSI) in computational pathology and analysis of brain fMRI ...
This work reveals that memory-intensive computation is a rising performance-critical factor in recent machine learning models. Due to a unique set of new challenges, existing ML optimizing compilers cannot perform efficient fusion under complex two-...
Establishing a consistent normal orientation for point clouds is a notoriously difficult problem in geometry processing, requiring attention to both local and global shape characteristics. The normal direction of a point is a function of the local ...
Vector line art plays an important role in graphic design, however, it is tedious to manually create. We introduce a general framework to produce line drawings from a wide variety of images, by learning a mapping from raster image space to vector image ...
Despite the recent success of face image generation with GANs, conditional hair editing remains challenging due to the under-explored complexity of its geometry and appearance. In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), ...
Recently, deep generative adversarial networks for image generation have advanced rapidly; yet, only a small amount of research has focused on generative models for irregular structures, particularly meshes. Nonetheless, mesh generation and synthesis ...
We introduce a novel deep learning framework for data-driven motion retargeting between skeletons, which may have different structure, yet corresponding to homeomorphic graphs. Importantly, our approach learns how to retarget without requiring any ...
Motion synthesis in a dynamic environment has been a long-standing problem for character animation. Methods using motion capture data tend to scale poorly in complex environments because of their larger capturing and labeling requirement. Physics-based ...
In this paper, we introduce Point2Mesh, a technique for reconstructing a surface mesh from an input point cloud. Instead of explicitly specifying a prior that encodes the expected shape properties, the prior is defined automatically using the input ...