Search by Subject

Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Applied Filters

People

Publications

Publication Date

Searched The ACM Full-Text Collection (691,749 records)|Expand your search to The ACM Guide to Computing Literature (3,482,419 records)

Showing 1 - 20of573 Results

Filters

Select All

Export Citations Save to Binder

per page:

Latest

research-article
March 2023
MRCAug: Data Augmentation via Machine Reading Comprehension for Document-Level Event Argument Extraction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 302022, pp 3160–3172https://doi.org/10.1109/TASLP.2022.3210442
Document-level event argument extraction (EAE) is a critical event semantic understanding task that requires a model to identify an event's global arguments beyond the sentence level. Existing approaches to this problem are based on supervised ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
December 2022
Refining History for Future-Aware Neural Machine Translation
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 500–512https://doi.org/10.1109/TASLP.2022.3226332
Neural machine translation uses a decoder to generate target words auto-regressively by predicting the next target word conditioned on a given source sentence and its previously predicted target words, i.e, its translation history, which suffers from two ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
December 2022
A Cross-Attention Fusion Based Graph Convolution Auto-Encoder for Open Relation Extraction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 476–485https://doi.org/10.1109/TASLP.2022.3226680
Open Relation Extraction (OpenRE) aims at clustering relation instances to extract relation types. By learning relation patterns between named entities, it clusters semantically equivalent patterns into a unified relation cluster. Existing clustering-...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
December 2022
Modularized Mutuality Network for Emotion-Cause Pair Extraction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 539–549https://doi.org/10.1109/TASLP.2022.3228129
Emotion-cause pair extraction (ECPE) is an emerging task born out of Emotion cause extraction (ECE), which aims to extract the emotion clause and the corresponding cause clause simultaneously. Previous methods decompose ECPE into multiple sub-tasks, ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
December 2022
On Robustness and Sensitivity of a Neural Language Model: A Case Study on Italian L1 Learner Errors
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 426–438https://doi.org/10.1109/TASLP.2022.3226333
In this paper, we propose a comprehensive linguistic study aimed at assessing the implicit behavior of one of the most prominent Neural Language Models (NLM) based on Transformer architectures, BERT Devlin et al., when dealing with a particular source of ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
December 2022
Audio-Visual Cross-Attention Network for Robotic Speaker Tracking
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 550–562https://doi.org/10.1109/TASLP.2022.3226330
Audio-visual signals can be used jointly for robotic perception as they complement each other. Such multi-modal sensory fusion has a clear advantage, especially under noisy acoustic conditions. Speaker localization, as an essential robotic function, was ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2022
Audio Embedding-Aware Dialogue Policy Learning
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 525–538https://doi.org/10.1109/TASLP.2022.3225658
Following the success of Natural Language Processing (NLP) transformers pretrained via self-supervised learning, similar models have been proposed recently for speech processing such as Wav2Vec2, HuBERT and UniSpeech-SAT. An interesting yet unexplored ...
0
4
Metrics
Total Citations0
Total Downloads4
Last 12 Months4
Last 6 weeks4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
November 2022
A Time-Frequency Attention Module for Neural Speech Enhancement
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 462–475https://doi.org/10.1109/TASLP.2022.3225649
Speech enhancement plays an essential role in a wide range of speech processing applications. Recent studies on speech enhancement tend to investigate how to effectively capture the long-term contextual dependencies of speech signals to boost performance. ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2022
A Neighborhood Re-Ranking Model With Relation Constraint for Knowledge Graph Completion
- Yu Li,
- Bojie Hu,
- Jian Liu,
- Yufeng Chen,
- Jinan Xu
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 411–425https://doi.org/10.1109/TASLP.2022.3225537
Knowledge graph completion (KGC) aims to predict missing links based on observed triples. However, current KGC models are still limited by the following two aspects. (1) the entity semantics is implicitly learned by neural network and merely depends on ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
Cross-Lingual Named Entity Recognition for Heterogenous Languages
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 371–382https://doi.org/10.1109/TASLP.2022.3212698
Previous works on cross-lingual Named Entity Recognition (NER) have achieved great success. However, few of them consider the effect of language families between the source and target languages. In this study, we find that the cross-lingual NER ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
End-to-End Multi-Modal Speech Recognition on an Air and Bone Conducted Speech Corpus
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 513–524https://doi.org/10.1109/TASLP.2022.3224305
Automatic speech recognition (ASR) has been significantly improved in the past years. However, most robust ASR systems are based on air-conducted (AC) speech, and their performances in low signal-to-noise-ratio (SNR) conditions are not satisfactory. Bone-...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
Curriculum-Style Fine-Grained Adaption for Unsupervised Cross-Lingual Dependency Transfer
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 322–332https://doi.org/10.1109/TASLP.2022.3224302
Unsupervised cross-lingual transfer has been shown great potentials for dependency parsing of the low-resource languages when there is no annotated treebank available. Recently, the self-training method has received increasing interests because of its ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
Training a Singing Transcription Model Using Connectionist Temporal Classification Loss and Cross-Entropy Loss
- Jun-You Wang,
- Jyh-Shing Roger Jang
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 383–396https://doi.org/10.1109/TASLP.2022.3224297
In this paper, we propose a method that uses a combination of the Connectionist Temporal Classification (CTC) loss and the cross-entropy loss to train a note-level singing transcription model. By considering the task as predicting a note sequence of the ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
STFF-SM: Steganalysis Model Based on Spatial and Temporal Feature Fusion for Speech Streams
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 277–289https://doi.org/10.1109/TASLP.2022.3224295
The real-time detection of speech steganography in Voice-over-Internet-Protocol (VoIP) scenarios remains an open problem, as it requires steganalysis methods to perform for low-intensity embeddings and short-sample inputs, as well as provide rapid ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
November 2022
Meta-AF: Meta-Learning for Adaptive Filters
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 355–370https://doi.org/10.1109/TASLP.2022.3224288
Adaptive filtering algorithms are pervasive throughout signal processing and have had a material impact on a wide variety of domains including audio processing, telecommunications, biomedical sensing, astrophysics and cosmology, seismology, and many more. ...
0
2
Metrics
Total Citations0
Total Downloads2
Last 12 Months2
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2022
EmoInt-Trans: A Multimodal Transformer for Identifying Emotions and Intents in Social Conversations
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 290–300https://doi.org/10.1109/TASLP.2022.3224287
In the natural language processing community, open-domain conversational agents, also known as chatbots, are gaining popularity. One of the difficulties is getting them to communicate in an emotionally intelligent manner. To generate dialogues, current ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
November 2022
Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 345–354https://doi.org/10.1109/TASLP.2022.3224286
Contextual knowledge is essential for reducing speech recognition errors on high-valued long-tail words. This paper proposes a novel tree-constrained pointer generator (TCPGen) component that enables end-to-end ASR models to bias towards a list of long-...
0
4
Metrics
Total Citations0
Total Downloads4
Last 12 Months4
Last 6 weeks4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
November 2022
STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 397–410https://doi.org/10.1109/TASLP.2022.3224285
Deep learning based speech enhancement in the short-time Fourier transform (STFT) domain typically uses a large window length such as 32 ms. A larger window can lead to higher frequency resolution and potentially better enhancement. This however incurs an ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 313–321https://doi.org/10.1109/TASLP.2022.3224282
In this paper, we present a new model for Direction of Arrival (DOA) estimation of sound sources based on an Icosahedral Convolutional Neural Network (CNN) applied over SRP-PHAT power maps computed from the signals received by a microphone array. This ...
0
1
Metrics
Total Citations0
Total Downloads1
Last 12 Months1
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2022
Enhanced Multi-Domain Dialogue State Tracker With Second-Order Slot Interactions
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), Volume 312023, pp 265–276https://doi.org/10.1109/TASLP.2022.3221044
Dialogue state tracking (DST) is often used to track the system's understanding of the user goal in task-oriented dialogue systems. Existing DST methods mainly fall into two categories according to their adopted model structure: non-hierarchical ...
0
Metrics
Total Citations0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access

Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Applied Filters

People

Names

Affiliations

Authors

Reviewers

Publications

All Publications

Content Type

Media Formats

Publisher

Publication Date

MRCAug: Data Augmentation via Machine Reading Comprehension for Document-Level Event Argument Extraction

Refining History for Future-Aware Neural Machine Translation

A Cross-Attention Fusion Based Graph Convolution Auto-Encoder for Open Relation Extraction

Modularized Mutuality Network for Emotion-Cause Pair Extraction

On Robustness and Sensitivity of a Neural Language Model: A Case Study on Italian L1 Learner Errors

Audio-Visual Cross-Attention Network for Robotic Speaker Tracking

Audio Embedding-Aware Dialogue Policy Learning

A Time-Frequency Attention Module for Neural Speech Enhancement

A Neighborhood Re-Ranking Model With Relation Constraint for Knowledge Graph Completion

Cross-Lingual Named Entity Recognition for Heterogenous Languages

End-to-End Multi-Modal Speech Recognition on an Air and Bone Conducted Speech Corpus

Curriculum-Style Fine-Grained Adaption for Unsupervised Cross-Lingual Dependency Transfer

Training a Singing Transcription Model Using Connectionist Temporal Classification Loss and Cross-Entropy Loss

STFF-SM: Steganalysis Model Based on Spatial and Temporal Feature Fusion for Speech Streams

Meta-AF: Meta-Learning for Adaptive Filters

EmoInt-Trans: A Multimodal Transformer for Identifying Emotions and Intents in Social Conversations

Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator

STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency

Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs

Enhanced Multi-Domain Dialogue State Tracker With Second-Order Slot Interactions