Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In contemporary TV audience prediction, outliers are considered mere anomalies in the otherwise cyclical trend and seasonality components that can be used to make predictions. In the ReTV project, we want to provide more accurate audience predictions in ...
Automatic video captioning can be used to enrich TV programs with textual informations on scenes. These informations can be useful for visually impaired people, but can also be used to enhance indexing and research of TV records. Video captioning can be ...
Identifying persons using face recognition is an important task in applications such as media production, archiving and monitoring. Like other tasks, also face recognition pipelines have recently shifted to Deep Convolutional Neural Network (DNNs) based ...
In this paper we present our work on improving the efficiency of adversarial training for unsupervised video summarization. Our starting point is the SUM-GAN model, which creates a representative summary based on the intuition that such a summary should ...
This paper analyzes the gender representation in four major corpora of French broadcast. These corpora being widely used within the speech processing community, they are a primary material for training automatic speech recognition (ASR) systems. As ...
In this paper we evaluate how discriminative are behavior-based signals obtained from the smartphone sensors. The main aim is to evaluate these signals for person recognition. The recognition based on these signals increases the security of devices, but ...
The goal of this work is segmenting on a video sequence the objects which are mentioned in a linguistic description of the scene. We have adapted an existing deep neural network that achieves state of the art performance in semi-supervised video object ...
For most humans, understanding multimedia content is easy, and in many cases images and videos are a preferred means of augmenting and enhancing human interaction and communication. Given a video, humans can discern a great deal from this rich ...
Navigation is an important cognitive task that enables humans and animals to traverse, with or without maps, over long distances in the complex world. Such long-range navigation can simultaneously support self-localisation ("I am here") and a ...
Infants use exploratory behaviors to learn about the objects around them. Psychologists have theorized that behaviors such as grasping touching, pressing, and lifting, coupled with the visual, tactile, haptic and auditory sensory modalities, enable ...
Dense captioning (DC), which provides a comprehensive context understanding of images by describing all salient visual groundings in an image, facilitates multimodal understanding and learning. As an extension of image captioning, DC is developed to ...
We present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on ...
Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior.We propose a scrape-by-...
The global rising trend of consumer health awareness has increasingly drawn attention towards the development of nutrition tracking applications. A key component to enable large scale, low cost and personalized nutrition analysis is a food image ...
In this paper, we study the novel problem of not only predicting ingredients from a food image, but also predicting the relative amounts of the detected ingredients. We propose two prediction-based models using deep learning that output sparse and dense ...
Automated face matching algorithms are used in a wide variety of societal applications ranging from access authentication, to criminal identification, to application customization. Hence, it is important for such algorithms to be equitable in their ...
Image analysis algorithms have become indispensable in the modern information ecosystem. Beyond their early use in restricted domains (e.g., military, medical), they are now widely used in consumer applications and social media. With the rise of the "...
Depression has been the leading cause of mental-health illness worldwide. Major depressive disorder (MDD), is a common mental health disorder that affects both psychologically as well as physically which could lead to loss of lives. Due to the lack of ...
Depression is a common, but serious mental disorder that affects people all over the world. Besides providing an easier way of diagnosing the disorder, a computer-aided automatic depression assessment system is demanded in order to reduce subjective ...
Cross-cultural emotion recognition has been a challenging research problem in the affective computing field. In this paper, we present our solutions for the Cross-cultural Emotion Sub-challenge (CES) in Audio/Visual Emotion Challenge (AVEC) 2019. The ...