Yannis Assael

Yannis Assael
  • Staff Research Scientist
  • Researcher at DeepMind

About

30
Publications
28,388
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,788
Citations
Introduction
Staff Research Scientist, DeepMind.
Current institution
DeepMind
Current position
  • Researcher
Additional affiliations
DeepMind
Position
  • Researcher
October 2015 - present
University of Oxford
Position
  • DPhil candidate
Position
  • MRes

Publications

Publications (30)
Article
Full-text available
Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the stu...
Article
Full-text available
Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of wr...
Article
Full-text available
How noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substantially improved gene expression prediction accuracy from DNA sequences through the use of a deep learning architecture, called Enformer, that is...
Preprint
This work describes an interactive decoding method to improve the performance of visual speech recognition systems using user input to compensate for the inherent ambiguity of the task. Unlike most phoneme-to-word decoding pipelines, which produce phonemes and feed these through a finite state transducer, our method instead expands words in lockste...
Article
Full-text available
Introduction:. Machine learning (ML) is a set of models and methods that can detect patterns in vast amounts of data and use this information to perform various kinds of decision-making under uncertain conditions. This review explores the current role of this technology in plastic surgery by outlining the applications in clinical practice, diagnost...
Preprint
The next phase of genome biology research requires understanding how DNA sequence encodes phenotypes, from the molecular to organismal levels. How noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substant...
Preprint
We describe a system for large-scale audiovisual translation and dubbing, which translates videos from one language to another. The source language's speech content is transcribed to text, translated, and automatically synthesized into target language speech using the original speaker's voice. The visual content is translated by synthesizing lip mo...
Preprint
This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture. To support the development of such a system, we built a large audio-visual (A/V) dataset of segmented utterances extracted from YouTube public videos, leading to 31k hours of audio-visual training content. The...
Preprint
Ancient history relies on disciplines such as epigraphy, the study of ancient inscribed texts, for evidence of the recorded past. However, these texts, "inscriptions", are often damaged over the centuries, and illegible parts of the text must be restored by specialists, known as epigraphists. This work presents Pythia, the first ancient text restor...
Preprint
Large-scale mobile communication systems tend to contain legacy transmission channels with narrowband bottlenecks, resulting in characteristic "telephone-quality" audio. While higher quality codecs exist, due to the scale and heterogeneity of the networks, transmitting higher sample rate audio with modern high-quality audio codecs can be difficult...
Preprint
Full-text available
We present a meta-learning approach for adaptive text-to-speech (TTS) with few data. During training, we learn a multi-speaker model using a shared conditional WaveNet core and independent learned embeddings for each speaker. The aim of training is not to produce a neural network with fixed weights, which is then deployed as a TTS system. Instead,...
Preprint
Full-text available
This work presents a scalable solution to open-vocabulary visual speech recognition. To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3,886 hours of video). In tandem, we designed and trained an integrated lipreading system, consisting of a video p...
Article
Full-text available
Designing tax policies that are effective in curbing tax evasion and maximize state revenues requires a rigorous understanding of taxpayer behavior. This work explores the problem of determining the strategy a self-interested, risk-averse tax entity is expected to follow, as it "navigates" - in the context of a Markov Decision Process - a governmen...
Preprint
Designing tax policies that are effective in curbing tax evasion and maximize state revenues requires a rigorous understanding of taxpayer behavior. This work explores the problem of determining the strategy a self-interested, risk-averse tax entity is expected to follow, as it "navigates" - in the context of a Markov Decision Process - a governmen...
Article
Full-text available
Cortical circuits exhibit intricate recurrent architectures that are remarkably similar across different brain areas. Such stereotyped structure suggests the existence of common computational principles. However, such principles have remained largely elusive. Inspired by gated-memory networks, namely long short-term memory networks (LSTMs), we intr...
Conference Paper
In this work, we demonstrate how interdisciplinary knowledge can provide solutions to elusive challenges and advance science. As an example, we used the application of the THW in the measurement of the thermal conductivity of solids. To obtain a solution of the equations by FEM, about 10 h were required. By employing tools from the field of machine...
Article
Full-text available
Lipreading is the task of decoding text from the movement of a speaker's mouth. Traditional approaches separated the problem into two stages: designing or learning visual features, and prediction. More recent deep lipreading approaches are end-to-end trainable (Wand et al., 2016; Chung & Zisserman, 2016a). All existing works, however, perform only...
Article
Full-text available
We propose Deep Optimistic Linear Support Learning (DOL) to solve high-dimensional multi-objective decision problems where the relative importances of the objectives are not known a priori. Using features from the high-dimensional inputs, DOL computes the convex coverage set containing all potential optimal solutions of the convex combinations of t...
Article
Full-text available
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protoco...
Article
Full-text available
We propose deep distributed recurrent Q-networks (DDRQN), which enable teams of agents to learn to solve communication-based coordination tasks. In these tasks, the agents are not given any pre-designed communication protocol. Therefore, in order to successfully communicate, they must first automatically develop and agree upon their own communicati...
Article
Full-text available
Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ("torques") from p...
Article
Full-text available
A new portable absolute Transient Hot-Wire instrument for measuring the thermal conductivity of solids over a range of 0.2 to 4 Wm-1K-1 is presented. The new instrument is characterized by three novelties: a) an innovative two-wires sensor which provides robustness and portability, while at the same time employs a soft silicone layer to eliminate t...
Article
Full-text available
Optimising black-box functions is important in many disciplines, such as tuning machine learning models, robotics, finance and mining exploration. Bayesian optimisation is a state-of-the-art technique for the global optimisation of black-box functions which are expensive to evaluate. At the core of this approach is a Gaussian process prior that cap...
Article
Full-text available
Multiple matching algorithms are used to locate the occurrences of patterns from a finite pattern set in a large input string. Aho-Corasick and Wu-Manber, two of the most well known algorithms for multiple matching require an increased computing power, particularly in cases where large-size datasets must be processed, as is common in computational...
Article
Full-text available
This paper contains new, representative equations for the thermal conductivity of normal and parahydrogen. The equations are based in part upon a body of experimental data that has been critically assessed for internal consistency and for agreement with theory whenever possible. Although there are sufficient data at normal temperatures, data at ver...

Network

Cited By