-
Interspeech 20232023To translate speech for automatic dubbing, machine translation needs to be isochronous, i.e. translated speech needs to be aligned with the source in terms of speech durations. We introduce target factors in a transformer model to predict durations jointly with target language phoneme sequences. We also introduce auxiliary counters to help the decoder to keep track of the timing information while generating
-
ACL Findings 20232023Knowledge graph embeddings (KGE) have been extensively studied to embed large-scale relational data for many real-world applications. Existing methods have long ignored the fact many KGs contain two fundamentally different views: high-level ontology-view concepts and fine-grained instance-view entities. They usually embed all nodes as vectors in one latent space. However, a single geometric representation
-
SC232023Memory-based Temporal Graph Neural Networks are powerful tools in dynamic graph representation learning and have demonstrated superior performance in many real-world applications. However, their node memory favors smaller batch sizes to capture more dependencies in graph events and needs to be maintained synchronously across all trainers. As a result, existing frameworks suffer from accuracy loss when scaling
-
ICCV 20232023Video amodal segmentation is a particularly challenging task in computer vision, which requires to deduce the full shape of an object from the visible parts of it. Recently, some studies have achieved promising performance by using motion flow to integrate information across frames under a self-supervised setting. However, motion flow has a clear limitation by the two factors of moving cameras and object
-
ICCV 20232023Amodal object segmentation is a challenging task that involves segmenting both visible and occluded parts of an object. In this paper, we propose a novel approach, called Coarse-to-Fine Segmentation (C2F-Seg), that addresses this problem by progressively modeling the amodal segmentation. C2F-Seg initially reduces the learning space from the pixel-level image space to the vector-quantized latent space. This
Related content
-
July 13, 2022Allowing separate tasks to converge on their own schedules and using knowledge distillation to maintain performance improves accuracy.
-
July 12, 2022Fun visual essays explain key concepts of machine learning.
-
July 07, 2022Walid’s 2010 paper on distributed caching algorithms for content distribution networks cited for its “significant impact on the research community”.
-
July 06, 2022Expanded program aimed at engineering undergraduate and graduate students builds off the success of inaugural program.
-
June 28, 2022Amazon’s TabTransformer model is now available through SageMaker JumpStart and the official release of the Keras open-source library.
-
June 24, 2022Technique that mixes public and private training data can meet differential-privacy criteria while cutting error increase by 60%-70%.