著者
Ryo Fukuda Katsuhito Sudoh Satoshi Nakamura
出版者
The Association for Natural Language Processing
雑誌
自然言語処理 (ISSN:13407619)
巻号頁・発行日
vol.29, no.2, pp.344-366, 2022 (Released:2022-06-15)
参考文献数
42

Recent studies consider knowledge distillation as a promising method for speech translation (ST) using end-to-end models. However, its usefulness in cascade ST with automatic speech recognition (ASR) and machine translation (MT) models has not yet been clarified. An ASR output typically contains speech recognition errors. An MT model trained only on human transcripts performs poorly on error-containing ASR results. Thus, it should be trained considering the presence of ASR errors during inference. In this paper, we propose using knowledge distillation for training of the MT model for cascade ST to achieve robustness against ASR errors. We distilled knowledge from a teacher model based on human transcripts to a student model based on erroneous transcriptions. Our experimental results showed that the proposed method improves the translation performance on erroneous transcriptions. Further investigation by combining knowledge distillation and fine-tuning consistently improved the performance on two different datasets: MuST-C English--Italian and Fisher Spanish--English.
著者
Kosuke Takahashi Katsuhito Sudoh Satoshi Nakamura
出版者
The Association for Natural Language Processing
雑誌
自然言語処理 (ISSN:13407619)
巻号頁・発行日
vol.29, no.1, pp.3-22, 2022 (Released:2022-03-15)
参考文献数
23
被引用文献数
1

As the performance of machine translation has improved, the need for a human-like automatic evaluation metric has been increasing. The use of multiple reference translations against a system translation (a hypothesis) has been adopted as a strategy to improve the performance of such evaluation metrics. However, preparing multiple references is highly expensive and impractical. In this study, we propose an automatic evaluation method for machine translation that uses source sentences as additional pseudo-references. The proposed method evaluates a translation hypothesis via regression to assign a real-valued score. The model takes the paired source, reference, and hypothesis sentences together as input. A pre-trained large-scale cross-lingual language model encodes the input to sentence vectors, with which the model predicts a human evaluation score. The results of experiments show that our proposed method exhibited stably higher correlation with human judgements than baseline methods that solely depend on hypothesis and reference sentences, especially when the hypotheses were very high- or low-quality translations.
著者
Yui Oka Katsuhito Sudoh Satoshi Nakamura
出版者
The Association for Natural Language Processing
雑誌
自然言語処理 (ISSN:13407619)
巻号頁・発行日
vol.28, no.3, pp.778-801, 2021 (Released:2021-09-15)
参考文献数
25

Neural machine translation often suffers from an under-translation problem owing to its limited modeling of the output sequence lengths. In this study, we propose a novel approach to training a Transformer model using length constraints based on length-aware positional encoding (PE). Because length constraints with exact target sentence lengths degrade the translation performance, we add a random perturbation with a uniform distribution within a certain range to the length constraints in the PE during the training. In the inference step, we predicted the output lengths from the input sequences using a length prediction model based on a large-scale pre-trained language model. In Japanese-to-English and English-to-Japanese translation, experimental results show that the proposed perturbation injection improves the robustness of the length prediction errors, particularly within a certain range.
著者
Dan Han Pascual Martínez-Gómez Yusuke Miyao Katsuhito Sudoh Masaaki Nagata
出版者
Information and Media Technologies Editorial Board
雑誌
Information and Media Technologies (ISSN:18810896)
巻号頁・発行日
vol.9, no.3, pp.272-301, 2014 (Released:2014-09-15)
参考文献数
44

In statistical machine translation, Chinese and Japanese is a well-known long-distance language pair that causes difficulties to word alignment techniques. Pre-reordering methods have been proven efficient and effective; however, they need reliable parsers to extract the syntactic structure of the source sentences. On one hand, we propose a framework in which only part-of-speech (POS) tags and unlabeled dependency parse trees are used to minimize the influence of parse errors, and linguistic knowledge on structural difference is encoded in the form of reordering rules. We show significant improvements in translation quality of sentences in the news domain over state-of-the-art reordering methods. On the other hand, we explore the relationship between dependency parsing and our pre-reordering method from two aspects: POS tags and dependencies. We observe the effects of different parse errors on reordering performance by combining empirical and descriptive approaches. In the empirical approach, we quantify the distribution of general parse errors along with reordering quality. In the descriptive approach, we extract seven influential error patterns and examine their correlations with reordering errors.