Publications

Filter by:

IJCNLP-AACL
2023
Chunpeng Ma (Megagon Labs), Takuya Makino (Megagon Labs)
Detecting out-of-scope (OOS) utterances is crucial in task-oriented dialogue systems, but obtaining enough annotated OOS dialogues to train a binary classifier directly is difficult in practice. Existing data augmentation methods generate OOS dialogues automatically, but their performance usually depends on an external corpus. This dependence not only induces uncertainty, but also reduces the quality of generated dialogues. Specifically, all of them are out-of-domain (OOD). Herein we propose SILVER, a self data augmentation method that does not use external data. It addresses issues of previous research and improves the accuracy of OOS detection (false positive rate: 90.5% → 47.4%). Furthermore, SILVER successfully generates high-quality in-domain (IND) OOS dialogues in terms of naturalness (percentage: 8% → 68%) and OOS correctness (percentage: 74% → 88%), as evaluated by human workers.
READ MORE
SIGDIAL
2023
Mai Omura(NINJAL), Aya Wakasa(Tohoku University), Hiroshi Matsuda(Megagon Labs), Masayuki Asahara(NINJAL)
In this study, we have developed Universal Dependencies (UD) resources for spoken Japanese in the Corpus of Everyday Japanese Conversation (CEJC). The CEJC is a large corpus of spoken language that encompasses various everyday conversations in Japanese, and includes word delimitation and part-of-speech annotation. We have newly annotated Long Word Unit delimitation and Bunsetsu (Japanese phrase)-based dependencies, including Bunsetsu boundaries, for CEJC. The UD of Japanese resources was constructed in accordance with hand-maintained conversion rules from the CEJC with two types of word delimitation, part-of-speech tags and Bunsetsu-based syntactic dependency relations. Furthermore, we examined various issues pertaining to the construction of UD in the CEJC by comparing it with the written Japanese corpus and evaluating UD parsing accuracy.
READ MORE
言語処理学会 (NLP)
2023
林部 祐太 株式会社リクルート Megagon Labs, Tokyo, Japan
我々は宿提案のための対話システムの構築に取り組んでいる.そこでは,カスタマーの発話意図をレビューなどから抽出した宿に関する情報とマッチングして,宿を提案する.それには,レビューやカスタマー発話内の文の意図を解釈する必要がある.そこで,文の意図を簡潔に示す「解釈文」を生成することで意図解釈を行う.そして,それぞれのトピックに基づきマッチングする.本論文では,解釈文生成とトピック分類の従来手法の改善に取り組んだ.
READ MORE
言語処理学会 (NLP)
2023
林部 祐太 株式会社リクルート Megagon Labs, Tokyo, Japan
我々は宿提案のための対話システムの構築に取り組んでいる.そこでは,カスタマーの発話意図をレビューなどから抽出した宿に関する情報とマッチングして,宿を提案する.それには,レビューやカスタマー発話内の文の意図を解釈する必要がある.そこで,文の意図を簡潔に示す「解釈文」を生成することで意図解釈を行う.そして,それぞれのトピックに基づきマッチングする.本論文では,解釈文生成とトピック分類の従来手法の改善に取り組んだ.
READ MORE