東京大学教授 / 京都研究室室長の暦本純一と東京大学の暦本研究室が、ドイツ・ハンブルグで開催されている国際会議「ACM CHI 2023」で2つの研究論文を発表し、そのうちの1つがBest Paper Awardを受賞しました。ACM CHIは、ヒューマンコンピュータインタラクションの分野におけるトップカンファレンスです。どちらの研究論文も、人間の能力をテクノロジーが拡張し、人間とAIが緊密に連携する「Human Augmentation / Human-AI Integration」研究ビジョンの成果です。
■ [Best Paper Award] “LipLearner: Customizable Silent Speech Interactions on Mobile Devices" (Zixiong Su, Shitao Fang, and Jun Rekimoto)
この研究は、自然言語によるプライベートなコミュニケーションを可能にする無声音声インターフェース (サイレントスピーチインタフェース)技術を紹介するものです。対照学習を活用したLipLearnerは、最小限の労力でフューショットのコマンドカスタマイズを可能にし、様々な条件下で高い頑健性を発揮します。ユーザーは高い信頼性で独自のコマンドを定義することができ、無声音声インタラクションを高いユーザビリティと学習可能なものにすることができます。
論文:Zixiong Su, Shitao Fang, and Jun Rekimoto. 2023. LipLearner: Customizable Silent Speech Interactions on Mobile Devices. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 696, 1–21. https://doi.org/10.1145/3544548.3581465
ACM Digital Library: https://dl.acm.org/doi/10.1145/3544548.3581465
arXiv: https://arxiv.org/abs/2302.05907
Video: https://youtu.be/-WINFwzkPNc
■“WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions" (Jun Rekimoto)
ささやき声を認識し、通常の音声に変換することで音声対話の可能性を追求する論文です。自己教師あり学習に基づくゼロショット・リアルタイムささやき声から通常音声への変換機構であるWESPERは、従来の音声変換技術の限界に対処しています。この変換は利用者・言語非依存で、音声の自然な韻律を維持し、通常発声者のみならず、音声障害や聴覚障害を持つ人々の発声を改善することができます。
論文:Jun Rekimoto. 2023. WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 700, 1–12. https://doi.org/10.1145/3544548.3580706
ACM Digital Library: https://dl.acm.org/doi/10.1145/3544548.3580706
arXiv: https://arxiv.org/abs/2303.01639
Video: https://www.youtube.com/watch?v=ZwDem8JZ0ug
Dr. J. Rekimoto and his research group at the University of Tokyo presented two research papers at the prestigious ACM CHI 2023 conference in Hamburg, Germany. ACM CHI is the top conference in the field of human-computer interaction. Both research papers are the result of Rekimoto’s Human Augmentaion / Human-AI Integration vision, where humans and AIs are tightly collaborating with each other.
[Best Paper Award] “LipLearner: Customizable Silent Speech Interactions on Mobile Devices" (Zixiong Su, Shitao Fang, and Jun Rekimoto)
The study introduces an innovative silent speech interface technology that enables private communications in natural language. Leveraging contrastive learning, LipLearner allows for few-shot command customization with minimal user effort and demonstrates high robustness in various conditions. Users can define their own commands with high reliability, making silent speech interactions highly usable and learnable.
Reference:
Zixiong Su, Shitao Fang, and Jun Rekimoto. 2023. LipLearner: Customizable Silent Speech Interactions on Mobile Devices. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 696, 1–21.https://doi.org/10.1145/3544548.3581465
ACM Digital Library: https://dl.acm.org/doi/10.1145/3544548.3581465
arXiv: https://arxiv.org/abs/2302.05907
Video: https://youtu.be/-WINFwzkPNc
"WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions," (Jun Rekimoto)
This paper explores the possibilities for speech interaction by recognizing whispered speech and converting it to normal speech. WESPER, a zero-shot, real-time whisper-to-normal speech conversion mechanism based on self-supervised learning, addresses the limitations of conventional speech conversion techniques. The conversion is user- and language- independent, preserving the natural prosody of speech and improving its quality for people with speech or hearing impairments.
Reference:
Jun Rekimoto. 2023. WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 700, 1–12.https://doi.org/10.1145/3544548.3580706
ACM Digital Library: https://dl.acm.org/doi/10.1145/3544548.3580706
arXiv: https://arxiv.org/abs/2303.01639
Video: https://www.youtube.com/watch?v=ZwDem8JZ0ug
The success of these two papers at ACM CHI 2023 highlights the group's dedication to advancing the field of human-computer interaction and developing technologies that benefit individuals and society as a whole.