Joseph Weizenbaum. Eliza―a computer program for the study of natural language communication between man and machine. Communications of the ACM, Vol. 9, No. 1, pp. 36–45, 1966.
Terry Winograd. Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. Technical report, MIT, 1971. (AITR-235).
Michael Johnston, et al. MATCH: An Architecture for Multimodal Dialogue Systems. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 376–383, 2002.
Timothy Bickmore and Justine Cassell. Relational agents: a model and implementation of building user trust. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, p. 396–403, 2001.
Dan Bohus and Eric Horvitz. Models for Multiparty Engagement in Open-World Dialog. In Proceedings of the SIGDIAL 2009 Conference, pp. 225–234, 2009.
David DeVault, et al. Simsensei kiosk: a virtual human interviewer for healthcare decision support. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems, p. 1061–1068, 2014.
Dian Yu, et al. Gunrock: A Social Bot for Complex and Engaging Long Conversations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp. 79–84, 2019.
Tom Brown, et al. Language Models are Few-Shot Learners. In Proceedings of the International Conference on Neural Information Processing Systems, pp. 1877–1901, 2020.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL, pp. 4171–4186, 2019.
Jared Kaplan, et al. Scaling Laws for Neural Language Models. arXiv preprint arXiv:2001.08361, 2020.
Ashish Vaswani, et al. Attention is All You Need. In Proceedings of the International Conference on Neural Information Processing Systems, pp. 5998–6008, 2017.
Laria Reynolds and Kyle McDonell. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. In Proceedings of the Conference on Human Factors in Computing Systems, pp. 1–7, 2021.
Long Ouyang, et al. Training language models to follow instructions with human feedback. In Proceedings of the International Conference on Neural Information Processing Systems, pp. 27730–27744, 2022.
河原達也. IT Text 音声認識システム. オーム社, 2016.
山本龍一, 高道慎之介. Python で学ぶ音声合成. インプレス, 2021.
高梨克也. 基礎から分かる会話コミュニケーションの分析法. ナカニシヤ出版, 2016.
Gabriel Skantze. Turn-taking in conversational systems and human-robot interaction: A review. Computer Speech & Language, Vol. 67, pp. 1–26, 2021.
Erik Ekstedt and Gabriel Skantze. Voice activity projection: Self-supervised learning of turn-taking events. In Proceedings of Interspeech, pp. 5190–5194, 2022.
Stephen Levinson and Francisco Torreira, Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology, Vol. 6, pp. 10–26, 2015.
Anne Anderson et al., The HCRC map task corpus. Language and Speech, Vol. 34, No. 4, pp. 351–366, 1991.
Koji Inoue, Bing’er Jiang, Erik Ekstedt, Tatsuya Kawahara, and Gabriel Skantze, Real-time and continuous turn-taking prediction using voice activity projection. arXiv preprint arXiv:2401.04868, pp. 1–10, 2024.
Aaron Powers, Sara Kiesler, Susan Fussell, and Cristen Torrey. Comparing a computer agent with a humanoid robot. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, pp. 145–152, 2007.
小松孝徳, 山田誠二. 適応ギャップがユーザのエージェントに対する印象変化に与える影響. 人工知能学会論文誌, Vol. 24, No. 2, pp. 232–240, 2009.・Joon Sung Park, et al. Generative Agents: Interactive Simulacra of Human Behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023.