Adiwardana, DanielLuong, Minh-ThangSo, David R.Hall, JamieFiedel, NoahThoppilan, RomalYang, ZiKulshreshtha, ApoorvNemade, GauravLu, YifengLe, Quoc V.2025-06-022025-06-022020-01-27https://arxiv.org/abs/2001.09977http://data.inu.ac.kr/handle/123456789/19612.6B 파라미터 Meena 모델을 제안하며, SSA(Sensibleness and Specificity Average)라는 새로운 평가 지표를 통해 다중턴 대화 성능을 평가합니다. 최고 성능 모델은 SSA 79%를 기록하며, 인간 수준(86%)에 근접하는 가능성을 제시합니다 ©2020 Google ResearchWe present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. We also propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of a human-like multi-turn conversation. Our experiments show strong correlation between perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72% on multi-turn evaluation) suggests that a human-level SSA of 86% is potentially within reach if we can better optimize perplexity. Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79% SSA, 23% higher in absolute SSA than the existing chatbots we evaluated.en-USMeenaOpen-Domain ChatbotSSAMulti-turn ConversationDialogue SystemsTowards a Human-like Open-Domain ChatbotArticle