Guiding Task Choice in Japanese Voice Interfaces through Vocalization Cost: Click-based vs. Voice-based Selection

191 Views

December 11, 25

スライド概要

Intrinsic motivation is known to improve task performance when individuals make their own choices. However, when multiple tasks are available, people often choose easier ones even when more difficult or troublesome tasks may be more beneficial. This study investigates whether the phrasing of spoken options can influence such decisions in Voice-based interfaces by leveraging the cognitive and articulatory effort required for vocalization. We conducted a controlled experiment with 40 participants, systematically varying the linguistic complexity of Japanese adverbial phrases in a pointing task and comparing Voice-based and Click-based selection. Results indicated a clear tendency in the voice condition to avoid the most complex phrase and revealed a modality-specific positional tendency in which left-positioned options were chosen more often and right-positioned options were avoided. To our knowledge, this is the first empirical study to demonstrate that vocalization cost can systematically bias task selection in Japanese voice interfaces. These findings suggest that carefully designed spoken language can subtly guide task selection, providing implications for fair and effective voice interface design.

profile-image

明治大学 総合数理学部 先端メディアサイエンス学科 中村聡史研究室

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

(ダウンロード不可)

関連スライド

各ページのテキスト
1.

ACM Multimedia Asia 2025 (MMAsia 2025) Kuala Lumpur, Malaysia December 11, 2025 Guiding Task Choice in Japanese Voice Interfaces through Vocalization Cost: Click-based vs. Voice-based Selection Ryunosuke Shigematsu, Ryuto Ohishi, Yuki Nakagawa, Satoshi Nakamura (Meiji University) Takeshi Torii, Hideyuki Takao (SUBARU CORPORATION)

2.

About our study Please speak one option out loud. 1. Perform the action action as quickly as as possible. ← Most people choose this. 2. Perform the action as quickly as you can manage. 3. Perform the action as quickly as feasible to the greatest extent possible. 1

3.

Contribution We introduced vocalization cost as an underexamined factor that affects user choices in voice interfaces. We explored how this cost can be used to subtly guide users toward better choices. 2

4.

Background Intrinsic motivation improves focus and performance, especially when tasks are self-chosen. Extrinsic motivation lower “Mom: You need to study.” Intrinsic motivation higher “Me: I want to study!” 3

5.

Background Question: Which subject would you study? Math English Science History Extrinsic motivation All options tend to be done evenly by external pressure or rewards. Intrinsic motivation Selection behavior is influenced by user’s skills, preferences, and past experience. 4

6.

Research Goal Guide users toward more balanced choices while keeping the benefits of intrinsic motivation. We investigate whether differences in vocalization cost can create selection bias in voice interfaces. Vocalization cost = The effort you need when speaking a phrase. 5

7.

Hypothesis Options with higher vocalization cost are avoided, while low-cost options are more likely to be selected. Example phrases (English examples; Japanese in experiment) Simple: (dekirudake) Perform the action as quickly as possible. Formal: (kanouna-kagiri) Perform the action as quickly as you can manage. Complex: (yareruhannide-saidaigen-ni) Perform the action as quickly as feasible to the greatest extent possible. 6

8.

Experiment Pointing task (based on Fitts’ Law) • Participants clicked circles that appeared at random positions and with random sizes • Each trial was performed with a focus on one selected task-related element. • Each trial consisted of 20 clicks. • Each participant completed 20 trials. • Total participants: 40 students 7

9.

Experiment Design Options Option construction 3 types of expressions • Simple (low cost) • Formal (medium cost) • Complex (high cost) 5 task-related elements • Be fast • Be accurate • Keep a steady rhythm • Aim for the center • Minimize movement complex simple formal Construction rules: • Expression order was randomized. • 3 of the 5 elements were randomly selected. Selection Methods We used a between-subjects design. Each modality was tested with 20 participants: Click-based selection Voice-based selection 8

10.

Results: Selection Rates 40 Selection rates for the three expressions 35 33.3 30 25 20 15 30.9 38.4 33.3 37.4 35.8 24.2 10 5 Click-based Voice-based 0 simple Click formal complex Voice chance level 9

11.

Results: Selection Rates 40 Selection rates for the three expressions 35 33.3 30 25 Click-based 20 15 30.9 33.3 35.8 10 The three expressions were chosen at similar rates, around the chance level. 5 0 simple Click formal complex chance level 10

12.

Results: Selection Rates 40 Selection rates for the three expressions 35 33.3 30 25 Voice-based 20 15 38.4 37.4 24.2 10 Only the complex was selected much less frequently. 5 0 simple formal complex Voice chance level 11

13.

Results: Selection Rates 40 Selection rates for the three positions 35 33.3 30 25 Options 20 15 33.3 36.0 33.3 34.3 33.4 29.8 Left Center Right 10 Left Center Right 5 0 Click Voice chance level 12

14.

Results: Selection Rates 40 Selection rates for the three positions 35 33.3 30 25 Click-based 20 15 33.3 33.3 33.4 10 No positional bias was observed. 5 0 Left Click Center Right chance level 13

15.

Results: Selection Rates 40 Selection rates for the three positions 35 33.3 30 25 Voice-based 20 15 36.0 34.3 29.8 Center Right 10 5 Selection rates were higher on the Left and decreased toward the Right. 0 Left Voice chance level 14

16.

Discussion High-cost bias appeared: Users likely minimized vocal effort, leading them to avoid high-cost (complex) phrases. Positional bias appeared: Because Japanese is read left-to-right, participants likely scanned options in that order and made decisions before reaching the rightmost option. 15

17.

Practical Implications for Interface Design In voice interfaces ✓ Reduce vocalization cost for beneficial tasks ✓ Add vocalization cost to less beneficial tasks ✓ Place beneficial tasks toward the left These manipulations help guide users toward better choices while keeping the benefits of intrinsic motivation. 16

18.

Summary Background Vocalization cost may bias choices in voice interfaces. Proposed Method We manipulated the linguistic complexity of Japanese expressions to isolate vocalization cost. Results High-cost phrases were avoided, and a left-to-right positional bias appeared only in voice selection. Future Work Extend to other languages, broader populations, and real-world voice interactions. 17

19.

Thank you very much for listening!