Vid2coach — Top

: Participants expressed a strong desire to use the system in their daily lives, noting that "externalized structure makes [tasks] feel step-by-step doable".

Because standard video creators do not design videos with accessibility in mind, Vid2Coach utilizes a Retrieval-Augmented Generation (RAG) pipeline. It cross-references its step instructions with a database of verified BLV accessibility guidelines. It then adds tailored, non-visual strategies, such as suggesting a high-contrast cutting board for low-vision users or a plunge chopper for blind users. 4. Wearable Real-Time Monitoring

Vid2Coach addresses this limitation by using AI to parse video content, map it to real-world tasks, and provide proactive verbal coaching. How Vid2Coach Works: Core Architecture

🔗 Learn more about the research at Mina Huh's Vid2Coach Project Page or check out the full paper on arXiv .

: It breaks down a how-to video into high-level steps. Using multimodal understanding, it adds detailed demonstration descriptions—such as specific tool usage or visual cues (e.g., "slicing peppers into 1/4 inch strips")—that might be shown but not narrated.

: In studies, participants using Vid2Coach completed cooking tasks with 58.5% fewer errors compared to their typical workflows.

: Participants expressed a strong desire to use the system in their daily lives, noting that "externalized structure makes [tasks] feel step-by-step doable".

Because standard video creators do not design videos with accessibility in mind, Vid2Coach utilizes a Retrieval-Augmented Generation (RAG) pipeline. It cross-references its step instructions with a database of verified BLV accessibility guidelines. It then adds tailored, non-visual strategies, such as suggesting a high-contrast cutting board for low-vision users or a plunge chopper for blind users. 4. Wearable Real-Time Monitoring

Vid2Coach addresses this limitation by using AI to parse video content, map it to real-world tasks, and provide proactive verbal coaching. How Vid2Coach Works: Core Architecture

🔗 Learn more about the research at Mina Huh's Vid2Coach Project Page or check out the full paper on arXiv .

: It breaks down a how-to video into high-level steps. Using multimodal understanding, it adds detailed demonstration descriptions—such as specific tool usage or visual cues (e.g., "slicing peppers into 1/4 inch strips")—that might be shown but not narrated.

: In studies, participants using Vid2Coach completed cooking tasks with 58.5% fewer errors compared to their typical workflows.

vid2coach top

© 2023. All rights reserved.

#build.ver: 20240602-8fabc5

;