[CHI '14]
Crowdsourcing Step-by-Step Information Extraction to Enhance Existing How-to Videos
Juho Kim Massachusetts Institute of Technology, Cambridge, MA, USA
Phu Tran Nguyen Massachusetts Institute of Technology, Cambridge, MA, USA
Sarah Weir Massachusetts Institute of Technology, Cambridge, MA, USA
Philip J. Guo University of Rochester & Massachusetts Institute of Technology, Rochester, NY, USA
Robert C. Miller Massachusetts Institute of Technology, Cambridge, MA, USA
Krzysztof Z. Gajos Harvard University, Cambridge, MA, USA
Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of watching existing how-to videos with step-by-step annotations.
We first performed a formative study to verify that annotations are actually useful to learners. For this study, we created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player.
To add the necessary step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing work- flow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all domains with 77% precision and 81% recall.
沒有留言:
張貼留言