2014年5月28日 星期三

(10) Cascade: Crowdsourcing Taxonomy Creation

[CHI '13]
Cascade: Crowdsourcing Taxonomy Creation

Lydia B. Chilton, University of Washington
Greg Little, oDesk Research
Darren Edge, Microsoft Research Asia
Daniel S. Weld, University of Washington
James A. Landay, University of Washington

Taxonomies are a useful and ubiquitous way of organizing information. However, creating organizational hierarchies is difficult because the process requires a global understanding of the objects to be categorized. Usually one is created by an individual or a small group of people working together for hours or even days. Unfortunately, this centralized approach does not work well for the large, quickly-changing datasets found on the web. Cascade is an automated workflow that creates a taxonomy from the collective efforts of crowd workers who spend as little as 20 seconds each. We evaluate Cascade and show that on three datasets its quality is 80-90% of that of experts. The cost of Cascade is competitive with expert information architects,
despite taking six times more human labor. Fortunately, this labor can be parallelized such that Cascade will run in as fast as five minutes instead of hours or days.

(9) Webzeitgeist: Design Mining the Web

[CHI '13]
Webzeitgeist: Design Mining the Web

Ranjitha Kumar, Stanford University
Arvind Satyanarayan, Stanford University
Cesar Torres, Stanford University
Maxine Lim, Stanford University
Salman Ahmad, Massachusetts Institute of Technology
Scott R. Klemmer, Stanford University
Jerry O. Talton, Intel Corporation

Advances in data mining and knowledge discovery have transformed the way Web sites are designed. However, while visual presentation is an intrinsic part of the Web, traditional data mining techniques ignore render-time page structures and their attributes. This paper introduces design mining for the Web: using knowledge discovery techniques to understand design demographics, automate design curation, and support data-driven design tools. This idea is manifest in webzeitgeist, a platform for large-scale design mining comprising arepository of over 100,000 Web pages and 100 million design elements. This paper describes the principles driving design mining, the implementation of the WEBZEITGEIST architecture, and the new class of data-driven design applications it enables.

(8) Catalyst: Triggering Collective Action with Thresholds

[CSCW '14]
Catalyst: Triggering Collective Action with Thresholds

Justin Cheng, Stanford HCI Group, Computer Science Department Stanford University
Michael S. Bernstein, Stanford HCI Group, Computer Science Department Stanford University

The web is a catalyst for drawing people together around shared goals, but many groups never reach critical mass. It can thus be risky to commit time or effort to a goal: participants show up only to discover that nobody else did, and organizers devote significant effort to causes that never get off the ground. Crowdfunding has lessened some of this risk by only calling in donations when an effort reaches a collective monetary goal. However, it leaves unsolved the harder problem of mobilizing effort, time and participation. We generalize the concept into activation thresholds, commitments that are conditioned on others' participation. With activation thresholds, supporters only need to show up for an event if enough other people commit as well. Catalyst is a platform that introduces activation thresholds for on-demand events. For more complex coordination needs, Catalyst also provides thresholds based on time or role (e.g., a bake sale requiring commitments for bakers, decorators, and sellers). In a multi-month field deployment, Catalyst helped users organize events including food bank volunteering, on-demand study groups, and mass participation events like a human chess game. Our results suggest that activation thresholds can indeed catalyze a large class of new collective efforts. 

(7) A Colorful Approach to Text Processing by Example


[UIST '13]
A Colorful Approach to Text Processing by Example

Kuat Yessenov   Massachusetts Institute of Technology, Cambridge, USA
Shubham Tulsiani IIT, Kanpur, India
Aditya Menon         University of California, San Diego, San Diego, USA
Robert C. Miller Massachusetts Institute of Technology, Cambridge, USA
Sumit Gulwani Microsoft, Redmond, USA
Butler Lampson Microsoft Research, Cambridge, USA
Adam Kalai         Microsoft Research, Cambridge, USA

Text processing, tedious and error-prone even for programmers, remains one of the most alluring targets of Programming by Example. An examination of real-world text processing tasks found on help forums reveals that many such tasks, beyond simple string manipulation, involve latent hierarchical structures.

We present STEPS, a programming system for processing structured and semi-structured text by example. STEPS users create and manipulate hierarchical structure by example. In a between-subject user study on fourteen computer scientists, STEPS compares favorably to traditional programming.

(6) Shepherding the Crowd Yields Better Work


[CSCW '12]
Shepherding the Crowd Yields Better Work

Steven Dow Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Anand Kulkarni University of California, Berkeley, Berkeley, California, USA
Scott Klemmer Stanford University, Stanford, California, USA
Björn Hartmann University of California, Berkeley, Berkeley, California, USA

Micro-task platforms provide massively parallel, on-demand labor. However, it can be difficult to reliably achieve high-quality work because online workers may behave irresponsibly, misunderstand the task, or lack necessary skills. This paper investigates whether timely, task-specific feedback helps crowd workers learn, persevere, and produce better results. We investigate this question through Shepherd, a feedback system for crowdsourced work. In a between-subjects study with three conditions, crowd workers wrote consumer reviews for six products they own. Participants in the None condition received no immediate feedback, consistent with most current crowdsourcing practices. Participants in the Self-assessment condition judged their own work. Participants in the External assessment condition received expert feedback. Self-assessment alone yielded better overall work than the None condition and helped workers improve over time. External assessment also yielded these benefits. Participants who received external assessment also revised their work more. We conclude by discussing interaction and infrastructure approaches for integrating real-time assessment into online work.

(5) Crowdsourcing Step-by-Step Information Extraction to Enhance Existing How-to Videos

[CHI '14]
Crowdsourcing Step-by-Step Information Extraction to Enhance Existing How-to Videos

Juho Kim Massachusetts Institute of Technology, Cambridge, MA, USA
Phu Tran Nguyen Massachusetts Institute of Technology, Cambridge, MA, USA
Sarah Weir Massachusetts Institute of Technology, Cambridge, MA, USA
Philip J. Guo University of Rochester & Massachusetts Institute of Technology, Rochester, NY, USA
Robert C. Miller Massachusetts Institute of Technology, Cambridge, MA, USA
Krzysztof Z. Gajos Harvard University, Cambridge, MA, USA

Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of watching existing how-to videos with step-by-step annotations.
We first performed a formative study to verify that annotations are actually useful to learners. For this study, we created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player.
To add the necessary step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing work- flow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all domains with 77% precision and 81% recall.

(4) Cobi: A Community-Informed Conference Scheduling Tool


[UIST '13]
Cobi: A Community-Informed Conference Scheduling Tool

Juho Kim                             Massachusetts Institute of Technology, Cambridge, MA, USA
Haoqi Zhang                     Northwestern University, Evanston, IL, USA
Paul André                     Carnegie Mellon University, Pittsburgh, PA, USA
Lydia B. Chilton             University of Washington, Seattle, WA, USA
Wendy Mackay             INRIA, Orsay, France
Michel Beaudouin-Lafon   Université Paris-Sud, Orsay, France
Robert C. Miller             Massachusetts Institute of Technology, Cambridge, MA, USA
Steven P. Dow                     Carnegie Mellon University, Pittsburgh, PA, USA

Effectively planning a large multi-track conference requires an understanding of the preferences and constraints of organizers, authors, and attendees. Traditionally, the onus of scheduling the program falls on a few dedicated organizers. Resolving conflicts becomes difficult due to the size and complexity of the schedule and the lack of insight into community members' needs and desires. Cobi presents an alternative approach to conference scheduling that engages the entire community in the planning process. Cobi comprises (a) communitysourcing applications that collect preferences, constraints, and affinity data from community members, and (b) a visual scheduling interface that combines communitysourced data and constraint-solving to enable organizers to make informed improvements to the schedule. This paper describes Cobi's scheduling tool and reports on a live deployment for planning CHI 2013, where organizers considered input from 645 authors and resolved 168 scheduling conflicts. Results show the value of integrating community input with an intelligent user interface to solve cmplex planning tasks.