A lot of what we do at Locu is automated by machines, but a lot of it also requires a human touch. When we require a human touch at scale, we crowdsource the work to a set of trusty crowd workers. In this post, we’ll shed some light on how Locu designs tasks for our crowd workers. Through these tasks, we work with crowd workers all over the world to perform structured data extraction on price lists like menus.
The conventional wisdom on designing tasks for crowd workers is to create micro tasks, or small verifiable tasks to which multiple workers provide answers (e.g., “Is there a flower in this picture?”). Because a worker’s response to a micro task (e.g., “Yes” or “No”) is simple, we can verify response quality by seeing how much agreement each task garners (e.g., “Four of five workers said no flower appears in the picture”). What we’ll see by the end of this post is that the conventional wisdom does not always lend itself to appropriate task design for other reasons. We found that in some situations, creating more complex tasks for trusted crowd workers to complete gives us higher quality results.
Simplicity and Context
Say you’re writing a book. You likely won’t complete a cover-to-cover copy in a sitting. First, you might outline the character development, and identify the chapters you will write each week. You might even start each chapter with an outline of the sections and paragraphs to be written. This is known as task decomposition.
What if you wanted to hire some crowd workers to help write your book? You couldn’t just assign chapter 3 (the one where Harold finally falls in love) to a random worker and have them write that chapter. They would be lacking context: who has Harold been in a relationship with before, and what is the name of his good friend?
Task decomposition with crowd workers is a balance between breaking tasks down into easy-to-complete portions and providing enough context with each task to enable a worker to complete it well. Between response complexity and required context, you end up with a design space that looks like this:
Micro tasks like image labeling are in the bottom left of this space: tasks are simple to ask and answer, and require little context. Complex tasks are in the top right, as they require both a nuanced response from the worker, and a significant amount of context for the worker to complete them accurately. We’ll see that menu structuring falls into this second class. It’s unclear whether there are interesting task designs in the top left or bottom right of this plot. Intuitively, it feels like increasing the context required for a task increases the complexity and nuance of the response, but that correlation is not for us to claim!
At Locu, we use a mix of micro tasks and complex tasks. For business listings (e.g., “What is the phone number of this doctor’s office?”), micro tasks are the way to go. We can ask multiple workers to get us the phone number for Dr. Whatsername, and if we see significant overlap in their responses, we trust their answer. Because context-sensitive tasks have gotten less attention in the discussion of crowd-powered workflows, we’ll talk about those for the rest of this post. We’ll use restaurant menus as our example, although Locu’s platform processes more diverse datasets than that.
A Seemingly Simple Slice of Pizza
Local merchant price lists, especially restaurant menus, have a complex structure. When we process a restaurant’s offerings, we have to consider several menus (e.g., lunch and dinner), titles, sections (e.g., appetizers and desserts), menu items, menu item descriptions, prices, and choices/additions that modify meals.
A naïve way to decompose the task of extracting the structured data from a menu would have us slicing a menu into multiple (micro)segments, asking workers to type up all of the menu items in one segment of a menu. This would keep individual tasks small and slightly more verifiable, but would give us context-related issues, as you can see in the following Pizza menu:
Say we want to extract structured data in the menu above, and that our system decides to send only “Thin-slice” dishes to a crowd worker. Relative to the size of the menu, typing up just the menu items “Small,” “Large,” and “Slice” is quite the micro task. Sadly, this approach would not work: the worker would never know to identify the list of available toppings as potential additions to the pizza. We need to provide more context to our worker, in this case requiring us to increase the size of the task to include more of the information embedded in this menu.
We now have an example of context-sensitivity that forces our crowd workflow design away from micro tasks. Just how large of a task should we generate for our workers? For this specific task–extracting the structure and detail of a restaurant menu–we found that we need to give workers quite a bit of context to be effective. For example, we will often uncover two of the same menus for a restaurant (e.g., this year’s holiday menu and last year’s). Asking different workers to process each menu will result in duplicated efforts. We have found that having one worker process all of the menus for a given restaurant, identifying everything from complex menu design to duplicate menus, is our most effective way to achieve high-quality work from Locu’s crowd workers.
Challenges and Future Directions
We use micro tasks where we can, and larger tasks when worker context is required. Our context-sensitive design is not without its challenges. Crowd workflow design is an iterative process, and we’re still improving ours!
Our first challenge is that, because of task complexity, we don’t have multiple workers’ output on the same task to verify task correctness. With a single worker processing several menus for a restaurant, there is no good way to compare their output to that of another worker. To address this difficulty, we rely on a hierarchy of reviewers to verify and edit worker output. We’ll talk about in a future post, but in short: trusted workers review the work of newer workers to ensure the data they extract from menus is to our standards. The benefit of this approach is that we can catch mistakes and train our workforce over time.
Another challenge comes from worker fatigue. Longer tasks result in more fatigue, and with fatigue comes sloppiness. We’re exploring hybrid microtask/complex task designs in which micro tasks help identify duplicate menus and important contextual clues so that workers can process a subset of a venue’s price lists and stay fresh.
Locu uses a mix of micro tasks and larger context-sensitive designs to extract data from local venue listings. While micro tasks are easier to design, they have their failings, and we have to rely on high-quality workers to process larger tasks with a review process that catches mistakes and trains new workers. Perhaps more important than task design is worker selection: ensuring you have a dependable workforce ensures that when a tough task arises, you will know which workers you can rely on to solve it. In the future, we’ll talk more about our worker analytics and review process!
Want to nerd out on crowds and data for a living? Come work with us!