Core Tenets
The way Mercor thinks about collecting high-quality human data for GenAI use cases:- Data has become core product / IP for both foundation model and application layer companies. It increasingly makes sense to build data collection and annotation processes in-house, both for improving core capabilities and data security.
- The quality of your annotator team is the biggest lever on the quality of your data. QA processes and tooling used to be the biggest lever with non-specialized work, but now, annotators must truly be experts to push the frontier of model capabilities.
- Experimentation speed is critical for continuously improving your model. Data annotation should not be a blocker for your experimentation cycles.
Current State
Unfortunately, neither of the two options research teams have today for creating human data are perfect:- Outsource the entire data collection/annotation to external vendors, resulting in:
- Slower experimentation speed, as time is spent negotiating per-task prices and sharing feedback with the vendor.
- Data security concerns, especially if the vendor prefers to use their own platform. Many vendors have already leaked data.
- Lack of transparency around cost and quality of data annotators, since vendors can be incentivized to keep costs low by obfuscating time spent and annotator backgrounds.
- Build a human data team themselves, which comes with:
- Higher fixed costs, in the form of hiring a larger internal human data team and maintaining a team of annotators.
- Having to constantly source, vet, hire, performance manage, and terminate contractors for your annotator team.
- Spend time and resources on operations which are not core competencies.
Black Box vs. Open Box
A more detailed comparison of the Black Box and Open Box approaches:| Criteria | Black Box | Open Box |
|---|---|---|
| Experimentation speed | Spend time negotiating with vendors over price per task for each iteration - Play a game of telephones between your research team, the vendors, and annotators | No new pricing negotiations every time you kick off a new project, just a flat % - Work directly with annotators when you need to iterate quickly |
| Data security | Risk data getting leaked by unknown annotators if stored or processed on vendor’s platform | Retain control of data and IP by keeping work on your own platform if preferred - Know exactly who handles your data |
| Cost structure | Unknown vendor margins | Full visibility into annotator pay |
| Management overhead | Vendor handles operations, but will likely still need a small team to coordinate with research teams, manage vendors, and assess quality | Build an internal human data team, but outsource sourcing and managing annotators, and setting up data pipelines and processes to vendors |
How Mercor helps teams build the “Open Box” solution:
- Create a proposal on scope and methodology
- We come in with a custom proposal based on our initial understanding of their needs, and will collect input on specific clarifications on topics like data distribution, volume, and what they’re optimizing for.
- There’s no lengthy pricing and scoping discussions each time they kick off a new project. We just take a flat percentage of the annotators’ pay rates.
- Source and vet high quality talent
- We have 300k+ experts in our talent pool, and will look at a combination of factors including interviews, work experience, education, GitHub profiles, Google Scholar citations, and more to find the best annotators for specific projects.
- We surface all selected annotator profiles before moving forward, for full transparency.
- Design and set up pipelines
- Teams can choose to use their own platform or ours. We can set up the workflow and pipeline with our tooling if they don’t have a platform to start with.
- We have a set of documents and processes to get started with as part of our Human Data Handbook — this includes guidelines, style guides, rubrics, and pipeline designs. We can help modify these for custom needs.
- We also help set up QA processes, from automated metrics to peer review pipelines, based on the specifics of the type of data and required quality.
- Measure key metrics on an ongoing basis
- We define a set of metrics to monitor the quality, volume, and cost efficiency of data being created, on aggregate and for each annotator.
- Based on the project, these metrics can be more or less custom. Quality in particular is usually quite context-specific, so we often create rubrics specific to each project to define quality for annotator work.
- Swap out talent based on performance and changing needs
- In cases where annotators are underperforming or no longer needed for projects, we will notify and off-board them. We do this proactively based on metrics.
- If companies have new needs, we source, surface, and onboard new annotators within hours to days. Teams can also use our platform to easily search for specific top talent themselves.
