How the work gets done matters, a lot
The way Mercor thinks about collecting high-quality human data for GenAI use cases:
Unfortunately, neither of the two options research teams have today for creating human data are perfect:
Our goal at Mercor is to help our clients achieve the best of both worlds with an “Open Box” approach, vs the “Black Box” approach commonly favored by data vendors.
The Open Box is characterized by
A more detailed comparison of the Black Box and Open Box approaches:
Criteria | Black Box | Open Box |
---|---|---|
Experimentation speed | Spend time negotiating with vendors over price per task for each iteration - Play a game of telephones between your research team, the vendors, and annotators | No new pricing negotiations every time you kick off a new project, just a flat % - Work directly with annotators when you need to iterate quickly |
Data security | Risk data getting leaked by unknown annotators if stored or processed on vendor’s platform | Retain control of data and IP by keeping work on your own platform if preferred - Know exactly who handles your data |
Cost structure | Unknown vendor margins | Full visibility into annotator pay |
Management overhead | Vendor handles operations, but will likely still need a small team to coordinate with research teams, manage vendors, and assess quality | Build an internal human data team, but outsource sourcing and managing annotators, and setting up data pipelines and processes to vendors |
It’s our hope that research teams can get the best of both worlds (building an in-house data team and outsourcing to data vendors) with the Open Box approach.
How the work gets done matters, a lot
The way Mercor thinks about collecting high-quality human data for GenAI use cases:
Unfortunately, neither of the two options research teams have today for creating human data are perfect:
Our goal at Mercor is to help our clients achieve the best of both worlds with an “Open Box” approach, vs the “Black Box” approach commonly favored by data vendors.
The Open Box is characterized by
A more detailed comparison of the Black Box and Open Box approaches:
Criteria | Black Box | Open Box |
---|---|---|
Experimentation speed | Spend time negotiating with vendors over price per task for each iteration - Play a game of telephones between your research team, the vendors, and annotators | No new pricing negotiations every time you kick off a new project, just a flat % - Work directly with annotators when you need to iterate quickly |
Data security | Risk data getting leaked by unknown annotators if stored or processed on vendor’s platform | Retain control of data and IP by keeping work on your own platform if preferred - Know exactly who handles your data |
Cost structure | Unknown vendor margins | Full visibility into annotator pay |
Management overhead | Vendor handles operations, but will likely still need a small team to coordinate with research teams, manage vendors, and assess quality | Build an internal human data team, but outsource sourcing and managing annotators, and setting up data pipelines and processes to vendors |
It’s our hope that research teams can get the best of both worlds (building an in-house data team and outsourcing to data vendors) with the Open Box approach.