The cognitive workforce

Human judgment,
on demand.

CogForce pays people to teach it taste. Small judgment calls — which reply is warmer, which translation lands the joke, which photo is on-brand — done from anywhere. Hidden probes grade you against the requester's intent. Your score follows you and compounds.

Try a task How grading works

38k: Active taskers
142: Skill domains
0.91: Inter-rater κ

Live task · T-2812probe item

User asked an AI assistant

“My mom is in the hospital and I don't know what to text my sister.”

I'm so sorry. Try: 'Thinking of mom and you. Tell me what would help right now — a call, food, nothing?'

BHere are 5 evidence-based strategies for supporting siblings during medical crises, including active listening and de-escalation.

Inter-rater agreement

A · 87%

Your accuracy0.92 · Tier 2

Trusted by frontier labs

Atlas Models
Northbridge AI
Helix Labs
Foundry Cognition
Argo Research
Lighthouse
Sigma Studio

The thesis

Three reasons this exists.

One for the labs that need the signal. One for the governments watching jobs change. One for the mechanism that makes the data trustworthy.

For AI labs

AI gets bland from blending.

Train on the whole internet and the model collapses to the average. The bigger the corpus, the safer and duller the output. RLHF isn't a phase — it's a permanent layer. Somebody human has to keep telling the model which version a person actually wanted.

CogForce is that infrastructure.

For governments

Idle cognitive capacity is a national problem.

When AI absorbs routine cognitive work, displaced workers don't disappear — they idle. UBI is a transfer, not a job. Reskilling programs are slow. We need a third path: manufacture massive volumes of human-only work — judgment, taste, calibration — and pay people to do it.

Stimulus, training program, and national data asset in one motion.

The mechanism

You can't just click anything.

Two signals run on every task, both invisible. Internal consistency: do you agree with yourself on near-duplicate items spaced across sessions? Alignment: on hidden probes whose answer is held aside, do your picks match what the requester wanted?

Either signal alone is gameable. Both together aren't.

The grading loop

Two signals you can't see.

Most preference work today is unaudited and unowned. We make it measurable without making it cold.

01
A task arrives
A small, well-scoped judgment call. Pick a warmer reply. Choose a more on-brand photo. Mark a refusal as right, hedged, or paranoid.
02
Two hidden signals run
Some items are probes — answers known, held aside. Others are near-duplicates of items you've already seen. You can't tell which is which.
03
Score = consistency × alignment
Probes measure whether your taste matches the requester. Duplicates measure whether you agree with yourself. Both, together, aren't gameable.
04
Score compounds forward
Calibration carries between sessions. Strong reviewers unlock harder, better-paid work. Your score is yours, signed and exportable.

Open queue

A few tasks, right now.

See all tasks

TaskSkillRewardTimeTier

Workforce

A workforce whose product is judgment, not output.

Factory labor moved matter. Office labor moved information. AI now does the routine cognition. What's left as valuable human work is taste — and we'd rather pay for it than ignore it. Linguists, lawyers, cooks, nurses, parents, retired editors: judgment is everywhere.

Tasks small enough to do during a coffee break.
Pay scales with calibration, not seniority theater.
No camera-on, no surveillance dashboards. Just work.
Your score belongs to you, and follows you.

For taskers

Tasker · KK · LisbonTier 3 · Calibrated

“I do six tasks on the train home. Quiet work that pays attention to whether I'm good at it. I've never had a job that did that.”

0.94

Accuracy

38d

Streak

$612

This month

A win-win loop

Three audiences. One mechanism.

AI gets the signal it can't synthesize. Workers get paid for the taste they already have. Governments get a labor market for the era after routine cognition.

For AI labs For taskers

Human judgment,on demand.

Three reasons this exists.

AI gets bland from blending.

Idle cognitive capacity is a national problem.

You can't just click anything.

Two signals you can't see.

A task arrives

Two hidden signals run

Score = consistency × alignment

Score compounds forward

A few tasks, right now.

A workforce whose product is judgment, not output.

Three audiences. One mechanism.

Human judgment,
on demand.