Cast your vote — results update live for everyone in the room
Scale adapted from Nate Silver · On the Edge, 2024
"The majority of the ruff ruff is people who look at the current point and people who look at the current slope."
— Andrej Karpathy, Jan 2026 · Former AI lead at Tesla, co-founder OpenAI
"AI makes mistakes. It can't do what I need. It's overhyped."
→ Looking at today's limitations
"Capabilities are doubling every few months. What's impossible today may be routine in a year."
→ Looking at the rate of change
An independent lab that gives AI agents real, open-ended tasks — and measures how complex those tasks can be before the AI starts failing.
AI is not uniformly capable. Centaurs and Cyborgs on the Jagged Frontier (Ethan Mollick) coined the term: AI can solve hard problems but fail on seemingly trivial ones. The line is jagged and unpredictable.
"I want to wash my car. The car wash is 50 metres from my house — should I walk or drive?" — AI often recommends walking. It misses the point that you need the car at the car wash.
| Concept | What it means | Why it matters for you |
|---|---|---|
| Token | The smallest unit the model processes — roughly ¾ of a word in English | Longer documents cost more and may hit limits |
| Context window | How much text the model can "see" at once — its working memory. Claude: 200k tokens ≈ 600 pages. Gemini: 2M tokens ≈ thousands of pages — an entire document archive. | Large documents may need chunking; context affects quality |
| Hallucination | When the model generates text that sounds correct but is invented. Not a bug — a fundamental property of how these models work. | Always verify facts, citations, and specific numbers |
| Training cut-off | The model only knows what existed in its training data. It cannot distinguish "I don't know" from "this hasn't happened yet." | Recent events are unknown; the model may confabulate |
| Temperature | Controls how creative vs. predictable the output is. Low = deterministic. High = varied. | Some tools let you adjust this; matters for creative vs. precise tasks |
The underlying AI engine. Examples: GPT-5, Claude Opus 4.6, Gemini 3 Pro.
This is the intelligence — built and maintained by AI labs.
The application you use. Examples: Microsoft Copilot, Claude.ai, ChatGPT, Gemini.
This is the product — with features, pricing, and data policies layered on top.
Why this matters: Microsoft Copilot can run on different models underneath — including Claude. The tool you're paying for and the model doing the thinking may be from different companies.
AI has no background knowledge about you, your company, your role, or your constraints. The more relevant context you provide, the better the result. Tell it who you are, what you're trying to achieve, and what the output will be used for.
Write a good summary of our Q1 performance. Make sure it is easy to read.
[attaches file]
I'm a divisional manager at a heavy equipment manufacturer. Write a one-page Q1 summary for our management board. Good means: data-led, 3 clear takeaways, no jargon, ends with a recommendation. Easy to read means: short paragraphs, headers, suitable for senior leaders who will skim it in under 2 minutes. Tone: direct and confident, not defensive.
[attaches: Q1 report · Q1 last year · last board summary · board template]
Useful framing: Think of AI as a highly capable new colleague who knows nothing about your organization, your industry, or your specific situation. Brief it accordingly.
Source: Anthropic
Show an example of the format you want. Few-shot examples dramatically improve output quality.
Ask it to flag uncertainty: "If you're unsure, say so." Verify facts, citations, and numbers independently.
Avoid leading questions. "Is this a good idea?" invites agreement. "Give me pros and cons" invites analysis.
"Explain for a senior executive without a technical background" vs "Write for a specialist." Radically different outputs.
Instead of writing the perfect prompt: "Ask me questions one by one until you have enough context to write the report."
"What complications might I be missing?" / "Argue against this plan." / "What should I ask that I haven't?"
Dump messy notes, voice transcripts, or bullet points and ask AI to structure them. It's excellent at finding order in chaos.
"I asked X and got Y, but wanted Z. Help me rewrite the prompt." AI is often better at writing prompts than humans.
Hallucination is not a bug that will be fixed — it's a fundamental property of how language models work. They generate plausible-sounding text. Plausible ≠ true.
Review the response I've pasted below. Flag any factual claims that could be wrong or that I should verify. List them as: Claim → Why to verify → How to verify.
The rule: Trust AI for reasoning, structure, and synthesis. Verify AI for facts, figures, and citations.
Coming up: AI tools, data security, agents, and change leadership
Click everything that applies — results are live.
Integrated into Microsoft 365 — Word, Excel, Teams, Outlook. The enterprise tool for organizations already on Microsoft. Strong for Office workflows, meeting summaries, email drafting.
The most widely known tool. Strong general capability, image generation, code. ChatGPT Enterprise meets corporate data requirements. Consumer tiers do not.
Known for nuanced writing, long document analysis, and careful reasoning. Claude.ai Teams plan has strong data protections. No training on your data.
Largest context window (2M tokens). Deep integration with Google Workspace. Gemini for Workspace enterprise plan for data protection.
AI search engine with cited sources — reduces hallucination risk for current information. Good for research tasks where source verification matters.
Industry-specific tools: Harvey (legal), Glean (enterprise search), Notion AI (knowledge management), GitHub Copilot (code). Growing rapidly.
Practical starting point: Use it first in Teams meetings (meeting recap) and Outlook (draft reply). These are the fastest wins with the lowest risk of hallucination — it has access to the actual context.
| Tier | Training on your data? | Data handling | Example plans |
|---|---|---|---|
| Free / Consumer | Yes by default | Conversations may be reviewed by humans; used for training | Claude Free, ChatGPT Free, Gemini Free |
| Individual paid | Yes by default — can opt out | Still consumer-grade terms; opt-out is per-user | Claude Pro/Max, ChatGPT Plus, Copilot Pro |
| Business / Team | No | Commercial terms; data not used for training; admin controls | Claude Team, ChatGPT Team, Copilot M365 |
| Enterprise / API | No | DPA available; SSO; audit logs; data residency options | Claude Enterprise, ChatGPT Enterprise |
Critical: Paying for an individual plan (e.g. Claude Pro, ChatGPT Plus) does not automatically protect your data. Individual plans are consumer-grade. Business/Team tier is the minimum for work with sensitive or confidential information.
The question that stalls most AI projects:
"Can we use data X for use case Y in tool Z?"
Who owns the answer? Who is allowed to say yes? Without a clear data governance structure, every AI initiative stalls at the same gate.
The good news: there is no special "AI readiness."
AI-ready data is simply well-governed, well-structured, digitised data — the same destination as the data quality and digitisation journey your organisation already started. AI makes that journey more urgent, not different.
Practical question for your IT/Legal: "Do we have a DPA with our AI vendor? Where is data processed? What's the retention period?"
Today's examples — early-stage, powerful, but still requiring human oversight at every significant step:
What builders are deploying right now — not research, not demos. Real systems running on real tasks.
A personal AI infrastructure project running on a home server — built to explore what agents can do when given real tools and real tasks.
Send a Telegram message: "Research X, implement it, test it, then have another agent review the result."
The system researches → writes code → runs tests → spins up a separate model to review the output — all without further human input. The goal: eventually close the loop entirely.
The principle: Agents should automate the mundane and reversible before the important and irreversible. Start with read-only tasks. Add write access incrementally, with logging and approval gates.
Workers consistently report significant personal productivity gains when using AI. Tasks completed faster, quality improved, new capabilities unlocked.
Organizations report only modest gains in aggregate productivity. The individual benefits aren't translating upward.
Why the gap? — Ethan Mollick
Despite official adoption rates of around 20%, surveys consistently show that over 40% of workers are already using AI tools — without telling their managers. They're called "secret cyborgs."
Ethan Mollick's framework for organizational AI adoption — based on what actually works.
AI is only as good as the data it has access to. Organizations with poor data quality, fragmented systems, or unclear data ownership will struggle to capture AI value.
AI strategy is not the same as "buy some licenses." It requires answering:
The uncomfortable truth: Most organizations are not AI-ready — not because of the technology, but because of data quality, change management, and unclear ownership. Address these first.
"The challenge is not building AI. The challenge is building organizations that can use AI." — common theme across every serious AI adoption study.
Look at the slope, not just the point. AI will be more capable in 6 months than it is today. Decisions and strategies need to account for the rate of change, not just current limitations.
Context is everything in prompting. The single biggest improvement you can make is giving more relevant background information. Brief AI like a capable new colleague.
Use business-tier tools. Consumer plans do not offer adequate data protection for work involving confidential or sensitive information. Know your vendor's data policy.
The jagged frontier means you need to experiment. There's no substitute for trying. What AI can't do today, it may do in months. What it struggles with is non-obvious until you test it.
Your team is already using AI. If you haven't set policy and created a safe space for experimentation, you have secret cyborgs. Get ahead of it.
Data foundations determine your ceiling. AI is only as good as the data it can access. Poor quality limits what's possible — and unclear governance ("can we use this data in this tool for this purpose?") will block adoption faster than any technical limitation.
Technology is not the hard part. Change management and clear ownership determine whether organizations actually benefit. Address those first.