Methodology
How capabilities are classified and sourced on this site. The taxonomy, the source-quality hierarchy, the editorial process, and the limits.
This document covers capability inclusion. Incident inclusion methodology is forthcoming.
What this site is
A UK-focused, editorially curated record of what AI systems can demonstrably do, the evidence for those capabilities, and the documented incidents arising from them. The unit of analysis is the capability, not the model. Each capability page traces the same arc: theory, controlled lab demonstration, real-world deployment, and a sourced counterargument.
The site is not a comprehensive global tracker. Larger projects exist for that purpose: the AI Incident Database, the MIT AI Incident Tracker, the International AI Safety Report 2026. The site's aim is narrower: to give a UK reader, a parliamentary researcher, or a journalist enough sourced evidence to follow the argument that current AI capabilities outpace current AI governance.
The metadata dimensions
Each capability is tagged across two dimensions. The dimensions are descriptive properties of the capability itself; they are not policy prescriptions. Their purpose is to let the reader filter and reason about capabilities without flattening them into a single severity score.
The capability list is presented as an ordered sequence: from things you already encounter in deployed products, through behavioural patterns of mass-deployed AI, into frontier model behaviours, ending with agentic and system-level emergence. The order is editorial background and is not asserted as a category system. The labels and filters are intentionally restricted to architectural and empirical facts.
Where the capability lives
Where in the architecture the capability physically resides. This is not a claim about how concerning the capability is - it is a claim about whether the capability is a property of the model alone, of the model integrated with tools and agents, or both.
- Model capability (model-resident): the capability lives in the raw model and surfaces in chat or API interactions. Sycophancy, alignment faking, expert-level answering.
- Model + harness (harness-required): the capability only manifests when the model is integrated with tools, memory, or autonomous loops. Self-replication, autonomous sub-goal pursuit, autonomous decision-making.
- Both (hybrid): the capability is present in the raw model and amplified when the model gains tools or autonomy. Image generation, voice cloning, cyber operations, persuasion.
Open-weights status: what is already loose
- Present in open-weights. The capability is fully available in open-weight models (Llama, Mistral, Stable Diffusion, XTTS). Lab-level controls cannot contain it. Governance has to address downstream questions: criminal law, content provenance, platform liability, professional regulation.
- Partial in open-weights. The capability is partially present in open-weight models, with frontier closed models leading by months. The gap is closing. As of April 2026 this includes alignment faking (Apollo Research found it in Llama 3.1 405B), shutdown resistance (Anthropic's Agentic Misalignment study showed DeepSeek-R1 at 79%), and CBRN guidance (Brent & McKelvey 2025 showed Llama 3.1 405B can guide a high-consequence biological recovery procedure). Governance must address both lab-level and downstream levers.
- Not yet in open-weights. The capability currently appears only in frontier closed-source models. Lab-level governance (responsible scaling policies, pre-deployment evaluation, AISI access) is the available lever.
The open-weights status is empirical, not political. The site reports what the literature shows; it does not assert that open-weight release is right or wrong. The dimension exists because the cleanest dividing line for governance leverage is whether the capability can be recalled, not whether it is harmful.
UK relevance
A capability is tagged UK-relevant where the documented incidents, the evaluation infrastructure (UK AISI), the legislation (Online Safety Act, Data Use and Access Act 2025), or the institutional response (NCSC, ICO, FCA) substantively involve the United Kingdom. The flag is not displayed on capability cards as a visual badge; it is an internal property used for filtering and prioritisation.
What counts as a capability
A capability is included if all of the following hold:
- The capability is technically demonstrated. At least three sourced lab demonstrations exist, with quantitative results where available.
- The capability is contested or contestable. A counterargument from a credible source can be presented in good faith. A capability where no serious objection exists is not interesting; a capability where the objection is genuine is.
- The capability has UK governance implications. Either through documented UK incidents, through UK regulatory or evaluation infrastructure, or through legislation that addresses or fails to address the capability.
- The capability has theoretical lineage. At least one foundational paper or framework can be cited that anticipated or formalised the capability.
Capabilities that fail any of these tests are not absent because they do not matter; they are absent because the site cannot yet present them honestly to its standard.
Source-quality hierarchy
Sources are not equal. Each capability page makes the source type visible; readers are expected to weigh sources accordingly.
- Peer-reviewed research. Published in journals or conferences with double-blind review (Nature, Science, PNAS, NeurIPS, ICML, ICLR, ACL, IEEE S&P). The strongest evidence the site uses.
- Institutional reports. Government bodies (UK AISI, US AISI, NIST), established research institutes (METR, Apollo Research, RAND, NewsGuard), regulators (Ofcom, ICO, FCA), and law-enforcement agencies (FBI, NCSC). Strong but should be checked for institutional bias where one exists.
- Pre-prints and lab technical reports. Frontier-lab system cards (Anthropic, OpenAI, Google DeepMind), arXiv papers from named research groups, and major lab disclosures. Useful but unreviewed; flagged as such on each capability page.
- Quality journalism. Publications with editorial standards and corrections policies (FT, Bloomberg, Reuters, BBC, Nature News, Science). Used where primary sources are not yet public, or to surface incidents.
- Vendor claims. Self-reported by the company that built the system. Used only when independent verification is not yet available, and explicitly flagged.
Where a news article reports on an underlying study, the site links to both. Where a vendor figure is the only source for a claim, the figure is presented as approximate.
The counterargument requirement
Every capability page carries a sourced counterargument. This is not a courtesy. A capability described without a credible objection is advocacy. The counterargument must be the strongest version of the case against treating the capability as a concern, sourced to a named author or institution. Where the counterargument has weight, that is stated. Where it has limits, those are stated too.
A capability where the strongest counterargument cannot be presented in good faith does not yet meet the inclusion threshold.
Editorial confidence
Editorial decisions on capability classification, source weighting, and counterargument selection are made by Sebastian Wood. They are not committee-derived; they are not algorithmic; they are subject to revision when the evidence changes. The honesty about subjectivity is the defence.
Capabilities are not currently displayed with confidence scores. When they are, the scores will be: verified (independent confirmation, peer-reviewed), high (strong evidence, multiple corroborating sources), moderate (single primary source or pre-print), needs-review (claim recently added, awaiting re-verification). Each score will carry a note explaining why.
Conflicts of interest
The site is self-funded. It receives no funding from AI laboratories, AI advocacy organisations, or government bodies. Sebastian Wood has no financial interest in any of the AI companies, products, or research organisations cited. Where this changes, it will be disclosed here before any capability page is updated.
Limits of the methodology
- Selection bias is real. Capabilities are selected for inclusion. There is no claim of comprehensiveness. The site is curated by a single editor against the criteria above; another editor would make different choices.
- The capability literature changes faster than the site. Evidence cited on a capability page may be superseded between site updates. Source-verification passes are conducted before each major update; readers should treat any specific quantitative claim as time-sensitive.
- The taxonomy will get capabilities wrong. The four dimensions are an analytic device, not ground truth. Several capabilities sit on tier or manifestation boundaries. Where the assignment is contested, the page says so.
- UK-relevance is a perspective, not a property. Capabilities matter globally; the site reads them through a UK lens. Other lenses would surface different incidents, different governance hooks, different priorities.
What is forthcoming
- Incident inclusion methodology - the parallel methodology for the Documented Incidents page. Will cover what counts as an incident, source minimum, headline-accuracy rule, and how attribution is handled.
- Reversibility annotation - whether the harm from a capability can be recovered. Some capabilities (CBRN release, weight exfiltration, AI-generated CSAM) are categorically irreversible; others (sycophantic advice, persuasion-driven decisions) are recoverable. The dimension matters for governance but requires careful definition.
- Confidence scoring on each capability and incident, per the schema above.
Methodology last updated: April 2026. Suggestions and corrections welcome.