Writing for an AI audience
The primary audience for our work might be an LLM.. What does this mean for the way we publish our outputs? I read this tweet by Karpathy and it reminded me of conversations with Rijesh and others as we were thinking about our new website design.
The primary audience of your thing (product, service, library, …) is now an LLM, not a human. LLMs don’t like to navigate, they like to scrape. LLMs don’t like to see, they like to read. LLMs don’t like to click, they like to curl. Etc etc.
I decided to ask this audience how we can optimise the various outputs we produce so they can be more effectively consumed by AI. Here are the key action items that were mentioned. I believe we are already considering some of their suggestions, but there are also useful points to reflect on.
Publish in Machine-Readable Formats: Move beyond PDFs: Provide research outputs in structured formats like HTML, XML, or JSON, alongside human-readable versions.
Embed Rich Metadata: Use standardised metadata (e.g., Schema.org, JSON-LD) to describe documents, authors, topics, and key findings for better discoverability.
Provide High-Quality Transcripts for Multimedia: Offer accurate, structured transcripts for podcasts and videos, including speaker tags and topic timestamps.
Enable API and Bulk Data Access: Develop APIs and bulk download options so LLMs (and other tools) can access, query, and retrieve content programmatically.
Chunk and Structure Content: Break long documents into logical, well-labelled sections or “chunks” to facilitate targeted retrieval and summarisation by LLMS.
Standardise Document Structure Use consistent headings, sections, and formats across all outputs to make parsing and extraction easier for machines.
Retain Human Accessibility Ensure all machine-readable outputs remain clear and accessible to human readers, maintaining narrative flow and context.
Disclosure: the recommendations are a summary of a conversation with AI