The reference list for the llms.txt ecosystem.
Every entry validated. Categorised by domain. Tracked over time.
Updated as new sites publish their files.
Every llms.txt file we've found in production
across SaaS, dev tools, content platforms, infra providers.
CI runs on every PR. Each entry must have a reachable URL and a parseable file. Broken links don't merge.
By domain (dev tools, AI infra, docs platforms, content sites). Use it to see what your industry looks like.
llms.txt is a real signal — if you publish one,
you're telling agents how to consume your site. We wanted to know
who actually does it. So we made the list. It's free.
54 of 109 tracked tools publish a working llms.txt, across 20 categories. The rest are tracked so you can see the gaps.
Desktop application for discovering, downloading, and running local LLMs with a polished UI and OpenAI-compatible server.
Run large language models locally via a single-binary server with a built-in model library.
Fast inference library for quantized LLMs optimized for consumer NVIDIA GPUs.
Foundational Python library for loading and running thousands of transformer models in PyTorch, TensorFlow, and JAX.
Open-source desktop ChatGPT alternative that runs local LLMs — privacy-first, no cloud, no account.
Single-binary llama.cpp wrapper with KoboldAI-style UI for chat, story-writing, and RP.
Reference C++ implementation for running LLaMA-family and other transformer models with GGUF quantization.
Self-hosted, OpenAI-compatible inference server for text, image, audio, and embedding models — runs anywhere.
Universal LLM deployment via compiled kernels — runs on iOS, Android, WebGPU, Vulkan, CUDA.
Self-hosted, feature-rich chat interface for local and cloud LLMs — the "ChatGPT clone" of the open-source world.
Fast LLM and VLM serving runtime with RadixAttention cache and structured output support.
Gradio-based web UI for local LLMs supporting GGUF, GPTQ, AWQ, ExLlamaV2.
High-throughput, memory-efficient LLM inference engine with PagedAttention and continuous batching.
Unified OpenAI-compatible proxy and SDK that routes calls across 100+ LLM providers with load balancing, fallbacks, and cost tracking.
High-performance multi-agent framework with memory, reasoning, and 20+ model integrations.
Python framework for orchestrating role-based multi-agent systems with sequential and hierarchical workflows.
Framework for programming rather than prompting LLMs — composable modules with optimizers.
Widely adopted framework for building LLM applications with chains, agents, retrievers, and memory.
Graph-based library for building stateful multi-agent workflows with explicit control flow and durability.
Agent framework built on Pydantic with type-safe tool use and structured responses.
TypeScript toolkit for building AI apps with unified APIs across providers and framework helpers.
Microsoft's framework for building multi-agent conversations with customizable agents and conversation patterns.
Type-safe Python library for building LLM-powered functions with structured outputs.
OpenAI's lightweight educational framework for multi-agent orchestration.
Open-source framework for running browser-automation agents with persistent profiles and human-in-the-loop review.
Official Anthropic client libraries for the Claude API in Python, TypeScript, Java, Go, and Ruby.
Anthropic's official SDK for building custom agents on top of Claude with tool use, subagents, and hooks.
Anthropic's terminal-first agentic coding assistant with deep tool use and codebase awareness.
Open-source AI coding assistant for VS Code and JetBrains — bring any model, any provider, customizable.
AI-first fork of VS Code with deep LLM integration, agent mode, and codebase-aware context.
GitHub's native AI coding assistant with chat, autocomplete, and agent mode across major IDEs.
AI-native IDE from Codeium with Cascade agent mode, deep indexing, and real-time code awareness.
AI pair programming in your terminal — edits code across your git repo with commit-per-change discipline.
AWS's AI coding assistant with deep integration into AWS services and enterprise compliance.
AI coding assistant with enterprise-grade code search context across massive codebases.
Node-based interface for building image, video, and audio generation workflows with any diffusion or multimodal model.
Open-source LLM app development platform with visual prompt IDE, RAG pipelines, and agent builder in one product.
Visual framework for building multi-agent and RAG applications with a node-based editor.
Fair-code workflow automation with native AI nodes, 500+ integrations, and first-class self-hosting.
Drag-and-drop UI for building LLM workflows and agents — open-source, self-hostable.
Offline voice-to-text desktop app with 36-language support and LoRA voice training.
Deep-learning toolkit for TTS with multi-speaker models and voice cloning.
Lightweight open-weight TTS model — surprisingly natural output at small model size.
Versatile instant voice cloning with cross-lingual synthesis and granular style control.
Fast, local neural text-to-speech with dozens of voices — optimized for Raspberry Pi.
C++ port of OpenAI Whisper for local speech-to-text — no Python, runs on CPU and many GPU backends.
Most widely-used web UI for Stable Diffusion — extensive extension ecosystem.
Simplified Stable Diffusion UI focused on ease-of-use — Midjourney-like experience locally.
Professional-grade Stable Diffusion with unified canvas, workflows, and team features.
Krita plugin for Stable Diffusion — inpaint, img2img, and generative layers inside Krita.
Advanced fork of SD WebUI with broader model support (Flux, Lumina, Kolors, more).
Open-source embedding database designed for LLM applications — runs embedded, as a server, or in the cloud.
Open-source cloud-native vector database built for billion-scale similarity search with separation of storage and compute.
Fully managed serverless vector database — the original SaaS option for production-scale vector search.
Open-source, Rust-written vector database built for production scale — rich filtering, hybrid search, and multi-tenancy.
Multi-model database written in Rust combining document, graph, key-value, time-series, and vector in one engine.
Open-source vector database with built-in ML modules, hybrid search, and first-class RAG tooling.
Serverless vector DB on the Lance columnar format — embedded or cloud, multimodal-ready.
Postgres extension adding vector similarity search — the "just use Postgres" option for RAG.
Production-oriented Python framework for building RAG, search, and agent pipelines with composable components.
Leading RAG framework for connecting LLMs to private data — document loaders, indexes, retrievers, and agents.
Persistent memory layer for AI agents — remembers user facts, preferences, and context across sessions.
All-in-one desktop and Docker RAG app — document ingestion, agents, multi-user.
Opinionated RAG framework: plug in your LLM, vector store, and files and get a chatbot.
Weaviate's open-source RAG chatbot — Golden RAGtriever reference implementation.
Leading open embedding models from BAAI — top of MTEB for multiple languages.
Lightweight, CPU-friendly embedding library from Qdrant — no torch dependency.
Python framework for state-of-the-art sentence, text, and image embeddings.
Open-source ML and LLM observability platform with OpenTelemetry-based tracing.
Open-source observability for LLM apps — traces, prompts, evaluations, usage analytics.
Open-source LLM engineering platform for tracing, evaluation, prompt management, and observability — self-host or cloud.
Commercial observability, debugging, and evaluation platform for LLM and agent applications.
Leading ML experiment tracking platform with dedicated LLM observability (Weave).
Open-source tool for testing, evaluating, and red-teaming LLM apps via config files.
Framework for evaluating RAG pipelines with metrics like faithfulness, answer relevance.
HuggingFace's library for reinforcement-learning based LLM training (DPO, PPO, SFT, KTO).
2x faster LLM fine-tuning with 70% less memory — drop-in replacement for HuggingFace's training stack.
Leading open-source toolkit for training LoRAs and fine-tunes on diffusion models — FLUX, SDXL, SD3, Qwen Image, and more.
YAML-configured fine-tuning framework supporting LoRA, QLoRA, full FT, DPO, and most modern LLM architectures.
WebUI-based fine-tuning framework supporting 100+ models with LoRA, QLoRA, DPO, and more.
ModelScope's fine-tuning framework supporting 350+ LLMs and 100+ multimodal models.
API access to Perplexity's search-augmented LLMs — Sonar models with live citations.
Scraping API for Google, Bing, DuckDuckGo and 15+ search engines — structured JSON results.
Independent web search API with no tracking — alternative to Google / Bing for agent use.
Neural search API built for AI agents — semantic search across the web with content retrieval.
IBM's document parsing toolkit — PDF, DOCX, images into structured JSON/markdown for RAG.
Fast, accurate PDF-to-markdown conversion — tables, equations, and structure preserved.
Library for ingesting PDF, HTML, DOCX, XLSX, and 25+ formats into RAG-ready chunks.
Production inference platform for open-source models with industry-leading speed for DeepSeek, Llama, Qwen.
Ultra-low-latency LLM inference on custom LPU silicon — sub-second complete responses and OpenAI-compatible API.
Run thousands of open-source ML models via simple API calls — image, video, audio, text — with per-second billing.
GPU cloud platform with on-demand instances, serverless endpoints, and a community GPU marketplace — priced for AI workloads.
Serverless inference for 200+ open-source models with OpenAI-compatible API — low latency, competitive pricing.
xAI's Grok API — Grok 4.1 Fast Reasoning and Non-reasoning currently the best raw-intelligence-per-dollar offering on the market.
Serverless cloud platform for Python with first-class GPU support — deploy LLMs, training jobs, and batch pipelines from code.
Anthropic's native desktop app for Claude — MCP server support, skills, agent mode, and deep OS integration.
Natural language interface to your computer — runs code locally to complete tasks from the CLI.
Command-line productivity tool powered by LLMs — generate shell commands, code, and configs.
Arch-based Linux distribution with performance-tuned kernels, first-class NVIDIA support, and a popular choice for local AI / ML workloads.
Upstream of RHEL and the distro that drives most Linux desktop feature adoption (Wayland, PipeWire, systemd, ostree) — strong AI / ML packaging on top.
The default Linux baseline for ML tutorials and cloud VMs — widely documented, increasingly controversial due to Snap-store enforcement and Canonical's direction.
Open a PR with the URL — the validator runs, and if it parses, it's in. Not on GitHub? Email us the tool you think we missed. We want every receipt.
Local speech-to-text that learns your voice. Perpetual licence. Our flagship.
PAID · flagship
Local long-term memory for Claude Code — full-text + vector + graph, on SurrealDB. MIT.
FREE · open source
Print-ready digital models. STL/3MF/OBJ included. Lifetime access.
PAID · digital catalog
Our printed designs, shipped across Europe. Buy the object, not the file.
PAID · physical objects
Cyber-tiger AI host. Privacy-first AI explained without the corporate filter.
CHANNEL · live
Curated GitHub lists for AI, MCP, local AI, Linux for AI, and more. Receipts, not vibes.
FREE · curated
Long-form how-tos for local AI on Linux, Windows, macOS. Real configs, not marketing.
FREE · coming soon
Production-tested ComfyUI graphs — LTX chunked-loop, the Nova pipeline, and more.
FREE · workflows landing
Negative-curation: practices and tools that waste your time, ranked. Receipts required.
FREE · coming soon
Who we are, why we build local-first AI, and what we won't do.