Using AI Without Leaving the Terminal: A Guide to llm

I first discovered the llm
tool watching Simon Willison’s talk “Catching up on the weird world of LLMs” at North Bay Python 2023. Since then, it’s become an essential part of my development workflow. This guide covers everything you need to know to get started and use it effectively.
What is llm?
llm
is a command-line tool that provides a unified interface to various language models. Instead of switching between different web interfaces, you can chat with GPT-4, Claude, Gemini, or local models directly from your terminal.
Key features:
- Universal interface: One command for all models
- Automatic logging: Every conversation saved to SQLite
- Plugin ecosystem: Extend functionality with 70+ plugins
- Pipe friendly: Works naturally with Unix pipes and command chains
Installation and Setup
There are several ways to install llm
. Choose the method that works best for your setup:
# Recommended: isolated environment with uv
uv tool install llm # Python ≥ 3.9 (creates a venv under ~/.uv)
# Quick one‑off try‑out (temporary env)
OPENAI_API_KEY=sk‑... uvx llm "fun facts about skunks"
# Traditional options
pipx install llm # or
brew install llm
Tip:
uvx
spins up a throw‑away virtualenv each run; switch touv tool install
when you’re ready for a permanent install.
Once installed, you’ll need to configure at least one AI provider. OpenAI is the most straightforward to start with:
llm keys set openai # paste your key
llm "Ten names for a pet pelican"
Add more providers at any time:
llm install llm-anthropic
llm keys set anthropic
Basic Usage Patterns
Here are the fundamental ways to interact with llm
:
Prompting & system messages:
llm -m gpt-4o "Explain quantum computing in one tweet"
llm -s "You are an SRE" -f server.log "Find errors in this log"
Place flags before the prompt so shell tab completion stays happy.
Working with files: Instead of copy-pasting content, you can directly reference files or pipe content:
llm -f myscript.py "Summarise this code"
cat diff.patch | llm -s "Generate a conventional commit message"
Interactive chat: For longer conversations, use chat mode to maintain context across multiple exchanges:
llm chat # new conversation
llm chat -c # continue last one
llm chat -m claude-4-opus
Managing Conversations
All conversations are automatically saved to SQLite. Here’s how to search and manage your conversation history:
llm logs -n 10 # tail
llm logs -q "vector search" # search strings
llm -c "follow up question" # continue context
llm logs -u # include token & cost usage
llm logs --json > backup.json # export
Essential Plugins
The plugin ecosystem extends llm
to work with different AI providers and add specialized functionality:
Category | Examples |
---|---|
Local models | llm-mlx , llm-gguf , llm-ollama |
Remote APIs | llm-anthropic , llm-gemini , llm-mistral , llm-openrouter |
Tools | llm-tools-quickjs , llm-tools-sqlite |
Fragments | llm-fragments-github , llm-fragments-pdf |
Embeddings | llm-sentence-transformers , llm-clip |
Extras | llm-cmd , llm-jq , llm-markov |
Install with llm install <plugin>
with no restart needed.
# Major AI providers
llm install llm-anthropic # Claude models
llm install llm-gemini # Google Gemini
llm install llm-ollama # Local models via Ollama
# Add API keys
llm keys set anthropic
llm keys set gemini
# Now use different models
llm -m claude-4-opus "Write a technical explanation"
llm -m gemini-2.0-flash "Quick calculation"
Running Local Models on Apple Silicon
For privacy or offline work, you can run models locally on Apple Silicon Macs using the llm-mlx
plugin:
uv tool install llm --python 3.12
llm install llm-mlx # macOS 14.4+ only
# Download a 3 B model (≈1.8 GB)
llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit
Model recommendations by RAM:
RAM | Model (4 bit) |
---|---|
8 GB | Llama 3.2 3B |
16 GB | Mistral 7B |
32 GB | Mistral Small 24B |
Assign an alias for convenience:
llm aliases set local3b mlx-community/Llama-3.2-3B-Instruct-4bit
Fragments for Long Context
Fragments let you feed huge inputs without copy paste:
# Install the GitHub fragments plugin first
llm install llm-fragments-github
# Summarise an entire GitHub repo
llm -f github:simonw/files-to-prompt "Key design decisions?"
# Use the symbex loader to extract a single Python symbol
llm install llm-fragments-symbex
symbex my_module:some_func | llm -c "Write pytest tests"
Tools & Agents (since 0.26)
Let models run code imperatively:
llm --functions 'def sq(x:int)->int: return x*x' \
"What is 431²?" --td # shows the tool call
# Use a plugin tool
llm install llm-tools-quickjs
llm --tool quickjs "JSON.parse('[1,2,3]').reduce((a,b)=>a+b,0)"
Templates and Aliases
Create shortcuts and reusable prompts to speed up common tasks:
llm aliases set fast gpt-4o-mini
llm aliases set smart gpt-4o
llm -s "Explain like I'm five" --save eLI5 # save as template
llm -t eLI5 "Why is the sky blue?"
Set a default model for the session:
llm models default gpt-4o-mini
Embeddings & Semantic Search
Build searchable knowledge bases with embeddings:
# Embed text into a SQLite DB
llm embed -m clip "hello world" -d embeds.db
# Find similar rows
llm similar "vector databases" -d embeds.db -n 5
Development Workflow Integration
Code analysis with symbex:
The symbex
tool pairs perfectly with llm
for code analysis and documentation:
# Install companion tools
pip install symbex files-to-prompt
# Analyze specific functions
symbex my_function | llm -s "Explain this code"
# Generate tests
symbex my_function | llm -s "Write pytest tests"
# Document entire codebase
files-to-prompt . -e py | llm -s "Generate API documentation"
Git integration: Integrate AI into your Git workflow for better commit messages and code reviews:
# Generate commit messages
git diff --cached | llm -s "Generate a conventional commit message"
# Code reviews
git diff HEAD~1 | llm -s "Review these changes for potential issues"
Cost Optimization
Monitor and control costs by using the right models for different tasks:
llm -u "Analyse this file" -f big.txt # prints tokens & cost
llm logs -u | ttok -s # summary of past usage
# Cheap model for simple tasks
llm -m gpt-4o-mini "Quick draft tweet"
# Continue conversations to reuse context
llm "Analyze this architecture" -f project/
llm -c "Now focus on security" # Reuses previous context
Automation Examples
Create scripts to automate repetitive AI tasks:
Batch summariser:
#!/usr/bin/env bash
for url in "$@"; do
curl -s "$url" | strip-tags article | \
llm -s "3 bullet summary" > "summary_$(basename $url).md"
done
Documentation generator:
files-to-prompt src/ -e py | \
llm -s "Generate API docs" > docs/api.md
Advanced Features
Structured output: Get responses in specific JSON formats for easier programmatic processing:
# Get JSON with specific schema
llm "Analyze sentiment" --schema '{
"type": "object",
"properties": {
"sentiment": {"type": "string"},
"confidence": {"type": "number"}
}
}'
Multi-modal capabilities: Work with images and other media types using compatible models:
# Image analysis
llm "Describe this image" -a screenshot.png -m gpt-4o
# Extract text from images
llm "Extract all text" -a document.jpg -m gpt-4o
Best Practices
- Create aliases for favourite models
- Save reusable templates and system prompts
- Use fragments to feed large context instead of copy paste
- Pick the cheapest model that solves the task
- Combine with Unix pipes for powerful automation
- Turn logging off with
llm logs off
if working with sensitive data
Conclusion
It’s been transformative integrating AI directly into my command-line workflow. Instead of context-switching between web interfaces, I can analyze code, generate documentation, or ask quick questions without leaving the terminal. The combination of universal model access, automatic conversation logging, and pipe-friendly design makes it an essential tool for any developer working with AI.
For more detailed information and advanced features, check out the official documentation at https://llm.datasette.io/