Lesson 35: Project — Documentation Agent
Course: AI Code Agents | Duration: 55 minutes | Level: Intermediate
Learning Objectives
By the end of this lesson, you will be able to:
- Build an agent that generates and updates docstrings for Python functions
- Generate Mermaid architecture diagrams from code structure
- Update README files to reflect code changes
- Implement incremental documentation (only update what changed)
Prerequisites
- Lessons 33-34 of this section
Part 1: Documentation Agent Design
Documentation has a reputation as tedious. That makes it an excellent target for automation.
The documentation agent has three distinct tasks, each with different requirements:
Docstring generation: Read a function's signature, body, and usage context. Generate a Google-style or NumPy-style docstring that describes: what the function does, its parameters (types and meanings), return value, and exceptions raised. The hard part: inferring intent from code, not just describing what the code does.
Architecture diagram generation: Read the codebase structure (directory layout, import dependencies, class hierarchies). Generate a Mermaid diagram that visually represents the architecture at an appropriate level of abstraction. The hard part: choosing the right level of abstraction (not too detailed, not too abstract).
README updating: Read the existing README and the recent code changes. Update the README to reflect what changed: new functions, changed APIs, new dependencies, new usage examples. The hard part: maintaining the existing style and structure of the README while adding new content accurately.
Part 2: The Documentation Agent Implementation
#!/usr/bin/env python3
"""Documentation Agent — generates and updates documentation for Python projects."""
import anthropic
import subprocess
from pathlib import Path
import ast
import sys
client = anthropic.Anthropic()
SYSTEM_PROMPT = """You are a technical writer and Python expert who writes clear, accurate documentation.
Documentation standards:
- Docstrings: Google style format
- All public functions must have docstrings
- Include types in Args and Returns sections
- Include Raises section only when functions explicitly raise exceptions
- Architecture diagrams: Mermaid flowchart or class diagram format
- README: maintain existing structure and style; add new sections at appropriate locations
Never invent capabilities that are not evident in the code.
If a function's purpose is unclear, describe what it does, not what it might intend.
"""
TOOLS = [
{
"name": "read_file",
"description": "Read a source file.",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
},
{
"name": "find_undocumented_functions",
"description": "Find functions in a file that have no docstring.",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string", "description": "Python file path"}},
"required": ["path"]
}
},
{
"name": "get_import_graph",
"description": "Analyze import dependencies between Python files in a directory.",
"input_schema": {
"type": "object",
"properties": {
"directory": {"type": "string", "description": "Directory to analyze"}
},
"required": ["directory"]
}
},
{
"name": "write_file",
"description": "Write content to a file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"]
}
},
{
"name": "edit_file",
"description": "Replace a specific string in a file with new content.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"old_text": {"type": "string", "description": "Exact text to replace"},
"new_text": {"type": "string", "description": "Replacement text"}
},
"required": ["path", "old_text", "new_text"]
}
}
]
def find_undocumented(path: Path) -> list[dict]:
"""Parse a Python file and find functions/methods without docstrings."""
source = path.read_text(encoding="utf-8")
try:
tree = ast.parse(source)
except SyntaxError as e:
return [{"error": str(e)}]
undocumented = []
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
if node.name.startswith("_"):
continue # Skip private
has_docstring = (
isinstance(node.body[0], ast.Expr)
and isinstance(node.body[0].value, ast.Constant)
and isinstance(node.body[0].value.value, str)
)
if not has_docstring:
undocumented.append({
"name": node.name,
"line": node.lineno,
"args": [a.arg for a in node.args.args]
})
return undocumented
def get_import_graph(directory: Path) -> dict[str, list[str]]:
"""Build a module import graph for the given directory."""
graph = {}
for py_file in directory.rglob("*.py"):
source = py_file.read_text(encoding="utf-8")
try:
tree = ast.parse(source)
imports = []
for node in ast.walk(tree):
if isinstance(node, ast.ImportFrom):
if node.module:
imports.append(node.module)
graph[str(py_file.relative_to(directory))] = imports
except SyntaxError:
continue
return graph
def execute_tool(name: str, args: dict, project_root: Path) -> str:
if name == "read_file":
p = project_root / args["path"]
return p.read_text(encoding="utf-8") if p.exists() else f"File not found: {args['path']}"
elif name == "find_undocumented_functions":
p = project_root / args["path"]
if not p.exists():
return f"File not found: {args['path']}"
results = find_undocumented(p)
if not results:
return "All public functions have docstrings."
return str(results)
elif name == "get_import_graph":
d = project_root / args["directory"]
graph = get_import_graph(d)
lines = [f"{module}: imports {', '.join(deps) or 'nothing'}"
for module, deps in graph.items()]
return "\n".join(lines)
elif name == "write_file":
p = project_root / args["path"]
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(args["content"], encoding="utf-8")
return f"Written: {p}"
elif name == "edit_file":
p = project_root / args["path"]
content = p.read_text(encoding="utf-8")
if args["old_text"] not in content:
return f"Error: old_text not found in {args['path']}"
new_content = content.replace(args["old_text"], args["new_text"], 1)
p.write_text(new_content, encoding="utf-8")
return f"Edited: {p}"
return f"Unknown tool: {name}"
def document_module(source_path: str, project_root: str = ".") -> str:
root = Path(project_root).resolve()
task = f"""Add missing docstrings to {source_path}.
Steps:
1. Find all undocumented functions in {source_path}
2. Read the file to understand the full context of each function
3. For each undocumented function, add a Google-style docstring that describes:
- What the function does (one-line summary)
- Args: each parameter with its type and meaning
- Returns: the return value type and meaning
- Raises: any exceptions the function explicitly raises
4. Edit the file to add the docstrings (use edit_file, not write_file — preserve existing content)
5. Report: how many docstrings were added"""
messages = [{"role": "user", "content": task}]
for _ in range(20):
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=8192,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages
)
if response.stop_reason == "end_turn":
for block in response.content:
if hasattr(block, "text"):
return block.text
return "Documentation complete."
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input, root)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "user", "content": tool_results})
return "Error: agent did not complete"
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python doc_agent.py <source.py> [project_root]")
sys.exit(1)
print(document_module(sys.argv[1], sys.argv[2] if len(sys.argv) > 2 else "."))Part 3: Architecture Diagram Generation
For architecture diagram generation, extend the agent with a separate task:
def generate_architecture_diagram(src_dir: str, output_path: str, project_root: str = ".") -> str:
root = Path(project_root).resolve()
task = f"""Generate a Mermaid architecture diagram for the project.
Steps:
1. Get the import graph for {src_dir}
2. Read the main module files to understand what each module does
3. Generate a Mermaid flowchart diagram that shows:
- The main modules/layers as nodes
- Import dependencies as edges
- Group related modules with subgraphs if there are clear layers (api, services, repositories, models)
4. Write the Mermaid diagram to {output_path}
Format: Mermaid flowchart LR or TD (choose the one that looks better for this structure)
Keep it readable: do not show every file, show logical modules/layers"""
messages = [{"role": "user", "content": task}]
# ... same loop pattern as aboveKey Takeaways
- Documentation agent has three distinct tasks: docstring generation, architecture diagrams, README updates
- Use AST parsing (
find_undocumented_functions) to reliably identify missing docstrings — more precise than regex - Prefer
edit_fileoverwrite_filefor docstring insertion to avoid accidentally overwriting existing content - Architecture diagrams are generated from import graph data + module descriptions — not from free-form description
Next Lesson: In Lesson 36: Project — Refactoring Agent, we build an agent that identifies code smells and performs targeted, safe refactoring using the test suite as a safety net.