5.3 调试技巧

引言

小白理解 - 什么是调试（Debug）？
调试 = 找出程序哪里出错了
类比：修理汽车
车坏了 → 不知道哪里坏
打开引擎盖，检查各个部件 → 调试
发现是电池没电 → 找到 bug
换电池 → 修复 bug
程序员的调试方法：
方法适用场景类比
print 调试快速检查变量值拿手电筒照一下
pdb 调试器逐行执行，深入排查慢动作回放
日志长期监控、生产环境安装行车记录仪
断点调试 IDE 中可视化调试在地图上插旗子

方法	适用场景	类比
`print` 调试	快速检查变量值	拿手电筒照一下
`pdb` 调试器	逐行执行，深入排查	慢动作回放
日志	长期监控、生产环境	安装行车记录仪
断点调试	IDE 中可视化调试	在地图上插旗子

print 调试

小白理解 - print 调试
最简单、最常用的调试方法：在关键位置打印变量值
python
def calculate(x, y):
    print(f"[DEBUG] x={x}, y={y}")  # 看看传进来什么
    result = x + y
    print(f"[DEBUG] result={result}")  # 看看计算结果
    return result
1
2
3
4
5
优点：简单快速缺点：代码里到处是 print，记得删掉

python

def process_agent_response(response: dict) -> str:
    """处理 Agent 响应"""
    print(f"[DEBUG] 收到响应: {response}")
    
    if "choices" in response:
        choices = response["choices"]
        print(f"[DEBUG] choices 数量: {len(choices)}")
        
        if choices:
            message = choices[0].get("message", {})
            content = message.get("content", "")
            print(f"[DEBUG] 内容长度: {len(content)}")
            return content
    
    print("[DEBUG] 响应格式错误")
    return ""

# 使用
response = {
    "choices": [
        {"message": {"content": "你好!"}}
    ]
}

result = process_agent_response(response)

pdb 调试器

小白理解 - pdb 是什么？
pdb = Python Debugger，Python 内置的调试器
类比：视频播放器的暂停和慢放
程序正常运行 = 视频正常播放
pdb.set_trace() = 按暂停
然后你可以一帧一帧看（逐行执行）
pdb 常用命令：
命令全称作用
n next 执行下一行
s step 进入函数内部
c continue 继续运行直到下一个断点
p x print 打印变量 x 的值
l list 显示当前代码位置
q quit 退出调试

命令	全称	作用
`n`	next	执行下一行
`s`	step	进入函数内部
`c`	continue	继续运行直到下一个断点
`p x`	print	打印变量 x 的值
`l`	list	显示当前代码位置
`q`	quit	退出调试

python

import pdb

def calculate_cost(tokens: int, price_per_1k: float) -> float:
    """计算 API 调用成本"""
    # 设置断点
    pdb.set_trace()

    cost = (tokens / 1000) * price_per_1k
    return cost

# 运行时会在 set_trace() 处暂停
# 常用命令:
# - n (next): 执行下一行
# - s (step): 进入函数
# - c (continue): 继续执行
# - p variable: 打印变量
# - l (list): 显示代码
# - q (quit): 退出调试

result = calculate_cost(1000, 0.002)

breakpoint() 函数（Python 3.7+）

小白理解 - breakpoint() 是什么？
breakpoint() = pdb.set_trace() 的简化版
类比：
pdb.set_trace() = 手动设置闹钟的所有参数
breakpoint() = 直接说"明早7点叫我"
两者效果一样，breakpoint() 更简洁
python
# 旧写法（Python 3.6 及之前）
import pdb
pdb.set_trace()

# 新写法（Python 3.7+）
breakpoint()  # 一行搞定！
1
2
3
4
5
6

python

def complex_calculation(data: list) -> float:
    """复杂计算"""
    result = 0.0
    
    for item in data:
        # 使用 breakpoint() 代替 pdb.set_trace()
        breakpoint()
        
        if isinstance(item, (int, float)):
            result += item
        elif isinstance(item, str):
            result += len(item)
    
    return result

# 使用
data = [1, 2, "hello", 3.5]
total = complex_calculation(data)

装饰器调试工具

小白理解 - 什么是调试装饰器？
装饰器调试 = 给函数装一个"监控器"
类比：店铺门口的计数器
顾客进来 → 记录"谁进来了"
顾客离开 → 记录"买了什么，花了多长时间"
@debug              ← 装一个监控器
def my_function():  ← 被监控的函数
    ...

调用 my_function() 时自动打印：
┌─────────────────────────────┐
│ 调用 my_function(参数1, 参数2) │
│   返回 "结果"                │
│   耗时 0.5秒                 │
└─────────────────────────────┘
1
2
3
4
5
6
7
8
9
10
好处：不用改函数内部代码，就能监控它的运行

python

import functools
from typing import Callable, Any
import time

def debug(func: Callable) -> Callable:
    """调试装饰器：打印函数调用信息"""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # 格式化参数
        args_repr = [repr(a) for a in args]
        kwargs_repr = [f"{k}={v!r}" for k, v in kwargs.items()]
        signature = ", ".join(args_repr + kwargs_repr)
        
        print(f"调用 {func.__name__}({signature})")
        
        # 执行函数
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        
        print(f"  返回 {result!r}")
        print(f"  耗时 {end - start:.4f}秒")
        
        return result
    
    return wrapper

@debug
def call_llm(prompt: str, model: str = "gpt-4") -> str:
    """调用 LLM"""
    time.sleep(0.5)  # 模拟 API 调用
    return f"来自 {model} 的响应"

# 使用
result = call_llm("你好", model="gpt-3.5-turbo")

性能分析

小白理解 - 什么是性能分析？
性能分析 = 找出程序哪里慢
类比：体检
程序运行慢 = 身体不舒服
性能分析 = 做一个全身检查
报告显示 = "心脏正常，但血压高"（某函数耗时过长）
cProfile 模块：
报告格式：
┌────────────────────────────────────────┐
│ 函数名          调用次数    总耗时    │
│ process_data    1000次     2.5秒      │ ← 瓶颈在这！
│ save_file       100次      0.1秒      │
│ load_config     1次        0.01秒     │
└────────────────────────────────────────┘
1
2
3
4
5
6
7
什么时候用？
程序运行很慢，但不知道慢在哪
想优化代码，但要先找到瓶颈

python

import cProfile
import pstats
from io import StringIO

def profile_function(func: Callable) -> Callable:
    """性能分析装饰器"""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        profiler = cProfile.Profile()
        profiler.enable()
        
        result = func(*args, **kwargs)
        
        profiler.disable()
        
        # 生成报告
        stream = StringIO()
        ps = pstats.Stats(profiler, stream=stream)
        ps.sort_stats('cumulative')
        ps.print_stats(10)  # 显示前 10 个最慢的函数
        
        print(stream.getvalue())
        
        return result
    
    return wrapper

@profile_function
def process_large_dataset(data: list) -> list:
    """处理大数据集"""
    result = []
    for item in data:
        # 一些复杂处理
        processed = item ** 2
        result.append(processed)
    return result

# 使用
data = list(range(10000))
result = process_large_dataset(data)

内存分析

小白理解 - 什么是内存分析？
内存分析 = 看程序占用了多少内存
类比：冰箱空间管理
内存 = 冰箱的存储空间
变量、数据 = 冰箱里的食物
内存分析 = 看看冰箱哪里被占满了
为什么要关心内存？
场景：处理100万条数据

没有内存意识：
data = [x for x in 超大文件]  # 一次性加载，内存爆了💥

有内存意识：
for x in 超大文件:  # 一条一条处理，内存平稳
    process(x)
1
2
3
4
5
6
7
8
tracemalloc 告诉你：
当前内存使用量
峰值内存使用量（最高用了多少）

python

import tracemalloc
from typing import Callable

def memory_profile(func: Callable) -> Callable:
    """内存分析装饰器"""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # 开始追踪
        tracemalloc.start()
        
        result = func(*args, **kwargs)
        
        # 获取内存使用情况
        current, peak = tracemalloc.get_traced_memory()
        tracemalloc.stop()
        
        print(f"{func.__name__} 内存使用:")
        print(f"  当前: {current / 1024 / 1024:.2f} MB")
        print(f"  峰值: {peak / 1024 / 1024:.2f} MB")
        
        return result
    
    return wrapper

@memory_profile
def create_large_list(size: int) -> list:
    """创建大列表"""
    return [i for i in range(size)]

# 使用
large_list = create_large_list(1000000)

自定义调试上下文管理器

小白理解 - 调试上下文管理器
上下文管理器 = 自动化的"进入/退出"记录
类比：会议室预约系统
with debug_context("API 调用"):  # 进入会议室（自动打印"进入"）
    # 开会（你的代码）
    response = call_api()
# 离开会议室（自动打印"退出"）
1
2
3
4
输出效果：
[进入] API 调用
响应: {...}
[退出] API 调用
1
2
3
好处：
自动记录代码块的开始和结束
如果出错，自动打印错误信息
用 with 语法，代码更整洁

python

from contextlib import contextmanager
from typing import Any
import sys

@contextmanager
def debug_context(name: str, verbose: bool = True):
    """调试上下文"""
    if verbose:
        print(f"[进入] {name}")
    
    try:
        yield
    except Exception as e:
        print(f"[错误] {name}: {e}")
        print(f"  类型: {type(e).__name__}")
        print(f"  位置: {sys.exc_info()[2].tb_frame.f_code.co_filename}")
        raise
    finally:
        if verbose:
            print(f"[退出] {name}")

# 使用
with debug_context("API 调用"):
    # 你的代码
    response = {"status": "success"}
    print(f"响应: {response}")

断言和验证

小白理解 - 什么是断言（assert）？
断言 = 程序里的"检查站"
类比：机场安检
python
assert 有机票, "没有机票不能登机"      # 检查点1
assert 没有违禁品, "不能带危险物品"    # 检查点2
# 都通过了才能登机
1
2
3
Python 断言语法：
python
assert 条件, "条件不满足时的错误信息"

# 等价于：
if not 条件:
    raise AssertionError("条件不满足时的错误信息")
1
2
3
4
5
什么时候用？
检查函数参数是否合法
验证配置是否正确
确保数据格式符合预期
注意：生产环境可以用 -O 参数禁用断言，所以重要的验证不要只用 assert

python

from typing import Dict, Any, List

def validate_agent_config(config: Dict[str, Any]):
    """验证 Agent 配置"""
    # 使用断言进行快速检查
    assert "model" in config, "缺少 model 配置"
    assert "temperature" in config, "缺少 temperature 配置"
    
    # 详细验证
    temperature = config["temperature"]
    assert 0 <= temperature <= 2, \
        f"temperature 必须在 0-2 之间，当前值: {temperature}"
    
    assert isinstance(config.get("tools", []), list), \
        "tools 必须是列表"
    
    print("配置验证通过")

# 使用
try:
    config = {
        "model": "gpt-4",
        "temperature": 0.7,
        "tools": ["search", "calculator"]
    }
    validate_agent_config(config)
except AssertionError as e:
    print(f"配置错误: {e}")

日志调试技巧

小白理解 - 用日志调试
日志调试 = print 调试的升级版
对比 print 调试日志调试
输出到哪只能输出到屏幕屏幕 + 文件
能否关闭要手动删代码改一个级别就行
信息丰富只有你写的内容自动加时间、位置
生产环境不能用可以用
为什么用日志调试？
开发时：DEBUG 级别，看所有细节
上线后：WARNING 级别，只看重要信息
一套代码，两种模式

对比	print 调试	日志调试
输出到哪	只能输出到屏幕	屏幕 + 文件
能否关闭	要手动删代码	改一个级别就行
信息丰富	只有你写的内容	自动加时间、位置
生产环境	不能用	可以用

python

import logging
from typing import Any

# 配置日志
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('debug.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

class DebugAgent:
    """带调试日志的 Agent"""
    
    def __init__(self, name: str):
        self.name = name
        logger.info(f"初始化 Agent: {name}")
    
    def process(self, input: str) -> str:
        """处理输入"""
        logger.debug(f"收到输入: {input}")
        
        try:
            # 处理逻辑
            result = self._internal_process(input)
            logger.debug(f"处理结果: {result}")
            return result
        
        except Exception as e:
            logger.error(f"处理失败: {e}", exc_info=True)
            raise
    
    def _internal_process(self, input: str) -> str:
        """内部处理"""
        logger.debug("开始内部处理")
        
        # 模拟处理
        if not input:
            raise ValueError("输入不能为空")
        
        return f"处理结果: {input.upper()}"

# 使用
agent = DebugAgent("TestBot")

try:
    result = agent.process("hello")
    print(result)
    
    # 触发错误
    result = agent.process("")
except Exception as e:
    print(f"错误: {e}")

VS Code 调试配置

小白理解 - 什么是 IDE 调试？

IDE 调试 = 可视化的 pdb

对比：

pdb：命令行输入 n、s、c（记命令）
VS Code 调试：点按钮、看界面（所见即所得）

VS Code 调试界面：

┌─────────────────────────────────────────────┐
│  ▶️ ⏭️ ⏹️  调试控制按钮                      │
├─────────────────────────────────────────────┤
│ 变量面板           │ 代码编辑器             │
│ x = 10             │ 1  def foo():          │
│ name = "张三"      │ 2 → print(x)  ← 断点   │
│                    │ 3    return x          │
├─────────────────────────────────────────────┤
│ 调用堆栈 │ 调试控制台                       │
└─────────────────────────────────────────────┘

操作方式：

点击行号左边 → 设置断点（红点）
按 F5 → 开始调试
按 F10 → 下一步（不进入函数）
按 F11 → 进入函数内部

创建 .vscode/launch.json:

json

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": false
        },
        {
            "name": "Python: Agent Script",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/agent_main.py",
            "args": ["--verbose", "--debug"],
            "console": "integratedTerminal",
            "env": {
                "DEBUG": "true"
            }
        },
        {
            "name": "Python: Attach",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            }
        }
    ]
}

远程调试（debugpy）

小白理解 - 什么是远程调试？
远程调试 = 程序跑在服务器上，你在自己电脑上调试它
类比：远程医疗
病人在老家（程序在服务器）
医生在北京（你在本地电脑）
通过视频问诊（网络连接调试）
场景：
程序部署在云服务器上出了 bug
不可能每次都登录服务器用命令行调试
远程调试让你用本地 VS Code 直接调试远程程序

python

# 安装: pip install debugpy
import debugpy

# 启动调试服务器
debugpy.listen(("0.0.0.0", 5678))
print("等待调试器连接...")
debugpy.wait_for_client()  # 可选：等待调试器连接

# 你的代码
def main():
    print("程序开始运行")
    # ...

if __name__ == "__main__":
    main()

单元测试中的调试

小白理解 - 测试中怎么调试？
单元测试 = 自动化的检查程序
类比：工厂的质检流程
产品生产完 → 质检员检查 → 合格就发货，不合格就退回
代码写完   → 测试运行   → 通过就部署，失败就修复
1
2
测试中的调试技巧：
用 breakpoint() 在测试中暂停
用 self.assertEqual() 对比期望值和实际值
测试失败时，查看详细的错误信息

python

import unittest

class TestAgentLogic(unittest.TestCase):
    """Agent 逻辑测试"""
    
    def setUp(self):
        """测试前准备"""
        self.agent = DebugAgent("TestAgent")
    
    def test_process_valid_input(self):
        """测试有效输入"""
        result = self.agent.process("hello")
        self.assertEqual(result, "处理结果: HELLO")
    
    def test_process_empty_input(self):
        """测试空输入"""
        with self.assertRaises(ValueError) as context:
            self.agent.process("")
        
        self.assertIn("输入不能为空", str(context.exception))
    
    def test_with_debug(self):
        """带调试的测试"""
        # 在测试中使用 breakpoint()
        input_value = "test"
        breakpoint()  # 调试器会在这里停止
        result = self.agent.process(input_value)
        self.assertIsNotNone(result)

# 运行测试
if __name__ == '__main__':
    unittest.main()

最佳实践

小白理解 - 调试的黄金法则
从简单开始：先用 print，不够再用 pdb
记得清理：调试完删掉 print 和 breakpoint
日志为王：正式项目用日志，不用 print
善用工具：VS Code 调试比命令行方便100倍

1. 日志级别使用

python

# DEBUG: 详细的诊断信息
logger.debug(f"变量值: x={x}, y={y}")

# INFO: 一般信息
logger.info("Agent 启动成功")

# WARNING: 警告信息
logger.warning("API 响应较慢")

# ERROR: 错误信息
logger.error("API 调用失败", exc_info=True)

# CRITICAL: 严重错误
logger.critical("系统崩溃")

2. 临时调试技巧

python

# 使用环境变量控制调试输出
import os

DEBUG = os.getenv("DEBUG", "false").lower() == "true"

def debug_print(*args, **kwargs):
    """条件打印"""
    if DEBUG:
        print("[DEBUG]", *args, **kwargs)

# 使用
debug_print("这只在 DEBUG=true 时显示")

3. 异常追踪

python

import traceback

try:
    # 可能出错的代码
    risky_operation()
except Exception as e:
    # 打印完整的异常追踪
    traceback.print_exc()
    
    # 或者记录到日志
    logger.error("异常详情:", exc_info=True)

关键要点

小白总结 - 调试工具选择指南
场景推荐工具理由
快速看变量值 print() 最简单
逐行排查问题 pdb / breakpoint() 可以暂停、单步执行
长期监控 logging 可以记录到文件
复杂项目 VS Code 调试可视化界面
性能问题 cProfile 找出慢的函数
内存问题 tracemalloc 找出内存占用
记忆口诀：
print 用来快速看，pdb 暂停慢慢查
日志记录不会丢，IDE 调试最方便
性能分析找瓶颈，内存追踪防爆炸
1
2
3

场景	推荐工具	理由
快速看变量值	`print()`	最简单
逐行排查问题	`pdb` / `breakpoint()`	可以暂停、单步执行
长期监控	`logging`	可以记录到文件
复杂项目	VS Code 调试	可视化界面
性能问题	`cProfile`	找出慢的函数
内存问题	`tracemalloc`	找出内存占用

选择合适的工具：print 快速、pdb 深入、日志持久
断点策略：在关键位置设置断点
日志级别：合理使用不同级别的日志
性能分析：找出性能瓶颈
单元测试：编写测试帮助调试

下一节：5.4 小结和复习

5.3 调试技巧 ​

引言 ​

print 调试 ​

pdb 调试器 ​

breakpoint() 函数（Python 3.7+） ​

装饰器调试工具 ​

性能分析 ​

内存分析 ​

自定义调试上下文管理器 ​

断言和验证 ​

日志调试技巧 ​

VS Code 调试配置 ​

远程调试（debugpy） ​

单元测试中的调试 ​

最佳实践 ​

1. 日志级别使用 ​

2. 临时调试技巧 ​

3. 异常追踪 ​

关键要点 ​

5.3 调试技巧

引言

print 调试

pdb 调试器

breakpoint() 函数（Python 3.7+）

装饰器调试工具

性能分析

内存分析

自定义调试上下文管理器

断言和验证

日志调试技巧

VS Code 调试配置

远程调试（debugpy）

单元测试中的调试

最佳实践

1. 日志级别使用

2. 临时调试技巧

3. 异常追踪

关键要点