第 19 讲：性能分析与优化——cProfile、perf 与优化技巧

2026-03-03

大家好，我是正在实战各种 AI 项目的程序员晚枫。

先测量再优化！用工具找到真正的性能瓶颈，而不是靠猜测。这一讲，教会你科学的性能分析方法。

📖 开篇：不要猜测，要测量

# 常见的错误优化：
# 1. 优化了 1% 运行时间的代码，忽略了 50% 的瓶颈
# 2. 手动优化了编译器自动优化的代码
# 3. 优化了代码风格，却没改善性能

# 正确的方法：
# 1. 测量 -> 找到瓶颈
# 2. 优化瓶颈
# 3. 测量 -> 验证改善
# 4. 重复

🔍 cProfile（最常用的分析器）

基本用法

import cProfile
import pstats
import io

# 方法 1：装饰器方式
@cProfile.runctx('test_func()', globals(), {})
def test_func():
    # 你的代码
    pass

# 方法 2：上下文管理器
def my_heavy_function():
    # ... 10000 行代码 ...
    pass

profiler = cProfile.Profile()
profiler.enable()

my_heavy_function()

profiler.disable()

# 输出到字符串
s = io.StringIO()
stats = pstats.Stats(profiler, stream=s)
stats.sort_stats('cumulative')  # 按累计时间排序
stats.print_stats(20)  # 显示前 20 个
print(s.getvalue())

输出解读

       1999993 function calls in 1.234s

Ordered by: cumulative time

ncalls  tottime  cumtime  filename:lineno(function)
     1    0.001  1.234  test.py:12(my_heavy_function)
     1    0.000  0.800  json.py:456(load)
100000    0.300  0.500  test.py:25(inner_loop)

列名	含义
ncalls	调用次数
tottime	函数自身执行时间（不含子调用）
cumtime	累计时间（含子调用）
filename:lineno	代码位置

热点分析

# 找出最慢的函数
stats.sort_stats('tottime')
stats.print_stats(10)  # 前 10 个自身耗时的函数

# 找出调用最多的函数
stats.sort_stats('ncalls')
stats.print_stats(10)  # 前 10 个调用次数最多的函数

🎯 line_profiler（行级分析）

1	pip install lineprofiler

# test.py
@profile  # 添加装饰器
def slow_function():
    data = range(10000)
    result = []
    for x in data:
        if x % 2 == 0:
            result.append(x * x)
        else:
            result.append(x)
    return result

1	kernprof -l -v test.py

Timer unit: 1e-06 s

File: test.py
Function: slow_function at line 2

Line #  Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2    1       5000.0    5000.0    100.0    @profile
     3    1       3000.0    3000.0     60.0    data = range(10000)
     4    1       1000.0    1000.0     20.0    result = []
     5 10000       4000.0       0.4     80.0    for x in data:
     6  5000       5000.0       1.0    100.0        if x % 2 == 0:
     7  5000       3000.0       0.6     60.0            result.append(x * x)
     9  5000       2000.0       0.4     40.0            result.append(x)

⚡ perf（Linux 内核性能分析器）

1
2
3

# Linux 系统用 perf
perf record -g python your_script.py
perf report

perf 可以分析 CPU 缓存命中率、分支预测失误等底层信息。

🧠 常见优化技巧

技巧1：使用局部变量

# 慢
def slow():
    for i in range(10**6):
        math.sin(i)

# 快
import math
def fast():
    sin = math.sin  # 局部变量，减少查找
    for i in range(10**6):
        sin(i)

技巧2：列表推导式 > 循环

# 慢
result = []
for x in data:
    result.append(x * 2)

# 快
result = [x * 2 for x in data]

技巧3：使用 slots

# 无 __slots__
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

# 有 __slots__
class Point:
    __slots__ = ('x', 'y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

# __slots__ 省内存，且属性访问略快
import sys
p1 = Point(1, 2)
p2 = Point(1, 2)
print(sys.getsizeof(p1.__dict__))  # 有 __dict__，很大

技巧4：使用内置函数

# 慢：手写循环
total = 0
for x in data:
    total += x

# 快：用内置函数
total = sum(data)

# 快：NumPy 向量化
import numpy as np
total = np.sum(data)

技巧5：使用生成器

# 慢：一次性生成列表
return [x * x for x in range(n)]

# 快：按需生成
def gen(n):
    for x in range(n):
        yield x * x

📊 timeit（精确计时）

import timeit

# 测试一小段代码
result = timeit.timeit(
    '[x*2 for x in range(1000)]',
    number=10000
)
print(f'列表推导式: {result:.4f}s')

# 测试带 setup
result = timeit.timeit(
    's.add(x)',
    setup='s = set()',
    number=10000
)
print(f'Set.add: {result:.4f}s')

🔬 memory_profiler（内存分析）

1	pip install memory_profiler

@profile  # 添加装饰器
def memory_intensive():
    data = [x * x for x in range(10**7)]
    return data

1	python -m memory_profiler test.py

💡 本节作业

用 cProfile 分析一个自己的 Python 程序，找出热点
用 line_profiler 找出循环中最慢的行
尝试用 timeit 对比列表推导式和 map() 的性能

🎯 本讲总结

cProfile：最常用的性能分析器，按 cumulative/tottime/ncalls 排序。

line_profiler：行级分析，精确定位到每一行。

优化技巧：局部变量、列表推导式、__slots__、内置函数、生成器。

timeit：精确计时，测量小代码片段的性能。

memory_profiler：内存分析，找出内存消耗大户。

🔗 课程导航

← 上一讲：C 扩展编程 | 下一讲：CPython 贡献指南 →

💬 联系我

平台	账号/链接
微信	扫码加好友
B 站	Python 自动化办公社区

主营业务：AI 编程培训、企业内训、技术咨询

jsonContent: meta: false pages: false posts: title: true date: true path: true text: false raw: false content: false slug: false updated: false comments: false link: false permalink: false excerpt: false categories: false tags: true