【Python】httpx库详解:HTTP客户端请求与响应处理指南
httpx
是一个现代化的 Python HTTP 客户端库,设计用于发送 HTTP 请求和处理响应。它支持同步和异步 API,兼容 requests
库的接口,同时提供更强大的功能,如 HTTP/2、连接池、流式响应和异步支持。httpx
适合构建高性能 Web 客户端、爬虫、API 交互工具,以及需要异步 I/O 的应用。
以下是对 httpx
库的详细介绍,包括其功能、用法和实际应用。
1. httpx 库的作用
requests
API 类似,易于迁移。2. 安装与环境要求
httpcore
:核心 HTTP 处理。certifi
:SSL 证书验证。h2
(HTTP/2 支持)、brotli
(Brotli 压缩)。pip install httpx
pip install httpx[http2,cli]
anyio
或 asyncio
)。import httpx
print(httpx.__version__) # 示例输出: 0.27.2
3. 核心功能与用法
httpx
提供同步 (httpx.Client
) 和异步 (httpx.AsyncClient
) 接口,核心类为 Client
和 AsyncClient
。以下是主要功能和示例。
3.1 基本请求(同步)
发送简单的 HTTP 请求,类似 requests
。
import httpx
# 发送 GET 请求
response = httpx.get("https://api.example.com/data")
print(response.status_code) # 输出: 200
print(response.json()) # 输出 JSON 数据
# 发送 POST 请求
data = {"key": "value"}
response = httpx.post("https://api.example.com/submit", json=data)
print(response.text)
说明:
httpx.get
、httpx.post
等是便捷方法,自动创建临时客户端。response.json()
解析 JSON 响应。params
(查询参数)、headers
、json
等参数。3.2 异步请求
使用 AsyncClient
进行异步 HTTP 请求。
import httpx
import asyncio
async def main():
async with httpx.AsyncClient() as client:
response = await client.get("https://api.example.com/data")
print(response.status_code)
print(response.json())
# 运行异步代码
asyncio.run(main())
说明:
AsyncClient
使用 async/await
语法,适合高并发。async with
确保客户端正确关闭。3.3 使用 Client(连接池)
通过 Client
或 AsyncClient
复用连接,提高性能。
import httpx
# 同步客户端
with httpx.Client() as client:
response1 = client.get("https://api.example.com/data1")
response2 = client.get("https://api.example.com/data2")
print(response1.json(), response2.json())
# 异步客户端
import asyncio
async def main():
async with httpx.AsyncClient() as client:
response1 = await client.get("https://api.example.com/data1")
response2 = await client.get("https://api.example.com/data2")
print(response1.json(), response2.json())
asyncio.run(main())
说明:
Client
维护连接池,减少重复建立连接的开销。with
语句确保客户端正确关闭。3.4 流式响应
处理大文件或实时数据流。
import httpx
with httpx.stream("GET", "https://example.com/large-file") as response:
for chunk in response.iter_bytes():
print(chunk[:100]) # 处理数据块
异步流式响应:
import httpx
import asyncio
async def main():
async with httpx.AsyncClient() as client:
async with client.stream("GET", "https://example.com/large-file") as response:
async for chunk in response.aiter_bytes():
print(chunk[:100])
asyncio.run(main())
说明:
stream
启用流式响应,适合下载大文件或处理实时数据。iter_bytes
/aiter_bytes
按块读取响应。3.5 自定义配置
支持超时、头信息、代理、认证等。
import httpx
# 自定义超时和头信息
headers = {"User-Agent": "MyApp/1.0"}
timeout = httpx.Timeout(10.0, connect=5.0)
response = httpx.get("https://api.example.com", headers=headers, timeout=timeout)
print(response.status_code)
# 代理
proxies = {"https://": "http://proxy.example.com:8080"}
response = httpx.get("https://api.example.com", proxies=proxies)
# 认证
auth = httpx.BasicAuth(username="user", password="pass")
response = httpx.get("https://api.example.com", auth=auth)
说明:
Timeout
:设置连接、读取等超时。proxies
:支持 HTTP/HTTPS 代理。auth
:支持基本认证、Digest 认证等。3.6 错误处理
捕获 HTTP 和网络错误。
import httpx
try:
response = httpx.get("https://nonexistent.example.com", timeout=5.0)
response.raise_for_status() # 抛出 HTTP 错误
except httpx.HTTPStatusError as e:
print(f"HTTP error: {e}")
except httpx.RequestError as e:
print(f"Request error: {e}")
说明:
HTTPStatusError
:处理 4xx/5xx 错误。RequestError
:处理网络错误(如超时、连接失败)。3.7 HTTP/2 支持
启用 HTTP/2 提高性能(需安装 h2
)。
import httpx
with httpx.Client(http2=True) as client:
response = client.get("https://api.example.com")
print(response.http_version) # 输出: HTTP/2
说明:
4. 性能与特点
requests
。requests
类似,迁移成本低。asyncio
或 anyio
。h2
。5. 实际应用场景
示例(异步 API 爬虫):
import httpx
import asyncio
async def fetch_url(client, url):
response = await client.get(url)
return response.json()
async def main():
urls = [
"https://api.example.com/data1",
"https://api.example.com/data2",
"https://api.example.com/data3"
]
async with httpx.AsyncClient() as client:
tasks = [fetch_url(client, url) for url in urls]
results = await asyncio.gather(*tasks)
for url, result in zip(urls, results):
print(f"{url}: {result}")
asyncio.run(main())
说明:
6. 部署与 CLI
httpx
提供命令行工具(需安装 httpx[cli]
)。
httpx https://api.example.com/data
输出示例:
{
"key": "value"
}
说明:
7. 注意事项
asyncio.run
或事件循环中运行。raise_for_status()
检查 HTTP 错误。RequestError
处理网络问题。httpx.get("https://api.example.com", timeout=5.0)
h2
确保 HTTP/2 可用。httpcore
和 h2
的版本兼容性。8. 综合示例
以下是一个综合示例,展示同步/异步请求、流式下载和错误处理:
import httpx
import asyncio
import os
# 同步请求
def sync_request():
try:
response = httpx.get(
"https://api.github.com/repos/python/cpython",
headers={"User-Agent": "MyApp/1.0"},
timeout=5.0
)
response.raise_for_status()
print("GitHub Repo:", response.json()["name"])
except httpx.HTTPStatusError as e:
print(f"HTTP error: {e}")
except httpx.RequestError as e:
print(f"Request error: {e}")
# 异步流式下载
async def download_file(url, filename):
async with httpx.AsyncClient(http2=True) as client:
async with client.stream("GET", url) as response:
response.raise_for_status()
with open(filename, "wb") as f:
async for chunk in response.aiter_bytes():
f.write(chunk)
print(f"Downloaded {filename}")
# 主函数
async def main():
# 运行同步请求
sync_request()
# 异步下载文件
await download_file(
"https://example.com/sample.pdf",
"sample.pdf"
)
# 并发请求多个 API
urls = [
"https://api.example.com/data1",
"https://api.example.com/data2"
]
async with httpx.AsyncClient() as client:
tasks = [client.get(url) for url in urls]
responses = await asyncio.gather(*tasks, return_exceptions=True)
for url, resp in zip(urls, responses):
if isinstance(resp, httpx.Response):
print(f"{url}: {resp.status_code}")
else:
print(f"{url}: Error - {resp}")
if __name__ == "__main__":
asyncio.run(main())
说明:
9. 资源与文档
作者:彬彬侠