代码收藏家技术教程 2025-05-18

Fickling工具：深入解析Python pickle序列化数据安全问题的检测与分析

1. 介绍

项目地址：https://github.com/trailofbits/fickling

Fickling的核心功能包括：

反编译（Decompilation）: 将pickle字节码转换成易于理解的Python代码。
静态分析（Static Analysis）: 检查潜在的安全风险，如恶意的exec或os.system调用。
字节码重写（Bytecode Rewriting）: 允许你对pickle文件注入安全的自定义Python代码。

Fickling 既可以用作 python 库，也可以用作 CLI（命令行界面）。

2. 安装

python 3.8 – 3.11 上明确是可以使用的：python -m pip install fickling

为了使用 Fickling 的 and 模块，还需要安装 PyTorch，PyTorch 是 Fickling 的可选依赖项：python -m pip install fickling[torch]

3. 使用

3.1 保护 AI/ML 环境

Fickling 可以自动扫描模型。Fickling 钩住 pickle 模块并验证加载模型时所做的导入。

原理是根据白名单（被视为安全的 ML 库的导入允许列表）列表检查导入，并阻止包含其他导入的文件。

要启用 Fickling 安全检查，只需在加载任何 AI/ML 模型之前，在流程中运行以下行一次：

import fickling
# This sets global hooks on pickle
fickling.hook.activate_safe_ml_environment()

要删除保护：

fickling.hook.deactivate_safe_ml_environment()

如果使用的模型可能包含 Fickling 不允许的导入，但是仍想加载模型，可以简单地使用参数 also_allow 为特定用例允许额外的导入：

fickling.hook.activate_safe_ml_environment(also_allow=[
    "some.import",
    "another.allowed.import",
])

3.2 通用恶意文件检测

Fickling 可以无缝集成到代码库中，以检测并停止恶意加载文件。

下面展示了使用 fickling 对 pickle 文件执行安全检查的不同方法。在后台，它会挂接 pickle 库以添加安全检查，以便在文件中检测到恶意内容时加载 pickle 文件会引发 UnsafeFileError 异常。

方法 1（推荐）：检查所有加载的 pickle 文件的安全性

# This enforces safety checks every time pickle.load() is used
fickling.always_check_safety()

# Attempt to load an unsafe file now raises an exception
with open("file.pkl", "rb") as f:
    try:
        pickle.load(f)
    except fickling.UnsafeFileError:
        print("Unsafe file!")

方法 2：使用上下文管理器

with fickling.check_safety():
    # All pickle files loaded within the context manager are checked for safety
    try:
        with open("file.pkl", "rb") as f:
            pickle.load("file.pkl")
    except fickling.UnsafeFileError:
        print("Unsafe file!")

# Files loaded outside of context manager are NOT checked
pickle.load("file.pkl")

方法 3：检查并加载单个文件

# Use fickling.load() in place of pickle.load() to check safety and load a single pickle file
try:
    fickling.load("file.pkl")
except fickling.UnsafeFileError as e:
    print("Unsafe file!")

方法 4：只检查 pickle 文件安全而不加载

# Perform a safety check on a pickle file without loading it
if not fickling.is_likely_safe("file.pkl"):
    print("Unsafe file!")

访问安全分析结果，可以从引发的异常中访问 fickling 安全性分析的详细信息：

>>> try:
...     fickling.load("unsafe.pkl")
... except fickling.UnsafeFileError as e:
...     print(e.info)

{
    "severity": "OVERTLY_MALICIOUS",
    "analysis": "Call to `eval(b'[5, 6, 7, 8]')` is almost certainly evidence of a malicious pickle file. Variable `_var0` is assigned value `eval(b'[5, 6, 7, 8]')` but unused afterward; this is suspicious and indicative of a malicious pickle file",
    "detailed_results": {
        "AnalysisResult": {
            "OvertlyBadEval": "eval(b'[5, 6, 7, 8]')",
            "UnusedVariables": [
                "_var0",
                "eval(b'[5, 6, 7, 8]')"
            ]
        }
    }
}

3.3 CLI 中检查 pickle 文件

如果你使用的是 Python 以外的其他语言，你仍然可以使用 fickling 的 CLI 安全检查 pickle 文件：fickling --check-safety -p pickled.data

更高级的用法：

跟踪 pickle 执行：
1. Fickling 的 CLI 允许安全地跟踪 Pickle 虚拟机的执行，而无需执行任何恶意代码：fickling --trace file.pkl
Pickle 代码注入：
1. Fickling 允许在 pickle 文件中注入任意代码，该文件将在每次加载文件时运行：fickling --inject "print('Malicious')" file.pkl

Pickle 反编译：

Fickling 可用于反编译 pickle 文件以进行进一步分析

>>> import ast, pickle
>>> from fickling.fickle import Pickled
>>> fickled_object = Pickled.load(pickle.dumps([1, 2, 3, 4]))
>>> print(ast.dump(fickled_object.ast, indent=4))
Module(
    body=[
        Assign(
            targets=[
                Name(id='result', ctx=Store())],
            value=List(
                elts=[
                    Constant(value=1),
                    Constant(value=2),
                    Constant(value=3),
                    Constant(value=4)],
                ctx=Load()))],
    type_ignores=[])

PyTorch 多语言：
1. PyTorch 包含多种文件格式，可以使用这些格式制作多语言文件，其中是可以有效解释为多种文件格式的文件。 Fickling 支持使用以下 PyTorch 文件格式：
  1. PyTorch v0.1.1：包含 sys_info、pickle、存储和张量的 Tar 文件
  2. PyTorch v0.1.10：堆叠的 pickle 文件
  3. TorchScript v1.0：带有 model.json 的 ZIP 文件
  4. TorchScript v1.1：包含 model.json 和 attributes.pkl 的 ZIP 文件
  5. TorchScript v1.3：包含 data.pkl 和 constants.pkl 的 ZIP 文件
  6. TorchScript v1.4：data.pkl、constants.pkl 和版本设置为 2 或更高的 ZIP 文件（2 个 pickle 文件和一个文件夹）
  7. PyTorch v1.3：包含 data.pkl 的 ZIP 文件（1 个 pickle 文件）
  8. PyTorch 模型存档格式[ZIP]：包含 Python 代码文件和 pickle 文件的 ZIP 文件
```
>> import torch
>> import torchvision.models as models
>> from fickling.pytorch import PyTorchModelWrapper
>> model = models.mobilenet_v2()
>> torch.save(model, "mobilenet.pth")
>> fickled_model = PyTorchModelWrapper("mobilenet.pth")
>> print(fickled_model.formats)
Your file is most likely of this format:  PyTorch v1.3 
['PyTorch v1.3']
```