代码收藏家技术教程 2023-12-08

使用Python实现将HTML文件转换为图片的三种方法

pyecharts的表格组件Table生成的html文件转图片方式

工作中由python处理后的数据需要自动化发送到工作群组

第一种：pyecharts自带的snapshot_phantomjs方式

前置准备

实现方法

第二种：aspose.words方式

前置准备

实现方法

第三种：imgkit,pdfkit方式

前置准备

实现方法

工作中由python处理后的数据需要自动化发送到工作群组

工作需要将pyecharts的表格组件Table生成的html文件转图片，尝试了三种方式，只有第三种方式成功了

第一种：pyecharts自带的snapshot_phantomjs方式

snapshot-phantomjs 是 pyecharts + phantomjs 渲染图片的扩展，支持png\jpeg\gif\pdf\svg等格式

前置准备

下载安装phantomjs （下载地址：http://phantomjs.org/download.html）注意里面的phantomjs.exe需要放的路径问题，网上普遍默认是要在环境变量下

安装snapshot-phantomjs包pip install snapshot-phantomjs

使用时可能报错 OSError: [“ReferenceError: Can’t find variable: echarts\n\n undefined:1\nnull\n”]，这个问题需要下载echarts.min.js（下载地址：https://echarts.apache.org/zh/download.html 我是点击里面的dist链接跳转到github直接下载echarts.min.js）

实现方法

生成html文件

from pyecharts import options as opts
from pyecharts.charts import Table
from pyecharts.render import make_snapshot
from snapshot_phantomjs import snapshot

from pyecharts.options import ComponentTitleOpts


table = Table()

headers = ["City name", "Area", "Population", "Annual Rainfall"]
rows = [
    ["Brisbane", 5905, 1857594, 1146.4],
    ["Adelaide", 1295, 1158259, 600.5],
    ["Darwin", 112, 120900, 1714.7],
    ["Hobart", 1357, 205556, 619.5],
    ["Sydney", 2058, 4336374, 1214.8],
    ["Melbourne", 1566, 3806092, 646.9],
    ["Perth", 5386, 1554769, 869.4],
]
table.add(headers, rows)
table.set_global_opts(
    title_opts=ComponentTitleOpts(title="Table-基本示例", subtitle="我是副标题支持换行哦")
)
table.render("table_base.html")

html文件转成图片格式如png

file_path = "{}/".format(os.path.dirname(os.path.abspath("/root/echarts.min.js")))
Table(init_opts=opts.InitOpts(js_host=file_path))
make_snapshot(snapshot,table.render(),"table0.pdf")

结果仍然报错，TypeError: Table.init() got an unexpected keyword argument ‘init_opts’
经查找，发现snapshot_phantomjs支持别的图导出如Bar、Grid、Line等都可以用这种方式，但是Table组件不支持

第二种：aspose.words方式

使用Aspose.Words for Python API。用python读取和操作各种类型文档比如 Microsoft Word（DOC、DOCX、ODT）、PDF和 Web（HTML、Markdown）文档

前置准备

安装aspose-words包pip install aspose-words

实现方法

以jpeg为例

import aspose.words as aw
doc = aw.Document("table_base.html")
imageOptions = aw.saving.ImageSaveOptions(aw.SaveFormat.JPEG)
imageOptions.jpeg_quality = 10
imageOptions.horizontal_resolution = 72

# Save the pages as JPG
for page in range(0, doc.page_count):
extractedPage = doc.extract_pages(page, 1)
extractedPage.save(f"C:\\Files\\Images\\Page_{page + 1}.jpg", imageOptions)

结果报错：IndentationError: expected an indented block after ‘for’ statement on line 17
经查找，发现这种方式只适用于文本页面，能用Document类加载的html文件，比如论文很适合。

第三种：imgkit,pdfkit方式

可以将html转为图片或者pdf，不限制类型

前置准备

安装imgkit、pdfkit包pip install imgkit `pip install pdfkit

下载安装wkhtmltopdf(下载地址：https://wkhtmltopdf.org/downloads.html)安装后有一下两个exe程序，分别用来转图片和pdf

实现方法

import imgkit
 
path_wkimg = r'D:\Program Files\wkhtmltopdf\bin\wkhtmltoimage.exe'  # 工具路径
cfg = imgkit.config(wkhtmltoimage=path_wkimg)
#可以修改参数，图片大小、语言等
# options={
#     page-size:""
# }
# 将html文件转为图片
imgkit.from_file('table_base.html', 'hellotable.jpg', config=cfg)

运行结果：Loading page (1/2)
Rendering (2/2)
Done
True

在运行路径下即可找到对应生成的图片