代码收藏家技术教程 2023-12-05

Python3爬虫学习之常见报错内容汇总（持续更新）

1 低级错误（比如拼写错误等）

1.1 NameError:

1.2 属性错误 AttributeError: 属性拼写错误

2 应用错误（类型应用，属性使用的错误）

2.1 类型错误 TypeError: 如字符串连接错误

2.2 属性应用错误 AttributeError

3 模块相关错误

3.1 找不到对应模块 ModuleNotFoundError:

3.2 相关模块相关错误，如os的 OSError: [Errno 22] Invalid argument:

3.3 requests模块问题：requests.exceptions.InvalidSchema无效架构

4 语法错误 SyntaxError:

4.1 函数语法错误（缺少括号）

4.2 函数语法错误（缺少:）

4.3 字符串连接错误 SyntaxError: unterminated string literal 未结束的字符串

4.4 值错误/参数错误 ValueError:

5 格式错误

5.1 缩进错误 IndentationError:

5.2 语法错误（复制代码空格导致错误）：SyntaxError: invalid non-printable character U+00A0

6 非错误，警告提醒类！

6.1 BeautifulSoup(html1,"lxml") 缺少参数时的警告

1 低级错误（比如拼写错误等）

1.1 NameError:

print 打成了 priint

1.2 属性错误 AttributeError: 属性拼写错误

AttributeError: module 'requests' has no attribute 'gat'. Did you mean: 'get'?

NameError: name 'priint' is not defined. Did you mean: 'print'?

python 还能给出修改意见

2 应用错误（类型应用，属性使用的错误）

2.1 类型错误 TypeError: 如字符串连接错误

TypeError: can only concatenate str (not “int“) to str

我原来代码有这么一句：

print ("本页返回状态码: "+res.status_code)

运行会报错

TypeError: can only concatenate str (not “int“) to str

因为res.status_code 返回的是数字，只有字符串可以 "" + "" , 所以用 str() 把 res.status_code 转化为string 就OK了

修改为

print ("本页返回状态码: "+str(res.status_code))

2.2 属性应用错误 AttributeError

AttributeError: 'str' object has no attribute 'text'

错误原因

print (res.text) 即相当于 print(html1.text)

当时 res=html =""" … """

即相当于 print(string.text)

string本身并没有 .text 这种下级属性了！

这里除非html1 不是一个string, 而是一个 html网页，用requests.get() 取下来的就可以

即这样是对的 print(requests.get(url1).text)

3 模块相关错误

3.1 找不到对应模块 ModuleNotFoundError:

报错内容： ModuleNotFoundError: No module named 'bs4'

需要现安装模块后，才能引用

没有安装这个模块就import 就会报错

3.2 相关模块相关错误，如os的 OSError: [Errno 22] Invalid argument:

报错 OSError: [Errno 22] Invalid argument:

错误写法

path1="E:\work\FangCloudV2\personal_space\2learn\python3\html0003.html"

soup1=BeautifulSoup(open(path1))

正确写法

path1=r"E:\work\FangCloudV2\personal_space\2learn\python3\html0003.html"

soup1=BeautifulSoup(open(path1))

因为path1， url1 这种一个长string里本身带一些特殊符号，比如/ \等转义符，就要用r转换为rawdata

报错信息

OSError: [Errno 22] Invalid argument: 'E:\\work\\FangCloudV2\\personal_space\x02learn\\python3\\html0003.html'

3.3 requests模块问题：requests.exceptions.InvalidSchema无效架构

错误写法1

print (html1.text)

这个例子是因为当时我这个 html 本身已经是一个字符串 """ … '"""的内容，而不是网页里

所以string.text 会报错

print (html1.text) 会报错

requests.exceptions.InvalidSchema: No connection adapters were found for '<html><head><title>The Dormouse\'s story</title></head>\n<body>\nThe Dormouse\'s story\n\nOnce upon a time there were three little sisters; and their names were\n<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,\n<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and\n<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;\nand they lived at the bottom of a well.\n\n…\n'

3.4 re.error: unbalanced parenthesis at position 7

使用re 正则库的时候，括号没有进行转义，或者丢了一半括号

4 语法错误 SyntaxError:

4.1 函数语法错误（缺少括号）

SyntaxError: Missing parentheses in call to 'print'. Did you mean print(…)?

python 还能给出修改意见

print () 语法需要有括号

4.2 函数语法错误（缺少:）

正确写法

python语法，要注意冒号和缩进:

正确写法 with open(path1 ,"a") as f :

报错内容

with open(path1 ,"a") as f

SyntaxError: expected ':'

4.3 字符串连接错误 SyntaxError: unterminated string literal 未结束的字符串

SyntaxError: unterminated string literal

未结束的字符串

造成这种错误的原因其实就是你运行的字符串有多义性

比如字符串的引号没有成对出现。

比如转义序列使用不正确

下面的例子就是把\ 写成 \\ 后即可解决问题

报错例子

错误：print(‘I'm a student')

正确：print(‘Im a student')

错误：with open(loc1+str(page)+'\'+p_name, 'wb') as f:

正确：with open(loc1+str(page)+'\\'+p_name, 'wb') as f: