0%

python中使用requests库爬取网页的脚本模板记录

最近回顾了python爬虫,学习了如何抓包(Firefox、Fiddler)。同时也把python中用requests库爬取网页的脚本模板记录在此,代码源自mooc中北理的相关网课:


requests库

1
2
3
4
5
6
7
8
9
10
11
import requests
url = "...."

try:
kv = {"user-agent":"Mozilla/5.0"}
r = requests.get(url,headers = kv)
r.raise_for_status()
r.encoding = r.apparent_encoding
print(r.text[1000:2000])
except:
print("爬取失败")