Crawler之Scrapy:基于scrapy框架实现爬虫两个网址下载网页内容信息之详细攻略
目录
后期更新……
- import scrapy
- class DmozSpider(scrapy.Spider):
- name ="dmoz"
- allowed_domains = ["dmoz.org"]
- start_urls = [
- "https://dmoztools.net/Computers/Programming/Languages/Python/Resources/"
- "https://dmoztools.net/Computers/Programming/Languages/Python/Books/"
- ]
- def parse(self,response):
- filename = response.url.split("/")[-2]
- with open(filename, 'wb') as f:
- f.write(response.body)
网站声明:如果转载,请联系本站管理员。否则一切后果自行承担。
加入交流群
请使用微信扫一扫!