本文共 950 字,大约阅读时间需要 3 分钟。
1.命令行输入该指令
2. 3.新建一个爬虫文件demo.pyimport scrapyclass DemoSpider(scrapy.Spider): name = "demo" start_urls = ['http://python123.io/ws/demo.html'] def parse(self, response): fname = response.url.split('/')[-1] with open(fname, 'wb') as f: f.write(response.body) self.log('Save file %s.' % fname)
4.开始执行scrap crawl demo
这个时候需要切换cmd的目录到新建的python123demo目录下(该目录下有scrap.cfg文件) 不然会报错: 5.这个时候会出现demo.py
import scrapyclass DemoSpider(scrapy.Spider): name = "demo" # 优化区别 start_urls = ['http://python123.io/ws/demo.html'] def start_requests(self): #新增 urls = ['http://python123.io/ws/demo.html'] for url in urls: yield scrapy.Request(url=url, callback=self.parse) #可以暂时挂起,下次进入 def parse(self, response): fname = response.url.split('/')[-1] with open(fname, 'wb') as f: f.write(response.body) self.log('Save file %s.' % fname)
转载地址:http://uyhpn.baihongyu.com/