到目前为止,使用selenium操作headless 模式下的chrome下载文件会出现问题,点击了下载却没有任何文件被下下来。官方现在也还没有正式解决这个bug(若已解决请指正),所以得靠自己摸索下载方法了。
在一番搜索后,终于找到了解决办法,可以在chrome headless模式下指定下载目录并且下载。出处找不到了,实现过程如下:
定义一个DriverBuilder
class DriverBuilder(): def enable_download_in_headless_chrome(self, driver, download_dir): there is currently a "feature" in chrome where headless does not allow file download: https://bugs.chromium.org/p/chromium/issues/detail?id=696481 This method is a hacky work-around until the official chromedriver support for this. Requires chrome version 62.0.3196.0 or above. # add missing support for chrome "send_command" to selenium webdriver driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command') params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}} command_result = driver.execute("send_command", params) self.logger.info("response from browser:") for key in command_result: self.logger.info("result:" + key + ":" + str(command_result[key]))
配置好下载项
self.options = webdriver.ChromeOptions() self.store_path = 'your_download_file' if not os.path.exists(self.store_path): os.makedirs(self.store_path) self.prefs = {'download.default_directory': self.store_path,'profile.default_content_settings.popups': 0} self.options.add_experimental_option('prefs', self.prefs) self.options.add_argument('--headless') self.driver = webdriver.Chrome(chrome_options=self.options)