java selenium使用BrowserMobProxy代理_options.setproxy

相关文章推荐

深沉的黑框眼镜 · python - ...· 2 月前 ·

儒雅的针织衫 · python 报 ...· 1 月前 ·

很酷的草稿纸 · 污点和容忍度 | Kubernetes· 1 年前 ·

一身肌肉的遥控器 · 【前端】js中数组对象根据内容查找符合的第一 ...· 2 年前 ·

爱吹牛的稀饭 · php - Swift Mailer ...· 2 年前 ·

聪明伶俐的皮带 · IDEA plugins - 沧海一滴 - 博客园· 2 年前 ·

低调的风衣 · PHP include 和 require ...· 2 年前 ·

java selenium目前想要监听网络请求和修改响应返回内容BrowserMobProxy代理是一个很好的选择，具体原理可以自行百度代理服务的原理，selenium4的阿尔法版本也开始添加类似的功能了，不过目前不是很完善,代码方面使用起来也很方便，下面直接看代码把。
BrowserMobProxy在和selenium集成的使用中性能消耗比较大其他方面暂时没有发现其他问题在git上看到有一个2.1.6的版本但是没有上传,目前最新的版本是2.1.5也是2017年的代码了。

代码上传至码云：码云地址

package com.watchmen.selenium;
import java.util.List;
import org.openqa.selenium.Proxy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import io.netty.handler.codec.http.HttpRequest;
import io.netty.handler.codec.http.HttpResponse;
import net.lightbody.bmp.BrowserMobProxy;
import net.lightbody.bmp.BrowserMobProxyServer;
import net.lightbody.bmp.client.ClientUtil;
import net.lightbody.bmp.core.har.Har;
import net.lightbody.bmp.core.har.HarEntry;
import net.lightbody.bmp.core.har.HarNameValuePair;
import net.lightbody.bmp.core.har.HarRequest;
import net.lightbody.bmp.core.har.HarResponse;
import net.lightbody.bmp.filters.RequestFilter;
import net.lightbody.bmp.proxy.CaptureType;
import net.lightbody.bmp.util.HttpMessageContents;
import net.lightbody.bmp.util.HttpMessageInfo;
 * @author kk
 * @Description selenium使用browserMobProxy代理
public class SeleniumBrowserMobProxy {
	public static void main(String[] args) {
		String webDriverDir = "浏览器驱动路径";
		// 加载驱动
		System.setProperty("webdriver.chrome.driver", webDriverDir);
		BrowserMobProxy browserMobProxy = new BrowserMobProxyServer();
		browserMobProxy.start();
		browserMobProxy.enableHarCaptureTypes(CaptureType.REQUEST_CONTENT, CaptureType.RESPONSE_CONTENT);
		browserMobProxy.setHarCaptureTypes(CaptureType.RESPONSE_CONTENT);
		browserMobProxy.newHar("kk");
		Proxy seleniumProxy = ClientUtil.createSeleniumProxy(browserMobProxy);
		// 设置浏览器参数
		ChromeOptions options = new ChromeOptions();
		options.setProxy(seleniumProxy);
		options.setAcceptInsecureCerts(true);
		options.setExperimentalOption("useAutomationExtension", false);
		// 创建驱动对象
		WebDriver driver = new ChromeDriver(options);
		// 监听网络请求
		browserMobProxy.addRequestFilter(new RequestFilter() {
			@Override
			public HttpResponse filterRequest(HttpRequest request, HttpMessageContents contents,
					HttpMessageInfo messageInfo) {
				// 打印浏览器请求的url和请求头
				System.out.println(request.getUri() + " --->> " + request.headers().get("Cookie"));
				return null;
		// 打开链接
		driver.get("https://www.baidu.com/");
		// 获取返回的请求内容
		Har har = browserMobProxy.getHar();
		List<HarEntry> entries = har.getLog().getEntries();
		for (HarEntry harEntry : entries) {
			HarResponse response = harEntry.getResponse();
			HarRequest request = harEntry.getRequest();
			String url = harEntry.getRequest().getUrl();
			List<HarNameValuePair> headers = request.getHeaders();
			for (HarNameValuePair harp : headers) {
				System.out.println(harp.toString());
maven依赖 
	  <dependency>
			<groupId>net.lightbody.bmp</groupId>
			<artifactId>browsermob-core</artifactId>
			<version>2.1.5</version>
		</dependency>
		<dependency>
			<groupId>net.lightbody.bmp</groupId>
			<artifactId>browsermob-legacy</artifactId>
			<version>2.1.5</version>
		</dependency>
        <dependency>
		    <groupId>org.seleniumhq.selenium</groupId>
		    <artifactId>selenium-java</artifactId>
		    <version>4.0.0-alpha-7</version>
		</dependency>
		</dependency>
1.环境配置
配置要点：
1.webdriver要和浏览器版本对应，chrome使用chromedriver和chrome浏览器，firefox使用geckodrive和firefox浏览器
2.支持headless：本地开发使用mac环境，默认支持；linux需要安装xvf8（虚拟GUI）
3.maven项目构建，使用...
                                    最近用selenium和browsermobproxy弄了个爬虫，专门去某个网站爬取pdf文件。虽然该网站没有提供下载文件的功能，但用户在浏览器上预览pdf内容时，浏览器事实上已经下载了pdf了，所以我试着用browsermobproxy在拦截请求阶段把文件给保存下来。
import browsermobproxy
from selenium import webdriver
from selenium.webdriver import chrome
server = browsermobproxy.
                                    BrowserMob Proxy 使用指南及最佳实践
 browsermob-proxyA free utility to help web developers watch and manipulate network traffic from their AJAX applications.项目地址:https://gitcode.com/gh_mirrors/br/browsermob-pr...
    void testNow() {
        /* First: Add the chrome.exe to the PATH.
         * Then: open the cmd and input the command below:
         * chrome.exe --remote-debugging-port=9222 -...
                                    因为最近看到一个软件可以实现网页的识别，自动导入网页的内容，感觉这个功能很厉害（真心佩服设计那个软件的人）。但不清楚这个软件的网页识别的实现，也没有接触过相关的技术，就上网搜索并学习一些相关的技术，所以有了这篇文章。但是只能获取简单的请求，一些复杂的请求获取不了（会报错，说是解析不了获取的preflight ---> 好像是一个涉及跨域请求的东西）。~~）最后，虽然是刚入门，但分享这个的初衷，是用于帮助其他伙伴对一些软件功能的实现提供一些思路。
                                    I'm using chrome option to access the performance logging using selenium, I'm trying to write a code that would help me figure out the total number of the http request and the size of the page after t...
自从发现Selenium这块新大陆后，许多异步加载、js加密、动态Cookie等问题都变得非常简单，大大简化了爬虫的难度。
但是有些时候使用Selenium仍然有一些缺陷，比如现在很多网站数据都是通过json结构的接口来交互，通过分析报文的方式直接发包可以直接拿到json数据，数据不但全而且还很好解析，这比解析html网页容易多了。另一个非常重要的问题就是，很多时候一些接口返...
                                    10. 然后增加 Proxy SwitchyOmega（ 因为我这边已经增加完了，所以蓝色按钮里面显示的是 ’从Chrome中删除‘ ）11. 增加完之后，打开Proxy SwitchyOmega，可以根据下面图片的内容进行更改。6.  然后再Network Proxy里面再点击红色方框里面的按钮。14. 然后可以打开一个新的标签页，随便搜点什么，可以正常搜索。因为我的代理端口号是1090，所以在这里使用的是1090。更改完之后，可以点击红色方框里面的按钮进行保存。然后点击红色方框里面的按钮。
                                    目录一、源码-ChromeOptions类的方法二、常用设置2.1 设置浏览器文件默认下载路径2.2 无头模式（后台运行）2.3 启动最大化2.4 设置中文简体2.5 关闭GPU2.6其他设置
一、源码-ChromeOptions类的方法
    通过查看ChromeOptions类的源码，我们可以看到ChromeOptions类有下面这些方法。
merg...