目前 selenium还不支持增加referer,不知道为什么?
如果有类似需求,大家是怎么做的呢?
让客户端看到我们的referer
主要思路是 让selnium 请求通过代理转发,然后在代理中添加referer,代理服务器使用mitmproxy
具体实现:
1) selenium 先访问referer,根据请求路径进行标识
2) 代理接收到请求,读取标志,若存在,则直接返回,否则直接发送请求
使用mitmproxy 作为代理服务器, 需要python2.7
yum -y groupinstall "Development tools"
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel
cd ~
wget https://www.python.org/ftp/python/2.7.9/Python-2.7.9.tgz
tar zxvf Python-2.7.9.tgz
cd Python-2.7.9
./configure --prefix=/usr/local
make && make altinstall
mv /usr/bin/python /usr/bin/python2.6.6.old
ln -s /usr/local/bin/python2.7 /usr/bin/python
vi /usr/bin/yum
将#!/usr/bin/python改为#!/usr/bin/python2.6,因为yum需要python2.6
wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
python2.7 ez_setup.py
easy_install-2.7 pip
#注意后面使用pip2.7而不是pip。
pip2.7 install netlib pyopenssl pyasn1 urwid lxml flask
pip2.7 install pil --allow-external PIL --allow-unverified PIL
pip2.7 install pyamf protobuf
pip2.7 install nose pathod countershape
pip2.7 install mitmproxy
安装后发现 python2.7还是无法使用,需要安装python3.X,
wget https://www.python.org/ftp/python/3.6.0/Python-3.6.0a1.tar.xz
tar xvf Python-3.6.0a1.tar.xz
./configure
make && make install
wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py
python3 ez_setup.py
easy_install-* pip
pip3 install netlib pyopenssl pyasn1 urwid lxml flask
pip3 install pil --allow-external PIL --allow-unverified PIL
pip3 install pyamf protobuf
pip3 install nose pathod countershape
pip3 install mitmproxy
配置上层代理
mitmproxy -b 192.168.109.135 -p 443 -U http://192.168.109.130:8080
动态修改 上层代理 方法 https://github.com/mitmproxy/mitmproxy/blob/master/examples/complex/change_upstream_proxy.py
自定义返回内容:
https://github.com/mitmproxy/mitmproxy/blob/master/examples/simple/send_reply_from_proxy.py
比如检测到cookie中包含特定字符串,代码如下 change.py,即返回:
from mitmproxy import http
def request(flow: http.HTTPFlow) -> None:
# pretty_url takes the "Host" header of the request into account, which
# is useful in transparent mode where we usually only have the IP otherwise.
if 'cookie' in flow.request.headers:
print(flow.request.headers['cookie'])
if(flow.request.headers['cookie'].find("key=website80") > -1):
print(flow.request.headers['cookie'])
#if flow.request.pretty_url == "http://example.com/path":
flow.response = http.HTTPResponse.make(
200, # (optional) status code
b"<html><body><website80></website80></body></html>", # (optional) content
{"Content-Type": "text/html"} # (optional) headers
)
else:
print('saaaaaaaaaaaaaaaaaaaaa')
运行 上述脚本 mitmdump -s change.py
综上所述,实现功能需要两个脚本,第一个mitmdump脚本, 根据请求路径中包含 redirect.html表示 不需要继续请求,直接返回,
# This scripts demonstrates how mitmproxy can switch to a second/different upstream proxy
# in upstream proxy mode.
#
# Usage: mitmdump -U http://default-upstream-proxy.local:8080/ -s change_upstream_proxy.py
#
# If you want to change the target server, you should modify flow.request.host and flow.request.port
from mitmproxy import http
def proxy_address(flow):
# Poor man's loadbalancing: route every second domain through the alternative proxy.
if hash(flow.request.host) % 2 == 1:
return ("192.168.109.130", 8080)
else:
return ("192.168.109.130", 8080)
def request(flow):
if flow.request.pretty_url.find("redirect.html") > -1:
flow.response = http.HTTPResponse.make(
200,
b"<html><body><website80></website80></body></html>",
{"Content-Type": "text/html"}
)
else:
print("go to now")
if flow.request.method == "CONNECT":
# If the decision is done by domain, one could also modify the server address here.
# We do it after CONNECT here to have the request data available as well.
return
address = proxy_address(flow)
if flow.live:
flow.live.change_upstream_proxy_server(address)
selenium代码:
# -*- coding: utf-8 -*-
import json
import random
import sys
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import os
from datetime import datetime
__author__ = 'Administrator'
'''
需求: 有一个可以控制的任意url跳转页面 website.com/redirect.html?host={host}
一个挂站长统计的域名列表
枚举域名列表 分别访问 website.com/redirect.html?host=domain.com,则跳转到domain.com
redirect.html代码如下:
功能: 刷带referer的站长统计
实现: 1. 本地实现一个webserver
2. 获得所有referer 的host 修改成本地 http://www.xker.com/page/e2015/05/188691.html
3. 读取一行为一个域名,打开对应referer
4. 页面返回后,插入跳转的 jsavscript代码 并执行
做一个静态页面,使用js创建一个a标签 id=redirecta
<a href="http://www.website80.com/Index" target="_blank">go</a>
<script type="text/javascript">
var a = document.createElement("a");
var node = document.createTextNode("link");
a.appendChild(node);
a.setAttribute("href","http://www.website.com/Index");
//a.setAttribute("target","_blank"); //不打开新标签,否则关闭不上
a.setAttribute("id","redirecta");
document.body.appendChild(a);
</script>
cookie
http://www.cnblogs.com/fnng/p/3269450.html
'''
# use the default firefox to do this,, so I need not do login operation. In the different os, you must change the path
#newFirefox = webdriver.FirefoxProfile(r'C:\Users\Administrator\AppData\Roaming\Mozilla\Firefox\Profiles\wed8a4gm.default')
newFirefox = webdriver.FirefoxProfile()
newFirefox.set_preference("network.proxy.type", 1)
newFirefox.set_preference("network.proxy.http",'192.168.109.135')
newFirefox.set_preference("network.proxy.http_port",int(8080))
#newFirefox.set_preference("general.useragent.override","whater_useragent")
newFirefox.update_preferences()
#firefox_profile=newFirefox
browser = webdriver.Firefox(firefox_profile=newFirefox)
key = 'website'
fd = open(r'51.la.txt')
count = 0
c = {}
#c['domain'] = 'website80.com'
c['name'] = 'key'
c['value'] = key
c['path'] = '/'
print datetime.now()
for line in fd:
line = line.strip()
if line:
count = count + 1
url = 'http://website.com/redirect.html?host=' + line
try:
if url.find("website.com") > -1:
browser.delete_all_cookies()
#browser.add_cookie(c)
#print browser.get_cookies()
browser.get(url)
bfind = True
sucess = WebDriverWait(browser,200).until(lambda browser: browser.find_element_by_tag_name("html"));
if sucess:
addjs = browser.find_element_by_tag_name(key)
if addjs:
destUrl = 'http://%s?from=www.website.com' % line
js = 'var a = document.createElement("a");'
js = js + 'var node = document.createTextNode("link");'
js = js + 'a.appendChild(node);'
js = js + 'a.setAttribute("id","redirecta");'
js = js + 'a.setAttribute("href","%s");' % destUrl
#js = js + 'a.setAttribute("target","_blank");'
js = js + 'document.body.appendChild(a);'
#print js
browser.execute_script(js)
a = browser.find_element_by_id("redirecta")
if a:
if(a.get_attribute("href") !="www.website.com"):
print a.get_attribute("href")
#pass
a.click()
else:
links = browser.find_elements_by_tag_name("a")
for link in links:
print link.get_attribute("href")
if bfind:
time.sleep(10)
#close the other windows by switch the windows
nowhandle = browser.current_window_handle
allhandles = browser.window_handles
if len(allhandles) > 1:
for handle in allhandles:
if handle != nowhandle:
browser.switch_to.window(handle)
try:
browser.close()
except Exception,e:
print e
browser.switch_to.window(nowhandle)
try:
if EC.alert_is_present:
alert = browser.switch_to.alert() # switch_to_alert()
alert.accept()
except Exception,e:
pass
except Exception,e:
print str(e)
if count % 1000 == 0:
print count
print datetime.now()
browser.close()
让服务端js看到我们的referer
同样使用代理的方式,在每
参考资料:
selenium添加referer
https://stackoverflow.com/questions/20732463/setting-referer-in-selenium
selnium 添加referer完整代码
https://github.com/j-bennet/selenium-referer
mitmproxy 安装
http://www.cnblogs.com/ShepherdIsland/p/4239052.html
mitmproxy github
https://github.com/mitmproxy/mitmproxy
mitmproxy 使用实例
https://github.com/brianwrf/NagaScan
mitmproxy实例
https://github.com/mitmproxy/mitmproxy/blob/master/examples/simple/add_header.py
FireFox 历史版本
https://download-installer.cdn.mozilla.net/pub/firefox/releases/