用nginx和heroku实现一个免费的http proxy

avatar 2019年11月7日23:11:51来源: faicker用nginx和heroku实现一个免费的http proxy已关闭评论

用nginx做forward proxy,借助于免费又支持SSL的heroku app,实现http proxy。

首先申请一个免费的heroku app做测试,heroku会分配一个域名,比如xxx.herokuapp.com,同时还支持SSL访问,这个是关键。

heroku的app不能直接用作代理,因为访问heroku app大概路径是,
xxx.herokuapp.com解析到了heroku的前端nginx集群,然后再反向代理到自己的app。nginx会检查Host是否是heroku的app,不是的话会报404 Object Not Found。

思路是,

把要访问的网站嵌入到url里,比如http://xxx.herokuapp.com/p/www.google.com,然后我们在app里去请求www.google.com,然后把结果返回(包括response headers),这样我们访问http://xxx.herokuapp.com/p/www.google.com返回了google的内容!可以把这个app强化一下,处理一下refer,url等,完全就是heroku app的壳,里面套了其他网站的内容。

为了偷懒,简化这里的处理,可以在本地用nginx做一个forward proxy,把header里的host rewrite到url里。(开始是用flask写的一个程序做这个事情,后来发现还是nginx简单)

最后在浏览器里配置一下http proxy就行了。

PS,

heroku的免费app比较坑爹的地方是,如果一段时间inactive后,会自动关闭。

最后奉上nginx的配置和示例代码:

nginx server段配置如下,其他省略,

server {
listen 8080;
location / {
resolver 223.5.5.5;
proxy_pass https://xxx.herokuapp.com/p/$http_host$uri$is_args$args;
}
}

app.py用的flask(网络上搜索到的一个例子改了一点,用其他也是OK的)

cat Procfile

web: python app.py

cat requirements.txt(flask已经有新版本了)

lask==0.9
Jinja2==2.6
Werkzeug==0.8.3
wsgiref==0.1.2
requests==2.3.0

cat app.py

"""
A simple proxy server. Usage:
http://hostname:port/p/(URL to be proxied, minus protocol)
For example:
http://localhost:8080/p/www.google.com
"""
import os
from flask import Flask, render_template, request, abort, Response, redirect
from werkzeug.serving import WSGIRequestHandler
import requests
import logging
app = Flask(__name__.split('.')[0])
logging.basicConfig(level=logging.INFO)
LOG = logging.getLogger("main.py")
@app.route('/<path:url>')
def root(url):
    LOG.info("Root route, path: %s", url)
    # If referred from a proxy request, then redirect to a URL with the proxy prefix.
    # This allows server-relative and protocol-relative URLs to work.
    proxy_ref = proxy_ref_info(request)
    if proxy_ref:
        redirect_url = "%s/%s%s" % (proxy_ref[0], url, ("?" + request.query_string if request.query_string else ""))
        LOG.info("Redirecting referred URL to: %s", redirect_url)
        return proxy(redirect_url)
    abort(404)
@app.route('/p/<path:url>')
def proxy(url):
    """Fetches the specified URL and streams it out to the client.
    If the request was referred by the proxy itself (e.g. this is an image fetch for
    a previously proxied HTML page), then the original Referer is passed."""
    r = get_source_rsp(url)
    LOG.info("Got %s response from %s",r.status_code, url)
    headers = dict(r.headers)
    if headers.has_key('transfer-encoding'):
        del(headers['transfer-encoding'])
    if headers.has_key('content-encoding'):
        del(headers['content-encoding'])
    return Response(r.content, headers = headers)
def get_source_rsp(url):
        url = 'http://%s' % url
        LOG.info("Fetching %s", url)
        # Pass original Referer for subsequent resource requests
        proxy_ref = proxy_ref_info(request)
        headers = { "Referer" : "http://%s/%s" % (proxy_ref[0], proxy_ref[1])} if proxy_ref else {}
        # Fetch the URL, and stream it back
        LOG.info("Fetching with headers: %s, %s", url, headers)
        return requests.get(url, stream=False, params = request.args, headers=headers)
def split_url(url):
    """Splits the given URL into a tuple of (protocol, host, uri)"""
    proto, rest = url.split(':', 1)
    rest = rest[2:].split('/', 1)
    host, uri = (rest[0], rest[1]) if len(rest) == 2 else (rest[0], "")
    return (proto, host, uri)
def proxy_ref_info(request):
    """Parses out Referer info indicating the request is from a previously proxied page.
    For example, if:
        Referer: http://localhost:8080/p/google.com/search?q=foo
    then the result is:
        ("google.com", "search?q=foo")
    """
    ref = request.headers.get('referer')
    if ref:
        _, _, uri = split_url(ref)
        if uri.find("/") < 0:
            return None
        first, rest = uri.split("/", 1)
        if first in "pd":
            parts = rest.split("/", 1)
            r = (parts[0], parts[1]) if len(parts) == 2 else (parts[0], "")
            LOG.info("Referred by proxy host, uri: %s, %s", r[0], r[1])
            return r
    return None
@app.route('/')
def hello():
    return 'Hello World!'
if __name__ == '__main__':
    # Bind to PORT if defined, otherwise default to 5000.
    port = int(os.environ.get('PORT', 5000))
    WSGIRequestHandler.protocol_version = "HTTP/1.1"
    app.run(host='0.0.0.0', port=port, threaded=True)
  • 本文作者:
avatar