Recently, I ran into the situation where I needed a piece of software, running inside a corporate network, to communicate with a backend service on the internet. The challenge was that, in order to reach the internet, the communication had to go through a forward proxy. However, the software did not have built-in proxy support. This post addresses some options to solve that problem, although not all of them in the same depth.

Proxifying a connection in this context means to give a client component the ability to communicate with a forward proxy.

In order to proxify an HTTP connection, the differences between forward and reverse proxies must be understood (understand the HTTP CONNECT verb and be aware of the proxy HOST header (Stackoverflow), RFC 7230).

A simplified comparison of communication flows between a forward proxy and a reverse proxy:

forward vs reverse_proxy diagram

Forward Proxy

  • Traffic from and to the web server (HTTP request & response) goes through the proxy.
  • The proxy is non-transparent for the client. Meaning, the client is aware of the proxy.
  • Transparent and non-transparent forward proxies exist while the latter is more common.

Reverse Proxy

  • Traffic from the web server (HTTP response) to the client may or may not go through the proxy.
  • The proxy is transparent for the client.
  • A load balancer is a common use case of a reverse proxy.

In case that the forward proxy is transparent, the client does not need to “speak proxy”, thus does not need to be proxified.

If a client cannot talk to a forward proxy, it is possible to proxify its connection on both network layer or application layer.

Network Layer

Reverse Proxy

It is possible to put a reverse proxy between client and forward proxy in order to transparently proxify a connection:

proxification through reverse proxy

Imagine that “Web Server” is reachable at the the domain name webserver.com.

In this example, Nginx is used as reverse proxy and requires minimal configuration:

server {
  listen 80;

  server_name webserver.com;

  access_log /var/log/nginx/access.log; # optional
  error_log /var/log/nginx/error.log; # optional

  location / {
    proxy_set_header Host webserver.com;  # proxification

    proxy_pass http://forward-proxy:8080; # configure forward proxy as upstream server
  }
}

On the client, it is necessary to redirect traffic intended for webserver.com to the reverse proxy. A simple way to achieve this, is to enforce DNS resolution of webserver.com to the IP address of the reverse proxy in the hosts file (/etc/hosts) of the client.

All it takes to get the traffic through the forward proxy to the destination is setting the Host header: proxy_set_header Host webserver.com;.

This configuration would also work for an HTTPS connection between client and reverse proxy. However, it does no longer work if the communication from the reverse proxy to the webserver must be HTTPS as well. The reason is that almost all typical corporate forward proxies do SSL interception. To achieve this, the HTTPS traffic from the reverse proxy to the forward proxy must be tunneled through the proxy protocol (RFC 7230) using the HTTP CONNECT verb. Nginx does currently not support that for communication to upstream servers (in this case, the forward proxy). Accordingly, this configuration does not work if the web server mandates HTTPS.

If the client intends to communicate to the web server using HTTPS, but the web server also supports HTTP, the following Nginx config does the job:

server {
  listen 443 ssl;

  server_name webserver.com;

  ssl_certificate /path/to/cert;
  ssl_certificate_key /path/to/key;

  access_log /var/log/nginx/access.log; # optional
  error_log /var/log/nginx/error.log; # optional

  location / {
    proxy_set_header Host webserver.com;  # proxification

    proxy_pass http://forward-proxy:8080; # configure forward proxy as upstream server
  }
}

Note that this is dangerous. The client thinks it establishes a secure connection to the web server, when in fact, it only talks to the reverse proxy in a secure way. The traffic between reverse proxy and web server remains unencrypted.

In order for this setup to work, the CA certificate (or one of its intermediate CA certificates), used to generate the TLS certificate for the reverse proxy, must be trusted by the client.

This is a quick and easy setup to proxify a connection without a clients knowledge :+1:. However, it comes with the drawback that the upstream server must support plain HTTP connections.

Transparent Forward Proxy

A number of powerful tools exist to transparently proxify connections. One of these tool is redsocks (https://github.com/darkk/redsocks). Is has significantly more capabilities (that come at the cost of complexity). In addition, redsocks requires mangling with iptables, which, depending on the infrastructure (for example, a containerized environment), might be challenging.

It might also be possible to use squid (https://www.squid-cache.org) serving as a transparent forward proxy, forwarding traffic to a squid cache_peer (the corporate forward proxy). In this setup, squid acts as a child proxy and the corporate proxy acts as the parent proxy.

Application Layer

Also on the application layer, multiple options exist to proxify connections (for HTTP and socks proxies).

Using LD_PRELOAD

A common tool for this is proxychains-ng (https://github.com/rofl0r/proxychains-ng). It achieves proxification using the “shared object hook” technique:

ProxyChains is a UNIX program, that hooks network-related libc functions in DYNAMICALLY LINKED programs via a preloaded DLL (dlsym(), LD_PRELOAD) and redirects the connections through SOCKS4a/5 or HTTP proxies.

Consequently, this technique does not work if:

  • the binary on the client is statically compiled or..
  • is written in go (because of go’s own syscall implementation rather than using libmusl or glibc)

If proxychains-ng is working as intended, it looks similar to something like this:

~$ proxychains4 wget -qo- server.local
[proxychains] config file found: /etc/proxychains.conf
[proxychains] preloading /usr/lib/libproxychains4.so
[proxychains] DLL init: proxychains-ng 4.14-git-3-gde4460f
[proxychains] Strict chain  ...  192.168.0.5:3128  ...  server.local:80

Using ptrace

An alternative is graftcp (https://github.com/hmgle/graftcp). It also works for statically compiled binaries as well as go binaries. It does its magic by utilizing ptrace(2) in order to intercept and proxify traffic.

Be aware that ptrace is a Linux kernel system call that is taking over control of another process. In other words, this happens in the kernel space of an operating system. graftcp consists of a kernel-space component and a user-space component. The user-space component (a nice cli) tells the kernel-space component which process id to take control over and proxify its connection.

This scenario does not work within a Linux container, because of pid namespaces (wikipedia) that map pids between host system and container. Further, ptrace might be disabled using security facilities, such as SELinux (setsebool -P deny_ptrace on) or Linux kernel capabilitiy CAP_SYS_PTRACE.