Bug 44806 - Set the IP address+port used for backend proxy requests.
Summary: Set the IP address+port used for backend proxy requests.
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P1 enhancement (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2008-04-10 20:21 UTC by D. Stussy
Modified: 2018-02-25 19:54 UTC (History)
1 user (show)



Attachments
patch for ProxyBindAddress [addr:port+range] (4.50 KB, patch)
2008-05-30 07:48 UTC, rahul
Details | Diff
Slightly better, staggers the starting index so that the number of tries for bind is less (4.91 KB, patch)
2008-05-30 09:33 UTC, rahul
Details | Diff
updated patch (5.03 KB, patch)
2008-05-30 12:01 UTC, rahul
Details | Diff
Updated patch according to comments from D.Stussy (5.75 KB, patch)
2008-06-02 03:14 UTC, rahul
Details | Diff
Updated patch as suggested (6.36 KB, patch)
2008-06-03 00:19 UTC, rahul
Details | Diff
updates (6.46 KB, patch)
2008-06-04 00:24 UTC, rahul
Details | Diff
PARTIAL patch - used during debugging. (7.31 KB, patch)
2008-06-04 15:54 UTC, D. Stussy
Details | Diff
Final Patch - basic "ProxyBindAddress" command (6.67 KB, patch)
2008-06-17 11:19 UTC, D. Stussy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description D. Stussy 2008-04-10 20:21:26 UTC
After scanning the source for mod_proxy, I saw no command to fix either the outbound IP address or TCP port number used in the backend outbound request.  Being able to set these values may be of concern when it comes to certain system FIREWALL designs for multi-homed hosts (including those virtually multi-homed due to possessing multiple IP addresses per interface).

Command: ProxyAddress <address/hostname>
Context: server config, virtual host, "<Proxy>" section (including proxymatch)
Default: unspecified address (0.0.0.0 for IPv4; "::/128" for IPv6)
Status:  Extension

The command may appear multiple times within the same context as long as it is specifying a different address family than the other instances.

- If the parameter is [0-9.] only, it is an IPv4 address.
- If the parameter contains a ":", it is an IPv6 address.
- Otherwise, it is a hostname.

A hostname may specify ONLY one address per address family, but may have multiple address families (i.e. both an "A" and "AAAA" record for IPv4+IPv6).  It is an error to specify a hostname that resolves to more than one address in the same address family.

Whether the TCP port number should also be settable I shall leave to the maintainers to decide.  Whether this may be done with the above command (to parallel the "Listen" command) or a separate "ProxyPort" command, I also leave open.

This command is meant to correspond to like commands in other server products:  "query-source[-v6]" for BIND/named, or "bindaddress[v6]" for INN/innfeed.  It is an address used when the server needs to act as a client.
Comment 1 rahul 2008-05-30 02:05:48 UTC
*** Bug 42013 has been marked as a duplicate of this bug. ***
Comment 2 rahul 2008-05-30 07:48:50 UTC
Created attachment 22038 [details]
patch for ProxyBindAddress [addr:port+range]

A quite simple minded patch to bind to a particular address and port range
it does
1) bind to a give address:port+range
e.g
ProxyBindAddress 0.0.0.0:11111+10
binds to 0.0.0.0 at 11111 to 11120

or binds to any address given (without range and port)
ProxyBindAddress 0.0.0.0

It tries the ports sequentially until the range is exhausted for each request.
this is probably not optimal. - please do suggest if you know any other way of doing this.

As of now, it returns 404 rather than 503. this can be changed easily but needs modification all mod_proxy_*.c files that call ap_proxy_connect_backend
Comment 3 rahul 2008-05-30 09:33:09 UTC
Created attachment 22039 [details]
Slightly better, staggers the starting index so that the number of tries for bind is less
Comment 4 D. Stussy 2008-05-30 10:33:54 UTC
Interesting patch (looking at the revised one - 0933PST).  I had not considered a port range in my suggestion, but I note the other report did, but lacks setting the IP address.

Parsing failure:  IPv6
+++ modules/proxy/mod_proxy.c	(working copy)
@@ -1706,6 +1709,33 @@
+    port = ap_strstr_c(arg, ":");

Note that this will catch an IPv6 address literal mid-stream as ":" is a valid character for the literal.  "%" is used as the port separator in IPv6 literals.  Alternatively, use a "strrchr()"-type function - which returns the LAST occurance of ":" - which still might be part of the IPv6 literal if ONLY the literal is specified (i.e. no port range at all).

Since IPv6 literals have more than one colon, strchr() != strrchr() would be a valid test to detect them.

+    port = ap_strstr_c(arg, ":");
>    if (port != strrchr(arg, ":") port = ap_strstr_c(arg, "%");  //fixed?

Q:  Missing parameter?
+++ modules/proxy/mod_proxy.c	(working copy)
@@ -1706,6 +1709,33 @@
+    if (!port) {
+        psf->bind_addr = arg;
+        psf->bind_port = 0;
+        psf->bind_range = 1; /* when port=0, bind should not fail */
+        return NULL;
+    }

Shouldn't "psf->bindopt_set = 1;" also be set here too?  Else the IP address or hostname will be ignored.  I assume that "apr_sockaddr_info_get()" will resolve hostnames to addresses.  (see "rewrite" below)

"psf->bind_idx" is not set in this case either.  However, it's used in proxy_util.c even when port is left at "0+1".  I note that the effective "mod 1" makes it useless - but using undeclared variables is messy.  It looks as if this is to round-robin the ports, but what if all are in use?  "All ports in use" appears to return DECLINED.  Should we have a queue of waiting outbound requests, subject to the proxy timeout value?  (Maybe this is handled elsewhere in Apache?).


+++ modules/proxy/proxy_util.c
@@ -2349,6 +2349,35 @@
+                ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, "proxy: %s: can not bound to %s:%d+%d",

Language issue:  "cannot bind to"

Also, what's with using "%d" for values that are inherently unsigned ("%u" should be used).  Port values and address families range 0-65535; no negative values.  Technically, the variables should be unsigned ints.


Rewrite - for efficiency:
+++ modules/proxy/mod_proxy.c	(working copy)
@@ -1706,6 +1709,33 @@
...
+    if (!port) {
+        psf->bind_addr = arg;
+        psf->bind_port = 0;
-
-
+*   } else {
+        psf->bind_addr = apr_pstrndup(parms->pool, arg, port-arg);
+        range = ap_strstr_c(port, "+");
+        if (!range)
+            return "ProxyBindAddress format is <addr>:<port>+<range>";
+*       psf->bind_port = atoi(++port);
+*       psf->bind_range = atoi(++range);   // Should this be atoi()+1 ???
>    }
+    psf->bind_idx = 0;
+
+    psf->bindopt_set = 1;
+    return NULL;

Using "++" instead of "+1" may be more efficient for some compilers that don't realize that the added value is only a constant of 1.  OPtimizing compilers should realize this, but some may not.

No error checking on parameters yet.  "addr:port+0" would be accepted but isn't valid - since that currently returns a "mod 0" -> division by zero operation.  Bump range to be "atoi()+1" to avoid this, but check for >=65536, which would be the same as setting the port to zero.

Also note that with this implementation, each proxy request has to resolve the hostname part.  Granted, it's cached, but shouldn't this be done once - while parsing the configuration.  Strange things could happen if the DNS data changes.
Comment 5 rahul 2008-05-30 12:01:39 UTC
Created attachment 22040 [details]
updated patch

Thanks for the detailed review,
the new patch incorporates the suggested changes as below,
1) new Format for ProxyBindAddress 
      ProxyBindAddress [addr] [port]+[range]
e.g
      ProxyBindAddress 0.0.0.0 11111+10
(avoid the issue of : in ipv6)

2) cache the host part of the address
3) set bind_idx and bindopt_set correctly in all cases
4) validate params
Comment 6 D. Stussy 2008-05-30 12:35:46 UTC
Looking further at IPv6:  apr_parse_addr_port() does most of the parsing already - and it seems that IPv6 literals need to be enclosed in brackets ("[]").  The only thing that it doesn't handle is the +# at the end of this (for the range) - so we would need to strip that off first.  Using this routine would probably be better as that way, parsing would be done in a consistent manner.  It also checks that the port is in the range 1-65535.

If not, then in the section reading:
+    if (!port) {
+        psf->bind_addr = arg;

shouldn't the arg also be duplicated?
+        psf->bind_addr = apr_pstrndup(parms->pool, arg, strlen(arg));
...or is it already in the pool?

Therefore, the command parsing part of the patch becomes:
@@ -1706,6 +1709,45 @@
     return NULL;
 }
 
+static const char*
+    set_proxy_bindaddr(cmd_parms *parms, void *dummy, const char *arg)
+{
+    char *host, *range, *scope_id;
+    apr_port_t port; 
+    apr_status_t rv;
+    unsigned int x;
+    
+    range = ap_strstr_c(arg, "+");
+    if (range) *range++ = 0;
+    x = range ? atoi(range) + 1 : 1;
+
+    if ((x < 1) || (x > 65534))
+        return "ProxyBindAddress:  Invalid range - format is <addr>:<port>+<range>";
+
+    rv = apr_parse_addr_port(&host, &scope_id, &port, arg, parms->pool);
+
+    if (rv != APR_SUCCESS)
+        return "ProxyBindAddress:  Invalid address or port - format is <addr>:<port>+<range>";
+
+    if (range && port)
+        return "ProxyBindAddress:  Range requires valid port - format is <addr>:<port>+<range>";
+
+    if (scope_id)
+        return "ProxyBindAddress:  IPv6 Scope not valid here - format is <addr>:<port>+<range>";
+
+    proxy_server_conf *psf =
+        ap_get_module_config(parms->server->module_config, &proxy_module);
+
+    if (!psf) return "ProxyBindAddress:  Cannot allocate memory for internal structure";
+
+    psf->bind_idx = 0;
+    psf->bind_addr = host;
+    psf->bind_port = (int) port;  // maybe unsigned int?
+    psf->bind_range = x;
+    psf->bindopt_set = 1;
+    return NULL;
+}
+
 static const char *add_member(cmd_parms *cmd, void *dummy, const char *arg)
 {
     server_rec *s = cmd->server;
Comment 7 D. Stussy 2008-05-30 14:06:10 UTC
Revision of Specification (for Manual) - based on my last parse routine:

Command: ProxyBindAddress [<hostname/address-literal>:]<port>[+<range>]
Context: server config, virtual host, "<Proxy>" section (including <ProxyMatch>)
Default: Address: unspecified address (0.0.0.0 for IPv4; "::/128" for IPv6)
         Port: unspecified port (0 => use any port available - OS choice)
         Range:  0 (use only the port specified, if any)
Status:  Extension

Specifying a hostname or an address-literal shall bind all outbound proxy requests to the IP address(es) specified or resolved.  DNS resolution is attempted at the time the proxy request is made, so if a hostname maps to multiple addresses, the address used may vary across requests.  If DNS resolution fails, the proxy request will fail.  As noted elsewhere, since IPv6 address literals contain colons, they must appear in brackets.

Specifying a non-zero port locks in that port as the one used, or if a range is specified, the first one used.  Specifying an optional range indicates how many consecutive port numbers beyond the first may be used.  Specifying a range of "+0" means that only the specified port is used.

Examples:
  ProxyBindAddress 192.0.2.1:10000+10

This sets the IP address to the IPv4 address of 192.0.2.1
There are 11 valid ports for this range:  10000-10010

  ProxyBindAddress [2001:df8::1]:49151+9

This sets the IP address to the IPv6 address of 2001:df8::1
There are 11 valid ports for this range:  49151-49160

Notes:

In the current implementation, it is not possible to specify separate port ranges for different addresses or address families.  It is also considered an error to specify an interface with an IPv6 address ("%" parameter).  Specifying port 0 with a range is invalid.  Specifying an address literal also locks the proxy server into the address family the literal belongs to.  Therefore, only a hostname produces an address family independent assignment assuming that both DNS A and AAAA records exist for the name.

-------------------------------------------------------------
Errata to parsing routine:

1)  Boundary condition - Check value - promote ">" to ">=":
+    if ((x < 1) || (x >= 65534)) ...
Specifying port "1+65534" in effect specifies all ports.  One should specify port 0 instead.

2)  Boundary condition - Value wraparound.
Add code to check for port range wraparound as only 0-65535 is valid:
+    if ((port + x) > 65535)
+        return "ProxyBindAddress:  Invalid port+range - port-wraparound to 0";
insert before "if (scope_id)".


To Rahul:  You seem to have an off-by-one issue with the range in your parsing routine.  In your latest version (noon today), you also seem to "psf->bindopt_set = 1" and overwrite some other variables before all your error checking is done.  I think you'll find my parser (with errors above corrected) is cleaner.
Comment 8 D. Stussy 2008-05-30 14:10:11 UTC
OOps:  Documentation change:

There are 10 valid ports for this range:  49151-49160

...Not 11.

Comment 9 rahul 2008-06-02 03:14:18 UTC
Created attachment 22051 [details]
Updated patch according to comments from D.Stussy

Some updates:
    If a single port is specified with out a range, only one client request at a time can happen. All other requests will be denied. This is not a sane scenario. So I have not allowed this in configuration parsing.

    Rather than checking the value of range and port separately, I have checked validity of port+range. This makes more sense (IMO and according to D.Stussy's later update)

    The looked up address is cached while parsing.
Comment 10 D. Stussy 2008-06-02 14:37:56 UTC
RE: Comment #9 version:

1)  "Range" value:  Checking for (r < 1) needs to be done(*).  It's possible that an admin could specify "ip:port+-range" and get a negative value.  Noting that a range of zero is "not useful," perhaps we need to set a minimum value for the range parameter?  In current parsing, not specifying a range at all does default it to a single port.  The question is:  What is a suitable default?  I'm thinking that it could be one of the following:
 - A fixed number, perhaps 8.
 - Some value derived from the "MaxClients" or "Min/MaxSpareServers" directive.
 - Some value derived from the Proxy Balancer code.
We need only check the lower value for "r" as the "port+r" code will check the upper value.  For the "port+r" comparison, do we have a "MAX_PORT" definition available (from an included file) so we don't have to hard-code the value?

* - Checking "r" against the minimum also will take care of cases where the admin specified non-digit values for the range (which atoi() converts to zero).  However, if the port remains as zero, a range value of zero (internally, 1) is appropriate.  Patch snippet below.


2)  "Address" value:  In the latest version, we're storing the address(es) returned by the apr_sockaddr_info_get() function.  By itself, I don't have a problem with that and thought that something like that should be done.  However, there are two issues with the step:

a)  We're fixing an address, but haven't checked to see if it was the ONLY address that the hostname resolved to.  In my original suggestion, we would need to make certain that DNS resolution produces a unique address (for a given address family).  I'm willing to DROP THAT REQUIREMENT, but then we must note in the documentation that should a hostname resolve to more than one address, all addresses returned may be used.  Dropping the uniqueness requirement per AF does make the parsing code simpler.  (If dropped, I think parsing might be finished.)

b)  Should there be multiple addresses because there's multiple address families (e.g. IPv4 and IPv6), we need to try an address in the same family as the destination.  It would be a problem if we store the IPv6 address (as first) and try to reach an IPv4-only site when we ignore the IPv4 address available for the hostname we're binding to because we're not looping through the alternatives.  Apr_sockaddr_info_get() is returning ALL matching addresses, but in proxy_util.c, it looks as if we're using only the first one and not checking to see if we have a reachable address family combination of source and destination, and when not, trying another address/AF combination.  Since at a point before the patch, we've already set "backend_addr" (the destination) and have "backend_addr->family" available, I think we need to compare these:

+                local_addr = conf->bind_addr;
>> if (backend_addr->family != local_addr->family) { /* connection won't work */
      TRY ANOTHER ADDRESS from bind_addr, if available.
We may need another loop beyond the port+range loop.  Inner loop or outer is your choice, but I think an inner loop is more efficient.

c)  Should we pass the port value to apr_sockaddr_info_get()?  I don't think it makes a difference, but if I remember getaddrinfo() correctly (called 2 routines deep from here), it may - in that it allows us to filter out certain returns of information.  (Cf. - bug reported for INN 2.4.3 that was returning each IP address 3 times - http://groups.google.com/group/news.software.nntp/browse_thread/thread/260a927ec3f5686c/bf6d6d2854dfb0ae?#bf6d6d2854dfb0ae ).  I note that the fix for the INN bug is already in the Apache code, but maybe the port number is also needed to avoid the duplication issue?


PARSING code changes suggested (double "++" for changes):

++ #define MIN_RANGE 8	       /* value subject to discussion */
++ #define MAX_PORTS 65535	/* may be defined from an include file */
	/* perhaps use "UINT16_MAX" from <stdint.h> ??? */
	/* perhaps use "USHRT_MAX"  from <limits.h> ??? */

+    if((apr_parse_addr_port(&host, &scope_id, &port, addr, parms->pool)
+                != APR_SUCCESS)
+            || scope_id               /* we dont know how to use scope_id */
+            || (!port && range)       /* only a combo [port+range] is valid */
++           || (port && (!range || (r < MIN_RANGE)))  /* or an invalid portnumber (p+r) */
++           || ((port + r) > MAX_PORTS) /* or a range that wraps around zero */
+      )
+        return "ProxyBindAddress:  Invalid address -"
+               " format is <addr>[:<port>+<range>]";
+
+    /* Preparse the address */
+    apr_sockaddr_info_get(&(psf->bind_addr), host, APR_UNSPEC, 0, 0, parms->pool);
+
+    psf->bind_port = port;
+    /* If there didn't exist a port then there was no range either. so we have the
+     * starting value 0 for r when no port was specified.*/
++   psf->bind_range = min( port ? MIN_RANGE : 1, r + 1 );
+    psf->bind_idx = 0;
+    psf->bindopt_set = 1;
+    return NULL;


From proxy_util.c:
+                    conf->bind_idx = i + 1;
+                    break;
+                } else {
+                    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, s, "proxy: %s: not bound to %s%u",
+                            proxy_function, conf->bind_addr, local_addr->port, NULL);
+                }

The ELSE isn't necessary, on account of the break.  It will be optimized out.
Comment 11 D. Stussy 2008-06-02 15:06:44 UTC
Fix the multiple address & address family match issue:

modules/proxy/proxy_util.c	(working copy)
+        if (conf->bindopt_set) {
+            const int idx = conf->bind_idx;
+            const int range = conf->bind_range;
+            const int start = conf->bind_port;
+            for(int i = 0; i < range; ++i) { /* loop until we can bind correctly*/
+                int port = start + ((idx + i) % range);
+                ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, s, "proxy: %s: trying to bind to %s:%u",
+                        proxy_function, conf->bind_addr,port, NULL);
+
++		 for (local_addr = conf->bind_addr; local_addr ; local_addr = local_addr->next) {
+
++		     if (local_addr->family != backend_addr->family) continue;
+
+                    local_addr->sa.sin.sin_port = htons(port);
+                    local_addr->port = port;
+
+                    if ((rv = apr_socket_bind(newsock, local_addr)) == APR_SUCCESS) {
+                        ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, s, "proxy: %s: bound to %s:%u",
+                                proxy_function, conf->bind_addr, local_addr->port, NULL);
+                        conf->bind_idx = i + 1;
++                       goto escape;	/* break from TWO loops */
++                   } 
+                    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, s, "proxy: %s: not bound to %s%u",
+                            proxy_function, conf->bind_addr, local_addr->port, NULL);
+                }
+            }
+            if (rv != APR_SUCCESS) {
+                ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, "proxy: %s: can not bind to %s:%u+%u",
+                        proxy_function,conf->bind_addr, conf->bind_port, conf->bind_range, NULL);
+                return DECLINED;
+            }
+        }
++escape:

Instead of the goto, one could add "&& (rv != APR_SUCCESS)" to the conditional of both loops.
Comment 12 D. Stussy 2008-06-02 15:17:25 UTC
Parser - afterthought:

++   psf->bind_range = min( port ? MIN_RANGE : 1, r + 1 );
 should be:
++   psf->bind_range = port ? min( MIN_RANGE, r + 1 ) : 1;

If port == 0, r == 0 also, so we don't have to evaluate "min(1,1)".
Comment 13 rahul 2008-06-02 22:40:57 UTC
(In reply to comment #10)
> 1)  "Range" value:  Checking for (r < 1) needs to be done(*).  It's possible
> that an admin could specify "ip:port+-range" and get a negative value.  Noting
> that a range of zero is "not useful," perhaps we need to set a minimum value
> for the range parameter?  In current parsing, not specifying a range at all
> does default it to a single port.  The question is:  What is a suitable
> default?  I'm thinking that it could be one of the following:

My thinking is that if the admin did not specify a range, and a port is given, then it is a configuration error, as there is a good chance that the admin did not understand the implications of restricting the port to a single value. (Note that Admin can still force the issue by setting port+0 as range if he wants it.)

On the other hand, if a port was not specified, then the system selects a free port for us so a second try is not needed any more. This is the reason bind_range is set to 1 in this case.

 
> 2)  "Address" value:  In the latest version, we're storing the address(es)
> returned by the apr_sockaddr_info_get() function.  By itself, I don't have a
> problem with that and thought that something like that should be done. 
> However, there are two issues with the step:
> 
> a)  We're fixing an address, but haven't checked to see if it was the ONLY
> address that the hostname resolved to.  In my original suggestion, we would
> need to make certain that DNS resolution produces a unique address (for a given
> address family).  I'm willing to DROP THAT REQUIREMENT, but then we must note
> in the documentation that should a hostname resolve to more than one address,
> all addresses returned may be used.  Dropping the uniqueness requirement per AF
> does make the parsing code simpler.  (If dropped, I think parsing might be
> finished.)
> 
> b)  Should there be multiple addresses because there's multiple address
> families (e.g. IPv4 and IPv6), we need to try an address in the same family as
> the destination.  It would be a problem if we store the IPv6 address (as first)
> and try to reach an IPv4-only site when we ignore the IPv4 address available
> for the hostname we're binding to because we're not looping through the
> alternatives.  Apr_sockaddr_info_get() is returning ALL matching addresses, but
> in proxy_util.c, it looks as if we're using only the first one and not checking
> to see if we have a reachable address family combination of source and
> destination, and when not, trying another address/AF combination.  Since at a
> point before the patch, we've already set "backend_addr" (the destination) and
> have "backend_addr->family" available, I think we need to compare these:
> 
> +                local_addr = conf->bind_addr;
> >> if (backend_addr->family != local_addr->family) { /* connection won't work */
>       TRY ANOTHER ADDRESS from bind_addr, if available.
> We may need another loop beyond the port+range loop.  Inner loop or outer is
> your choice, but I think an inner loop is more efficient.
> 
> c)  Should we pass the port value to apr_sockaddr_info_get()?  I don't think it
> makes a difference, but if I remember getaddrinfo() correctly (called 2
> routines deep from here), it may - in that it allows us to filter out certain
> returns of information.  (Cf. - bug reported for INN 2.4.3 that was returning
> each IP address 3 times -
> http://groups.google.com/group/news.software.nntp/browse_thread/thread/260a927ec3f5686c/bf6d6d2854dfb0ae?#bf6d6d2854dfb0ae
> ).  I note that the fix for the INN bug is already in the Apache code, but
> maybe the port number is also needed to avoid the duplication issue?

From that post, it seems the fix is to specify the protocol? how is specifying port useful?

[snip]
> +
> +    psf->bind_port = port;
> +    /* If there didn't exist a port then there was no range either. so we have
> the
> +     * starting value 0 for r when no port was specified.*/
> ++   psf->bind_range = min( port ? MIN_RANGE : 1, r + 1 );

see above (top) explanation, I dont think we should use a MIN_RANGE (if at all, we should check the range value, and if less than MIN_RANGE, warn the user.)

> +    psf->bind_idx = 0;
> +    psf->bindopt_set = 1;
> +    return NULL;
> 
Comment 14 rahul 2008-06-02 23:20:59 UTC
(In reply to comment #12)
> Parser - afterthought:
> 
> ++   psf->bind_range = min( port ? MIN_RANGE : 1, r + 1 );
>  should be:
> ++   psf->bind_range = port ? min( MIN_RANGE, r + 1 ) : 1;
> 
> If port == 0, r == 0 also, so we don't have to evaluate "min(1,1)".
> 

the value r starts with '0' and does not change if port was not specified
so you dont need to check 'port? : xxxx : 1 , the value would be r+1 == 1
Comment 15 rahul 2008-06-03 00:19:25 UTC
Created attachment 22058 [details]
Updated patch as suggested

Checks address families to get a usable address.
Comment 16 D. Stussy 2008-06-03 00:44:15 UTC
RE: Comment #13

1)  "port value to apr_sockaddr_info_get()?"  I don't know if it's significant or not for most admins - but it is one of the input values, so perhaps there is a reason that hasn't revealed itself to us:  It's passed to find_addresses() which passes it to call_resolver() which uses it with the AIX OS to set/modify the "servname" parameter to getaddrinfo().  I don't use AIX, but it appears significant to that OS.  Therefore, for portability, it should be passed.

One thing that I did notice:  If we're allocating a ProxyBindAddress address list where one already exists (e.g. server reconfiguration via signal HUP), we should free any existing list before assigning a new value in the parser:

  if (psf->bind_addr != NULL) freeaddrinfo(psf->bind_addr);

Otherwise, we could cause a slow memory leak over time.  Reading getaddrinfo()'s man page reminded me of this.

2)  If you want to specifically allow the "ip:port+0" case, then MIN_RANGE may be set to equal 0.  I took your comment that setting the range so that only a single port was available as "not sane" to mean that you felt such should not be allowed.  I suggested a value of 8 so that there would always be a minimum of 8 worker ports available.  We still need to reject negative values for "r", so checking (r < some_value) still needs to be there.

RE: Comment #14

It's still faster to evaluate "1" over "r+1" even if they yield the same result (of 1) because r = 0.
Comment 17 D. Stussy 2008-06-03 01:04:31 UTC
RE:  Comment #15 - patch:

Code in modules/proxy/proxy_util.c	(working copy)
@@ -2349,6 +2373,21 @@

Does NOT cycle through all the addresses.  It tries the first address of the matching address family in the list, and if that fails, it fails the whole attempt.  It does not try other addresses in the same address family (e.g. multi-homed host, virtual hosts, etc.).  One address may work where another one fails due to interface, firewall, or routing table considerations.

Try this:
@@ -2349,6 +2373,19 @@
                      "proxy: %s: fam %d socket created to connect to %s",
                      proxy_function, backend_addr->family, worker->hostname);
 
+        if (conf->bindopt_set) {
+            for(apr_sockaddr_t *addr = conf->bind_addr; addr; addr = addr->next) {
+                if (addr->family != backend_addr->family) continue;
+                if (bind_to_addr(newsock, laddr, proxy_function, conf, s) == APR_SUCCESS)
+                    break;
+                ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, "proxy: %s: can not bind to %s:%u+%u",
+                        proxy_function,conf->bind_addr, conf->bind_port, conf->bind_range, NULL);
+            }
+            if (!addr)
+                return DECLINED;
+            }
+        }
+
         /* make the connection out of the socket */
         rv = apr_socket_connect(newsock, backend_addr);
 
Comment 18 rahul 2008-06-03 01:43:37 UTC
> 1)  "port value to apr_sockaddr_info_get()?"  I don't know if it's significant
> or not for most admins - but it is one of the input values, so perhaps there is
> a reason that hasn't revealed itself to us:  It's passed to find_addresses()
> which passes it to call_resolver() which uses it with the AIX OS to set/modify
> the "servname" parameter to getaddrinfo().  I don't use AIX, but it appears
> significant to that OS.  Therefore, for portability, it should be passed.

The port we have is only one of the values going to be used (unless the range is set to 0) so I dont think it is correct to pass it.

From apr_sockaddr_info_get, it is used to work around a problem with AIX which
refuses a service "0" and the solution is to set it to "1". this hints that the port name is not important (read the comments above that line too.)

> One thing that I did notice:  If we're allocating a ProxyBindAddress address
> list where one already exists (e.g. server reconfiguration via signal HUP), we
> should free any existing list before assigning a new value in the parser:
> 
>   if (psf->bind_addr != NULL) freeaddrinfo(psf->bind_addr);
> 
> Otherwise, we could cause a slow memory leak over time.  Reading
> getaddrinfo()'s man page reminded me of this.

ok
 
> 2)  If you want to specifically allow the "ip:port+0" case, then MIN_RANGE may
> be set to equal 0.  I took your comment that setting the range so that only a
> single port was available as "not sane" to mean that you felt such should not
> be allowed.

It is not sane, so using a warning just like when MaxClients set to too low :)

>  I suggested a value of 8 so that there would always be a minimum
> of 8 worker ports available.  We still need to reject negative values for "r",
> so checking (r < some_value) still needs to be there.

This is there in the patch (r < 0)


> RE: Comment #14
> 
> It's still faster to evaluate "1" over "r+1" even if they yield the same result
> (of 1) because r = 0.

in your case, you have a ternery embedded, I suspect the cost would be the same in that case :)

Comment 19 rahul 2008-06-03 01:48:09 UTC
> >   if (psf->bind_addr != NULL) freeaddrinfo(psf->bind_addr);
> > 
> > Otherwise, we could cause a slow memory leak over time.  Reading
> > getaddrinfo()'s man page reminded me of this.
> 

call_resolver seems to be doing a freeaddrinfo on the original address list, and
returning back a duplicated list allocated from the pool. So I guess freeaddrinfo is incorrect in the pool-alloced mem.
Comment 20 D. Stussy 2008-06-03 12:29:33 UTC
One last thing that I noticed:

apr_sockaddr_info_get(&(psf->bind_addr), ...

If NO DNS records matched, psf->bind_addr would be set to NULL, and we should probably issue an error - and KEEP any prior list we happen to have (but I don't think we will have a prior list - wouldn't the [prior] config pool be released before the re-read?).  Should a second occurance of this command in the same context be flagged as an error or just overwrite the first occurance?

Code:
...
+    /* Preparse the address */
+    apr_sockaddr_info_get(&(psf->bind_addr), host, APR_UNSPEC, port, 0, parms->pool);
++   if (psf->bind_addr == NULL)
++       return "ProxyBindAddress:  Hostname did not resolve";
++  /* If an address literal was specified, that should have resolved. */
+
+    psf->bind_port = port;
...

Even if it's unnecessary, I don't see any harm in passing "port" to the resolver.  Granted that if non-zero, it is lowest numbered port to use, but zero vs. non-zero uses could be significant in the future.  You've questioned why pass it?  I say why not?

> call_resolver seems to be doing a freeaddrinfo ...

Apr_sockaddr_info_get() sets it to NULL before call_resolver ever sees it.  However, you seem to be correct in that the getaddrinfo() list does seem to be freed.  We should release the copy of it allocated to the pool, as we may have a memory leak - but it appears that such gets released only when the pool is terminated as there's no way to release individual items allocated from a pool.  If we know that the prior pool has already been released when the parser is called, then I don't see a problem.  (I wish I were more familiar with Apache's overall design.)
Comment 21 rahul 2008-06-04 00:24:30 UTC
Created attachment 22068 [details]
updates

1) pass port to info_get
2) look for addr resolution error or !addr
Comment 22 D. Stussy 2008-06-04 15:54:49 UTC
Created attachment 22074 [details]
PARTIAL patch - used during debugging.

RE:  Patch at comment #21

modules/proxy/mod_proxy.c
Parser:  Boundary condition.  If we're defining PORT_MAX_PORTS as 65536 (0x00010000), then "(ports+r) >= ..." on line 1741 is the error condition we need to check for.  Add an equals sign.

modules/proxy/proxy_util.c
@@ -2349,6 +2373,20 @@  Non-critical issue (cosmetic):
Note:  In your design, you're reporting the error of a failure to bind for the first address only when all fail, while in my version in comment #17, moving the ap_log_error() call inside the loop (and after the break) will report EACH address that fails, even if some address succeeds.  We also don't need to keep the status in "rv" as "addr" will be set on success and NULL on failure.  Difference:

Your version:  Reports the error only if all addresses & ports fail.
My version:  Reports an error for each address where all ports fail.

This issue does not affect the operation of the function, but depending on what the admin is doing (production vs. testing a configuration), it may be useful to have a log entry for each address failing as opposed to merely reporting the first.

If you want to keep the error message outside of the loop (as the inner function "bind_to_addr()" has sufficient debug statements, perhaps the message should read:

+                ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, "proxy: %s: All addresses and ports in range specified failed to bind",
+                        proxy_function, NULL);
--------------
As adjusted, I attempted to install the patch and run it.  I found the following problems when patching against a stock version 2.2.8:

+++ modules/proxy/mod_proxy.c	(working copy)
@@ -35,6 +35,12 @@
Patch was rejected.  I manually inserted it.  Trailing text did not match.

All other items did install, but some had different line offsets.

Compilation errors:
My C compiler did not like the declarations inside the for() loops
modules/proxy/proxy_util.c: 2275:  "int i"
modules/proxy/proxy_util.c: 2378:  "apr_sockaddr_t *addr"
Moving those declarations outside of the loops onto their own line worked.

With those two changes, all compiles without error and without warnings.

Operation:
ProxyBindAddress allowed in main server configuration.
ProxyBindAddress allowed inside VirtualHost configuration.
ProxyBindAddress allowed inside <IfModule mod_proxy.c> configuration.
ProxyBindAddress NOT allowed inside <Proxy "*"> configuration - acceptable, but unexpected.

Configured it with a hostname only resolving to one IPv4 and one IPv6 address.
Did NOT see any binding attempts in logs.
Saw ONLY "[error] proxy: HTTP: All addresses and ports in range specified failed to bind"
Removed address-family comparison; added AF to debug string; recompiled.

Still saw it.  Inserted a debug-log entry and found conf->bind_addr is NULL.  Set port options in the config file - different options for the main configuration and a virtual host that has proxy service, but found that at the time of the call, port+range = "0+0" - NOT the same as set.

Are we certain that the configuration pointer value used when proxying is the same configuration pointer value that was used during configuration - i.e. same memory address?

Live log file:
[Wed Jun 04 22:31:30 2008] [debug] mod_proxy.c(855): Running scheme http handler (attempt 0)
[Wed Jun 04 22:31:30 2008] [debug] mod_proxy_http.c(1822): proxy: HTTP: serving URL http://lasvegas.en.craigslist.org/robots.txt
[Wed Jun 04 22:31:30 2008] [debug] proxy_util.c(1855): proxy: HTTP: has acquired connection for (*)
[Wed Jun 04 22:31:30 2008] [debug] proxy_util.c(1916): proxy: connecting http://lasvegas.en.craigslist.org/robots.txt to lasvegas.en.craigslist.org:80
[Wed Jun 04 22:31:31 2008] [debug] proxy_util.c(2015): proxy: connected /robots.txt to lasvegas.en.craigslist.org:80
[Wed Jun 04 22:31:31 2008] [debug] proxy_util.c(2197): proxy: HTTP: fam 2 socket created to connect to *
[Wed Jun 04 22:31:31 2008] [debug] proxy_util.c(2204): ProxyBindAddress: conf->bind_addr is NULL, port=0, range=0 (0825BCD8)
[Wed Jun 04 22:31:31 2008] [error] proxy: HTTP: All addresses and ports in range specified failed to bind

Just in case I made any accidental changes, I've included my differences, complete with DEBUGGING statements that should be removed before the final version is released and incorporated into the mainstream program.

File:  modules/proxy/mod_proxy.h - patch same as proposed, and therefore excluded.
Comment 23 rahul 2008-06-05 02:54:47 UTC
> 
> modules/proxy/mod_proxy.c
> Parser:  Boundary condition.  If we're defining PORT_MAX_PORTS as 65536
> (0x00010000), then "(ports+r) >= ..." on line 1741 is the error condition we
> need to check for.  Add an equals sign.

ok
 
> modules/proxy/proxy_util.c
> @@ -2349,6 +2373,20 @@  Non-critical issue (cosmetic):
> Note:  In your design, you're reporting the error of a failure to bind for the
> first address only when all fail, while in my version in comment #17, moving
> the ap_log_error() call inside the loop (and after the break) will report EACH
> address that fails, even if some address succeeds.  We also don't need to keep
> the status in "rv" as "addr" will be set on success and NULL on failure. 
> Difference:
> 
> Your version:  Reports the error only if all addresses & ports fail.
> My version:  Reports an error for each address where all ports fail.
 
Actually if you check my version of bind_to_addr, you will see that I am printing a log on each time an attempt is made to bind, with a log that reports when a binding is successful. I think there is no loss of information in my case too. (Any address that was tried but did not result in a bind is a failed attempt.)

> As adjusted, I attempted to install the patch and run it.  I found the
> following problems when patching against a stock version 2.2.8:
> 
> +++ modules/proxy/mod_proxy.c   (working copy)
> @@ -35,6 +35,12 @@
> Patch was rejected.  I manually inserted it.  Trailing text did not match.
> 
> All other items did install, but some had different line offsets.

My diffs were against 2.3 trunk. since this is valid in trunk also we should change the version to trunk ?

> Compilation errors:
> My C compiler did not like the declarations inside the for() loops
> modules/proxy/proxy_util.c: 2275:  "int i"
> modules/proxy/proxy_util.c: 2378:  "apr_sockaddr_t *addr"
> Moving those declarations outside of the loops onto their own line worked.

ok

> Operation:
> ProxyBindAddress allowed in main server configuration.
> ProxyBindAddress allowed inside VirtualHost configuration.
> ProxyBindAddress allowed inside <IfModule mod_proxy.c> configuration.
> ProxyBindAddress NOT allowed inside <Proxy "*"> configuration - acceptable, but
> unexpected.
> 
> Configured it with a hostname only resolving to one IPv4 and one IPv6 address.
> Did NOT see any binding attempts in logs.
> Saw ONLY "[error] proxy: HTTP: All addresses and ports in range specified
> failed to bind"
> Removed address-family comparison; added AF to debug string; recompiled.
> 
> Still saw it.  Inserted a debug-log entry and found conf->bind_addr is NULL. 
> Set port options in the config file - different options for the main
> configuration and a virtual host that has proxy service, but found that at the
> time of the call, port+range = "0+0" - NOT the same as set.

> Are we certain that the configuration pointer value used when proxying is the
> same configuration pointer value that was used during configuration - i.e. same
> memory address?

That is interesting, In my testing, it seems to work fine? could you check this against a 2.3 trunk checkout?

> Just in case I made any accidental changes, I've included my differences,
> complete with DEBUGGING statements that should be removed before the final
> version is released and incorporated into the mainstream program.
> 
> File:  modules/proxy/mod_proxy.h - patch same as proposed, and therefore
> excluded.

Comment 24 D. Stussy 2008-06-05 13:59:05 UTC
1)  If it works for you in the current source tree (v 2.3-prerelease), then so be it.  I found my way to "http://svn.apache.org/repos/asf/httpd/httpd/trunk/" but saw no way to download a current snapshot at once as an archive.  Is there someone else who can test it?  We should have someone else look at this anyway.

2)  Looking back at the code and the manual, I'm wondering if we should also try to implement a version at the proxy worker level (via the ProxyPass or PorxySet commands).  That is, different workers and different backend sites could have different settings than the main configuration.  Maybe it should only be allowed in "balancer" declarations.  I have not looked into how feasible this is - and maybe it should be opened as a separate feature request.  Your option.

3)  Documentation - Revision for syntax:

Command: ProxyBindAddress (hostname|address-literal)[:port+range]
Context: server config, virtual host
Comment 25 rahul 2008-06-07 02:51:28 UTC
Adding a proxypass like directive at worker level (I assume you are refering to allowing interface selection to be based on url) is probably better. I will take a look at how it can be done.
Comment 26 D. Stussy 2008-06-07 18:07:23 UTC
RE: comment #25

...Or a <proxy balancer:...> section too, if possible.

Let's publish the final version of our patch in the meantime as that would be the default if no proxy-worker-specific choice is made.
Comment 27 D. Stussy 2008-06-17 11:19:43 UTC
Created attachment 22134 [details]
Final Patch - basic "ProxyBindAddress" command

I believe that this is the final version of the patch for the basic command - without allowing this to be set on a per proxy-worker basis.  Does everyone agree?
Comment 28 D. Stussy 2008-06-24 22:22:20 UTC
Not having heard anything, I'm going to close the enhancement as resolved.  Should anyone want to take this further, such as to allow this to be settable by various proxy-worker threads, then please "clone the bug" and continue.  Meanwhile, let's commit this to the current distribution version.
Comment 29 rahul 2008-07-16 00:34:43 UTC
Proxy/Balancer Worker specific configuration added as a new bug at
https://issues.apache.org/bugzilla/show_bug.cgi?id=45405
Comment 30 Rainer Jung 2018-02-25 19:54:22 UTC
Undo spam change