Bug 48505 - Apache 2.2 not working with LDAP Fail Over Auth
Summary: Apache 2.2 not working with LDAP Fail Over Auth
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_authz_ldap (show other bugs)
Version: 2.2.13
Hardware: PC Linux
: P2 critical (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2010-01-07 06:17 UTC by Muzi
Modified: 2010-12-28 16:19 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Muzi 2010-01-07 06:17:44 UTC
hi Guys

I am using apache version Apache/2.2.13 (Unix) on Fedora -11 , and using ldap as url authentications. I setup my conf in --> /etc/httpd/conf.d/mydomain.conf with below values

I define only here below the ldap related entries only.


LDAPTrustedMode TLS
LDAPConnectionTimeout 4

<Directory "/var/www/html/test" >


AuthName "Testing Ldap fail over"
AuthType Basic

# The LDAP server(s)
AuthLDAPURL "ldap://ldap1.mydomain.com ldap2.mydomain.com/dc=mydomain,dc=com?uid??"
AuthBasicProvider ldap
AuthLDAPBindDN "uid=webcon,ou=WebAdmin,dc=mydomain,dc=com"
AuthLDAPBindPassword abxxyz

AuthLDAPGroupAttribute memberUid

Require ldap-group cn=WEBOU,dc=mydomain,dc=com
Order Allow,Deny
Options Indexes FollowSymLinks Multiviews
Allow from All

</Directory>


So its working when primary ldap1 server is up but for testing i down the ldap1 and then do check, so apache not forwarded the auth request to ldap2 for a long time. it takes so many time to connect with ldap2, i think its may be 15-20 mins. I want to immediate ldap failover if primary ldap1 is down so it connects connect to ldap2 after just some seconds. For this i can find directive "LDAPConnectionTimeout" in docs i can try it to define 7 seconds, but still it fails to connect with ldap2.

So please kindly give suggestions , which i need extra to make ldap fail over effective.
Edit/Delete Message
Comment 1 Eric Covener 2010-01-07 06:37:50 UTC
Apache doesn't do anything with the two hostnames or the connect timeout (boils down to LDAP option  LDAP_OPT_NETWORK_TIMEOUT), they are passed up-front to whatever LDAP library on your system Apache has been linked with.

if it takes 15-20 minutes to figure out you can't connect, you probably have a firewall that blocks the RST sent from the LDAP server to the webserver, which is what signales that the connection is activelt refused.
Comment 2 Muzi 2010-01-07 07:10:10 UTC
(In reply to comment #1)
> Apache doesn't do anything with the two hostnames or the connect timeout (boils
> down to LDAP option  LDAP_OPT_NETWORK_TIMEOUT), they are passed up-front to
> whatever LDAP library on your system Apache has been linked with.
> 
> if it takes 15-20 minutes to figure out you can't connect, you probably have a
> firewall that blocks the RST sent from the LDAP server to the webserver, which
> is what signales that the connection is activelt refused.

Hi

No in my firewall, ldap ports (389,636) are allowed for both incoming/outgoing for all hosts. the above timeout directive is mentioned in apache 2.2 docs, but i dont know why its not forwarding request to secondary ldap2 after desire seconds
Comment 3 Muzi 2010-01-07 07:20:09 UTC
i forget to change status
Comment 4 Eric Covener 2010-01-07 07:31:23 UTC
I've clarified the documentation here:

http://svn.apache.org/viewvc?rev=896897&view=rev

But it's not an Apache bug if your LDAP client takes 15 minutes to fail to acquire a connection, and this is not a support forum.
Comment 5 Muzi 2010-01-07 08:06:05 UTC
(In reply to comment #4)
> I've clarified the documentation here:
> 
> http://svn.apache.org/viewvc?rev=896897&view=rev
> 
> But it's not an Apache bug if your LDAP client takes 15 minutes to fail to
> acquire a connection, and this is not a support forum.

my dear on same machine, my clients just take 5 seconds for ssh/vsftpd auth with ldap failover, but for apache its take a to much time, and also one thing more leave the time taken, i have a problem just, when my primary ldap is down, apache not forward request to ldap2 just, the time thing i mentioned i tested it again and its still fails after the 15th min. so please suggest.

Thanks
Comment 6 Eric Covener 2010-01-07 08:09:43 UTC
http://httpd.apache.org/userslist.html
Comment 7 Eric Covener 2010-01-09 09:53:13 UTC
*** Bug 48515 has been marked as a duplicate of this bug. ***
Comment 8 Muzi 2010-01-09 10:43:37 UTC
Its not resolved yet and mailing list not provide me any info and update, so i reopened it again, and assume i touch with professionals peoples in this bugzilla community.

Thanks
Comment 9 Eric Covener 2010-01-09 11:30:29 UTC
*** Bug 48515 has been marked as a duplicate of this bug. ***
Comment 10 Eric Covener 2010-01-09 11:36:45 UTC
attach your ErrorLog, with LogLevel debug set, and a binary, unformatted, unlimited capture length packet capture (tcpdump, wireshark) of your LDAP port. 

To re-iterate, this timeout and failover is implemented by the LDAP library, not by Apache. Apache literally passess of the URL with two hostnames to the LDAP library in a single call. 

If your network takes 15 minutes to figure out it can't connect, and your LDAP library doesn't use the network timeout for connect(), there's nothing Apache can do about it.
Comment 11 charlie 2010-01-22 14:49:11 UTC
OK, I'm going to document this exhaustively.  Please bear with me.

MY PLATFORM:

I am using the mod_authnz_ldap that ships with Red Hat Enterprise Linux v5.4, as part of their httpd-2.2.3-31.el5_4.2 package.  The problem I am seeing is consistent with the hundreds of similar problems easily found with Google from various releases and builds of Apache 2.2 with the ASF LDAP auth modules.

My underlying libraries are OpenLDAP and I have turned debugging up to the point of crippling the server, which allows me to see exactly what Apache is doing and why so many people are whining and complaining.

THE ISSUE:

The problem appears to be in the AuthLDAPURL directive, which is not compliant with RFC2255 (the current relevant RFCs are 4516 and 4510) as documented, because the RFCs do not specify host failover syntax as far as I can determine.  I have searched extensively, and while there is a "de facto standard" for failover specification used by nearly all LDAP capable software (except Apache) I could not find any RFC that explicitly addressed failover host specification syntax.  Embedding multiple host names the way Apache does in the midst of an otherwise RFC compliant string breaks compliancy.

If Apache uses the de facto standard as used by IBM, HP, OpenLDAP, and PADL, there will be no more problems - the underlying libraries will be able to do whatever they are capable of doing instead of being restricted by Apache's ability to parse, and all the things people are trying to do will work.

If Apache continues to use the current syntax, users must make a choice of either efficiency or security - they cannot have both in a failover configuration.

APACHE DOCS:

From the apache module itself (viewed via mod_info.c) the spec is described as: 

ldap://host[:port]/basedn[?attrib[?scope[?filter]]]

the module doc additionally states:

"Host is the name of the LDAP server. Use a space separated list of hosts to specify redundant servers"

A second parameter of "LDAP connection mode" is allowed with permitted values of NONE, SSL, or TLS/STARTTLS.

WHY THIS IS SUCH A PROBLEM:

LDAP lookups frequently contain passwords, and ldap services frequently use dissimilar schema.

In real world LDAP deployments, system architects usually want to encrypt ldap lookups across networks for security, but do not wish to incur encryption overhead on ldap lookups using local secure channels (such as the loopback interface or named pipes or a separate network, depending on OS capabilities and site setup).

Less commonly, sites that have dual LDAP backends (typically OpenLDAP and Active Directory) may present a single replicated data set using different attributes and thus may require different filter or port specifications for different hosts specified as failovers.

AuthLDAPURL's syntax prevents this by forcing a single set of parameters across hosts, which is not required by the underlying libraries.

EXAMPLES:

This is a commonly used LDAP failover configuration in PADL's pam_ldap and nss_ldap configurations (on Red Hat, both are in /etc/ldap.conf):

uri ldap://127.0.0.1:389/ ldaps://remotehost.example.com:636/

Note how the local loopback has no encryption, but the failover host is forced into an SSL tunnel.

Here's another, with a named pipe:

uri ldapi://%2fvar%2frun%2fldapi_sock/
# Note: %2f encodes the '/' used as directory separator

Here's one for a machine that runs scalix, note the weird port:

uri ldapi://%2fvar%2frun%2fldapi_sock/ ldap://127.0.0.1:9009/ ldaps://failoverhost.example.com/

These are (host obscured) real-world examples from running machines using PADL's pam_ldap to access OpenLDAP's client libraries.  NONE of these configurations can be achieved with the Apache module's AuthLDAPURL syntax using the same libraries.  The limitation is not in the libraries, it's in the AuthLDAPURL syntax.

RECOMMENDED SOLUTION:

In order to avoid breaking current applications that are using any of the "hacks" found on the net, implement a new parameter  AuthLDAPURI (note uri rather than url, this is consistent with LDAP v3 nomenclature as per RFC) which behaves exactly like OpenLDAP & PADL syntax.  This syntax allows all the various combinations that users want and is completely compliant with both  RFC2255 (historical) and RFC4516 (current) as well as LDAPS and STARTTLS.

AuthLDAPURI ldap[s]://host[:port]/basedn[?attrib[?scope[?filter]]]

"Use a space separated list of URIs to specify redundant servers"

CONCLUSION:

I hope this clears up the confusion about the problems users are encountering in the wild.  Google currently shows 23,100 hits when searching "ldap failover in apache 2.2", and they all seem to be complaints. There are clearly hundreds of sites struggling to find a solution to their misunderstanding of the Apache 2.2 LDAP limitations.  These limitations are not present in the underlying libraries.
Comment 12 Eric Covener 2010-01-22 15:23:05 UTC
Where does the lengthy post about ldapi:// tie back into the issue in this report? I think it belongs in a separate enhancement request.

For the OP, Stefan introduced a change to flip the option used by openldap for the connect or bind timeouts, which were not touched by the current mod_ldap timeout setting.

http://svn.apache.org/viewvc?rev=898102&view=rev
Comment 13 charlie 2010-01-25 11:28:39 UTC
Eric, it's not about ldapi, it's about the way Apache's broken AuthLDAPURL syntax is crippling systems by preventing easy access to underlying LDAP client libraries.

You require this for syntax:

transport://host host host host host:port/dn?filter

this is not RFC compliant, you cannot embed multiple hosts like that. This is a hack and it's not a good one.  It prevents many common and desirable configs.

This is the right way to do it:

transport://host:port/dn?filter transport://host:port/dn?filter transport://host:port/dn?filter

It's the way OpenLDAP's code does it, it's the way PADL's code does it, it's the only way I know of that conforms to the relevant RFCs.  The relevant RFCs were written by Kurt Zeilinga, and Kurt uses the form I've recommended in his own code.  All the people complaining about failover not working are trying to use this well-known format which Apache does not support.

In regards to ldapi, that is one single line of my previous post.  I was trying to illustrate that a proper syntax will allow access to *everything* the client libraries provide, including ldapi.  However, if that introduces confusion, please ignore the one single line that mentions ldapi, I did not mean to mislead you.
Comment 14 Eric Covener 2010-01-25 11:40:58 UTC
I understand what's desirable, I just don't see how it relates to the bug report it's attached to.  Stefan fixed the reported problem for modern openldap by using a timeout on non-connect operations, which is IMO orthogonal to not being able to specify two different transports.
Comment 15 Eric Covener 2010-01-25 11:51:42 UTC
probably aiding in the confusion: OP discussed this issue in multiple bugs and on the ML, and my discussion of the internal host and connect timeouts is actually OT for the issue he reported and fixed by setting a timeout in his SDK directly.

Charlie: Please open a new bug or enhancement request along with as much SDK info as you can gather.

timeout issue fixed in trunk rev for contemporary openldap (and resolvy OP via setting timesouts system-wide, see 48515)

http://svn.apache.org/viewvc?rev=898102&view=rev
Comment 16 charlie 2010-01-26 08:08:31 UTC
(In reply to comment #15)
> Charlie: Please open a new bug or enhancement request along with as much SDK
> info as you can gather.

So let it be written, so let it be done.

https://issues.apache.org/bugzilla/show_bug.cgi?id=48623

Eric, thank you for your help and advice!  I am not an experienced bugzilla user, and I haven't written anything in C in at least a decade, but I know LDAP very well.
Comment 17 Muzi 2010-01-26 10:49:12 UTC
I have figure out network timeout issue and fixed it. As i mentioned i already define two ldap hosts, and its works on failover if ldap service is not run on primary ldap then its switch to secondary ldap2 automatically in just some seconds
but problem for if primary ldap is down like poweroff etc then apache hangs, so the soluiton is define below directive in --> /etc/openldap/ldap.conf file

NETWORK_TIMEOUT 4 (with 4 seconds wait)

So its switch to ldap2 incase of network failure with primary ldap1, i can test it gradually by down ldap1 with poweroff or unplug network cable or block port using iptables, its now works f9. :)

Muzi
Comment 18 Mark A. Ziesemer 2010-12-28 16:19:35 UTC
Relating to the ldapi:/// URLs, please refer to bug 44302.