Bug 55449 - AH01185: worker slotmem_create failed --> slotmem file name/ID not unique
Summary: AH01185: worker slotmem_create failed --> slotmem file name/ID not unique
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_balancer (show other bugs)
Version: 2.4.6
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-19 09:48 UTC by Andre W.
Modified: 2016-01-18 14:29 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andre W. 2013-08-19 09:48:09 UTC
Hello everybody,

currently I'm facing a problem by using multiple apache instances on one single server with the creation of the shared memory file trhough mod_proxy_balancer, since version 2.4.4.

In your environment we are running multiple web servers (reverse proxy) on one machine by different users, which will be reloaded or restartet by the need of each project. During this restarts since 2.4.4 we had a increase of the following failure

[Wed Jul 10 15:43:59.221748 2013] [proxy_balancer:emerg] [pid 25160:tid 140641178298112] (17)File exists: AH01185: worker slotmem_create failed
[Wed Jul 10 15:43:59.221858 2013] [:emerg] [pid 25160:tid 140641178298112] AH00020: Configuration Failed, exiting

After checking the code of mod_proxy, mod_proxy_balancer and soltmem_create i recognized the follwing code. In version 2.4.4 there was a function included for the first time, which checks if the requested shared memory file is actually created and try's to reuse it (see mod_slotmem_shm.c, line 316ff). But i think there won't be a check if the user rights are the same, because if multiple users get the same "ID" for the filename of the shared memory file the won't be able to create the file or use it if it still exists, because a NEW filename/hash ID won't be created inside the method! (See also mod_proxy_balancer.c, line 774ff for creation of the file ID)

[mod_proxy_balancer.c, Line 793: rv = storage->create(&new, conf->id, ALIGNED_PROXY_BALANCER_SHARED_SIZE, conf->max_balancers, type, pconf);
 
--> created file ID will only be handed over once, and not checked if ID already exists by a different user]

Configuration items the hash of the file ID is based on:
(See line mod_proxy_balancer.c, 766ff)
server_scheme: same for all instances
server_hostname: same for all instances
port: differs
server_admin: same for all instances

So currently from my point of unterstanding, there should be a check if the current file exists under the current user, if the shared memory file is created and used by another user, a new ID should be created, because this file can't be used by the current process of apache!?

OS: Suse Linux 11 Sp2 / 64 Bit
Apache: 2.4.4/6, OpenSSL 1.0.1e, PCRE 8.33, OpenLDAP 2.4.35

Best regards,
André
Comment 1 Andre W. 2013-08-19 11:02:29 UTC
Hello everybody,

just wanted to add some additional comments:

Could it maybe in idea to add the name of the current os-user into the ID, of an shared memory file name, i.e. ID --> OSUser-Hash. So that each user has a kind of "namespace" for multiple shared memory files?

I.E.

Root-12x85
Root-12c86
Marc-15078
Andy-15078
... 

Regards,
André
Comment 2 enteiser 2013-08-22 21:01:47 UTC
We experience semaphore collisions too when there a many apache instances on a machine. Apache tries to create a semaphore which is already occupied by another users apache instance.

If I interpret the code correctly the username prefix for the ID won't work because the ID is passed to shmget, and key_t is an integer.

So various data is passed to a hash function to create a unique key_t. To me it seems it's not so unique and collisions are more likely than intended?

Isn't it possible to use IPC_PRIVATE?

For quick lookup:

AH01179 happens here in line 797:
http://svn.apache.org/viewvc/httpd/httpd/tags/2.4.6/modules/proxy/mod_proxy_balancer.c?revision=1503324&view=markup

conf->id is some hashed value of some data created in line 766, and is presumably passed as key_t to shmget at the end.
Comment 3 Andre W. 2013-10-23 10:54:54 UTC
Hello everyone,

i invested a little bit more of investigation on that topic, because the problem came up more often.

After that investigations I also found that open bug, which describes nearly the same problem, by creating a key that already exists: 

https://issues.apache.org/bugzilla/show_bug.cgi?id=53996

I think the problem comes up from the ftok(3) call inside unixd_set_shm_perms()  (modules/slotmem/mod_slotmem_shm.c). The second argument there is the INT projid, which is constant value 1.  Due to the fact that the creation of the ftok-key may not be random (the keys should be reproducable), there should be an additionl option to varify the key (i.e. an addtional parameter on startup, which can be different for each apache instance? e.g. the port number -c Listen <port>). Due to the fact that only the last 8 bits of the projid are used, maybe the 16 bit port number can also be hashed?

This would not solve the problem compleatly but i think it would reduce the amount of collisons which are coming up.

The only work arround I currently found is the following:

1) rename existing *.shm 
2) restart apache
3) if slotmem_create failure still occurs inside Error-Log, back to 1)
4) delete old files after apache2 has started

Best regards,
André
Comment 4 Jim Jagielski 2013-11-06 14:50:24 UTC
Couldn't you simply use different locations for the .shm file? DefaultRuntimeDir allows this.

Also, are the different users using the *same* instances? I have no idea how that could even work. From what I can see below, you mention that the port differs, so since the hash uses that, the hash itself will be (should be) different, so how are they getting the same IDs?
Comment 5 Andre W. 2013-11-07 10:05:16 UTC
Hello Jim,

okay, let's explain it a little bit more in detail, what is meant by different instances.

Currently, we run multiple different apache http servers (up 100 or more single instances) on an single server instance. To realize this, each apache server is executed by a single user inside a single workspace were only this user and root has access, also each server as a well defined port range, which he is able to use.

i.e. the main config is like this:

APACHE_USER="<project_user>" #the user httpd runs with
APACHE_INSTANCE="<project_name>"
APACHE_PROJ="<project_path>" #will be handed over as startparameter

ServerRoot ${APACHE_INSTROOT} #Installationpath of the apache exectuble
DefaultRuntimeDir /var/tmp/apache_${APACHE_INSTANCE} 
PidFile ${DEFAULTRUNTIMEDIR}/httpd.pid

As you can see we already set a different DefaultRuntimeDir, because in previous cases we had the problem of the collision of the different *.shm files. But actually we don't have collisions on the *.shm files, we have collisions inside the shared memory slots of the unix system. This failure occours from my point of investigation, through the generation of an already existing key inside the shared memory register of the unix system (see also my last post).

It is true that it is not able to minimize these collisons to zero (the definition of ftok(3) tells nearly the same, see below), but with further investigation to generate a more unique key it should be able to reduce these collisons (point of investigation could be the projid, which is always set to 1, maybe a possibility could be to use the port number, or give the user the ability to configure this variable inside httpd.conf or via start parameter).

I think the collisions are coming up because of the variance of the hash value ftok is called with is two low, so that an internal recalculation inside ftok leads to the collision (This is also really well described here https://issues.apache.org/bugzilla/show_bug.cgi?id=53996).

Hopefully that helps,

Best regards,
André

ftok(3) manpage applies:
       Of course no guarantee can be given that the resulting key_t is unique.
       Typically, a best effort attempt combines the given proj_id  byte,  the
       lower  16 bits of the i-node number, and the lower 8 bits of the device
       number into a 32-bit result.  Collisions may easily happen, for example
       between files on /dev/hda1 and files on /dev/sda1.
Comment 6 Andre W. 2013-11-15 11:50:30 UTC
Additional information provided.
Comment 7 Jim Jagielski 2013-11-16 22:08:37 UTC
I think I see what you are saying, but the fact that we are getting very different fname's for those ftok() calls means that the use of the constant '1' seems moot. I guess we *could* use the int equiv of the fname for the key; that would ensure some uniqueness...
Comment 8 Jim Jagielski 2013-11-16 22:33:53 UTC
Upon further review, there's nothing that can be done in httpd itself. Instead, this requires changes to and with APR
Comment 9 Andre W. 2013-12-03 15:11:58 UTC
The following task has been marked as fixed https://issues.apache.org/bugzilla/show_bug.cgi?id=53996, which belongs to the apr and this issue?

Will there be an release of the apr 1.5.1 in the near future?
Comment 10 Jeff Trawick 2013-12-03 20:19:32 UTC
>Will there be an release of the apr 1.5.1 in the near future?
There aren't currently plans.

Can you try this patch on top of apr 1.5.0?

http://svn.apache.org/viewvc/apr/apr/branches/1.5.x/shmem/unix/shm.c?r1=1534011&r2=1542731&sortby=date&view=patch