Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Won't Fix
    • Affects Version/s: 3.0
    • Fix Version/s: 3.1
    • Labels:
      None

      Description

      Since "Pair" is supposed to be an immutable class, couldn't we cache the "hashCode" value at construction? That would supposedly make it more efficient when used in maps.

        Activity

        Hide
        Gilles added a comment -

        Removed caching in revision 1336958.

        Show
        Gilles added a comment - Removed caching in revision 1336958.
        Hide
        Gilles added a comment -

        Reopening issue as caching is eventually not worth the risk of inconsistent behaviour.

        Show
        Gilles added a comment - Reopening issue as caching is eventually not worth the risk of inconsistent behaviour.
        Hide
        Gilles added a comment -

        I'm inclined to agree with James here.
        But if it is decided to keep the feature, then the Javadoc must be clarified.

        1. I put on this issue on someone else's behalf.
        2. Just after I clicked "submit issue", I realized the consistency problem: See my own first comment.[1]
        3. The discussion went on because you were in favour of caching the hash code, without allowing the current behaviour (no cache). IMO, not having the flag, and caching is the most dangerous option.

        As for the comments on the clarity of the Javadoc, feel free to make it clearer for you.
        IMHO, the class comment makes it quite obvious (for someone who knows what is meant by immutable is this context) what the problem could be.

        [1] (Initial) Question: Does someone see a way to make "Pair" truly immutable?
        (Expected) Answer: No. It is thus not safe to cache the hash code value.
        (Simple) Conclusion: "Won't fix".

        • I'll remove the cache-related code.
        • This issue at least served to point to the deficient behaviour of the class's "hashCode" method previous implementation. The current one is better I think. Please review...
        Show
        Gilles added a comment - I'm inclined to agree with James here. But if it is decided to keep the feature, then the Javadoc must be clarified. I put on this issue on someone else's behalf. Just after I clicked "submit issue", I realized the consistency problem: See my own first comment. [1] The discussion went on because you were in favour of caching the hash code, without allowing the current behaviour (no cache). IMO, not having the flag, and caching is the most dangerous option. As for the comments on the clarity of the Javadoc, feel free to make it clearer for you. IMHO, the class comment makes it quite obvious (for someone who knows what is meant by immutable is this context) what the problem could be. [1] (Initial) Question: Does someone see a way to make "Pair" truly immutable? (Expected) Answer: No. It is thus not safe to cache the hash code value. (Simple) Conclusion: "Won't fix". I'll remove the cache-related code. This issue at least served to point to the deficient behaviour of the class's "hashCode" method previous implementation. The current one is better I think. Please review...
        Hide
        Sebb added a comment -

        if you need to cache the hashcode values, just do it in the types that you put in the pair.

        This "optimization" is in the wrong place, it should be left to the participating types

        Very good points.

        I'm inclined to agree with James here.
        But if it is decided to keep the feature, then the Javadoc must be clarified.

        Show
        Sebb added a comment - if you need to cache the hashcode values, just do it in the types that you put in the pair. This "optimization" is in the wrong place, it should be left to the participating types Very good points. I'm inclined to agree with James here. But if it is decided to keep the feature, then the Javadoc must be clarified.
        Hide
        James Ring added a comment -

        Feel free to ignore me. I have no interest in this feature and I'd never use it, but:

        • if you need to cache the hashcode values, just do it in the types that you put in the pair. Some types that know they are immutable already do this (e.g. String)
        • the API documentation is vague: what do you mean by immutable? Does passing "false" mean I can mutate the pair itself? I think somebody looking at this API is going to have to dig into the source code to figure out exactly what you mean
        • it unnecessary pollutes a simple API. Also, why not cache toString()? Why single out hashCode()? This "optimization" is in the wrong place, it should be left to the participating types
        • it's a premature optimization: nobody has shown a demonstrable benefit and there is, I believe, a negative effect on the cleanliness of the API (but hey, why start making nice APIs now?)
        • somebody is going to look at the API and wonder if they should set this. If they choose incorrectly (which is possible because of the poor documentation), they get a surprising bug that they wouldn't otherwise have
        Show
        James Ring added a comment - Feel free to ignore me. I have no interest in this feature and I'd never use it, but: if you need to cache the hashcode values, just do it in the types that you put in the pair. Some types that know they are immutable already do this (e.g. String) the API documentation is vague: what do you mean by immutable? Does passing "false" mean I can mutate the pair itself? I think somebody looking at this API is going to have to dig into the source code to figure out exactly what you mean it unnecessary pollutes a simple API. Also, why not cache toString()? Why single out hashCode()? This "optimization" is in the wrong place, it should be left to the participating types it's a premature optimization: nobody has shown a demonstrable benefit and there is, I believe, a negative effect on the cleanliness of the API (but hey, why start making nice APIs now?) somebody is going to look at the API and wonder if they should set this. If they choose incorrectly (which is possible because of the poor documentation), they get a surprising bug that they wouldn't otherwise have
        Hide
        Gilles added a comment -

        Committed in revision 1336458.

        Show
        Gilles added a comment - Committed in revision 1336458.
        Hide
        Gilles added a comment -

        -1000 to this proposal

        The proposal is fully backwards-compatible.
        The API is just extended (one new constructor).

        What's the problem?

        Show
        Gilles added a comment - -1000 to this proposal The proposal is fully backwards-compatible. The API is just extended (one new constructor). What's the problem?
        Hide
        James Ring added a comment -

        -1000 to this proposal, unnecessary complicates the API and there's no demonstrated benefit.

        Why wouldn't the types in the Pair just cache their own hashcodes if performance is so critical?

        Show
        James Ring added a comment - -1000 to this proposal, unnecessary complicates the API and there's no demonstrated benefit. Why wouldn't the types in the Pair just cache their own hashcodes if performance is so critical?
        Hide
        Gilles added a comment -

        See my last sentence, I think your solution is fine.

        I had seen it

        Just making sure that your "[...] it would be fine I guess." becomes "it would be fine."

        Show
        Gilles added a comment - See my last sentence, I think your solution is fine. I had seen it Just making sure that your " [...] it would be fine I guess." becomes "it would be fine."
        Hide
        Thomas Neidhart added a comment -

        See my last sentence, I think your solution is fine.

        In my comment I wanted to point out that assuming and documenting immutability may still be dangerous, especially when all the millions of Pair implementations out there do it differently. An explicit flag is the way to go then if it is really needed for performance reasons imho.

        Show
        Thomas Neidhart added a comment - See my last sentence, I think your solution is fine. In my comment I wanted to point out that assuming and documenting immutability may still be dangerous, especially when all the millions of Pair implementations out there do it differently. An explicit flag is the way to go then if it is really needed for performance reasons imho.
        Hide
        Gilles added a comment -

        Do you have a use case at hand that requires this change?

        It's an optimization suggested by a developer who works with maps.

        I have no idea how much gain it provides; but it seems that for an application that calls "hashCode" a lot, that might be useful...

        I don't understand why you call it "premature" optimization.

        If the Javadoc states that the correct behaviour is up to the user, I don't see any real problem; we have other cases where a flag indicates that a reference is stored, and if the user does something he shouldn't, the same "problems" will be created.

        What I propose seems the perfect compromise between always assuming immutable (the two Seb's opinion) and never optimize "hashCode" (your opinion). The default being no optimization so that users are not bitten by surprise.
        Shall I apply the change?

        Show
        Gilles added a comment - Do you have a use case at hand that requires this change? It's an optimization suggested by a developer who works with maps. I have no idea how much gain it provides; but it seems that for an application that calls "hashCode" a lot, that might be useful... I don't understand why you call it "premature" optimization. If the Javadoc states that the correct behaviour is up to the user, I don't see any real problem; we have other cases where a flag indicates that a reference is stored, and if the user does something he shouldn't, the same "problems" will be created. What I propose seems the perfect compromise between always assuming immutable (the two Seb's opinion) and never optimize "hashCode" (your opinion). The default being no optimization so that users are not bitten by surprise. Shall I apply the change?
        Hide
        Thomas Neidhart added a comment -

        Do you have a use case at hand that requires this change?

        In general, I would oppose something like that as it sounds like premature optimization.
        Although documenting the behavior in javadoc is correct, it may still create problems and makes debugging more difficult.

        OTOH, if the default is "false" and only set explicitly to "true" in specific cases where a user / developer knows exactly what he/she is doing it would be fine I guess.

        Show
        Thomas Neidhart added a comment - Do you have a use case at hand that requires this change? In general, I would oppose something like that as it sounds like premature optimization. Although documenting the behavior in javadoc is correct, it may still create problems and makes debugging more difficult. OTOH, if the default is "false" and only set explicitly to "true" in specific cases where a user / developer knows exactly what he/she is doing it would be fine I guess.
        Hide
        Gilles added a comment -

        Of course the default value of the flag will be "true".

        Probably better to be safe, and thus set the default to "false"!

        I also add to the discussion that in most parts of Commons Math, we try to avoid dangerous assumptions (cf. defensive copies).

        Here we cannot make copies but still can offer both options (assume immutable or not). It is still indeed the users' responsibility to use the object consistently with his stated assumption.

        And, assuming mutability by default will also preserve compatibility with current behaviour (were the hash code is not cached).

        Show
        Gilles added a comment - Of course the default value of the flag will be "true". Probably better to be safe, and thus set the default to "false"! I also add to the discussion that in most parts of Commons Math, we try to avoid dangerous assumptions (cf. defensive copies). Here we cannot make copies but still can offer both options (assume immutable or not). It is still indeed the users' responsibility to use the object consistently with his stated assumption. And, assuming mutability by default will also preserve compatibility with current behaviour (were the hash code is not cached).
        Hide
        Gilles added a comment -

        Can we ensure there all applications, where one would mutate the underlying "key", will behave badly, even if the hash code is recomputed every time?

        If not, the proposal makes the class more flexible.
        Of course the default value of the flag will be "true".

        The point is that we don't have to force the user to "obey the Javadoc"; we can provide both possibilities and they have to use the chosen one consistently (or "bad things happen" etc.).

        Show
        Gilles added a comment - Can we ensure there all applications, where one would mutate the underlying "key", will behave badly, even if the hash code is recomputed every time? If not, the proposal makes the class more flexible. Of course the default value of the flag will be "true". The point is that we don't have to force the user to "obey the Javadoc"; we can provide both possibilities and they have to use the chosen one consistently (or "bad things happen" etc.).
        Hide
        Sébastien Brisard added a comment -

        I agree: no extra flag in my opinion!

        Show
        Sébastien Brisard added a comment - I agree: no extra flag in my opinion!
        Hide
        Sebb added a comment -

        If the Javadoc says that the user must not change the object, then the code can assume that the object is immutable.
        No need for an extra flag; the hash can be calculated once regardless.

        If the user does not obey the Javadoc, and bad things happen when the hashcode changes, that's their problem.

        Show
        Sebb added a comment - If the Javadoc says that the user must not change the object, then the code can assume that the object is immutable. No need for an extra flag; the hash can be calculated once regardless. If the user does not obey the Javadoc, and bad things happen when the hashcode changes, that's their problem.
        Hide
        Gilles added a comment -

        OK.

        Then assuming that it's the user's responsibility to not mutate the passed references, it seems reasonable to optionally allow the performance gain of computing the hash code at construction, by having a flag in the constructor's parameter list:

        public Pair(K k, V v, boolean assumeImmutable) {
            key = k;
            value = v;
            immutable = assumeImmutable;
            hashCode = computeHashCode();
        }
        

        Then, we'd have:

        public int hashCode() {
            return immutable ? hashCode : computeHashCode();
        }
        

        What do you think?

        Show
        Gilles added a comment - OK. Then assuming that it's the user's responsibility to not mutate the passed references, it seems reasonable to optionally allow the performance gain of computing the hash code at construction, by having a flag in the constructor's parameter list: public Pair(K k, V v, boolean assumeImmutable) { key = k; value = v; immutable = assumeImmutable; hashCode = computeHashCode(); } Then, we'd have: public int hashCode() { return immutable ? hashCode : computeHashCode(); } What do you think?
        Hide
        Sébastien Brisard added a comment -

        Is there really a way to make Pair immutable?
        How about we write a big warning in the Javadoc that it is the reponsibility of the user to make sure that objects passed to Pair are immutable.
        I think we must trust the user on this particular problem. As for internal uses of Pair it is our responsibility... I think we can take care of that!!!

        Show
        Sébastien Brisard added a comment - Is there really a way to make Pair immutable? How about we write a big warning in the Javadoc that it is the reponsibility of the user to make sure that objects passed to Pair are immutable. I think we must trust the user on this particular problem. As for internal uses of Pair it is our responsibility... I think we can take care of that!!!
        Hide
        Gilles added a comment -

        I already notice a problem with this proposal: the elements of the pair might well be mutable, and since we store references, not copies, of the objects passed to the constructor, we cannot ensure that "equals" will stay consistent with the cached "hashCode"!

        Does someone see a way to make "Pair" truly immutable?

        Show
        Gilles added a comment - I already notice a problem with this proposal: the elements of the pair might well be mutable, and since we store references, not copies, of the objects passed to the constructor, we cannot ensure that "equals" will stay consistent with the cached "hashCode"! Does someone see a way to make "Pair" truly immutable?

          People

          • Assignee:
            Gilles
            Reporter:
            Gilles
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development