Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.2.2
    • Fix Version/s: 0.2.4
    • Component/s: Blur, Blur Shell
    • Labels:
      None

      Description

      When using the terms call in blur the raw terms are returned this works for string and text types. However the numeric types, gis etc, do not return human readable values.

      This would be highly useful for typeahead lookups on numeric types.

        Activity

        Hide
        Garrett Barton added a comment -

        I'll take a stab at this one.

        Current plan is to add a new method to FieldTypeDefinition:
        public abstract String readBytesRef(BytesRef ref)

        And then implement it for the built in types. Will mod the IndexManager.terms() command to get the FieldTypeDefinition for the field being requested and pass that down into the deeper term's commands for use.

        Show
        Garrett Barton added a comment - I'll take a stab at this one. Current plan is to add a new method to FieldTypeDefinition: public abstract String readBytesRef(BytesRef ref) And then implement it for the built in types. Will mod the IndexManager.terms() command to get the FieldTypeDefinition for the field being requested and pass that down into the deeper term's commands for use.
        Hide
        Garrett Barton added a comment -

        So I settled on
        public abstract String readTerm(BytesRef ref)

        I implemented it for all the types currently, not entirely sure if the spatial ones make sense yet. I have been struggling with making numerics sensible though.
        Lucene does a ton of terms per number as you know and I personally think that its less useful to return all the internal terms vs the real ones. My approach was to discard anything that had a shift associated with it, and that is indeed giving me the right originating numbers back. (IndexManager does a null check on the readTerms call, I return null if the term wasn't an original one) The problem is I'm worried about time to return, for 3 doubles I have over 75k terms in the index to run through. I might be spinning a really long time to pull back a decent number of terms with a real sized index.

        Also the blur shell seems to hang with term queries, not sure why. Client works fine so thinking that will be another bug to fix the shell.

        Thoughts?

        Show
        Garrett Barton added a comment - So I settled on public abstract String readTerm(BytesRef ref) I implemented it for all the types currently, not entirely sure if the spatial ones make sense yet. I have been struggling with making numerics sensible though. Lucene does a ton of terms per number as you know and I personally think that its less useful to return all the internal terms vs the real ones. My approach was to discard anything that had a shift associated with it, and that is indeed giving me the right originating numbers back. (IndexManager does a null check on the readTerms call, I return null if the term wasn't an original one) The problem is I'm worried about time to return, for 3 doubles I have over 75k terms in the index to run through. I might be spinning a really long time to pull back a decent number of terms with a real sized index. Also the blur shell seems to hang with term queries, not sure why. Client works fine so thinking that will be another bug to fix the shell. Thoughts?
        Hide
        Garrett Barton added a comment -

        Submitted pull request to the github apache blur. Please review and accept/comment.

        Show
        Garrett Barton added a comment - Submitted pull request to the github apache blur. Please review and accept/comment.
        Hide
        ASF GitHub Bot added a comment -

        Github user Humbedooh commented on the pull request:

        https://github.com/apache/incubator-blur/pull/2#issuecomment-51775286

        Testing GitHub integration once again, please do ignore

        Show
        ASF GitHub Bot added a comment - Github user Humbedooh commented on the pull request: https://github.com/apache/incubator-blur/pull/2#issuecomment-51775286 Testing GitHub integration once again, please do ignore
        Hide
        ASF GitHub Bot added a comment -

        Github user williamstw commented on a diff in the pull request:

        https://github.com/apache/incubator-blur/pull/2#discussion_r16051539

        — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java —
        @@ -972,6 +973,13 @@ public Long merge(BlurExecutorCompletionService<Long> service) throws BlurExcept
        LOG.error("Unknown error while trying to fetch index readers.", e);
        throw new BException(e.getMessage(), e);
        }
        +
        + TableContext tableContext = getTableContext(table);
        + FieldManager fieldManager = tableContext.getFieldManager();
        + //TODO: isn't there a util or something available to concat these?
        + final FieldTypeDefinition typeDefinition = fieldManager.getFieldTypeDefinition(columnFamily + "." + columnName);
        — End diff –

        it's in the basefieldmanager, maybe it should be moved up? (mostly testing the github comment integration here )

        Show
        ASF GitHub Bot added a comment - Github user williamstw commented on a diff in the pull request: https://github.com/apache/incubator-blur/pull/2#discussion_r16051539 — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java — @@ -972,6 +973,13 @@ public Long merge(BlurExecutorCompletionService<Long> service) throws BlurExcept LOG.error("Unknown error while trying to fetch index readers.", e); throw new BException(e.getMessage(), e); } + + TableContext tableContext = getTableContext(table); + FieldManager fieldManager = tableContext.getFieldManager(); + //TODO: isn't there a util or something available to concat these? + final FieldTypeDefinition typeDefinition = fieldManager.getFieldTypeDefinition(columnFamily + "." + columnName); — End diff – it's in the basefieldmanager, maybe it should be moved up? (mostly testing the github comment integration here )
        Hide
        ASF GitHub Bot added a comment -

        Github user gbarton commented on a diff in the pull request:

        https://github.com/apache/incubator-blur/pull/2#discussion_r16068253

        — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java —
        @@ -972,6 +973,13 @@ public Long merge(BlurExecutorCompletionService<Long> service) throws BlurExcept
        LOG.error("Unknown error while trying to fetch index readers.", e);
        throw new BException(e.getMessage(), e);
        }
        +
        + TableContext tableContext = getTableContext(table);
        + FieldManager fieldManager = tableContext.getFieldManager();
        + //TODO: isn't there a util or something available to concat these?
        + final FieldTypeDefinition typeDefinition = fieldManager.getFieldTypeDefinition(columnFamily + "." + columnName);
        — End diff –

        Going to skip this as this recommendation as it potentially gets dealt with in the blur-platform branch.

        Show
        ASF GitHub Bot added a comment - Github user gbarton commented on a diff in the pull request: https://github.com/apache/incubator-blur/pull/2#discussion_r16068253 — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java — @@ -972,6 +973,13 @@ public Long merge(BlurExecutorCompletionService<Long> service) throws BlurExcept LOG.error("Unknown error while trying to fetch index readers.", e); throw new BException(e.getMessage(), e); } + + TableContext tableContext = getTableContext(table); + FieldManager fieldManager = tableContext.getFieldManager(); + //TODO: isn't there a util or something available to concat these? + final FieldTypeDefinition typeDefinition = fieldManager.getFieldTypeDefinition(columnFamily + "." + columnName); — End diff – Going to skip this as this recommendation as it potentially gets dealt with in the blur-platform branch.
        Hide
        ASF GitHub Bot added a comment -

        Github user gbarton commented on a diff in the pull request:

        https://github.com/apache/incubator-blur/pull/2#discussion_r16068322

        — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java —
        @@ -1024,7 +1032,11 @@ public static long recordFrequency(IndexReader reader, String columnFamily, Stri

        BytesRef currentTermText = termEnum.term();
        do {

        • terms.add(currentTermText.utf8ToString());
          +
          + String readTerm = typeDef.readTerm(currentTermText);
          + System.out.println("Read term: " + readTerm);
          + if(readTerm != null)
            • End diff –

        Fixed. At least I tried to, I could not determine the exact formatting style being used as everything I tried to do to the whole of IndexManager changed the whole file. Perhaps a committed style for eclipse in the blur project would help?

        Show
        ASF GitHub Bot added a comment - Github user gbarton commented on a diff in the pull request: https://github.com/apache/incubator-blur/pull/2#discussion_r16068322 — Diff: blur-core/src/main/java/org/apache/blur/manager/IndexManager.java — @@ -1024,7 +1032,11 @@ public static long recordFrequency(IndexReader reader, String columnFamily, Stri BytesRef currentTermText = termEnum.term(); do { terms.add(currentTermText.utf8ToString()); + + String readTerm = typeDef.readTerm(currentTermText); + System.out.println("Read term: " + readTerm); + if(readTerm != null) End diff – Fixed. At least I tried to, I could not determine the exact formatting style being used as everything I tried to do to the whole of IndexManager changed the whole file. Perhaps a committed style for eclipse in the blur project would help?

          People

          • Assignee:
            Unassigned
            Reporter:
            Aaron McCurry
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development