Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: Function/UDF
    • Labels:
      None

      Description

      Function definition

      int4 locate(string text, substr text, [, pos])
      

      Description

      Returns the position of the first occurance of substr in str after position pos

      • The result is one-based index.
      • If string or substr is null, the result should be null.
      • pos is one-based index.
      • pos cannot be a negative integer.
      • If 0 is given to pos, the function considers that pos is 1.
      • If there is no pos, the function considers that pos is 1.
      • If there is no matched substring, the result should be 0.

      Example

      SELECT locate('foobarbar', 'bar',5);
      -> 7
      
      1. TAJO-343.2.patch
        9 kB
        KyoungBok Lee
      2. TAJO-343.patch
        8 kB
        KyoungBok Lee

        Activity

        Hide
        deepblue.kblee KyoungBok Lee added a comment -

        I can solve this issue. Assign me please.

        Show
        deepblue.kblee KyoungBok Lee added a comment - I can solve this issue. Assign me please.
        Hide
        hyunsik Hyunsik Choi added a comment -

        I just added you to a contributor group, and I just assigned this issue to you. Thank you for your participation.

        Show
        hyunsik Hyunsik Choi added a comment - I just added you to a contributor group, and I just assigned this issue to you. Thank you for your participation.
        Hide
        deepblue.kblee KyoungBok Lee added a comment - - edited

        Hyunsik Choi, I have a question.

        1. If pos is a negative integer, what should be returned? null or 0?
        2. If substr is the empty string(""), how process this case?

        ※ locate() seems to be mysql's function. right? Then,
        function signature is bellow.

        int4 locate(substr text, string text, [, pos])
        
        SELECT locate('bar', 'foobarbar', 5);
        -> 7
        

        And, result is bellow case by case.

        mysql> select locate('bar', 'foobarbar', 5) as col1;
        +------+
        | col1 |
        +------+
        |    7 |
        +------+
        1 row in set (0.00 sec)
        
        mysql> select locate('bar', 'foobarbar') as col1;
        +------+
        | col1 |
        +------+
        |    4 |
        +------+
        1 row in set (0.00 sec)
        
        mysql> select locate('', 'foobarbar') as col1;
        +------+
        | col1 |
        +------+
        |    1 |
        +------+
        1 row in set (0.00 sec)
        
        mysql> select locate('bar', 'foobarbar', -1) as col1;
        +------+
        | col1 |
        +------+
        |    0 |
        +------+
        1 row in set (0.00 sec)
        

        Anyway, I will implement as requested now.

        Show
        deepblue.kblee KyoungBok Lee added a comment - - edited Hyunsik Choi, I have a question. If pos is a negative integer, what should be returned? null or 0? If substr is the empty string(""), how process this case? ※ locate() seems to be mysql's function. right? Then, function signature is bellow. int4 locate(substr text, string text, [, pos]) SELECT locate('bar', 'foobarbar', 5); -> 7 And, result is bellow case by case. mysql> select locate('bar', 'foobarbar', 5) as col1; +------+ | col1 | +------+ | 7 | +------+ 1 row in set (0.00 sec) mysql> select locate('bar', 'foobarbar') as col1; +------+ | col1 | +------+ | 4 | +------+ 1 row in set (0.00 sec) mysql> select locate('', 'foobarbar') as col1; +------+ | col1 | +------+ | 1 | +------+ 1 row in set (0.00 sec) mysql> select locate('bar', 'foobarbar', -1) as col1; +------+ | col1 | +------+ | 0 | +------+ 1 row in set (0.00 sec) Anyway, I will implement as requested now.
        Hide
        deepblue.kblee KyoungBok Lee added a comment -

        Please review this.

        Show
        deepblue.kblee KyoungBok Lee added a comment - Please review this.
        Hide
        charsyam DaeMyung Kang added a comment - - edited

        KyoungBok Lee

        Please make patch using git diff with --no-prefix option

        it is not default patch type

        and upload again please

        Show
        charsyam DaeMyung Kang added a comment - - edited KyoungBok Lee Please make patch using git diff with --no-prefix option it is not default patch type and upload again please
        Hide
        deepblue.kblee KyoungBok Lee added a comment -

        Please review this again.

        DaeMyung Kang
        I missed that option.
        Thank you so much~

        Show
        deepblue.kblee KyoungBok Lee added a comment - Please review this again. DaeMyung Kang I missed that option. Thank you so much~
        Hide
        hyunsik Hyunsik Choi added a comment - - edited

        KyoungBok,

        I'm sorry for the late reply. You seems to solve your questions.

        +1 for this patch. I like the function specification that you completed. The patch looks great for me, and it includes very well written unit tests.

        Show
        hyunsik Hyunsik Choi added a comment - - edited KyoungBok, I'm sorry for the late reply. You seems to solve your questions. +1 for this patch. I like the function specification that you completed. The patch looks great for me, and it includes very well written unit tests.
        Hide
        hyunsik Hyunsik Choi added a comment - - edited

        I leave one comment. Actually, the proposed function specification is different from that of MySQL and Hive. The main reason is consistency. Most of string functions have the first parameter as an input string. However, locate function in Hive and MySQL has input string as the second parameter. Since locate function is not SQL standard function, I thought that we can redefine this function signature with more consistency.

        Show
        hyunsik Hyunsik Choi added a comment - - edited I leave one comment. Actually, the proposed function specification is different from that of MySQL and Hive. The main reason is consistency. Most of string functions have the first parameter as an input string. However, locate function in Hive and MySQL has input string as the second parameter. Since locate function is not SQL standard function, I thought that we can redefine this function signature with more consistency.
        Hide
        hyunsik Hyunsik Choi added a comment -

        committed this patch to master branch.

        The below link is the first your contribution log in Tajo repository.
        https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git;a=commit;h=3c2a6343fe9056c9e520442b744ba924704fc3dd

        Thank you for your contribution.

        Show
        hyunsik Hyunsik Choi added a comment - committed this patch to master branch. The below link is the first your contribution log in Tajo repository. https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git;a=commit;h=3c2a6343fe9056c9e520442b744ba924704fc3dd Thank you for your contribution.
        Hide
        sirpkt Keuntae Park added a comment -

        I agree with you about consistency on defining function signature.
        However, many people (including me) may have plan to migrate existing queries from Hive to Tajo.
        So, I think we need clear migration guide that includes all the redefined or renamed functon signatures.

        Show
        sirpkt Keuntae Park added a comment - I agree with you about consistency on defining function signature. However, many people (including me) may have plan to migrate existing queries from Hive to Tajo. So, I think we need clear migration guide that includes all the redefined or renamed functon signatures.
        Hide
        sirpkt Keuntae Park added a comment -

        I want to start that documentation work for my needs
        However, do I should look through all the issue descriptions or is there more efficient way to do that?

        Show
        sirpkt Keuntae Park added a comment - I want to start that documentation work for my needs However, do I should look through all the issue descriptions or is there more efficient way to do that?
        Hide
        hyunsik Hyunsik Choi added a comment -

        I thought that the number of those who already know locate function's signature are fewer than those who don't know. Probably, those who don't know
        the signature will read Tajo's guide or guess that locate function may be similar to other SQL string functions. So, I decided that redefine is better.

        In addition, Tajo also provides hive mode. It also would be helpful for some guys who want to use both systems at the same time. In hive mode, locate function can be the same to that of Hive.

        Anyway, the migration guide that you propose is very nice idea, and the guide would be very helpful for Hive users who want to start Tajo. Actually, I don't know what is efficient way, but I can suggest some important points:

        • Tajo supports ANSI SQL.
          • Almost users who use Hive already know SQL well.
          • So, the guide needs to focus on non-SQL parts, such as File Format.
            • For example, how to specify a file format of a table, or how to specify compression specification of a table.
        • Non-standard function map between HQL and Tajo SQL will be helpful for that.

        If you want to begin to write the migration guide, it would be great to make a separate jira issue.

        Show
        hyunsik Hyunsik Choi added a comment - I thought that the number of those who already know locate function's signature are fewer than those who don't know. Probably, those who don't know the signature will read Tajo's guide or guess that locate function may be similar to other SQL string functions. So, I decided that redefine is better. In addition, Tajo also provides hive mode. It also would be helpful for some guys who want to use both systems at the same time. In hive mode, locate function can be the same to that of Hive. Anyway, the migration guide that you propose is very nice idea, and the guide would be very helpful for Hive users who want to start Tajo. Actually, I don't know what is efficient way, but I can suggest some important points: Tajo supports ANSI SQL. Almost users who use Hive already know SQL well. So, the guide needs to focus on non-SQL parts, such as File Format. For example, how to specify a file format of a table, or how to specify compression specification of a table. Non-standard function map between HQL and Tajo SQL will be helpful for that. If you want to begin to write the migration guide, it would be great to make a separate jira issue.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-trunk-postcommit #594 (See https://builds.apache.org/job/Tajo-trunk-postcommit/594/)
        TAJO-343: Implement locate function. (KyoungBok Lee via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=3c2a6343fe9056c9e520442b744ba924704fc3dd)

        • CHANGES.txt
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoMaster.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/function/TestStringOperatorsAndFunctions.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/string/Locate.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-trunk-postcommit #594 (See https://builds.apache.org/job/Tajo-trunk-postcommit/594/ ) TAJO-343 : Implement locate function. (KyoungBok Lee via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=3c2a6343fe9056c9e520442b744ba924704fc3dd ) CHANGES.txt tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TajoMaster.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/function/TestStringOperatorsAndFunctions.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/string/Locate.java
        Hide
        sirpkt Keuntae Park added a comment -

        Hyunsik Choi Thank you for the comment, I just created an issue for the documentation.

        Show
        sirpkt Keuntae Park added a comment - Hyunsik Choi Thank you for the comment, I just created an issue for the documentation.

          People

          • Assignee:
            deepblue.kblee KyoungBok Lee
            Reporter:
            hyunsik Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development