Details

      Description

      We need to implement a system for passing and storing options at the global and session level. Some example options include, explain plan format, allowing quoted identifiers, null handling (concat null yields null). This involves creating a notion of a user session, as previously nothing was stored per user session.

      1. Drill-381-Options-4-14-14.patch
        96 kB
        Jason Altekruse
      2. Drill-381-options-3_30_14.patch
        101 kB
        Jason Altekruse
      3. Drill-381-Options-Mar-24-14.patch
        82 kB
        Jason Altekruse
      4. 0001-start-of-storage-system-for-drill-options.patch
        61 kB
        Jason Altekruse

        Activity

        Hide
        Jacques Nadeau added a comment -

        added in 08923cb

        Show
        Jacques Nadeau added a comment - added in 08923cb
        Hide
        Jason Altekruse added a comment -

        updated reviewboard patch

        Show
        Jason Altekruse added a comment - updated reviewboard patch
        Hide
        Jason Altekruse added a comment -

        Attached the wrong patch file, latest one with 3_30 in the title is correct.

        Show
        Jason Altekruse added a comment - Attached the wrong patch file, latest one with 3_30 in the title is correct.
        Hide
        Jacques Nadeau added a comment -

        High level comments on initial patch:

        Can you move the reader to a new SystemStoragePlugin? You can then expose the options as a table in that storage plugin. (Register this plugin automatically rather than having the user configure it, similar to information schema. The table columns would probably be something like: [NAME, CATEGORY, DEFAULT, SESSION] where DEFAULT is the setting that a new session would get (a.k.a global) and the SESSION is the current value of the option. In the case that you haven't set any options, the two values would be the same.

        For the options access interface, we should have the following things supported:

        QueryContext.setOption()
        QueryContext.getOption()

        FragmentContext.getOption()

        For getOption(), we should first consult the set of SESSION options. If we don't have a value there, we should use the DEFAULT option.

        We should maintain the set of global options in the distributed cache across the cluster. For session options, these will be maintained on the node that owns the session. All session level options that are focused on query execution will be included in logical and physical plans.

        We should probably have a category criteria for options that describes to what they correspond. For example, we could have category criteria of:

        PARSING
        OPTIMIZATION
        EXECUTION

        an example might be:
        session set.
        SET OPTIMIZATION PRFER_HASH_JOIN ON

        global set

        SET GLOBAL OPTIMIZATION PRFER_HASH_JOIN ON

        syntax is made up here to describe concepts.

        Show
        Jacques Nadeau added a comment - High level comments on initial patch: Can you move the reader to a new SystemStoragePlugin? You can then expose the options as a table in that storage plugin. (Register this plugin automatically rather than having the user configure it, similar to information schema. The table columns would probably be something like: [NAME, CATEGORY, DEFAULT, SESSION] where DEFAULT is the setting that a new session would get (a.k.a global) and the SESSION is the current value of the option. In the case that you haven't set any options, the two values would be the same. For the options access interface, we should have the following things supported: QueryContext.setOption() QueryContext.getOption() FragmentContext.getOption() For getOption(), we should first consult the set of SESSION options. If we don't have a value there, we should use the DEFAULT option. We should maintain the set of global options in the distributed cache across the cluster. For session options, these will be maintained on the node that owns the session. All session level options that are focused on query execution will be included in logical and physical plans. We should probably have a category criteria for options that describes to what they correspond. For example, we could have category criteria of: PARSING OPTIMIZATION EXECUTION an example might be: session set. SET OPTIMIZATION PRFER_HASH_JOIN ON global set SET GLOBAL OPTIMIZATION PRFER_HASH_JOIN ON syntax is made up here to describe concepts.
        Hide
        Jason Altekruse added a comment -

        I was hoping to get some feedback first. I was a bit torn about the format for the various value types over the wire. I knew that we wold want to read these values in from some sort of config file for the initial value populations when we actually make a release, right now they are just hard coded. For this reason I just re-used the string representations that would be used for the config files for sending the values throughout the cluster (I used the standard toString and parsing methods for the numeric types, and a UTF-8 encoding for strings). It does mean that every time a value is received we have to call a parsing method. I wasn't sure we would ever define enough options for this to be an issue, or if there would even be any significant performance enhancement over re-interpreting a raw byte stream as the appropriate java values, rather than parsing ascii.

        Show
        Jason Altekruse added a comment - I was hoping to get some feedback first. I was a bit torn about the format for the various value types over the wire. I knew that we wold want to read these values in from some sort of config file for the initial value populations when we actually make a release, right now they are just hard coded. For this reason I just re-used the string representations that would be used for the config files for sending the values throughout the cluster (I used the standard toString and parsing methods for the numeric types, and a UTF-8 encoding for strings). It does mean that every time a value is received we have to call a parsing method. I wasn't sure we would ever define enough options for this to be an issue, or if there would even be any significant performance enhancement over re-interpreting a raw byte stream as the appropriate java values, rather than parsing ascii.
        Hide
        Timothy Chen added a comment -

        You hoping to merge this patch or wait until you have everything?

        Show
        Timothy Chen added a comment - You hoping to merge this patch or wait until you have everything?
        Show
        Jason Altekruse added a comment - https://reviews.apache.org/r/18388/
        Hide
        Jason Altekruse added a comment -

        first patch posted contains a partial solution, can handle session level options. There is an outstanding problem with the distributed map implementation that is preventing global options from syncing values between drillbits.

        Show
        Jason Altekruse added a comment - first patch posted contains a partial solution, can handle session level options. There is an outstanding problem with the distributed map implementation that is preventing global options from syncing values between drillbits.

          People

          • Assignee:
            Jason Altekruse
            Reporter:
            Jason Altekruse
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development