XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      From the review feedback, there was a comment about validating URL specified in the connector URL when it is created. Currently, there is no validation except for checking for empty/null value. This is by-design and the desired behavior, IMHO. But filing this to be discussed with wider audience.

      I tried creating a connector without the mysql JDBC URL specified properly and it went through,
      
      please see below,
      
      CREATE CONNECTOR mysql_test_2
      TYPE 'mysql'
      URL 'jdbc://'
      COMMENT 'test connector'
      WITH DCPROPERTIES (
      "hive.sql.dbcp.username"="hive1",
      "hive.sql.dbcp.password"="hive1");
      
      CREATE CONNECTOR mysql_test_3
      TYPE 'mysql'
      URL 'jdbc:derby://nightly1.apache.org:3306/hive1'
      COMMENT 'test connector'
      WITH DCPROPERTIES (
      "hive.sql.dbcp.username"="hive1",
      "hive.sql.dbcp.password"="hive1");
      
      I am not saying they are wrong, but we should probably call this out in the documentation. Document that URLs are not verified.
      
      Another thing I noticed is that the password is displayed in plain
      text on the command line. This used be considered a security problem
      in a product I worked in a past life. But I notice that an external
      table can be created with this semantics. I guess it is acceptable
      here.
      
      It is also stored in plain text in the metastore, please see below,
      
      CREATE TABLE DATACONNECTOR_PARAMS (
      NAME VARCHAR(128) NOT NULL,
      PARAM_KEY VARCHAR(180) NOT NULL,
      PARAM_VALUE VARCHAR(4000),
      PRIMARY KEY (NAME, PARAM_KEY),
      CONSTRAINT DATACONNECTOR_NAME_FK1 FOREIGN KEY (NAME) REFERENCES DATACONNECTORS (NAME) ON DELETE CASCADE
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      
      Again I am not saying this is a problem, but I thought I can call this out to you.
      
       
      @nrg4878 nrg4878 24 minutes ago Author Member
      We check for null/empty values for URL. We error out in those cases. Other than that, any non-empty value is accepted. I dont think we should check for correctness of the URL or even can for that matter.
      a) The URL is meant to be a freeform value against dozens of datasource types (mysql, postgres, hive, AWS Glue, Redshift etc). For each such source type, there could be dozens of variations of the url (includes properties and other params specific to the source). So I dont think we can meaningfully detect incorrect URLs.
      For example, MySQL though the URL might look fine syntactically, we cannot confirm dbName1 or dbName2 exist without actually attempting to connect to the DB.
      jdbc:mysql://:3306/
      jdbc:mysql://:3306/
      b) The format for the URLs could be changing overtime as well. It is unnecessary burden for maintaining new formats in hive. We want to be able to plugin a new datasource type by simply adding a provider.
      
      c) To be able to validate the URL, we have to establish the connection to the datasource at the time of creation. We are trying to delay making that connection as long as possible. When actual show tables is called. We avoid using up extra resources and leak connections.
      
      d) Users can do "create connector" .. followed by "alter connector set url". So any incorrect URLS can be modified using alter. Also in this case, we would be checking the URL twice. Better to have the onus of configuring it correctly on the end user.
      
      Passwords can be secured using jceks files as described in the "Securing Password" section of the doc below.
      https://cwiki.apache.org/confluence/display/Hive/JDBC+Storage+Handler
      So users have an option of using non-CTVs
      

      Attachments

        Activity

          People

            ngangam Naveen Gangam
            ngangam Naveen Gangam
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: