Hive
  1. Hive
  2. HIVE-3463

Add CASCADING to MySQL's InnoDB schema

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.9.0
    • Fix Version/s: None
    • Component/s: Metastore
    • Labels:
      None

      Description

      Cascading could help to cleanup the tables when a FK is deleted.
      http://dev.mysql.com/doc/refman/5.5/en/innodb-foreign-key-constraints.html

        Activity

        Alexander Alten-Lorenz created issue -
        Hide
        Alexander Alten-Lorenz added a comment -

        As hive is using atomic transactions to perform individual operations, its causing issue during replication of the database. Hive does not have cascade on the foreign key constraints, when the atomic transactions been performed on the database, the bin log will be written based on the completion of the task rather than the time of issuance. This is causing reversal of the order in bin log, which causes the replication break.

        Show
        Alexander Alten-Lorenz added a comment - As hive is using atomic transactions to perform individual operations, its causing issue during replication of the database. Hive does not have cascade on the foreign key constraints, when the atomic transactions been performed on the database, the bin log will be written based on the completion of the task rather than the time of issuance. This is causing reversal of the order in bin log, which causes the replication break.
        Hide
        Alexander Alten-Lorenz added a comment -

        More insight:

        1. The drop table command deletes entries in different tables as UNIT transactions, so if there is a communication break or any other uncertain condition, database could rollback that individual transaction leaving database in inconsistent state.

        2. The drop table is not serialized action, which is causing Replication break

        3. As cascade is not defined in the FK definition during table creation, the stale could present in certain conditions.

        4. The database with id 511 is showing the tables of default database who’s db_id is 1

        5. The select on any portioned table with a limit operator scans through all the partitions using UNIT transactions, which just query single entry per transaction, means to say, if a table has got 100000 partitions, it does 100000 queries to database before showing the result. Instead, it should have done one bulk query fetching all the partition details at once.

        Example:
        In our database we are having 3500 partitioned tables, with total of 950000 partitions. Every day we get hit on 90% of tables. And 60% of the users say, the queries takes its own sweet time to return, even though we kept the metastore level performance(MySQL) at 2miliSec to 20 miliSec, the query time at hive is always varies between few minutes to few hours. Even simple USE statement takes up to 5 min in some cases. (new thing I am bringing it up, as it took some time for me to confirm this)

        6. Dead locks happening on hive tables, during normal operations, and quite a number of rollbacks happening at database side. What I noticed is the transaction serialization issue, which is issue a delete before an update, see below for the detail on the dead lock. With upgrade to mysql 5.5, we are not seeing it not so frequently now.

        One nit, Cascading for InnoDB is a great feature, since Hive doesn't support MySQL's binlog replication (Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'). I was digging a bit deeper, that could be avoided by setting "SET GLOBAL binlog_format = 'ROW';: inside of MySQL. But this needs to be confirmed first.

        I raise the FR up.

        Show
        Alexander Alten-Lorenz added a comment - More insight: 1. The drop table command deletes entries in different tables as UNIT transactions, so if there is a communication break or any other uncertain condition, database could rollback that individual transaction leaving database in inconsistent state. 2. The drop table is not serialized action, which is causing Replication break 3. As cascade is not defined in the FK definition during table creation, the stale could present in certain conditions. 4. The database with id 511 is showing the tables of default database who’s db_id is 1 5. The select on any portioned table with a limit operator scans through all the partitions using UNIT transactions, which just query single entry per transaction, means to say, if a table has got 100000 partitions, it does 100000 queries to database before showing the result. Instead, it should have done one bulk query fetching all the partition details at once. Example: In our database we are having 3500 partitioned tables, with total of 950000 partitions. Every day we get hit on 90% of tables. And 60% of the users say, the queries takes its own sweet time to return, even though we kept the metastore level performance(MySQL) at 2miliSec to 20 miliSec, the query time at hive is always varies between few minutes to few hours. Even simple USE statement takes up to 5 min in some cases. (new thing I am bringing it up, as it took some time for me to confirm this) 6. Dead locks happening on hive tables, during normal operations, and quite a number of rollbacks happening at database side. What I noticed is the transaction serialization issue, which is issue a delete before an update, see below for the detail on the dead lock. With upgrade to mysql 5.5, we are not seeing it not so frequently now. One nit, Cascading for InnoDB is a great feature, since Hive doesn't support MySQL's binlog replication (Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'). I was digging a bit deeper, that could be avoided by setting "SET GLOBAL binlog_format = 'ROW';: inside of MySQL. But this needs to be confirmed first. I raise the FR up.
        Alexander Alten-Lorenz made changes -
        Field Original Value New Value
        Priority Minor [ 4 ] Major [ 3 ]
        Hide
        Ashutosh Chauhan added a comment -

        Alexander,
        Good findings! Couple of comments:

        • I wonder what about same issues for other backend dbs we support (postgres/derby/oracle). I assume you didn't look into those, but was wondering if you have any insight for that.
        • Wherever possible we should dictate the schema definitions from datanucleus ORM mapping (which lives in package.jdo) and try to avoid directly playing with schemas of individual db(in scripts/ dir). Obviously, if datanucleus doesn't expose feature of underlying db which we need, we won't have a choice.
        Show
        Ashutosh Chauhan added a comment - Alexander, Good findings! Couple of comments: I wonder what about same issues for other backend dbs we support (postgres/derby/oracle). I assume you didn't look into those, but was wondering if you have any insight for that. Wherever possible we should dictate the schema definitions from datanucleus ORM mapping (which lives in package.jdo) and try to avoid directly playing with schemas of individual db(in scripts/ dir). Obviously, if datanucleus doesn't expose feature of underlying db which we need, we won't have a choice.
        Carl Steinbach made changes -
        Assignee Alexander Alten-Lorenz [ alo.alt ]
        Hide
        Carl Steinbach added a comment -

        I strongly second Ashutosh's second point. It looks like DN does support this feature. Check out the 'delete-action' attribute for the 'field' element described here:

        http://www.datanucleus.org/products/datanucleus/jdo/metadata_xml.html#field

        Show
        Carl Steinbach added a comment - I strongly second Ashutosh's second point. It looks like DN does support this feature. Check out the 'delete-action' attribute for the 'field' element described here: http://www.datanucleus.org/products/datanucleus/jdo/metadata_xml.html#field
        Hide
        Alexander Alten-Lorenz added a comment -

        We've to decide on which table we need cascading to create the appr. schema.
        Example for SERDE (not really tested, only a design pattern)

        CREATE TABLE SERDES (
            SERDE_ID VARCHAR not null,
            NAME VARCHAR not null,
            SLIB VARCHAR,
            SERDE_PARAMS STRING,
            PARAM_KEY VARCHAR not null,
            PARAM_VALUE VARCHAR not null,
            PRIMARY KEY (PARAM_KEY,PARAM_VALUE),
            KEY pkey (PARAM_KEY),
            FOREIGN KEY (PARAM_VALUE) REFERENCES SERDE_ID (id)
               ON DELETE CASCADE
               ON UPDATE CASCADE,
            FOREIGN KEY (NAME) REFERENCES SERDE_ID (id)
               ON DELETE CASCADE
               ON UPDATE CASCADE,
        
        ); 
        

        We need to know:

        • which tables and rows have to be a relationship
        • what make sense to describe as a CASCADE
        • are the schema we deliver good enough to implement bulk query fetching?

        Thanks,
        Alex

        Show
        Alexander Alten-Lorenz added a comment - We've to decide on which table we need cascading to create the appr. schema. Example for SERDE (not really tested, only a design pattern) CREATE TABLE SERDES ( SERDE_ID VARCHAR not null , NAME VARCHAR not null , SLIB VARCHAR, SERDE_PARAMS STRING, PARAM_KEY VARCHAR not null , PARAM_VALUE VARCHAR not null , PRIMARY KEY (PARAM_KEY,PARAM_VALUE), KEY pkey (PARAM_KEY), FOREIGN KEY (PARAM_VALUE) REFERENCES SERDE_ID (id) ON DELETE CASCADE ON UPDATE CASCADE, FOREIGN KEY (NAME) REFERENCES SERDE_ID (id) ON DELETE CASCADE ON UPDATE CASCADE, ); We need to know: which tables and rows have to be a relationship what make sense to describe as a CASCADE are the schema we deliver good enough to implement bulk query fetching? Thanks, Alex
        Hide
        Carl Steinbach added a comment -

        @Alex: Are you suggesting that we fix this problem by modifying the schema DDL scripts?

        Show
        Carl Steinbach added a comment - @Alex: Are you suggesting that we fix this problem by modifying the schema DDL scripts?
        Hide
        Alexander Alten-Lorenz added a comment -

        I think so, since the schema is describing the design, and DN maps only. I do some tests atm.

        Show
        Alexander Alten-Lorenz added a comment - I think so, since the schema is describing the design, and DN maps only. I do some tests atm.
        Hide
        Carl Steinbach added a comment -

        @Alex: I think we need to investigate whether or not this can be fixed by modifying the package.jdo file first. Modifying the schema DDL scripts should be a last resort.

        Show
        Carl Steinbach added a comment - @Alex: I think we need to investigate whether or not this can be fixed by modifying the package.jdo file first. Modifying the schema DDL scripts should be a last resort.
        Hide
        Alexander Alten-Lorenz added a comment -

        Yes, we have to patch the upgrade scripts as well the jdo package (since the db will created if not present). But we've to know which relations we need to implement cascading. Any thoughts?

        Show
        Alexander Alten-Lorenz added a comment - Yes, we have to patch the upgrade scripts as well the jdo package (since the db will created if not present). But we've to know which relations we need to implement cascading. Any thoughts?
        Hide
        Carl Steinbach added a comment -

        I'm opposed to only modifying the schema scripts (without updating JDO) unless we can prove that DN doesn't support cascading deletes via JDO. The relations that we need to tag with cascading deletes can be inferred by mapping the schema back to the original JDO file.

        Show
        Carl Steinbach added a comment - I'm opposed to only modifying the schema scripts (without updating JDO) unless we can prove that DN doesn't support cascading deletes via JDO. The relations that we need to tag with cascading deletes can be inferred by mapping the schema back to the original JDO file.
        Hide
        Harsh J added a comment -
        Show
        Harsh J added a comment - DN does have cascading support: http://www.datanucleus.org/products/datanucleus/jdo/orm/cascading.html
        Alexander Alten-Lorenz made changes -
        Assignee Alexander Alten-Lorenz [ alo.alt ]
        Hide
        Alexander Alten-Lorenz added a comment -

        Since the switch into a new DN version needs large work I going ahead and mark as Unassigned.

        Show
        Alexander Alten-Lorenz added a comment - Since the switch into a new DN version needs large work I going ahead and mark as Unassigned.

          People

          • Assignee:
            Unassigned
            Reporter:
            Alexander Alten-Lorenz
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:

              Development