Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13523

Fix connection leak in ORC RecordReader and refactor for unit testing

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.1, 2.1.0
    • ORC
    • None

    Description

      In RecordReaderImpl, a MetadataReaderImpl object was being created (opening a file), but never closed, causing a leak. This change closes the Metadata object in RecordReaderImpl, and does substantial refactoring to make RecordReaderImpl testable:

      • Created DataReaderFactory and MetadataReaderFactory (plus default implementations) so that the create() methods can be mocked to verify that the objects are actually closed in RecordReaderImpl.close()
      • Created MetadataReaderProperties and DataReaderProperties to clean up argument lists, making code more readable
      • Created a builder() for RecordReaderImpl to make the code more readable
      • DataReader and MetadataReader now extend closeable (there was no reason for them not to in the first place) so I can use the guava Closer interface: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/io/Closer.html
      • Use the Closer interface to guarantee that regardless of if either close() call fails, both will be attempted (preventing further potential leaks)
      • Create builders for MetadataReaderProperties, DataReaderProperties, and RecordReaderImpl to help with code readability

      Attachments

        1. HIVE-13523.patch
          42 kB
          Thomas Poepping

        Activity

          LGTM, +1. Pending tests.

          prasanth_j Prasanth Jayachandran added a comment - LGTM, +1. Pending tests.

          Review board submission created: https://reviews.apache.org/r/46237/

          poeppt Thomas Poepping added a comment - Review board submission created: https://reviews.apache.org/r/46237/
          hiveqa Hive QA added a comment -

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12798852/HIVE-13523.patch

          SUCCESS: +1 due to 3 test(s) being added or modified.

          ERROR: -1 due to 45 failed/errored test(s), 9904 tests executed
          Failed tests:

          TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file
          TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_left_outer_join2.q-vector_outer_join5.q-and-12-more - did not produce a TEST-*.xml file
          TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file
          TestSparkClient - did not produce a TEST-*.xml file
          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
          org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
          org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testAddPartitions
          org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
          org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
          org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.org.apache.hadoop.hive.metastore.hbase.TestHBaseImport
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.concurrencyFalse
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLExclusive
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLNoLock
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLShared
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDelete
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockTimeout
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testRollback
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadPartition
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadTable
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleWriteTable
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testUpdate
          org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testWriteDynamicPartition
          org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges
          org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls
          org.apache.hadoop.hive.ql.security.TestFolderPermissions.org.apache.hadoop.hive.ql.security.TestFolderPermissions
          org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges
          org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener
          org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
          org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccessWithReadOnly
          org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testDelegationTokenSharedStore
          org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser
          org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
          org.apache.hive.hcatalog.api.repl.commands.TestCommands.org.apache.hive.hcatalog.api.repl.commands.TestCommands
          org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable
          org.apache.hive.minikdc.TestJdbcWithDBTokenStore.org.apache.hive.minikdc.TestJdbcWithDBTokenStore
          org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.org.apache.hive.service.TestHS2ImpersonationWithRemoteMS
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7629/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7629/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7629/

          Messages:

          Executing org.apache.hive.ptest.execution.TestCheckPhase
          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 45 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12798852 - PreCommit-HIVE-TRUNK-Build

          hiveqa Hive QA added a comment - Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12798852/HIVE-13523.patch SUCCESS: +1 due to 3 test(s) being added or modified. ERROR: -1 due to 45 failed/errored test(s), 9904 tests executed Failed tests: TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_left_outer_join2.q-vector_outer_join5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestSparkClient - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testAddPartitions org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.org.apache.hadoop.hive.metastore.hbase.TestHBaseImport org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.concurrencyFalse org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLExclusive org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLNoLock org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLShared org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDelete org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockTimeout org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testRollback org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadPartition org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadTable org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleWriteTable org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testUpdate org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testWriteDynamicPartition org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls org.apache.hadoop.hive.ql.security.TestFolderPermissions.org.apache.hadoop.hive.ql.security.TestFolderPermissions org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccessWithReadOnly org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testDelegationTokenSharedStore org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore org.apache.hive.hcatalog.api.repl.commands.TestCommands.org.apache.hive.hcatalog.api.repl.commands.TestCommands org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropTable org.apache.hive.minikdc.TestJdbcWithDBTokenStore.org.apache.hive.minikdc.TestJdbcWithDBTokenStore org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.org.apache.hive.service.TestHS2ImpersonationWithRemoteMS Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7629/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7629/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7629/ Messages: Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 45 tests failed This message is automatically generated. ATTACHMENT ID: 12798852 - PreCommit-HIVE-TRUNK-Build

          poeppt Although these test failures does not look related to me. Can you run these tests and confirm if these are related?

          prasanth_j Prasanth Jayachandran added a comment - poeppt Although these test failures does not look related to me. Can you run these tests and confirm if these are related?

          prasanth_j

          tl;dr on my local environment, every test that was reported to fail here succeeded ; both with and without my change. And I agree that none of these tests seem related. What's our next step?

          Of the 42 tests that failed, around half (20) were because of "connection refused", which is a failure that we've been seeing consistently in the automatic test runs. However, to be sure, I'll run each failed test with and without my change

          TestDbTxnManager succeeds for me both with and without my change
          TestClientSideAuthorizationProvider succeeds
          TestCommands succeeds
          TestDbNotificationListener succeeds
          TestExtendedAcls succeeds
          TestFolderPermissions succeeds
          TestHBaseImport succeeds
          TestHiveMetaStorePartitionSpecs succeeds
          TestJdbcWithDBTokenStore succeeds
          TestMetastoreAuthorizationProvider succeeds
          TestStorageBasedClientSideAuthorizationProvider succeeds
          TestStorageBasedMetastoreAuthorizationDrops succeeds
          TestStorageBasedMetastoreAuthorizationProvider succeeds
          TestStorageBasedMetastoreAuthorizationReads succeeds
          TestHadoopAuthBridge23 succeeds
          TestHS2ImpersonationWithRemoteMS succeeds
          TestMultiAuthorizationPreEventListener succeeds
          TestStorageBasedMetastoreAuthorizationProviderWithACL succeeds
          TestCliDriver with partition_coltype_literals.q succeeds
          TestCliDriver with index_bitmap3.q succeeds

          poeppt Thomas Poepping added a comment - prasanth_j tl;dr on my local environment, every test that was reported to fail here succeeded ; both with and without my change. And I agree that none of these tests seem related. What's our next step? Of the 42 tests that failed, around half (20) were because of "connection refused", which is a failure that we've been seeing consistently in the automatic test runs. However, to be sure, I'll run each failed test with and without my change TestDbTxnManager succeeds for me both with and without my change TestClientSideAuthorizationProvider succeeds TestCommands succeeds TestDbNotificationListener succeeds TestExtendedAcls succeeds TestFolderPermissions succeeds TestHBaseImport succeeds TestHiveMetaStorePartitionSpecs succeeds TestJdbcWithDBTokenStore succeeds TestMetastoreAuthorizationProvider succeeds TestStorageBasedClientSideAuthorizationProvider succeeds TestStorageBasedMetastoreAuthorizationDrops succeeds TestStorageBasedMetastoreAuthorizationProvider succeeds TestStorageBasedMetastoreAuthorizationReads succeeds TestHadoopAuthBridge23 succeeds TestHS2ImpersonationWithRemoteMS succeeds TestMultiAuthorizationPreEventListener succeeds TestStorageBasedMetastoreAuthorizationProviderWithACL succeeds TestCliDriver with partition_coltype_literals.q succeeds TestCliDriver with index_bitmap3.q succeeds

          poeppt Thanks for your analysis. That's what I suspected too. I will run these tests at my end too and if nothing suspicious I will go ahead and commit it.

          prasanth_j Prasanth Jayachandran added a comment - poeppt Thanks for your analysis. That's what I suspected too. I will run these tests at my end too and if nothing suspicious I will go ahead and commit it.

          Thanks poeppt for the contribution! I ran the tests locally and it ran fine. So went ahead and committed the patch to master and branch-2.0.

          prasanth_j Prasanth Jayachandran added a comment - Thanks poeppt for the contribution! I ran the tests locally and it ran fine. So went ahead and committed the patch to master and branch-2.0.
          omalley Owen O'Malley added a comment -

          Why did you make a builder for RecordReaderImpl, which is an internal class rather than just passing down the options object from the ReaderImpl?

          omalley Owen O'Malley added a comment - Why did you make a builder for RecordReaderImpl, which is an internal class rather than just passing down the options object from the ReaderImpl?

          I do pass the options object down from ReaderImpl. I could have kept RecordReaderImpl as a constructor with 11 arguments, but I thought that a builder would be more readable. RecordReaderImpl needs more than just the Options object passed from ReaderImpl, like fileSystem, path, codec, etc.

          If you mean that I could have passed a properties object for RecordReaderImpl from ReaderImpl, I could have created a properties object for RecordReaderImpl (the way I did for DataReader and MetadataReader) but I thought that a builder made more sense and was easier to use.

          poeppt Thomas Poepping added a comment - I do pass the options object down from ReaderImpl. I could have kept RecordReaderImpl as a constructor with 11 arguments, but I thought that a builder would be more readable. RecordReaderImpl needs more than just the Options object passed from ReaderImpl, like fileSystem, path, codec, etc. If you mean that I could have passed a properties object for RecordReaderImpl from ReaderImpl, I could have created a properties object for RecordReaderImpl (the way I did for DataReader and MetadataReader) but I thought that a builder made more sense and was easier to use.
          omalley Owen O'Malley added a comment -

          I've redone this in my HIVE-12159 patch. The RecordReaderImpl now takes ReaderImpl and Reader.Options. It is simple and direct.

          omalley Owen O'Malley added a comment - I've redone this in my HIVE-12159 patch. The RecordReaderImpl now takes ReaderImpl and Reader.Options. It is simple and direct.

          People

            poeppt Thomas Poepping
            poeppt Thomas Poepping
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: