Derby
  1. Derby
  2. DERBY-6136

Create a custom/optional tool for dumping the data in a corrupted database.

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 10.11.0.0
    • Fix Version/s: None
    • Component/s: Tools
    • Labels:
      None
    • Urgency:
      Normal

      Description

      It would be useful to have a tool for dumping the data in a corrupted database. This could start out as a custom tool. After we debug the tool and get some experience with it, we can consider promoting it to be a (possibly undocumented) optional tool which we ship with the product. I think the tool should have the following behavior:

      1) The tool should not subvert the security of the corrupted database. If the corrupted database is password-protected, then you would need to present its DBO's credentials in order to use the tool. Naturally, an encryption key would have to be presented in order to decode an encrypted database.

      2) The tool should not stop reading a table when it hits a corrupt record. Instead, the tool should soldier on and collect a list of warnings on bad records.

      Such a tool would be useful in situations where some part of a heap table is corrupt but the following heap conglomerates are intact:

      i) SYSSCHEMAS
      ii) SYSTABLES
      iii) SYSCONGLOMERATES
      iv) SYSCOLUMNS
      v) property conglomerate

      Such a tool would be useful for some situations where data can't be dumped even after you delete the log files in order to short-circuit recovery.

      1. DataFileVTI.java
        59 kB
        Rick Hillegas
      2. DataFileVTI.java
        59 kB
        Rick Hillegas
      3. DataFileVTI.java
        63 kB
        Rick Hillegas
      4. DataFileVTI.java
        58 kB
        Rick Hillegas
      5. DataFileVTI.java
        57 kB
        Rick Hillegas
      6. dataFileVTI.sql
        4 kB
        Rick Hillegas
      7. RawDBReader.java
        22 kB
        Rick Hillegas
      8. RawDBReader.java
        22 kB
        Rick Hillegas

        Issue Links

          Activity

          Hide
          Rick Hillegas added a comment -

          Attaching a new version of DataFileVTI which preserves triggering Throwables in the SQLWarnings passed back to the user.

          Show
          Rick Hillegas added a comment - Attaching a new version of DataFileVTI which preserves triggering Throwables in the SQLWarnings passed back to the user.
          Hide
          Rick Hillegas added a comment -

          Attaching a new version of DataFileVTI. This version makes the following changes:

          1) Skips records which are marked as deleted.

          2) Turns errors into SQLWarnings so that problems will still be reported as reading continues. The warnings percolate up to the user only if you are using a codeline improved by derby-6151-01-aa-passBackWarnings.diff.

          3) Removed dead code and reorganized to improve readability.

          I have run DataFileVTI successfully against corrupt heaps. That is, against heaps which spew hex dumps when you try to select from them directly. Here is an ij example of using DataFileVTI to read a corrupt heap:

          call syscs_util.syscs_register_tool
          (
          'customTool',
          true,
          'RawDBReader',
          'CONTROL11',
          'RAW11_',
          '/Users/rh161140/derby/mainline/corruptEncryptedDB',
          'bootPassword=mysecretpassword',
          'APP',
          null
          );
          0 rows inserted/updated/deleted
          ij(CONNECTION2)> select * from raw11_app.t1 where a > 37 and a < 50;
          A |B |C
          -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          38 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&
          39 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&
          48 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&
          WARNING (no SQLState): Error reading field data. Offset = 366, length = 290, datatype = VARCHAR(1000): Field 2 in record 0 on page 5 in file c490.dat: java.io.UTFDataFormatException: null
          WARNING (no SQLState): Error reading field data. Offset = 671, length = 1823514738, datatype = VARCHAR(1000): Field 1 in record 1 on page 5 in file c490.dat: java.io.IOException: Negative seek offset
          WARNING (no SQLState): Error reading field data. Offset = 685, length = 36, datatype = VARCHAR(1000): Field 2 in record 1 on page 5 in file c490.dat: java.io.EOFException: null
          WARNING (no SQLState): java.lang.ArrayIndexOutOfBoundsException: null
          49 |abcdefghijklmnopqrstuvwxyz0123456789 |abcdefghijklmnopqrstuvwxyz0123456789

          4 rows selected

          I think that these tools are ready for experimental use by others. Please tell me about the problems you find, so that I can improve these tools. Thanks.

          Show
          Rick Hillegas added a comment - Attaching a new version of DataFileVTI. This version makes the following changes: 1) Skips records which are marked as deleted. 2) Turns errors into SQLWarnings so that problems will still be reported as reading continues. The warnings percolate up to the user only if you are using a codeline improved by derby-6151-01-aa-passBackWarnings.diff. 3) Removed dead code and reorganized to improve readability. I have run DataFileVTI successfully against corrupt heaps. That is, against heaps which spew hex dumps when you try to select from them directly. Here is an ij example of using DataFileVTI to read a corrupt heap: call syscs_util.syscs_register_tool ( 'customTool', true, 'RawDBReader', 'CONTROL11', 'RAW11_', '/Users/rh161140/derby/mainline/corruptEncryptedDB', 'bootPassword=mysecretpassword', 'APP', null ); 0 rows inserted/updated/deleted ij(CONNECTION2)> select * from raw11_app.t1 where a > 37 and a < 50; A |B |C ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 38 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs& 39 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs& 48 |abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs&|abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrs& WARNING (no SQLState): Error reading field data. Offset = 366, length = 290, datatype = VARCHAR(1000): Field 2 in record 0 on page 5 in file c490.dat: java.io.UTFDataFormatException: null WARNING (no SQLState): Error reading field data. Offset = 671, length = 1823514738, datatype = VARCHAR(1000): Field 1 in record 1 on page 5 in file c490.dat: java.io.IOException: Negative seek offset WARNING (no SQLState): Error reading field data. Offset = 685, length = 36, datatype = VARCHAR(1000): Field 2 in record 1 on page 5 in file c490.dat: java.io.EOFException: null WARNING (no SQLState): java.lang.ArrayIndexOutOfBoundsException: null 49 |abcdefghijklmnopqrstuvwxyz0123456789 |abcdefghijklmnopqrstuvwxyz0123456789 4 rows selected I think that these tools are ready for experimental use by others. Please tell me about the problems you find, so that I can improve these tools. Thanks.
          Hide
          Rick Hillegas added a comment -

          Attaching a new version of DataFileVTI. This version reads heaps with long rows which spill across multiple pages.

          Show
          Rick Hillegas added a comment - Attaching a new version of DataFileVTI. This version reads heaps with long rows which spill across multiple pages.
          Hide
          Rick Hillegas added a comment -

          Attaching new versions of DataFileVTI and RawDBReader. These versions change the bootPassword argument into a more general encryptionAttributes argument. I have tested these new versions with a database encrypted with an explicit encryption key and with a database encrypted with a bootpassword. Here are examples of both usages:

          call syscs_util.syscs_register_tool
          (
          'customTool',
          true,
          'RawDBReader',
          'CONTROL6',
          'RAW6_',
          '/Users/rh161140/derby/mainline/encryptedDB',
          'encryptionKey=abcd1234efab5678',
          'APP',
          null
          );

          call syscs_util.syscs_register_tool
          (
          'customTool',
          true,
          'RawDBReader',
          'CONTROL7',
          'RAW7_',
          '/Users/rh161140/derby/mainline/bootpasswordDB',
          'bootPassword=mysecretpassword',
          'APP',
          null
          );

          Show
          Rick Hillegas added a comment - Attaching new versions of DataFileVTI and RawDBReader. These versions change the bootPassword argument into a more general encryptionAttributes argument. I have tested these new versions with a database encrypted with an explicit encryption key and with a database encrypted with a bootpassword. Here are examples of both usages: call syscs_util.syscs_register_tool ( 'customTool', true, 'RawDBReader', 'CONTROL6', 'RAW6_', '/Users/rh161140/derby/mainline/encryptedDB', 'encryptionKey=abcd1234efab5678', 'APP', null ); call syscs_util.syscs_register_tool ( 'customTool', true, 'RawDBReader', 'CONTROL7', 'RAW7_', '/Users/rh161140/derby/mainline/bootpasswordDB', 'bootPassword=mysecretpassword', 'APP', null );
          Hide
          Rick Hillegas added a comment -

          Attaching the first rev of RawDBReader, a custom tool for dumping the data in a corrupt database. This tool makes use of DataFileVTI. The tool creates schemas, table functions, and wrapper views for reading all user tables in a corrupt database. To use the tool you must specify the bootpassword (if applicable) and the credentials of the owner of the corrupt database. The tool takes the following arguments:

          o The name of a control schema to create in your local database. The control schema will hold views against the core catalogs in t e corrupt database.

          o A prefix to prepend to the names of user schemas. For each schema in the corrupt database, the tool creates a corresponding lo al schema with this prefix prepended to its name.

          o The location of the corrupt database directory. That is, the directory which holds its service.properties file.

          o The bootpassword of the corrupt database.

          o The username of the owner of the corrupt database.

          o The password of the owner of the corrupt database.

          You load the tool like this:

          call syscs_util.syscs_register_tool
          (
          'customTool',
          true,
          'RawDBReader',
          'CONTROL',
          'RAW_',
          '/Users/corruptdbs/badDB',
          null,
          'FRED_THE_DBO',
          'fredspassword'
          );

          If the corrupt database has a table named T1 in schema S1, you can dump its data like this:

          select * from RAW_S1.T1;

          When you are done dumping the data, you can unload the tool as follows. This will drop all of the schema objects which were created when the tool was loaded:

          call syscs_util.syscs_register_tool
          (
          'customTool',
          false,
          'RawDBReader',
          'CONTROL',
          'RAW_'
          );

          Show
          Rick Hillegas added a comment - Attaching the first rev of RawDBReader, a custom tool for dumping the data in a corrupt database. This tool makes use of DataFileVTI. The tool creates schemas, table functions, and wrapper views for reading all user tables in a corrupt database. To use the tool you must specify the bootpassword (if applicable) and the credentials of the owner of the corrupt database. The tool takes the following arguments: o The name of a control schema to create in your local database. The control schema will hold views against the core catalogs in t e corrupt database. o A prefix to prepend to the names of user schemas. For each schema in the corrupt database, the tool creates a corresponding lo al schema with this prefix prepended to its name. o The location of the corrupt database directory. That is, the directory which holds its service.properties file. o The bootpassword of the corrupt database. o The username of the owner of the corrupt database. o The password of the owner of the corrupt database. You load the tool like this: call syscs_util.syscs_register_tool ( 'customTool', true, 'RawDBReader', 'CONTROL', 'RAW_', '/Users/corruptdbs/badDB', null, 'FRED_THE_DBO', 'fredspassword' ); If the corrupt database has a table named T1 in schema S1, you can dump its data like this: select * from RAW_S1.T1; When you are done dumping the data, you can unload the tool as follows. This will drop all of the schema objects which were created when the tool was loaded: call syscs_util.syscs_register_tool ( 'customTool', false, 'RawDBReader', 'CONTROL', 'RAW_' );
          Hide
          Rick Hillegas added a comment -

          Attaching the first rev of DataFileVTI, a table function which reads a heap conglomerate. Also attaching dataFileVTI.sql, a script which shows how to use this table function to read the contents of SYSSCHEMAS, SYSTABLES, SYSCONGLOMERATES, and SYSCOLUMNS. There is more work to be done on this table function, but I think that it's worth getting community feedback on this early version.

          The table function takes the following arguments:

          (
          databaseDirectoryName varchar( 32672 ),
          heapConglomerateFileName varchar( 32672 ),
          tableSignature varchar( 32672 ),
          bootpassword varchar( 32672 ),
          dboName varchar( 32672 ),
          dboPassword varchar( 32672 )
          )

          ...and requires that you declare the shape of the rows in the target heap. Here, for instance, is how you would declare a table function to read SYS.SYSSCHEMAS in a corrupt database:

          create function sysschemas
          (
          databaseDirectoryName varchar( 32672 ),
          dataFileName varchar( 32672 ),
          tableSignature varchar( 32672 ),
          bootpassword varchar( 32672 ),
          userName varchar( 32672 ),
          password varchar( 32672 )
          )
          returns table
          ( schemaID char(36), schemaname varchar(128), authorizationid varchar(128) )
          language java
          parameter style derby_jdbc_result_set
          no sql
          external name 'DataFileVTI.dataFileVTI';

          Typically, I think that you would wrap the table function in a view in order to hide the messy arguments:

          create view sysschemas
          as select * from table
          (
          sysschemas
          (
          '/Users/rh161140/derby/mainline/db1',
          'cc0.dat',
          '( schemaID char(36), schemaname varchar(128), authorizationid varchar(128) )',
          null,
          'APP',
          null
          )
          ) t;

          Then you would dump the data in the heap by selecting from the view:

          select * from sysschemas;

          You will get an error if the credentials or bootpassword are invalid or if the user name is not the name of the owner of the corrupt database. I have tested this logic when the following authentication schemes are specified by database-only properties in the corrupt database:

          a) NATIVE
          b) BUILTIN
          c) LDAP
          d) custom

          I am aware of the following defects:

          I) The tool will probably fail to run against really old databases which use BUILTIN authentication or password substitution.

          II) The tool will fail to read rows which spill across multiple heap pages.

          III) I have not stressed the unencryption argument yet. The logic is merely based on what's done in the DataFileReader program attached to DERBY-5201. I'm sure it doesn't handle the difference between bootpasswords and encryption keys.

          Show
          Rick Hillegas added a comment - Attaching the first rev of DataFileVTI, a table function which reads a heap conglomerate. Also attaching dataFileVTI.sql, a script which shows how to use this table function to read the contents of SYSSCHEMAS, SYSTABLES, SYSCONGLOMERATES, and SYSCOLUMNS. There is more work to be done on this table function, but I think that it's worth getting community feedback on this early version. The table function takes the following arguments: ( databaseDirectoryName varchar( 32672 ), heapConglomerateFileName varchar( 32672 ), tableSignature varchar( 32672 ), bootpassword varchar( 32672 ), dboName varchar( 32672 ), dboPassword varchar( 32672 ) ) ...and requires that you declare the shape of the rows in the target heap. Here, for instance, is how you would declare a table function to read SYS.SYSSCHEMAS in a corrupt database: create function sysschemas ( databaseDirectoryName varchar( 32672 ), dataFileName varchar( 32672 ), tableSignature varchar( 32672 ), bootpassword varchar( 32672 ), userName varchar( 32672 ), password varchar( 32672 ) ) returns table ( schemaID char(36), schemaname varchar(128), authorizationid varchar(128) ) language java parameter style derby_jdbc_result_set no sql external name 'DataFileVTI.dataFileVTI'; Typically, I think that you would wrap the table function in a view in order to hide the messy arguments: create view sysschemas as select * from table ( sysschemas ( '/Users/rh161140/derby/mainline/db1', 'cc0.dat', '( schemaID char(36), schemaname varchar(128), authorizationid varchar(128) )', null, 'APP', null ) ) t; Then you would dump the data in the heap by selecting from the view: select * from sysschemas; You will get an error if the credentials or bootpassword are invalid or if the user name is not the name of the owner of the corrupt database. I have tested this logic when the following authentication schemes are specified by database-only properties in the corrupt database: a) NATIVE b) BUILTIN c) LDAP d) custom I am aware of the following defects: I) The tool will probably fail to run against really old databases which use BUILTIN authentication or password substitution. II) The tool will fail to read rows which spill across multiple heap pages. III) I have not stressed the unencryption argument yet. The logic is merely based on what's done in the DataFileReader program attached to DERBY-5201 . I'm sure it doesn't handle the difference between bootpasswords and encryption keys.

            People

            • Assignee:
              Unassigned
              Reporter:
              Rick Hillegas
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development