Derby
  1. Derby
  2. DERBY-4710

Upgrade from 10.2 to 10.6 fails if existing database contains a large number of tables with similar names.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 10.5.1.1, 10.5.2.0, 10.5.3.0, 10.6.1.0
    • Fix Version/s: None
    • Component/s: SQL, Store
    • Environment:
      Windows XP SP3, or Windows Server 2003 R2 32-bit. Not tested on any other platform.
    • Urgency:
      Normal
    • Issue & fix info:
      Repro attached
    • Bug behavior facts:
      Performance, Seen in production

      Description

      Upgrade fails because of quick degradation in performance when trying to upgrade our existing Derby DB from version 10.2.2 to the latest build 10.6. Problem is that our database contains thousands of tables with the names starting from "SURVEY_xxxxx" where xxxxx can be any integer from 1 to 99999. Upgrade fails on this tables to the point that one cannot access any of them, because apparently it takes a very long time to open them.

      We staged a test in order to see how database handles creation of thousands of similarly named tables.

      Below we will try to describe how the test was conducted.

      Process

      • Create a new blank database in 10.6

      In a loop from 1 to 10000

      { although I only managed to get to 1510 over night}
      • Created a program that creates a table called SURVEY_X
      • Inserts 1/2 hour interval data from the range 2006-08-03 15:00:00 to 2009-01-15 00:00:00. 40,000 records.

      And this process repeats.

      Results

      • At the start (10:00 pm) a single cycle of create and insert was taking 2 seconds i.e Creation of SURVEY_1
      • Run overnight
      • In the morning 7:00am it had only got to 1510 table and insert creations, and was taking 2 minutes for every new table - i.e Creation of SURVEY_1510

      If I change the program (and use it on this database with the current 1510 tables in it) to create a table called T_SURVEY_X then it goes back to 2 seconds, although I suspect that if I left it running and we had 1500 tables called T_SURVEY_X we would have the same problem.

      The symptom is also present in SQLWorkbench/J where it takes 2 seconds to see table T_SURVEY_0 but 45 seconds to see SURVEY_1510 and even after it presents the data it still seems to lock up etc.

      So this explains why with 6000 tables that we seem to get no response at all. As you can see from the enclosed log performance starts really degrading after a 1000 tables.

      1. Derby4710.java
        3 kB
        Dag H. Wanvik
      2. Derby-test.log
        255 kB
        Ray Gala

        Activity

        Hide
        Dag H. Wanvik added a comment -

        My test program has now run this scenario for some hours (using insane jars trunk), and there is no degradation:

        :
        Table SURVEY_4412 time spent: 2 seconds, inserting 42980 rows
        Table SURVEY_4413 time spent: 2 seconds, inserting 42981 rows
        Table SURVEY_4414 time spent: 2 seconds, inserting 42980 rows
        Table SURVEY_4415 time spent: 2 seconds, inserting 42980 rows
        Table SURVEY_4416 time spent: 2 seconds, inserting 42980 rows
        Table SURVEY_4417 time spent: 2 seconds, inserting 42980 rows

        Unless we get more information on this problem scenario (modify my test program, perhaps, to show what the issue is), I suggest we close this as not reproducible.

        Show
        Dag H. Wanvik added a comment - My test program has now run this scenario for some hours (using insane jars trunk), and there is no degradation: : Table SURVEY_4412 time spent: 2 seconds, inserting 42980 rows Table SURVEY_4413 time spent: 2 seconds, inserting 42981 rows Table SURVEY_4414 time spent: 2 seconds, inserting 42980 rows Table SURVEY_4415 time spent: 2 seconds, inserting 42980 rows Table SURVEY_4416 time spent: 2 seconds, inserting 42980 rows Table SURVEY_4417 time spent: 2 seconds, inserting 42980 rows Unless we get more information on this problem scenario (modify my test program, perhaps, to show what the issue is), I suggest we close this as not reproducible.
        Hide
        Dag H. Wanvik added a comment -

        Uploading a test program which tries to do what's described in this issue. I have been running it just for 10 minutes now,
        but so far spent time and heap usage is flat, I'll report back tomorrow. Of course, the program may differ significantly from what the Ray's program, but this is a starting point.

        Show
        Dag H. Wanvik added a comment - Uploading a test program which tries to do what's described in this issue. I have been running it just for 10 minutes now, but so far spent time and heap usage is flat, I'll report back tomorrow. Of course, the program may differ significantly from what the Ray's program, but this is a starting point.
        Hide
        Mike Matrigali added a comment -

        Triaged for 10.9, no changes.

        Show
        Mike Matrigali added a comment - Triaged for 10.9, no changes.
        Hide
        Myrna van Lunteren added a comment -

        Although clearly there is an issue here in upgrading to Derby 10.6, in the community, we understand an 'upgrade' issue to be a problem with a database that is being/has been upgraded using the 'upgrade=true' (hard upgrade) or implicit 'soft' upgrade (just use new jars) functionality. In that context, this does not look like an issue with upgrade, but a performance regression from 10.2 to 10.6, especially since you see the same behavior with a newly created 10.6 database.

        The Apache DB community has recently released Derby 10.8.1.2.

        Although the 'repro attached' flag is checked, the repro in the description is impractical to follow, because there is no description of the original table created; no sql.

        Is it possible for you to:

        • verify the results with 10.8.1.2?
        • contribute the sql used for the test cases to duplicate the behavior, and your program that does the timed inserts?
        Show
        Myrna van Lunteren added a comment - Although clearly there is an issue here in upgrading to Derby 10.6, in the community, we understand an 'upgrade' issue to be a problem with a database that is being/has been upgraded using the 'upgrade=true' (hard upgrade) or implicit 'soft' upgrade (just use new jars) functionality. In that context, this does not look like an issue with upgrade, but a performance regression from 10.2 to 10.6, especially since you see the same behavior with a newly created 10.6 database. The Apache DB community has recently released Derby 10.8.1.2. Although the 'repro attached' flag is checked, the repro in the description is impractical to follow, because there is no description of the original table created; no sql. Is it possible for you to: verify the results with 10.8.1.2? contribute the sql used for the test cases to duplicate the behavior, and your program that does the timed inserts?
        Hide
        Ray Gala added a comment -

        Reason a new blank database was created in 10.6 was to create a test case and demonstrate performance issues. Initially this happened during upgrade from 10.2 to 10.6, and we spend a lot of time trying to figure out why we cannot access migrated tables. That's why this issue has been created with description saying the problem exists during upgrade.

        Show
        Ray Gala added a comment - Reason a new blank database was created in 10.6 was to create a test case and demonstrate performance issues. Initially this happened during upgrade from 10.2 to 10.6, and we spend a lot of time trying to figure out why we cannot access migrated tables. That's why this issue has been created with description saying the problem exists during upgrade.
        Hide
        Knut Anders Hatlen added a comment -

        The bug description says that a new blank database was created in 10.6, so it would appear to be a 10.2 vs 10.6 performance issue, and not an upgrade issue.

        Show
        Knut Anders Hatlen added a comment - The bug description says that a new blank database was created in 10.6, so it would appear to be a 10.2 vs 10.6 performance issue, and not an upgrade issue.
        Hide
        Mike Matrigali added a comment -

        not sure if this is a sql or store issue so marking both components. Should figure out if upgrade has anything to do with it, or if it is just a 10.2 vs. 10.6 performance issue. Possibilities could include:
        o) some new catalog that does not use right index for metadata
        o) some system specific problem with large directories, derby will create a new file for each new
        table. Names of these files don't really have anything to do with table names, so may also not
        be problem.

        Show
        Mike Matrigali added a comment - not sure if this is a sql or store issue so marking both components. Should figure out if upgrade has anything to do with it, or if it is just a 10.2 vs. 10.6 performance issue. Possibilities could include: o) some new catalog that does not use right index for metadata o) some system specific problem with large directories, derby will create a new file for each new table. Names of these files don't really have anything to do with table names, so may also not be problem.
        Hide
        Rick Hillegas added a comment -

        Unchecking the "Data corruption" box because the report does not indicate that data has been mangled. Unchecking "Deviation from standard" box because I do not see how this behavior breaks either SQL or JDBC compliance. Checking the "Seen in production" box.

        Show
        Rick Hillegas added a comment - Unchecking the "Data corruption" box because the report does not indicate that data has been mangled. Unchecking "Deviation from standard" box because I do not see how this behavior breaks either SQL or JDBC compliance. Checking the "Seen in production" box.
        Hide
        Ray Gala added a comment -

        Log file showing degradation in performance to a virtual halt after ~9 hours.

        Show
        Ray Gala added a comment - Log file showing degradation in performance to a virtual halt after ~9 hours.

          People

          • Assignee:
            Unassigned
            Reporter:
            Ray Gala
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development