Jackrabbit Content Repository
  1. Jackrabbit Content Repository
  2. JCR-1972

Preserving UUID and document version history on repository migration

    Details

    • Type: Wish Wish
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: core 1.4.8
    • Fix Version/s: 1.6
    • Component/s: jackrabbit-core
    • Labels:
      None

      Description

      I have been working I an migration utility for OpenKM and I performed some changes in jackrabit-core to enable version import, preserving
      the modification date. Also modified org.apache.jackrabbit.core.NodeImpl to preserve UUID in the migration process.

      This migration process is needed because there are changes in repository node definition, and Jackrabbit can't deal with this actually.

      I've attache a PDF with the changes needed in Jackrabbit-core. It works and there was no problems with the migrated repository.

      1. CheckinCalendarTest.java
        2 kB
        Paco Avila
      2. Jackrabbit_modifications.pdf
        29 kB
        Paco Avila
      3. JCR-1972_1.x.patch
        16 kB
        Jukka Zitting
      4. JCR-1972_1.x.patch
        28 kB
        Paco Avila
      5. JCR-1972_1.x.patch
        23 kB
        Paco Avila
      6. JCR-1972_1.x.patch
        17 kB
        Paco Avila
      7. JCR-1972.patch
        10 kB
        Paco Avila

        Activity

        Hide
        Paco Avila added a comment -

        This PDF shows the modifications needed in the jackrabbit core to preserve node UUID and document version history date in repository migrations.

        Show
        Paco Avila added a comment - This PDF shows the modifications needed in the jackrabbit core to preserve node UUID and document version history date in repository migrations.
        Hide
        Alexander Klimetschek added a comment -

        Could you provide the diff in a normal txt/patch format (svn diff > JCR-1972.patch)? A pdf is a rather unusual format for that

        Show
        Alexander Klimetschek added a comment - Could you provide the diff in a normal txt/patch format (svn diff > JCR-1972 .patch)? A pdf is a rather unusual format for that
        Hide
        Paco Avila added a comment - - edited

        Patch solicited by Alexander Klimetschek. This patch works with jackrabbit-core-1.4.6 and i'm not sure if it works for more recent versions.

        Show
        Paco Avila added a comment - - edited Patch solicited by Alexander Klimetschek. This patch works with jackrabbit-core-1.4.6 and i'm not sure if it works for more recent versions.
        Hide
        Jukka Zitting added a comment -

        If I read the patch correctly, it breaks the NodeImpl.checkin() method.

        Also, please use only spaces (no tabs) for indentation.

        Other than that I think the changes are reasonable. We may even want to consider adding the extra method signatures to jackrabbit-api.

        Show
        Jukka Zitting added a comment - If I read the patch correctly, it breaks the NodeImpl.checkin() method. Also, please use only spaces (no tabs) for indentation. Other than that I think the changes are reasonable. We may even want to consider adding the extra method signatures to jackrabbit-api.
        Hide
        Paco Avila added a comment -

        As I wrote some time ago, these modifications make two enhacements in the jackrabbit-core functionality:

        • Enable UUID parameter in addNode: this allow to preserve the unique node identifier across a repository migration. If you use references this change is fundamental.
        • Enable Calendar parameter in checkin: this change can be used to preserve checking date in document version history in a possible migration.

        This migration can be of two types:

        • Change the repository backend: you use the default Derby database, but you want to swith to MySQL, for example. Actually this is not possible in a easy way.
        • Change the repository node definitions: jackrabbit actually can't deal with these changes depending on their complexity. You have to write a program to make this migration.

        From my point of view, this patch introduces very neccessary funtionality into Jackrabbit. If you want to include these modifications in a near release, I can try to send a new patch that works with an specific jackrabbit-core version.

        Show
        Paco Avila added a comment - As I wrote some time ago, these modifications make two enhacements in the jackrabbit-core functionality: Enable UUID parameter in addNode: this allow to preserve the unique node identifier across a repository migration. If you use references this change is fundamental. Enable Calendar parameter in checkin: this change can be used to preserve checking date in document version history in a possible migration. This migration can be of two types: Change the repository backend: you use the default Derby database, but you want to swith to MySQL, for example. Actually this is not possible in a easy way. Change the repository node definitions: jackrabbit actually can't deal with these changes depending on their complexity. You have to write a program to make this migration. From my point of view, this patch introduces very neccessary funtionality into Jackrabbit. If you want to include these modifications in a near release, I can try to send a new patch that works with an specific jackrabbit-core version.
        Hide
        Jukka Zitting added a comment -

        We should protect against the case of adding a node with an UUID that already exists in the workspace. See the XML import functionality for potential ways of dealing with such cases.

        Show
        Jukka Zitting added a comment - We should protect against the case of adding a node with an UUID that already exists in the workspace. See the XML import functionality for potential ways of dealing with such cases.
        Hide
        Paco Avila added a comment -

        Updated patch for the 1.x branch

        Show
        Paco Avila added a comment - Updated patch for the 1.x branch
        Hide
        Paco Avila added a comment -

        I've updated the patch and also added two JUnit test (CheckinCalendarTest and added a new method at NodeImplTest).

        In addNode(..., UUID) I can test if the node has the MIX_REFERENCEABLE mixing but I'm not sure if this good because the mixin can be added after the node creation. This check is commented in the patch.

        Show
        Paco Avila added a comment - I've updated the patch and also added two JUnit test (CheckinCalendarTest and added a new method at NodeImplTest). In addNode(..., UUID) I can test if the node has the MIX_REFERENCEABLE mixing but I'm not sure if this good because the mixin can be added after the node creation. This check is commented in the patch.
        Hide
        Jukka Zitting added a comment -

        The patch is creating lots of duplicate code in the extra methods. Can we avoid that?

        Also, it would be good to have a check to prevent someone to add a new node with an id that already exists in the workspace.

        Show
        Jukka Zitting added a comment - The patch is creating lots of duplicate code in the extra methods. Can we avoid that? Also, it would be good to have a check to prevent someone to add a new node with an id that already exists in the workspace.
        Hide
        Paco Avila added a comment - - edited

        Updated patch -> Check for existing UUID.

        I have to duplicate some methods to add a new parameter with the UUID.

        Show
        Paco Avila added a comment - - edited Updated patch -> Check for existing UUID. I have to duplicate some methods to add a new parameter with the UUID.
        Hide
        Jukka Zitting added a comment -

        Thanks for the update! Still more comments:

        • It would be better if the UUID argument was a String instead of an instance of the UUID class. Also, it might be useful to rename the relevant methods to addNodeWithUUUID to avoid signature clashes with the already overloaded addNode method.
        • It would be good to have javadocs for all the new methods.
        • Please use only spaces to indent the code.
        • About the duplicate methods: It's OK to have extra method signatures for different purposes, but now you're duplicating also the entire method bodies. For example the new NodeImpl.checkin(Calendar) method is some 40 lines of code that's essentially identical to the code in NodeImpl.checkin(). If we later on encounter a bug and fix it in the checkin() method, how do we make sure that the fix also gets applied to checkin(Calendar)?
        Show
        Jukka Zitting added a comment - Thanks for the update! Still more comments: It would be better if the UUID argument was a String instead of an instance of the UUID class. Also, it might be useful to rename the relevant methods to addNodeWithUUUID to avoid signature clashes with the already overloaded addNode method. It would be good to have javadocs for all the new methods. Please use only spaces to indent the code. About the duplicate methods: It's OK to have extra method signatures for different purposes, but now you're duplicating also the entire method bodies. For example the new NodeImpl.checkin(Calendar) method is some 40 lines of code that's essentially identical to the code in NodeImpl.checkin(). If we later on encounter a bug and fix it in the checkin() method, how do we make sure that the fix also gets applied to checkin(Calendar)?
        Hide
        Paco Avila added a comment -

        Improvements by recommendation of Jukka

        Show
        Paco Avila added a comment - Improvements by recommendation of Jukka
        Hide
        Jukka Zitting added a comment -

        Thanks! I made some modifications to the patch (no tabs, less extra methods, even less code duplication; see attached patch for details) and committed it to the 1.x branch in revision 800385.

        I'll see what it takes to forward port this to the trunk.

        Show
        Jukka Zitting added a comment - Thanks! I made some modifications to the patch (no tabs, less extra methods, even less code duplication; see attached patch for details) and committed it to the 1.x branch in revision 800385. I'll see what it takes to forward port this to the trunk.
        Hide
        Jukka Zitting added a comment -

        One more thing... Your patch referenced a CheckinCalendarTest class, but the relevant java file wasn't included in the patch. Can you attach the file so we can include it in the test suite?

        Show
        Jukka Zitting added a comment - One more thing... Your patch referenced a CheckinCalendarTest class, but the relevant java file wasn't included in the patch. Can you attach the file so we can include it in the test suite?
        Hide
        Jukka Zitting added a comment -

        The new method signatures are now available also in trunk. Resolving as fixed for 1.6.0.

        Show
        Jukka Zitting added a comment - The new method signatures are now available also in trunk. Resolving as fixed for 1.6.0.
        Hide
        Paco Avila added a comment -

        Sorry for the missing test class. I did no review of the generated patch and thought it was included. I'm on holiday but i will try to get a decent Internet connection to attach the missing class.

        Show
        Paco Avila added a comment - Sorry for the missing test class. I did no review of the generated patch and thought it was included. I'm on holiday but i will try to get a decent Internet connection to attach the missing class.
        Hide
        Paco Avila added a comment -

        The missing test class

        Show
        Paco Avila added a comment - The missing test class

          People

          • Assignee:
            Jukka Zitting
            Reporter:
            Paco Avila
          • Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development