Uploaded image for project: 'Infrastructure'
  1. Infrastructure
  2. INFRA-3628

Import SourceForge respositories to apache for incubator project Jena

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: Initial Clearing
    • Component/s: Subversion
    • Labels:
      None

      Description

      Jena is project in incubation with an existing codebase, currently on SourceForge in 3 different repositories, SVN and CVS (2 SF projects). We have software grants on file at Apache for the majority of this code (in fact, for almost all of it). We'd appreciate the help migrating it to Apache infrastructure.

      We have several repositories, we'd like to import as-is, ideally with history, to create a record of the software pre-Apache.

      Our SourceForge repositories are:

      CVS: jena.cvs.sourceforge.net:/cvsroot/jena
      CVS: joseki.cvs.sourceforge.net:/cvsroot/joseki
      SVN: https://jena.svn.sourceforge.net/svnroot/jena

      Our initial thoughts were to place each repository in its own area under a common root to denote the imported code:

      https://svn.apache.org/repos/asf/incubator/jena/import/...
        so
      https://svn.apache.org/repos/asf/incubator/jena/import/jena-cvs
      https://svn.apache.org/repos/asf/incubator/jena/import/jena-svn
      https://svn.apache.org/repos/asf/incubator/jena/import/joseki-cvs

      There are modules with the same name in the CVS and SVN repositories: some modules moved from CVS to SVN with the same name - we don't need to integrate them as part of the move.

      In case it helps, the rsync backups are:
       jena.cvs.sourceforge.net::cvsroot/jena/*
       jena.svn.sourceforge.net::svn/jena/*
       joseki.cvs.sourceforge.net::cvsroot/joseki/*

      They are: 1.2G, 2.3G and 409M respectively.

      After the import, we will be extracting the active modules and building a new project structure, leaving the "import" area as the permanent record of out starting point at Apache. The import/ area becomes effectively a read-only archive.

      If this is not a sensible way for doing the imports, or there is better practice, or we're just being plain daft, please let us know.

          Andy
          jena-dev@incubator.apache.org


        Issue Links

          Activity

          Hide
          pctony Tony Stevenson added a comment -
          Import completed. Closing as fixed.
          Show
          pctony Tony Stevenson added a comment - Import completed. Closing as fixed.
          Hide
          pctony Tony Stevenson added a comment -
          Jena-CVS
          --------------

          Eris: ------- Committed new rev 1114174 (loaded from original rev 9456) >>>
          Harmonia: ------- Committed new rev 1114174 (loaded from original rev 9456) >>>


          Joseki-CVS
          -----------------

          Eris: ------- Committed new rev 1115276 (loaded from original rev 1102) >>>
          Harmonia: ------- Committed new rev 1115276 (loaded from original rev 1102) >>>


          Jena-SVN
          ---------------

          Eris: ------- Committed new rev 1124118 (loaded from original rev 8842) >>>
          Harmonia: ------- Committed new rev 1124118 (loaded from original rev 8842) >>>
          Show
          pctony Tony Stevenson added a comment - Jena-CVS -------------- Eris: ------- Committed new rev 1114174 (loaded from original rev 9456) >>> Harmonia: ------- Committed new rev 1114174 (loaded from original rev 9456) >>> Joseki-CVS ----------------- Eris: ------- Committed new rev 1115276 (loaded from original rev 1102) >>> Harmonia: ------- Committed new rev 1115276 (loaded from original rev 1102) >>> Jena-SVN --------------- Eris: ------- Committed new rev 1124118 (loaded from original rev 8842) >>> Harmonia: ------- Committed new rev 1124118 (loaded from original rev 8842) >>>
          Hide
          castagna Paolo Castagna added a comment -
          Thank you Andy, thank you Tony.
          Show
          castagna Paolo Castagna added a comment - Thank you Andy, thank you Tony.
          Hide
          pctony Tony Stevenson added a comment -
          Sure. I'll import them into the folders you created. I'll let you know once it is done. It is likely to be tomorrow AM (BST).

          Show
          pctony Tony Stevenson added a comment - Sure. I'll import them into the folders you created. I'll let you know once it is done. It is likely to be tomorrow AM (BST).
          Hide
          andy.seaborne Andy Seaborne added a comment - Reporter
          Tony,

          They look fine.

          The only thing I noticed is that the folder in the Jena area currently created is "Import" with a capital "I". Either one will do.
          Show
          andy.seaborne Andy Seaborne added a comment - Reporter Tony, They look fine. The only thing I noticed is that the folder in the Jena area currently created is "Import" with a capital "I". Either one will do.
          Hide
          pctony Tony Stevenson added a comment -
          Andy,

          Ok, so I have loaded the 3 fiels you gave us in to a test repo, so that you can check them and make sure they look as expected, before I go ahead and load them into the main repo. We need to do this to make sure the structure looks right, and so we can get an idea as to how long it will take to load the data into the live repo. This is because we have to take SVN offline, and make it non-writable for the duration, this is because of the number of revisions, and the fact we use the multi-site (write-through) setup.

          So as soon as you confirm these look right to you, I will need to schedule the 45mins or so of downtime to load these in. Once they are in, they are in, so please check these tests carefully.

          Show
          pctony Tony Stevenson added a comment - Andy, Ok, so I have loaded the 3 fiels you gave us in to a test repo, so that you can check them and make sure they look as expected, before I go ahead and load them into the main repo. We need to do this to make sure the structure looks right, and so we can get an idea as to how long it will take to load the data into the live repo. This is because we have to take SVN offline, and make it non-writable for the duration, this is because of the number of revisions, and the fact we use the multi-site (write-through) setup. So as soon as you confirm these look right to you, I will need to schedule the 45mins or so of downtime to load these in. Once they are in, they are in, so please check these tests carefully.
          Hide
          pctony Tony Stevenson added a comment -
          Jena-SVN import complete

          real 17m7.417s
          user 3m0.059s
          sys 6m32.120s

          https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Jena-SVN/
          Show
          pctony Tony Stevenson added a comment - Jena-SVN import complete real 17m7.417s user 3m0.059s sys 6m32.120s https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Jena-SVN/
          Hide
          pctony Tony Stevenson added a comment -
          Jena-CVS import complete

          real 15m12.001s
          user 2m40.971s
          sys 7m35.445s

          https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Jena-CVS/
          Show
          pctony Tony Stevenson added a comment - Jena-CVS import complete real 15m12.001s user 2m40.971s sys 7m35.445s https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Jena-CVS/
          Hide
          pctony Tony Stevenson added a comment -
          Joseki import complete

          real 1m31.582s
          user 0m12.989s
          sys 0m37.338s

          https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Joseki-CVS/

          Show
          pctony Tony Stevenson added a comment - Joseki import complete real 1m31.582s user 0m12.989s sys 0m37.338s https://svn-master.apache.org/repos/test/pctony/incubator/jena/import/Joseki-CVS/
          Hide
          pctony Tony Stevenson added a comment -
          Note,

          The checksum files seem to match.

          However, I nave also noted that data volume is quite large.

          -rw-r--r-- 1 pctony pctony 1.7G May 16 14:16 ASF-Jena-CVS.svn
          -rw-r--r-- 1 pctony pctony 3.5G May 16 14:18 ASF-Jena-SVN.svn
          -rw-r--r-- 1 pctony pctony 223M May 16 14:18 ASF-Joseki-CVS.svn

          Essentially 5.5Gb.

          ASF-Jena-CVS == 9456 revs
          ASF-Jena-SVN == 8842 revs
          ASF-Joseki-CVS == 1102 revisions

          Andy, I will have to schedule downtime to do the live import for this, due to the size of the dataset. However I will be running a test import, to measure time it takes to do each of them.
          Show
          pctony Tony Stevenson added a comment - Note, The checksum files seem to match. However, I nave also noted that data volume is quite large. -rw-r--r-- 1 pctony pctony 1.7G May 16 14:16 ASF-Jena-CVS.svn -rw-r--r-- 1 pctony pctony 3.5G May 16 14:18 ASF-Jena-SVN.svn -rw-r--r-- 1 pctony pctony 223M May 16 14:18 ASF-Joseki-CVS.svn Essentially 5.5Gb. ASF-Jena-CVS == 9456 revs ASF-Jena-SVN == 8842 revs ASF-Joseki-CVS == 1102 revisions Andy, I will have to schedule downtime to do the live import for this, due to the size of the dataset. However I will be running a test import, to measure time it takes to do each of them.
          Hide
          andy.seaborne Andy Seaborne added a comment - Reporter
          From IRC chat, it might be easier if they are on the web, so I have put them in

          http://people.apache.org/~afs/....

          and left symlinks in the original place.
          Show
          andy.seaborne Andy Seaborne added a comment - Reporter From IRC chat, it might be easier if they are on the web, so I have put them in http://people.apache.org/~afs/ .... and left symlinks in the original place.
          Hide
          andy.seaborne Andy Seaborne added a comment - Reporter
          Dumps ready.

          I have uploaded 3 svn repository dumps to people.apache.org:/home/andy

            ASF-Jena-CVS.svn.gz
            ASF-Jena-SVN.svn.gz
            ASF-Joseki-CVS.svn.gz

          + a SHA1 checksum file, ASF-jena-import-sha1

          Please load these as:

          ASF-Jena-CVS.svn.gz ==> incubator/jena/Import/Jena-CVS
          ASF-Jena-SVN.svn.gz ==> incubator/jena/Import/Jena-SVN
          ASF-Joseki-CVS.svn.gz ==> incubator/jena/Import/Joseki-CVS

          The destination folders already exist. There are folders with the same name in ASF-Jena-SVN and ASF-Jena-CVS which is why they need to go into different places and the use of --parent-dir.

          I checked they work, and the layout is right by doing:
          REPO is where the top of a local SVN repository by:

          # Joseki-CVS
          gzip -d < Imports/ASF-Joseki-CVS.svn.gz | \
               svnadmin load --parent-dir incubator/jena/Import/Joseki-CVS $REPO

          # Jena-CVS
          gzip -d < Imports/ASF-Jena-CVS.svn.gz | \
               svnadmin load --parent-dir incubator/jena/Import/Jena-CVS $REPO

          # Jena-SVN
          gzip -d < Imports/ASF-Jena-SVN.svn.gz | \
               svnadmin load --parent-dir incubator/jena/Import/Jena-SVN $REPO


              Thanks
              Andy
          Show
          andy.seaborne Andy Seaborne added a comment - Reporter Dumps ready. I have uploaded 3 svn repository dumps to people.apache.org:/home/andy   ASF-Jena-CVS.svn.gz   ASF-Jena-SVN.svn.gz   ASF-Joseki-CVS.svn.gz + a SHA1 checksum file, ASF-jena-import-sha1 Please load these as: ASF-Jena-CVS.svn.gz ==> incubator/jena/Import/Jena-CVS ASF-Jena-SVN.svn.gz ==> incubator/jena/Import/Jena-SVN ASF-Joseki-CVS.svn.gz ==> incubator/jena/Import/Joseki-CVS The destination folders already exist. There are folders with the same name in ASF-Jena-SVN and ASF-Jena-CVS which is why they need to go into different places and the use of --parent-dir. I checked they work, and the layout is right by doing: REPO is where the top of a local SVN repository by: # Joseki-CVS gzip -d < Imports/ASF-Joseki-CVS.svn.gz | \      svnadmin load --parent-dir incubator/jena/Import/Joseki-CVS $REPO # Jena-CVS gzip -d < Imports/ASF-Jena-CVS.svn.gz | \      svnadmin load --parent-dir incubator/jena/Import/Jena-CVS $REPO # Jena-SVN gzip -d < Imports/ASF-Jena-SVN.svn.gz | \      svnadmin load --parent-dir incubator/jena/Import/Jena-SVN $REPO     Thanks     Andy
          Hide
          lsimons Leo Simons added a comment -
          INFRA-1672 is apparently the last time someone did some cvs2svn work and it was quite a struggle to get a reasonable svn dump file.
          Show
          lsimons Leo Simons added a comment - INFRA-1672 is apparently the last time someone did some cvs2svn work and it was quite a struggle to get a reasonable svn dump file.
          Hide
          lsimons Leo Simons added a comment -
          As per my e-mail to jena-dev, infra need you to provide a(n) svnadmin dump file(s) for them to import.
          Show
          lsimons Leo Simons added a comment - As per my e-mail to jena-dev, infra need you to provide a(n) svnadmin dump file(s) for them to import.

            People

            • Assignee:
              pctony Tony Stevenson
              Reporter:
              andy.seaborne Andy Seaborne
              Request participants:
              None
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: