Uploaded image for project: 'Infrastructure'
  1. Infrastructure
  2. INFRA-19439

Add a way to publish static HTML content of huge size (outside of pelican CMS) without checking into Git



    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: None
    • Component/s: Other/Misc
    • Environment:
      Lucene Webseite using Pelican
    • Project:


      we are in progress of converting our website from the old CMS to pelican. The CMS part is already done (see LUCENE-8987). The results are looking fine:

      But there is still an open point: The website uses the extpath functionality to publish the documentation (markdown+Javadocs with hundreds of files) and Solr Refguide for each released version on the website.

      Previously we were uploading those files to the production tree in SVN and referenced them in extpath of the old CMS. The total size of the files are about 16 to 20 GiB, consisting of hundreds/thousands of files (static HTML). The files are pushed exactly one time to a new folder (including version number) and never ever changed (as it's somehow a release of documentation, so its static), the committers were using sparse SVN checkouts for that.

      Of course we could commit those 16 Gigs of small files to Git but that won't scale at all. Also building the webseite would take very long (as Pelican does not work incrementially - please correct me if I am wrong).

      I was talking with [~ke4qqq] on Apachecon Europe about this. He said, that the CMS should only contain the markdown CMS files (so the dynamic website). Static content can still be committed to subversion (where you also have the cool functionality of sparse checkouts). With Git you have to copy everything.

      The same approach is used for distributions (tar.gz), so we'd like to have the static HTML stuff (Javadocs, Refguide) still in SVN and would like to link them using .htaccess or similar into the main web page: e.g., http://lucene.apache.org/core/8_3_0/ (this is static content), while http://lucene.apache.org/core/ is build by CMS. So we'd like to host the stuff in another subdirectory using svnpubsub and add "Alias" directives for each released version (or maybe using a regex rewrite). The same applies for Solr and its refguide: http://lucene.apache.org/solr/guide/8_3/ (everything below /solr/guide).

      What options do we have for that? Should we keep the current SVN for that maybe just move the stuff somewhere else in the SVN tree after switch to Pelican.

      We may move the stuff to another subdirectory so the Alias deployment works better and add permanent redirects. It's important that the old URLs are stable, because the release TARs refer to them, too.

      I heard of other projects having the same problem: Maven with those websites for each relaese, generated from their build system; OpenOffice


          Issue Links



              • Assignee:
                humbedooh Daniel Gruno
                uschindler Uwe Schindler
              • Votes:
                1 Vote for this issue
                5 Start watching this issue


                • Created:

                  Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 50m