Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11473

Make HDFSDirectoryFactory support other prefixes (besides hdfs:/)

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 6.6.1
    • 8.1, 9.0
    • Hadoop Integration, hdfs
    • None

    Description

      Not sure if it's a bug or a missing feature I'm trying to make Solr work on Alluxio, as described by Timothy Potter in https://www.slideshare.net/thelabdude/running-solr-in-the-cloud-at-memory-speed-with-alluxio/1

      The problem I'm facing here is with autoAddReplicas. If I have replicationFactor=1 and the node with that replica dies, the node taking over incorrectly assigns the data directory. For example:

      before

      "dataDir":"alluxio://localhost:19998/solr/test/",

      after

      "dataDir":"alluxio://localhost:19998/solr/test/core_node1/alluxio://localhost:19998/solr/test/",

      The same happens for ulogDir. Apparently, this has to do with this bit from HDFSDirectoryFactory:

        public boolean isAbsolute(String path) {
          return path.startsWith("hdfs:/");
        }

      If I add "alluxio:/" in there, the paths are correct and the index is recovered.

      I see a few options here:

      • add "alluxio:/" to the list there
      • add a regular expression in the lines of [a-z]*:/ I hope that's not too expensive, I'm not sure how often this method is called
      • don't do anything and expect alluxio to work with an "hdfs:/" path? I actually tried that and didn't manage to make it work
      • have a different DirectoryFactory or something else?

      What do you think?

      Attachments

        1. SOLR-11473.patch
          1 kB
          Radu Gheorghe
        2. SOLR-11473.patch
          12 kB
          Kevin Risden
        3. SOLR-11473.patch
          13 kB
          Kevin Risden

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            krisden Kevin Risden
            radu0gheorghe Radu Gheorghe
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 20m
              20m

              Slack

                Issue deployment