Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6305

Ability to set the replication factor for index files created by HDFSDirectoryFactory

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 8.3
    • Component/s: Hadoop Integration, hdfs
    • Labels:
      None
    • Environment:

      hadoop-2.2.0

      Description

      HdfsFileWriter doesn't allow us to create files in HDFS with a different replication factor than the configured DFS default because it uses:
      FsServerDefaults fsDefaults = fileSystem.getServerDefaults(path);

      Since we have two forms of replication going on when using HDFSDirectoryFactory, it would be nice to be able to set the HDFS replication factor for the Solr directories to a lower value than the default. I realize this might reduce the chance of data locality but since Solr cores each have their own path in HDFS, we should give operators the option to reduce it.

      My original thinking was to just use Hadoop setrep to customize the replication factor, but that's a one-time shot and doesn't affect new files created. For instance, I did:

      hadoop fs -setrep -R 1 solr49/coll1

      My default dfs replication is set to 3 ^^ I'm setting it to 1 just as an example

      Then added some more docs to the coll1 and did:

      hadoop fs -stat %r solr49/hdfs1/core_node1/data/index/segments_3

      3 <-- should be 1

      So it looks like new files don't inherit the repfact from their parent directory.

      Not sure if we need to go as far as allowing different replication factor per collection but that should be considered if possible.

      I looked at the Hadoop 2.2.0 code to see if there was a way to work through this using the Configuration object but nothing jumped out at me ... and the implementation for getServerDefaults(path) is just:

      public FsServerDefaults getServerDefaults(Path p) throws IOException

      { return getServerDefaults(); }

      Path is ignored

        Attachments

          Activity

            People

            • Assignee:
              krisden Kevin Risden
              Reporter:
              thelabdude Timothy Potter
            • Votes:
              4 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: