Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-4886

Remote ColumnFamilyInputFormat

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Won't Fix
    • 3.11.5
    • None
    • None

    Description

      As written, the ColumnFamilyInputFormat does not have a great deal of fault tolerance.

      It only attempts to perform a read from a single replica, with an infinite timeout. If that replica is not available, then the Task fails, and must be retried on a different node.

      This is fine if the TaskTrackers are colocated with Cassandra nodes, but is very fragile when this is not possible. When the Tasktrackers are remote to cassandra, the same rules about clients should apply--there should be a strict (configurable) timeout, and the ability to retry requests on a different replica if at single request fails.

      It seems obvious that we'd want to support both types of architecture; to do that, we should probably have a configuration which allows the user to specify his architecture choices explicitely.

      Attachments

        1. CASSANDRA-4886.patch
          36 kB
          Scott Fines

        Issue Links

          Activity

            People

              Unassigned Unassigned
              scottfines Scott Fines
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: