Details

    • Hadoop Flags:
      Reviewed
    1. HDFS-3186.HDFS-3092.patch
      40 kB
      Brandon Li
    2. HDFS-3186.HDFS-3092.patch
      40 kB
      Brandon Li
    3. HDFS-3186.HDFS-3092.patch
      39 kB
      Brandon Li
    4. HDFS-3186.HDFS-3092.patch
      37 kB
      Brandon Li
    5. HDFS-3186.HDFS-3092.patch
      33 kB
      Brandon Li
    6. HDFS-3186.HDFS-3092.patch
      32 kB
      Brandon Li

      Activity

      Hide
      Hari Mankude added a comment -

      We need to have a mechanism for the newly added journal service to get the missing edit logs from the active journal service daemons.

      Show
      Hari Mankude added a comment - We need to have a mechanism for the newly added journal service to get the missing edit logs from the active journal service daemons.
      Hide
      Brandon Li added a comment -

      The basic mechanism is like the following:

      The journal service spawns a sync thread to download journal segments form other journal services. The sync thread contacts other journal services one by one until it can sync all the missed journal segments.
      After the synchronization is done, the sync thread changes the journal service state from SYNCING to IN_SYNC, and then blocks itself. The journal service wakes up the sync thread to do another round of synchronization when its state transits from WAITING_FOR_ROLL to SYNCING.

      Here is the journal service state auto-machine https://issues.apache.org/jira/secure/attachment/12524775/JNStates.png

      Show
      Brandon Li added a comment - The basic mechanism is like the following: The journal service spawns a sync thread to download journal segments form other journal services. The sync thread contacts other journal services one by one until it can sync all the missed journal segments. After the synchronization is done, the sync thread changes the journal service state from SYNCING to IN_SYNC, and then blocks itself. The journal service wakes up the sync thread to do another round of synchronization when its state transits from WAITING_FOR_ROLL to SYNCING. Here is the journal service state auto-machine https://issues.apache.org/jira/secure/attachment/12524775/JNStates.png
      Hide
      Suresh Srinivas added a comment -

      Preliminary comments:

      1. JournalService.java
        • JournalSync conf should be renamed as it hides outer class conf.
        • Remove unnecessary casting 3 to long
        • In startLogSegment the comment you have added is unnecessary
        • Why do you want to throw IllegalStateException in inSync() method?
        • Update the JournalService constructor javadoc for newly added parameters
        • Why should syncThread be a never ending thread?
        • There are bunch of TODO items
        • Why would syncThread be not null in run() method?
        • Please use space after // in comments. Also for method comments use /** */ comments.
        • runSync - startSync name better? valueSet may not be needed?
        • enableRunSync() - startSync() better name?
        • getRunSync() - waitForStartSync() better name?
        • Add javadoc to JouranalSync methods
        • rename getAllJournalSegments to syncAllJournalSegments?
        • "Sync trial failed for" to "Sync failed for"
      Show
      Suresh Srinivas added a comment - Preliminary comments: JournalService.java JournalSync conf should be renamed as it hides outer class conf. Remove unnecessary casting 3 to long In startLogSegment the comment you have added is unnecessary Why do you want to throw IllegalStateException in inSync() method? Update the JournalService constructor javadoc for newly added parameters Why should syncThread be a never ending thread? There are bunch of TODO items Why would syncThread be not null in run() method? Please use space after // in comments. Also for method comments use /** */ comments. runSync - startSync name better? valueSet may not be needed? enableRunSync() - startSync() better name? getRunSync() - waitForStartSync() better name? Add javadoc to JouranalSync methods rename getAllJournalSegments to syncAllJournalSegments? "Sync trial failed for" to "Sync failed for"
      Hide
      Brandon Li added a comment -

      Why do you want to throw IllegalStateException in inSync() method?

      I don't think I get the question. do we want to fail in a different way here?

      Why should syncThread be a never ending thread?

      It is because sync may be required anytime due to NN failover or journal service failure. If it's not a never ending thread, journal service needs to spawn sync thread on demand and lets the sync thread exit when sync is done. Both approaches need to handle the same set of situations, such as, the journal service state changes before sync finishes. The communication between main thread and sync thread is strait forward in both cases, and the never-ending-thread approach can avoid the cost of repeated thread creation.

      There are bunch of TODO items

      One cause of the TODOs is that, the NN is not updated to cooperate with journal service.

      Why would syncThread be not null in run() method?

      removed this check. I was over cautious there.

      runSync - startSync name better? valueSet may not be needed?

      changed name. removed valueSet since it's redundent with another veriable.

      Other comments are addressed.

      Show
      Brandon Li added a comment - Why do you want to throw IllegalStateException in inSync() method? I don't think I get the question. do we want to fail in a different way here? Why should syncThread be a never ending thread? It is because sync may be required anytime due to NN failover or journal service failure. If it's not a never ending thread, journal service needs to spawn sync thread on demand and lets the sync thread exit when sync is done. Both approaches need to handle the same set of situations, such as, the journal service state changes before sync finishes. The communication between main thread and sync thread is strait forward in both cases, and the never-ending-thread approach can avoid the cost of repeated thread creation. There are bunch of TODO items One cause of the TODOs is that, the NN is not updated to cooperate with journal service. Why would syncThread be not null in run() method? removed this check. I was over cautious there. runSync - startSync name better? valueSet may not be needed? changed name. removed valueSet since it's redundent with another veriable. Other comments are addressed.
      Hide
      Brandon Li added a comment -

      New patch has the added descriptions for the tests.

      Show
      Brandon Li added a comment - New patch has the added descriptions for the tests.
      Hide
      Brandon Li added a comment -

      rebase the patch with HDFS-3092 head

      Show
      Brandon Li added a comment - rebase the patch with HDFS-3092 head
      Hide
      Suresh Srinivas added a comment -

      Thanks for adding more descriptions to the tests. +1 for the patch.

      Show
      Suresh Srinivas added a comment - Thanks for adding more descriptions to the tests. +1 for the patch.
      Hide
      Suresh Srinivas added a comment -

      There are bunch of TODO comments in the patch, these should be addressed before merging this branch to trunk.

      Show
      Suresh Srinivas added a comment - There are bunch of TODO comments in the patch, these should be addressed before merging this branch to trunk.
      Hide
      Suresh Srinivas added a comment -

      I committed the patch. Thank you Brandon.

      Show
      Suresh Srinivas added a comment - I committed the patch. Thank you Brandon.
      Hide
      amith added a comment -

      Hi, sorry for jumping late

      if (minTxid == -1 || minTxid == -1) {
      

      I think one of the var should be maxTxid

      Show
      amith added a comment - Hi, sorry for jumping late if (minTxid == -1 || minTxid == -1) { I think one of the var should be maxTxid
      Hide
      Brandon Li added a comment -

      Thank you Amith! I will fix it in my next patch.

      Show
      Brandon Li added a comment - Thank you Amith! I will fix it in my next patch.

        People

        • Assignee:
          Brandon Li
          Reporter:
          Hari Mankude
        • Votes:
          0 Vote for this issue
          Watchers:
          2 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development