Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1860

File system errors are not detected unless Kafka tries to write

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.10.0.0
    • Component/s: None
    • Labels:
      None

      Description

      When the disk (raid with caches dir) dies on a Kafka broker, typically the filesystem gets mounted into read-only mode, and hence when Kafka tries to read the disk, they'll get a FileNotFoundException with the read-only errno set (EROFS).

      However, as long as there is no produce request received, hence no writes attempted on the disks, Kafka will not exit on such FATAL error (when the disk starts working again, Kafka might think some files are gone while they will reappear later as raid comes back online). Instead it keeps spilling exceptions like:

      2015/01/07 09:47:41.543 ERROR [KafkaScheduler] [kafka-scheduler-1] [kafka-server] [] Uncaught exception in scheduled task 'kafka-recovery-point-checkpoint'
      java.io.FileNotFoundException: /export/content/kafka/i001_caches/recovery-point-offset-checkpoint.tmp (Read-only file system)
      	at java.io.FileOutputStream.open(Native Method)
      	at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
      	at java.io.FileOutputStream.<init>(FileOutputStream.java:156)
      	at kafka.server.OffsetCheckpoint.write(OffsetCheckpoint.scala:37)
      

        Attachments

        1. KAFKA-1860.patch
          4 kB
          Mayuresh Gharat

          Activity

            People

            • Assignee:
              mgharat Mayuresh Gharat
              Reporter:
              guozhang Guozhang Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: