Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Data Processors
    • Labels:
      None
    • Environment:

      Redhat 5.1, Java 6

      Description

      On older systems, iostat metrics contain random overflow. The plan is the put in a filter to remove the overflow in the demux parser to improve the data quality.

      1. CHUKWA-75-2.patch
        0.8 kB
        Eric Yang
      2. CHUKWA-75.patch
        1 kB
        Eric Yang

        Activity

        Hide
        Eric Yang added a comment -

        I just committed this, thanks for everyone's feedback.

        Show
        Eric Yang added a comment - I just committed this, thanks for everyone's feedback.
        Hide
        Mac Yang added a comment -

        Created CHUKWA-85 to track the general issue of not logging errors.

        Show
        Mac Yang added a comment - Created CHUKWA-85 to track the general issue of not logging errors.
        Hide
        Cheng added a comment -

        agree with mac, we should create a new jira to track it and move this jira forward.

        Show
        Cheng added a comment - agree with mac, we should create a new jira to track it and move this jira forward.
        Hide
        Cheng added a comment -

        usually i prefer not to use exception to control the flow. in this case, if the data is not a number, a lot of NumberFormatException will be thrown. it could slow down the system.

        Show
        Cheng added a comment - usually i prefer not to use exception to control the flow. in this case, if the data is not a number, a lot of NumberFormatException will be thrown. it could slow down the system.
        Hide
        Mac Yang added a comment -

        When an error condition occurred, it's very important to capture that information. I will create another jira to track it separately.

        Show
        Mac Yang added a comment - When an error condition occurred, it's very important to capture that information. I will create another jira to track it separately.
        Hide
        Eric Yang added a comment -

        Bump up the number to 1+e12, and skip the key/value pair, if it was a overflow. In today's disk, the highest throughput is ~100MB/second. It's unlike any number should reach 1+e12 in iostat. By skipping the key/value pair, we eliminate null pointer exception problem for the down stream application.

        Show
        Eric Yang added a comment - Bump up the number to 1+e12, and skip the key/value pair, if it was a overflow. In today's disk, the highest throughput is ~100MB/second. It's unlike any number should reach 1+e12 in iostat. By skipping the key/value pair, we eliminate null pointer exception problem for the down stream application.
        Hide
        Mac Yang added a comment -

        Is there an expectation that this record type will never contain null value? Given the wide range of input sources and the plugable nature of the system, seems to me the generic components like the demux pipeline and the DumpRecord tool should check for null value irregardless of what we decide to do here.

        If we use some other value instead of null then all the down stream components will need to check for that, in addition to checking for null. Besides, that value will need to be reserved to signify missing data point, and needs to be followed cross data sources.

        Show
        Mac Yang added a comment - Is there an expectation that this record type will never contain null value? Given the wide range of input sources and the plugable nature of the system, seems to me the generic components like the demux pipeline and the DumpRecord tool should check for null value irregardless of what we decide to do here. If we use some other value instead of null then all the down stream components will need to check for that, in addition to checking for null. Besides, that value will need to be reserved to signify missing data point, and needs to be followed cross data sources.
        Hide
        Jerome Boulon added a comment -

        If we don't know the column that we are processing how can we be sure that a value >1000000000L is an overflow?

        • Should we or not apply the overflow only on a specific list of headers that we are aware of?

        Also, can we have a validation that will not create any exception from Demux, DumpRecord, pig?
        Having a null there will force everyone than want to use this recordType to check for NULL (performance impact + check for null at query time),
        so in the worse case it should be documented but having a well defined value in place of NULL will be better.

        Show
        Jerome Boulon added a comment - If we don't know the column that we are processing how can we be sure that a value >1000000000L is an overflow? Should we or not apply the overflow only on a specific list of headers that we are aware of? Also, can we have a validation that will not create any exception from Demux, DumpRecord, pig? Having a null there will force everyone than want to use this recordType to check for NULL (performance impact + check for null at query time), so in the worse case it should be documented but having a well defined value in place of NULL will be better.
        Hide
        Cheng added a comment -

        +1 — null is better than '0'. please also make sure downstream app can handle null case.

        Show
        Cheng added a comment - +1 — null is better than '0'. please also make sure downstream app can handle null case.
        Hide
        Eric Yang added a comment -

        Submitted patch for overflow value set to null.

        Show
        Eric Yang added a comment - Submitted patch for overflow value set to null.
        Hide
        Eric Yang added a comment -

        Use null instead of 0 for overflowed values.

        Show
        Eric Yang added a comment - Use null instead of 0 for overflowed values.
        Hide
        Eric Yang added a comment -

        Reopen to change overflow value to null.

        Show
        Eric Yang added a comment - Reopen to change overflow value to null.
        Hide
        Eric Yang added a comment -

        The columns are mapped differently in different linux distro. The parser doesn't employ a strict type check, rather, it attempts to extract information by column separations. Hence, if the column is not a number, it must be a string, and we store the column as a string. This enable us to map columns by column header regardless how the column are rearranged.

        Instead of 0, it is better to store as null. Reopen to fix this again.

        Show
        Eric Yang added a comment - The columns are mapped differently in different linux distro. The parser doesn't employ a strict type check, rather, it attempts to extract information by column separations. Hence, if the column is not a number, it must be a string, and we store the column as a string. This enable us to map columns by column header regardless how the column are rearranged. Instead of 0, it is better to store as null. Reopen to fix this again.
        Hide
        Mac Yang added a comment -

        Agree that it does not make sense to remove the whole record, however, instead of setting the value to 0, it might be better to remove the offending value and set it to null. It's probably also a good idea to generate a log entry.

        As for the NumberFormatException, sounded like there might be a reason for why it's handled this way. It would be useful to elaborate on this a little so everyone is on the same page.

        Show
        Mac Yang added a comment - Agree that it does not make sense to remove the whole record, however, instead of setting the value to 0, it might be better to remove the offending value and set it to null. It's probably also a good idea to generate a log entry. As for the NumberFormatException, sounded like there might be a reason for why it's handled this way. It would be useful to elaborate on this a little so everyone is on the same page.
        Hide
        Eric Yang added a comment -

        Thanks Cheng, I just committed this.

        Show
        Eric Yang added a comment - Thanks Cheng, I just committed this.
        Hide
        Eric Yang added a comment -

        Only one of the column contains incorrect value, hence we can't remove the entire record.
        If it is a NumberFormatException, this means it is most likely a string. The handling code is correct.

        Show
        Eric Yang added a comment - Only one of the column contains incorrect value, hence we can't remove the entire record. If it is a NumberFormatException, this means it is most likely a string. The handling code is correct.
        Hide
        Jerome Boulon added a comment -

        Metrics Quality concern:

        • Instead of adding a record with a wrong value, shouldn't it be better to just remove this record?
        • Also if we have a NumberFormatException exception, shouldn't we do something more tangible?
        Show
        Jerome Boulon added a comment - Metrics Quality concern: Instead of adding a record with a wrong value, shouldn't it be better to just remove this record? Also if we have a NumberFormatException exception, shouldn't we do something more tangible?
        Hide
        Cheng added a comment -

        +1 minor change. looks good.

        Show
        Cheng added a comment - +1 minor change. looks good.
        Hide
        Eric Yang added a comment -

        Filter out value greater than 1+e10.

        Show
        Eric Yang added a comment - Filter out value greater than 1+e10.
        Hide
        Eric Yang added a comment -

        Filter out values that is greater than 1+e10.

        Show
        Eric Yang added a comment - Filter out values that is greater than 1+e10.

          People

          • Assignee:
            Eric Yang
            Reporter:
            Eric Yang
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development