Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3711

Create UDAF to calculate an array of Benford's Law

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • UDF

    Description

      Benford's Law is a useful analytical tool to determine if a number was generated with a random process by evaluating the relative proportions of the leading digit. It can be used to detect accounting, financial, and election fraud.

      Wikipedia's Benford's Law page has a good overview.

      Hive is well suited to calculate Benford's Law. The result should be a named struct with names 1-9 and values being the corresponding proportions of each digit.

      An alternative is to calculate the deviations from Benford's Law for each digit. The structure of the resulting array would be the same, but the result would be the difference between the actual proportions and the proportions given the by formula on Wikipedia.

      Attachments

        Activity

          People

            Unassigned Unassigned
            eshilts Erik Shilts
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: