Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-2142

GSoC project: Exact and Approximate Statistics for Data Streams and Windows

    XMLWordPrintableJSON

Details

    Description

      The goal of this project is to implement basic statistics of data streams and windows (like average, median, variance, correlation, etc.) in a computationally efficient manner. This involves designing custom PreReducers.

      The exact calculation of some statistics (eg. frequencies, or the number of distinct elements) would require memory proportional to the number of elements in the input (the window or the entire stream). However, there are efficient algorithms and data structures using less memory for calculating the same statistics only approximately, with user-specified error bounds.

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            ggevay Gábor Gévay
            ggevay Gábor Gévay
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: