Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-1729

Assess performance of classification algorithms

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Won't Do
    • None
    • None

    Description

      In order to validate Flink's classification algorithms (in terms of performance and accuracy), we should run them on publicly available classification data sets. This will not only serve as a proof for the correctness of the implementations but will also show how easy the machine learning library can be used.

      Bottou [1] published some results for the RCV1 dataset using SVMs for classification. The SVMs are trained using stochastic gradient descent. Thus, they would be a good comparison for the CoCoA trained SVMs.

      Some more benchmark results and publicly available data sets ca be found here [2].

      Resources:
      [1] http://leon.bottou.org/projects/sgd
      [2] https://github.com/BIDData/BIDMach/wiki/Benchmarks

      Attachments

        Issue Links

          Activity

            People

              hoa@insightdatascience.com hoa nguyen
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: