Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: SystemML 1.2
    • Component/s: None
    • Labels:
      None

      Description

      In the context of ml, it would be more efficient to support the data partitioning in distributed manner. This task aims to do the data partitioning on Spark which means that all the data will be firstly splitted among workers and then execute data partitioning on worker side according to scheme, and then the partitioned data which stay on each worker could be directly passed to run model training work without materialization on HDFS.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Guobao LI Guobao
                Reporter:
                Guobao LI Guobao
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: