Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2147

[Storm SQL] Support automatic spout parallelism based on DataSource metadata

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: storm-sql

      Description

      It would be better to receive metadata from Data Source, especially producer which can give some hints to optimize.
      A notable kind of hint is parallelism hint. In storm-kafka we know that normally it's best to set parallelism to same as topic's partition count so that Spouts can pull the data from all partitions in parallel.

      We can apply non-query optimizations start from here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                kabhwan Jungtaek Lim
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: