Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25874

Simplify abstractions in the K8S backend

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Major
    • Resolution: Done
    • 2.4.0
    • 3.0.0
    • Kubernetes, Spark Core
    • None

    Description

      I spent some time recently re-familiarizing myself with the k8s backend, and I think there is room for improvement. In the past, SPARK-22839 was added which improved things a lot, but it is still hard to follow the code, and it is still more complicated than it should be to add a new feature.

      I've worked on the main things that were bothering me and came up with these changes:
      https://github.com/vanzin/spark/commits/k8s-simple

      Now that patch (first commit of the branch) is a little large, which makes it hard to review and to properly assess what it is doing. So I plan to break it down into a few steps that I will file as sub-tasks and send for review independently.

      The commit message for that patch has a lot of the background of what I changed and why. Since I plan to delete that branch after the work is done, I'll paste it here:

      There are two main changes happening here.
      
      (1) Simplify the KubernetesConf abstraction.
      
      The current code around KubernetesConf has a few drawbacks:
      
      - it uses composition (with a type parameter) for role-specific configuration
      - it breaks encapsulation of the user configuration, held in SparkConf, by
        requiring that all the k8s-specific info is extracted from SparkConf before
        the KubernetesConf object is created.
      - the above is usually done by parsing the SparkConf info into k8s-backend
        types, which are then transformed into k8s requests.
      
      This ends up requiring a whole lot of code that is just not necessary.
      The type parameters make parameter and class declarations full of needless
      noise; the breakage of encapsulation makes the code that processes SparkConf
      and the code that builds the k8s descriptors live in different places, and
      the intermediate representation isn't adding much value.
      
      By using inheritance instead of the current model, role-specific
      specialization of certain config properties works simply by implementing some
      abstract methods of the base class (instead of breaking encapsulation), and
      there's no need anymore for parameterized types.
      
      By moving config processing to the code that actually transforms the config
      into k8s descriptors, a lot of intermediate boilerplate can be removed.
      
      This leads to...
      
      (2) Make all feature logic part of the feature step itself.
      
      Currently there's code in a lot of places to decide whether a feature
      should be enabled. There's code when parsing the configuration, building
      the custom intermediate representation in a way that is later used by
      different code in a builder class, which then decides whether feature A
      or feature B should be used.
      
      Instead, it's much cleaner to let a feature decide things for itself.
      If the config to enable feature A exists, it proceses the config and
      sets up the necessary k8s descriptors. If it doesn't, the feature is
      a no-op.
      
      This simplifies the shared code that calls into the existing features
      a lot. And does not make the existing features any more complicated.
      
      As part of this I merged the different language binding feature steps
      into a single step. They are sort of related, in the sense that if
      one is applied the others shouldn't, and merging them makes the logic
      to implement that cleaner.
      
      The driver and executor builders are now also much simpler, since they
      have no logic about what steps to apply or not. The tests were removed
      because of that, and some new tests were added to the suites for
      specific features, to verify what the old builder suites were testing.
      
      On top of the above I made a few minor changes (in comparison):
      
      - KubernetesVolumeUtils was modified to just throw exceptions. The old
        code tried to avoid throwing exceptions by collecting results in `Try`
        objects. That was not achieving anything since all the callers would
        just call `get` on those objects, and the first one with a failure
        would just throw the exception. The new code achieves the same
        behavior and is simpler.
      
      - A bunch of small things, mainly to bring the code in line with the
        usual Spark code style. I also removed unnecessary mocking in tests,
        unused imports, and unused configs and constants.
      
      - Added some basic tests for KerberosConfDriverFeatureStep.
      
      Note that there may still be leftover intermediate types that could
      potentially be removed. But at this point the change is already pretty
      large as it is.
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vanzin Marcelo Masiero Vanzin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: