As Zeppelin evolves its notebook, for large scale data analysis, multiple zeppelin users are expected to use and connect to the same set of data repositories within an enterprise. Since Zeppelin notebooks could affect data, state and its lineage, it is important to have separation of users, provide them with appropriate sandboxes, in addition to capturing the right audit details. Further, the IT within the organization would prefer to support fewer Zeppelin instances (preferably one) to support its customers. Therefore, the objectives of creating a multi-tenant zeppelin are:
● Supporting workloads of multiple customers
● Supporting multiple LOBs (lines of business), on a single data systems
● Support fine grained audits
As a natural evolution of Zeppelin Authentication and Authorization design, partly user awareness in downstream data systems such as Spark/Hive and others, is essential to achieve the above stated objectives.
Google Doc link for collaborating - https://docs.google.com/document/d/1AVGcviyVqWmmbHJmkgUo76ZDSwWAMjwHxmKBhZdAav4/edit?usp=sharing