Page tree
Skip to end of metadata
Go to start of metadata

 

 

Goals

  1. Authorize a subset of operations on CDAP entities using Apache Sentry

  2. Make the authorization system pluggable. Support the following two systems to begin with:

    1. Sentry based

    2. CDAP Dataset based

Checklist

  • User stories documented (Rohit/Bhooshan)
  • User stories reviewed (Nitin)
  • Design documented (Rohit/Bhooshan)
  • Design reviewed (Andreas)
  • Feature merged (Rohit/Bhooshan)
  • Examples and guides (Rohit)
  • Integration tests (Bhooshan) 
  • Documentation for feature (Rohit/Bhooshan)
  • Blog post 

User Stories

  • As a CDAP system, I should be able to integrate with Apache Sentry for fine-grained role-based access controls of select CDAP operations 
  • As a CDAP admin, I should be able to easily configure Sentry to work with CDAP on different type of cluster (ex: CDH, CM cluster etc). 
  • As a CDAP admin, I should be able to create/update/delete roles in Apache Sentry
  • As a CDAP admin, I should be able to add users/groups to roles in Apache Sentry
  • As a CDAP admin, I should be able to turn authorization on/off easily for entire CDAP instance
  • As a CDAP system, I should be able to authorize the following requests
    • Namespace create/update/delete
    • Application deployment
    • Program start/stop
    • Stream read/write  (Not Implemented in 3.4)
      These operations are a subset that represents the various 'kinds' of operations allowed in CDAP

Scenarios

Scenario #1

  • D-Rock is an IT-Admin extra-ordinaire who has just been tasked with adding authorizing access to entities in CDAP on the cluster he manages. 
  • D-Rock is already familiar with Apache Sentry, since he has used it for authorization in other projects like Apache HDFS, Apache Hive, Apache Sqoop, etc. 
  • He would rather not learn a new authorization system. He would instead prefer that Apache Sentry be used to provide Role Based Access Control to CDAP entities as well.
  • As part of this, he would also like a streamlined installation and configuration experience with Apache Sentry and CDAP, including detailed instructions.

Scenario #2

  • D-Rock manages a variety of CDAP clusters in dev/smoke/qa/staging environments along with the prod environment.
  • For these environments, he would like to be able to turn authorization on/off easily with a switch for the CDAP instance, depending on the need at a given time.

Scenario #3

  • Ideally, D-Rock would like to be able to authorize all operations on all entities in CDAP. 
  • However, this can be rolled out in phases. In the initial phase, he would like to control who can:
    • Create/update/delete a namespace
      • Only users with WRITE permission on CDAP instance should be able to perform this operation.
      • A property in sentry-site.xml will decide a set of users who have admin permission on cdap instance. These admins can then later grant permissions to other users.
    • Deploy an application in a namespace
      • Only users with WRITE permission on the namespace should be able to perform this operation
      • One the application is deployed the the user who deployed becomes the ADMIN of the application. 
    • Start/stop a program
      • Only users with READ permission on the namespace and application, and EXECUTE permission on the program should be able to perform this operation
      • Only users with ADMIN permission on the program can set preference for the program
      • Only users with WRITE permission can provide runtime args
    • Read/write to a stream
      • Only users with READ privilege on the namespace and READ permission on the stream should be able to read from the stream
      • Only users with READ privilege on the namespace and WRITE permission on the stream should be able to write to the stream
      • Note: We have decided not to handle views separately. A user have same permission on all views of a stream as what it has on the stream. 

Entities, Operations and Privileges

EntityOperationRequired PrivilegesResultant Privileges
NamespacecreateADMIN (Instance)ADMIN (Namespace)
 updateADMIN (Namespace) 
 listREAD (Instance) 
 getREAD (Namespace) 
 deleteADMIN (Namespace) 
 set preferenceWRITE (Namespace) 
 get preferenceREAD (Namespace) 
 searchREAD (Namespace) 
ArtifactaddWRITE (Namespace)ADMIN (Artifact)
 deleteADMIN (Artifact) 
 getREAD (Artifact) 
 listREAD (Namespace) 
 write propertyADMIN (Artifact) 
 delete propertyADMIN (Artifact) 
 get propertyREAD (Artifact) 
 refreshWRITE (Instance) 
 write metadataADMIN (Artifact) 
 read metadataREAD (Artifact) 
ApplicationdeployWRITE (Namespace)ADMIN (Application)
 getREAD (Application) 
 listREAD (Namespace) 
 updateADMIN (Application) 
 deleteADMIN (Application) 
 set preferenceWRITE (Application) 
 get preferenceREAD (Application) 
 add metadataADMIN (Application) 
 get metadataREAD (Application) 
Programsstart/stop/debugEXECUTE (Program) 
 set instancesADMIN (Program) 
 listREAD (Namespace) 
 set runtime argsEXECUTE (Program) 
 get runtime argsREAD (Program) 
 get instancesREAD (Program) 
 set preferenceADMIN (Program) 
 get preferenceREAD (Program) 
 get statusREAD (Program) 
 get historyREAD (Program) 
 add metadataADMIN (Program) 
 get metadataREAD (Program) 
 emit logsWRITE (question) (Program) 
 view logsREAD (Program) 
 emit metricsWRITE (question) (Program) 
 view metricsREAD (Program) 
StreamscreateWRITE (Namespace)ADMIN (Stream)
 update propertiesADMIN (Stream) 
 deleteADMIN (Stream) 
 truncateADMIN (Stream) 
 enqueue
asyncEnqueue
batch
WRITE (Stream) 
 getREAD (Stream) 
 listREAD (Namespace) 
 read eventsREAD (Stream) 
 set preferencesADMIN (Stream) 
 get preferencesREAD (Stream) 
 add metadataADMIN (Stream) 
 get metadataREAD (Stream) 
 view lineageREAD (Stream) 
 emit metricsWRITE (question) (Stream) 
 view metricsREAD (Stream) 
DatasetslistREAD (Namespace) 
 getREAD (Dataset) 
 createWRITE (Namespace)ADMIN (Dataset)
 updateADMIN (Dataset) 
 dropADMIN (Dataset) 
 executeAdmin (exists/truncate/upgrade)ADMIN (Dataset) 
 add metadataADMIN (Dataset) 
 get metadataREAD (Dataset) 
 view lineageREAD (Dataset) 
 emit metricsWRITE (question) (Dataset) 
 view metricsREAD (Dataset) 

NOTE: Cells marked green are in scope for 3.4

Design

This feature can be broken down into the following main parts, in no specific order:

Authorization in CDAP

The authorization system in CDAP will be pluggable, and the backend can be provided by external systems like Apache Sentry/Ranger. It provides:

  • Authorization Enforcement hooks during various operations within CDAP, that throw AuthorizationException if the operation is not authorized.
  • ACL Management

This system exposes a set of interfaces defined below. 

AuthEnforcer

The AuthEnforcer interface provides a way to check if an operation is authorized. At various points in the CDAP code (NamespaceHttpHandler, AppLifecycleHttpHandler, ProgramLifecycleHttpHandler, StreamHandler in 3.4), this interface will be used to check if an operation is authorized.

AuthChecker Interface
interface AuthEnforcer {
	/**
     * Enforces authorization for the specified {@link Principal} for the specified {@link Action} on the specified {@link EntityId}.
     *
     * @param principal the principal that performs the actions. This could be a user, group or a role
     * @param entity the entity on which an action is being performed
     * @param action the action being performed
     * @throws AuthorizationException if the principal is not authorized to perform action on the entity
     */
	void enforce(Principal principal, EntityId entity, Action action) throws AuthorizationException;
}

Authorizer

This interface allows CDAP admins to grant/revoke permissions for specific operations on specific CDAP entities to specified Principals. It will be used by the ACL Management module, which may or may not reside in CDAP for the purposes of integration with Apache Sentry (question) TBD.

Authorizer Interface
public interface Authorizer {
  /**
   * Initialize the {@link Authorizer}. Authorization extensions can use this method to access an
   * {@link AuthorizationContext} that allows them to interact with CDAP for operations such as creating and accessing
   * datasets, executing dataset operations in transactions, etc.
   *
   * @param context the {@link AuthorizationContext} that can be used to interact with CDAP
   */
  void initialize(AuthorizationContext context) throws Exception;

  /**
   * Enforces authorization for the specified {@link Principal} for the specified {@link Action} on the specified
   * {@link EntityId}.
   *
   * @param entity the {@link EntityId} on which authorization is to be enforced
   * @param principal the {@link Principal} that performs the actions
   * @param action the {@link Action} being performed
   * @throws UnauthorizedException if the principal is not authorized to perform action on the entity
   * @throws Exception if any other errors occurred while performing the authorization enforcement check
   */
  void enforce(EntityId entity, Principal principal, Action action) throws Exception;

  /**
   * Grants a {@link Principal} authorization to perform a set of {@link Action actions} on an {@link EntityId}.
   *
   * @param entity the {@link EntityId} to whom {@link Action actions} are to be granted
   * @param principal the {@link Principal} that performs the actions. This could be a user, or role
   * @param actions the set of {@link Action actions} to grant.
   */
  void grant(EntityId entity, Principal principal, Set<Action> actions) throws Exception;

  /**
   * Revokes a {@link Principal principal's} authorization to perform a set of {@link Action actions} on
   * an {@link EntityId}.
   *
   * @param entity the {@link EntityId} whose {@link Action actions} are to be revoked
   * @param principal the {@link Principal} that performs the actions. This could be a user, group or role
   * @param actions the set of {@link Action actions} to revoke
   */
  void revoke(EntityId entity, Principal principal, Set<Action> actions) throws Exception;

  /**
   * Revokes all {@link Principal principals'} authorization to perform any {@link Action} on the given
   * {@link EntityId}.
   *
   * @param entity the {@link EntityId} on which all {@link Action actions} are to be revoked
   */
  void revoke(EntityId entity) throws Exception;

  /**
   * Returns all the {@link Privilege} for the specified {@link Principal}.
   *
   * @param principal the {@link Principal} for which to return privileges
   * @return a {@link Set} of {@link Privilege} for the specified principal
   */
  Set<Privilege> listPrivileges(Principal principal) throws Exception;

  /********************************* Role Management: APIs for Role Based Access Control ******************************/
  /**
   * Create a role.
   *
   * @param role the {@link Role} to create
   * @throws RoleAlreadyExistsException if the the role to be created already exists
   */
  void createRole(Role role) throws Exception;

  /**
   * Drop a role.
   *
   * @param role the {@link Role} to drop
   * @throws RoleNotFoundException if the role to be dropped is not found
   */
  void dropRole(Role role) throws Exception;

  /**
   * Add a role to the specified {@link Principal}.
   *
   * @param role the {@link Role} to add to the specified group
   * @param principal the {@link Principal} to add the role to
   * @throws RoleNotFoundException if the role to be added to the principals is not found
   */
  void addRoleToPrincipal(Role role, Principal principal) throws Exception;

  /**
   * Delete a role from the specified {@link Principal}.
   *
   * @param role the {@link Role} to remove from the specified group
   * @param principal the {@link Principal} to remove the role from
   * @throws RoleNotFoundException if the role to be removed to the principals is not found
   */
  void removeRoleFromPrincipal(Role role, Principal principal) throws Exception;

  /**
   * Returns a set of all {@link Role roles} for the specified {@link Principal}.
   *
   * @param principal the {@link Principal} to look up roles for
   * @return Set of {@link Role} for the specified {@link Principal}
   */
  Set<Role> listRoles(Principal principal) throws Exception;

  /**
   * Returns all available {@link Role}. Only a super user can perform this operation.
   *
   * @return a set of all available {@link Role} in the system.
   */
  Set<Role> listAllRoles() throws Exception;

  /**
   * Destroys an {@link Authorizer}. Authorization extensions can use this method to write any cleanup code.
   */
  void destroy() throws Exception;
}

Where Principal is the entity performing actions defined as below:

Subject
public class Principal {
	enum PrincipalType {
		USER,
		GROUP,
		ROLE
	}
 
	private final String name;
	private final PrincipalType type;
 
	public Principal(String name, PrincipalType type) {
		this.name = name;
		this.type = type;
	}
 
	public String getName() {
		return name;
	}
 
	public PrincipalType getType() {
		return type;
	}
}

Integration with Apache Sentry will be achieved by implementations of these interfaces that delegate to Apache Sentry.

Integration with Apache Sentry

Integration with Apache Sentry involves the development of three main modules:

CDAP Sentry Binding

Here we will bind CDAP to SentryGenericServiceClient and to the operations on the client.

SentryAuthorizer
public class SentryAuthorizer implements Authorizer {

    void grant(EntityId entity, Principal Principal, Set<Action> actions){
		// do grant operation on sentry client with needed mapping/conversion
	}
	... 
	...
	private SentryGenericServiceClient getClient() throws Exception {
	  return SentryGenericServiceClientFactory.create(conf); // create sentry client from Configuration 
	}
}

CDAP Sentry Model

The CDAP Sentry Model defines the CDAP entities for whom access needs to be authorized via Apache Sentry. It will based off of the Sentry Generic Authorization Model. The CDAP Sentry Model will have the following components:

CDAPAuthorizable

This interface defines the CDAP entities that need to be authorized. It must implement Authorizable.

CDAPAuthorizable
/**
 * This interface represents an authorizable resource in the CDAP component.
 */
public interface CDAPAuthorizable extends Authorizable {

  public enum AuthorizableType {
	Instance,
    Namespace,
    Artifact,
    Application,
    Program,
    Dataset,
    Stream,
  };
  AuthorizableType getAuthzType();
}

The CDAPAuthorizable interface will have to be implemented for each authorizable entity defined by the AuthorizableType enum above.

CDAPAction and CDAPActionFactory

These classes will implement BitFieldAction and BitFieldActionFactory to define the types of actions on CDAP entities. These classes also allow you to define implies relationships between actions.

TODO: Think about ALL, ADMIN_ALL

CDAPActions
public class CDAPActionConstants {
  public static final String READ = "read";
  public static final String EXECUTE = "execute";
  public static final String WRITE = "write";
  public static final String ADMIN = "admin"; // this is read + write + execute + admin (create/update/delete)
}

Sentry Policy Engine

Resource URIs

Using the above authorizable model, resource URIs for CDAP entities in the Sentry Policy Engine will be as follows:

EntitySentry Resource URI
Instance
cdap:///instance=server1
Namespacecdap:///instance=server1/namespace=ns1
Artifactcdap:///instance=server1/namespace=ns1/artifact=art/artifactVersion=1
Application

cdap:///instance=server1/namespace=ns1/application=app1

Programcdap:///instance=server1/namespace=ns1/application=app1/programType=pt1/programName=prg1
Datasetcdap:///instance=server1/namespace=ns1/dataset=ds1
Streamcdap:///instance=server1/namespace=ns1/stream=s1

The above URIs are internal Apache Sentry representations defined at SentryAuthorizationModelDesign. They are only mentioned here to convey how the CDAP entity hierarchy will be represented in Apache Sentry.

Interaction Diagram

Use-case: App Deployment by an unauthorized user

Configuration

Sentry

PropertyDescriptionValue
sentry.service.allow.connectList of users allowed to connect to the Sentry Servercdap will be added to this list
sentry.cdap.provider
Authorization provider for the CDAP component in Sentry. This class defines the user-group mapping amongst other things.
org.apache.sentry.provider.common.
HadoopGroupResourceAuthorizationProvider
sentry.cdap.provider.resourceThe resource for creating the Sentry Provider Backend. This property seems unused, and always defaults to "". However, all data engines (hive, sqoop, kafka define it).""
sentry.cdap.provider.backendA class that implements ProviderBackend. This class uses a SentryServiceClient to communicate with the sentry service from the client side in Sentry.
org.apache.sentry.provider.db.generic.SentryGenericProviderBackend
sentry.cdap.policy.engineDefines the Sentry Policy Engine for the cdap component. Must implement org.apache.sentry.policy.common.PolicyEngine

co.cask.cdap.security.authorization.sentry.policy.PolicyEngine

(package name subject to change)

CDAP

These properties will be defined in cdap-security.xml

PropertyDescriptionDefault
security.authorization.enabled
Determines whether authorization should be enabled in CDAP. If false, a NoOpAuthorizer would be used for security.authorizer.classfalse
security.authorizer.class
Fully qualified class name of the authorizer class. Must implement the Authorizer interfaceco.cask.cdap.security.authorization.DatasetBasedAuthorizer
instance.nameDefines the instance name for the cdap component.cdap

Role Management

To support RBAC (Role Based Access Control) such as Apache Sentry we will need to support role management through CDAP.

A user using RBAC should be able to:

  • Create a role
  • delete a role
  • add role to principal (where principal can be of type user or group)
  • remove role from a principal (where principal can be of type user or group)
  • List roles
  • List roles for principal
  • List privileges for role

We will need to support this operation from through REST  APIs and also through cli. Below is the proposed APIs and CLI commands:

Authorization API

Security CLI commands

ACL management

There are multiple options for ACL Management. For dataset-based authorizer, we will have to support ACL Management via the CDAP CLI.

For Apache Sentry based authorizer, there are multiple options. We should support this via the CDAP CLI because it should involve very little extra work. However, support should also be provided via the SentryShell as well as Hue.

Although supporting the Sentry Shell seems straightforward once the CDAP backend for Sentry is implemented, it's a relatively new feature added in Sentry 1.7 (SENTRY-749). CDH 5.5 ships Sentry 1.5 and there are no timelines on support for Sentry 1.7 (Cloudera Maven Repository).

After some digging we found out that SentryShell is hardcoded to use work with Hive and it works only with Hive. At the moment of this writing, Kafka is added support for SentryShell by making a copy for Hive's SentryShell. This seems to be the norm in Sentry for Shell support since there is no generic Shell which can be used by the services being integrated to Sentry. Unless we have some strong reason we should avoid having support for CDAP through SentryShell, specially since we are already working on supporting ACL management for CDAP in Sentry through Hue. See below. 

For recognizing and listing CDAP entities in Hue, we will have to implement a CDAP Webapp for Hue. Hue is implemented entirely in Python using the Django framework. This integration is a risk for 3.4. More details on this TBD.

Hue Integration

Testing

For testing the sentry integration, there are a couple of approaches. We can use the file-based policy store in Apache Sentry for tests. However, to simulate more realistic scenarios, we should explore if it is easy to setup an in-memory database (HSQL, etc) with the Sentry schema in tests.

Installation

Questions

  1. How does CDAP get sentry-site.xml? Path provided via cConf?
  2. Distinguishing Read/Write access is perhaps out of scope of 3.4, since we will need changes to Dataset Framework
  3. Can access to all entities be authorized in one go? If so, how? 
  4. How does hierarchy work? e.g. write to stream requires READ perms on namespace + write perms on stream
  5. In a secure/kerberos environment, what does it take to communicate with the Sentry Server?
  6. In a secure/kerberos environment, what does it take to communicate with the Sentry Server?
  7. Given that Sentry has a slightly data-engine-based schema, will we need some updates to the policy store to contain CDAP specific tables for storing CDAP Privileges? SENTRY_CDAP_PRIVILEGE and SENTRY_CDAP_PRIVILEGE_MAP tables?
  8. What about instance-level authorization? Would users need to be authorized to a given CDAP instance as well, along with the namespace and entity?
  9. Do we need EXECUTE operation just for Programs entity. Can we say that any user who has READ can run the program ? 

Discussion Bhooshan & Rohit 02/17

 

CDAP SpecificExternal Auth Service: SentryACL Management
  1. Provide Authorization Hooks in CDAP
    1. Intercept all HTTP calls
    2. Thrift calls
    3. Access to data from programs
  1. Modules to implement
    1. Binding
    2. Model
    3. Policy
    4. E2E Tests
  1. Should CDAP do ACL Management
    1. CLI
    2. HTTP Handlers

    3. If we assume ACLs are set in Sentry through Sentry
      what if we switch to Dataset based store.

2. Authorization Checks

Check
for a given user/group and type of access
	if allowed:
		perform operation
	else:
		throw AuthException

2. Figuring out how to interact with Sentry

    • SentryGenericServiceClient
    • How to know where Sentry is running?

 

 

 
3. We need an Authorization interface  

Discussion with Gokul 02/08

  • Push down ACLs  - No HBase support in Sentry
  • Custom datasets - how do you recognize read/writes
  • How do you distinguish between read/write
  • Sentry Integration - needs follow-ups
  • Performance (num RPC calls)
  • Sentry Persistent Storage - PolicyStoreProvider
  • Interactions with Auth system
  • Sentry web-app for UI may need customizations in Hue
  • How does switching between authorization enabled/disabled work

Out-of-scope User Stories (3.5 and beyond)

  1. As a CDAP admin, I should be able to authorize reads/writes to datasets
  2. As a CDAP admin, I should be able to authorize metadata changes to CDAP entities
  3. As a CDAP system, I should be able to push down ACLs to storage providers
  4. As a CDAP admin, I should be able to authorize reads/writes to custom datasets
  5. As a CDAP system, I should be able to judge, document and improve the performance impact of authorization
  6. As a CDAP authorization system, I should be able to interact with an external authentication system
  7. As a CDAP admin, I should be able to use external UIs like Hue for ACL Management
  8. As a CDAP admin, I should be able to see an audit log of all authorization-related changes in CDAP
  9. As a CDAP admin, I should be able to authorize all thrift-based traffic, so transaction management is also authorized.

References

  • No labels

37 Comments

    • Deploy an application in a namespace
      Only users with READ permission on the namespace, and ADMIN permission on the application should be able to perform this operation


    I am not sure if I understand this completely. When an user is trying to deploy an application the application does not exist and since the entity does nto exist he/she cannot have any permission on it. Also, an user who just have read permission on the namespace should just be allowed to see the namespace and browse it. For example viewing all the application deployed in the NS. If he also have a read permission on a stream in a NS he/she can view that too and run explore queries on it.

    But to deploy an application in a NS he/she should have a write permission on the NS not just read. It's very similar to creating a file a inside a directory in  linux where we can think NS as a directory and the application as a file.

    1. Per our discussion:

      1. Write perms on instance -> can create a namespace, becomes admin of that namespace, can give admin access to the namespace (or any other permission) to other roles.
      2. Write perms on namespace -> can deploy apps, becomes admin of those apps. Can redeploy, etc. Can also grant perms on the apps.

       

    • Create/update/delete a namespace
      Only users with ADMIN permission should be able to perform this operation.

    Admin permission on the CDAP Instance right ?

    Also, when an user creates a NS by default he gets admin permission on it right ?

    1. yes, that sounds better. also, such an admin can then grant admin rights to others.

    • Start/stop a program
      Only users with READ permission on the namespace and application, and ADMIN permission on the program should be able to perform this operation


    I think admin permission to run/stop a program might be a little confusing. How about a user who have write permission on the program will be able to run/stop a program but a user with admin permission will be able change the runtime args/preferences. 

     

    We also  talked about execute operation. Do you think it makes more sense to require execute permission to run/stop than read/write/admin ?

    1. yes, execute sounds better. also, runtime arguments should require the same permissions as start/stop. preferences may be admin, but i'm not sure

    • Read/write to a stream
      Only users with READ privilege on the namespace and READ permission on the stream should be able to read from the stream
      Only users with READ privilege on the namespace and WRITE permission on the stream should be able to write to the stream

    Also, we should think about explore queries on Streams. Insert into/delete will require write access on stream and we will have to handle that for explore queries too. 

    1. Authorizing stream read/writes will send 403 forbidden to explore if the operations are not authorized, right? 

  1.     /**
         * Grants a Principal authorization to perform all actions on an entity.
         *
         * @param entity the entity on which an action is being performed
         * @param principal the Principal that performs the actions. This could be a user, group or a role
         */
        void grant(EntityId entity, Principal principal, Set<Action> actions);


    Do you think we should have another method or boolean in this method itself to determine whether to give all (read/write/execute) or  admin_all (read, write, execute with admin)
    1. I think lets keep it simple. If you want to give read, write and execute, you pass in a Set with read, write and execute perms.

  2. We are keeping ACL out of Dataset from the scope of 3.4 as currently we don't have a way to know read/write on dataset from a user program. We will first need to solve this to provide access control. 

      • These operations are a subset that represents the various 'kinds' of operations allowed in CDAP

    Should we name all the operations that will eventually be supported? Access to datasets etc.? Scope of release 3.4 is independent of that, I think.

  3. he would like to be able to turn authorization on/off easily with a switch, depending on the need at a given time.

    Is that for an entire CDAP instance? Or a namespace? Or individual "entities"?

    1. It will be for entire cdap instance. Updated the sentence. 

      1. However, Andreas Neumann, are there use cases to do this more granularly? If so, we could add it as a future enhancement and keep in mind while implementing the first version as well.

  4. Only users with WRITE permission on CDAP instance

    What are the possible privileges? WRITE, READ, EXECUTE? 

    1. The set of actions are READ/WRITE/EXECUTE and ADMIN (create/update/delete) we have listed them in CDAPActionConstants class.

  5.  

      • One the application is deployed the the user who deployed becomes the owner

     

    What privilege does "owner" correspond to? WRITE? Can the owner delegate privileges to others?

    1. Owner will have the ADMIN permission and he can delegate it to others. I will update the design to include this. 

  6. A property in cdap-site.xml should decide a set of users who have admin permission on cdap instance.

    Seems that this would better be in cdap-security.xml? 

    I am actually wondering whether it should be entirely separate. Otherwise cdap-site.xml becomes an ACL- store.

    1. Yes good suggestion. We will keep it in cdap-security.xml

      For now, I think we just need 2-3 properties to be defined so we are planning to use the cdap-security.xml. If the required property list grows further we can have a separate configuration file for it. 

  7. Scenario #3

    I think it is difficult to list all operations and required permissions here. A tabular form would be much more concise. 

    I also think some of these are debatable. 

    1. Just created a table - CDAP3.4-Entities,OperationsandPrivileges. Please review again.

  8. Read/write to a stream

    • Only users with privilege

    How exactly does this work? If user A deploys an app, user B configures a program in that app to read the stream, and user C starts the program. Which of these three users needs the privileges?

    1. The user deploying the App which will create the dataset/stream will get ADMIN permission on the dataset/stream. All other user who wants to use these dataset/stream to read/write will need respective read/write permission. 

      In the above case: User A who is deploying the app will get ADMIN permission and user B will just need WRITE permission on the Namespace to configure the program but does not need any permission on the stream as it is not actively reading/writing to it. User C will need read permission to read from the Stream.  

  9. the Principal that performs the actions.

    similar here. If a program performs the action, what is the corresponding principal? 

    1. The intent here is to capture the user who is running the program. However we are not sure if that information is available in CDAP today. We will need to find this out. 

      Poorna ChandraAlvin Wang: Any thoughts ? 

  10. Integration with Apache Sentry will be achieved by implementations of these interfaces that delegate to Apache Sentry.

    In Scenario #1 you mention that Sentry should be used to implement management of authorization for CDAP. Here it seems that CDAP provided methods that delegate to Sentry, that is, D-Rock interacts with CDAP for managing the ACLs? And CDAP delegates to Sentry? Except for WRITE on the CDAP instance?

    1. Yes but small correction its not WRITE on the CDAP Instance but ADMIN. 

  11.   public enum AuthorizableType {
        Instance,
        Namespace,
        Artifact,
        Application,
        Program,
        Dataset,
        Stream,
        Stream_View
      };

    Stream_View is special enough to get its own type? Isn't a view simply backed by the ACLs of its underlying stream? Why would we give separate permissions to the view? 

    1. We added Stream_View as separate entity because it's possible that an user create different views from different columns of the stream and want to gives access to different users on these views. 

  12. cdap:///namespace=ns1/artifact=art1

    Why the x=y notation? Could we make it more RESTy, like cdap://namespaces/ns1/artifacts/art1 ?

    1. Its an internal Sentry representation defined in https://issues.apache.org/jira/secure/attachment/12663314/sentryAuthorizationModelDesign.pdf. Its probably too much of an implementation/internal detail to be on this doc. I had just added it to show the hierarchy, but it probably doesn't even belong here.

  13. public static final String ADMIN

    Does this include the right to manage the ACLs for the same entity?

    1. Yes, ADMIN can manage ACLs for an entity as well as do configuration and management operations. 

  14. Overall, this document contains decisions but not the reasons for these decisions. It would be good to include that. For example, why is the list of authorizable types the way it is? etc.

  15. Andreas Neumann: Thanks for the detailed review. You brought up some critical points which we haven't documented. We have addressed the comments and can go over them in design discussion.