Page tree
Skip to end of metadata
Go to start of metadata

 

 

Goals

  1. Explore CDAP Entities in Hue

  2. Use Hue's admin interface to manage ACL for CDAP stored in Apache Sentry

Checklist

  • User stories documented (Shenggu)
  • User stories reviewed (Nitin)
  • Design documented (Shenggu)
  • Design reviewed (Andreas)
  • Feature merged (Shenggu)
  • Integration tests (Shenggu)
  • Documentation for feature (Shenggu)
  • Blog post (Shenggu)

User Stories

  • As a Hue admin, I should be able to easily configure CDAP as a plugin app in the Hue system
  • As a CDAP user or a CDAPadmin, I should be able to explore the entities of CDAP (Namespace->Application->Program(->subprogram), Namespace->Stream/Dataset/Aritifacts) in Cloudera Hue's UI.
  • As a CDAP user, I should be able to perform all the ACL management operations provided by Apache Sentry through Cloudera Hue's admin UI.
    • CDAP superusers can manage all the rules
    • A user/groups who have ADMIN on one entity can give ACL on that entity to other users/groups

Scenerios

#Scenario 1

A user (typically a CDH user) is using Hue for exploring and managing ACL and other operations for all the different services on their cluster. He would prefer to use Hue and the consistent UI to manage ACLs for CDAP from a central place rather than separately in CDAP UI. 

Design

This integration code to be implemented will be part of the Cloudera Hue and communicate to CDAP & Apache Sentry through Rest/Thrift to manage the ACLs. The Hue/app itself does not store any state during this process.

 

Brief Introduction of Cloudera Hue

 

 (from hue's doc http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/Hue-2-User-Guide/hue2.html)

 

| Hue is a set of web applications that enable you to interact with a CDH cluster. Hue applications let you browse HDFS and work with Hive and Cloudera Impala queries, MapReduce jobs, and Oozie workflows.

The Hue server part is written in python Django framework and different systems, say Hbase or Impala, are configured as separate apps in Django. The users are able to control these components on the cluster through the web interface. And it is also possible to add customized apps to Hue server to provide support for additional system.

Logic view of the system

There are two possible designs for the system. 

Design 1:

 

Design 2:

 

As shown in both of the above diagram, the CDAP and SENTRY support are configured as a plugin app installed in the Hue system. Hue's front system is implemented in Django, which provides good isolation and extension for multiple apps running together in a web service. A separate panel section will be created in the Hue's default UI for related operations. This app will communicate with the CDAP system through CDAP's restful api service. All the live entities will be displayed in Hue's UI.

 

Communication with Apache SENTRY is enabled by SENTRY's thrift service. When admin grants/ revokes certain privileges through the Hue UI, it will be propagated to the SENTRY system and take effects on the further request coming from CDAP. In design one Hue will talk to the Sentry directly while design two take advantage of the Sentry Client apis built in CDAP to do so. Although the second design involves less code to be implemented, we will still implement design one as it is compatible with the behaviors of other plugins(hive/hdfs) in Hue and it is suitable for more cases(a security breach for instance). To work on design one, the Hue itself will also talk to sentry and have a separate keytab file to get authenticated with kerberos. 

UI Mockup

One possible UI layout is shown below. All the entities in CDAP can be listed hierarchically in the left. When click on one specific entity, user is able to view the detailed properties of this entity and manage the acl rules associated with this entity. The actual UI may vary in colors and relative layout of elements but stick to this concept.

Here are some other possible UI designs. Basically the ideas behind are the same that we provide a hierarchy entity structure to user with either a separate panel or a pop-up window to manage the ACLs.

We can make the addition of the ACLs as a pop up window to get focused.

 

In this case, the entire ACL management buttons are presented in the pop up window. The descriptions of entities can be displayed right to the entity name or displayed as anchors when mouse hovers over it.

 

Among all the UI layouts, we prefer the to implement the first one, since displaying all the UI components on the same page invloves less window open/close logic and less confusing to end users. In addtion, as the description of each entitiy is generally not that long (less than 5 entries in the first layer) and thus it is possible to put the ACL-adding-panel right under the descriptions. 

Configuration

To configure the CDAP app in HUE, simply copy the cdap app source code into $HUE_ROOT and run commands below: 

$HUE_ROOT/tools/app_reg/app_reg.py --install cdap --relative-paths

and the setup script will automatically add all required fields into hue's configuration file.

 

Note: May move some customized settings into HUE's configuration (located in $HUE_ROOT/desktop/config.dist/hue.ini) when project moves on, i.e. root host address of CDAP's rest api etc.

Currently no specific configuration is required in CDAP side.

Routes

This section explain the routes defined in Hue's CDAP app. In Django (as Hue is written in Django), routes is named as urls.py that use regex to define the format. MAKO is used as the html template engine.

URLResponse
GET /cdap/index.mako (main page)
GET /cdap/details/path/to/entity/entity_idjson of entity properties
GET /cdap/acl/path/to/entity/entity_idjson of entity ACLs
POST /cdap/acl/add/entity_id/ --data {groupid, operations}200 ok / 500 error
POST /cdap/acl/revoke/entity_id/ --data {groupid, operations}200 ok / 500 error
  
  
 

 

The operations here include {READ | WRITE | EXECUTE | ADMIN | ALL}. Multiple operations can be granted/revoked at once.

 

Out of Scope

In the above design, the system only supports listing all entities in CDAP and perform ACL management on these entities, while there is no full-support for managing the entities. These cases are listed as below and might be supported in the future.

  • Deploy/Start/Destroy a program
  • Creating/Deleting/Renaming an entity
  • List and Explore those entities that are not related to ACL management such as services, workflows
  • Change the properties of entities

 


 

 

45 Comments

  1. Explore CDAP entities and integrate them with Cloudera Hue

    Since we will just be exploring CDAP  Entities in Hue. We should say "Explore CDAP Entities in Hue". 

  2. Use Hue's admin interface to manage ACL for Apache Sentry

    for CDAP stored in Apache Sentry

  3. As a CDAP admin, I should be able to explore all the entities of CDAP (ex: Namespaces, Streams, Programs etc.) in Cloudera Hue's admin UI.

    CDAP Users should be able to explore too. 

  4.  

    • As a CDAP admin, I should be able to perform all the ACL management operations provided by Apache Sentry also in Cloudera Hue's admin UI.

    In CDAP, ACL is be given by:

    • Superusers of CDAP
    • A user/groups who have ADMIN on entity can give ACL on that entity to other users/groups.
    • As a result of new entity being created.
  5. Also add some details as what hue is and how it works.

    Simple architecture diagram and one paragraph of definition just to give context to how we will do the integration. 

    1. Unknown User (shenggu@cask.co)

      Yes, I changed these.

  6. The system utilize the Cloudera Hue's interface

    What is system here ? 

     

    1. Unknown User (shenggu@cask.co) This comment is still not addressed.

  7. It will be nice to specifically mention in design that the integration is only to support listing/viewing CDAP entities in CDAP so that users can manage ACLs on these entities. Other feature, say running a program, exploring dataset/stream are out of scope currently. You can also list them in out of scope section.

  8. In the "Logic View of System": 

    You have a connection from Hue App to sentry for ACL management. I don't think the ACL management will work like this. According to my understanding, Hue app will be just an UI on top of CDAP REST APIs. For ACL management actions too Hue will hit the appropriate CDAP rest endpoints (we already have these implemented and backed by Sentry) and the SentryAuthorizer running in CDAP will be be responsible for talking to Sentry to store the policies. Hue will not talk directly to Sentry. You should confirm this though. 

  9. One possible UI layout is shown below.

    Can you also list different possible UI layout? It will be nice to have consensus on the UI layout which we pick. If making an UI mockup is time intensive then just draw it on a paper and add it here. Also, provide some details on why you are picking this layout and not others.

  10. When admin grants/ evokes certain privileges

    revokes

  11. Also, for the following user story: 

    As a Hue admin, I should be able to easily configure CDAP as a plugin app in the Hue system

    One task will be being able to add CDAP in Hue through Cloudera Manager. Keep this point in design. We need to figure this out  too. 

  12.  We currently preferred the 

    We prefer the 

  13. Here is some possible UI designs. 

    Here are some other possible ...

  14. List all the routes before sending for review. 

  15. In out of scope section list down all the different  known features which we are not doing like exploring, creating entities etc. 

     

     

  16. For UI layout say which one you are choosing and why

  17. The Design Doc should clearly state the purpose of integration (maybe in Scenario)

    A user (typically CDH users) who are using Hue for exploring and managing ACL for all the different services on their cluster will prefer to use Hue and the consistent UI to manage ACL for CDAP to from a central place rather than directly in CDAP. 

  18. One other story that we might want to add is the following:

    • Administrator is able to apply bulk operations on users. 
    1. Unknown User (shenggu@cask.co): Looks like this has not been addressed. 

  19. Hue has a lot more than ACL management UIs. I believe it has ways to browse data (HDFS and HBase), workflows etc. Are we planning to allow browsing CDAP datasets and seeing/configuring CDAP workflows and schedules? Or are we limiting it purely to ACL management?

    1. Now it is only limited to ACL management. Other possible use cases for CDAP are now listed in the out-of-scope section.

  20. what types of CDAP entities are we going to expose to Hue?

    1. Edited in the user story.

  21. Please also provide UI Design mockup to support role management operation. Since Sentry is RBAC we will need to support creating role, deleting role, listing roles, adding group to  role and removing group from a role. CDAP exposes endpoint for these all you need to do is to hit those endpoints from Hue. 

  22. Are entities in Hue going to be in the same hierarchy as in CDAP? (Namespace->Application->Program(->subprogram), Namespace->Stream/Dataset)?

    1. Yes that's the case. Edited in user stories.

  23. How is user/group/role management going to work? Also through Hue? 

    1. Yes. Will edit that in the user cases.

  24. How does Hue itself authenticate with CDAP? In order to fetch all entities, I guess it needs some kind of super user privileges? Or does it authenticate as the user that is signed into Hue?

    1. Use has to provide superuser credentials somewhere in Hue's config. (Or pop up a window for authentication)

  25. Suppose I have a security breach and I shut down CDAP for that reason. Now, with CDAP down, I need to revoke privileges for users that I believe have been compromised. Is that possible with both designs 1 and 2?

    1. Andreas Neumann Thanks for bringing up an interesting point. We were inclining towards Design 2 because it goes through CDAP and the logic to talk to Sentry will be just in CDAP. But this design cannot serve the above use case. It can be addressed in Design 1 since Hue will directly talk to Sentry for CDAP. Do you think this is a critical requirement? If yes, then we will need to change the decision and go with Design 1. 

      1. What do other Hue plugins do? We should follow the established pattern.

        1. For built-in plugins, only hive and hdfs have sentry support. Both of them talk to sentry directly.

          1. That's design 1? Then I think we should follow.

          2. You should update the document to favor design 1 - if that is the consensus here. You also need to explain what are the changes required in CDAP and Hue that are implied by design 1. 

            1. Yes, I added it in the design.

  26. Hue appears to have a way to view and manipulate meta data for entities. Will we allow that for CDAP entities through our Hue extension?

    1. We display the meta data but not support modifications now.

  27. Regarding the routes. You have an API to grant a particular operation to a group. And an API revoke from a group, but that does not take the operation? is there a reason why? Also, you need to explain what operations are supported. Is there a way to specify multiple operations at once. Is there a way to grant all operations?

    1. It is because of some early design of UI. Edited in the document. Both grant & revoke should take the operation.

  28. In out of scope section: 

    • List and Explore those entities that are not related to ACL management such as services, workflows

     We are supporting listing and exploring all entities but we don't support explore queries.

    • As a CDAP user or a CDAPadmin, I should be able to explore the entities of CDAP (Namespace->Application->Program(->subprogram), Namespace->Stream/Dataset/Aritifacts) in Cloudera Hue's UI.

    There is no such entity as subprogram in CDAP. Workflow is a program which has other program inside it. Look at how we list programs in cdap and do the same.