- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
CDAP allows you to create and manage many entities at the system level. There are system profiles, system preferences, system artifacts and system apps. These are all managed in different ways. Profiles and preferences must be manually created and managed through the REST endpoints once CDAP is up and running. Artifacts are automatically loaded from a directory every time CDAP starts up. System apps are sometimes created manually, and sometimes automatically created by the backend on startup. Any additional instance specific bootstrapping must be handled by the CDAP administrator. For example, if a CDAP admin wants to make sure the default amount of memory used by program containers is 4gb, they need to manually set a system preference after CDAP has been installed and is running. In environments where multiple CDAP instances must be installed and bootstrapped, it is up to the administrator to before the manual steps required to set everything up in a consistent fashion.
Provide a unified mechanism used to bootstrap a CDAP instance.
- An organization gives every developer their own CDAP sandbox for development. The company hosts much of their development infrastructure and data in the cloud. The administrator wants to ensure that every developer's CDAP instance is pre-configured with a cloud runtime profile as the system default instead of the native profile. They also want to make sure the dataprep app is deployed with a specific config such that the cloud connections are pre-defined.
- An organization runs many clusters for different use cases, each cluster with their own CDAP instance. A new cluster is created every quarter. System administrators want to ensure that every new CDAP instance comes installed with a set of required system applications that run worker and service programs that ensure certain compliance requirements are enforced on each cluster.
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system profiles
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system preferences
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system applications
- As a System Administrator, I want new CDAP instances to automatically start system programs
- As a System Administrator, I want certain bootstrap actions to only be run once on a new CDAP instance
- As a System Administrator, I want certain bootstrap actions to be run every time CDAP is restarted
- As a System Administrator, if CDAP dies or is shut down before bootstrap can finish, I want bootstrap to be retried the next time CDAP starts
- As a System Administrator, I want to be able to manually re-run bootstrap operations at a later time
- As a System Administrator, I don't want a manual bootstrap operation to modify any existing CDAP entities
- As a System Administrator, I want to see log warnings when a bootstrap step fails
- As a System Administrator, I want to see an informational log message when a bootstrap step is skipped because an entity already exists
- As a System Administrator, I don't want a failed bootstrap step to block subsequent steps from running
- As a CDAP user, I want to be able to use CDAP normally even if the bootstrap fails
- As a CDAP user, I want certain system programs (like dataprep) to automatically start whenever the CDAP sandbox starts
We will add a new 'system.bootstrap.file' configuration setting. This points to a file that will specify what entities need to be bootstrapped.
Some bootstrap steps will only automatically happen once per CDAP instance. We will need to keep track of whether or not the instance has been bootstrapped by setting some value in a system table. The CDAP master leader will perform the bootstrap operations.
Bootstrap can be re-run by calling a new REST endpoint 'POST /v3/bootstrap'. When bootstrap is re-run, entities will only be created if they do not already exist. For example, if the bootstrap file contains a step to create a profile named 'ABC' and there is an existing profile named 'ABC', the bootstrap process will ignore that step. Existing entities will not be modified either.
If CDAP dies or is shut down in the middle of a bootstrap, the bootstrap will be retried the next time CDAP starts up. Conflicts will be handled the same way as manual re-runs. If a bootstrap step is on an entity that already exists, the step will be skipped. Existing entities will not be modified.
If a bootstrap step fails with a non-retryable exception, it will be skipped and the next step will be executed.
There are many ways to represent the bootstrap file. JSON format seems like a consistent choice for a format, as many of our other config files are in JSON.
The bootstrap file is a list of steps, where each step has the same format. This makes the ordering unambiguous and gives a mostly standard format for each step. Each step will have some short label, which will be used when logging warnings that a step failed. Each step also has a type, that determines what type of action will be performed. Each step also defines whether or not it should be run every time CDAP starts up, or just once for the entire CDAP instance.
The example above loads system artifacts, creates a profile named 'ABC', sets 'ABC' as the default profile for all of CDAP, creates the dataprep application configured with a default cloud storage connection, and finally starts the dataprep service if it is not already running.
The steps to load system artifacts and start the dataprep service are performed every time CDAP starts up. Profile creation, setting as the default profile, and creating the dataprep app only happens once for a CDAP instance.
The bootstrap file can be represented in Java as.
New Programmatic APIs
Deprecated Programmatic APIs
New REST APIs
|/v3/bootstrap||POST||Re-run the bootstrap steps|
200 - Ran steps, even if all failed
500 - Any internal errors so didn't run steps
400 - bootstrap file is ill formed
Deprecated REST API
CLI Impact or Changes
UI Impact or Changes
What's the impact on Authorization and how does the design take care of this aspect
Impact on Infrastructure Outages
|Test ID||Test Description||Expected Results|
|1||Move existing logic that loads system artifacts and deploys apps to bootstrap framework||System artifacts loaded on restart, wrangler automatically deployed and started for sandbox on every restart.|
|2||Create bootstrap file that creates a profile and sets it as default profile||After starting CDAP, the profile should appear in the system profile list as the default profile|
|3||Create bootstrap file with 3 steps where the second step is guaranteed to fail||Steps 1 and 3 should complete and there should be a warning about step 2|
|4||Manually create a profile that has the same name as a bootstrap profile, but with different properties||Bootstrap step should be skipped|
|5||Manually set preferences that have the same keys but different values as those in the bootstrap step||Bootstrap step should not overwrite existing preferences|
Potentially add more supported actions