Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Checklist

  • User Stories Documented
  • User Stories Reviewed
  • Design Reviewed
  • APIs reviewed
  • Release priorities assigned
  • Test cases reviewed
  • Blog post

Introduction 

The current scheduler has limited trigger types and constraints. We wish to add more configuration by chaining pipelines and programs based on the states of programs, as well as impose more scheduler constraints.

Goals

  • Triggers: Other programs or pipelines should be started based on the program status of another program or pipeline.
  • Constraints: Additional scheduling constraints such as a specific time window, when a program has generated a specific amount of data, and number of concurrent runs, should be added.

User Stories 

Note: The "program" denoted below in the user stories could be a pipeline or any custom program. See the Assumptions section below for more details.

  1. As a user, I would like to be able to start program B if and only if program A successfully completed, and start program C if and only if program A failed.
  2. As a user, I would like to be able to start program B if and only if program A has failed, with a time window constraint for program B.
  3. As a user, I would like to be able to start program B if and only if program A successfully completed and dataset C has 100 MB of new data
  4. As a user, I would like to be able to start program B if and only if program A successfully completed and dataset C has new partitions.
  5. **As a user, I would like to start program B if program A resulted in a lot of errors. (Error metrics)

Design

Assumptions

  • The "program" specified in the user stories can be a pipeline in the same namespace, or any custom program.

  • We are assuming that that the program state based scheduling can take in all possible program states (Initialized, Running, Completed, Failed, Killed). The UI can later be configured to allow only the appropriate and meaningful program states.
  • All combinations of constraints and triggers that are meaningful are possible, where the UI can be configured to restrict a combination as needed.

Approach

  • Program State Based Scheduling will rely on the existing TMS system and follow a similar architecture to the existing Scheduler.
  • UI can populate the program type (One of flowsmapreduceservicessparkworkers, or workflows)
  • Constraints added to a ProgramStateTrigger will be applied to the current program, not the previous program (as seen in User Story #2).

Approach #1:

  • BasicCustomActionContext, BasicWorkflowContext, BasicMapReduceContext, BasicFlowletContext all extend from AbstractContext.
  • We let BasicSparkContext also extend from AbstractContext
  • Implement setState method in AbstractContext, where it sends a TMS ProgramStateNotification
  • All contexts then can simply use this method.

Potential Drawbacks:

  • Requires TMS to be running before program executes
  • Program execution may take longer than before because of these TMS message updates.
  • What happens if YARN container dies?
    • Will need to send the program status to failed/killed based on program timeout
    • This may need to be done from some other class that monitors all programs

TMS Message Design:

For a Program State message, the notification will contain:

  • namespace:
  • application:
  • applicationVersion
  • programType
  • program

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

ProgramScheduleStatus
public enum ProgramScheduleStatus {
	RUNNING,
	COMPLETED,
	FAILED,
	KILLED
};
Triggers
public class ProgramStatusTrigger {
  private final String namespace;
  private final String application;
  private final String applicationVersion;
  private final ProgramType type;
  private final String program;
  private final ProgramScheduleStatus status;
 
  public ProgramStatusTrigger(String namespace, String application, String applicationVersion, 
		ProgramType type, String program, ProgramScheduleStatus status) {
	this.namespace = namespace;
	this.application = application;
	this.applicationVersion = applicationVersion;
	this.type = type;
	this.program = program;
	this.status = status;
  }
}
Trigger Definition
ScheduleCreationSpec triggerByProgramStatus(String namespace, String application, String applicationVersion, 
	ProgramType type, String program, ProgramScheduleStatus status);

 

Deprecated Programmatic APIs

New REST APIs

New or Update Existing API MethodPathMethodDescriptionRequest BodyResponse CodeResponse
Update
/v3/namespaces/<namespace-id>/apps/<app-id>/schedules/<schedule-id>

PUT

To add a schedule for a program to an application

Request
{
  "name": "<name of the schedule>",
  "description": "<schedule description>",
  "program": {
    "programName": "<name of the program>",
    "programType": "WORKFLOW"
  },
  "properties": {
    "<key>": "<value>",
    ...
  },
  "constraints": [
    {
      "type": "<constraint type>",
      "waitUntilMet": <boolean>,
      ...
    },
    ...
  ],
  "trigger": {
	"type": "PROGRAMSTATE",
	"program": {
		"namespace": "<namespace of the program>",
		"application": "<application of the program>",
		"applicationVersion": "<application version of the program>",
		"type": "<type of the program>",
		"programName": "<name of the program>",
	},
	"status": "<status of the program; one of above>"
  }

200 - On success

404 - When application is not available

409 - Schedule with the same name already exists

500 - Any internal errors

"OK"

New? Not sure if it exists now. To be used in the dropdown for selecting different programs./v3/namespaces/<namespace-id>/apps/<app-id>/programsGETGet a list of all programs configured in a certain namespace. 200 - On success
Response
[
 {
   "program": {
		"namespace": "<namespace of the program>",
		"application": "<application of the program>",
		"applicationVersion": "<application version of the program>",
		"type": "<type of the program>",
		"programName": "<name of the program>"
   }
 },
 ...
]

 

 

 

Deprecated REST API

PathMethodDescription
/v3/apps/<app-id>GETReturns the application spec for a given application

CLI Impact or Changes

  • Add CLI handlers for adding new ProgramSchedule
  • Impact #2
  • Impact #3

UI Impact or Changes

  • Add ProgramSchedule options to the "Schedule" panel, along with dropdowns for pipeline selection and program status selection
  • Impact #2
  • Impact #3

Mockup:

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results
1Configure a schedule to run every 5 minutes and always complete successfully. Configure a second schedule to run if the first schedule has succeeded.Both schedules run and finish successfully.
2Configure a schedule to fail. Configure a second schedule to run only if the first schedule has succeeded.The first schedule should run. The second schedule should not run.
3Configure a schedule to run for 10 seconds before it completes. Configure a second schedule to run only if the first schedule has succeeded. 500 milliseconds before the first schedule completes, delete the second schedule.The first schedule should run and finish successfully.
4Add a Spark program (or any other type of program besides another workflow). Configure a schedule to run if the Spark program has succeeded. Run the Spark programThe Spark program runs, then the schedule starts.
   
   
   
   
   

Open Questions

  1. User-facing class to encapsulate a program? Rather than passing in namespace, application, applicationVersion, and programType, and program all to encapsulate a program?
  2. More granularity with program states? Programs have different states based on the program type. Not directly related as ProgramSchedulerStatus encompasses general states of all programs which is the primary use case, but accurate and detailed program states may still be something to consider.

Work Items

  • Notifications

    • Name topic in TMS for ProgramSchedulerStates

    • Send program state notifications through TMS

      • At central location for all programs for all ProgramSchedulerStates
  • API Specification
    • Parse new ProgramStateTrigger from request
  • Scheduler
    • Subscribe to ProgramStateNotifications from TMS and create schedule
    • Add to ProgramScheduleStore and ensure atomicity of events
    • Ensure event keys are still unique for Program State Triggers
  • Tests
    • Need long running test to test series of pipelines starting from statuses of each other
  • UI
    • Fetch all ProgramScheduleTypes
    • Fetch all programs for user to choose from in a pipeline
    • Design multiple tabs to pick between different triggers
  • Documentation
    • Add examples for how to build the trigger

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

  • Work #1
  • Work #2
  • Work #3

Future work

  • No labels