Compilation pipeline#

A FAST-HEP workflow typically begins as a human-authored YAML file:

author.yaml

fasthep-flow then compiles this workflow through several stages:

        flowchart TD

    subgraph Compile["Compilation and planning"]
        Author["author.yaml"]
        Profiles["profiles and registries"]
        Normalised["normalised workflow"]
        Dependency["dependency inference"]
        Plan["execution plan"]

        Author --> Normalised
        Profiles --> Normalised
        Normalised --> Dependency
        Dependency --> Plan
    end

    subgraph Execute["Runtime execution"]
        Runtime["runtime execution"]
        Outputs["artifacts and outputs"]

        Runtime --> Outputs
    end

    Plan --> Runtime
    

This separation between workflow intent and execution strategy is a core design principle of fasthep-flow.

Workflows can therefore be:

  • validated before execution

  • inspected and serialised

  • transformed into backend-specific plans

  • optimised independently of user workflow logic

  • executed on different runtime backends without changing analysis definitions


Compilation stages#

The compilation pipeline progressively transforms a human-authored workflow into a runtime-ready execution plan.

Author workflow#

The workflow begins as a user-authored YAML description:

analysis:
  stages:
    - id: BasicVars
      op: hep.define

At this stage the workflow focuses on readability and intent rather than runtime structure.


Profiles and registries#

Profiles are resolved and registries are loaded.

This stage makes operations, sources, sinks, hooks, and rendering implementations available to the workflow compiler.

For example:

use:
  profiles:
    - registry
    - fasthep_carpenter:registry

loads operations from the FAST-HEP ecosystem or custom user operations into the active workflow.


Normalised workflow#

The workflow is then transformed into a normalised internal representation.

This stage may:

  • resolve defaults

  • expand shorthand syntax

  • apply profile-provided defaults

  • resolve reusable styles

  • construct explicit workflow objects

The resulting workflow representation is more explicit and machine-oriented than the original author YAML.


Dependency inference#

fasthep-flow then infers workflow dependencies automatically.

For example:

expr: "sqrt(Muon_Px ** 2 + Muon_Py ** 2)"

implicitly depends on Muon_Px and Muon_Py

The workflow engine therefore constructs dependency edges automatically without requiring users to manually wire execution graphs together.

Note

Dependency inference currently focuses primarily on workflow structure and operation relationships.

More advanced validation and semantic inspection tooling is still evolving as part of the rewrite.


Execution plans#

The final result of compilation is a serialisable execution plan:

plan.yaml

Execution plans contain:

  • resolved workflow structure

  • explicit dependencies

  • runtime configuration

  • backend information

  • execution ordering

  • operation metadata

Plans are intended to be:

  • inspectable

  • reproducible

  • serialisable

  • largely backend-independent

The execution plan acts as the boundary between workflow compilation and runtime execution.

Note

Execution plans may contain backend-specific runtime configuration while still remaining largely backend-independent.

For example, the same workflow plan may be executed:

  • locally

  • with Dask

  • through workflow managers

  • with alternative runtime implementations

without changing the original author workflow.

Backend configuration may therefore be embedded, overridden, or replaced at runtime depending on the execution environment.


CLI usage#

Workflows are most commonly compiled and executed through the fasthep command-line interface.

Typical workflows move through several stages:

author.yaml
  → normalised workflow
  → execution plan
  → runtime execution

The CLI exposes commands for inspecting and interacting with these stages individually.


Workflow compilation#

Compile and execute a workflow directly:

fasthep run author.yaml

This performs:

  1. workflow loading

  2. profile resolution

  3. normalisation

  4. dependency inference

  5. execution planning

  6. runtime execution


Normalisation#

Inspect the normalised workflow representation:

fasthep normalise author.yaml

or:

fasthep normalize author.yaml

This expands defaults, resolves profiles, and produces a more explicit workflow representation.


Plan generation#

Generate a serialisable execution plan without executing the workflow:

fasthep make-plan author.yaml

or:

fasthep compile author.yaml

This produces:

plan.yaml

which can later be executed independently of the original author workflow.


Plan execution#

Execute a previously generated plan:

fasthep run-plan plan.yaml

Python API usage#

Workflows may also be compiled programmatically through Python APIs.

This is useful for:

  • notebooks

  • custom tooling

  • workflow services

  • testing

  • alternative runtimes

  • experimental optimisation pipelines

A typical workflow compilation flow looks conceptually like:

from hepflow.api import load_workflow, compile_workflow

workflow = load_workflow("author.yaml")
plan = compile_workflow(workflow)

The exact APIs are still evolving during the rewrite.


Runtime execution#

Once compiled, execution plans may be evaluated by different runtime backends.

Current and planned backends include:

  • local execution

  • Dask-based distributed execution

  • workflow-manager orchestration

  • experimental optimisation and execution backends

Backends are discussed in more detail in Execution Strategies and backends.


Next steps#

Continue with: