Resource Pools#
Resource pools are the mechanism FAST-HEP uses to route work to appropriate workers.
They allow a single workflow to use multiple types of compute resources without splitting the analysis into separate jobs.
Motivation#
Scientific workflows often contain stages with very different requirements.
Examples include:
lightweight filtering,
memory-intensive preprocessing,
machine learning inference,
GPU-accelerated calculations.
Traditionally these stages are executed as separate workflows or submitted to different queues manually.
FAST-HEP allows them to coexist within a single workflow.
Resources vs Pools#
A useful way to think about the execution system is:
Resources describe what a stage needs.
Pools describe what workers exist.
Resources#
Stages request resources:
analysis:
stages:
- id: BuildIndex
op: custom.build_index
execution:
require: high_memory
- id: Inference
op: custom.inference
execution:
require: gpu
The stage does not know:
which batch system is used,
how many workers exist,
which machine will execute it.
It only describes what it needs.
Pools#
Pools describe available worker groups:
execution:
pools:
default:
workers: 20
resources:
cpus: 1
memory: 4GB
high_memory:
workers: 2
resources:
cpus: 8
memory: 128GB
gpu:
workers: 1
resources:
gpus: 1
memory: 16GB
Each pool creates workers with specific capabilities.
Routing#
When a stage requests:
execution:
require: high_memory
FAST-HEP schedules that stage onto workers from the corresponding pool.
Likewise:
execution:
require: gpu
will only execute on workers belonging to the GPU pool.
Conceptually:
Stage
↓
Required resource
↓
Matching pool
↓
Worker
A Typical Example#
Consider a workflow that:
builds an event index,
performs GPU inference,
creates histograms.
analysis:
stages:
- id: BuildIndex
op: custom.index
execution:
require: high_memory
- id: RunInference
op: custom.inference
execution:
require: gpu
- id: MuonPt
op: hep.hist
With pools:
execution:
pools:
default:
workers: 10
high_memory:
workers: 2
resources:
memory: 128GB
gpu:
workers: 1
resources:
gpus: 1
Execution proceeds automatically:
BuildIndex
→ high_memory workers
RunInference
→ gpu workers
MuonPt
→ default workers
No manual job splitting is required.
Pool Profiles#
Many sites use the same resource configurations repeatedly.
Profiles can be used to avoid repetition.
For example:
execution:
pools:
default:
use: standard_worker
gpu:
use: gpu_worker
where profiles define the actual worker configuration.
This allows sites to standardise resource definitions while analyses remain portable.
Pool-Specific Configuration#
Pools may also carry backend-specific configuration.
For example:
execution:
pools:
high_memory:
workers: 2
resources:
memory: 128GB
config:
walltime: 04:00:00
The exact configuration options depend on the selected execution strategy.
For example:
HTCondor
may expose different scheduling options.
Heterogeneous Execution#
The most important feature of resource pools is heterogeneous execution.
A single workflow can use:
CPU workers
High-memory workers
GPU workers
simultaneously.
This allows workflows to express the natural structure of an analysis rather than forcing users to divide it into separate batch submissions.
Resource Pools and Dask#
When using the Dask backend, pools are translated into worker groups.
Each pool advertises resources to Dask:
resource.default
resource.high_memory
resource.gpu
Stages are automatically annotated with the required resources.
Dask then routes tasks to matching workers.
The analysis author does not need to manage this routing manually.