Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 47 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Task Workflow v3.1 — DAG-Based Task Scheduler
# Task Workflow v3.2 — DAG-Based Task Scheduler

Intelligent task scheduling skill for AI coding agents. Uses dependency analysis,
complexity scoring, and topological sorting to produce optimal execution batches.
Expand All @@ -13,6 +13,8 @@ Tasks with dependencies → DAG analysis → Batch schedule (parallel where poss
- **Complexity Scoring**: 1-10 scale, lower complexity tasks execute first
- **Batch Grouping**: Independent tasks grouped for parallel execution
- **Dynamic Insertion**: Add tasks mid-execution without restart
- **Cross-session Persistence**: Daily backlog files with CST 00:00 auto-migration
- **Cycle Detection**: Refuses to schedule circular dependencies

## Installation

Expand All @@ -29,32 +31,63 @@ git clone https://github.com/Charpup/openclaw-task-workflow.git ~/.claude/skills
Task-workflow integrates with [TriaDev](https://github.com/Charpup/triadev):

```
planning-with-files (plan) → task-workflow (schedule) → tdd-sdd (implement)
planning-with-files (plan) → task-workflow (schedule) → tdd-sdd-development (implement)
```

Coordinates via `triadev-handoff.json` — reads extracted tasks, writes batch schedule.
Coordinates via `triadev-handoff.json` — reads extracted tasks, writes batch schedule. Also works standalone: reads `task_plan.md` directly when no handoff.json is present.

## What's New in v3.2

| Addition | Purpose |
|----------|---------|
| `examples/humanizer-skill-schedule/` | **GOLD** real-run reference — 21-task humanizer-skill project with input task_plan, full DAG output, and handoff snippet. Shows fan-out (T7 → 5 tasks), fan-in (T13 ← 5 tasks), critical path (12 tasks, complexity sum 33), max parallelism (5). |
| `evals/evals.json` (4 → 8 cases) | New cases: within-batch complexity ordering, cross-session migration (CST 00:00 behavior), standalone mode (no triadev-handoff), dynamic insertion mid-execution. ~80% deterministic assertions (sequence_order, json_path_*, file_exists). |

## Project Structure

```
openclaw-task-workflow/
├── SKILL.md # Scheduling instructions
├── scripts/
│ ├── task_scheduler.py # DAG sort + batch grouping
│ ├── task_persistence.py # State persistence
│ └── task_index_manager.py # Cross-session index
├── references/
│ ├── file-format.md # Daily file format spec
│ └── v3-migration.md # Migration guide
├── SKILL.md # Scheduling workflow
├── README.md # This file
├── cli.py # Entry point
├── pytest.ini # Test config
├── contracts/
│ └── stack-handshake.json # TriaDev integration contract
│ └── stack-handshake.json # TriaDev integration contract
├── references/
│ ├── file-format.md # Daily file format spec (v3)
│ └── v3-migration.md # Migration guide from earlier versions
├── scripts/
│ ├── task_scheduler.py # DAG sort + batch grouping
│ ├── task_persistence.py # File persistence + daily migration
│ ├── task_index_manager.py # Cross-session index
│ └── stack_contract.py # Contract validator
├── config/
│ └── cron.yaml # Auto-migration config (CST 00:00)
├── examples/
│ └── humanizer-skill-schedule/ # GOLD — real 21-task project
├── evals/
│ └── evals.json # Test cases
└── tests/ # Unit + integration tests
│ └── evals.json # 8 cases
└── tests/ # Unit + integration
```

## Working Example

See [`examples/humanizer-skill-schedule/`](examples/humanizer-skill-schedule/) for a real completed run — the humanizer-skill build (PR blader/humanizer#94 merged). Demonstrates:

- Mixed complexities (1 to 7) with real reasoning
- Fan-out: one high-complexity task (T7 merge 32 patterns) unlocks 5 downstream references
- Fan-in: the main `SKILL.md` draft (T13) requires all 5 reference files
- Critical path length 12, total complexity sum 33, max parallelism 5
- Both raw `output-schedule.json` format and the triadev-handoff slice

## Changelog

### v3.2.0 (2026-04-18)
Round-2 standardization. Additive; no breaking changes.

- **New**: `examples/humanizer-skill-schedule/` — GOLD completed-run reference harvested from real humanizer-skill project. 4 files: README + input task_plan.md + output-schedule.json + handoff snippet.
- **Hardened**: `evals/evals.json` — 4 → 8 cases. New: complexity ordering within batches, cross-session migration behavior, standalone mode (no handoff.json), dynamic insertion. Shifted ~80% of assertions to deterministic types (`sequence_order`, `json_path_equals`, `file_exists`).

### v3.1.0 (2026-04-09)
- **New**: Prompt-centric SKILL.md with clear boundary rules
- **New**: Integration with triadev-handoff.json contract
Expand Down
159 changes: 145 additions & 14 deletions evals/evals.json
Original file line number Diff line number Diff line change
@@ -1,37 +1,65 @@
{
"skill_name": "task-workflow",
"eval_schema_version": "2.0",
"assertion_types_supported": {
"llm_judge": "Subjective check, prone to leniency. Use sparingly.",
"contains_any": "Deterministic. Response contains at least one value.",
"contains_all": "Deterministic. Response contains every value.",
"does_not_contain": "Deterministic. Response contains no value from values[].",
"file_exists": "Deterministic. File at path exists.",
"no_file_exists": "Deterministic. File at path does NOT exist.",
"json_path_equals": "Deterministic. JSON at file has path equal to expected.",
"json_path_in": "Deterministic. JSON at file has path whose value is one of values[].",
"json_path_length": "Deterministic. JSON array at path has expected length.",
"json_schema_valid": "Deterministic. JSON at file validates against schema_path.",
"sequence_order": "Deterministic. Response contains all values in the given order (for batch ordering checks)."
},
"evals": [
{
"id": "dag-basic-01",
"prompt": "I have 5 tasks for my project: research-api (no deps, complexity 2), design-schema (depends on research-api, complexity 4), implement-auth (depends on design-schema, complexity 6), write-tests (depends on implement-auth, complexity 5), deploy (depends on write-tests and implement-auth, complexity 3). Schedule these with dependency resolution.",
"expected_output": "Builds a DAG, performs topological sort, groups into batches respecting dependencies and ordered by complexity within each batch.",
"assertions": [
{
"text": "Correct batch ordering respecting dependencies",
"type": "llm_judge",
"criteria": "Tasks are scheduled so that no task runs before its dependencies complete. research-api must be in an earlier batch than design-schema, etc."
"text": "Batches announced in correct dependency order",
"type": "sequence_order",
"values": ["research-api", "design-schema", "implement-auth", "write-tests", "deploy"]
},
{
"text": "Mentions scheduling concepts",
"type": "contains_any",
"values": ["DAG", "dependency", "batch", "topological", "schedule"]
},
{
"text": "research-api is in first batch",
"type": "llm_judge",
"criteria": "The schedule output shows research-api in batch 1 (or the first batch group), since it has no dependencies."
}
]
},
{
"id": "handoff-integration-01",
"prompt": "Here's my triadev-handoff.json with tasks_extracted already populated:\n{\"version\":\"1.0.0\",\"project\":\"auth-api\",\"route\":\"extended\",\"current_phase\":\"scheduling\",\"planning\":{\"status\":\"complete\",\"tasks_extracted\":[{\"id\":\"create-schema\",\"name\":\"Create DB schema\",\"complexity\":3,\"dependencies\":[]},{\"id\":\"impl-model\",\"name\":\"Implement user model\",\"complexity\":5,\"dependencies\":[\"create-schema\"]},{\"id\":\"write-tests\",\"name\":\"Write integration tests\",\"complexity\":6,\"dependencies\":[\"impl-model\"]}]}}\nSchedule these tasks and update the handoff file.",
"prompt": "Here's my triadev-handoff.json with tasks_extracted already populated:\n{\"version\":\"1.0.0\",\"project\":\"auth-api\",\"route\":\"extended\",\"current_phase\":\"scheduling\",\"planning\":{\"status\":\"complete\",\"files\":[\"task_plan.md\"],\"tasks_extracted\":[{\"id\":\"create-schema\",\"name\":\"Create DB schema\",\"complexity\":3,\"dependencies\":[]},{\"id\":\"impl-model\",\"name\":\"Implement user model\",\"complexity\":5,\"dependencies\":[\"create-schema\"]},{\"id\":\"write-tests\",\"name\":\"Write integration tests\",\"complexity\":6,\"dependencies\":[\"impl-model\"]}]},\"scheduling\":{\"status\":\"pending\",\"batches\":[]},\"value_gate\":{\"status\":\"pending\",\"verdict\":null,\"review_path\":null},\"implementation\":{\"status\":\"pending\",\"completed\":[],\"current\":null,\"spec_path\":null,\"tdd_state_path\":null}}\nSchedule these tasks and update the handoff file.",
"expected_output": "Reads tasks from handoff.json planning.tasks_extracted. Produces batches. Writes scheduling.batches and scheduling.status to handoff file.",
"assertions": [
{
"text": "Reads from handoff file",
"type": "llm_judge",
"criteria": "The response reads tasks from the triadev-handoff.json planning.tasks_extracted field, not from some other source."
"text": "Updates handoff scheduling status to complete",
"type": "json_path_equals",
"file": "triadev-handoff.json",
"path": "$.scheduling.status",
"equals": "complete"
},
{
"text": "Writes exactly 3 batches",
"type": "json_path_length",
"file": "triadev-handoff.json",
"path": "$.scheduling.batches",
"length": 3
},
{
"text": "Writes schedule back to handoff",
"text": "First batch contains create-schema",
"type": "llm_judge",
"criteria": "The response updates triadev-handoff.json with scheduling.batches and scheduling.status fields."
"criteria": "scheduling.batches[0] contains 'create-schema' (the only dependency-free task)."
}
]
},
Expand All @@ -41,9 +69,19 @@
"expected_output": "Detects circular dependency (a→c→b→a) and reports the cycle. Does not produce a schedule.",
"assertions": [
{
"text": "Detects and reports circular dependency",
"type": "llm_judge",
"criteria": "The response identifies the circular dependency between the three tasks and refuses to produce a schedule, explaining the cycle."
"text": "Reports the cycle explicitly",
"type": "contains_any",
"values": ["circular", "cycle", "cyclic"]
},
{
"text": "Does not produce batches",
"type": "does_not_contain",
"values": ["Batch 1:", "Batch 2:", "scheduled into batches"]
},
{
"text": "Names the cycling tasks",
"type": "contains_all",
"values": ["task-a", "task-b", "task-c"]
}
]
},
Expand All @@ -53,9 +91,102 @@
"expected_output": "This is not a task scheduling request. Should not trigger task-workflow.",
"assertions": [
{
"text": "Does not trigger DAG scheduling",
"text": "Does not produce batches",
"type": "does_not_contain",
"values": ["Batch 1:", "DAG", "complexity score", "dependency resolution"]
},
{
"text": "No handoff file created",
"type": "no_file_exists",
"path": "triadev-handoff.json"
}
]
},
{
"id": "batch-ordering-complexity-01",
"prompt": "Schedule these tasks, all with zero dependencies: task-low (complexity 2), task-mid (complexity 5), task-high (complexity 8), task-trivial (complexity 1). They can all run in parallel, but order within the batch should reflect complexity.",
"expected_output": "All four tasks in Batch 1, ordered from lowest to highest complexity: task-trivial → task-low → task-mid → task-high.",
"assertions": [
{
"text": "All tasks in same batch (parallel)",
"type": "llm_judge",
"criteria": "All four tasks are in Batch 1 (or the single first batch). No dependencies forces them into the same batch."
},
{
"text": "Within-batch order is ascending by complexity",
"type": "sequence_order",
"values": ["task-trivial", "task-low", "task-mid", "task-high"]
}
]
},
{
"id": "cross-session-persistence-01",
"prompt": "Yesterday I created a task workflow file with 3 tasks. Two were running at CST 00:00 when auto-migration kicked in:\n\nYesterday's file (~/.openclaw/workspace/task_backlog/task-workflow-progress-2026-04-17.md):\n```\n| ID | Task | Complexity | Dependencies | Status | Batch |\n|----|------|-----------|--------------|--------|-------|\n| a | Task A | 3 | - | ✅ Completed | 1 |\n| b | Task B | 5 | - | 🔄 Running | 1 |\n| c | Task C | 4 | b | ⏳ Pending | 2 |\n```\n\nWhat should today's file (2026-04-18) contain?",
"expected_output": "Today's file has task b (running → pending, marked migrated) and task c (pending, marked migrated). Task a (completed) is NOT carried over.",
"assertions": [
{
"text": "Completed task 'a' not migrated",
"type": "llm_judge",
"criteria": "The response handles this as a simple list request without invoking DAG scheduling, batch ordering, or task-workflow concepts."
"criteria": "Task 'a' does NOT appear in today's migrated section (completed tasks stay in history)."
},
{
"text": "Running task 'b' migrated with reset status",
"type": "contains_all",
"values": ["b", "migrated"]
},
{
"text": "Pending task 'c' carried over",
"type": "contains_any",
"values": ["Task C", "task c"]
},
{
"text": "Status reset logic mentioned",
"type": "contains_any",
"values": ["reset", "pending", "restart"]
}
]
},
{
"id": "standalone-mode-01",
"prompt": "I have a task_plan.md but no triadev-handoff.json (I'm using task-workflow standalone). Schedule these:\n\n## Phase 2: Implementation\n- [ ] Setup database (id: setup-db, complexity: 2)\n- [ ] Write API routes (id: write-routes, deps: setup-db, complexity: 5)\n- [ ] Add validation (id: add-validation, deps: write-routes, complexity: 3)\n- [ ] Integration tests (id: int-tests, deps: add-validation, complexity: 6)",
"expected_output": "Works in standalone mode. Outputs schedule as formatted text. Does NOT create triadev-handoff.json (not using triadev).",
"assertions": [
{
"text": "Does not create handoff file in standalone mode",
"type": "no_file_exists",
"path": "triadev-handoff.json"
},
{
"text": "Outputs schedule as text",
"type": "contains_all",
"values": ["setup-db", "write-routes", "add-validation", "int-tests"]
},
{
"text": "Correct dependency order",
"type": "sequence_order",
"values": ["setup-db", "write-routes", "add-validation", "int-tests"]
}
]
},
{
"id": "dynamic-insertion-01",
"prompt": "I currently have this schedule running:\nBatch 1: [a, b] (completed)\nBatch 2: [c] (running)\nBatch 3: [d] (pending, depends on c)\n\nInsert a new task 'hotfix' with complexity 2 and no dependencies. Re-schedule the remaining work.",
"expected_output": "Re-runs DAG analysis on remaining tasks only (c, d, hotfix). Since hotfix has no deps and c is already running, hotfix joins the currently-running batch or inserts a new parallel batch.",
"assertions": [
{
"text": "Does not re-schedule completed tasks",
"type": "does_not_contain",
"values": ["re-scheduling a", "re-scheduling b", "batch 1 updated"]
},
{
"text": "Inserts hotfix into remaining schedule",
"type": "contains_any",
"values": ["hotfix"]
},
{
"text": "Announces schedule change",
"type": "contains_any",
"values": ["schedule change", "updated schedule", "inserting", "re-scheduling"]
}
]
}
Expand Down
51 changes: 51 additions & 0 deletions examples/humanizer-skill-schedule/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Example: humanizer-skill schedule (GOLD — real completed run)

## What this is

A real completed task-workflow run from `projects/humanizer-skill/` (March 2026).
A 21-task skill build that went planning → scheduling → implementation → PR merged.

Inputs are the **verbatim** `task_plan.md` from the project. Outputs are what
task-workflow's DAG analysis produces from that plan.

## Why GOLD

- ✅ Real completed project (PR blader/humanizer#94 landed)
- ✅ 20 of 21 tasks completed — realistic in-flight state
- ✅ Mix of task complexities (1–7), mix of dependency shapes (fan-out, fan-in, parallel)
- ✅ Shows both `tasks_extracted` (from handoff.json) and `batches` (scheduled output)
- ✅ Demonstrates within-batch complexity ordering (lowest complexity first)

## Files

| File | Role |
|------|------|
| `input-task_plan.md` | Input to task-workflow (extracted from project) |
| `output-schedule.json` | task-workflow output (batches + metadata) |
| `output-handoff-snippet.json` | How this schedule lives inside triadev-handoff.json |

## Characteristic lessons

1. **Complexity ordering within a batch**: Batch 1 contains 4 independent tasks
(T1, T2, T6, T16). They're ordered T1(1) → T2(1) → T16(1) → T6(4), ascending
by complexity as per scheduler rule.

2. **Fan-out from T7**: T7 (merge 32 patterns, complexity 7) is the single-point
dependency for 5 downstream tasks (T8–T12). Those 5 run in parallel after T7.

3. **Fan-in to T13**: T13 (SKILL.md) depends on all 5 reference files (T8–T12).
The scheduler correctly waits for the entire batch to complete before running T13.

4. **Linear tail (T17 → T21)**: Deployment phase is inherently sequential. No
parallelization possible.

5. **Standalone-style output**: This example uses task-workflow without triadev.
Output is formatted text + JSON, not written to triadev-handoff.json. The final
file shows how the same schedule maps into handoff.json if triadev orchestrates.

## What would make this NOT an ideal reference

- Task complexity scores were assigned by the author, not derived from code
- No cross-session daily-file migration shown (project completed in one week,
mostly in 2-3 sessions — no CST 00:00 migrations occurred)
- For daily-file migration example, see `references/file-format.md` (spec only)
Loading