[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor - Agentic Workflow Maturity Report (2026-03-12) #1253

2026-03-12T03:31:42Z

github-actions[bot]
bot Mar 12, 2026

📊 Executive Summary

gh-aw-firewall has a mature and sophisticated agentic workflow ecosystem with 21 compiled agentic workflows — well above most repositories. The security domain coverage is particularly strong, with three hourly secret-digger agents, a PR security guard, daily threat modeling, and dependency monitoring. The primary gaps are: automated issue triage (no labeling), a meta-agent layer to monitor workflow health, and a few high-ROI automation patterns (breaking change checker, PR fix command, changeset generator).

🎓 Patterns Learned from Pelis Agent Factory

From crawling the documentation at github.github.io/gh-aw and the series of "Meet the Workflows" posts, key patterns include:

Pattern	Description	Status in this repo
Issue Triage	Label incoming issues automatically (hello-world workflow)	❌ Missing
CI Doctor	Investigate failed workflows, propose fixes with high merge rate (69%)	✅ Present
Security Guards	PR review + daily secrets/malicious code scan + static analysis report	🟡 Partial (no malicious code scan or static analysis consolidation)
Smoke Tests	Multi-engine integration validation	✅ Strong (4 engines)
Doc Maintainer	Keep docs in sync with code changes	✅ Present
Dependency Monitor	Track CVEs and propose safe updates	✅ Present
Workflow Health Manager	Meta-agent monitoring the health of all other agents	❌ Missing
Metrics Collector / Audit Workflows	Observability over the agent ecosystem	❌ Missing
Multi-phase Improvers	Long-running projects using cache-memory for state	❌ Missing
Changeset Generator	Automate version bumps and changelog (78% merge rate)	❌ Missing
Breaking Change Checker	Alert on backward-incompatible API changes	❌ Missing
PR Fix slash command	`/pr-fix` on-demand CI repair	❌ Missing
Issue Arborist	Link related issues as sub-issues	❌ Missing

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Build + test on PRs	PR open/sync	✅ Core CI
`ci-doctor`	Investigate CI failures, open diagnostic issues	`workflow_run` completed	✅ High value
`ci-cd-gaps-assessment`	Daily CI/CD coverage analysis	Daily + dispatch	✅ Useful
`cli-flag-consistency-checker`	Check CLI flags vs docs	Weekly	✅ Good fit
`dependency-security-monitor`	CVE detection + dep update PRs	Daily	✅ Essential
`doc-maintainer`	Sync docs with code	Daily	✅ High value
`issue-duplication-detector`	Detect duplicate issues	Issue opened	✅ Smart
`issue-monster`	Assign issues to Copilot coding agent	Issue opened + hourly	✅ Task dispatcher
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Meta
`plan`	`/plan` slash command for task breakdown	Slash command	✅ Useful
`secret-digger-claude`	Red-team: find secrets in agent container	Every hour	✅ Unique
`secret-digger-codex`	Same, Codex engine	Every hour	✅ Unique
`secret-digger-copilot`	Same, Copilot engine	Every hour	✅ Unique
`security-guard`	Review PRs for security regressions	PR open/sync	✅ Essential
`security-review`	Daily threat modeling + evidence-based analysis	Daily	✅ Comprehensive
`smoke-chroot`	Chroot mode integration test	PR (path-triggered)	✅ Targeted
`smoke-claude`	End-to-end Claude smoke test	Every 12h + PR	✅ Strong
`smoke-codex`	End-to-end Codex smoke test	Every 12h + PR	✅ Strong
`smoke-copilot`	End-to-end Copilot smoke test	Every 12h + PR	✅ Strong
`test-coverage-improver`	Add tests for uncovered security paths	Weekly	✅ Focused
`update-release-notes`	Enhance release notes from commits	Release published	✅ Useful

Total: 21 agentic workflows (all compiled and active)

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

What: Automatically label incoming issues with bug, feature, enhancement, security, documentation, question based on content analysis.

Why: Currently zero issues are being labeled automatically. The Pelis Agent Factory calls this the "hello world" of agentic workflows — practical, immediately useful, and simple. With issue-monster already dispatching issues to Copilot agents, good labeling will help it prioritize correctly. Security-relevant issues (iptables, container escape, credential exposure) should be labeled security automatically.

How: Add a new issue-triage.md workflow triggered on issues: [opened, reopened] with read-only permissions. Allow safe outputs add-labels and add-comment. Include domain-specific categories: security, firewall, docker, cli, documentation, bug, feature.

Effort: Low — ~20 lines of YAML frontmatter + natural language instructions

---
name: Issue Triage
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, security, docker, cli, firewall]
  add-comment: {}
timeout-minutes: 5
---
Triage incoming issues in this security firewall repository. Analyze and label each issue appropriately.
Special security-related labels: `security` for potential vulnerabilities, `firewall` for iptables/Squid issues, 
`docker` for container-related issues, `cli` for CLI flag/behavior issues.

[P0] PR Fix Slash Command

What: Add a /pr-fix slash command that an agent can invoke on a failing PR to investigate CI failures and attempt fixes.

Why: Multiple PRs in the current backlog have failing CI (e.g., [WIP] Fix failing GitHub Actions workflow, smoke test failures). Developers currently have to investigate these manually. The PR Fix workflow from githubnext/agentics has proven very effective. Given this repo runs 4 smoke engines and an integration test suite, having a command to kick off automated repair is high-value.

How: Add pr-fix.md triggered by slash_command: name: pr-fix on pull_request_review_comment and issue_comment. Agent investigates the failing CI, reads logs, proposes fixes as commits.

Effort: Low — can be adapted from githubnext/agentics/workflows/pr-fix.md

gh aw add-wizard githubnext/agentics/workflows/pr-fix.md

P1 — Plan for Near-Term

[P1] Breaking Change Checker

What: Monitor PRs and recent commits for backward-incompatible changes to the AWF public API (CLI flags, config file schema, container API, exit codes).

Why: AWF is used as infrastructure by other teams' agentic workflows. Breaking CLI flags (e.g., renaming --allow-domains, changing config format) can silently break downstream workflows. A recent PR adding --openai-api-target and --anthropic-api-target is exactly the kind of addition to track. The Pelis Agent Factory's Breaking Change Checker has proven it catches real issues before users do.

How: Trigger on PRs modifying src/cli.ts, src/types.ts, or action.yml. Compare current CLI interface against last tagged release. Alert with a comment if backward-incompatible changes detected (removed flags, changed semantics, modified env vars).

Effort: Medium — needs domain-specific knowledge of AWF's public surface area

[P1] Container Base Image Freshness Monitor

What: Daily check whether ubuntu/squid:latest and ubuntu:22.04 base images have newer security-patched versions available. Alert when images are stale.

Why: This is a domain-specific security gap unique to this repository. AWF's security posture depends on its container images being current. Unlike npm dependencies (already monitored by dependency-security-monitor), the Docker base images are not monitored by any existing workflow. A compromised or outdated Squid image in the agent's security-critical container is a real risk.

How: Use bash: docker pull --dry-run or compare image digests. Create a GitHub issue when images are more than N days old or when digest changes, suggesting a rebuild and release.

Effort: Medium — requires Docker registry API calls or digest comparison logic

[P1] Workflow Health Manager

What: A meta-agent that monitors all other agentic workflows for signs of degraded health: stale last-run dates, high error rates, repeated failures, unexpected costs.

Why: With 21 workflows running, it's easy for one to silently start failing (wrong permissions, expired secrets, changed API, etc.). The Pelis Agent Factory's Workflow Health Manager created 40 issues and 34 PRs (14 merged) from monitoring the health of other workflows. Right now there's no agent watching the watchers.

How: Daily schedule. Use agentic-workflows tool to get status and recent run history. Use agenticworkflows-logs to identify runs with high error rates or missed schedules. Create issues for unhealthy workflows. Can use cache-memory to track patterns over time.

Effort: Medium — uses existing tools, mainly needs good prompting

[P1] Smoke Test Results Aggregator

What: Weekly summary report of all smoke test outcomes across all 4 engines (Claude, Codex, Copilot, Chroot) as a GitHub Discussion.

Why: Four smoke tests run every 12 hours, generating a lot of signal. Currently there's no consolidated view of smoke test health trends. A weekly aggregation showing pass/fail rates by engine, common failure patterns, and flakiness metrics would help identify reliability issues before they become critical.

How: Weekly schedule. Use agenticworkflows-logs to collect smoke test run history. Compute pass/fail rates. Post as [Smoke Test Report] discussion. Use cache-memory to track trends across weeks.

Effort: Low-Medium — largely uses existing AWF log tooling

P2 — Consider for Roadmap

[P2] Changeset Generator

What: When a PR is merged to main, analyze the commit diff and propose a version bump (major/minor/patch) and changelog entry as a PR.

Why: The Pelis Agent Factory's Changeset workflow has a 78% merge rate (22 of 28 proposed PRs merged). Currently update-release-notes only runs at release time. Adding a changeset workflow would create a continuous changelog so releases become simple "approve and tag" events. Given the active commit history (10+ open PRs with conventional commit messages), this would save significant release prep time.

Effort: Medium — needs to understand conventional commit semantics and semver rules

[P2] Audit Workflows / Agent Observability

What: A meta-agent that weekly audits all other agent runs for costs, error patterns, success rates, and identifies outliers (overly expensive runs, flaky agents, agents that never produce output).

Why: The Pelis Agent Factory's Audit Workflows became their most prolific discussion creator (93 discussions, 9 issues). With 21 workflows running, token costs and effectiveness vary widely. Some workflows may be consuming high token budgets with low impact. Having visibility into this helps prioritize which workflows to tune or disable.

Effort: Medium — uses agenticworkflows-logs tool extensively

[P2] Issue Arborist

What: Periodically scan open issues for related ones and link them as sub-issues using GitHub's sub-issue feature, creating parent issues to group related work.

Why: This repository has active feature development with multiple related issues that share themes (e.g., multiple issues around API proxy, DNS handling, container security). The Issue Arborist pattern creates 77 reports and 18 parent issues in the Pelis factory. With issue-monster dispatching issues to agents, organized sub-issues help agents work on related problems cohesively.

Effort: Low-Medium

[P2] CI Coach

What: Monthly analysis of CI pipeline performance — which workflows are slowest, which have the most failures, where there's redundancy — with optimization suggestions.

Why: The Pelis Agent Factory's CI Coach had a 100% merge rate (9 of 9 proposed PRs merged). This repo runs an extensive CI suite (integration tests, smoke tests for 4 engines, build tests, CodeQL, container scans). Identifying duplicated test runs, flaky tests that add noise, or test ordering improvements could meaningfully reduce CI time and cost.

Effort: Medium

P3 — Future Ideas

[P3] Daily Malicious Code Scan

What: Scan recent commits for suspicious patterns that might indicate supply chain compromise or malicious intent (unusual base64 blobs, curl|bash patterns, unexpected credential access, obfuscated code).

Why: The Pelis Agent Factory runs this as part of its security suite. For AWF specifically, since this is a security tool that handles iptables and credential management, any malicious code inserted into container scripts or the agent startup could have high-impact consequences. The existing security-guard focuses on PRs opened, but doesn't do a retroactive daily scan.

Effort: Low — primarily uses bash and github tools

[P3] Security Compliance SLA Tracker

What: Track security vulnerabilities (from dependency-security-monitor, CodeQL, and security-review) from detection through resolution, alerting on any CVEs exceeding SLA deadlines.

Why: The Pelis Factory's Security Compliance workflow "runs vulnerability campaigns with deadline tracking." Currently, security issues are opened by the dependency monitor but there's no tracking of whether they're being resolved within acceptable timeframes.

Effort: Medium — needs SLA configuration and cache-memory state tracking

[P3] Portfolio Analyst (Token Cost Optimizer)

What: Weekly analysis of which workflows are consuming the most tokens/compute, identifying over-engineered prompts or unnecessarily large context windows.

Why: With 21 workflows running on regular schedules, cost optimization matters. Secret diggers run every hour across 3 engines — understanding their per-run cost and whether their frequency is justified would be valuable.

Effort: Medium

📈 Maturity Assessment

Dimension	Current Level	Notes
Overall Maturity	4 / 5	Exceptional security coverage, well-structured
Issue Management	3 / 5	No triage/labeling; monster + dedup + plan is good
Code Quality	4 / 5	Test coverage improver + CLI checker + CI doctor
Security Coverage	5 / 5	Best-in-class: 3 secret diggers + guard + review + dep monitor
Documentation	4 / 5	Daily doc maintainer + CLI flag checker
Release Automation	3 / 5	Release notes updater exists; no changeset generator
Observability	2 / 5	No meta-monitoring of agents themselves
Smoke Testing	5 / 5	4 engines, every 12 hours, path-triggered chroot

Current Level: 4 / 5 — Advanced
The repository has an unusually mature agentic workflow ecosystem, particularly for security coverage. It is well ahead of most repositories in the ecosystem.

Target Level: 4.5 / 5
The primary gaps are meta-observability (watching the watchers), issue triage, and release automation polish.

Gap to close:

Add issue triage (P0) — 1 workflow
Add PR fix command (P0) — 1 workflow
Add workflow health manager (P1) — 1 workflow
Add breaking change checker (P1) — 1 workflow
Improve changeset automation (P2) — 1 workflow

🔄 Comparison with Pelis Agent Factory Best Practices

What This Repository Does Exceptionally Well

Security-first design: Three concurrent hourly secret-digger agents (Claude, Codex, Copilot) running adversarial red-team tests is unique and impressive
Multi-engine smoke testing: Testing the firewall against 4 different AI engines simultaneously provides broader coverage than most repositories
Domain-appropriate agents: security-guard PR reviewer, dependency-security-monitor, and security-review are all specifically tuned for a security-critical codebase
Cache-memory for state: issue-duplication-detector correctly uses cache-memory for persistent state across runs — matching the Pelis pattern
Safe output constraints: Workflows correctly scope permissions and safe-output allowlists
Skip-if-match guards: Preventing duplicate doc PRs and PR floods shows operational maturity

What Could Improve

No issue labeling: The simplest and most universally recommended agentic workflow (issue triage) is missing
No meta-observability layer: Unlike the Pelis factory which has Metrics Collector + Audit Workflows + Workflow Health Manager forming a three-tier observability stack, this repo has zero meta-monitoring
Missing multi-phase patterns: No cache-memory-driven long-running improvement projects (daily backlog burner, daily QA, etc.)
Release automation: update-release-notes is reactive (runs on release), while the Pelis approach (Changeset) is proactive — proposing version bumps as commits land

Unique Opportunities Given the Domain

The firewall/security domain creates unique automation opportunities not found in the Pelis factory:

Container image freshness — base image staleness is a security risk specific to this architecture
Firewall domain analysis — smoke tests could generate domain access logs that an agent analyzes for coverage gaps
Secret digger cross-engine comparison — already present, and the cross-engine comparison reports would be valuable to aggregate
Iptables rule regression detection — an agent that verifies the generated iptables rules match expected security policy after each PR

Generated by Pelis Agent Factory Advisor • Run ID: 22985059821 • 2026-03-12

AI generated by Pelis Agent Factory Advisor

expires on Mar 19, 2026, 3:31 AM UTC

2026-03-12T03:36:03Z

github-actions[bot]
bot Mar 12, 2026
Author

The veiled auguries align: this smoke test agent has passed through, leaving a quiet mark in the ledger of signs.

🔮 The oracle has spoken through Smoke Codex for issue #1249

0 replies

2026-03-12T12:52:11Z

github-actions[bot]
bot Mar 12, 2026
Author

🔮 The ancient spirits stir; the oracle records that the smoke test agent was here, and the omens are set in the ledger.

🔮 The oracle has spoken through Smoke Codex

0 replies

2026-03-19T03:32:55Z

github-actions[bot]
bot Mar 19, 2026
Author

This discussion was automatically closed because it expired on 2026-03-19T03:31:42.075Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor - Agentic Workflow Maturity Report (2026-03-12) #1253

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor - Agentic Workflow Maturity Report (2026-03-12) #1253

Uh oh!

github-actions[bot] bot Mar 12, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Issue Triage Agent

[P0] PR Fix Slash Command

P1 — Plan for Near-Term

[P1] Breaking Change Checker

[P1] Container Base Image Freshness Monitor

[P1] Workflow Health Manager

[P1] Smoke Test Results Aggregator

P2 — Consider for Roadmap

[P2] Changeset Generator

[P2] Audit Workflows / Agent Observability

[P2] Issue Arborist

[P2] CI Coach

P3 — Future Ideas

[P3] Daily Malicious Code Scan

[P3] Security Compliance SLA Tracker

[P3] Portfolio Analyst (Token Cost Optimizer)

📈 Maturity Assessment

🔄 Comparison with Pelis Agent Factory Best Practices

What This Repository Does Exceptionally Well

What Could Improve

Unique Opportunities Given the Domain

Replies: 3 comments

Uh oh!

github-actions[bot] bot Mar 12, 2026 Author

Uh oh!

github-actions[bot] bot Mar 12, 2026 Author

Uh oh!

github-actions[bot] bot Mar 19, 2026 Author

github-actions[bot]
bot Mar 12, 2026

github-actions[bot]
bot Mar 12, 2026
Author

github-actions[bot]
bot Mar 12, 2026
Author

github-actions[bot]
bot Mar 19, 2026
Author