You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository is among the most mature agentic workflow repositories I've analyzed, running 21 specialized automated agentic workflows covering security, testing, documentation, CI/CD, and issue management. The domain (network firewall for AI agents) creates powerful opportunities for self-referential automation — running your own firewall to test the firewall. Key opportunities lie in meta-monitoring, issue triage, CI cost optimization, and supply-chain security scanning.
Specialization over monoliths — Dozens of focused agents outperform one general-purpose agent. Each does one thing well.
Meta-agents are essential at scale — "Workflow Health Manager" and "Audit Workflows" monitor all other agents. Without meta-monitoring, cost and quality degradation go undetected.
Causal chains multiply impact — Analysis workflows create issues; issue-monster assigns them; coding agents fix them. The cascade produces more merged PRs than direct automation.
Trust but verify — Testing/validation workflows run continuously because what worked yesterday may fail silently today.
skip-if-match prevents redundant work — Use it aggressively to avoid duplicate PRs/issues.
Cache-memory enables longitudinal analysis — Storing history across runs enables trend detection that single-run agents miss.
Comparison to This Repository
Pattern
Pelis Factory
This Repo
Issue triage
✅ Issue Triage Agent
❌ Missing
Meta-monitoring
✅ Workflow Health Manager
❌ Missing
Metrics collection
✅ Metrics Collector, Portfolio Analyst
❌ Missing
CI fault investigation
✅ CI Doctor
✅ CI Doctor
Security scanning
✅ Secrets, Malicious Code, Static Analysis
⚠️ Secrets only (no malicious code scan)
Causal chain (issue → PR)
✅ Issue Monster + Copilot coding agent
✅ Issue Monster
Documentation maintenance
✅ Multiple doc workflows
✅ Doc Maintainer
Release automation
✅ Changeset Generator
⚠️ Update Release Notes only
Smoke testing
✅ Firewall validation workflow
✅ Multi-engine smoke tests
📋 Current Agentic Workflow Inventory
Workflow
Purpose
Trigger
Assessment
build-test
Build verification on PRs
PR events
✅ Well-configured
ci-cd-gaps-assessment
Identifies CI/CD coverage gaps
Daily
✅ Good meta-analysis
ci-doctor
Investigates CI failures
workflow_run (failed)
⚠️ Missing smoke-copilot & smoke-codex in watch list
cli-flag-consistency-checker
Flags CLI/doc discrepancies
Weekly
✅ Good hygiene
dependency-security-monitor
CVE detection & patch PRs
Daily
✅ Strong security coverage
doc-maintainer
Syncs docs with code changes
Daily
✅ Good cadence
issue-duplication-detector
Detects duplicate issues
On issue open
✅ Useful gatekeeper
issue-monster
Assigns issues to Copilot agent
Hourly + On open
✅ Core task dispatcher
pelis-agent-factory-advisor
This workflow
Daily
✅ Meta-awareness
plan
Generates plans via /plan slash command
Slash command
✅ Good ChatOps
secret-digger-claude
Red-team secrets scan (Claude)
Every hour
⚠️ Redundant with 3 engines; costly
secret-digger-codex
Red-team secrets scan (Codex)
Every hour
⚠️ Redundant with 3 engines; costly
secret-digger-copilot
Red-team secrets scan (Copilot)
Every hour
⚠️ Redundant with 3 engines; costly
security-guard
PR security review
PR events
✅ Excellent domain fit
security-review
Daily threat modeling
Daily
✅ Comprehensive
smoke-chroot
Chroot integration smoke test
PR paths + reaction
✅ Smart path-filtering
smoke-claude
End-to-end Claude smoke test
PR + every 12h
✅ Multi-engine coverage
smoke-codex
End-to-end Codex smoke test
PR + every 12h
✅ Multi-engine coverage
smoke-copilot
End-to-end Copilot smoke test
PR + every 12h
✅ Multi-engine coverage
test-coverage-improver
Adds security-critical tests
Weekly
✅ Right focus on security paths
update-release-notes
Updates release notes on publish
Release published
✅ Good release hygiene
🚀 Actionable Recommendations
P0 — Implement Immediately
[P0] Fix CI Doctor Missing Workflows
What: ci-doctor.md monitors workflow failures but is missing smoke-copilot and smoke-codex from its watch list. When those smoke tests fail, no investigation is triggered.
Why: CI Doctor is the primary fault-investigation mechanism. Missing workflows means silent failures go uninvestigated. Given smoke tests run every 12h, this is a daily blind spot.
How: Add the missing workflow names to the workflows: list in ci-doctor.md frontmatter.
Effort: Low (2-line change)
Fix:
# Add to ci-doctor.md workflows list:
- "Smoke Copilot"
- "Smoke Codex"# (Also verify "Test Coverage Improver" and "Update Release Notes" are present)
[P0] Issue Triage Agent
What: Automatically label and respond to new issues (bug, feature, question, security, performance, documentation, good-first-issue).
Why: The issue tracker is active (open issues include performance reports, security concerns, and feature requests mixed together with no labels). Triage would help maintainers prioritize, and the custom labels could be domain-specific (e.g., container-security, domain-filtering, api-proxy).
How: Add a simple issue triage agent triggered on issues: [opened, reopened].
Effort: Low
Example:
---on:
issues:
types: [opened, reopened]permissions:
issues: readtools:
github:
toolsets: [issues, labels]safe-outputs:
add-labels:
allowed: [bug, feature, enhancement, documentation, question, security, performance, good-first-issue]add-comment:
max: 1timeout-minutes: 5---# Issue Triage Agent
Analyze new issues in the gh-aw-firewall repository (network firewall for AI agents using Squid proxy and Docker).
Apply appropriate labels and leave a brief comment explaining the classification and how it might be addressed.
Domain-specific categories: container escape, domain filtering, DNS security, proxy configuration, API proxy.
P1 — Plan for Near-Term
[P1] Workflow Health Manager (Meta-Monitor)
What: A meta-agent that runs daily to inspect the health of all other agentic workflows — detecting failures, no-ops, cost anomalies, and stale workflows.
Why: With 21 agentic workflows, things can fail silently. Issue #1380 shows "No-Op Runs" accumulating. The Pelis factory learned this lesson and built a Workflow Health Manager that created 40 issues leading to 34 PRs. This is the most impactful pattern missing from this repo.
How: Daily workflow that uses agentic-workflows tool to inspect recent run logs, flag failures, detect cost outliers, and open issues for problems.
Effort: Medium
Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.md
Why: This repository implements a security tool used to sandbox AI agents. A supply chain attack here could compromise every repository using AWF. The Pelis factory runs this daily and considers it a core security guardrail.
Effort: Low-Medium
Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md
[P1] Breaking Change Checker
What: Detects backward-incompatible changes in PRs to CLI flags, Docker Compose configuration, environment variables, and the WrapperConfig TypeScript interface.
Why: AWF is increasingly used as infrastructure by other teams and CI workflows. Breaking changes to --allow-domains, --enable-api-proxy, WrapperConfig, or container env vars can silently break user workflows. The Pelis factory's Breaking Change Checker directly addresses this pattern.
Effort: Medium
Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/breaking-change-checker.md
[P1] CI Optimization Coach
What: Analyzes CI pipeline performance and proposes parallelization, caching, and test-splitting improvements.
Why: Issue #1376 explicitly reports integration tests taking 37+ minutes in CI (Domain & Network job). The Pelis factory's CI Coach achieved a 100% merge rate on 9 proposals. This is actionable right now with a known pain point.
Effort: Low-Medium
Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-coach.md
[P1] Rationalize Secret Digger Scheduling
What: Currently running 3 separate secret digger workflows (Claude, Codex, Copilot) each every hour = ~72 runs/day. Consider staggering to every 6h per engine, or rotating engines.
Why: Running 3 different engines hourly is expensive and likely produces redundant results. The cost could be redirected to higher-value workflows like meta-monitoring. The security value of 3 engines is real (different models may catch different patterns), but hourly cadence may be excessive.
How: Change schedules to every 6h with staggered offsets, or rotate daily (Monday=Claude, Tuesday=Codex, Wednesday=Copilot...). Preserve at least one daily run per engine.
Effort: Low
P2 — Consider for Roadmap
[P2] Metrics Collector / Portfolio Analyst
What: Daily workflow that aggregates token usage, cost estimates, run durations, and success rates across all 21 agentic workflows. Weekly workflow that identifies cost reduction opportunities.
Why: With 21 workflows, cost management becomes important. The Pelis factory learned that some agents were "way too chatty" — only discovered through Portfolio Analyst. AWF has a unique multi-engine setup (Claude, Codex, Copilot) making cross-engine cost comparison especially valuable.
Source: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/metrics-collector.md
[P2] Schema Consistency Checker
What: Detects drift between src/types.ts (WrapperConfig interface), CLI flags in src/cli.ts, documentation in README.md/docs/, and action.yml input definitions.
Why: The codebase has 700+ lines of types and a complex CLI. Schema drift between the TypeScript interface, CLI flags, and documentation is a common source of bugs and user confusion. The cli-flag-consistency-checker covers some of this, but a deeper schema drift detector would catch TypeScript ↔ docs ↔ action.yml inconsistencies.
What: Weekly workflow running zizmor, poutine, and actionlint on all workflow files and container Dockerfiles, posting a structured discussion with findings.
Why: The repo has a security-review workflow, but adding structured static analysis tooling (zizmor for workflow injection, poutine for supply chain, actionlint for correctness) would give more systematic coverage. The Pelis factory generates "57 analysis discussions" from this pattern.
Effort: Low-Medium
[P2] Changeset Generator
What: Automated semantic versioning and changelog generation when PRs are merged — determines version bump (major/minor/patch) from commit messages and generates a changelog entry.
Why: update-release-notes handles notes after publishing, but there's no automation for generating changelogs or managing version bumps before release. With the current Conventional Commits enforcement (commitlint), the signals are already there.
Source: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/changeset.md
P3 — Future Ideas
[P3] Issue Arborist
What: Links related issues as sub-issues (e.g., group all DNS-related issues, all API proxy issues, all container security issues).
Why: As the project matures and issue count grows, organizational automation helps. Currently, #1356 (chroot PATH), #1355 (memory limits), and similar infrastructure issues could benefit from grouping.
[P3] Firewall Policy Regression Detector
What: Agent that runs after each merge to main to compare the list of allowed/blocked domains and iptables rules against a known-good baseline, alerting on any policy regressions.
Why: This repository IS a firewall. Domain policy changes in src/squid-config.ts or iptables rules in containers/agent/setup-iptables.sh could silently weaken security. This is a unique, domain-specific workflow opportunity.
What: On first issue or first PR from a new contributor, generate a personalized onboarding comment explaining relevant code areas, testing instructions, and contributing guidelines based on what they've touched.
Why: Lower contribution friction for an open-source security tool. The architecture (Docker, Squid, iptables, chroot) can be daunting for new contributors.
Release notes updated, but no changeset automation
Cost Management
2/5
Secret diggers running 72 times/day; no portfolio analysis
Overall Current Level: 4/5 — Highly mature, security-focused agentic workflow setup, well-suited to the domain. Top priority gaps are meta-monitoring/observability and issue triage.
Target Level: 4.5/5 — Achievable by adding meta-monitoring, issue triage, CI optimization, and rationalizing secret digger costs.
🔄 Comparison with Pelis Factory Best Practices
What This Repository Does Exceptionally Well
Domain-driven security workflows — The security-guard, security-review, secret-digger (x3), and dependency-security-monitor form a comprehensive security posture unique to this domain
Multi-engine smoke testing — Running Claude, Copilot, and Codex smoke tests every 12h is a pattern we haven't seen in other repos — it validates the firewall works with all supported engines
Self-referential automation — Using AWF to test AWF (smoke tests run inside the firewall) is elegant and trustworthy
skip-if-match usage — Workflows like doc-maintainer and test-coverage-improver avoid creating redundant PRs
Key Improvements to Make
Add meta-monitoring — With 21 workflows, a Workflow Health Manager is not optional
Issue triage — No-label issues are accumulating; triage adds immediate maintainer value
Cost rationalization — 72 secret-digger runs/day is the most obvious optimization target
Supply chain gap — Daily malicious code scan is missing for a security-critical tool
Unique Domain Opportunities
This repository has a special opportunity: the tool being built IS the security infrastructure. This enables:
Firewall policy regression detection (unique to this repo)
Multi-engine security validation as a first-class workflow
Performance benchmarking tied directly to agent startup costs
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
This repository is among the most mature agentic workflow repositories I've analyzed, running 21 specialized automated agentic workflows covering security, testing, documentation, CI/CD, and issue management. The domain (network firewall for AI agents) creates powerful opportunities for self-referential automation — running your own firewall to test the firewall. Key opportunities lie in meta-monitoring, issue triage, CI cost optimization, and supply-chain security scanning.
🎓 Patterns Learned from Pelis Agent Factory
Key Patterns Discovered
From the Pelis Agent Factory blog series and the githubnext/agentics repository:
Comparison to This Repository
📋 Current Agentic Workflow Inventory
build-testci-cd-gaps-assessmentci-doctorcli-flag-consistency-checkerdependency-security-monitordoc-maintainerissue-duplication-detectorissue-monsterpelis-agent-factory-advisorplansecret-digger-claudesecret-digger-codexsecret-digger-copilotsecurity-guardsecurity-reviewsmoke-chrootsmoke-claudesmoke-codexsmoke-copilottest-coverage-improverupdate-release-notes🚀 Actionable Recommendations
P0 — Implement Immediately
[P0] Fix CI Doctor Missing Workflows
What:
ci-doctor.mdmonitors workflow failures but is missingsmoke-copilotandsmoke-codexfrom its watch list. When those smoke tests fail, no investigation is triggered.Why: CI Doctor is the primary fault-investigation mechanism. Missing workflows means silent failures go uninvestigated. Given smoke tests run every 12h, this is a daily blind spot.
How: Add the missing workflow names to the
workflows:list inci-doctor.mdfrontmatter.Effort: Low (2-line change)
Fix:
[P0] Issue Triage Agent
What: Automatically label and respond to new issues (bug, feature, question, security, performance, documentation, good-first-issue).
Why: The issue tracker is active (open issues include performance reports, security concerns, and feature requests mixed together with no labels). Triage would help maintainers prioritize, and the custom labels could be domain-specific (e.g.,
container-security,domain-filtering,api-proxy).How: Add a simple issue triage agent triggered on
issues: [opened, reopened].Effort: Low
Example:
P1 — Plan for Near-Term
[P1] Workflow Health Manager (Meta-Monitor)
What: A meta-agent that runs daily to inspect the health of all other agentic workflows — detecting failures, no-ops, cost anomalies, and stale workflows.
Why: With 21 agentic workflows, things can fail silently. Issue #1380 shows "No-Op Runs" accumulating. The Pelis factory learned this lesson and built a Workflow Health Manager that created 40 issues leading to 34 PRs. This is the most impactful pattern missing from this repo.
How: Daily workflow that uses
agentic-workflowstool to inspect recent run logs, flag failures, detect cost outliers, and open issues for problems.Effort: Medium
Source to adapt:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.md[P1] Daily Malicious Code Scan
What: Reviews recent commits and PRs for suspicious code patterns — supply chain attacks, backdoors, unusual network calls, credential harvesting patterns.
Why: This repository implements a security tool used to sandbox AI agents. A supply chain attack here could compromise every repository using AWF. The Pelis factory runs this daily and considers it a core security guardrail.
Effort: Low-Medium
Source to adapt:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md[P1] Breaking Change Checker
What: Detects backward-incompatible changes in PRs to CLI flags, Docker Compose configuration, environment variables, and the
WrapperConfigTypeScript interface.Why: AWF is increasingly used as infrastructure by other teams and CI workflows. Breaking changes to
--allow-domains,--enable-api-proxy,WrapperConfig, or container env vars can silently break user workflows. The Pelis factory's Breaking Change Checker directly addresses this pattern.Effort: Medium
Source to adapt:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/breaking-change-checker.md[P1] CI Optimization Coach
What: Analyzes CI pipeline performance and proposes parallelization, caching, and test-splitting improvements.
Why: Issue #1376 explicitly reports integration tests taking 37+ minutes in CI (Domain & Network job). The Pelis factory's CI Coach achieved a 100% merge rate on 9 proposals. This is actionable right now with a known pain point.
Effort: Low-Medium
Source to adapt:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-coach.md[P1] Rationalize Secret Digger Scheduling
What: Currently running 3 separate secret digger workflows (Claude, Codex, Copilot) each every hour = ~72 runs/day. Consider staggering to every 6h per engine, or rotating engines.
Why: Running 3 different engines hourly is expensive and likely produces redundant results. The cost could be redirected to higher-value workflows like meta-monitoring. The security value of 3 engines is real (different models may catch different patterns), but hourly cadence may be excessive.
How: Change schedules to
every 6hwith staggered offsets, or rotate daily (Monday=Claude, Tuesday=Codex, Wednesday=Copilot...). Preserve at least one daily run per engine.Effort: Low
P2 — Consider for Roadmap
[P2] Metrics Collector / Portfolio Analyst
What: Daily workflow that aggregates token usage, cost estimates, run durations, and success rates across all 21 agentic workflows. Weekly workflow that identifies cost reduction opportunities.
Why: With 21 workflows, cost management becomes important. The Pelis factory learned that some agents were "way too chatty" — only discovered through Portfolio Analyst. AWF has a unique multi-engine setup (Claude, Codex, Copilot) making cross-engine cost comparison especially valuable.
Source:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/metrics-collector.md[P2] Schema Consistency Checker
What: Detects drift between
src/types.ts(WrapperConfig interface), CLI flags insrc/cli.ts, documentation inREADME.md/docs/, andaction.ymlinput definitions.Why: The codebase has 700+ lines of types and a complex CLI. Schema drift between the TypeScript interface, CLI flags, and documentation is a common source of bugs and user confusion. The
cli-flag-consistency-checkercovers some of this, but a deeper schema drift detector would catch TypeScript ↔ docs ↔ action.yml inconsistencies.Effort: Medium
[P2] Container Security Scanner (Weekly Static Analysis)
What: Weekly workflow running
zizmor,poutine, andactionlinton all workflow files and container Dockerfiles, posting a structured discussion with findings.Why: The repo has a
security-reviewworkflow, but adding structured static analysis tooling (zizmor for workflow injection, poutine for supply chain, actionlint for correctness) would give more systematic coverage. The Pelis factory generates "57 analysis discussions" from this pattern.Effort: Low-Medium
[P2] Changeset Generator
What: Automated semantic versioning and changelog generation when PRs are merged — determines version bump (major/minor/patch) from commit messages and generates a changelog entry.
Why:
update-release-noteshandles notes after publishing, but there's no automation for generating changelogs or managing version bumps before release. With the current Conventional Commits enforcement (commitlint), the signals are already there.Source:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/changeset.mdP3 — Future Ideas
[P3] Issue Arborist
What: Links related issues as sub-issues (e.g., group all DNS-related issues, all API proxy issues, all container security issues).
Why: As the project matures and issue count grows, organizational automation helps. Currently, #1356 (chroot PATH), #1355 (memory limits), and similar infrastructure issues could benefit from grouping.
[P3] Firewall Policy Regression Detector
What: Agent that runs after each merge to main to compare the list of allowed/blocked domains and iptables rules against a known-good baseline, alerting on any policy regressions.
Why: This repository IS a firewall. Domain policy changes in
src/squid-config.tsor iptables rules incontainers/agent/setup-iptables.shcould silently weaken security. This is a unique, domain-specific workflow opportunity.Effort: Medium-High (requires capturing baseline state)
[P3] Onboarding Guide Generator
What: On first issue or first PR from a new contributor, generate a personalized onboarding comment explaining relevant code areas, testing instructions, and contributing guidelines based on what they've touched.
Why: Lower contribution friction for an open-source security tool. The architecture (Docker, Squid, iptables, chroot) can be daunting for new contributors.
[P3] Performance Regression Monitor (AWF Startup Time)
What: Weekly benchmark of AWF startup time and container pull time, tracking trends and alerting on regressions.
Why: Issue #1376 mentions 37+ minute CI runs. A dedicated performance tracking workflow (using cache-memory for historical comparison) could catch performance regressions early.
📈 Maturity Assessment
Overall Current Level: 4/5 — Highly mature, security-focused agentic workflow setup, well-suited to the domain. Top priority gaps are meta-monitoring/observability and issue triage.
Target Level: 4.5/5 — Achievable by adding meta-monitoring, issue triage, CI optimization, and rationalizing secret digger costs.
🔄 Comparison with Pelis Factory Best Practices
What This Repository Does Exceptionally Well
security-guard,security-review,secret-digger(x3), anddependency-security-monitorform a comprehensive security posture unique to this domaindoc-maintainerandtest-coverage-improveravoid creating redundant PRsKey Improvements to Make
Unique Domain Opportunities
This repository has a special opportunity: the tool being built IS the security infrastructure. This enables:
Run by: Pelis Agent Factory Advisor | Date: 2026-03-21 | Engine: Copilot
Beta Was this translation helpful? Give feedback.
All reactions