You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
gh-aw-firewall has one of the most mature agentic workflow setups I've analyzed — 21 agentic .md workflows covering security, testing, CI/CD, documentation, and issue management. The security-domain focus is particularly impressive with hourly red-team "secret digger" agents and a dedicated daily threat-modeling workflow. The top opportunities are: adding an Issue Triage Agent (high-ROI, low-effort), implementing a Firewall Escape Test Agent (referenced but missing!), and adding a Workflow Health Manager meta-agent to watch over the growing collection.
🎓 Patterns Learned from Pelis Agent Factory
From the Documentation Site
The factory emphasizes specialization over generalism — dozens of focused agents beat one monolithic agent. Key patterns observed:
Pattern
Description
Triage-then-dispatch
Issue Triage → Issue Monster pipeline: triage first, then dispatch to coding agent
Trust but verify
Continuous testing agents re-verify what worked yesterday
Meta-agents
Workflow Health Manager watches all other workflows
✅ Ahead of factory norms: security automation depth (3x hourly red-team agents, PR security guard, daily threat modeling), smoke testing matrix across 3 AI engines, domain-specific security context in every workflow prompt.
⚠️Behind factory norms: no issue triage, no meta-agent monitoring other workflows, no firewall-escape test agent (referenced but absent!), no breaking change detection, no sub-issue organization.
📋 Current Agentic Workflow Inventory
Workflow
Purpose
Trigger
Assessment
secret-digger-claude
Red-team: scan for secrets (Claude)
Every hour
✅ Excellent — unique to security domain
secret-digger-codex
Red-team: scan for secrets (Codex)
Every hour
✅ Excellent — multi-engine coverage
secret-digger-copilot
Red-team: scan for secrets (Copilot)
Every hour
✅ Excellent — multi-engine coverage
security-guard
PR security review
PR open/sync
✅ Excellent — domain-specific checks
security-review
Daily threat modeling + STRIDE analysis
Daily / dispatch
✅ Excellent — but references missing escape agent
ci-doctor
CI failure investigator
Workflow failure
✅ Good — monitors 26 workflows
dependency-security-monitor
Daily CVE detection + dependency updates
Daily
✅ Good
doc-maintainer
Sync docs with 7-day code changes
Daily
✅ Good
test-coverage-improver
Weekly coverage improvement PRs
Weekly
✅ Good — security-focused priority
issue-monster
Dispatch issues to Copilot coding agent
Issue opened / hourly
✅ Good — 9 draft PRs cap
cli-flag-consistency-checker
CLI doc consistency audit
Weekly
✅ Good
ci-cd-gaps-assessment
CI/CD gap analysis
Daily
⚠️ Redundant with this workflow?
update-release-notes
Enhance release notes on publish
Release published
✅ Solid
smoke-claude/codex/copilot/chroot
AWF smoke tests
Push / dispatch
✅ Core product validation
plan
/plan slash command handler
Issue/discussion comment
✅ Good ChatOps pattern
issue-duplication-detector
Find duplicate issues
Issue opened
✅ Useful
build-test
Integration build tests
Push/PR
✅ Core
pelis-agent-factory-advisor
This workflow
Weekly
✅ Meta-analysis
🚀 Actionable Recommendations
P0 — Implement Immediately
🚨 P0.1: Create the Firewall Escape Test Agent
What: A dedicated agentic workflow that actively attempts to escape the AWF firewall — trying DNS tunneling, domain-fronting, SSRF, CONNECT hijacking, and other egress bypass techniques from inside the agent container.
Why: The daily security-review.md workflow has a Phase 1 that explicitly reads logs from "Firewall Escape Test Agent" — but this workflow does not exist. The security review is currently running with an empty Phase 1, missing the complementary red-team data it was designed to consume. This is the single highest-ROI gap in the entire workflow collection: it's a security-critical tool that should be continuously self-testing its own firewall.
Attempts a curated set of egress bypass techniques (DNS over HTTPS to non-whitelisted resolver, SSRF via allowed API, high-numbered ports, IPv6 direct, ICMP tunneling attempt)
Verifies each attempt is correctly blocked
Creates a discussion with pass/fail results + evidence
Creates an issue if any escape succeeds
Effort: Medium — requires domain expertise already present in the team
---description: Actively tests AWF firewall by attempting egress bypass techniques to verify containmenton:
schedule: dailyworkflow_dispatch:
...
safe-outputs:
create-discussion:
title-prefix: "[Firewall Escape Test] "create-issue:
title-prefix: "🚨 Firewall Bypass Detected"labels: [security, critical]---# Firewall Escape Test Agent
You are a red-team agent testing the AWF firewall's containment...
🚨 P0.2: Add Issue Triage Agent
What: Automated labeling and first-response for new issues.
Why: The Issue Monster dispatches issues to Copilot, but there's no upstream triage. Issues arrive unlabeled, making it harder for Issue Monster to prioritize security vs. enhancement vs. bug. This is the "hello world" of agentic workflows with extremely high ROI.
Effort: Low — can be remixed from githubnext/agentics/issue-triage.md
gh aw add-wizard githubnext/agentics/issue-triage
Then customize: add security label, add AWF-specific context, link to docs/troubleshooting.md in responses.
P1 — Plan for Near-Term
P1.1: Workflow Health Manager (Meta-Agent)
What: A meta-agent that monitors the health of all 21+ agentic workflows, detects failures, staleness, or degraded output quality, and creates issues/PRs to fix problems.
Why: With 21 agentic workflows running continuously, managing "the fleet" becomes its own problem. The Workflow Health Manager at Peli's factory created 40 issues leading to 34 PRs — the highest causal chain value of any workflow category. This repo already has ci-doctor.md for CI failures, but no agent monitors the quality of agentic workflow outputs (e.g., is doc-maintainer creating PRs with no changes? Is dependency-security-monitor finding false positives?).
How: Create .github/workflows/workflow-health-manager.md that uses agentic-workflows tool to analyze recent run history, check for unusual patterns (too many skips, empty outputs, escalating failures), and creates diagnostic issues.
What: Daily automated static analysis of all .lock.yml workflow files and TypeScript code using zizmor, poutine, and actionlint.
Why: The Peli factory's equivalent workflow produced 57 analysis discussions + 12 Zizmor security reports. This repo generates lock files that other organizations use — ensuring those files are clean is a product quality concern, not just an internal hygiene concern. Currently build.yml runs actionlint but there's no dedicated agentic analysis + reporting loop.
How: Create .github/workflows/static-analysis-report.md scheduled daily, running all three tools, creating a discussion with findings, and creating issues for HIGH severity items.
Effort: Low — tools already installed, just needs agentic wrapper
P1.3: Breaking Change Checker
What: Agent that monitors PRs for backward-incompatible changes to the CLI interface, Docker Compose schema, or public API surface.
Why: AWF is used as a GitHub Action and CLI by other teams. Breaking changes to --allow-domains format, Docker Compose config structure, or environment variable names could silently break downstream users. Currently there's no automated detection.
How: Create .github/workflows/breaking-change-checker.md triggered on PRs that modify src/cli.ts, src/docker-manager.ts, or src/types.ts. Check for flag renames, removed flags, changed default behaviors, schema changes.
Effort: Low-Medium — can be remixed from factory patterns
P1.4: Daily Malicious Code Scan
What: Agent that reviews recent commits for suspicious patterns — obfuscated code, hidden network calls, unusual base64 strings, suspicious shell commands in entrypoints.
Why: This is a supply chain security concern. The repo includes container entrypoints (containers/agent/entrypoint.sh, containers/squid/entrypoint.sh, containers/api-proxy/server.js) that execute in privileged contexts. A supply chain attack injecting malicious code into these files could compromise every AWF user. Peli's factory runs this daily.
How: Remix githubnext/agentics/daily-malicious-code-scan.md with AWF-specific focus on container scripts.
Effort: Low — direct remix available
P2 — Consider for Roadmap
P2.1: Issue Arborist
What: Weekly agent that groups related issues as parent/sub-issue trees.
Why: With Issue Monster actively creating Copilot PRs from issues, the issue tracker can get cluttered. An Issue Arborist would organize related work (e.g., "IPv6 filtering improvements" → 3 related issues as sub-issues). Peli's factory created 18 parent issues this way.
Effort: Low-Medium (remix from githubnext/agentics/issue-arborist.md)
P2.2: CI Coach
What: Periodic agent that analyzes CI pipeline performance and suggests optimizations.
Why: The current CI has multiple test suites (unit, integration, chroot, smoke). A CI Coach could identify parallel execution opportunities, redundant steps, or slow tests. Peli's factory had 9 merged PRs out of 9 proposed (100% merge rate) — best precision of any workflow category.
Effort: Low (remix from githubnext/agentics/ci-coach.md)
P2.3: Weekly Issue Summary
What: Weekly digest of issue activity for async team members.
Why: With Issue Monster dispatching issues to Copilot, high-velocity teams may lose track of what's being worked on. A weekly summary discussion provides visibility.
Effort: Low (remix from githubnext/agentics/weekly-issue-summary.md)
P2.4: Grumpy Code Reviewer
What: General-purpose PR code quality reviewer (not just security-focused).
Why: security-guard.md only reviews for security regressions. A general reviewer could catch code style issues, missing tests, unclear variable names, and non-security logic errors. The grumpy-reviewer pattern from agentics has high signal-to-noise.
Effort: Low-Medium
P3 — Future Ideas
P3.1: Daily Repo Chronicle
A daily narrative summary of repository activity — what changed, what was discussed, what was merged. Useful for async contributors and onboarding. Remix: githubnext/agentics/daily-repo-chronicle.md.
P3.2: Docs Accessibility Tester
Given the Astro/Starlight docs site, a Playwright-based agent testing mobile/screen-reader accessibility of the docs. Particularly relevant for the security documentation that operators rely on. Remix: githubnext/agentics/daily-accessibility-review.md + daily-multi-device-docs-tester.md.
P3.3: Sub Issue Closer
Automatically closes sub-issues when parent issues are resolved. Keeps the tracker clean as Issue Monster creates more sub-issues. Remix: githubnext/agentics/sub-issue-closer.md.
P3.4: Container Image Freshness Monitor
An AWF-specific workflow that checks whether the base images (ubuntu/squid:latest, ubuntu:22.04) have been updated and creates PRs to pin newer versions. Critical for security patch uptake.
📈 Maturity Assessment
Dimension
Current
Target
Gap
Overall Level
4/5 — Advanced
5/5 — Factory-grade
Meta-agents, escape testing
Security Automation
5/5 — Exceptional
5/5
Missing escape test agent
Issue Management
2/5 — Basic
4/5
No triage, no arborist
Code Quality
3/5 — Good
4/5
No breaking change checker
Meta-observation
2/5 — Minimal
4/5
No health manager
Documentation
3/5 — Good
4/5
Docs maintainer exists but no changelog agent
Current Level: 4/5 — This repository has significantly more agentic automation than a typical OSS project, with particularly sophisticated security workflows (hourly red-teaming, daily threat modeling, PR security guard). The AWF's own technology is being used to test itself, which is a great dogfooding pattern.
Target Level: 5/5 — To reach factory-grade, the key gaps are: the missing Firewall Escape Test Agent (which the security review already expects to exist), an issue triage pipeline to feed Issue Monster better, and a meta-agent to monitor the growing fleet.
🔄 Comparison with Best Practices
What This Repo Does Well
🏆 Security automation depth far exceeds factory norms — 3 engines × hourly red-teaming is unique
🏆 Domain-specific context — every workflow has detailed AWF architecture context in its prompt
🏆 Multi-engine smoke testing — validating AWF works with Claude, Codex, Copilot, and chroot mode
🏆 Self-referential testing — the AWF firewall is tested using the AWF firewall
Run date: 2026-03-19. Workflow count: 21 agentic .md files. Key finding: security-review.md Phase 1 references a "Firewall Escape Test Agent" that doesn't exist — this should be tracked as the top priority. CI Doctor monitors 26 workflows but misses some newer ones. Cache-memory notes saved at /tmp/gh-aw/cache-memory/advisor-notes.md.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
gh-aw-firewallhas one of the most mature agentic workflow setups I've analyzed — 21 agentic.mdworkflows covering security, testing, CI/CD, documentation, and issue management. The security-domain focus is particularly impressive with hourly red-team "secret digger" agents and a dedicated daily threat-modeling workflow. The top opportunities are: adding an Issue Triage Agent (high-ROI, low-effort), implementing a Firewall Escape Test Agent (referenced but missing!), and adding a Workflow Health Manager meta-agent to watch over the growing collection.🎓 Patterns Learned from Pelis Agent Factory
From the Documentation Site
The factory emphasizes specialization over generalism — dozens of focused agents beat one monolithic agent. Key patterns observed:
From the
githubnext/agenticsRepositoryNotable reference implementations available:
issue-arborist.md,issue-triage.md,breaking-change-checker(viacontribution-check.md),daily-malicious-code-scan.md,sub-issue-closer.md,ci-coach.md,grumpy-reviewer.md,daily-test-improver.md,weekly-issue-summary.md.How This Repo Compares
✅ Ahead of factory norms: security automation depth (3x hourly red-team agents, PR security guard, daily threat modeling), smoke testing matrix across 3 AI engines, domain-specific security context in every workflow prompt.
📋 Current Agentic Workflow Inventory
secret-digger-claudesecret-digger-codexsecret-digger-copilotsecurity-guardsecurity-reviewci-doctordependency-security-monitordoc-maintainertest-coverage-improverissue-monstercli-flag-consistency-checkerci-cd-gaps-assessmentupdate-release-notessmoke-claude/codex/copilot/chrootplanissue-duplication-detectorbuild-testpelis-agent-factory-advisor🚀 Actionable Recommendations
P0 — Implement Immediately
🚨 P0.1: Create the Firewall Escape Test Agent
What: A dedicated agentic workflow that actively attempts to escape the AWF firewall — trying DNS tunneling, domain-fronting, SSRF, CONNECT hijacking, and other egress bypass techniques from inside the agent container.
Why: The daily
security-review.mdworkflow has a Phase 1 that explicitly reads logs from "Firewall Escape Test Agent" — but this workflow does not exist. The security review is currently running with an empty Phase 1, missing the complementary red-team data it was designed to consume. This is the single highest-ROI gap in the entire workflow collection: it's a security-critical tool that should be continuously self-testing its own firewall.How: Create
.github/workflows/firewall-escape-test.mdthat:awfwith a minimal allowlistEffort: Medium — requires domain expertise already present in the team
🚨 P0.2: Add Issue Triage Agent
What: Automated labeling and first-response for new issues.
Why: The Issue Monster dispatches issues to Copilot, but there's no upstream triage. Issues arrive unlabeled, making it harder for Issue Monster to prioritize security vs. enhancement vs. bug. This is the "hello world" of agentic workflows with extremely high ROI.
How: Add
.github/workflows/issue-triage.mdtriggered onissues: [opened, reopened]. Labels:bug,feature,enhancement,documentation,question,security,good-first-issue. Include AWF-specific context (Docker networking, iptables, Squid).Effort: Low — can be remixed from
githubnext/agentics/issue-triage.mdThen customize: add
securitylabel, add AWF-specific context, link todocs/troubleshooting.mdin responses.P1 — Plan for Near-Term
P1.1: Workflow Health Manager (Meta-Agent)
What: A meta-agent that monitors the health of all 21+ agentic workflows, detects failures, staleness, or degraded output quality, and creates issues/PRs to fix problems.
Why: With 21 agentic workflows running continuously, managing "the fleet" becomes its own problem. The Workflow Health Manager at Peli's factory created 40 issues leading to 34 PRs — the highest causal chain value of any workflow category. This repo already has
ci-doctor.mdfor CI failures, but no agent monitors the quality of agentic workflow outputs (e.g., isdoc-maintainercreating PRs with no changes? Isdependency-security-monitorfinding false positives?).How: Create
.github/workflows/workflow-health-manager.mdthat usesagentic-workflowstool to analyze recent run history, check for unusual patterns (too many skips, empty outputs, escalating failures), and creates diagnostic issues.Effort: Medium
P1.2: Static Analysis Report (Daily zizmor/poutine/actionlint)
What: Daily automated static analysis of all
.lock.ymlworkflow files and TypeScript code usingzizmor,poutine, andactionlint.Why: The Peli factory's equivalent workflow produced 57 analysis discussions + 12 Zizmor security reports. This repo generates lock files that other organizations use — ensuring those files are clean is a product quality concern, not just an internal hygiene concern. Currently
build.ymlruns actionlint but there's no dedicated agentic analysis + reporting loop.How: Create
.github/workflows/static-analysis-report.mdscheduled daily, running all three tools, creating a discussion with findings, and creating issues for HIGH severity items.Effort: Low — tools already installed, just needs agentic wrapper
P1.3: Breaking Change Checker
What: Agent that monitors PRs for backward-incompatible changes to the CLI interface, Docker Compose schema, or public API surface.
Why: AWF is used as a GitHub Action and CLI by other teams. Breaking changes to
--allow-domainsformat, Docker Compose config structure, or environment variable names could silently break downstream users. Currently there's no automated detection.How: Create
.github/workflows/breaking-change-checker.mdtriggered on PRs that modifysrc/cli.ts,src/docker-manager.ts, orsrc/types.ts. Check for flag renames, removed flags, changed default behaviors, schema changes.Effort: Low-Medium — can be remixed from factory patterns
P1.4: Daily Malicious Code Scan
What: Agent that reviews recent commits for suspicious patterns — obfuscated code, hidden network calls, unusual base64 strings, suspicious shell commands in entrypoints.
Why: This is a supply chain security concern. The repo includes container entrypoints (
containers/agent/entrypoint.sh,containers/squid/entrypoint.sh,containers/api-proxy/server.js) that execute in privileged contexts. A supply chain attack injecting malicious code into these files could compromise every AWF user. Peli's factory runs this daily.How: Remix
githubnext/agentics/daily-malicious-code-scan.mdwith AWF-specific focus on container scripts.Effort: Low — direct remix available
P2 — Consider for Roadmap
P2.1: Issue Arborist
What: Weekly agent that groups related issues as parent/sub-issue trees.
Why: With Issue Monster actively creating Copilot PRs from issues, the issue tracker can get cluttered. An Issue Arborist would organize related work (e.g., "IPv6 filtering improvements" → 3 related issues as sub-issues). Peli's factory created 18 parent issues this way.
Effort: Low-Medium (remix from
githubnext/agentics/issue-arborist.md)P2.2: CI Coach
What: Periodic agent that analyzes CI pipeline performance and suggests optimizations.
Why: The current CI has multiple test suites (unit, integration, chroot, smoke). A CI Coach could identify parallel execution opportunities, redundant steps, or slow tests. Peli's factory had 9 merged PRs out of 9 proposed (100% merge rate) — best precision of any workflow category.
Effort: Low (remix from
githubnext/agentics/ci-coach.md)P2.3: Weekly Issue Summary
What: Weekly digest of issue activity for async team members.
Why: With Issue Monster dispatching issues to Copilot, high-velocity teams may lose track of what's being worked on. A weekly summary discussion provides visibility.
Effort: Low (remix from
githubnext/agentics/weekly-issue-summary.md)P2.4: Grumpy Code Reviewer
What: General-purpose PR code quality reviewer (not just security-focused).
Why:
security-guard.mdonly reviews for security regressions. A general reviewer could catch code style issues, missing tests, unclear variable names, and non-security logic errors. Thegrumpy-reviewerpattern from agentics has high signal-to-noise.Effort: Low-Medium
P3 — Future Ideas
P3.1: Daily Repo Chronicle
A daily narrative summary of repository activity — what changed, what was discussed, what was merged. Useful for async contributors and onboarding. Remix:
githubnext/agentics/daily-repo-chronicle.md.P3.2: Docs Accessibility Tester
Given the Astro/Starlight docs site, a Playwright-based agent testing mobile/screen-reader accessibility of the docs. Particularly relevant for the security documentation that operators rely on. Remix:
githubnext/agentics/daily-accessibility-review.md+daily-multi-device-docs-tester.md.P3.3: Sub Issue Closer
Automatically closes sub-issues when parent issues are resolved. Keeps the tracker clean as Issue Monster creates more sub-issues. Remix:
githubnext/agentics/sub-issue-closer.md.P3.4: Container Image Freshness Monitor
An AWF-specific workflow that checks whether the base images (
ubuntu/squid:latest,ubuntu:22.04) have been updated and creates PRs to pin newer versions. Critical for security patch uptake.📈 Maturity Assessment
Current Level: 4/5 — This repository has significantly more agentic automation than a typical OSS project, with particularly sophisticated security workflows (hourly red-teaming, daily threat modeling, PR security guard). The AWF's own technology is being used to test itself, which is a great dogfooding pattern.
Target Level: 5/5 — To reach factory-grade, the key gaps are: the missing Firewall Escape Test Agent (which the security review already expects to exist), an issue triage pipeline to feed Issue Monster better, and a meta-agent to monitor the growing fleet.
🔄 Comparison with Best Practices
What This Repo Does Well
shared/directory with reusable imports (mcp-pagination.md,secret-audit.md, etc.)What Could Improve
Unique Opportunities Given the Security Domain
iptables-bypass,squid-acl,container-escape) would enable better Issue Monster routing📝 Notes for Future Runs
Run date: 2026-03-19. Workflow count: 21 agentic
.mdfiles. Key finding:security-review.mdPhase 1 references a "Firewall Escape Test Agent" that doesn't exist — this should be tracked as the top priority. CI Doctor monitors 26 workflows but misses some newer ones. Cache-memory notes saved at/tmp/gh-aw/cache-memory/advisor-notes.md.Beta Was this translation helpful? Give feedback.
All reactions