[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor Report – March 2026 #1391

2026-03-21T03:21:14Z

github-actions[bot]
bot Mar 21, 2026

📊 Executive Summary

This repository is among the most mature agentic workflow repositories I've analyzed, running 21 specialized automated agentic workflows covering security, testing, documentation, CI/CD, and issue management. The domain (network firewall for AI agents) creates powerful opportunities for self-referential automation — running your own firewall to test the firewall. Key opportunities lie in meta-monitoring, issue triage, CI cost optimization, and supply-chain security scanning.

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns Discovered

From the Pelis Agent Factory blog series and the githubnext/agentics repository:

Specialization over monoliths — Dozens of focused agents outperform one general-purpose agent. Each does one thing well.
Meta-agents are essential at scale — "Workflow Health Manager" and "Audit Workflows" monitor all other agents. Without meta-monitoring, cost and quality degradation go undetected.
Causal chains multiply impact — Analysis workflows create issues; issue-monster assigns them; coding agents fix them. The cascade produces more merged PRs than direct automation.
Trust but verify — Testing/validation workflows run continuously because what worked yesterday may fail silently today.
Observability is non-optional — Metrics Collector tracks daily performance; Portfolio Analyst catches over-spending agents.
skip-if-match prevents redundant work — Use it aggressively to avoid duplicate PRs/issues.
Cache-memory enables longitudinal analysis — Storing history across runs enables trend detection that single-run agents miss.

Comparison to This Repository

Pattern	Pelis Factory	This Repo
Issue triage	✅ Issue Triage Agent	❌ Missing
Meta-monitoring	✅ Workflow Health Manager	❌ Missing
Metrics collection	✅ Metrics Collector, Portfolio Analyst	❌ Missing
CI fault investigation	✅ CI Doctor	✅ CI Doctor
Security scanning	✅ Secrets, Malicious Code, Static Analysis	⚠️ Secrets only (no malicious code scan)
Causal chain (issue → PR)	✅ Issue Monster + Copilot coding agent	✅ Issue Monster
Documentation maintenance	✅ Multiple doc workflows	✅ Doc Maintainer
Release automation	✅ Changeset Generator	⚠️ Update Release Notes only
Smoke testing	✅ Firewall validation workflow	✅ Multi-engine smoke tests

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Build verification on PRs	PR events	✅ Well-configured
`ci-cd-gaps-assessment`	Identifies CI/CD coverage gaps	Daily	✅ Good meta-analysis
`ci-doctor`	Investigates CI failures	workflow_run (failed)	⚠️ Missing smoke-copilot & smoke-codex in watch list
`cli-flag-consistency-checker`	Flags CLI/doc discrepancies	Weekly	✅ Good hygiene
`dependency-security-monitor`	CVE detection & patch PRs	Daily	✅ Strong security coverage
`doc-maintainer`	Syncs docs with code changes	Daily	✅ Good cadence
`issue-duplication-detector`	Detects duplicate issues	On issue open	✅ Useful gatekeeper
`issue-monster`	Assigns issues to Copilot agent	Hourly + On open	✅ Core task dispatcher
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Meta-awareness
`plan`	Generates plans via /plan slash command	Slash command	✅ Good ChatOps
`secret-digger-claude`	Red-team secrets scan (Claude)	Every hour	⚠️ Redundant with 3 engines; costly
`secret-digger-codex`	Red-team secrets scan (Codex)	Every hour	⚠️ Redundant with 3 engines; costly
`secret-digger-copilot`	Red-team secrets scan (Copilot)	Every hour	⚠️ Redundant with 3 engines; costly
`security-guard`	PR security review	PR events	✅ Excellent domain fit
`security-review`	Daily threat modeling	Daily	✅ Comprehensive
`smoke-chroot`	Chroot integration smoke test	PR paths + reaction	✅ Smart path-filtering
`smoke-claude`	End-to-end Claude smoke test	PR + every 12h	✅ Multi-engine coverage
`smoke-codex`	End-to-end Codex smoke test	PR + every 12h	✅ Multi-engine coverage
`smoke-copilot`	End-to-end Copilot smoke test	PR + every 12h	✅ Multi-engine coverage
`test-coverage-improver`	Adds security-critical tests	Weekly	✅ Right focus on security paths
`update-release-notes`	Updates release notes on publish	Release published	✅ Good release hygiene

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Fix CI Doctor Missing Workflows

What: ci-doctor.md monitors workflow failures but is missing smoke-copilot and smoke-codex from its watch list. When those smoke tests fail, no investigation is triggered.

Why: CI Doctor is the primary fault-investigation mechanism. Missing workflows means silent failures go uninvestigated. Given smoke tests run every 12h, this is a daily blind spot.

How: Add the missing workflow names to the workflows: list in ci-doctor.md frontmatter.

Effort: Low (2-line change)

Fix:

# Add to ci-doctor.md workflows list:
- "Smoke Copilot"
- "Smoke Codex"
# (Also verify "Test Coverage Improver" and "Update Release Notes" are present)

[P0] Issue Triage Agent

What: Automatically label and respond to new issues (bug, feature, question, security, performance, documentation, good-first-issue).

Why: The issue tracker is active (open issues include performance reports, security concerns, and feature requests mixed together with no labels). Triage would help maintainers prioritize, and the custom labels could be domain-specific (e.g., container-security, domain-filtering, api-proxy).

How: Add a simple issue triage agent triggered on issues: [opened, reopened].

Effort: Low

Example:

---
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, security, performance, good-first-issue]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent
Analyze new issues in the gh-aw-firewall repository (network firewall for AI agents using Squid proxy and Docker).
Apply appropriate labels and leave a brief comment explaining the classification and how it might be addressed.
Domain-specific categories: container escape, domain filtering, DNS security, proxy configuration, API proxy.

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Monitor)

What: A meta-agent that runs daily to inspect the health of all other agentic workflows — detecting failures, no-ops, cost anomalies, and stale workflows.

Why: With 21 agentic workflows, things can fail silently. Issue #1380 shows "No-Op Runs" accumulating. The Pelis factory learned this lesson and built a Workflow Health Manager that created 40 issues leading to 34 PRs. This is the most impactful pattern missing from this repo.

How: Daily workflow that uses agentic-workflows tool to inspect recent run logs, flag failures, detect cost outliers, and open issues for problems.

Effort: Medium

Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.md

[P1] Daily Malicious Code Scan

What: Reviews recent commits and PRs for suspicious code patterns — supply chain attacks, backdoors, unusual network calls, credential harvesting patterns.

Why: This repository implements a security tool used to sandbox AI agents. A supply chain attack here could compromise every repository using AWF. The Pelis factory runs this daily and considers it a core security guardrail.

Effort: Low-Medium

Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md

[P1] Breaking Change Checker

What: Detects backward-incompatible changes in PRs to CLI flags, Docker Compose configuration, environment variables, and the WrapperConfig TypeScript interface.

Why: AWF is increasingly used as infrastructure by other teams and CI workflows. Breaking changes to --allow-domains, --enable-api-proxy, WrapperConfig, or container env vars can silently break user workflows. The Pelis factory's Breaking Change Checker directly addresses this pattern.

Effort: Medium

Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/breaking-change-checker.md

[P1] CI Optimization Coach

What: Analyzes CI pipeline performance and proposes parallelization, caching, and test-splitting improvements.

Why: Issue #1376 explicitly reports integration tests taking 37+ minutes in CI (Domain & Network job). The Pelis factory's CI Coach achieved a 100% merge rate on 9 proposals. This is actionable right now with a known pain point.

Effort: Low-Medium

Source to adapt: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-coach.md

[P1] Rationalize Secret Digger Scheduling

What: Currently running 3 separate secret digger workflows (Claude, Codex, Copilot) each every hour = ~72 runs/day. Consider staggering to every 6h per engine, or rotating engines.

Why: Running 3 different engines hourly is expensive and likely produces redundant results. The cost could be redirected to higher-value workflows like meta-monitoring. The security value of 3 engines is real (different models may catch different patterns), but hourly cadence may be excessive.

How: Change schedules to every 6h with staggered offsets, or rotate daily (Monday=Claude, Tuesday=Codex, Wednesday=Copilot...). Preserve at least one daily run per engine.

Effort: Low

P2 — Consider for Roadmap

[P2] Metrics Collector / Portfolio Analyst

What: Daily workflow that aggregates token usage, cost estimates, run durations, and success rates across all 21 agentic workflows. Weekly workflow that identifies cost reduction opportunities.

Why: With 21 workflows, cost management becomes important. The Pelis factory learned that some agents were "way too chatty" — only discovered through Portfolio Analyst. AWF has a unique multi-engine setup (Claude, Codex, Copilot) making cross-engine cost comparison especially valuable.

Source: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/metrics-collector.md

[P2] Schema Consistency Checker

What: Detects drift between src/types.ts (WrapperConfig interface), CLI flags in src/cli.ts, documentation in README.md/docs/, and action.yml input definitions.

Why: The codebase has 700+ lines of types and a complex CLI. Schema drift between the TypeScript interface, CLI flags, and documentation is a common source of bugs and user confusion. The cli-flag-consistency-checker covers some of this, but a deeper schema drift detector would catch TypeScript ↔ docs ↔ action.yml inconsistencies.

Effort: Medium

[P2] Container Security Scanner (Weekly Static Analysis)

What: Weekly workflow running zizmor, poutine, and actionlint on all workflow files and container Dockerfiles, posting a structured discussion with findings.

Why: The repo has a security-review workflow, but adding structured static analysis tooling (zizmor for workflow injection, poutine for supply chain, actionlint for correctness) would give more systematic coverage. The Pelis factory generates "57 analysis discussions" from this pattern.

Effort: Low-Medium

[P2] Changeset Generator

What: Automated semantic versioning and changelog generation when PRs are merged — determines version bump (major/minor/patch) from commit messages and generates a changelog entry.

Why: update-release-notes handles notes after publishing, but there's no automation for generating changelogs or managing version bumps before release. With the current Conventional Commits enforcement (commitlint), the signals are already there.

Source: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/changeset.md

P3 — Future Ideas

[P3] Issue Arborist

What: Links related issues as sub-issues (e.g., group all DNS-related issues, all API proxy issues, all container security issues).

Why: As the project matures and issue count grows, organizational automation helps. Currently, #1356 (chroot PATH), #1355 (memory limits), and similar infrastructure issues could benefit from grouping.

[P3] Firewall Policy Regression Detector

What: Agent that runs after each merge to main to compare the list of allowed/blocked domains and iptables rules against a known-good baseline, alerting on any policy regressions.

Why: This repository IS a firewall. Domain policy changes in src/squid-config.ts or iptables rules in containers/agent/setup-iptables.sh could silently weaken security. This is a unique, domain-specific workflow opportunity.

Effort: Medium-High (requires capturing baseline state)

[P3] Onboarding Guide Generator

What: On first issue or first PR from a new contributor, generate a personalized onboarding comment explaining relevant code areas, testing instructions, and contributing guidelines based on what they've touched.

Why: Lower contribution friction for an open-source security tool. The architecture (Docker, Squid, iptables, chroot) can be daunting for new contributors.

[P3] Performance Regression Monitor (AWF Startup Time)

What: Weekly benchmark of AWF startup time and container pull time, tracking trends and alerting on regressions.

Why: Issue #1376 mentions 37+ minute CI runs. A dedicated performance tracking workflow (using cache-memory for historical comparison) could catch performance regressions early.

📈 Maturity Assessment

Dimension	Score	Notes
Coverage	4.5/5	21 workflows across all key categories
Security	4.5/5	Multiple security workflows, domain-specific security guard
Testing	4/5	Multi-engine smoke tests + coverage improver
Documentation	4/5	Daily doc maintainer
Observability	2/5	No metrics collection or meta-monitoring
Issue Management	3/5	Issue Monster + dedup, but no triage labeling
Release Automation	3/5	Release notes updated, but no changeset automation
Cost Management	2/5	Secret diggers running 72 times/day; no portfolio analysis

Overall Current Level: 4/5 — Highly mature, security-focused agentic workflow setup, well-suited to the domain. Top priority gaps are meta-monitoring/observability and issue triage.

Target Level: 4.5/5 — Achievable by adding meta-monitoring, issue triage, CI optimization, and rationalizing secret digger costs.

🔄 Comparison with Pelis Factory Best Practices

What This Repository Does Exceptionally Well

Domain-driven security workflows — The security-guard, security-review, secret-digger (x3), and dependency-security-monitor form a comprehensive security posture unique to this domain
Multi-engine smoke testing — Running Claude, Copilot, and Codex smoke tests every 12h is a pattern we haven't seen in other repos — it validates the firewall works with all supported engines
Self-referential automation — Using AWF to test AWF (smoke tests run inside the firewall) is elegant and trustworthy
skip-if-match usage — Workflows like doc-maintainer and test-coverage-improver avoid creating redundant PRs

Key Improvements to Make

Add meta-monitoring — With 21 workflows, a Workflow Health Manager is not optional
Issue triage — No-label issues are accumulating; triage adds immediate maintainer value
Cost rationalization — 72 secret-digger runs/day is the most obvious optimization target
Supply chain gap — Daily malicious code scan is missing for a security-critical tool

Unique Domain Opportunities

This repository has a special opportunity: the tool being built IS the security infrastructure. This enables:

Firewall policy regression detection (unique to this repo)
Multi-engine security validation as a first-class workflow
Performance benchmarking tied directly to agent startup costs
Domain allowlist drift detection across releases

Run by: Pelis Agent Factory Advisor | Date: 2026-03-21 | Engine: Copilot

AI generated by Pelis Agent Factory Advisor

expires on Mar 28, 2026, 3:21 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor Report – March 2026 #1391

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor Report – March 2026 #1391

Uh oh!

github-actions[bot] bot Mar 21, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns Discovered

Comparison to This Repository

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

[P0] Fix CI Doctor Missing Workflows

[P0] Issue Triage Agent

P1 — Plan for Near-Term

[P1] Workflow Health Manager (Meta-Monitor)

[P1] Daily Malicious Code Scan

[P1] Breaking Change Checker

[P1] CI Optimization Coach

[P1] Rationalize Secret Digger Scheduling

P2 — Consider for Roadmap

[P2] Metrics Collector / Portfolio Analyst

[P2] Schema Consistency Checker

[P2] Container Security Scanner (Weekly Static Analysis)

[P2] Changeset Generator

P3 — Future Ideas

[P3] Issue Arborist

[P3] Firewall Policy Regression Detector

[P3] Onboarding Guide Generator

[P3] Performance Regression Monitor (AWF Startup Time)

📈 Maturity Assessment

🔄 Comparison with Pelis Factory Best Practices

What This Repository Does Exceptionally Well

Key Improvements to Make

Unique Domain Opportunities

Replies: 0 comments

github-actions[bot]
bot Mar 21, 2026