AI-SDLC
AI-SDLC

AI-SDLC Progressive Autonomy Specification

AI-SDLC Progressive Autonomy Specification

Document type: Normative Status: Draft Spec version: v1alpha1


Table of Contents

  1. Introduction
  2. Autonomy Levels
  3. The Principle of Least Autonomy
  4. Complexity-Based Task Routing
  5. Promotion Criteria
  6. Demotion Triggers
  7. Framework Alignment

1. Introduction

The progressive autonomy system governs how much independence AI agents have within the development lifecycle. Autonomy is earned through demonstrated reliability, not granted by configuration alone. Every agent starts at a minimum autonomy level and advances through quantitative evidence of trustworthiness.

Autonomy policies are declared as AutonomyPolicy resources. The reconciliation loop continuously evaluates agent behavior against promotion criteria and demotion triggers.


2. Autonomy Levels

The framework defines four autonomy levels (0-3). Each level specifies permissions, guardrails, and monitoring intensity.

2.1 Level 0 — Observer (Intern)

Description: Read-only observation. The agent observes the development process and learns patterns but produces no artifacts.

Permissions:

  • Read: All accessible resources
  • Write: None
  • Execute: None

Guardrails:

  • All actions require approval (effectively no actions are possible)

Monitoring: Continuous

Minimum duration: 2 weeks (RECOMMENDED)

2.2 Level 1 — Junior

Description: The agent recommends changes but every action requires mandatory human approval before execution.

Permissions:

  • Read: All accessible resources
  • Write: Draft pull requests, comments
  • Execute: Test suites

Guardrails:

  • All changes require human approval
  • Maximum lines per PR: 200 (RECOMMENDED)
  • Blocked from security-sensitive paths (RECOMMENDED)

Monitoring: Continuous

Minimum duration: 4 weeks (RECOMMENDED)

2.3 Level 2 — Senior

Description: The agent executes within defined guardrails with real-time notification to humans. Only security-critical changes require pre-approval.

Permissions:

  • Read: All accessible resources
  • Write: Branches, pull requests, comments
  • Execute: Test suites, linters, builds

Guardrails:

  • Only security-critical changes require pre-approval
  • Maximum lines per PR: 500 (RECOMMENDED)
  • Resource/cost budget limits SHOULD be enforced
  • Reduced set of blocked paths

Monitoring: Real-time notification

Minimum duration: 8 weeks (RECOMMENDED)

2.4 Level 3 — Principal

Description: Autonomous within the agent's domain. Continuous validation replaces pre-approval. Audit trails provide accountability.

Permissions:

  • Read: All accessible resources
  • Write: Branches, pull requests, comments, merge non-critical changes
  • Execute: Test suites, linters, builds, deploy to staging

Guardrails:

  • Only architecture-level changes require pre-approval
  • Maximum lines per PR: 1000 (RECOMMENDED)

Monitoring: Audit log

Minimum duration: None (continuous validation)


3. The Principle of Least Autonomy

Agents MUST operate at the lowest autonomy level sufficient for their function. This extends the cybersecurity Principle of Least Privilege to agent autonomy.

  • Every agent MUST start at Level 0 or Level 1
  • Agents MUST NOT be initialized at Level 2 or Level 3 without first demonstrating competence through the promotion process
  • Implementations MUST enforce this by rejecting AutonomyPolicy resources that set an agent's initial level above 1

4. Complexity-Based Task Routing

Task complexity determines the minimum autonomy level and human involvement required. Complexity is scored on a 1-10 scale.

4.1 Complexity Tiers

TierScoreStrategyHuman Role
Low1-3fully-autonomousNone (post-hoc audit)
Medium4-6ai-with-reviewReviewer
High7-8ai-assistedCollaborator
Critical9-10human-ledOwner

4.2 Routing Strategies

fully-autonomous — The agent executes the task independently with automated quality gates. Human review occurs post-hoc through audit logs. This strategy MUST only be used for agents at Level 2 or Level 3.

ai-with-review — The agent generates the output (code, PR, etc.) and a human reviewer approves or requests changes before the output is accepted. This is the default strategy for Level 1 and Level 2 agents on medium-complexity tasks.

ai-assisted — A human leads the task with AI providing suggestions, drafts, and analysis. The human retains decision authority at every step.

human-led — A human performs the task with optional AI support for research, boilerplate generation, or documentation. The agent MUST NOT make autonomous changes.

4.3 Scoring

Complexity scoring MAY be performed by:

  • Static analysis of the task description
  • AI-based evaluation of task scope
  • Manual assignment by a human
  • Historical analysis of similar tasks

The scoring method is implementation-defined. However, the resulting score MUST be an integer in the range 1-10, and the routing strategy MUST be applied consistently based on the Pipeline resource's routing.complexityThresholds configuration.


5. Promotion Criteria

Promotion from one autonomy level to the next requires meeting quantitative criteria. Promotion criteria are declared in the AutonomyPolicy resource's promotionCriteria field.

5.1 Requirements

Every promotion MUST satisfy:

  1. Minimum task count — The agent MUST have completed at least minimumTasks tasks at the current level.
  2. Metric thresholds — All metric conditions MUST be met simultaneously.
  3. Required approvals — All roles listed in requiredApprovals MUST explicitly approve the promotion.
  4. Minimum duration — If the current level specifies a minimumDuration, the agent MUST have spent at least that duration at the current level.

The following metrics are RECOMMENDED for promotion evaluation:

MetricDescriptionTypical Threshold
recommendation-acceptance-rateRate at which the agent's recommendations are accepted>= 0.90
pr-approval-rateRate at which the agent's PRs are approved>= 0.90 (L1-L2), >= 0.95 (L2-L3)
rollback-rateRate of changes that required rollback<= 0.02 (L1-L2), <= 0.01 (L2-L3)
average-review-iterationsAverage number of review rounds per PR<= 1.5
security-incidentsNumber of security incidents caused== 0
code-coverage-maintainedWhether code coverage is maintained or improved>= 0.80
production-incidents-causedNumber of production incidents caused== 0

5.3 Evaluation

Implementations SHOULD evaluate promotion criteria periodically (e.g., weekly or on a configurable schedule). Evaluation MUST consider only tasks completed at the agent's current level — historical performance at previous levels MUST NOT be counted.

When all criteria are met, implementations SHOULD notify the required approvers and await explicit approval before advancing the agent's level.


6. Demotion Triggers

Demotion is automatic and immediate when a trigger event occurs. This ensures that trust is continuously verified.

6.1 Trigger Events

Demotion triggers are declared in the AutonomyPolicy resource's demotionTriggers field. Common triggers include:

TriggerRecommended ActionDescription
critical-security-incidentdemote-to-0A security vulnerability was introduced or exploited
rollback-rate-exceeds-5-percentdemote-one-levelRollback rate exceeded threshold over evaluation window
unauthorized-access-attemptdemote-to-0Agent attempted to access a resource outside its permissions

6.2 Demotion Actions

  • demote-to-0 — Immediately set the agent's autonomy level to 0 (Observer). Used for serious incidents.
  • demote-one-level — Reduce the agent's autonomy level by one. Used for performance degradation.

6.3 Cooldown Period

After a demotion, a cooldown period MUST be observed before the agent is eligible for re-promotion. The cooldown duration is specified per trigger in the cooldown field.

During the cooldown period:

  • The agent MUST operate at the demoted level
  • Promotion criteria evaluation MUST NOT start until the cooldown expires
  • Task counts for promotion MUST reset to zero

6.4 Notification

Implementations MUST notify relevant stakeholders when a demotion occurs. The notification MUST include:

  • Agent identity
  • Previous level and new level
  • Trigger event that caused the demotion
  • Cooldown duration
  • Timestamp

7. Framework Alignment

The AI-SDLC autonomy system synthesizes three independently converging frameworks:

AI-SDLC LevelCSA ATF AnalogKnight-Columbia AnalogKey Characteristic
0 (Observer)L1 OperatorRead-only; observe and learn
1 (Junior)InternL2 CollaboratorRecommend; all changes require approval
2 (Senior)Junior/SeniorL3 Consultant / L4 ApproverExecute within guardrails; real-time notification
3 (Principal)PrincipalL5 ObserverAutonomous within domain; audit trail
  • CSA Agentic Trust Framework (February 2026) — Applies Zero Trust principles to AI agents. Trust is earned through demonstrated behavior and continuously verified through monitoring.
  • Knight-Columbia Autonomy Levels (Feng, McDonald & Zhang, July 2025) — Frames autonomy as a deliberate design decision separate from capability.
  • Guided Autonomy / Principle of Least Autonomy — Extends the cybersecurity Principle of Least Privilege to agent autonomy.