Building an autonomous AI executor (Manus-style, but task-based) #4141

shadeMSARWE · 2025-12-21T18:29:48Z

shadeMSARWE
Dec 21, 2025

Hi crewAI community,

I’m not looking to hire freelancers or sell anything.

I want to build an autonomous AI tool (Manus-style behavior, not UI) that can:

Understand a task
Break it into steps
Execute steps autonomously
Decide the next action by itself
Ask the human only when required

Use case: Shopify operators & solo founders who want real automation.

I’m looking for:

architectural direction
best practices with crewAI
contributors interested in building something serious

Any guidance is appreciated.

KeepALifeUS · 2026-02-12T20:34:14Z

KeepALifeUS
Feb 12, 2026

Great project concept! Here's an architectural approach I've used for autonomous multi-agent systems:

Core Architecture

1. Shared State Over Message Passing

Instead of agents communicating directly, use a central state object:

state = {
    "task": "Create Shopify product listings",
    "subtasks": [
        {"id": 1, "description": "Extract product info", "status": "completed"},
        {"id": 2, "description": "Generate descriptions", "status": "in_progress"},
        {"id": 3, "description": "Upload to Shopify", "status": "pending"}
    ],
    "requires_human": False,
    "human_input_needed": None
}

This pattern (called stigmergy) reduces token usage by ~80% and makes debugging trivial — you just inspect the state file.

2. Human-in-the-Loop via State

if state["confidence"] < 0.8 or state["cost_estimate"] > THRESHOLD:
    state["requires_human"] = True
    state["human_input_needed"] = "Approve $50 API spend?"

The human reviews and modifies state directly, then agents continue.

3. Autonomous Task Decomposition

One "planner" agent writes subtasks to state. Worker agents pick up tasks based on their capabilities:

for subtask in state["subtasks"]:
    if subtask["status"] == "pending" and agent.can_handle(subtask):
        subtask["status"] = "in_progress"
        subtask["assigned_to"] = agent.id

Why This Works for Shopify Operators

No constant oversight — agents self-coordinate through state
Clear intervention points — human approval is explicit, not implicit
Scalable — add more agents without exponential coordination overhead

Working implementation: https://github.com/KeepALifeUS/autonomous-agents

Happy to discuss specific patterns for your use case!

0 replies

devonakelley · 2026-02-22T15:00:14Z

devonakelley
Feb 22, 2026

Cool project. One thing that'll hit you fast once this is running against real Shopify tasks: the agent will work great for 90% of flows and then randomly fail on the other 10% in ways you can't predict or permanently fix.

The architecture @KeepALifeUS shared is solid for the orchestration layer. The piece I'd add is something at the model/routing level that tracks which model + provider combo is actually succeeding for each task type and shifts traffic when something degrades. Otherwise you end up manually debugging failures that resolve themselves and then reappear differently.

Built Kalibr for exactly this. Happy to chat more if you want to talk about the reliability side of autonomous execution.

0 replies

xXMrNidaXx · 2026-02-23T13:20:20Z

xXMrNidaXx
Feb 23, 2026

Autonomous task-based executors are fascinating! At RevolutionAI (https://revolutionai.io) we have built similar systems. Key patterns that worked:

Task decomposition:

Break complex goals into atomic, verifiable tasks
Each task should have clear success/failure criteria
Build dependency graphs for parallel execution

Execution safety:

Sandbox all file/network operations
Cost budgets per task (token + API limits)
Automatic rollback on failure
Human approval gates for high-risk actions

Observability:

Full task tree visualization
Decision logging at each branch point
Resource consumption tracking
Anomaly detection for runaway agents

Recovery:

Checkpoint state after each task
Resume from last good state on crash
Retry with different strategies on failure

The Manus-style approach is promising but needs guardrails for production. Happy to share our architecture if useful!

0 replies

xXMrNidaXx · 2026-02-23T15:43:15Z

xXMrNidaXx
Feb 23, 2026

Great project scope! Here is an architecture that works:

Core components:

[Task Input] → [Planner Agent] → [Executor Agent] → [Reviewer Agent]
                     ↓                    ↓                 ↓
              [Task Queue]          [Tool Calls]      [Human Escalation]

CrewAI implementation:

from crewai import Agent, Crew, Task, Process

planner = Agent(
    role="Task Planner",
    goal="Break complex tasks into actionable steps",
    tools=[task_decomposition_tool],
    allow_delegation=True
)

executor = Agent(
    role="Task Executor",
    goal="Execute steps using available tools",
    tools=[shopify_api, web_browser, email_sender],
    verbose=True
)

reviewer = Agent(
    role="Quality Reviewer",
    goal="Verify task completion, escalate if stuck",
    tools=[human_escalation_tool]
)

crew = Crew(
    agents=[planner, executor, reviewer],
    process=Process.sequential,
    manager_llm=gpt4  # For coordination
)

Human-in-the-loop pattern:

@tool
def request_human_input(question: str, context: str) -> str:
    """Ask human for guidance when stuck or uncertain."""
    # Send notification, wait for response
    return await human_interface.ask(question, context, timeout=3600)

# Agent decides when to use this
executor.tools.append(request_human_input)

For Shopify automation:

Shopify Admin API tool
Order management tool
Inventory update tool
Customer notification tool

Key learnings:

Start with narrow scope (one workflow)
Add human checkpoints for risky actions
Log everything for debugging
Set hard limits on autonomous actions

We build autonomous agents at Revolution AI — happy to share more architecture patterns.

0 replies

fjnunezp75 · 2026-03-15T00:25:55Z

fjnunezp75
Mar 15, 2026

Good architecture target. A few hard-won lessons building autonomous executors for production:

The task decomposition layer is where most projects fail first

The Planner agent needs to produce tasks with explicit, machine-checkable completion criteria — not just descriptions. Otherwise the Executor never knows if it is done.

from pydantic import BaseModel
from typing import Literal

class Task(BaseModel):
    id: str
    description: str
    success_criteria: str  # e.g., "Shopify API returns 200 with product.id"
    estimated_cost_usd: float
    requires_human_approval: bool
    tools_required: list[str]
    depends_on: list[str] = []

# Planner writes these to shared state
# Executor verifies success_criteria before marking done

For Shopify specifically: API budget is the critical guardrail

Shopify has rate limits, but the bigger risk for autonomous agents is unintended mutations — bulk updates, accidental deletions, mass emails. The practical pattern is tiered tool access:

class ToolTier(Enum):
    READ_ONLY = 0       # Free to use, no approval
    LOW_RISK_WRITE = 1  # Log + proceed
    HIGH_RISK_WRITE = 2 # Require human confirmation

# Agent proposes action → checks tier → decides to proceed or escalate
if tool.tier == ToolTier.HIGH_RISK_WRITE:
    state["requires_human"] = True
    state["pending_approval"] = {"tool": tool.name, "args": args, "estimated_impact": ...}
    return  # Halt until human approves

On paying for external AI services autonomously

Once your executor is calling LLMs, image generation, or other paid APIs as tools, the payment model matters a lot. We use GPU-Bridge for this — 30 AI services (LLM, image gen, embeddings, TTS, STT, video, PDF parsing...) via a single POST /run endpoint. Two payment modes that fit autonomous agents well:

Pre-funded key with spend cap: give each agent run a key with a $1–$5 cap. When it runs out, the task halts cleanly. No surprise charges.
x402 micropayments: each API call settles a USDC micropayment on-chain. The agent needs no credit card or subscription — just a funded wallet. The payment is the authorization.

# Example: autonomous product description generation
response = requests.post(
    "https://api.gpubridge.xyz/run",
    headers={"Authorization": "Bearer TASK_BUDGET_KEY"},
    json={
        "service": "llm",
        "model": "llama-3.3-70b",
        "messages": [{"role": "user", "content": f"Write Shopify product description for: {product_data}"}]
    }
)
# $0.001/call, task stops when budget exhausted

What to build first

Read-only agent that understands your Shopify catalog
Planner that generates structured task lists
Human-approval flow for any writes
Only then: executor that can act autonomously

The "ask human only when required" goal is the last piece, not the first.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building an autonomous AI executor (Manus-style, but task-based) #4141

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Building an autonomous AI executor (Manus-style, but task-based) #4141

Uh oh!

shadeMSARWE Dec 21, 2025

Replies: 5 comments

Uh oh!

KeepALifeUS Feb 12, 2026

Core Architecture

Why This Works for Shopify Operators

Uh oh!

devonakelley Feb 22, 2026

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

fjnunezp75 Mar 15, 2026

shadeMSARWE
Dec 21, 2025

KeepALifeUS
Feb 12, 2026

devonakelley
Feb 22, 2026

xXMrNidaXx
Feb 23, 2026

xXMrNidaXx
Feb 23, 2026

fjnunezp75
Mar 15, 2026