Add ask_for_memory_deletion MCP tool (with human confirmation)

Done Story Medium

Created: Dec 23, 2025

Updated: 7 days ago

PR: View

Description

Add a safe memory deletion mechanism that prevents agents from arbitrarily deleting organizational knowledge. ## Problem Agents should NOT be able to directly delete memories - this could lead to: - Loss of important architectural decisions - Removal of bug fixes/solutions - Erasure of system knowledge - Potential "rogue agent" scenarios ## Solution: Request-Approval Flow Create `ask_for_memory_deletion` MCP tool that: 1. Agent requests deletion (provides memory_id and reason) 2. Request is logged/stored for human review 3. Human confirms/rejects via UI or separate approval flow 4. Only after human confirmation is memory actually deleted ## Tool Specification **Tool Name:** `ask_for_memory_deletion` **Parameters:** - `memory_id` (required) - ID of memory to delete - `reason` (required) - Why this memory should be deleted **Returns:** - Confirmation that deletion request was submitted - Request ID for tracking - Status: "pending_human_review" **What it does:** - Creates a `MemoryDeletionRequest` record - Stores: memory_id, requesting_agent_id, reason, status, created_at - Does NOT delete the actual memory - Logs the request for human review ## Human Approval Flow (Separate) Options for human approval: 1. **Rails UI** - `/admin/memory_deletion_requests` index page 2. **CLI command** - `rails tinker:memory_deletion:review` 3. **API endpoint** - `POST /api/v1/memory_deletion_requests/:id/approve` ## Database Changes Create table: ```ruby create_table :memory_deletion_requests do |t| t.integer :memory_id, null: false t.integer :requesting_agent_id, null: false t.text :reason, null: false t.string :status, default: 'pending' t.integer :reviewed_by_agent_id t.text :review_notes t.timestamps end ``` ## Acceptance Criteria 1. `ask_for_memory_deletion` tool creates request record (not deletion) 2. Request includes memory_id, agent_id, reason, timestamp 3. Human can view all pending deletion requests 4. Human can approve/reject deletion requests 5. Only approved deletions actually remove memory 6. All deletion attempts are logged (audit trail) 7. Agents cannot bypass this flow ## Security Considerations - No direct `delete_memory` tool exists - Only humans (or Orchestrator with special privilege?) can approve - Audit log of all deletion requests and outcomes - Optional: Require 2-factor approval for certain memory types (decision, error) ## Files to Create/Modify **New:** - `app/models/memory_deletion_request.rb` - `app/controllers/api/v1/memory_deletion_requests_controller.rb` - `spec/models/memory_deletion_request_spec.rb` - `spec/controllers/api/v1/memory_deletion_requests_controller_spec.rb` **Modify:** - `db/schema.rb` - add memory_deletion_requests table - `mcp-bridge/src/tools/index.ts` - add ask_for_memory_deletion tool - Admin UI (optional) - deletion request review interface

Recent Comments

tinker-reviewer 7 days ago

## REVIEWER AUDIT - RETURNED FOR SECURITY ISSUE ### Critical Security Vulnerability: Missing Authorization The `approve` and `reject` endpoints can be called by **ANY authenticated agent**, including workers who could approve their own deletion requests. **The Problem:** ```ruby def approve request.approve!(reviewer: current_agent, notes: params[:notes]) # NO AUTH CHECK ``` **Ticket Requirement:** > "Human approval flow" > "Only humans (or Orchestrator with special privilege?) can approve" **What's Needed:** Add authorization to ensure only human agents can approve/reject: ```ruby before_action :require_human_agent, only: [:approve, :reject] private def require_human_agent unless current_agent.agent_type == "human" render json: { error: "Unauthorized - only human agents can approve deletion requests" }, status: :forbidden end end ``` ### What IS Good ✓ - Model implementation is correct with proper validations - Test coverage is excellent (42 passing specs) - Audit logging is present - Feature behavior matches acceptance criteria ### Action Required 1. Add `require_human_agent` before_action for approve/reject 2. Add tests verifying non-human agents cannot approve 3. Consider adding optional orchestrator approval if desired The implementation is solid - it just needs the authorization check that was explicitly required by the ticket.

Ticket Stats

Status: Done

Priority: Medium

Type: Story

Rework: 1x

Comments

1 comments

tinker-reviewer Reviewer 7 days ago

Add a Comment

No Subtasks Yet

Break down this ticket into smaller, manageable subtasks

Activity Timeline

tinker-worker

State transition

7 days ago
tinker-orchestrator

Transition approve

7 days ago
tinker-worker

State transition

7 days ago
tinker-reviewer

Transition pass audit

7 days ago
tinker-worker

State transition

7 days ago
tinker-worker

Transition submit review

7 days ago
tinker-worker

State transition

7 days ago
tinker-worker

Transition start work

7 days ago
System

State transition

7 days ago
tinker-reviewer

Transition fail audit

7 days ago
tinker-reviewer

Add comment

7 days ago
System

State transition

7 days ago
tinker-worker

Transition submit review

7 days ago
tinker-worker

Update ticket

7 days ago
System

State transition

7 days ago