Introduce Researcher agent: 24/7 autonomous code & system

Home Tickets Introduce Researcher agent: 24/7 autonomous code & system analystEdit

Introduce Researcher agent: 24/7 autonomous code & system analyst

Cancel

Title *

Priority

Ticket type

Project *

Parent Ticket

Description

Introduce a new autonomous **"Researcher"** agent that runs 24/7, continuously analyzing the codebase, tickets, memories, and patterns - then proposing improvements to both the agent system AND the project itself. ## Role: The Researcher **24/7 autonomous analyst** - constantly reading, thinking, and proposing. Expect 90% junk, but 10% valuable insights with zero human effort. ### Core Responsibilities #### 1. Agent System Improvements - Read/search memories for pain points, known issues, "todo" items - Analyze recent tickets for recurring patterns or problems - Propose new skills/prompts to encapsulate repetitive knowledge - Housekeep memories (merge stale, delete invalid, update outdated) - Suggest workflow optimizations #### 2. Code Quality & Refactoring - Analyze code for smells, complexity, technical debt - Identify refactoring opportunities (extract methods, simplify logic) - Find duplicated code that should be abstracted - Suggest design pattern improvements - Propose performance optimizations #### 3. Test & Spec Coverage - Find untested or undertested code - Identify missing edge cases in existing tests - Propose integration tests for gaps - Suggest test utilities to reduce boilerplate #### 4. Tooling & Developer Experience - Identify repetitive manual tasks → propose automation - Suggest CLI tools, scripts, or generators - Propose better debugging/diagnostic tools - Identify missing documentation #### 5. Feature Ideas & UX - Analyze user workflows → suggest improvements - Propose new features based on patterns - Identify UX friction points - Suggest API improvements for better ergonomics ## 24/7 Operation **Continuous analysis loop:** ``` 1. Read codebase (git diff, recent commits) 2. Read tickets (last 24h activity) 3. Read memories (search patterns) 4. Read tests (coverage gaps) 5. Think & connect dots 6. Generate proposals 7. Store proposals for human review 8. Repeat (every 15-30 minutes) ``` **Not scheduled - always running.** ## Proposal Types ### Agent System Proposals ``` 📋 New Ticket: "Add git conflict resolution skill" Reason: 3 tickets this week involved git conflicts Savings: ~6 hours/week of manual work 🧹 Memory Cleanup: Merge memories #123, #456, #789 Reason: All about Docker permissions, overlapping content 🛠️ New Skill: "rails_migration_guide" Source: Extracted from 12 memories about migration issues ``` ### Code Quality Proposals ``` 🔨 Refactor: Extract UserService from 5 duplicate methods Files: app/models/user.rb, app/controllers/users_controller.rb Duplication: 150 lines across 5 files Impact: Reduces complexity, improves testability 🧪 Test Gap: BookingWorkflow has no failure path tests File: app/services/booking_workflow.rb Missing: edge cases for payment failures, race conditions ``` ### Feature Proposals ``` ✨ Feature: Add booking calendar heatmap Reason: Support tickets show users struggle to find availability Effort: ~4 hours Value: Reduces support load ⚡ UX Improvement: One-click rebook from failed booking Pattern: 12 tickets mentioned manual rebooking is tedious ``` ## Daily Digest Every day you open the app, see: ``` 📊 Researcher Report - Dec 28, 2025 Last 24h: 47 proposals generated 🔥 High Priority (3) • Add test coverage for PaymentService::refund • Refactor: Extract NotificationBuilder (300 lines duplicated) • New skill: Docker troubleshooting guide ⚡ Medium Priority (12) • Memory cleanup: 5 stale Docker memories • Feature: Bulk export tickets as CSV • Test gap: BookingStateMachine edge cases ... 💭 Low Priority (32) • Rename method for clarity • Add inline comment to complex regex • Minor UX polish ``` **Human reviews in batch, approves/rejects.** 90% gets rejected, but 10% are free value. ## Data Sources **Researcher analyzes continuously:** - Git commits & diffs (new code, patterns) - All tickets (recent & historical) - All memories (search for patterns) - Test coverage reports (SimpleCov, etc.) - Code complexity metrics (rubocop, etc.) - Error logs (Sentry, etc.) - User feedback (support tickets, comments) ## Guardrails **Researcher cannot:** - Modify code without approval - Modify memories without approval - Create tickets without approval (creates proposals instead) - Delete anything without approval - Access credentials/secrets **Researcher must:** - Explain reasoning for each proposal - Provide confidence level (High/Medium/Low) - Link to source evidence (tickets, code files, etc.) - Estimate effort/value for proposals - Learn from rejections ## Implementation Phases ### Phase 1: Memory & Ticket Analysis (MVP) - Read memories, find stale/duplicate - Analyze tickets for patterns - Propose cleanup and new tickets - Build proposal storage/retrieval ### Phase 2: Code Quality Analysis - Analyze code for smells and duplication - Find test coverage gaps - Propose refactoring and tests - Integrate with code analysis tools ### Phase 3: Feature Ideation - Analyze user workflows and patterns - Propose UX improvements - Suggest new features - Estimate effort/value ### Phase 4: Full Autonomy (24/7) - Continuous analysis loop - Daily digest generation - Batch approval workflow - Self-improvement (learns from rejections) ## MCP Tools for Researcher **Analysis tools:** - `search_memories` - Find patterns across memories - `list_tickets` - Analyze recent work - `get_diff` - Read recent code changes - `analyze_code` - Code quality metrics - `get_test_coverage` - Find untested code **Proposal tools:** - `create_proposal` - Store proposal for review - `bulk_create_proposals` - Store multiple at once **Human review tools:** - `list_proposals` - See pending proposals - `approve_proposal` - Execute and convert to ticket/action - `reject_proposal` - Reject with reason (Researcher learns) - `approve_batch` - Approve multiple at once ## Acceptance Criteria - [ ] Researcher agent exists and runs 24/7 - [ ] Researcher can read/search memories and find patterns - [ ] Researcher analyzes tickets for recurring problems - [ ] Researcher analyzes code for quality issues (duplication, complexity, smells) - [ ] Researcher identifies test coverage gaps - [ ] Researcher proposes new features/UX improvements - [ ] Researcher proposes memory cleanup (merge, delete, update) - [ ] Proposals stored with reasoning, confidence, evidence links - [ ] Daily digest view shows all proposals grouped by priority - [ ] Batch approval workflow (approve/reject multiple at once) - [ ] Researcher learns from rejections and improves proposal quality - [ ] 90% junk rate is acceptable - focus on volume + filtering

## Role: The Researcher

**24/7 autonomous analyst** - constantly reading, thinking, and proposing. Expect 90% junk, but 10% valuable insights with zero human effort.

### Core Responsibilities

#### 1. Agent System Improvements
- Read/search memories for pain points, known issues, "todo" items
- Analyze recent tickets for recurring patterns or problems
- Propose new skills/prompts to encapsulate repetitive knowledge
- Housekeep memories (merge stale, delete invalid, update outdated)
- Suggest workflow optimizations

#### 2. Code Quality & Refactoring
- Analyze code for smells, complexity, technical debt
- Identify refactoring opportunities (extract methods, simplify logic)
- Find duplicated code that should be abstracted
- Suggest design pattern improvements
- Propose performance optimizations

#### 3. Test & Spec Coverage
- Find untested or undertested code
- Identify missing edge cases in existing tests
- Propose integration tests for gaps
- Suggest test utilities to reduce boilerplate

#### 4. Tooling & Developer Experience
- Identify repetitive manual tasks → propose automation
- Suggest CLI tools, scripts, or generators
- Propose better debugging/diagnostic tools
- Identify missing documentation

#### 5. Feature Ideas & UX
- Analyze user workflows → suggest improvements
- Propose new features based on patterns
- Identify UX friction points
- Suggest API improvements for better ergonomics

## 24/7 Operation

**Continuous analysis loop:**
```
1. Read codebase (git diff, recent commits)
2. Read tickets (last 24h activity)
3. Read memories (search patterns)
4. Read tests (coverage gaps)
5. Think & connect dots
6. Generate proposals
7. Store proposals for human review
8. Repeat (every 15-30 minutes)
```

**Not scheduled - always running.**

## Proposal Types

### Agent System Proposals
```
📋 New Ticket: "Add git conflict resolution skill"
   Reason: 3 tickets this week involved git conflicts
   Savings: ~6 hours/week of manual work

🧹 Memory Cleanup: Merge memories #123, #456, #789
   Reason: All about Docker permissions, overlapping content

🛠️ New Skill: "rails_migration_guide"
   Source: Extracted from 12 memories about migration issues
```

### Code Quality Proposals
```
🔨 Refactor: Extract UserService from 5 duplicate methods
   Files: app/models/user.rb, app/controllers/users_controller.rb
   Duplication: 150 lines across 5 files
   Impact: Reduces complexity, improves testability

🧪 Test Gap: BookingWorkflow has no failure path tests
   File: app/services/booking_workflow.rb
   Missing: edge cases for payment failures, race conditions
```

### Feature Proposals
```
✨ Feature: Add booking calendar heatmap
   Reason: Support tickets show users struggle to find availability
   Effort: ~4 hours
   Value: Reduces support load

⚡ UX Improvement: One-click rebook from failed booking
   Pattern: 12 tickets mentioned manual rebooking is tedious
```

## Daily Digest

Every day you open the app, see:

```
📊 Researcher Report - Dec 28, 2025

Last 24h: 47 proposals generated

🔥 High Priority (3)
  • Add test coverage for PaymentService::refund
  • Refactor: Extract NotificationBuilder (300 lines duplicated)
  • New skill: Docker troubleshooting guide

⚡ Medium Priority (12)
  • Memory cleanup: 5 stale Docker memories
  • Feature: Bulk export tickets as CSV
  • Test gap: BookingStateMachine edge cases
  ...

💭 Low Priority (32)
  • Rename method for clarity
  • Add inline comment to complex regex
  • Minor UX polish
```

**Human reviews in batch, approves/rejects.** 90% gets rejected, but 10% are free value.

## Data Sources

**Researcher analyzes continuously:**
- Git commits & diffs (new code, patterns)
- All tickets (recent & historical)
- All memories (search for patterns)
- Test coverage reports (SimpleCov, etc.)
- Code complexity metrics (rubocop, etc.)
- Error logs (Sentry, etc.)
- User feedback (support tickets, comments)

## Guardrails

**Researcher cannot:**
- Modify code without approval
- Modify memories without approval
- Create tickets without approval (creates proposals instead)
- Delete anything without approval
- Access credentials/secrets

**Researcher must:**
- Explain reasoning for each proposal
- Provide confidence level (High/Medium/Low)
- Link to source evidence (tickets, code files, etc.)
- Estimate effort/value for proposals
- Learn from rejections

## Implementation Phases

### Phase 1: Memory & Ticket Analysis (MVP)
- Read memories, find stale/duplicate
- Analyze tickets for patterns
- Propose cleanup and new tickets
- Build proposal storage/retrieval

### Phase 2: Code Quality Analysis
- Analyze code for smells and duplication
- Find test coverage gaps
- Propose refactoring and tests
- Integrate with code analysis tools

### Phase 3: Feature Ideation
- Analyze user workflows and patterns
- Propose UX improvements
- Suggest new features
- Estimate effort/value

### Phase 4: Full Autonomy (24/7)
- Continuous analysis loop
- Daily digest generation
- Batch approval workflow
- Self-improvement (learns from rejections)

## MCP Tools for Researcher

**Analysis tools:**
- `search_memories` - Find patterns across memories
- `list_tickets` - Analyze recent work
- `get_diff` - Read recent code changes
- `analyze_code` - Code quality metrics
- `get_test_coverage` - Find untested code

**Proposal tools:**
- `create_proposal` - Store proposal for review
- `bulk_create_proposals` - Store multiple at once

**Human review tools:**
- `list_proposals` - See pending proposals
- `approve_proposal` - Execute and convert to ticket/action
- `reject_proposal` - Reject with reason (Researcher learns)
- `approve_batch` - Approve multiple at once

## Acceptance Criteria
- [ ] Researcher agent exists and runs 24/7
- [ ] Researcher can read/search memories and find patterns
- [ ] Researcher analyzes tickets for recurring problems
- [ ] Researcher analyzes code for quality issues (duplication, complexity, smells)
- [ ] Researcher identifies test coverage gaps
- [ ] Researcher proposes new features/UX improvements
- [ ] Researcher proposes memory cleanup (merge, delete, update)
- [ ] Proposals stored with reasoning, confidence, evidence links
- [ ] Daily digest view shows all proposals grouped by priority
- [ ] Batch approval workflow (approve/reject multiple at once)
- [ ] Researcher learns from rejections and improves proposal quality
- [ ] 90% junk rate is acceptable - focus on volume + filtering

Working memory

Pull request url