[TASK-3-6] Save_test_set(test_set)

May 23, 2025 by ADMIN 35 views

Overview

The save_test_set ADK tool is a critical component of the data processing pipeline, responsible for persisting extracted, structured, and tagged test question data into the Supabase database. This task involves implementing the save_test_set tool, which will handle the final stage of the pipeline: saving processed data.

Parent Epic

Epic 3: Tool Implementation (ADK Tool Functions)

Background / Why

The save_test_set tool is essential for storing and retrieving question sets. As per mvp_breakdown.md (line 33), this tool will utilize the upsert method from Supabase to insert or update data in the database.

[ ] T6 save_test_set(test_set) ― Supabase upsert (onConflict=["part_id","number"])

What to do / How

Fetch and Understand ADK Documentation

Use the MCP tool Context7 to fetch and understand ADK documentation from Agent Development Kit (ADK) Documentation.
Understand ADK documentation and how to use Tools and ToolContext, particularly for interacting with external services like Supabase.

Implement `save_test_set` Tool

Create/update the Python file for the tool (e.g., questions_extractor_agent/tools/database_tools.py or questions_extractor_agent/tools/save_tool.py, following tools/**.py convention from mvp_breakdown.md).
- Define the save_test_set(test_set: dict, tool_context: ToolContext) function signature as specified in prd.md (Section 4.2).
  - Input: test_set (a dictionary containing the structured question data ready for DB insertion, conforming to the 7+2 table structure).
  - Input: tool_context (ADK ToolContext).

Supabase Integration

Utilize the supabase library to connect to the Supabase instance (credentials should be loaded from .env).
Implement upsert logic for each of the tables defined in prd.md (Section 5. Data Model):
- test_forms
- sections
- parts
  - passage_sets
  - passages
  - questions (implement onConflict=["part_id","number"])
  - choices (implement onConflict=["question_id","label"])
  - tags
  - question_tags
- Ensure data is inserted/updated in the correct order to satisfy foreign key constraints. Consider the relationships between tables (e.g., a question belongs to a part and passage_set).
- Group database operations logically, potentially by passage_set, and perform bulk upserts within database transactions for efficiency and atomicity.
- Adhere to the JSONB metadata size limit (less than 8MB per row) for tables like passage_sets and passages as noted in prd.md.

Return Values

The tool must return a dictionary as specified in prd.md (Section 4.2): {"status": "success" | "error", "message": "Descriptive message", "rows_upserted": integer_count}.

Unit Testing

Create unit tests in tests/tools/test_save_test_set.py (or similar, matching the tool's file location).
Mock Supabase client interactions using unittest.mock or a similar library.
- Test successful data insertion for all tables.
- Test successful data updates (idempotency) due to onConflict clauses for questions and choices.
- Test correct handling of relationships and foreign keys.
- Test error handling (e.g., database errors, malformed test_set data).
- Verify the structure and content of the return values (status, message, rows_upserted).

Acceptance Criteria / AC

The save_test_set tool is implemented as an ADK Tool function.
The tool correctly upserts structured test data into all 9 Supabase tables (test_forms, sections, parts, passage_sets, passages, questions, choices, tags, question_tags).
onConflict constraints for questions (part_id, number) and choices (question_id, label) are correctly implemented and prevent duplicate entries while allowing updates.
The tool handles bulk operations efficiently, ideally grouped by passage_set and within transactions.
The tool returns the specified status, message, and rows_upserted count.
Unit tests for save_test_set are comprehensive and pass, covering successful operations, onConflict behavior, and error handling.
As per roadmap.md (Milestone Week 10), the implementation should support storing 1k dummy questions without duplicates, demonstrating robust onConflict and bulk handling.
Consider and test for pgBouncer connection limits if applicable during integration testing (as noted in mvp_breakdown.md for T6 tests).

Predefined Checklist

Code style unified (ruff/black/isort)
Type check (mypy | pyright)
Docstring & comments added, explaining the logic, especially for data mapping and Supabase interactions.
Test cases added (as detailed in "What to do / How" and "Acceptance Criteria").
Documentation updated (if applicable, e.g., notes on data structure expectations for test_set).

Related Materials

PRD: docs/prd.md (especially Sections 3.3, 4.2, 5)
Roadmap: docs/roadmap.md (Phase 4: Supabase I/O, Milestone Week 10)
MVP Breakdown: docs/mvp_breakdown.md (Task T6, line 33)
Q&A: Implementing the save_test_set ADK Tool =====================================================

Q: What is the purpose of the `save_test_set` ADK tool?

A: The save_test_set ADK tool is responsible for persisting extracted, structured, and tagged test question data into the Supabase database. This is a critical step in the data processing pipeline, enabling the storage and future retrieval of question sets.

Q: What is the expected input for the `save_test_set` tool?

A: The expected input for the save_test_set tool is a dictionary containing the structured question data ready for DB insertion, conforming to the 7+2 table structure, and an ADK ToolContext.

Q: How does the `save_test_set` tool interact with Supabase?

A: The save_test_set tool utilizes the supabase library to connect to the Supabase instance and perform upsert operations on the specified tables.

Q: What are the key features of the `save_test_set` tool?

A: The key features of the save_test_set tool include:

Correctly upserting structured test data into all 9 Supabase tables
Implementing onConflict constraints for questions and choices
Handling bulk operations efficiently, ideally grouped by passage_set and within transactions
Returning the specified status, message, and rows_upserted count

Q: What are the acceptance criteria for the `save_test_set` tool?

A: The acceptance criteria for the save_test_set tool include:

The tool is implemented as an ADK Tool function
The tool correctly upserts structured test data into all 9 Supabase tables
onConflict constraints for questions and choices are correctly implemented
The tool handles bulk operations efficiently
The tool returns the specified status, message, and rows_upserted count
Unit tests for save_test_set are comprehensive and pass

Q: What is the predefined checklist for implementing the `save_test_set` tool?

A: The predefined checklist for implementing the save_test_set tool includes:

Code style unified (ruff/black/isort)
Type check (mypy | pyright)
Docstring & comments added, explaining the logic, especially for data mapping and Supabase interactions
Test cases added (as detailed in "What to do / How" and "Acceptance Criteria")
Documentation updated (if applicable, e.g., notes on data structure expectations for test_set)

Q: What are the related materials for implementing the `save_test_set` tool?

A: The related materials for implementing the save_test_set tool include:

PRD: docs/prd.md (especially Sections 3.3, 4.2, 5)
Roadmap: docs/roadmap.md (Phase 4: Supabase I/O, Milestone Week 10)
MVP Breakdown: docs/mvp_breakdown.md (Task T6, line 33)

: What is the expected outcome of implementing the `save_test_set` tool?

A: The expected outcome of implementing the save_test_set tool is a robust and efficient data processing pipeline that can store and retrieve question sets correctly.