Update M5-M7 with new issues #28

Open
opened 2025-12-01 15:53:29 +00:00 by ber · 0 comments
Owner

Milestone 5 – Introduce QuickParse on the in‑memory chunk
Goal: While the chunk bytes are still in memory, run a fast “quick parse” step over them.
M5‑1: Define QuickParse task payload
Decide whether QuickParse:
Receives only refs (chunkId), or
Happens inline in the scrape worker with the raw ByteArray.
For a first step, keep it inline in scrapeChunk (no new queue yet).
M5‑2: Add a stub quickParseInMemory(bytes: ByteArray) function
Takes a buffer, maybe returns dummy MatchRef list or just counts something.
For now, this can be a placeholder implementation.
M5‑3: Call quickParseInMemory inside scrape worker
After reading the chunk:
Call quickParseInMemory.
Log results.
Still mark chunk done at the end.
M5‑4: Add a flag / status to track “quick parsed”
Optionally:
Add a quickParsed boolean or status column to chunks, or
Just reuse status if that still makes sense for now.

Milestone 6 – Basic scan lifecycle & state reflection
Goal: Connect scan status/stage to what the engine is actually doing, ignoring UI details.
M6‑1: Update Scan.stage when chunks are created
After prepareChunks(scanId) finishes:
Set stage to SCRAPING (or similar).
M6‑2: Update Scan.status based on ScanContext life
When ScanContext.launch() starts:
Set Scan.status = RUNNING.
When all chunks are DONE:
Set Scan.status = FINISHED.
M6‑3: Add a simple getScanStatus(scanId) API
In ScanManager or a small repository:
Read and return (status, stage, doneChunks, totalChunks).
M6‑4: Small harness: print periodic progress
A loop that every second prints:
status, stage, x / y chunks done.

Milestone 7 – Engine API refinement (towards frontends)
Goal: Have a small, clean engine API that a UI or CLI could call later.
M7‑1: Stabilize core EngineBackend commands
Define a minimal command set:
StartScanNew(target, chunkSize)
StartScanExisting(scanId)
StopScan(scanId)
Implement them in EngineBackend.submit by delegating to ScanManager.
M7‑2: Expose a domain DTO for scan summary
Define a small data class ScanSummary (id, target, status, stage, progress).
Add fun listScans(): List in ScanManager or an engine service.
M7‑3: (Later) Back into UI: list scans + start/stop
Out of scope for now, but future card:
UI calls listScans and submit(StartScanNew/Existing).

Milestone 5 – Introduce QuickParse on the in‑memory chunk Goal: While the chunk bytes are still in memory, run a fast “quick parse” step over them. M5‑1: Define QuickParse task payload Decide whether QuickParse: Receives only refs (chunkId), or Happens inline in the scrape worker with the raw ByteArray. For a first step, keep it inline in scrapeChunk (no new queue yet). M5‑2: Add a stub quickParseInMemory(bytes: ByteArray) function Takes a buffer, maybe returns dummy MatchRef list or just counts something. For now, this can be a placeholder implementation. M5‑3: Call quickParseInMemory inside scrape worker After reading the chunk: Call quickParseInMemory. Log results. Still mark chunk done at the end. M5‑4: Add a flag / status to track “quick parsed” Optionally: Add a quickParsed boolean or status column to chunks, or Just reuse status if that still makes sense for now. Milestone 6 – Basic scan lifecycle & state reflection Goal: Connect scan status/stage to what the engine is actually doing, ignoring UI details. M6‑1: Update Scan.stage when chunks are created After prepareChunks(scanId) finishes: Set stage to SCRAPING (or similar). M6‑2: Update Scan.status based on ScanContext life When ScanContext.launch() starts: Set Scan.status = RUNNING. When all chunks are DONE: Set Scan.status = FINISHED. M6‑3: Add a simple getScanStatus(scanId) API In ScanManager or a small repository: Read and return (status, stage, doneChunks, totalChunks). M6‑4: Small harness: print periodic progress A loop that every second prints: status, stage, x / y chunks done. Milestone 7 – Engine API refinement (towards frontends) Goal: Have a small, clean engine API that a UI or CLI could call later. M7‑1: Stabilize core EngineBackend commands Define a minimal command set: StartScanNew(target, chunkSize) StartScanExisting(scanId) StopScan(scanId) Implement them in EngineBackend.submit by delegating to ScanManager. M7‑2: Expose a domain DTO for scan summary Define a small data class ScanSummary (id, target, status, stage, progress). Add fun listScans(): List<ScanSummary> in ScanManager or an engine service. M7‑3: (Later) Back into UI: list scans + start/stop Out of scope for now, but future card: UI calls listScans and submit(StartScanNew/Existing).
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ber/kotlin-ntfs-scraper#28
No description provided.