Metrics Dashboard

Purpose

Metrics Dashboard is the iteration-level measurement summary. Its unique job is to compare current metric values against explicit baselines, interpret direction and tolerance, and produce a clear decision about improvement, regression, or noise.

It consumes Metric Definitions. It does not redefine metric formulas, command semantics, or labels. It informs the Improvement Backlog but does not decide implementation work by itself.

Example

Show a worked example of this artifact

---
ddx:
  id: example.metrics-dashboard.depositmatch.csv-import
  depends_on:
    - example.metric-definition.depositmatch.csv-import-validation-seconds
---

# Metrics Dashboard: DepositMatch CSV Import Pilot Readiness

**Review Window**: 2026-05-05 - 2026-05-12
**Baseline**: FEAT-001 pre-release benchmark floor from 2026-05-05
**Status**: complete

## Decision

The latest change improved CSV import validation time and remains within the
FEAT-001 performance target. No performance backlog item is required for this
metric. Continue monitoring during pilot rollout.

## Summary

The representative 10,000-row CSV import validation benchmark improved from
5.8 seconds to 4.4 seconds. The current value is below the 5-second target and
outside the 5% noise band, so the result counts as a meaningful improvement.

## Metrics Table

| Metric | Baseline | Current | Direction | Result | Source |
|--------|----------|---------|-----------|--------|--------|
| `csv-import-validation-seconds` | 5.8 seconds | 4.4 seconds | lower | pass / improved | `docs/helix/06-iterate/metrics/csv-import-validation-seconds.yaml`; `pnpm metric:csv-import-validation -- --fixture fixtures/import/acme-10000-rows` |

## Interpretation Rules

- Values within 5% of baseline are treated as noise.
- Values above 5 seconds violate the FEAT-001 performance target and create an
  improvement backlog candidate.
- Values below target but worse than baseline by more than 5% create a watch
  item, not an automatic build task.

## Trend Notes

- Validation improved after switching row checks to batched normalization before
  persistence.
- No corresponding increase in upload API 5xx rate was observed in staging.
- Pilot rollout should still watch p95 upload response time because production
  file sizes may differ from fixture data.

## Follow-Up

- No immediate improvement issue.
- Continue pilot monitoring through the deployment checklist signals.
- Re-run the benchmark after adding column mapping and row-level validation
  stories.

## Review Checklist

- [x] Baseline is explicit
- [x] Each metric cites a source
- [x] The summary states the decision implication

Reference

Activity Iterate — Measure, align, and improve. Close the feedback loop back into the planning strand.

Default location docs/helix/06-iterate/metrics-dashboard.md

Requires None

Enables None

Informs Improvement Backlog

HELIX documents docs/helix/06-iterate/metrics-dashboard.md

Generation prompt

Show the full generation prompt

# Metrics Dashboard Generation Prompt
Document the iteration-level metrics summary used to judge whether the latest
changeset improved the system.
Purpose
    
Metrics Dashboard is the iteration-level measurement summary. Its unique
job is to compare current metric values against explicit baselines, interpret
direction and tolerance, and produce a clear decision about improvement,
regression, or noise.
It consumes Metric Definitions. It does not redefine metric formulas, command
semantics, or labels. It informs the Improvement Backlog but does not decide
implementation work by itself.
Reference Anchors
    
Use this local resource summary as grounding:

docs/resources/google-sre-monitoring-distributed-systems.md grounds
dashboard summaries as interpreted quantitative signals with clear sources.

Focus
    

Start from the canonical metric definitions in docs/helix/06-iterate/metrics/.
Compare the current measurement against the previous baseline or committed floor.
State whether the change improved, regressed, or stayed within noise.
Include only the metrics needed to support the decision.
Cite the source of each metric and the measurement command or report.
Keep raw observability setup and implementation details out of this artifact.

Boundary Test
    

  
      
          If you are writing…
          Put it in…
      
  
  
      
          Metric unit, command, tolerance, and labels
          Metric Definition
      
      
          Current-vs-baseline interpretation for one iteration
          Metrics Dashboard
      
      
          Prioritized follow-up work
          Improvement Backlog
      
      
          Alerting or dashboard implementation details
          Monitoring Setup
      
  

Completion Criteria
    
Every metric cited has a source definition and current value.
The comparison baseline is explicit.
The conclusion is actionable and easy to hand to the next iteration.

Template

Show the template structure

---
ddx:
  id: metrics-dashboard
---
Metrics Dashboard: [iteration or release]
Review Window: [start date - end date]
Baseline: [previous iteration, ratchet floor, or benchmark]
Status: [draft | complete]
Decision
    
[State whether the latest change improved the system and why.]
Summary
    
[One concise paragraph that summarizes the result of the measured change.]
Metrics Table
    

  
      
          Metric
          Baseline
          Current
          Direction
          Result
          Source
      
  
  
      
          [name]
          [value]
          [value]
          [higher/lower]
          [pass/fail/noise]
          [metric definition or report]
      
  

Interpretation Rules
    

[How tolerance/noise is applied]
[What creates a follow-up]

Trend Notes
    

[Trend or anomaly]
[What changed relative to the baseline]

Follow-Up
    

[Tracker issue ID or next step]

Review Checklist
    
 Baseline is explicit
 Each metric cites a source
 The summary states the decision implication