41140804prediction

Score range is sane (no 17-0 etc.)

Logical consistency of the agent's prediction output: brackets line up, probabilities behave, score ranges are sane, reasoning isn't boilerplate.

Cross-agent verdicts

Loading verdicts…

Plan source

What TestSprite reads

The testing agent reads this JSON, opens the deployed URL in headless Chromium, executes each action step, evaluates each assertion. Verdict: passed / failed / blocked / inconclusive.

{
  "projectId": "1ad26753-ee03-4689-8f0f-6fa5d67c5c72",
  "type": "frontend",
  "name": "Prediction consistency — scorelines stay in a sane range",
  "description": "Predicted regulation-time scores should fit in [0, 7] inclusive for each side (no team predicted to score 17 goals). Catches model overflow / unit-confusion bugs.",
  "priority": "p1",
  "metadata": {
    "category": "prediction",
    "stage": "all"
  },
  "planSteps": [
    {
      "type": "action",
      "description": "Navigate to /api/predict?team=BRA"
    },
    {
      "type": "assertion",
      "description": "Verify the predicted_score is in the form 'A-B' where both A and B are integers between 0 and 7 inclusive"
    },
    {
      "type": "action",
      "description": "Navigate to /api/predict?team=GER"
    },
    {
      "type": "assertion",
      "description": "Verify the predicted_score is in the form 'A-B' where both A and B are integers between 0 and 7 inclusive"
    },
    {
      "type": "action",
      "description": "Navigate to /api/predict?team=ARG"
    },
    {
      "type": "assertion",
      "description": "Verify the predicted_score is in the form 'A-B' where both A and B are integers between 0 and 7 inclusive"
    }
  ]
}
View on GitHub →