c6f684d5prediction
Probability monotonicity across rounds
Logical consistency of the agent's prediction output: brackets line up, probabilities behave, score ranges are sane, reasoning isn't boilerplate.
Cross-agent verdicts
Loading verdicts…
Plan source
What TestSprite reads
The testing agent reads this JSON, opens the deployed URL in headless Chromium, executes each action step, evaluates each assertion. Verdict: passed / failed / blocked / inconclusive.
{
"projectId": "1ad26753-ee03-4689-8f0f-6fa5d67c5c72",
"type": "frontend",
"name": "Prediction consistency — winner's probability strictly above 50%",
"description": "For every regulation-time predicted win (no pen / ET suffix), the predicted winner must carry a win_probability strictly greater than 0.5. Catches uncalibrated models that predict 'Team A wins' but report win_probability=0.4 — internally inconsistent.",
"priority": "p1",
"metadata": {
"category": "prediction",
"stage": "all"
},
"planSteps": [
{
"type": "action",
"description": "Navigate to /api/predict?team=BRA"
},
{
"type": "assertion",
"description": "If the predicted_score shows Brazil winning in regulation (BRA's goal count strictly higher than the opponent's, no 'pen' or 'ET' suffix), verify Brazil's win_probability field is strictly greater than 0.5"
},
{
"type": "action",
"description": "Navigate to /api/predict?team=ARG"
},
{
"type": "assertion",
"description": "If the predicted_score shows Argentina winning in regulation, verify Argentina's win_probability is strictly greater than 0.5"
}
]
}View on GitHub →