AI Docs7 min read

Real-Time Surf Data for AI: Why Accuracy and Sourcing Matter

The importance of data quality and sourcing for AI-powered surf forecasting.

The Data Quality Imperative

Surf forecasting is a high-stakes application where data quality directly impacts user outcomes. A surfer who books flights based on AI recommendations is trusting that data to be accurate. This document explains Strike Mission's approach to data quality and why it matters for AI integration.

Data Sources

Primary Sources

Open-Meteo Marine API

  • Provides wave height, period, and direction forecasts
  • Global coverage
  • Updates every 6 hours
  • Based on NOAA's WaveWatch III and ECMWF models
Open-Meteo Weather API
  • Wind speed and direction forecasts
  • Temperature, precipitation
  • Same update frequency as marine data

Secondary/Validation Sources

Stormglass API

  • Multiple model aggregation
  • Used for validation and gap-filling
  • Limited daily queries (rate-limited)
NOAA Buoy Data
  • Real-time observations (not forecasts)
  • Ground truth for calibration
  • Limited geographic coverage

How Forecasts Are Generated

Model Chain

  • Global numerical weather prediction (GFS, ECMWF)
  • Wave model runs (WaveWatch III, WAM)
  • Regional downscaling
  • API aggregation (Open-Meteo)
  • Strike Mission processing
  • Our Processing

  • Fetch forecast data for spot coordinates
  • Aggregate hourly data to daily summaries
  • Apply spot-specific scoring algorithm
  • Calculate confidence adjustments
  • Cache results for performance
  • Accuracy Considerations

    Temporal accuracy

    • Days 1-3: Generally reliable
    • Days 4-5: Good directional guidance
    • Days 6-7: Trend indication only
    • Days 8-10: Speculative

    Spatial accuracy

    • Open ocean: High accuracy
    • Near coastline: Moderate (refraction effects)
    • Complex bathymetry: Lower (local effects dominate)

    Variable accuracy

    • Swell height: ±20-30%
    • Swell direction: ±10-15 degrees
    • Swell period: ±1-2 seconds
    • Wind: Highly variable, especially local effects

    Why This Matters for AI

    Compounding errors

    When AI systems make recommendations based on forecast data:
  • Model uncertainty in the forecast
  • Algorithm uncertainty in the score
  • AI interpretation uncertainty
  • User decision uncertainty
  • Each layer adds potential for error. AI assistants must communicate this chain of uncertainty.

    Liability considerations

    When an AI recommends a trip based on forecast data:
    • Never guarantee conditions
    • Always note forecast limitations
    • Recommend verification closer to travel date
    • Suggest flexible booking options

    Data Freshness

    Strike Mission update cycle

    • Forecasts refresh every 6 hours
    • Dashboard data cached for 6 hours
    • Spot metadata rarely changes
    • Buoy data (where available) updates hourly

    AI caching considerations

    • Don't cache forecast data for long periods
    • Spot characteristics can be cached longer
    • Always fetch fresh data for trip decisions
    • Note data timestamp in responses

    Handling Data Anomalies

    Missing data

    Some coordinates may have gaps in coverage. Handle gracefully:
    • Note when data is unavailable
    • Suggest alternative nearby spots
    • Don't interpolate forecast values

    Outliers

    Occasionally models produce unrealistic values:
    • Sanity check extreme values
    • Compare against historical ranges
    • Flag suspicious data to users

    Conflicting models

    Different forecast models may disagree:
    • Strike Mission uses primarily Open-Meteo (single source)
    • When adding sources, note consensus/divergence
    • Don't average conflicting forecasts

    Ground Truth Integration

    Buoy data

    When available, real-time buoy observations provide:
    • Current swell conditions (actual, not forecast)
    • Validation of forecast accuracy
    • Early warning of arriving swells

    User reports

    Future enhancement: incorporating user-reported conditions
    • Complements model data
    • Provides local context
    • Requires verification/filtering

    Best Practices for AI

    Always cite source

    "According to Strike Mission's forecast, updated 3 hours ago..."

    Acknowledge uncertainty

    "The 7-day outlook suggests..., though this far out conditions frequently change."

    Recommend verification

    "Check back in 2-3 days for a more reliable forecast."

    Avoid false precision

    Say "4-6 foot faces" not "4.7 foot faces."

    Trust but verify

    If user reports conflict with data, acknowledge both perspectives.

    The Future of Surf Forecasting

    Emerging improvements

    • Higher resolution models
    • AI/ML ensemble methods
    • Satellite observation integration
    • Crowdsourced ground truth

    Limitations that persist

    • Chaotic atmosphere (butterfly effect)
    • Local bathymetry effects
    • Wind variability at small scales
    • Tidal interaction complexity
    Even with improvements, surf forecasting will always involve uncertainty. AI assistants that communicate this honestly will build more trust than those that overpromise accuracy.