Six Degrees of the World Cup

Find the shortest teammate chain between any two WC2026 players — every hop proven by club + season. Plus rosters, an 11k-edge interactive graph, leaderboards, and a test-verified derivation.

Websitegithub
  • 1,955 Raised
  • 14 Views
  • 9 Judges

Description

What it is

SquadGraph — Six Degrees of the World Cup is a zero-backend explorer of the WC2026 "played together" graph. Every one of the 1,248 players at the tournament is connected to every other through the club dressing rooms they shared, season by season — 11,035 edges, 8,998 of them crossing national-team lines. For a fan or journalist, it answers the questions the brief poses instantly: who was at this club in this season?, how is Player X connected to Player Y?, and which clubs forged the strongest cross-nation bonds of this World Cup? The signature feature is the Degrees of Separation path finder: pick any two players and get the shortest teammate chain, with every hop labeled with the exact club and season — so each link is auditable against the source data, not just asserted.

How it maps to the brief & rubric

  • Data accuracy and coverage — the canonical v1.0 dataset is used byte-for-byte (committed into the repo, SHA-256 recorded in the README; fetched from the pinned CDN commit). No fabricated enrichment. The in-app Data & Gaps page surfaces every documented limit from gaps.json (8 history-less players, 437 dateless memberships dropped, ±1-season boundary stints), so coverage gaps are visible, not hidden.
  • Graph correctness — the derivation implements the dataset's edge rule verbatim: group players by (club_id, season), joining on Wikidata QIDs only, never club names (PSG senior Q483020 stays distinct from its youth academy Q2945336). A 12-test Vitest suite (npm test) asserts the brief's published baselines: 1,248 players / 1,578 clubs / ~11k edges, the PSG 2023-24 trio (Vitinha, Nuno Mendes, Gonçalo Ramos) with João Neves correctly excluded until 2024-25, per-edge evidence soundness against raw stints, dedup, and parity with the brief's reference pairwise derivation.
  • Query and visualization usefulness — the core club-season roster query is front-and-center with typeahead search over all 1,578 clubs, and all four stretch goals are shipped: degrees of separation with explained hops, an interactive force-directed graph with filters (nation, club country as league proxy, season era, cross-nation-only), and strongest-connection leaderboards.
  • Code quality — TypeScript end-to-end; one small pure-function module (src/lib/derive.ts) holds the entire graph logic, shared by the build script, the tests, and the browser; clean page/component layout; no dead dependencies.
  • Write-up clarity — the README covers setup in three commands, the exact derivation rules, a verified-baselines table, architecture, and trade-offs; this page mirrors it.

Key features

  • Club-Season Query (core requirement) — pick any club + season → every WC2026 player who was in that room, grouped by national team, with cross-nation rooms highlighted and the club's QID shown for verifiability.
  • Degrees of Separation — BFS shortest path between any two players; each hop renders the full evidence list ("teammates at Sporting CP 2020-21"), every club-season clickable through to its roster. Gracefully explains "no connection" (isolated nodes are a documented dataset reality).
  • Interactive Graph — canvas force-directed view of the network; nodes colored and sized by national team and degree, edge tooltips list every shared club-season, click-through to player profiles. Defaults to the readable cross-nation view with the full ~11k-edge hairball behind a toggle.
  • Strongest Connections — leaderboards of player pairs by shared club-seasons (with a cross-nation filter), the single club-seasons hosting the most future WC2026 players, and the clubs that hosted the most players overall.
  • Player profiles — career timeline where every stint links to its club-season roster, plus all WC2026 teammates ranked by bond strength.
  • Data & Gaps transparency page — provenance, derivation rules, the verified-baselines table, and the gaps.json coverage report rendered in-app.

Tech stack & architecture

  • Vite + React + TypeScript — static SPA; types mirror the dataset schema so the club-name-vs-QID trap is unrepresentable in the join code.
  • Build-time derivationscripts/build-graph.ts (run via tsx on prebuild) converts the pinned data/players.json into compact integer-indexed artifacts in public/derived/: graph.json (players, clubs, seasons, edges with full evidence lists, ~714 KB), leaderboards.json, and a pass-through gaps.json. Artifacts are committed, so the deploy is reproducible and inspectable without running anything.
  • Pure graph coresrc/lib/derive.ts contains grouping, edge derivation, adjacency building, and BFS as small pure functions; the same code is exercised by the build script, the Vitest suite, and the browser.
  • force-graph (canvas) — renders 1,200+ nodes / thousands of edges smoothly without WebGL complexity.
  • Vitest — 12 tests pinning the brief's sanity baselines; judges can verify correctness with one command.
  • No backend, no database — the SPA fetches the precomputed JSON once; every query (roster lookup, BFS, filtering) runs client-side in under a millisecond on a 1,248-node graph. A static Vercel deploy has zero runtime failure modes.
  • Minimal hash router — deep links like #/path?from=…&to=… work on any static host with no rewrite config and no router dependency.

How to try it

Live demo: arena-the-squad-graph-fable.vercel.app — no login, no setup. Things to try:

  • Club-Season Query: search "Paris Saint-Germain", pick 2023-24 — the brief's sanity trio appears (and João Neves doesn't until 2024-25).
  • Degrees of Separation: pick any two players from rival nations and follow the evidence chain; the home page links the strongest cross-nation pair directly.
  • Graph: hover edges for shared club-seasons; filter to a single nation or league country.

Code: github.com/layerx-labs/arena-the-squad-graph-fable. Run locally: npm install && npm test && npm run dev.

Challenges & what we learned

  • Evidence-carrying edges without bloat — keeping the full (club, season) evidence on all 11,035 edges naively explodes file size. Integer-indexing players/clubs/seasons and shipping tuple arrays kept graph.json at ~714 KB (≈186 KB over the wire gzipped), preserving full auditability at static-site cost.
  • Real data is dirty — a handful of clubs ship with null names in the source. The derivation falls back to the QID rather than dropping them, so their rosters stay queryable, and the choice is documented in code.
  • Correctness as a product feature — instead of claiming the graph is right, the test suite independently re-derives the edge set with the brief's reference algorithm and asserts parity, and the UI exposes QIDs and evidence everywhere so any judge can spot-check any edge.
  • Visualizing 11k edges — the full graph is an unreadable hairball; defaulting to cross-nation edges (the brief's most interesting slice) with filters made the visualization a tool rather than a screensaver.

What's next

  • Edge-weighted paths (prefer chains with more shared seasons) and "all shortest paths" view.
  • Club ego-graphs: a club's entire WC2026 alumni network over time as an animated timeline.
  • Squad-vs-squad comparison: all cross-links between two national teams ahead of a fixture — a journalist's pre-match brief in one click.
  • Optional enrichment layer (transfer fees, competitions) kept strictly separate from the judged canonical dataset.