DA-ASSOC Cheatsheet — Databricks SQL: Joins, Windows, Dashboards & Quick Rules

Last-mile DA-ASSOC review: high-yield SQL patterns (joins, windows, CTEs), common pitfalls, and Databricks SQL dashboard/alert best practices.

Use this for last‑mile review. Pair it with the Syllabus for coverage and Practice to validate speed.


1) Join correctness (high-yield)

“Row explosion” checklist

  • Is the join key unique on the dimension side?
  • Did you accidentally join on the wrong granularity?
  • Did you filter after a LEFT join in a way that turns it into an INNER join?

Common join types

Join Keeps rows from Typical use
INNER both sides match enrich when match is required
LEFT left side always keep all facts and add dims when present
ANTI left rows without match find “missing” records

2) Window functions (the “analytics power tool”)

 1SELECT
 2  customer_id,
 3  order_ts,
 4  amount,
 5  SUM(amount) OVER (
 6    PARTITION BY customer_id
 7    ORDER BY order_ts
 8    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
 9  ) AS running_total
10FROM orders;

Rule: partition = “grouping”; order = “sequence”; frame = “range of rows.”


3) CTEs (clarity + correctness)

1WITH base AS (
2  SELECT * FROM events WHERE event_date >= current_date() - 7
3)
4SELECT event_type, COUNT(*) AS cnt
5FROM base
6GROUP BY event_type;

4) Dashboard hygiene (trustworthy analytics)

Practice Why it matters
Define metrics clearly reduces misinterpretation
Use consistent filters comparable slices
Show freshness avoids “stale data” surprises
Prefer parameterized queries reusable and safer

5) Fast troubleshooting pickers

  • Wrong counts after join: non-unique dim key or many-to-many join.
  • Window gives unexpected results: missing ORDER BY or wrong frame.
  • Slow query: missing filters, scanning too much data; reduce early and avoid unnecessary DISTINCT.