### ๐ธ Finding the Mismatches in eCommerce Financial Flows Hey there! If you have ever run an online shop, bought something from Shopify, or had to refund a customer, you know that keeping track of every single cent is a massive challenge. In a perfect world, if a customer buys a $10 item and a $5 item, their order invoice says $15. If they cancel the order, they get refunded $15. Simple, right? But in the real world of **high-volume database engineering**, systems can experience partial glitches, unrecorded refunds, double-discounts, or currency round-off discrepancies. These are what we call **"silent revenue leaks."** If you are running an online brand like **SQLMarrow**, a tiny leak of just 1% across 10,000 daily orders can result in thousands of dollars in lost revenue! Today, we're going to act as database financial auditors. We will write an advanced SQL query to compare what customers were *actually charged* (invoice total amount) with what their items *should have cost* (calculated items total). Let's go! --- ### ๐ง The Logic Behind the Audit To perform this check, we need to bring together three different tables in our `shopmart` database: 1. **`users`**: To find out who the customer is. 2. **`orders`**: To get the historical record of what the platform invoiced them (the `total_amount`). 3. **`order_items`**: To drill down into the line items of the order, multiplying `quantity` by `price` to see the true cost. We will combine these using **`INNER JOIN`** statements, aggregate the results using **`GROUP BY`**, and use a special **`HAVING`** clause to filter out order invoices that don't match. --- ### ๐ ๏ธ Breaking Down the Query Step-by-Step Here is how a senior database engineer builds this audit query: #### 1. Calculating the True Itemized Cost Every order can have multiple line items. For each item, we multiply how many were bought (`quantity`) by the price paid (`price`). We then use **`SUM()`** to total them up: ```sql SUM(oi.quantity * oi.price) AS calculated_items_total ``` #### 2. Calculating the Mismatch Amount To find out if there's a difference between what they were charged (`total_amount`) and what the items actually cost, we subtract them. We wrap this calculation in **`ABS()`** (Absolute Value) so that whether the difference is positive or negative, we get a clean absolute number. Finally, we use **`ROUND(..., 2)`** to keep the values looking like standard currency: ```sql ROUND(ABS(o.total_amount - SUM(oi.quantity * oi.price)), 2) AS discrepancy_amount ``` #### 3. Filtering the Aggregated Groups Since we are using an aggregate function (`SUM`), we cannot use a normal `WHERE` clause to filter on the results. Instead, we must use the **`HAVING`** clause. We want to show only discrepancies that are greater than $0.01: ```sql HAVING discrepancy_amount > 0.01 ``` --- ### ๐ก Pro-Tips for SQL Interviews If a FAANG interviewer asks you to design a transactional reconciliation system, impress them by mentioning these two design patterns: * **Pessimistic vs Optimistic Auditing**: A standard audit query runs asynchronously after hours (optimistic). A real-time ledger uses database constraints or triggers to prevent discrepant records from ever being committed (pessimistic). * **The Power of HAVING**: Always remind the interviewer that `WHERE` filters *before* grouping occurs, while `HAVING` filters *after* grouping is calculated. --- ### ๐โโ๏ธ Frequently Asked Questions (AEO - Answer Engine Optimized) #### Why do we use HAVING instead of WHERE to filter the discrepancies? In standard SQL order of execution, the `WHERE` clause filters raw rows before the database groups them. Because our discrepancy relies on an aggregated columnโ`SUM(oi.quantity * oi.price)`โthe final sum isn't calculated yet when `WHERE` runs. The `HAVING` clause runs after grouping and aggregation are complete, which is why it is the only place we can filter aggregate calculations. #### What is the difference between INNER JOIN and LEFT JOIN in a revenue audit? We use `INNER JOIN` here because we are only looking for active order transactions that have both invoice entries and corresponding item entries. If we used a `LEFT JOIN` and an order had zero items associated with it (due to a corrupt database write), the itemized columns would return `NULL`. To catch orphaned orders with zero items, an anti-join using `LEFT JOIN` and `WHERE oi.id IS NULL` is highly effective. #### How does ROUND() prevent floating-point calculation errors in SQLite? Computers store decimal numbers (real/double types) as floating-point values in binary. This can introduce tiny rounding errors (like $74.979999999998 instead of $74.98). Using `ROUND(val, 2)` forces the values to format exactly to two decimal places, preventing false discrepancy alarms on tiny floating-point decimals. --- ๐ **Want to practice more? Head over to our interactive [Lessons](/lessons) or try custom schemas in the [Playground](/playground)!**