MySQL Query Optimization: From 8 Seconds to 80ms
A real optimization story from a CRM dashboard query that was killing the server. The systematic process I used to identify bottlenecks and reduce query time by 99%.
Advertisement
Last year I inherited a CRM dashboard that took 8 seconds to load. Every click felt like submitting a form to the DMV. The culprit was a single query joining six tables and doing a full scan on two of them. Here's the systematic process I used to get it under 80ms — and the mental model I apply to every slow query now.
Step 1: EXPLAIN ANALYZE Is Your Starting Point
Never guess. Run EXPLAIN ANALYZE on the slow query and read the output top to bottom. Look for 'type: ALL' (full table scan), high 'rows' estimates, and 'Using filesort' or 'Using temporary'. Each of these is a specific type of problem with a specific fix. In my case, I found two full scans on tables with 500k+ rows.
EXPLAIN ANALYZE
SELECT c.name, COUNT(o.id) as order_count, SUM(o.total) as revenue
FROM clients c
JOIN orders o ON o.client_id = c.id
JOIN order_items oi ON oi.order_id = o.id
WHERE o.created_at BETWEEN '2024-01-01' AND '2024-12-31'
AND c.region_id = 5
GROUP BY c.id
ORDER BY revenue DESC;Step 2: Index the Filter and Join Columns
The first fix is almost always indexes. Index every column in your WHERE, JOIN ON, and ORDER BY clauses — in that priority order. Composite indexes work in prefix order, so put the highest-cardinality filter column first. For this query, a composite index on orders(client_id, created_at) and a single index on clients(region_id) cut the scan from full-table to index range.
Step 3: Eliminate the Implicit Joins
The original query was joining order_items just to count lines per order, but that count wasn't even in the final SELECT. It had been added during development and never removed. Dropping that join halved the number of rows MySQL had to process. Always verify that every join in a query contributes to the output.
Step 4: Push Aggregations Down with Subqueries
-- Instead of joining then aggregating on a huge result set:
SELECT c.name, o.order_count, o.revenue
FROM clients c
JOIN (
SELECT client_id,
COUNT(*) as order_count,
SUM(total) as revenue
FROM orders
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY client_id
) o ON o.client_id = c.id
WHERE c.region_id = 5
ORDER BY o.revenue DESC;Step 5: Cache the Expensive Result
Even at 80ms, this dashboard query runs every time a user opens the page. With 50 concurrent users, that's 50 queries per second. I added a Redis cache with a 5-minute TTL and invalidation on order creation. Dashboard load time dropped to under 5ms for cached results. The 80ms query now runs at most 12 times per hour per region.
The Optimization Hierarchy
- 1. Remove unnecessary joins and columns first (free wins)
- 2. Add targeted indexes based on EXPLAIN output
- 3. Restructure the query (subqueries, CTEs) to reduce intermediate result sets
- 4. Add application-level caching for expensive, rarely-changing results
- 5. Consider denormalization or materialized views as a last resort
Advertisement