Here’s a concise, interview-focused summary you can study from. Use it like a mini cheat sheet plus strategy guide.


1. Why SQL interviews feel hard

  • Data jobs are high paying and very competitive, so companies use tough SQL rounds (online tests + live interviews) to filter candidates.
  • Expect timed platforms (HackerRank, Codility, CodeSignal) and on-the-spot questions from FAANG-type companies.
  • You’re evaluated not only on the final answer, but also on clarity of thinking, communication, and how you handle pressure.

2. Core SQL commands you must know

Filtering rows

  • WHERE to filter rows, combined with AND, OR, NOT.
  • LIKE with wildcards (%, _) for pattern matching (e.g., name LIKE 'A%').

Aggregation

  • GROUP BY to turn raw rows into summary rows per group (e.g., per city, per department).
  • Combine with aggregate functions: MIN, MAX, COUNT, SUM, AVG.

Example pattern (very common in interviews):

1
2
3
4
SELECT department_id,
       AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;

Joins (absolutely essential)

Know what each does conceptually and be able to explain with examples:

  • INNER JOIN: rows that match in both tables.
  • LEFT JOIN: all rows from left, matching from right (or NULL).
  • RIGHT JOIN: opposite of left.
  • FULL OUTER JOIN: all rows from both, match where possible, else NULLs.

Also:

  • SELF JOIN: joining a table to itself (e.g., employees and their managers, or generating combinations like 3-topping pizzas).
  • CROSS JOIN: Cartesian product, all combinations (e.g., all coffee types × all sizes).

Set operators (how they differ from joins):

  • UNION / UNION ALL: stack rows from two queries vertically.
  • INTERSECT: common rows between two result sets.
  • EXCEPT / MINUS: rows in first set but not in second.

Be able to explain: “JOINs combine columns horizontally by matching keys; UNION/INTERSECT/EXCEPT combine rows vertically with same columns.”

Window functions (commonly tested for data roles)

Operate over a “window” of rows without collapsing them like GROUP BY:

  • Ranking: ROW_NUMBER(), RANK(), DENSE_RANK().
  • Offsets: LAG(), LEAD().
  • Running totals, moving averages with SUM() OVER (...), etc.

Typical pattern:

1
2
3
4
5
6
7
8
SELECT employee_id,
       department_id,
       salary,
       RANK() OVER (
           PARTITION BY department_id
           ORDER BY salary DESC
       ) AS salary_rank
FROM employees;

3. Join & set-operator interview patterns

Common join questions

You may be asked to:

  • Explain differences between join types, maybe with Venn-diagram-style reasoning.
  • Write queries combining multiple joins across several tables (users, orders, products, etc.).
  • Do SELF JOIN questions: e.g., generate all 3-topping pizza combinations using the toppings table joined to itself multiple times.
  • Use CROSS JOIN: generate all combinations of two lists (e.g., coffee_type × cup_size).

Be very clear on:

  • Why a FULL OUTER JOIN is not the same as UNION ALL.
    • FULL OUTER JOIN aligns matching rows into one row; UNION ALL just stacks results, even duplicates, without joining.

Performance intuition (high level)

For more advanced roles you should know:

  • JOINs can be expensive on large tables; sometimes denormalizing (pre-combining into a wider table) improves performance.
  • Avoid SELECT * in large/critical queries; explicitly select needed columns.
  • Remove unnecessary joins and subqueries.

4. Example: “Top 3 cities by completed trades”

This is a classic pattern: join → filter → group → order → limit.

Given:

  • trades(trade_id, user_id, status, ... )
  • users(user_id, city, ...)

Goal: top 3 cities by number of completed trades.

Steps:

  1. Join trades with users:
1
2
3
FROM trades t
JOIN users u    -- use `JOIN` or `INNER JOIN` keywords
  ON t.user_id = u.user_id
  1. Filter to completed trades:
1
WHERE t.status = 'COMPLETED'
  1. Group by city and count:
1
2
3
4
5
6
7
SELECT u.city,
       COUNT(*) AS completed_trades
FROM trades t
JOIN users u              -- or INNER JOIN users u
  ON t.user_id = u.user_id
WHERE t.status = 'COMPLETED'
GROUP BY u.city
  1. Order and take top 3:
1
2
ORDER BY completed_trades DESC
LIMIT 3;
1
2
3
4
5
6
7
Pattern: join  filter  group  order  limit

goal: count completed trades per city.
problem: trades doesnt have city; users has the city.
solution: Join trades with users on user_id to combine the data.

So without a join, you cannot know which trade belongs to which city.

Memorize this pattern; many questions are variants of it.


5. Six‑step framework for hard SQL questions

Use this on every non-trivial problem:

  1. Understand the question clearly

    • Restate in your own words.
    • Ask clarifying questions (definitions, inclusions/exclusions, edge cases).
  2. Identify relevant tables and columns

    • Ignore extra tables/columns that are not needed.
    • List exactly what columns you’ll use and how (join keys, filters, aggregations).
  3. Consider edge cases

    • NULLs.
    • Non-unique data.
    • Ties (e.g., what if two cities have same count for 3rd place?).
    • Days with no data, users with no orders, etc.
  4. Break the problem into sub-problems

    • Think in steps: first filter, then aggregate, then rank, etc.
    • Each step should be something you could run as a stand-alone query.
  5. Solve the sub‑problems with SQL

    • Start writing code only now.
    • Use CTEs (WITH ... AS (...)) or nested subqueries to isolate each step.
  6. Test and refine

    • Run each sub-step, check intermediate outputs.
    • Verify row counts, sample values, edge cases.

You can literally narrate this framework during the interview to show systematic thinking.


6. Example of breaking down a complex question

Problem: “Find the top 2 highest-grossing items in each category for year 2022.”

Given: product_spend(product_id, category, spend, txn_date, ...).

Thinking process (before coding):

  • “Top 2” per category → rank products within each category by total spend.
  • “Highest grossing” → need total spend per product per category → SUM(spend) with GROUP BY category, product_id.
  • “For 2022” → filter rows by year in a WHERE clause.

Step-by-step plan:

  1. Filter to 2022: WHERE YEAR(txn_date) = 2022.
  2. Group by category, product_id and SUM(spend) to get total per product.
  3. Use RANK() (or DENSE_RANK()) over PARTITION BY category ORDER BY total_spend DESC.
  4. Filter to ranks 1 and 2.

Sketch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
WITH product_totals AS (
    SELECT category,
           product_id,
           SUM(spend) AS total_spend
    FROM product_spend
    WHERE YEAR(txn_date) = 2022
    GROUP BY category, product_id
),
ranked AS (
    SELECT category,
           product_id,
           total_spend,
           RANK() OVER (
               PARTITION BY category
               ORDER BY total_spend DESC
           ) AS rnk
    FROM product_totals
)
SELECT category,
       product_id,
       total_spend
FROM ranked
WHERE rnk <= 2;

7. Example: odd vs even measurements by day

Problem: For each day, sum the values of odd-numbered and even-numbered measurements separately.

General strategy:

  1. Explore data with SELECT * and check columns: measurement_id, value, timestamp.
  2. Clarify edge cases: what if a day has zero measurements?

Steps:

  1. Assign a sequence number to each measurement within each day using ROW_NUMBER().
  2. Mark rows as odd or even using modulo (row_num % 2).
  3. Group by day and odd/even flag and sum values, then pivot if needed.

Sketch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
WITH numbered AS (
    SELECT DATE(measurement_time) AS measurement_date,
           value,
           ROW_NUMBER() OVER (
               PARTITION BY DATE(measurement_time)
               ORDER BY measurement_time
           ) AS rn
    FROM measurements
),
tagged AS (
    SELECT measurement_date,
           value,
           CASE WHEN rn % 2 = 1 THEN 'odd' ELSE 'even' END AS parity
    FROM numbered
)
SELECT measurement_date,
       SUM(CASE WHEN parity = 'odd'  THEN value END) AS odd_sum,
       SUM(CASE WHEN parity = 'even' THEN value END) AS even_sum
FROM tagged
GROUP BY measurement_date;

This pattern (assign row numbers, classify, then aggregate) is very reusable.


8. What to do when you’re stuck

Behavior in tough moments can still impress the interviewer.

  • For conceptual questions you half-remember:
    • Start with what you do know, then reason forwards (e.g., explain WHERE clearly, then attempt to contrast with HAVING).
  • For topics you’ve never heard of:
    • Admit it honestly, ask for a quick explanation or to confirm terminology: “I haven’t used that term; is it similar to X?”
  • For coding questions:
    • Think out loud. Explain your plan, assumptions, and partial attempts.
    • This shows problem-solving ability and can earn partial credit or hints.
  • If you forget exact syntax or a dialect-specific function:
    • Politely ask if you can look up syntax or pseudo-code the idea: most interviewers care more about logic than commas and parentheses.

Key idea: don’t freeze; communicate.


9. From good to great at SQL interviews

Practical habits that move you into the top tier:

  • Time yourself when solving practice problems to simulate pressure.
  • Do lots of questions (not just 10–20). Aim for tens to hundreds across different patterns (joins, windows, date logic, tricky filters).
  • After each problem, reflect:
    • What pattern did this use (top‑k per group, running total, self join, date filter, etc.)?
    • How does it relate to others you’ve done?
  • Study multiple solutions to the same problem (CTEs vs subqueries, join vs window function, etc.).
    • Learn pros/cons: readability, performance, portability.

10. Advanced topics to cover later

Once you’re comfortable with basics:

  1. Writing clean SQL

    • Good alias names, consistent style, indentation.
    • Use CTEs for clarity; know when a subquery is simpler.
  2. Database design basics

    • Explain tables for a simple app (e.g., social media, e‑commerce):
      • Main entities, primary keys, foreign keys, how tables relate.
      • Which fields to index (IDs used in joins, frequently filtered columns).
  3. Query performance optimization (high level)

    • Avoid SELECT * in serious queries.
    • Don’t add unnecessary joins or calculations.
    • Understand that indexing and denormalizing can speed up frequent reads.

11. Suggested study plan with these notes

If you use this document to study, here’s a simple loop:

  1. Review sections 2–4 and rewrite key patterns from memory (especially joins, group‑by, and window examples).
  2. Pick any random SQL problem and force yourself through the 6‑step framework in section 5.
  3. After solving, categorize the problem type (e.g., “top‑k per group with RANK()”).
  4. Repeat daily with a timer and keep a list of patterns you’ve seen.

If you want, paste your own practice query attempts next time and I can help you refine them and add more interview-style commentary.

Generate all unique 3-topping pizza combinations using data from a single table.

To generate all unique 3-topping combinations from a single table, you need to use self-joins. The trick to ensuring the combinations are unique (meaning “Pepperoni, Mushroom, Olive” is treated the same as “Olive, Mushroom, Pepperoni”) is to join the table on itself using an inequality operator like <.

Assuming your table is named Toppings and has a column named ToppingName, here is the SQL query to get the job done:

The SQL Query

1
2
3
4
5
6
7
SELECT 
    t1.ToppingName AS Topping1, 
    t2.ToppingName AS Topping2, 
    t3.ToppingName AS Topping3
FROM Toppings t1
JOIN Toppings t2 ON t1.ToppingName < t2.ToppingName
JOIN Toppings t3 ON t2.ToppingName < t3.ToppingName;

Why this works

  • Self-Joins: We alias the same table three times (t1, t2, and t3) to represent each slot on the pizza.
  • The < Operator: This is the secret sauce. By enforcing that Topping1 < Topping2 < Topping3, we do two things:
  1. Prevent Duplicates: It stops a topping from being paired with itself (e.g., No “Pepperoni, Pepperoni, Pepperoni”).
  2. Ensure Uniqueness: It forces an alphabetical order. Since “Mushroom” comes before “Pepperoni,” the query will only ever output Mushroom, Pepperoni and never the reverse.

Example Output

If your table had Pepperoni, Mushroom, Onions, and Sausage, the result would look like this:

Topping1Topping2Topping3
MushroomOnionsPepperoni
MushroomOnionsSausage
MushroomPepperoniSausage
OnionsPepperoniSausage

You can adapt this query to include the total price of each combination if you have a cost column?

My apologies for the extra commentary! I hear you—you need a compact, high-density study sheet for quick review.

Here is the technical breakdown in a “cheat sheet” format, optimized for fast reading.


🛡️ DBA 1 Interview Fast-Track (PA State Police)

I. Database Development (60%)

TopicThe “Must-Say” Technical Detail
NormalizationUse 3NF to eliminate redundancy; mention Foreign Key constraints for data integrity.
Data ModelingStart with ER Diagrams (Logical) then map to Physical tables and schemas.
T-SQLUse Set-based logic over Cursors; use Stored Procedures for security and execution plan reuse.
ACIDAtomicity, Consistency, Isolation, Durability—the foundation of reliable transactions.

II. BI & Reporting (20%)

  • SSIS (ETL): Mention using Data Flow Tasks, Lookups, and Error Redirection to handle bad data without failing the whole job.
  • Power BI: Focus on the Star Schema (Fact and Dimension tables) for the most efficient report performance.
  • Gateway: Use the On-premises Data Gateway to keep PSP data secure behind the firewall while viewing reports.

III. Ops & Security (15%)

  • Performance: Check DMVs (like sys.dm_exec_requests) for blocking; use Clustered Indexes for primary lookups.
  • Recovery: Full Backups + Log Backups = Point-in-time recovery (essential for 24/7 law enforcement data).
  • Security: Use Role-Based Access Control (RBAC); never give “SA” rights to applications.
  • Azure: Be familiar with SQL Managed Instance (MI) for cloud scalability.

IV. The “State Police” Angle

  • High Availability: Mention Always On Availability Groups—if one server dies, the other takes over instantly.
  • Integrity: Law enforcement data is legal evidence; Data Audit Trails and Triggers are vital for tracking changes.
  • Availability: Willingness to be On-Call. If a dispatch system goes down at 3:00 AM, the DBA is the first responder.

💡 Quick Mock Q&A

Q: A query is slow. How do you fix it? A: “I check the Execution Plan to find ‘Index Scans’ that should be ‘Index Seeks.’ I check for Parameter Sniffing or fragmented indexes.”

Q: Why use a Stored Procedure instead of raw SQL? A: “Security (prevents SQL Injection), better performance (cached execution plans), and easier maintenance.”

Would you like me to condense this even further into a bulleted “Checklist” you can read 10 minutes before the interview?

Got it — you’re looking for a structured interview preparation sheet where each topic from the DBA 1 job description is turned into an example interview question and sample answer you can study from. Below is a full, ready-to-use guide you can review before your interview.


Technical Preparation Guide: DBA 1 – Pennsylvania State Police

Core SQL Server & Database Development (60%)

Normalization

Key Concepts: 1NF, 2NF, 3NF; reducing redundancy; enforcing data integrity; dependency preservation.

Example Question:
“How do you approach designing a new database schema from scratch?”

Sample Answer:

I start by gathering all business requirements to understand the entities and relationships involved. Then I create an ER diagram and apply normalization up to the third normal form to remove redundancy and ensure data consistency. For instance, I separate repeating groups into different tables and use foreign keys to maintain relationships. Finally, I review the design with end users and developers to ensure it supports business workflows efficiently without unnecessary complexity.


Data Modeling

Key Concepts: Logical vs. Physical design, ER Diagrams, Primary/Foreign Keys, data types, indexing for performance.

Example Question:
“Walk us through your process for creating a data model for a new application.”

Sample Answer:

I begin with a logical model that captures entities, attributes, and relationships independent of physical constraints. Once approved, I translate it into a physical model by defining key constraints, data types, and indexing strategies in SQL Server. I often use ERwin or Visio to visualize and document the model. The final model supports scalability, performance, and data integrity through well-defined primary and foreign key constraints.


T-SQL Development

Key Concepts: Stored Procedures, Views, Triggers, Query Optimization, Error Handling.

Example Question:
“What are some best practices you follow when writing stored procedures?”

Sample Answer:

I always start with clear and modular procedure designs, parameterized inputs, and proper error handling using TRY…CATCH blocks. I include comments for maintainability, use transactions for data consistency, and leverage SET NOCOUNT ON to reduce unnecessary traffic. Before deployment, I examine execution plans to confirm indexes are being used efficiently and optimize joins or subqueries where needed.


Data Integrity

Key Concepts: Constraints (CHECK, NOT NULL, UNIQUE), Transactions, Referential Integrity, ACID properties.

Example Question:
“How do you ensure data remains consistent across multiple related tables during a batch update?”

Sample Answer:

I use explicit transactions to group dependent operations and ensure atomicity. If one step fails, the transaction rolls back, maintaining consistency. I also rely on foreign key constraints for referential integrity and include validation logic in triggers or stored procedures when necessary. Additionally, I test for concurrency issues to ensure isolation in multi-user environments.


Data Analytics & Power BI (20%)

SSIS – ETL & Data Migration

Key Concepts: Extract, Transform, Load processes; package automation; error handling; logging.

Example Question:
“Describe your experience building SSIS packages for ETL.”

Sample Answer:

I design SSIS packages that pull data from multiple sources, transform it using Data Flow tasks, and load it into target tables or data warehouses. I implement error handling through event handlers and logging to record failures and performance metrics. I also use configuration files or parameters to make packages portable and easy to maintain across environments.


Power BI

Key Concepts: Data modeling, relationships, visualization best practices, workspace and gateway management.

Example Question:
“How do you manage Power BI reports and dashboards in a secure environment?”

Sample Answer:

I build optimized data models in Power BI Desktop, define proper relationships, and use measures in DAX for calculation. For publishing, I manage workspaces with role-based access control and configure the Power BI Gateway for scheduled refreshes. I work closely with stakeholders to ensure reports reflect real-time operational needs while maintaining compliance with data security policies.


DAX & Power Query

Key Concepts: Data shaping, custom columns, calculated measures, performance tuning in reports.

Example Question:
“How do you leverage DAX and Power Query in your reporting solutions?”

Sample Answer:

In Power Query, I clean and shape raw data to ensure consistency before loading it into the model. Then I use DAX to create calculated columns and measures that support dynamic reporting. This combination lets me design dashboards that are both interactive and performant, minimizing the need for complex queries on the source database.


Database Operations & Security (15%)

Performance Tuning

Key Concepts: Query execution plans, indexes (clustered vs. non-clustered), statistics, DMVs, performance baselining.

Example Question:
“How do you diagnose and improve slow SQL queries?”

Sample Answer:

I start by reviewing the execution plan to identify bottlenecks such as table scans or missing indexes. I also check wait statistics and use DMVs like sys.dm_exec_query_stats for insights. Based on findings, I might add or adjust indexes, rewrite queries for efficiency, and ensure statistics are up to date. I always validate improvements by comparing baseline performance metrics before and after tuning.


Backup & Recovery

Key Concepts: Full, Differential, Log backups; RPO/RTO; recovery modes; testing restore procedures.

Example Question:
“Can you explain your approach to database backup and recovery planning?”

Sample Answer:

I design backup strategies based on business RPO and RTO goals. For transactional systems, I configure full backups nightly, differential backups every few hours, and transaction log backups at frequent intervals. I regularly test restores in non-production environments to ensure reliability. Documenting the process is critical so the team can execute recovery quickly under stress.


Security

Key Concepts: RBAC, TDE (Transparent Data Encryption), SQL Injection prevention, security audits.

Example Question:
“How do you secure a SQL Server database in a multi-user environment?”

Sample Answer:

I apply role-based access to follow the principle of least privilege. I encrypt data at rest with TDE and use parameterized queries to prevent SQL injection. I schedule regular audits using SQL Server Audit to track permission changes and suspicious activity. For compliance, I also coordinate with the security team to align with organizational and state data protection standards.


Azure and Cloud Database Management

Key Concepts: Azure SQL Managed Instances, security configuration, monitoring, backups in the cloud.

Example Question:
“What’s your experience with Azure-based database systems?”

Sample Answer:

I have worked with Azure SQL Managed Instances for hosting and migrating on-prem databases. I configure geo-replication for high availability, manage security through Azure AD authentication, and use Azure Monitor for performance insights. Automated backups and retention policies help meet compliance and disaster recovery needs while minimizing manual effort.


Critical Operations Context

Example Question:
“How would you handle a critical database outage in a law enforcement environment?”

Sample Answer:

In a critical outage, I remain calm and follow a structured incident response plan. My first step is to identify the cause using monitoring tools or logs. I coordinate with the team to restore service using backups or failover mechanisms while communicating regularly with stakeholders. Once service is restored, I document the incident, implement preventive measures, and review performance thresholds to avoid recurrence.


Would you like me to format this into a one-page printable interview prep sheet (with columns for topic, question, and answer), or keep it in this detailed text version for studying digitally?