Remoteria — Pre-Vetted Offshore Staffing

Q: A dashboard shows revenue up 40% week-over-week and the CEO is asking why. Walk me through how you would investigate.

Check data freshness first (was a pipeline backfilled?), then slice by product/segment/geo/channel, check for one-time events (bulk order, refund reversal), reconcile against Stripe or source of truth, compare against order count vs AOV. Real analysts check the pipeline before celebrating.

Q: Design an A/B test for a new checkout flow. What is your primary metric, guardrails, sample size, and how long do you run it?

Primary metric tied to decision (conversion rate), guardrails (refund rate, support tickets, AOV), MDE and power analysis to get sample size, run for at least one full business cycle (usually 7-14 days) to control weekly seasonality. Should mention peeking / sequential testing risks.

Q: What is the difference between rolling retention and bounded retention, and when would you use each?

Bounded retention = returned on day N exactly. Rolling / unbounded = returned on day N or later. Bounded is stricter and shows true habit; rolling smooths sparse usage and is better for weekly-use products. Should not confuse with cohort vs classical retention.

Q: Tell me about a time a stakeholder asked for a number that would have been misleading. What did you do?

Pushed back with a better framing, explained the trap, landed on a number the stakeholder trusted. Did not just deliver the wrong thing because it was asked for.

Q: Walk me through an analysis you shipped that changed a business decision.

Specific story: the question, the data, the finding, the decision that changed. Can tell you the number. Not "I built a dashboard and people looked at it."

Syed Ali

Interview guide

Data Analyst Interview Questions & Answers Guide (2026)

A hiring-manager’s interview kit for data analysts — with specific “what to look for” notes on every answer, red flags to watch, and a practical test.

Skip the interviews — hire a pre-vetted data analyst

Key facts

Role: Data Analyst
Technical questions: 14
Behavioral: 7
Role-fit: 5
Red flags: 8
Practical test: Included

How to use this guide

Pick 4-6 technical questions across difficulties, 2-3 behavioral, and 1-2 role-fit for a 45-minute interview. For senior roles, weight harder technical and role-fit higher. Always close with the practical test so you are hiring on evidence, not impressions. The “what to look for” notes are a scoring rubric: strong answers touch most points, weak answers miss them or replace them with platitudes.

Technical questions — Easy

1. Explain the difference between a LEFT JOIN and a LEFT JOIN with a filter in the WHERE clause vs the ON clause. Why does it matter?

Easy

What to look for: Filter in WHERE turns a LEFT JOIN into an effective INNER JOIN when the filter references the right-side table. Filter in ON preserves left rows with NULL on the right. Classic gotcha — if they do not know this, they have written bad joins.

2. A stakeholder asks for "users who are active." How do you respond?

Easy

What to look for: Pushes back with three questions: what action counts as active, what time window, and what decision will this drive. Does not just write SELECT COUNT(DISTINCT user_id) and hope for the best.

3. When would you choose Mixpanel/Amplitude over writing SQL against the event table in the warehouse?

Easy

What to look for: Product analytics tools for PM self-serve, funnels, cohorts, and fast iteration on event schemas. Warehouse SQL for revenue reconciliation, joining to CRM, ML features, and auditable numbers. Should know both have a role.

Technical questions — Medium

1. Write a SQL query that returns, for each user, the date of their first purchase and the date of their third purchase, or NULL if they have fewer than three.

Medium

What to look for: ROW_NUMBER() window function partitioned by user_id ordered by purchase_date, then pivot or conditional aggregation. Should handle NULLs cleanly and not accidentally filter out users with <3 purchases.

2. A dashboard shows revenue up 40% week-over-week and the CEO is asking why. Walk me through how you would investigate.

Medium

What to look for: Check data freshness first (was a pipeline backfilled?), then slice by product/segment/geo/channel, check for one-time events (bulk order, refund reversal), reconcile against Stripe or source of truth, compare against order count vs AOV. Real analysts check the pipeline before celebrating.

3. Design an A/B test for a new checkout flow. What is your primary metric, guardrails, sample size, and how long do you run it?

Medium

What to look for: Primary metric tied to decision (conversion rate), guardrails (refund rate, support tickets, AOV), MDE and power analysis to get sample size, run for at least one full business cycle (usually 7-14 days) to control weekly seasonality. Should mention peeking / sequential testing risks.

4. What is the difference between rolling retention and bounded retention, and when would you use each?

Medium

What to look for: Bounded retention = returned on day N exactly. Rolling / unbounded = returned on day N or later. Bounded is stricter and shows true habit; rolling smooths sparse usage and is better for weekly-use products. Should not confuse with cohort vs classical retention.

5. Write a SQL query that calculates 7-day rolling active users on an events table with (user_id, event_date).

Medium

What to look for: Self-join or window with range between 6 preceding and current row on a date spine, COUNT(DISTINCT user_id). Watch for DISTINCT over windows not being universally supported (Snowflake does, Redshift did not historically).

6. What does it mean for a metric to have "survivorship bias", and how would you spot it in a retention chart?

Medium

What to look for: Later cohorts look better because only engaged users survived to be measured. Fix by anchoring at cohort start, not current date. Should give a concrete example like "churn rate among active users" being meaningless.

7. We A/B tested a feature. Variant A: conversion 5.1%. Variant B: 5.3%. n=1,000 per arm. Is B better?

Medium

What to look for: Almost certainly not statistically significant at that sample size — should estimate MDE or run the proportion test mentally. Red flag: calls B the winner without significance.

8. How do you handle time zones in a SQL query when event timestamps are UTC and the business reports on America/New_York?

Medium

What to look for: CONVERT_TIMEZONE in Snowflake, AT TIME ZONE in Postgres/Redshift. Aware of DST issues. Stores raw UTC, converts at report time. Red flag: shifts by fixed offset and breaks twice a year.

Technical questions — Hard

1. Walk me through how you would build a marketing attribution model when GA4, Meta Ads Manager, and our warehouse all show different conversion numbers.

Hard

What to look for: Pick one source of truth (usually the warehouse), reconcile UTM tagging, understand platform-reported conversions use view-through and modeled conversions, explain last-touch vs first-touch vs MTA trade-offs. Should be comfortable saying "they will never match exactly".

2. In LookML (or dbt), you discover two derived metrics that should match disagree by 3%. How do you debug?

Hard

What to look for: Check grain of underlying models, look for fan-out from joins, check filters on one side missing from the other, check NULL handling, check currency / timezone. Should trace lineage not guess.

3. Explain the bias-variance trade-off in the context of a simple forecasting model (not ML theory).

Hard

What to look for: Simpler model (trend + seasonality) = more bias, less variance = stable but misses recent changes. Complex model overfits recent noise. Practical answer: use Prophet/statsmodels with cross-validation, not "add more features."

Behavioral questions

1. Tell me about a time a stakeholder asked for a number that would have been misleading. What did you do?

What to look for: Pushed back with a better framing, explained the trap, landed on a number the stakeholder trusted. Did not just deliver the wrong thing because it was asked for.

2. Walk me through an analysis you shipped that changed a business decision.

What to look for: Specific story: the question, the data, the finding, the decision that changed. Can tell you the number. Not "I built a dashboard and people looked at it."

3. Describe a time you found a bug in a dashboard that was already in use. How did you handle the communication?

What to look for: Told stakeholders proactively, quantified the error, fixed the root cause in the model not the viz, followed up with a short write-up. Did not quietly patch it.

4. How do you prioritize when you have 15 ad-hoc requests and 3 planned projects?

What to look for: Has a triage system — impact on decision, blocker status, estimated hours. Pushes back on low-value asks, batches similar requests, protects project time.

5. Tell me about a time you disagreed with a product manager about what the data was saying.

What to look for: Brought the query, walked through the logic, stayed curious, changed their mind or got changed. Not "I was right, they were wrong."

6. Describe the messiest data quality problem you have investigated.

What to look for: Specific: duplicate records, double-counted orders, timezone bug, broken UTM. Traced to root cause, coordinated a fix, put a test in place to catch it next time.

7. How do you keep your skills current — SQL patterns, new BI tools, statistical techniques?

What to look for: Concrete sources: newsletters, books, communities, side projects. Not "I watch YouTube sometimes."

Role-fit questions

1. Why analytics specifically, not data engineering or data science?

What to look for: Has a real answer about enjoying stakeholder work, business problems, or translation between data and decisions. Not "I could not get a DS job."

2. Our BI tool is [Looker / Tableau / Metabase]. Have you used it, and if not, how would you ramp?

What to look for: Honest answer. If they have not used it, a concrete plan — docs, a small rebuild of an existing dashboard, pair with a senior.

3. How do you feel about owning recurring weekly business reviews where the exec team is in the room?

What to look for: Comfortable presenting, knows how to handle hostile questions, can defend a number live. Not panicked by live scrutiny.

4. What does a good working relationship with a PM look like to you?

What to look for: Shared backlog, early involvement in feature specs to define success metrics, honest reads on tests even when the PM is rooting for a win.

5. What would make you leave this role in a year?

What to look for: Honest signals — being used only for ad-hoc pulls with no strategic work, no path to senior, no connection to decisions. Tells you what motivates them.

Red flags

Any one of these alone is usually reason to pass, especially combined with weak answers elsewhere.

• Writes SELECT * JOIN without thinking about grain, gets row-count surprises, and shrugs.
• Cannot explain what a p-value actually means.
• Calls an A/B test "significant" without mentioning sample size or test duration.
• Builds dashboards nobody asked for and measures success by dashboard count.
• Does not push back on vague requests, just churns out one-off pulls.
• Confuses correlation with causation in a live discussion.
• Has never reconciled two data sources that disagreed.
• Says "the data shows X" without being able to walk you through the query.

Practical test

3-hour take-home using a provided e-commerce dataset (orders, users, events, marketing spend). Deliverables: (1) a SQL script that computes weekly new-customer revenue, returning-customer revenue, and blended CAC by channel for the last 12 weeks; (2) a 1-page Looker Studio or Metabase dashboard sharing the top finding; (3) a 300-word written recommendation to the hypothetical CMO. We grade on SQL correctness (30%), analytical judgment (30%), visualization clarity (20%), and written communication (20%). Bonus for catching the seeded data quality bug in the orders table.

Scoring rubric

Score each answer 1-4: (1) Misses most of the rubric or gives platitudes; (2) Hits some points but cannot go deep when pressed; (3) Covers the rubric and can defend the answer under follow-ups; (4) Adds unprompted nuance, trade-offs, or real examples beyond the rubric. Hire at an average of 3.0+ across technical, behavioral, and role-fit, with zero red flags, and a pass on the practical test.

Data Analyst Interview Questions & Answers Guide (2026)

Key facts

How to use this guide

Technical questions — Easy

1. Explain the difference between a LEFT JOIN and a LEFT JOIN with a filter in the WHERE clause vs the ON clause. Why does it matter?

2. A stakeholder asks for "users who are active." How do you respond?

3. When would you choose Mixpanel/Amplitude over writing SQL against the event table in the warehouse?

Technical questions — Medium

1. Write a SQL query that returns, for each user, the date of their first purchase and the date of their third purchase, or NULL if they have fewer than three.

2. A dashboard shows revenue up 40% week-over-week and the CEO is asking why. Walk me through how you would investigate.

3. Design an A/B test for a new checkout flow. What is your primary metric, guardrails, sample size, and how long do you run it?

4. What is the difference between rolling retention and bounded retention, and when would you use each?

5. Write a SQL query that calculates 7-day rolling active users on an events table with (user_id, event_date).

6. What does it mean for a metric to have "survivorship bias", and how would you spot it in a retention chart?

7. We A/B tested a feature. Variant A: conversion 5.1%. Variant B: 5.3%. n=1,000 per arm. Is B better?

8. How do you handle time zones in a SQL query when event timestamps are UTC and the business reports on America/New_York?

Technical questions — Hard

1. Walk me through how you would build a marketing attribution model when GA4, Meta Ads Manager, and our warehouse all show different conversion numbers.

2. In LookML (or dbt), you discover two derived metrics that should match disagree by 3%. How do you debug?

3. Explain the bias-variance trade-off in the context of a simple forecasting model (not ML theory).

Behavioral questions

1. Tell me about a time a stakeholder asked for a number that would have been misleading. What did you do?

2. Walk me through an analysis you shipped that changed a business decision.

3. Describe a time you found a bug in a dashboard that was already in use. How did you handle the communication?

4. How do you prioritize when you have 15 ad-hoc requests and 3 planned projects?

5. Tell me about a time you disagreed with a product manager about what the data was saying.

6. Describe the messiest data quality problem you have investigated.

7. How do you keep your skills current — SQL patterns, new BI tools, statistical techniques?

Role-fit questions

1. Why analytics specifically, not data engineering or data science?

2. Our BI tool is [Looker / Tableau / Metabase]. Have you used it, and if not, how would you ramp?

3. How do you feel about owning recurring weekly business reviews where the exec team is in the room?

4. What does a good working relationship with a PM look like to you?

5. What would make you leave this role in a year?

Red flags

Practical test

Scoring rubric

Related

Written by Syed Ali