There are four main job titles in data, and they describe four genuinely different jobs. If you are thinking about a career in data but the titles all blur together, you are not missing something — the industry does a poor job of explaining what each role actually does. This post is the overview nobody gave you.
Start with a single analogy
Picture a high-end restaurant. Getting a meal from raw ingredients to a diner's plate requires different types of work done in sequence. None of those workers do the same job, but the restaurant only functions when all of them show up.
The Data Engineer is the person who builds the kitchen itself. They install the equipment, set up the plumbing, negotiate with suppliers, and make sure ingredients arrive reliably. Without them, nobody can cook.
The Analytics Engineer does the prep work — what chefs call mise en place. They wash, portion, and organise the raw ingredients into clean, labelled containers that any chef can pick up and use immediately. They bridge the gap between raw delivery and ready-to-cook.
The Data Analyst is the front-of-house manager who reads every customer review, studies the order history, and tells the kitchen: “the pasta is outselling everything, but three tables complained about the seasoning.” They turn the restaurant's numbers into decisions.
The Data Scientist is the experimental chef who studies food science, runs controlled tests, and predicts: “if we launch this new dish at this price, here is what our revenue will look like next quarter.” They use patterns in past data to forecast the future.
One kitchen. Four roles. All necessary, all distinct. Now let's go deeper on each one.
The Data Analyst
The data analyst's job is to answer business questions with data. A sales team wants to know which product is selling best this quarter. A marketing team wants to understand which campaign is converting. An operations team wants to know where orders are getting delayed. The analyst turns these questions into queries, pulls the numbers, and communicates what the data says.
Day to day, this means writing SQL to query databases, building dashboards in tools like Tableau or Power BI, investigating anomalies (“why did revenue drop last week?”), and presenting findings to non-technical stakeholders in plain language. The job requires more translation than most people expect — translating a business question into a data question, and then translating the answer back into something a non-analyst can act on.
The core skills are SQL (non-negotiable), data visualisation, and clear written and verbal communication. Most data analysts use a BI tool like Tableau, Power BI, or Looker. Some use Python for more complex analysis, but many do not need it regularly.
This is the most accessible entry point into the data field. A background in business, economics, finance, marketing, or any field that involved working with spreadsheets and asking “what do the numbers say?” translates well. It is not a software engineering role — but it does require a genuine comfort with numbers and structured thinking.
The Data Engineer
Before anyone can analyse data, that data has to be collected, moved, stored, and made queryable. When your company's sales system, CRM, payment processor, and marketing platform all generate data separately, someone needs to build the pipelines that pull all of it together into one place. That person is the data engineer.
Day to day, this means writing Python code to build ETL pipelines (Extract, Transform, Load — the process of pulling data from sources, processing it, and storing it), setting up and maintaining cloud infrastructure, scheduling jobs with orchestration tools like Airflow, and ensuring that pipelines run reliably and recover cleanly when something goes wrong. The job is closer to software engineering than to analysis — it is about building systems, not answering questions.
The core skills are Python, cloud infrastructure (usually AWS, GCP, or Azure), and systems thinking — the ability to design something that runs reliably at scale and fails gracefully. SQL is important but secondary to programming depth.
This is typically the most technically demanding entry point. A computer science or software engineering background is the most common path in. Some data analysts who learn Python and infrastructure make the transition, but it requires building up genuine engineering fundamentals.
The Analytics Engineer
This is the newest of the four roles and often the most misunderstood — partly because the title did not widely exist until around 2019, when dbt (a SQL transformation tool) became popular enough to create demand for a new type of specialist.
Here is the problem the analytics engineer solves. The data engineer brings raw data into the warehouse. But that raw data is messy: inconsistent formats, cryptic column names, missing values, business logic scattered across dozens of spreadsheets. The analyst cannot just query the raw tables — they would spend most of their time cleaning and second-guessing the data rather than analysing it.
The analytics engineer's job is to turn that raw data into clean, reliable, well-documented tables that analysts can trust. They do this with SQL and dbt, applying business logic, writing automated tests to catch data quality issues, and maintaining clear documentation of what every table and column means.
Think of it concretely: a raw orders table might have columns named ord_sts, duplicate rows, and missing customer IDs. The analytics engineer writes the dbt model that produces a clean fct_orders table with proper column names, deduplication, and referential integrity — something any analyst can safely query without needing to understand the raw system behind it.
The core skills are SQL (at a level deeper than most analysts use), software engineering practices like version control with Git and automated testing, and a genuine understanding of the business well enough to model it correctly. Python is useful but not always required.
The most common path in is either a data analyst who picks up engineering practices and dbt, or a software engineer who moves toward analytics. If you love SQL and want to apply engineering discipline to data quality, this role is worth looking at closely.
The Data Scientist
Data analysts answer the question “what happened?” Data scientists work on harder questions: “what will happen?” and “why does this pattern exist?” The tools for answering those questions are statistics and machine learning.
In practice this means building predictive models (a model that estimates which customers are likely to cancel their subscription next month), running controlled experiments like A/B tests (did this product change actually improve conversion, or was the improvement just random chance?), and analysing complex datasets to find non-obvious patterns. The job is closer to applied research than to either analysis or engineering.
The core skills are statistics and probability (genuinely essential, not optional), Python, and the ability to communicate statistical findings clearly to people who have no statistics background. The main tools are Python libraries: pandas and NumPy for data manipulation, scikit-learn for classical ML, PyTorch or TensorFlow for deep learning, Jupyter notebooks for exploration, and MLflow for tracking experiments.
This role typically has the highest bar for quantitative knowledge. A background in mathematics, statistics, physics, or any field that required formal probability and modelling is a strong foundation. It is possible to get into data science without this background, but building the statistical intuition takes time and cannot be fully shortcut by a bootcamp.
How the four roles connect
These roles are not isolated. In a well-run data team, they form a chain where each role depends on the one before it.
The data engineer builds the pipelines that bring raw data from source systems into a central warehouse or lake. Without those pipelines, there is no data to work with.
The analytics engineer takes that raw data and models it into clean, reliable, tested tables. Without that work, analysts would spend most of their time cleaning data instead of analysing it.
The data analyst queries those clean tables to build dashboards, answer business questions, and surface insights. Without the analyst, the data exists but nothing gets decided.
The data scientist builds models on top of the same curated data to forecast outcomes and run experiments. Without clean, consistent data as input, ML models produce unreliable results.
At small companies, one or two people cover all of these functions. At larger organisations, these become distinct teams with their own managers. Either way, the chain is the same.
How to pick the right path
Three questions will get you most of the way there.
1. Do you prefer answering questions or building systems?
Data analysts and data scientists are primarily in the business of answering questions: what happened, what will happen, why did this occur. If the satisfaction of turning data into a clear answer appeals to you, both of those roles are worth exploring.
Data engineers and analytics engineers build the infrastructure that makes answers possible. If you are more energised by building something reliable and well-engineered — something other people depend on — then those roles are more likely to hold your interest long term.
2. How much do you want to write code?
Data analysts use the least code of the four roles. SQL is essential; Python is useful but not always required. Analytics engineers write a lot of SQL plus some Python, and apply software engineering practices to that SQL. Data engineers write significant Python and work with systems-level infrastructure. Data scientists write Python constantly, and their code is more mathematical in nature.
3. Where does your current background point?
Business, finance, marketing, or economics — Data Analyst is the most natural entry point. Strong SQL skills and a BI tool will get you a long way.
Software engineering or computer science — Data Engineer. Your existing skills in writing reliable, testable code transfer directly.
Deep SQL experience and an interest in data quality — Analytics Engineer. Learning dbt on top of strong SQL is a realistic and well-trodden path.
Mathematics, statistics, or academic research — Data Scientist. The quantitative intuition is the hardest thing to build from scratch; if you already have it, use it.
The most common starting point is data analyst. It is the most accessible entry, gives you the clearest view of what problems actually matter to a business, and naturally connects to all the other paths. Many analytics engineers and data scientists started as analysts. Many data engineers started in software and moved into data as a domain. There is no single correct trajectory — but if you are unsure, starting with SQL and business analysis is rarely a wrong move.