For decades, moving data between systems followed the same pattern: Extract it from the source, Transform it into the right shape, then Load it into the destination. ETL. Then cloud data warehouses arrived, compute got cheap, and the industry quietly flipped the last two steps. Today, most modern data teams use ELT — and the difference matters more than the one swapped letter suggests.
The analogy: restaurant prep vs. a well-stocked kitchen
ETL is like a restaurant kitchen before service. The chef preps everything in advance — vegetables chopped, sauces reduced, proteins portioned. By the time customers arrive, the food is ready to plate. Fast to serve, but you committed to a fixed menu before you knew what people wanted.
ELT is like a well-stocked home kitchen. The raw ingredients are all there — uncut, uncooked. You decide what to make when you’re hungry. More flexible, and if you want something different tomorrow, nothing is already committed. The trade-off: you need a decent kitchen (a powerful warehouse) to cook in.
What is ETL?
In a traditional ETL pipeline, data moves through three distinct stages:
- Extract — Pull raw data from source systems: databases, APIs, flat files.
- Transform — Apply business logic, clean the data, join tables, aggregate metrics — all in an external tool before the data reaches the warehouse. Tools like Informatica, Talend, SSIS, or custom Python scripts handled this step.
- Load — Write the already-transformed, clean data into the destination warehouse or database.
The critical detail: the transformation happens outside the warehouse, on a separate compute engine. The warehouse only receives finished, clean data.
What is ELT?
ELT keeps the Extract and Load steps but moves transformation to last — and crucially,inside the destination warehouse:
- Extract — Pull raw data from sources (same as ETL).
- Load — Write the raw, untransformed data directly into the warehouse into a “raw” schema. Nothing is cleaned yet.
- Transform — Use the warehouse’s own compute (SQL, dbt) to clean, reshape, and model the data. The transformation runs inside BigQuery or Snowflake, using their processing power.
Why ETL made sense in the 2000s
ETL wasn’t a bad idea — it was the right answer for the constraints of the time:
- Warehouses were expensive. On-premise Oracle or Teradata charged by the CPU cycle. You absolutely did not want to run heavy transformations inside them.
- Storage was expensive. Storing raw, messy data took real money. You only loaded clean, valuable data.
- SQL at scale was slow. You needed specialist ETL tools to handle complex joins and aggregations before loading.
Why ELT won in the cloud era
Three things changed and made ETL’s assumptions obsolete:
- Cloud compute became cheap and elastic. BigQuery and Snowflake can run massive SQL transformations in seconds for pennies. There’s no reason to do that work outside the warehouse anymore.
- Storage got nearly free. Storing raw data in BigQuery or an S3-backed warehouse costs almost nothing. You can keep the full history of everything and re-transform whenever requirements change.
- SQL became the best transformation language. With tools like dbt, SQL transformations gained version control, testing, documentation, and lineage — everything that made ETL tools seem “professional”.
The result: the entire transformation layer moved inside the warehouse, powered by SQL and orchestrated by dbt. ELT tools like Fivetran and Airbyte handle the Extract and Load steps. dbt handles the Transform. The old ETL platforms largely became unnecessary for most use cases.
The modern data stack in one sentence: Fivetran (or Airbyte) loads raw data into BigQuery or Snowflake. dbt transforms it. A BI tool like Looker or Metabase reads it. That’s ELT in practice.
When ETL still makes sense
ELT isn’t universally superior. ETL still wins in specific situations:
- PII and compliance requirements. If personal data must be masked, tokenised, or filtered before it enters any storage system — because of GDPR, HIPAA, or legal constraints — ETL handles this at the pipeline layer. With ELT, raw PII enters the warehouse first.
- Complex non-SQL transformations. If your transformation requires machine learning models, external API calls, or logic that can’t be expressed in SQL, a Python-based ETL step outside the warehouse still makes sense.
- Legacy systems. Some organisations have years of investment in ETL tooling and processes. The switch to ELT isn’t free — rewriting pipelines takes time.
The practical takeaway
If you’re building a data stack from scratch today on a modern cloud warehouse, default to ELT. Load raw data first. Transform it with dbt inside the warehouse. Keep your raw data intact so you can re-model it whenever business logic changes — and it always will.
If you’re dealing with sensitive personal data, add a masking or filtering step before the load. That hybrid approach — ETL for the sensitive fields, ELT for everything else — is how most mature data teams handle compliance.