Familiar with graphs? How about DAGs? This is not a paradigm shift, but think of DAG as a cool way for tiny team in Austin/Dallas Texas to build an Extract Transform and Load software!
The DAG engine gives this small team the ability to create an ETL software with rules and futuristic features.
We are using the same principles other Directed Acyclic Graph tools we know and love today. Like Apache Spark, Apache Airflow. Apache Beam, Kubeflow Pipelines, MLflow, TensorFlow, Dagster, Prefect, Argo Workflows, Google Cloud Composer, Azure Data Factory, and many other cool tools.
We created our own custom DAG engine using JavaScript, and this enables us to flow data downstream in a web application, also it streams. Data steaming in a no-code ETL software, without a setup or install, feels like a big win for any ETL software.
In simple terms, Acyclic means not looping, this diagram/graph shows no loops.
What is a graph?
From a data perspective, a graph is a non-linear data structure used to model and store information where the relationships between individual data points are as important as the data itself. Natively a graph engine would treat data as a first class citizen, enabling real-time data processing and the ability to only compute what needs to be computed.
Unlike tables in a relational database, which store data in a fixed, row-and-column format, a graph is a flexible, interconnected network of entities and their relationships. With ET1, we fix together this graph engine so that it looks and feels like a regular ETL software, enabling a lot of cool functionality and features that regular ETL software is unable to offer.
We don’t mean to appear as we are reinventing the wheel but rather adding a different style to the typical nodes or tools you have come to learn and love.
No looping… Acyclic
ET1 is by law, acyclic, meaning not forming part of a cycle. In a data world that means no looping is possible. This application is unable to loop back on itself. This is a beneficial rule for an ETL software.
In this diagram, 2 goes to 5 then back to 1. This isn’t possible in ET1.
This diagram is not possible due to the rule base in the engine and not a DAG, due to the loop.
Imagine an ETL application that could easily loop on itself, that would be a negative because it would allow people to break their system. DAG is predictable and a great engine to use for flowing data downstream.
Core Concept: No Play Button, Data Flows, DAG Guides, Ready?
The DAG (Directed Acyclic Graph) system is like a digital assembly line for your data, where each node is a workstation that processes data and passes it along. This changes how data is computed.
Instead of maxing out a few nodes because you’re querying all the data at once before starting a new node, each piece of your data is treated like a first class citizen in ET1.
Here’s how it works:
Is this data ready?
Yes or no?
When you go climbing, you are always talking to your partner, are they “Ready or not.” Is the person keeping you safe ready for you to fall? Are you ready? The person keeping you safe should always be ready. ET1 is always ready, so data is always flowing.
Being “always ready” is the key, DAG the bumpers to fall within, and our guide. It creates things like streaming, processing only what’s necessary, and branching off big ideas is simplistic.
Key Components
Nodes – Individual processing units (like filters, joins, calculations)
Edges – Connections showing data flow between nodes
Data Streams – The actual data flowing through the system
How It Works
Automatic Updates
Change a node? The system only recalculates what’s needed downstream
No manual refreshing – updates happen in real-time
Smart Processing
Only processes changed data paths
Alteryx and Knime users tired of data processing unnecessarily will be excited about this feature
Avoids redundant calculations
The DAG engine lets you only calculate what changes, decreasing your compute and time spent creating solutions
Visual Flow
See your data transform step by step
Easy to spot where changes are needed
Intuitive drag-and-drop interface
Why ET1 is Better
No More Waiting: Only recalculates what’s necessary
Never get stuck waiting on data to re-run because you made a change, only calculate what matters, graph enables the ability to calculate one thing at a time
Most products have to re-calculate the entire table before it’s ready to move forward
Mistake-Proof: Can’t create circular references, very helpful
Users are unable to make big mistakes like spamming their API in an infinite loop
No one will be able to increase their cloud costs because they made an easy mistake
Exploration has no penalties, crafting a sense of trust in non-technical users
Decrease stress and network strains by avoiding infinite loops
Visual Debugging: See exactly where data changes happen, a visual teacher
Created to help people visually understand their data processes
Highlight to quickly see and understand the data automation
Scalable: Handles simple to complex workflows with ease
Think of it like a factory conveyor belt system – each station (node) does its job and passes the product (data) to the next station, with the system automatically managing the flow and only processing what’s needed.
Competitive analysis
Instead of constantly recycling the same rows over and over, ET1 enables anyone the ability to only compute rows that need to be updated VS re-running each table unnecessarily.
This is the tools for problem solving like KNIME, Alteryx, Tableau, Power BI, and most BI Platforms.
In most software; If your pipeline changes, you have to run 100% of the records.
ET1 defeats this with this engine.
The DAG engine introduces what we feel is great foundation for a powerful ETL tools that can scale in the future.
We believe only the data that matters should flow down stream, DAG natively supports that by design. So using this DAG engine, we are able to only flow what matters, and make problem solving feel modern.
Future outlooks
We are not married to this engine but believe it’s very beneficial thus far. Our goal is not become fixated on the engine but rather what features it can offer.
Graph means it’s easy for us to scale up to cloud or server off loading situations in the future and that’s the easy piece.
Knowing that DAG systems are the backbone of many major big data appliances, know we are thinking bigger, big picture, and next steps too.
If you have a use case that isn’t possible on your current machine, let us know.
Aggregation, what a classic. Aggregating your data is a landmark trait for any data steward, data wrangler, or data analyst. In ET1, you can easily aggregate your data.
The Power of Grouping with the Group By Node
Aggregations turn a sea of numbers into meaningful insights.
While consulting with The Nielsen Company, the product manager and I looked at the RAW data together. The issue, the data was too big. I pointed out the data is “split second.” That means they are tracking every second that it happens.
So I asked, “can we group by month?”
The answer was yes, they only cared about monthly data.
This question, “can we group,” reduced the data by 99%, reducing big data costs. So by writing a Group By in the SQL, I was able to save The Nielsen Company 99% of their downtime when waiting for their dashboards to load.
Knowing how Group By works is important. This software will teach you how Group By works!
The Group By Node is the foundation of your aggregation
This lets you split the information across a non-aggregating column, otherwise you’re creating a KPI.
Create your KPI, understand the number of records, and various ways to aggregate.
By default aggregation starts with count_rows to enable faster developer cycles.
๐ข The Essential Aggregations
Sum
Adds up all values
Perfect for: Sales totals, revenue, quantities
Average (Mean)
Finds the middle ground
Great for: Test scores, ratings, temperatures
Minimum/Maximum
Spot the extremes
Use for: Price ranges, performance metrics, outliers
Count
Simple but powerful
Tells you: How many? How often?
Number of records
by default, you will get the “number of records”
you’re welcome!
Count Distinct?
Well, count distinct is nice but…
This really means your data is duplicated!
๐ฏ Group By: The Game Changer
The real magic happens when you combine these with Group By:
Sales by Region: Group by Region, Aggregateย Sum ( Revenue )
Real-World Examples
How would you aggregate in the real world?
E-commerce
Total sales per city
Average ticket sales per state
Average order value by customer segment
Education
Pass rates by subject
Sum of students per day
Average of students per month per class
Finance
Monthly expenses by category
Highest spending customers
Pro Tips
Start Simpleย – Try one aggregation at a time
Clean Firstย – Make sure it’s just numbers or you’re not aggregating
Check Your Groupsย – Make sure your groups make sense, very powerful for data reduction
Aggregation needs to be simple. Let us know if it’s not.
Concat merges everything, and it doesn’t care about data types.
What it does: Merges text from different columns
Add a custom string between what you’re merging.
Perfect for:
Creating full names from first/last
Building addresses from components
Generating unique IDs or labels
Bringing together: State with City in 1 column.
Real-World Examples
Join:
Match customer emails with their support tickets
Combine product IDs with inventory details
Union:
Merge Q1, Q2, Q3 sales into one report
Combine survey responses from different regions
Concat:
Create “Last, First” name formats
Build URLs from domain + path components
Pro Tips
Joins work best with unique identifiers
Union requires matching column structures
Concat can add custom separators (spaces, dashes, etc.)
Remove duplicate records
No more copy-pasting between spreadsheets or writing complex formulas – just connect the dots and let the data flow! No strange joining tools in Excel, no learning the difference between joins, and just get your data wrangled already!
The filtering nodes help you reduce the number of rows, drill into the exact information needed, and create a data set that will add value VS confuse your audience.
When filtering, remember you’re reducing the amount of data coming through the node, or you can swap to include.
Include, exclude, and ultimately work on your data.
The Filtering Nodes in ET1
1. ๐ Any Column Filter
The Swiss Army Knife
Search across all columns at once
Perfect for quick data exploration
No setup required – just type and filter
2. ๐ Column Filter
The Precision Tool
Filter specific columns with exact matches
Create multiple filter conditions
Ideal for structured, clean data
3. ๐งฎ Measure Filter
The Number Cruncher
Filter numeric columns using conditions like:
Greater than/less than
Between ranges
Above/below average
Great for financial data, metrics, and KPIs
4. ๐ Wild Headers
Include or exclude headers based on wildcard
Easily clean wide tables
No brainer approach to column filtering
Column filter is nice, but at times wild headers are king
Why This Beats Spreadsheet Hell
Visual Feedback: See filtered results instantly
Non-Destructive: Your original data stays safe
Never recycle: You filter data unnecessarily.
Stackable: Chain multiple filters for complex queries
Reversible: Remove or modify filters anytime
Filtering Pro Tips
Be willing to test filters, and create branches. Then right click the beginning of the branch to duplicate the entire downstream operation. This then lets you edit filters across multiple streams of data, and see the difference between your filtering!
Start with “Any Column” to explore strings, measure filter to explore measures, then switch to specific column filters as you understand your data better, and wild headers for those edge cases where you have a lot of columns (but only a couple matter).
ET1 is built to easily filter (transform) your data. Remember, it’s like having a conversation with your dataset!
You know the drill, data tools are very similar, it all starts with extracting your data.
But are you familiar with where your data lives? Start asking, documenting, and building your understanding of your data environment. This software will help you warehouse that information into a single canvas, without having to ask engineering for help.
Input Node Overview
The Input nodes are essential for moving the needle in ET1, without data, we are using our feelings!
The CSV Input node is great for getting your Comma delimited files into ET1.
The JSON Input node is great for getting JSON in the app, your engineering team will be happy.
The Github CSV is where you can get CSVs off the public internet. That’s fun. Enrich your data pipelines.
Manual table is great, synthesize a few rows, add table, make life easier.
The future of data inputs for ET1
We are eager to add more connections but today we are looking to keep it simple by offering CSV, JSON, Github CSV, and manual tables.
Next? Excel input perhaps.
ETL Input Nodes – Simple as Pie
๐ CSV Input
What it does: Loads data from CSV files or text
Why it’s cool:
Drag & drop any CSV file
Handles messy data with smart parsing
Preview before committing
No more: Fighting with Excel imports or command-line tools
๐งพ JSON Input
What it does: Imports JSON data from files or direct input
Why it’s cool:
Works with nested JSON structures
Automatically flattens complex objects
Great for API responses and config files
No more: Writing custom parsers for every JSON format
๐ Manual Table
What it does: Create data tables by hand
Why it’s cool:
Add/remove rows and columns on the fly
Perfect for quick mockups or small datasets
Edit cells like a spreadsheet
No more: Creating throwaway CSV files for tiny datasets
๐ GitHub CSV
What it does: Pull CSV files directly from GitHub
Why it’s cool:
Point to any public GitHub CSV file
Auto-refreshes on URL change
A fetch button to ‘get’ it again
Great for github collaboration
No more: Downloading data, this gets it for you.
The Best Part?
No coding required.
No complex setup.
Just point, click, and start transforming your data like some data engineer, the pro.
What used to take hours (and a computer science degree) now takes seconds, and not scary…
ET1 helps you extract, transform, and load data in a single user-friendly canvas.
Data automation in a single canvas and this is the basic training. Where we show you the basics so that you’re dangerous.
This is the only data automation that visually shows you what is happening in real-time, allows you to open more than one tool at once, and the settings aren’t scary.
If you’re familiar with music equipment, phones, or computers, you’re more than likely familiar with inputs and outputs with audio.
Never loop or recycle data (audio analogy)
The DAG Streaming engine also means no ability to loop data/music infinitely, which means you never hurt your ears or your machine!
ET1 is a lot like an audio device and nodes help you change the audio.
Data flows through ET1 through, from output to input.
๐ Getting Started
Drag a data source node (CSV, JSON, or Manual Table)
Click the node to configure it
Connect nodes by dragging from output to input circles
Drag to output connect/create
๐ Core Concepts
Nodes: Each node, tool, or box does one or more things to your data
Connections: Arrows show how data flows. From Output to input
Think of it like audio equipment, that is always ready
Preview: See results instantly under each node
Audio is always on, instantly able to preview
Cell level highlights..!
Hover your mouse over cells to see and understand your data automation tools better.
Highlight a cell, see it highlight across nodes, branches and the canvas.
How to start ET1.
ET1 starts with a CSV Input Node open and available to begin slicing your data.
Csv file looks like file.csv and opens in ET1.
CSV data usually looks like…
City, State
Austin, Texas
Texhoma, Texas
Addison, Texas
Allen, Texas
Carrollton, Texas
Connecting the dots, creating connectors, drawing connections, the node menu
When highlighting on the right of a node with an output, you will find a circle.
This creates an arrow that can connect to the input of other nodes.
I find this next piece, the node menu, is the most helpful. I can begin a process, and immediately get through everything with ease. — Tyler Garrett
The node menu
Clicking the circle creates an arrow that points at a scrollable node menu, and automatically places the node directly to the right of your node.
You don’t have to use the menu, to close the menu, simply click somewhere else to continue your work.
“Drag” the circle by clicking and dragging your mouse to create an arrow…
“Drop” the arrow by letting go of your mouse click into a blank canvas to create a scrollable node menu where you need a node to be placed.
Drag the circle and drop.. not that crazy right!
Save your favorite nodes in the 6 circles in the top. Drag and drop the node/tool into the circle to save your favorite nodes, to make for an easier development lifecycle.