Docs
An overview of all transformations in ETLBox
This article will give you an overview of all transformations that currently exist in ETLBox. If you already know what kind of transformation you are looking for, you can visit directly the article that goes more into the details.
Instruction how to read or write from access databases.
Access is partially supported in ETLBox. Most of the components may work, but there are some limitations when connecting with Access. Below are these limitation described in detail. The good news is: You can make it work to integrate access either as source or as destination.
Details about the Aggregation
The aggregation class let you aggregate your data by with either your own aggregation function or a default function. This can be done on your whole data or on particular groups of your data (similar to a GROUP BY in sql).
Details about the Azure Cosmos DB connector
The Azure Cosmos DB is a multi-model database, which can also be used either as source or destination for an ETLBox data flow. This article will give you an overview how to use the connector.
Details about the Azure Service Bus connector
The Azure Service Bus is an enterprise message bus, which can also be connected to an ETLBox data flow. This article will give you an overview how to use the connector.
Details about the Azure Tables connector
The Azure Tables is a service that stores non-relational, structured data in a key-value store, with a schemaless design. Access to Table storage data is fast and cost-effective for many types of applications, and is typically lower in cost than traditional SQL for similar volumes of data.
Details about the BatchTransformation
The BatchTransformation let you transform batches of ingoing data.
Details about the BlockTransformation
The BlockTransformation is a real blocking transformation. It will block processing until all records arrived, and use up as much memory as needed to store the incoming rows. After this, all rows are written into the output.
Details about the CachedBatchTransformation
The CachedBatchTransformation has the same functionality as the BatchTransformation, but offers additionally a cache manager object to access previously processed batches of data.
Details about the CachedRowTransformation
The CachedRowTransformation does basically the same as the RowTransformation, but has a cache to access previously processed data rows.
Details about the ColumnRename
This transformation let you rename the properties names of your ingoing data. Also, you can remove columns from your flow.
Overview of shared properties and methods for all data flow components
All components in ETLBox share some properties and methods. This chapter describes the details.
Details about the ConcurrentMemoryDestination
The ConcurrentMemoryDestination is very similar to the MemoryDestination, but uses a thread safe collection for storing data.
An overview of existing database connectors in ETLBox and their concepts.
There a numerous database connector packages that come with ETLBox. Add the connector package for the database that you want to connect with. So if you want to connect with a SqlServer, add the ETLBox.SqlServer pacakge to your project. For MySql, use ETLBox.MySql.
An overview of the control flow task concepts in ETLBox.
ControlFlow Tasks are an easy way to run database independent sql code or to avoid the boilerplate code when you just want to execute a simple statement.
All ControlFlow task in detail
ETLBox comes with a fair set of ControlFlow Tasks that allow you to most common task on a relational database. This will give you an overview of these tasks.
Details about the Couchbase connector
Couchbase can be used a document storage and key/value storage. This article will give you an overview of the Couchbase connectors for ETLBox.
Details about the CrossJoin
The CrossJoin allows you to combine every record from one input with every record from the other input. This allows you to simulate a cross join like behavior as in sql (also known as Cartesian product).
Details about the CsvSource and CsvDestination
With the CsvSource you can send csv formatted data into a data flow, and the CsvDestination will produce csv formatted output.
Details about the CustomSource and CustomDestination
ETLBox allows you to create your own implementation of a source or destinations. This gives you high flexibility if you need to integrate systems that are currently now included in the list of default connectors.
Details about the Distinct transformation
The Distinct transformation will filter out duplicate records.
Details about the ExcelSource
The ExcelSource allows you to send data from an excel file into a data flow.
How to extend your DbContext with bulk inserts
ETLBox offers support for Entity Framework. This article give you a brief overview how to use bulk operations with Entity Framework's DbContext.
Details about the FilterTransformation
The FilterTransformation filters out row that do not match with a given predicate.
Details about the MemorySource and MemoryDestination
The MemorySource and MemoryDestination allows you to read or write data from/into an IEnumerable - so any list or collection of the .NET ecosystem can be used as source or destination for an ETLBox data flow. The Memory connectors are available in the ETLBox core package.
Details about the JsonSource and JsonDestination
The JsonSource and JsonDestination allows you to read or write data from/into a json format, either into a file or a webservice.
Details how to link the components of a data flow.
Before you can execute a data flow, you need to link your sources, transformations and destinations. The linking is quite easy - every source component and every transformation offers a LinkTo() method. This method accepts a link target, which either is another transformation or a destination.
Details about logging in ETLBox
On top of NLog, ETLBox offers you support to create a simple but still powerful database logging, which is simple to set up and eays to maintain.
Details about the LookupTransformation
If you want to lookup some data from existing tables or other sources, the lookup transformation is the right choice. It allows you to enrich the incoming rows with data from the lookup source.
Details about the MergeJoin
The MergeJoin transformation joins the outcome of two sources or transformations into one data record. This allows you to merge the data of two inputs into one output.
Details about the DbMerge connector
The following article describes how you can use the data from your data flow to insert new data in a destination table, update existing or delete removed records.
MongoSource and MongoDestination in detail
A detailed overview of the MongoSource and MongoDestination.
Details about the Multicast
The Multicast is a component which basically clones your data and send them to all connected target. It has one input and can have two or more outputs.
Details how to execute a data flow.
After you created and linked your components, you are ready to execute your data flow.
General information about support NoSql databases
ETLBox adds support for various relational and NoSql databases, as well as flat file formats and web services. This article gives an overview about the support NoSql databases.
Details about the ParquetSource and ParquetDestination
The parquet connector pack allow you to read or write data from or into parquet files.
Quickstart: Basic usage of ETLBox.
Let's get started. This page gives you a brief overview of the basic concepts and usage.
How to read or write data from Redis
Redis is a very popular key/value store. ETLBox can connect to Redis with the RedisSource and RedisDestination.
Accessing relation databases with the DbSource and DbDestination
A detailed overview of the DbSource and DbDestination connector.
Details about the RowDuplication
The RowDuplication simply creates duplicates of the incoming rows. You can specify how many copies you want or if you want to create a copy only if a predicate evaluates to true.
Details about the RowMultiplication
The RowMultiplication allows to create multiple records out of one input record. It works like a RowTransformation - so it accepts an input and an output type - but instead of just modifying one records it can return an array of records (when you return an empty list, it will even remove the incoming row).
Details about the RowTransformation
The RowTransformation will apply a custom transformation function to each row of incoming data. This transformation is useful in many scenarios, as it allows you to apply any .NET code to your data.
Stream connectors allow to read common formats either from a file source or via http.
Stream connectors allow to read data that is stored in a common format either from a file or via http.
A simple data flow with ETLBox
The main part of ETLBox are the data flow components. They are the building blocks for the ETL part, and contain classes that help you to to extracting, transform and load data. This example will lead you through a simple data flow step-by-step.
Details about the TextSource and TextDestination
The TextSource and TextDestination allow you to read or write data from/into a text file. The text connectors are part of the ETLBox core package.
Details about the VoidDestination
The Void destination can be used to discard records
Details about the WaitTransformation
The WaitTransformation blocks execution until another component in the network has completed processing all records.
Details how to execute, link and start/wait for a data flow.
ETLBox supports generic components that are typed to an object, but also works well with dynamic objects. Some components also allow to use an array as type. This chapter will give insight how to operate on your data with different types.
Details about the XmlSource and XmlDestination
The XmlSource and XmlDestination allow you to read or write data from/into a xml file or web service.
Details about the XmlSchemaValidation
This transformation allows you to validate XML code in your incoming data against a XML schema definition.