New Entrants to the Data Integration Market
The last 10 years have seen a proliferation of new data integration tools and has witnessed a much more expanded definition of the capabilities and features under the term data integration. In the past, data integration mainly described ETL tools, data virtualization tools, and application messaging tools. These days the following scenarios/use cases are commonly supported:
Data Engineering – This includes building and deploying data pipelines to support business intelligence and data science workloads. These pipelines combine internal/external data sources into purpose built environments for data lakes, data warehousing, and self service analytics.
Cloud Enablement – These days organizations have data sources that span on-premises, multi-cloud, hybrid-cloud, and private cloud platforms.
Operational Integration – This includes supporting master data management, data sharing between external business partners, and ensuring data consistency across internal/external data silos (like syncing records between Salesforce and Netsuite).
Data Mesh – A data mesh is a new style of enterprise data architecture that is domain driven and seeks to address common pitfalls of legacy approaches to data integration that promotes monolithic architectures. More on this concept can be found here.
Some of the tools that support the use cases above have looked more like traditional ETL tools, some of these tools are entirely cloud based and follow an ELT paradigm, and some of these tools have focused mainly on the Extract and Load steps of ELT and been pretty light on the Transformation step.
Some of the newer tools we have seen in production include Talend, Dell Boomi, Matillion, dbt, and FiveTran. This post focuses on FiveTran and discusses specific strengths and weaknesses of the product.
What is Fivetran?
FiveTran is defined as a Niche Player in Gartner’s Magic Quadrant for Data Integration. It is primarily focused on the Extraction and Loading part of the ELT paradigm and has a library of connectors that customers can use that require virtually no coding or engineering expertise to connect source to target. They operate almost as a managed service for ELT in that they have customer facing teams who can assist with developing new connectors or providing Transformation capabilities using the open source tool dbt.
We frequently find FiveTran to be deployed alongside Snowflake for storage/compute purposes and Looker for data visualization purposes. The majority of our clients who adopt FiveTran are high growth software companies who are looking to connect external SaaS services to extract and load data into internal data lakes and data warehouses for data science and analytics purposes. A typical use case would be connecting Salesforce, Netsuite, and an external order management system like Ordoro to an internal data warehouse designed in Snowflake. In this scenario, FiveTran would manage the movement of data from these external sources to an internal data store for reporting and analytics.
The following strengths are found within the platform:
Managed service approach – FiveTran is responsible for managing the various database connectors in it’s offering on behalf of their customers. Whenever there is a new change to an API or endpoint, FiveTran is on the hook for ensuring that the existing connector is not broken as a result. Given the diversity of their customer base, customers experience a network effect of continued improvement and advancement in features of the platform. Given the wide variety of off-the-shelf connectors, it is very easy to get up and running with source to target connectors for data extraction and loading.
Pre-built schemas for data analytics – FiveTran provides over 50 prebuilt data models for supporting common analytical scenarios like finance and digital marketing needs. Across our client base we find that teams are generally looking for similar insights and thus having prebuilt schemas is an excellent way to get up and running quickly with turning raw data into information and ultimately into actionable insights.
Low cost of ownership – A typical data engineer working with FiveTran can accomplish a lot with a relatively small team. FiveTran acts as a force multiplier for data engineers and allows teams to accomplish more with fewer resources.
The following strengths are found within the platform:
Limited Data Transformation Support – As mentioned previously, FiveTran relies on dbt under the hood to handle complex transformations for normalizing data into suitable data structures for analytics. As a result, teams may find that they are having to manage multiple tools and become annoyed that all of that can’t be combined into a common interface/tool.
Lack of Enterprise Data Management Capabilities – FiveTran lacks broader data management capabilities found in more enterprise tools such as auto discovery of data semantics, data virtualization, data governance, and data quality capabilities.
Fivetran takes all of the data from your databases, events, applications, and files, and essentially replicates them into high-performance data lakes and data warehouses. It is an excellent fit for teams who want to quickly implement source to target data extraction and loading and are comfortable with using dbt or some other means (hello python scripting!) to handle data transformations. For fast growing software companies it is an excellent force multiplier to keep your data engineers focused on higher level tasks as opposed to managing source to target data movement.