While the term “Analytics Engineering” might be new to your vocabulary, the concept is not. In the past, it has also been referred to as data quality, wrangling, modeling, or transformation. The core idea with each of these terms is the same: take raw data and turn it into something useful for data analysts and business users.
In this article, we are going to break down what it means to be an analytics engineer, understand what the role really means and define the pros and cons of placing this new role within your organization.
To better understand what exactly this term entails, let’s start by looking at the responsibilities that can fall under the role of an analytics engineer.
An analytic engineer's primary responsibility is to transform raw data into data sets that are useful to data analysts and business users. These data sets are usually fed into business intelligence software like PowerBI/Tableau or machine learning models. This work requires both a good technical understanding of databases and the needs of the end users.
Data quality is a broad set of activities that revolves around ensuring the accuracy, completeness, and consistency of data. At its simplest it involves correcting bad data where possible and excluding it when it is not.
Keeping track of entities (customers, products, employees, stores, etc.) across different systems is a challenge. Master data management is the process of reconciling and consolidating these disparate lists of data. This data is supposed to be handled by specialist software and teams, but analytics engineers often find themselves acting in an impromptu MDM role.
Data that is hard to access is almost as useless as a system cluttered with “bad” data. Analytics engineers can improve usability by naming or renaming tables and columns with clear, readable labels. Improving usability also includes organizing tables into facts and dimensions that allow easier ingestion into apps like Power BI.
It should come as no surprise that data governance is a key concern for analytics engineers. Analytics engineers view data through the lens of a software engineer. This includes using version control, unit tests, and proper documentation.
The remaining question is where should analytic engineers reside in an organization? There are three major schools of thought: central IT, self-service, and data mesh. Each of these areas would be suitable for this role, so let’s review the pros and cons of each one.
The most obvious school of thought in which to place an analytics engineer, is in central IT. Analytics are created and maintained by a team reporting up to the CIO, so this role would inevitably fit in an area that directly works with the team responsible for managing your organization’s data.
Placing this role directly under your CIO or Director of Data, allows for an easier opportunity to manage the people handing your organization’s data and govern the tasks they’re completing. Additionally, the central IT school of thought has a greater concentration of technical expertise, allowing your employees within this framework to engage in more quality collaboration.
While central IT may seem like the most obvious arrangement for this role, it comes with it's downsides. Organization’s often see a gulf between IT departments and the rest of the business, which can be difficult to bridge, hindering the ability for analytics engineers to communicate fully with outside divisions. Utilizing this school of thought can also present the obstacle of an overwhelming workload, for instance, having the demand for analytics be much larger than this one team can provide.
The initial reaction to overcome the shortcomings of a central reporting team is to utilize the framework of self-service BI. The core idea here is to empower business users to create their own reports/analytics.
Stationing analytics engineers in the realm of self-service BI seems like a clear choice because individuals who directly use the analytics are most in-tune with the organizations data-related needs, meaning those who can fix the problems that arise will be the first to identify them. Similarly, this school of thought can increase the pool of available people who can produce quality analytics.
Trouble can arise from this decision as we’ve often seen how data governance can be far more difficult to enforce in self-service BI. This choice of placement can cause obstacles when business users inevitably struggle with the technical aspects of analytics generation, (Python, SQL, DAX, Tableau Calculations, version control, etc.)
Whether the first two placement options don’t seem to be the right fit, or they both seem to be suitable options for which you cannot decide, the third option, Data Mesh has posed to be a compromised approach. A centralized group in IT manages infrastructure and governance. Each business unit is expected to provide analytics related to their business domain.
Placing your analytics engineer within this framework allows for strong unity of business and technical users in cross functional team, allowing for clear communication between teams. Additionally, bringing the focus back to data mesh can empower your organization to scale beyond what is possible with central analytics teams.
On the downside of a data mesh, placing your analytics engineers in this area will require significant organizational restructuring. This type of change can also increase the need for engineering disciplines and maturities.
Now that we identified the department placement options of your analytics engineer, we must decide which toolset will allow them the most success in your organization.
Besides the fact that SQL and DBT are already widely known in the world of data analytics, these tools are crucial for any analytics engineer to leverage the power of cloud data warehouses, allowing them to make the most effective and accessible database for your organization. Conversely, SQL and DBT can limit one’s ability to work with certain complex problems, due to the routine nature of these tools.
Python is another commonly used tool in the work of data science. It’s extreme versatility allows your analytics engineer to decide how they can utilize this tool to solve the problems in front of them. However, it is important to consider Python requires significantly more tooling and expertise in comparison to SQL.
Visual design tools can be extremely useful to your organization due to how easy they can be for your non-technical team members to use. This promotes efficiency in problem solving by enabling a larger pool of people who can navigate the system. Despite the flexibility of these tools, their software licensing can come with a hefty price tag and can often create a struggle when scaling to larger use cases.
Like the data science field that proceeded it, analytics engineering is still in the process of defining its role and scope in the data ecosystem, but organizations everywhere are learning how to use it to their advantage.
In summary, we’ve explored what the role of an analytics engineer entails;
The most reasonable schools of thought in which to place your analytics engineer;
And identified the tools necessary for an analytics engineer to be successful;
If you’re interested in exploring how your organization can find and benefit from analytics engineer, contact Onebridge here to chat with one of our data experts.