Here we find ourselves in yet another election year. Technology continues to advance at a record pace – autonomous cars, a 5G Super Bowl Broadcast, cognitive services available in all major cloud providers for anyone to use, the list goes on. So why in the year 2020 do we still see such a gap in how we collect and manage polling data?
As a data consulting firm, we often see organizations put advanced analytics technology in place without a well-designed process for using it. The lack of a data strategy prevents them from getting clear answers to their business questions. Data governance and management are frequently a missing piece of the puzzle.
How much more efficient, secure, inclusive, and accurate could the process be? What can we learn from our failure to leverage modern data management and technology in this particular area and apply those lessons in other areas of our society? Let’s explore a bit.
The recent challenge with the Iowa caucus is a prime example of why data governance and management are so important. New technology actually exacerbated the problem it was designed to solve: an untested app was quickly rolled out to make it easier for Iowa precinct chairs to report their local election results to the state party and, unfortunately, made it impossible to do so due to connectivity issues. The backup reporting method of calling in results via telephone was pushed past its bandwidth and failed for many precincts. This led many candidates into the same grey area that many of today’s business managers live in – making educated guesses about next steps rather than informed decisions.
For data experts, there are a few easy-to-spot mistakes common to situations like this:
Not only has this happened many times before when technology has been used to simplify or otherwise improve the election process, but it’s also common in business. In any organization, until the basics of data governance are observed, there will always be a lack of clarity in results.
In 2012, the Republican Party was plagued with similar delays and uncertainty. The results of several precincts were lost permanently,and inconsistencies in paperwork were found in over a hundred more. Mitt Romney was initially announced as a winner, with a retraction more than two weeks later declaring Rick Santorum the winner instead by a very narrow margin.
When thought of as a data field yielding results, the phrasing of a question can often create problems with data analysis in elections. Questions posed by audiences unfamiliar with data governance may be worded in confusing ways, leading to misleading results. Errors in user input can become a problem as well, as in the case of government officials calling in election results to an automated system, punching in numbers on the phone with no safeguard in case of a mistaken entry. Whether it’s on paper, on the phone, or in an app, proper data collection methods and data management procedures must be followed to offer accurate results.
The first major challenge in the management of voting data is in its collection. There is a big gap between natural language and clean data. It feels simple to ask, “Who do you vote for?” because our brain can naturally translate a wide variety of responses. To a computer, however, you must be very specific. “Candidate A,” “Joe Biden,” “The Vice President,” may all be the same answer to us, but it’s three different answers when it comes to data. Any time we set out to collect data in politics or business, we have to define exactly what we are looking for and how we are going to collect it. In a good data governance strategy, defining exactly how we are going to ask for and collect data (called “semantic modeling”) is an essential step.
With the new app released to collect data in the Iowa caucus, the interface and accessibility to even download the app left users frustrated and confused. Paper was still used and paper ballots were sometimes required, in other cases names and preferences were captured on cards to create a “paper trail.” Inconsistent data collection methods that also require some degree of manual tallying is prone to human error, reconciliation issues, and inherently will require more time to analyze with imperfect results.
From there, you have to address issues of data aggregation and analysis, lineage, its quality or trustworthiness, reconciliation, reporting, and remediation. This should all be supported by a sound data security plan that employs secure transport methods, encryption at rest and in movement, all incredibly important when thinking about the implications in an election. It’s a huge and hugely important task to put something so complex into place, and ought to be approached in the proper way by experts in the field.
Issues with data collection and analysis in elections are poignant examples of the importance of proper data governance because it seems like it should be such a simple process: do you choose this, or that? But as we’ve seen, even with the simplest of goals, data collection is a complex, multi-faceted process that requires a very specific approach. The disruption at the Iowa Caucus demonstrates that new technology is seldom the cure unless accompanied by a well thought-out, tested and verified process with a user base fully equipped to use it. This can only be achieved with a comprehensive data strategy.