A Successful Journey with Data Science
Sometime around 1850, Jacob Waltz found a gold mine in the southwestern United States. He didn’t tell anyone where it was. No one knows for sure if Waltz actually found a gold mine. Nevertheless, many people have tried to find the gold mine to this day. Just like prospectors who wanted gold for success, CIOs can leverage their enterprise’s data more effectively to create greater value for their organizations.
Competitive advantage from technology usually accrues to the first to leverage it. Upon seeing some impressive early successes of using big data to help the business in other organizations, many CIOs are considering their own journey to use data science. Like finding a lost gold mine, it can seem an elusive undertaking, but I will provide an approach to get started. I guarantee you’ll have a much easier time using data science to solve pressing problems than finding Waltz’s lost mine, but the payoff will be as good or better.
At SAIC, we define data science as the discipline of managing, organizing, and extracting actionable insights from data to derive value for mission success. While there are many approaches to establish a data science program, I recommend an incremental approach to enable your organization to be progressively data driven.
In an ideal world, the CIO can build the foundation for data science culture by starting with a strategic plan. But most CIOs live far from an ideal world, so they need an approach that lets them try out data science and in so doing solve a pressing problem. Once a few successes have happened, creating a strategic plan will make it easier to repeat those successes. Start with an easily defined business problem that is obviously dependent on data and isn’t overly complex.
First, drive a shift in how the stakeholders of the chosen business problem view data. An organization’s leaders should treat their data as an asset that can be leveraged instead of as the byproduct of a business process or simply to measure performance. Your organization should use data to improve business decisions and outcomes significantly. Use this guiding principle for a data science strategy around collecting, migrating, consolidating, archiving, and retiring existing datasets and systems.
In order to conduct initial planning for the project, define a clear vision for success and then assess the current environment from the data and business perspective. To do data science, you need data, domain experts, data scientists, and tools. Proper pairing of tools, technology, and resources with the right skill set is very important. Since the new tools and resources can be expensive, secure funding for tools and infrastructure and empower the team by clearing any organizational and political hurdles.
The data science journey from initial planning to execution resides on three pillars: architecture planning, data lifecycle management and operations. Establish a reference architecture to support developing repeatable solutions and building expertise in strategic areas. Introduce modern platforms only as needed for a controlled transition. Address data lifecycle management by planning for governance, compliance, metadata, risk management, and security. These aspects should be in the roadmap from the beginning since the cost of addressing these later can be quite high.
Use your organization’s usual deployment process for taking a data science project from a prototype stage to operations. However, consider taking a fresh approach and tapping into Agile and DevOps practices. You will promote a higher level of transparency and communication with all stakeholders and introduce automation, both leading to significant cost savings in the long run. Finally, a successful data science effort is one where measurement is performed at various milestones to identify the value-add and is used as a feedback loop for further tuning the roadmap.