Why data management is key in the age of big data
As data and its analysis has moved out of the hands of specialised data scientists into much more accessible, enterprise-wide settings, it has become ever more important to have a clear idea of data management strategy and the best practices it involves.
One such approach, DataOps, involves connecting data creators and data consumers. According to a report authored by data storage company Seagate and IDC,the majority of businesses consider DataOps to be at least very important, with 71%, 72% and 73% of those seeing a boost in customer loyalty, revenue and profit respectively. Rose Hiu, Vice President Global Marketing and PR at Seagate says DataOps “means ensuring data classification is robust but flexible, and each data set has a designated purpose. Companies should view DataOps as the missing link in their data management, where integration takes the centre stage.”
----
Key takeaways from Seagate and IDC’s Rethink Data report
“Every day, businesses prove that data holds tremendous value when captured, stored, and leveraged to its full extent—an increasingly difficult task in a rapidly changing multi-cloud, multi-edge world. The explosion in data creation, coupled with the increasing need to mobilize and analyze it at unprecedented volume and speed provide a complex backdrop. Meanwhile, resource scarcity and technology limitations exacerbate enterprise pain points as IT architecture and data management practices evolve to capitalize on the enormous opportunity to put more data to work.”
----
Whichever approach is taken, there are a number of shared core tenets to managing data, such as ensuring a high level of quality is maintained. That wasn’t always the case, however, as Jean-Michel, Senior Director Data Governance, Talend, explains: “Not so long ago, most businesses believed data quality was not their concern. However, now, not only do businesses understand the value of data, but they also understand the risks and problems that can arise if data is not handled correctly. The idea that data is an asset is finally becoming mainstream.”
The right protocols
That level of care must be present from the beginning, or else the whole project will be built on shaky foundations. Hiu says: “we advise companies to identify and classify their information streams appropriately. They must ask where data will be stored, what type of data will they be acquiring, and whether it will be sales data, employee time logs, or IoT assets? Data management solutions can save significant amounts of time here via automating these processes, but its vital IT decision-makers determine the right protocols at the outset.”
That’s made easier if the ways of interacting with data themselves are demystified, as Alan Gibson, Vice President of EMEA, Alteryx, explains: “Without any hesitation, making data and analytics more accessible and intuitive needs to be the key priority as we move into 2021 and beyond. Organisations can only achieve effective data-driven transformation when they promote an analytics culture. This means bringing everyone in the company along on the journey by democratising data.”
Organisations stand to gain everything from making that leap, and indeed proper data management may be a prerequisite for success in increasingly digital economies. “The future is amplifying human intelligence through self-service data analytics platforms,” says Gibson. “Removing the need for a data science degree, these platforms mean anyone can be elevated to data analyst. Empowering data workers to quickly build repeatable AI -informed predictive models without the need for coding or performing complex statistics.”
Data demand
The fact that demand for data scientists is increasing shows that there remain plenty of challenges in data management that require human ingenuity to fix, however. While machines might be getting smarter, they are only ever as good as the data they are trained on, all of which requires proper labelling. “The effort required at present to manipulate data into a meaningful and query-able format is a challenge,” says Simon Cole, CEO at Automated Intelligence. “80% of data a company holds is unstructured; this data is challenging to examine, bring structure to and extract value from.”
There’s also the question of so-called “legacy” data hanging around organisations, which may differ in format and pose a risk of making data sets unreliable if not properly cleaned. Along similar lines, the multiple sources of data available to a modern company must be brought together if they are to be of any use. “The biggest challenges come from data fragmentation,” says Tido Carriero, Chief Product Development Officer at Segment. “As the number of customer touchpoints balloons, it becomes harder for businesses to ensure that they have a complete and reliable understanding of their customers.”
Carriero emphasises that the biggest challenges all fall under this umbrella of improper coordination. “Similarly, data silos can prevent teams from getting the data they need, and create blindspots which can result in security and privacy issues. Other common challenges relate to having incomplete or unreliable data. Many companies are still using haphazard data capture methods that don’t update in real time, while others have no clear system for making sense of their data and putting it to effective use. Unless it can be activated, data is a useless resource, so this is a real missed opportunity.”
As society becomes ever-more data-hungry, the correct management of said data will only become more important. Taming the vast volumes of data produced daily will be key to the continuing improvement of artificial intelligence, as Cole explains: “80% of the time spent by data scientists is in manipulating data to get it into a form that is useful. Repeatable automated and intelligent capability around data capture, lineage tracing, quality and insight for AI and machine learning are required to standardise data management in an open and transparent manner.”