Getting value out of your data doesn’t have to be that hard

The potential implications of continued global data growth continue to stagger the imagination. According to a 2018 report, every second every person produces 1.7 MB data on average – and annual data creation more than doubled since then and it is projected to more than double again by 2025. According to a McKinsey Global Institute report, skillful use of big data can lead to additional $ 3 trillion in economic activities, providing applications as diverse as self-driving cars, personalized healthcare and traceable food supply chains.

But adding all of this data to the system also creates confusion about how to find, use, manage, and legally, securely and efficiently share it. Where did a particular dataset come from? Who owns what? Who is allowed to see certain things? Where is he located? Can I share? Can you sell it? Can people see how it was used?

As data applications grow and become more ubiquitous, producers, consumers, owners and data managers find they have no tutorial to follow. Consumers want to connect to data they trust so they can make better decisions. Manufacturers need tools to securely communicate with those who need them. But technology platforms fail, and there are no real shared sources of truth connecting both sides.

How do we find data? When should we move it?

In an ideal world, data would flow freely, like a utility available to everyone. It could be packaged and sold as raw material. It could be easily and easily viewed by anyone who has the right to see it. Its origins and movement were traceable, eliminating any fear of unholy use anywhere on the line.

Today’s world, of course, does not work that way. The tremendous growth in data volumes has created a long list of challenges and opportunities that make it difficult to share pieces of information.

Since data is generated almost everywhere inside and outside the organization, the first challenge is to determine what is going to be and how to organize it so that it can be found.

The lack of transparency and sovereignty over the stored and processed data and infrastructure creates trust issues. Moving data from multiple technology stacks to centralized locations is costly and inefficient today. The lack of open metadata standards and widely available APIs can make it difficult to access and use data. The presence of sector-specific data ontologies can make it difficult for people outside the sector to benefit from new data sources. Multiple stakeholders and difficulty accessing existing data services can make sharing difficult without a governance model.

Europe takes the lead

Despite the challenges, data exchange projects are being implemented on a large scale. One, backed by the European Union and a non-profit group, is creating a collaborative data exchange called Gaia-Xwhere businesses can exchange data protected by strict European data privacy laws. Exchange is conceived as a means of exchanging data between industries and a repository of information about data processing services related to artificial intelligence (AI), analytics and the Internet of Things.

Hewlett Packard Enterprise recently announced decision framework to support the participation of companies, service providers and community organizations in Gaia-X. The data spaces platform, currently in development and based on open standards and cloud technologies, democratizes access to data, data analytics and artificial intelligence, making them more accessible to domain experts and ordinary users. It is a place where subject matter experts can more easily identify reliable datasets and perform operational data analytics securely without always requiring costly data movement to centralized locations.

By leveraging this framework to integrate complex data sources into IT landscapes, enterprises can achieve massive data visibility so that everyone – data scientist or not – knows what data they have, how to access it, and how to use it. in real time.

Data sharing initiatives are also at the forefront of enterprises. One of the top priorities that businesses face is validation of data that is used to train internal AI and machine learning models. Artificial intelligence and machine learning are already widely used in businesses and industries to continually improve everything from product development to recruiting and manufacturing. And we’re just getting started. IDC predicts that the global AI market will will grow from $ 328 billion in 2021 to $ 554 billion in 2025.

To unlock the true potential of AI, governments and businesses need to better understand the collective legacy of all the data that underpins these models. How do AI models make decisions? Do they have bias? How trustworthy are they? Could untrusted people have accessed or changed the data from which the enterprise trained its model? Connecting data producers with data consumers more transparently and efficiently can help answer some of these questions.

Data maturity

Businesses aren’t going to decide how to unlock all of their data overnight. But they can prepare to take advantage of technologies and governance concepts that help shape a data-sharing mindset. They can ensure that they develop the maturity to strategically and effectively use or share data, rather than doing so on a one-off basis.

Data producers can prepare for wider data dissemination by taking a number of steps. They need to understand where their data is and how they collect it. Then they need to make sure that the people consuming the data are able to access the right datasets at the right time. This is the starting point.

Then comes the tricky part. If a data producer has consumers – who can be inside or outside the organization – they must connect to the data. This is both an organizational and technological problem. Many organizations want to manage communication with other organizations. Democratizing data – at least being able to find it across organizations – is a matter of organizational maturity. How do they deal with it?

Companies that contribute to the automotive industry actively exchange data with suppliers, partners and subcontractors. It takes a lot of parts and a lot of coordination to assemble a car. Partners are eager to share information on everything from engines to tires and repair channels over the Internet. More than 10,000 suppliers can serve vehicle data spaces. But in other industries it could be more insular. Some large companies may be reluctant to share confidential information even within their own network of business units.

Creating a data mentality

Companies on both sides of the consumer-producer continuum can improve their data sharing mentality by asking themselves the following strategic questions:

  • If businesses are building AI and machine learning solutions, where do teams get their data from? How do they connect to this data? And how do they track this history to ensure the reliability and provenance of the data?
  • If data has value to others, what monetization path is the team taking today to expand that value, and how can it be managed?
  • If a company is already sharing or monetizing data, can it authorize a broader set of services across multiple platforms – on-premises and in the cloud?
  • How do organizations that need to communicate with vendors today align those vendors with the same datasets and updates?
  • Do manufacturers want to replicate their data or get people to bring them models? Data sets can be so large that they cannot be reproduced. Should a company host software developers on its platform where its data is stored and move models back and forth?
  • How can people in the department using data influence the way data producers work in their organization?

Taking action

The data revolution is creating business opportunities – along with a lot of confusion about how to find, collect, manage and extract actionable information from this data in a strategic way. Producers and consumers of data are increasingly drifting apart. HPE is building a platform that supports both on-premises and public clouds, using open source as a foundation and solutions such as the HPE Ezmeral Software Platform to provide the common ground needed for both parties to make the data revolution work on them.

Read the original article on Enterprise.nxt

This content was produced by Hewlett Packard Enterprise. This was not written by the editors of the MIT Technology Review.

Source link

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button