Trusted data, the Achilles heel of artificial intelligence

Dor several years now, the ecosystem of the digital world, both academic and industrial, has been exploring so-called “trusted” artificial intelligence (AI): we are on the verge of regulation concerning the technological risks of the deployment of AI that mainly highlights the explainability, robustness and transparency of the algorithms. This is the Artificial Intelligence Act (AI Act), published by the European Commission in April 2021, which provides for graduated levels of risk according to the impact on humans.

The procedures for assessing compliance with the principles of trusted AI, mentioned in this bill and which concentrate industrial and standardization efforts, concern the assessment and observability of algorithmic deployments. But in any algorithmic decision, data is the fuel that powers the engine of algorithms. This is the reason why an evaluation of the behavior of the algorithms cannot be dissociated from the analysis of the state of the data, for the purpose of operational efficiency from the evaluation of the conformity with the values ​​defining a trustworthy AI. .

The key question is: can we do trusted AI without trusted data? Due to the duality of data and algorithms, in essence two sides of the same coin, the answer is obviously no. However, this notion has not received the attention it deserves with regard to its impact on the overall algorithmic decision-making process.

How can we ensure confidence in the data? This notion goes beyond the usual aspects of data quality and integrity. We can, for example, consider the criterion of eligibility of the data which differs from their quality. Even when the data is of high quality, the inclusion of a subset may not be appropriate for solving a particular problem and, in fact, is a source of noise and bias against the target problem.

Transformed data

In addition, data in companies are of various qualities and come from different internal and external sources, in particular with the existence of data brokers, which collects from a multitude of sources, including social networks. In fact, the data used in a decision process is composed and transformed data.

It is essential to control the level of trust over the entire life cycle of cross-functional data within a company, and not only in certain specific business segments. The structural foundations of the data must be ensured independently of their purpose of use to allow their transverse reuse. Concretely, for example, by making the reference data management models consistent in a large commercial brand. Subsequently, the choice of data models adopted must be steered according to the “business” purpose sought, for example the different facets of consumer data.

You have 17.33% of this article left to read. The following is for subscribers only.

We would love to say thanks to the writer of this short article for this remarkable content

Trusted data, the Achilles heel of artificial intelligence

Visit our social media profiles along with other pages related to them