AI and ML are the key drivers of digital transformation. Almost any modern product or service is expected to include services based on advanced machine learning algorithms providing values such as predictive maintenance in manufacturing or cyber incident detection.

Emerging technologies almost always bring privacy and confidentiality considerations, yet the scale and application of AI create unprecedented and unique challenges. Building ML models require vast data; some might be business-sensitive, and some private information. Data scientists use the data to learn, develop adaptive models, and make actionable predictions - often without a transparent, explainable process.

The Cambridge Analytica scandal was an eye-opening global event. People started to take stock of their privacy, leading to the emergence of privacy-related legislation and regulation that drives tectonic changes in the market and customer expectations. With Google's forthcoming sunsetting of the 3rd party cookies, marketers will have to find new ways to be effective in a world without this method of identification. This, in combination with Apple and its fellow tech giants tightening up privacy protections, has reinforced the primacy of first-party data.

Data has become a precious asset; some say it is more valuable than oil. It has been commoditized for some time, but as the conditions change, getting the customers' data, behavior, and profile should become more complex and potentially more expensive as companies need to get the data directly from the customers.

Such changes will probably lead to new ways of building AI and implementing ML. Google is working on an alternative technology for customer profiling based on implementing ML on edge without ever consolidating and anonymizing user data at their data centers. Federated Learning of Cohorts or FLoC will group users into "cohorts" based on their preferences and personal attributes using computations executed directly on users' devices. Derived calculations are consolidated on central servers, where they are combined into a primary model, which is propagated back to the edge for retraining.

Cohorts that categorize users into different topics are not a consensus yet; it's criticized as a method that will not mitigate privacy concerns and even rejected by leading browsers. But the federated learning technology is undoubtedly something of notice in this new era. Federated learning is a machine learning technique that trains the algorithms across multiple decentralized edge devices or servers holding local data samples without exchanging them. It takes into account that local data samples are not necessarily identically distributed.

 Federated learning can become a new way to collaboratively create insights and value without sharing valuable and sensitive or confidential data 

This collaborative learning method allows multiple parties to build a robust and common machine learning model without compromising the sovereignty and confidentiality of each actor's data. Thus allowing us to address the critical issue of data privacy, data security, and data access rights.

This technology is not a commodity yet, as other machine learning techniques and tools are available in public cloud providers or other dedicated machine learning platforms. Some complicated problems need to be solved before federated learning can become a standard tool in the arsenal of data scientists. One example might be vertical federated learning (versus horizontal federated learning), where collaborating parties might not hold the same feature space. This use case will require that parties share the same ID space, which might introduce yet another data-sharing requirement.

Prediction is difficult, mainly when it involves the future. But as technology evolves and problems are solved, federated learning can become a new way to collaboratively create insights and value without sharing valuable and sensitive or confidential data. Retailers can build customer preference models without sharing customer data. Cyber security platforms can create unified detection algorithms, and manufacturing execution systems can predict maintenance windows without requiring a single party to aggregate the telemetry data.