Cracking Data Silos: A New Era of Federated Learning

Cracking Data Silos: A New Era of Federated Learning

In the realm of artificial intelligence, a significant barrier has long plagued researchers and developers: the data silo. This phenomenon, where organizations hoard their data due to privacy and security concerns, has hindered the potential of machine learning models. However, a recent surge in research has led to the development of innovative solutions that aim to break down these silos and unlock the true potential of data collaboration.

The Migrating Study: Transfer Learning

Professor Qiang Yang, Chair Professor and Chief of Artificial Intelligence Officer at Hong Kong University of Science, has been actively involved in the field of machine learning research. He has extensively studied the concept of “migrating study,” or transfer learning, which enables the use of data from one large dataset to improve the performance of a smaller task. This approach has numerous applications, such as loan risk control strategy, policy migration recommendation systems, and public opinion analysis.

The Challenges of Federated Learning

While transfer learning has shown promising results, it has its limitations. In recent years, researchers have discovered that the migration of learning can be applied to more complex problems, but these challenges require more sophisticated solutions. For instance, modern organizations possess vast amounts of data, but each organization maintains its own database, making it difficult to share data directly. The General Data Protection Regulation (GDPR) in the European Union has further emphasized the need for user privacy protection.

Federated Learning: A Solution to the Data Silo Problem

Professor Yang’s team at Micro Focus Bank AI has led the research in developing federated learning solutions to address these challenges. At the “New Generation of Artificial Intelligence Academician Summit” in December 2018, Professor Yang introduced two models for federated learning: the portrait of federated learning and the transverse federated learning. These models utilize homomorphic encryption technology to ensure data privacy and security.

Three Papers Detailing Federated Learning Solutions

The Micro AI team has published three papers detailing their research on federated learning solutions under security requirements. These papers focus on supervised learning, reinforcement learning, and decision tree models.

Security Federated Transfer Learning

The first paper, “Security Federated Transfer Learning” ([1812.03337] Secure Federated Transfer Learning), proposes a framework for federated transfer learning (FTL) that enables the use of data resources within a federal data setting to improve model performance. This approach ensures data privacy between different members and allows for the transfer of complementary data within the network.

Federal Reinforcement Learning

The second paper, “Federal Reinforcement Learning” ([1901.08755] SecureBoost: A Lossless Federated Learning Framework), introduces a new reinforcement learning framework that takes into account privacy requirements. This framework enables agents to learn from each other’s experiences without sharing their own observations, making it suitable for multi-agent reinforcement learning.

SecureBoost Security Tree Model

The third paper, “SecureBoost Security Tree Model” ([1901.08755] SecureBoost: A Lossless Federated Learning Framework), proposes a new non-destructive, privacy-boosting tree (tree-boosting) security system called SecureBoost. This model allows multiple learning processes to be joint, using the same sample part of the user but with a completely different set of features corresponding to a different vertical equivalent virtual packet data set.

Open Source Information

Both FTL and SecureBoost have been made open source, and the Micro-bank affiliate team AI AI Solutions Project FATE (Federated AI Technology Enabler) provides support for a federal AI ecology and application of open source libraries that can be deployed on a single machine or a cluster of computers.

Epilogue

The solutions presented in these three papers demonstrate the importance of considering encryption and security when dealing with practical problems. By leveraging federated learning, organizations can break down data silos and unlock the true potential of data collaboration. As Professor Yang emphasized, “Let the data lead to higher utility, so that data from different institutions are no longer ‘data silos’ learning can bring significant federal help, the relevant technology is also worthwhile to continue digging.”