Enabling privacy-aware and carbon-aware machine learning
The ever-increasing complexity of machine learning models is setting new records for energy usage month after month. Training modern image or language models like StyleGAN2-ada, GPT-3 or GLaM has an estimated energy usage of about 325, 1287 and 456 MWh, respectively. For reference, an average household in the EU consumes only 3.7 MWh per year.
These numbers are from training on highly energy-efficient GPU clusters in modern data centers with all data at hand. However, in many practical use cases it is simply not possible to collect all data in once central location for privacy reasons. As governments are pushing data protection regulations there is a growing need for systems that allow data processing directly at end devices without transmitting any data.
This is why Google introduced Federated Learning in 2016. In federated learning a machine learning model is being trained directly on end devices without every transmitting any data - that is, in a secure and privacy-friendly manner. The big problem is: End devices are usually less energy-efficient and federated trainings take more time. The energy usage of AI is hence expected to further increase, once federated learning is becoming widely deployed.
To reduce the carbon footprint of future federated learning applications, we developed a plugin, called Lowcarb, for the popular federated learning framework Flower that enables carbon-aware scheduling of training jobs on geographically distributed clients.
Other than selecting participating clients randomly for each training round, which is the state of the art, Lowcarb picks clients that are in a time window of relatively low carbon intensity. For this, we rely on marginal carbon intensity forecasts for each client's location provided by the Carbon Aware SDK. The selection is based on two factors:
- How often did the client participate in the past? We want all clients to participate fairly in the training to not introduce biases. Hence, clients with less participation are treated with higher urgency. This guarantees fairness, a highly important metric in federated learning.
- Is the carbon intensity at a client's location expected to drop in the future? Our selection algorithm optimizes for clients with the least potential of lowering their associated carbon intensity. This way, we postpone the training on clients that will have access to greener energy in future training rounds.
Our approach comes with batteries included and requires no complicated configuration. This enables carbon-aware federated learning for everyone and allows our approach to scale to many different problems and domains.
Although federated learning is a very young technology, there are already plenty of promising use cases as well as real life deployments:
- Google is using federated learning to train language models directly on smartphones for their Android keyboard Gboard without transmitting any user data
- Several works propose applying federated learning in smart grids to avoid sending privacy-sensitive data on critical infrastructure over the public network
- Federated learning can be applied to the health sector, for example if multiple hospitals want to train a common model without sharing their patient's data
- Similarly, the financial sector can benefit from collaborating on AI models without leaking customer information
- Even low Earth orbit satellites can benefit from federated learning, as they produce massive amounts of data that can often not be transmitted to ground stations due to sparse connectivity
- One of the most promising use cases is training distributed models on future autonomous vehicle fleets, where thousands or millions of cars are collaborating on a common model
Case Study 1: Federated Learning in Hospitals
We benchmarked Lowcarb's impact by training an image classification model for thorax diseases on chest X-ray images. As described above, federated learning is likely to be deployed in such scenarios in the future due to the strict privacy regulations in health care.
In our example, we assumed 100 hospitals in 14 different locations around the world. Compared to random client selection, Lowcarb reduces the training's associated carbon emissions out-of-the-box by 13% without any sacrifices on training speed, final accuracy and fairness in client selection - a very important metric in federated learning systems.
Case Study 2: Federated Autonomous Driving Fleet
Tesla’s Autopilot is currently one of the most advanced autonomous driving systems. At the moment, their Visual Transformer Neural Network is trained in a centralized manner on 5760 high-perfomance GPUs on 1.5 petabytes of data. In the future, this training might become federated and distributed across the fleet. This enables privacy-aware training on even more data, making the autopilot more reliable.
Since there are no energy consumption specs for autonomous vehicle hardware available yet, we rely on sources that estimate a compute power of at least 2500 W. When assuming 5h of participation from each car in each round, we total in 12.5 kWh or 6 kgCO2 (at the global average of 475 gCO2/kWh).
6 kg of CO2 per vehicle sound harmless at first, but there could be 60-150 million autonomous vehicles worldwide by 2030. When manufacturers plan to involve only 10% of these vehicles per round, each training round already adds up 90,000 tons of CO2. Assuming a continuous training of 100 rounds at the same 13% reduction as in the hospital use case, the savings achieved by Lowcarb are equivalent to the annual carbon emissions of about 200,000 EU households.
We have a ready-to-use implementation of Lowcarb which is available as a python package on PyPI:
> pip install flwr-lowcarb
Our GitHub repository includes a tutorial that demonstrates how to implement Lowcarb. The plugin only requires minimal configuration and adaptations to existing applications, such as annotating clients with their respective location.
Lowcarb is easy to use, introduces no disadvantages compared to randomized client selection and does not put any responsibility on developers to take care of carbon-awareness themselves. Therefore, we think it will be well received by the community if it is widely advertised.
Going forward, we see three main ways to improve Lowcarb and enable widespread adoption:
- Advertisement: The most important factor determining the success of our solution is how many developers will actually integrate it into their applications. Federated learning is a very young technology that is just now moving from academia to actual use in production systems. We have the opportunity to shape this rapidly growing field by incorporating carbon awareness as a fundamental design paradigm. For this we need to heavily advertise our approach using different channels like the Green Software Foundation, the yearly Flower Summit, as well as academic and industry conferences.
- Improvements: Our current implementation requires some minimal configuration made by users such as providing the duration of training rounds. In the future, we would like to automatically determine the optimal parameterization of the plugin for all types of setups to increase usability and remove all responsibilities form developers. Furthermore, we believe there is still lots of potential in improving our algorithmic approach to yield even larger carbon savings at reduced training times, for example by combining it with novel, performance-based strategies like Oort.
- Extensions: Many devices suitable for federated learning trainings have access to batteries or even renewable power generation, for example cars charging at a house with a solar power system. In addition to optimizing for low carbon intensity, our plugin could also take such factors into account to get one step closer to zero-carbon AI.
Given the increasing popularity and adoption of federated learning, we see Lowcarb as a major unexplored opportunity for leveraging the Carbon Aware SDK and hope it will raise awareness for carbon-aware design principles in the machine learning community.