Creating Energy Efficient Deep ML Models

Undoubtedly A.I. has helped us make our world a better place, but all of this comes at a cost. A critical problem is its massive energy consumption and huge carbon footprint.

Artificial Intelligence has entered our lives like a storm, from solving crucial problems such as detecting cancer early to not so important issues such as solving a Rubik’s cube. Undoubtedly A.I. has helped us make our world a better place, but all of this comes at a cost.

One of the critical problems of this technology is its massive energy consumption and huge carbon footprint. According to the University of Massachusetts Amherst, researchers reviewed a prominent architecture used for Natural Language Processing(NLP) applications. They found that just training that model once required more than 650,000 kWh of energy over eighty-four hours. This energy consumption has the same carbon footprint as fifty-seven humans would have in one year of their lives. 

Nobody can deny that A.I. is the present. We can see A.I. in our everyday business and personal lives. Therefore, we need to find solutions that can help us reduce the carbon footprint of A.I. and, in turn, fight climate change. 

The Causes

To solve the problem of Artificial Intelligence models consuming so much energy, we first have to figure out why they devour this much energy in the first place. 

  • Inefficient Training:

After looking at various model architectures such as the Convolution Neural Networks (CNNs), LSTMs, RNNs, and even simple ANNs, it can be easily seen that they have many layers. Before we can use them, they have to be trained thoroughly. This training is not one-time; as more data is collected and the models change periodically, the models have to be retrained again. This retraining is a very computationally expensive process. So the question arises, What are the energy-efficient ways to train a model?. 

  • Finding the suitable model:

One more reason for this excessive training is that the researchers are trying to figure out the ideal network topology. Parameters such as- 

  • The number of neurons.
  • The number of connections between the neurons.
  • The learning rate(the rate of change of the parameters).

Now, this testing is a little hit and miss. The more the combinations are tested, the more likely the network is to attain a high level of accuracy. On the other hand, human brains do not need to find an ideal structure since they come with one that has been fine-tuned by evolution.

Now the pressure is on enhancing the process of finding the correct method efficiently. Just a 1% increase in accuracy on tough jobs like machine translation is substantial, resulting in positive P.R. and better goods. However, to get that 1% increase, one researcher may have to train the model thousands of times with a different structure until the optimal one is discovered.

The Solutions:

Here are a few methods to save your energy on training.

  • Know your target accuracy :

According to  Dr. Vibhu Sharma and Vikrant Kaulgud[2], after conducting a series of experiments at Accenture labs on publicly available datasets, they arrived at the following conclusions:

  • All ML models make a series of passes over a dataset known as epochs. All of the model’s accuracy plateaued after certain epochs, but the energy consumption grew exponentially.
  • For example, after training one model for an accuracy of 96%, the energy consumption was 964 joules.
  • To increase the accuracy by 2.5% to a total of 99%, the model required an additional 15,000 joules.
  • It was also found that more energy was required to train models on bigger datasets- and in some of the cases, using a bigger dataset did not increase the model’s efficiency.

It would help save energy on training the models if you looked at your use case before training the model. If you have to train a model that detects cancer cells, you should aim for high accuracy because skimping the accuracy might have catastrophic effects. But if your application is not so critical, then settling on a low accuracy can help you save energy.

  • Opt for Transfer Learning:

Now again, based upon your case. You can use transfer learning to solve your problems. Transfer Learning is a method in which you use pre-trained models. These models save time that you would have spent training the model and energy as the model is trained already.

  • Once-For-All Network:

Researchers at MIT have been developing a “Once-For All” (OFA) network.[3]

Out of all the experiments they conducted, they arrived at the following results:

  • After training a computer vision model that contained over ten quintillion architectural settings, the OFA approach ended up being far more effective than spending hours training each sub-network. 
  • This cumulative training through the OFA approach did not affect the accuracy or efficiency of the model. The model was able to achieve state-of-the-art accuracy on android and ios devices. When tested against a common benchmark (ImageNet), the model proved to be 1.5 to 2.6 times faster in terms of inference than leading classification systems.
  • The researchers also found that the CV model only had roughly 1/1,300 the carbon emissions while training compared with today’s popular model search techniques. According to John Con, who is an IBM fellow and a member of the MIT-IBM Watson AI Lab, “ If rapid progress in A.I. is to continue, we need to reduce its environmental impact. The upside of developing methods to make A.I. models smaller and more efficient is that the models may also perform better.”

This kind of research opens doors for a world where A.I. can continue to positively impact our planet while having a minimal carbon footprint. You can always follow this kind of research and use their trained model for your use cases, i.e., another application of transfer learning.

  • Use Energy-efficient models (SNNS):

The energy consumption problem required at retraining the model can also be solved by using more energy-efficient models. One example of such a model is the Spiking Neural Network. It simulates the working of the human brain, as we are always learning new stuff, and our model(brain) is constantly being updated. We don’t require staggering amounts of energy to survive.

To understand the working of an SNN, we first have to understand the working of the human brain.

According to Olaf de Leeuw

“Human brain is made up of neurons. These neurons communicate with each other via sequences of pulses. An action potential travels along the axon in a neuron and activates synapses. The synapses release neurotransmitters that arrive at the postsynaptic neuron. Here the action potential rises with each incoming pulse of neurotransmitters. If the action potential reaches a certain threshold, the postsynaptic neuron fires a spike itself.”

Now according to André Grüning¹ and Sander M. Bohte [1] . SNNs work on the following method:

ANNs (Artificial Neural Networks) that try to mimic the natural behavior of humans are known as spiking neural networks (SNNs). In addition to neuronal and synaptic status, SNNs incorporate time into their working model. The idea is that neurons in the SNN do not transmit information at the end of each propagation cycle (as they do in traditional multi-layer perceptron networks), but only when a membrane potential – a neuron’s intrinsic quality related to its membrane electrical charge – reaches a specific value, known as the threshold.[1]

There is no denying that Artificial Intelligence is taking a toll on our planet. But there are solutions and methodologies available to mitigate this quite substantially. Ongoing research is being done to produce more energy-efficient models like SNNs. Also, it was predicted that the energy use of the data centers would explode in recent times, but that has also not happened due to better hardware and cooling solutions that were developed. 

There’s also a trade-off between the cost of training the models and the cost of utilizing them, so putting in more effort during training time to come up with a smaller model might potentially save you money and emissions in the long run. Because a model will be utilized several times throughout its life, this can save a significant amount of energy.

The A.I. community should put more effort into developing energy-efficient training methods in the future. Otherwise, A.I. risks becoming dominated by a small group of people who can afford to dictate what sorts of models are produced, what kind of data is used to train them, and what the models are utilized for.


[1] Spiking Neural Networks: Principles and Challenges, André Grüning¹ and Sander M. Bohte², University of Surrey, United Kingdom¹, CWI, Amsterdam, The Netherlands², ESANN 2014 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 23–25 April 2014, i6doc.com publ., ISBN 978–287419095–7.

[2] https://www.accenture.com/us-en/blogs/technology-innovation/sharma-kaulgud-developing-energy-efficient-machine-learning

[3] https://news.mit.edu/2020/artificial-intelligence-ai-carbon-footprint-0423

[4] https://theconversation.com/it-takes-a-lot-of-energy-for-machines-to-learn-heres-why-ai-is-so-power-hungry-151825

Lewis Lovejoy

Lewis Lovejoy

10 March 2022


Please log in to comment

Other posts you might like