Lean, Mean, Green Machines: Optimizing AI for Energy Efficiency

As AI technology advances, the industry is actively working to meet the increasing demand for electricity and water to support the servers powering this innovation.

A standard DGX computer, the gold standard for AI tasks, consumes more than 10KW of power. Big Tech is expected to purchase millions of these systems this year, surpassing the power usage of New York City. With this comes the responsibility to find sustainable ways to manage energy consumption. Researchers and engineers are already working on creative solutions to mitigate the environmental impact.

Source: Midjourney

Cooling Challenges: Managing Heat in AI Servers

But it’s not just the electricity required to operate these computers. They generate a
significant amount of heat, getting really hot, and therefore require cooling. Getting rid of that heat actually consumes twice as much power as the computer itself. So now, that 10KW machine is effectively utilizing 30KW while in operation.

These new servers will consume three times more electricity than the entire state of California used in 2022! To address this issue, server farms are exploring alternative cooling methods, like utilizing water, to minimize electricity consumption. While this approach shifts the burden of resources, it also opens up opportunities to develop more efficient and environmentally friendly cooling technologies.

But this uses our precious fresh water to save on electricity for cost cutting.

Strategies for Reducing AI’s Energy Footprint

AI’s hungry power consumption is a growing concern and will worsen. Is there a way we can address this issue? Luckily, researchers have already begun exploring more effective
approaches to creating and utilizing AI. Model reuse, ReLora, MoE, and quantization are all promising techniques that could help address this issue.

1. Model Reuse: Maximizing Efficiency

By using model reuse, existing models are retrained for new tasks rather than training from scratch, so we can save time, energy, and resources while also improving performance. Meta and Mixtral have both been releasing reusable models, and are leaders in this province.

2. ReLora and Lora: Reducing Computational Demand

ReLora and Lora make it possible to minimize the number of calculations required during model retraining for new purposes. This not only saves energy but also allows the utilization of smaller and less power-hungry computers. Consequently, instead of depending on high-energy systems such as NVidia’s DGX, a simple graphics card can often be sufficient for the retraining process.

3. MoE Models: Smarter Energy Use

Mistral’s recently released MoE models have fewer parameters compared to traditional models. This means fewer calculations and less energy consumption. Additionally, these MoE models only activate the required blocks when in use, similar to turning off lights in unused rooms. As a result, there is a remarkable 65% reduction in energy usage.

4. Quantization: Compact and Efficient AI Models

Quantization is a cutting-edge method that decreases the size of AI models without
significantly affecting their performance. By quantizing a model, the number of bits required to represent each parameter is reduced, resulting in a smaller model size, enabling the use of less powerful and more energy-efficient hardware.

For example, a huge 40 billion parameter model would typically need a power-hungry GPU system like the DGX to operate effectively. However, through quantization, this same model can be optimized to run on a low-power consumer GPU, such as those commonly found in laptops.

While quantization may lead to a slight decrease in model accuracy in certain scenarios, for many practical purposes, this compromise is minimal or hardly noticeable. The performance remains excellent while demanding only a fraction of the computing resources. Quantization helps AI models become more efficient, compact, and environmentally friendly, reducing the hardware requirements and energy consumption. This enables AI to operate on everyday consumer devices without sacrificing accuracy in crucial areas. Quantization is a crucial advancement towards scalable and sustainable AI.

Practical Implementation: Success Stories

As an example of what is possible, we managed to repurpose a 47 billion parameter MoE model by utilizing these four methods, retraining it for a client on a server that uses less than 1KW of power, finishing the process in only 10 hours.

Additionally, the client can now operate the model on regular Apple Mac computers equipped with energy-efficient M2 silicon chips. At smartR AI, when developing and training new models, such as the company’s generative AI loyal companion SCOTi® AI, the company has been privileged to be able to utilize the super computer at EPCC, Edinburgh University, reducing the time span required for training of models substantially – trained a model from scratch in nearly one hour.

The Path to a Sustainable AI Future

As AI becomes more prevalent, we all need to start thinking more proactively about energy and water usage. Research into more efficient training and utilization methods is yielding promising results. But we also need to start using these methods actively; by integrating these new techniques into our tool flows, we not only benefit our clients but also contribute to a more sustainable future for our planet.

About the Author

Oliver King-Smith
Oliver King-Smith

At smartR AI, Oliver King-Smith spearheads innovative patent applications harnessing AI for societal impact, including advancements in health tracking, support for vulnerable populations, and resource optimization. Oliver is an innovator with expertise in Data Visualization, Statistics, Machine Vision, Robotics, and AI.

Moreover, check out Decoding AI adoption: 6 key rules to separate authentic innovation from the hype written by Oliver, to learn more about navigating the AI landscape!

For the newest insights in the world of data and AI, subscribe to Hyperight Premium. Stay ahead of the curve with exclusive content that will deepen your understanding of the evolving data landscape.

Add comment