Science

Jim__

(14,561 posts) Sun Oct 13, 2024, 03:55 PM Oct 2024

Integer addition algorithm could reduce energy needs of AI by 95%

16-bit, 8-bit floating point numbers defined in IEEE 754 and on various hardware for tensor computations, and the 16-bit integer. MSB stands for most significant bit and LSB stands for least significant bit. Credit: arXiv (2024). DOI: 10.48550/arxiv.2410.00907
___________________________________________________________________________

A team of engineers at AI inference technology company BitEnergy AI reports a method to reduce the energy needs of AI applications by 95%. The group has published a paper describing their new technique on the arXiv preprint server.

...

The new technique is basic—instead of using complex floating-point multiplication (FPM), the method uses integer addition. Apps use FPM to handle extremely large or small numbers, allowing applications to carry out calculations using them with extreme precision. It is also the most energy-intensive part of AI number crunching.

The researchers call their new method Linear-Complexity Multiplication—it works by approximating FPMs using integer addition. They claim that testing, thus far, has shown that the new approach reduces electricity demand by 95%.

The one drawback it has is that it requires different hardware than that currently in use. But the research team also notes that the new type of hardware has already been designed, built and tested.

a little bit more ...

7 replies

= new reply since forum marked as read

Highlight:

Integer addition algorithm could reduce energy needs of AI by 95% (Original Post) Jim__ Oct 2024 OP

Thanks. littlemissmartypants Oct 2024 #1

Screw AI -- how does this affect computational number theory and encryption ? eppur_se_muova Oct 2024 #2

I'm not sure why it would affect computational number theory and encryption. Jim__ Oct 2024 #3

Computational number theory does a great many giant-integer multiplies. The fastest way to do these ... eppur_se_muova Oct 2024 #4

It sounds like they may be obverse problems. Jim__ Oct 2024 #5

Encryption surely requires exact integer arithmetic, rather than approximate floating-point arithmetic muriel_volestrangler Oct 2024 #6

It reminds me of tricks that were used to get 3D gaming effects on slow 8 and 16 bit computers. hunter Oct 2024 #7

littlemissmartypants

(26,368 posts)

1. Thanks.

Reply to Jim__ (Original post)

Sun Oct 13, 2024, 04:30 PM

Oct 2024

Original study:
Addition is All You Need for Energy-efficient Language Models
Hongyin Luo, Wei Sun
https://arxiv.org/abs/2410.00907
❤️

eppur_se_muova

(37,984 posts)

2. Screw AI -- how does this affect computational number theory and encryption ?

Reply to Jim__ (Original post)

Sun Oct 13, 2024, 04:47 PM

Oct 2024

Suppose large factorizations are suddenly 20x easier -- how does that affect Web security ?

Jim__

(14,561 posts)

3. I'm not sure why it would affect computational number theory and encryption.

Reply to eppur_se_muova (Reply #2)

Sun Oct 13, 2024, 05:44 PM

Oct 2024

My reading is not that they are changing any underlying algorithms, they are changing the implementation of the algorithms. Apparently AI algorithms perform an extremely high number of floating point multiplications. They are converting those floating point multiplications to integer addition, and that uses significantly less electricity.

The computations are less energy intensive. I don't believe they are any faster or any more powerful than the computations they are replacing.

Again that's based on my reading, you may be seeing something different.

eppur_se_muova

(37,984 posts)

4. Computational number theory does a great many giant-integer multiplies. The fastest way to do these ...

Reply to Jim__ (Reply #3)

Sun Oct 13, 2024, 08:48 PM

Oct 2024

... (AFAIK) is something called the irrational base discrete weighted {fast Fourier} transform, which replaces the integer multiply with multiple floating-point calculations. Weirdly, this sounds like the reverse of the technique in the OP. Either one of these represents a really fundamental change in the algorithm -- it's not anything like fine-tuning. This link gives a nice short intro to what's involved, as well as a further development which does not seem to be practical just yet: https://theconversation.com/weve-found-a-quicker-way-to-multiply-really-big-numbers-114923

Detailed technical discussion is found in the Wikipedia articles https://en.wikipedia.org/wiki/Fast_Fourier_transform and https://en.wikipedia.org/wiki/Multiplication_algorithm. Also briefly discussed in Knuth (vol. 2) and in more detail in Crandall and Pomerantz .

Public-key cryptographic security is often based on the assumption that factoring very large integers is so resource-intensive that it will almost never be worth the effort involved to crack a protected file. As the algorithms to find factors of large integers become faster and more powerful, larger and larger keys are required.

Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem—for example, the RSA problem. An algorithm that efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.

https://en.wikipedia.org/wiki/Integer_factorization

Jim__

(14,561 posts)

5. It sounds like they may be obverse problems.

Reply to eppur_se_muova (Reply #4)

Mon Oct 14, 2024, 06:59 AM

Oct 2024

This post is based on quick reads of the AI paper and the papers you referenced. So, it's sort of food for thought.

In the AI problem, the number of multiplications is O(n²) where n is roughly based on the number of elements in a tensor. They are working with neural networks here, so my current understanding (I've read through the paper, but I need to read it more thoroughly to get a better understanding) is that this is based on the number of nodes (synapses) in the neural net. From the
paper - my best attempt at copying this:

Multiplication operations are generally more complicated than additions, and FP operation are more costly than integers (Horowitz, 2014). Table 1 shows that multiplying two fp32 numbers consumes 37 times higher energy than adding two 32-bit integers. While the complexity of integer addition is O(n) where n is the number of bits used for representing the number, FP multiplication requires O(e) exponent addition, O(m²) mantissa multiplication, and rounding. Here e and m stand for the number of bits used for exponent and mantissa parts of the FP numbers.

Modern LLM training and inference involves a large number of FP calculations in tensor computa-
tion. Consider calculating the element-size and dot products of two 2-D tensors:

Y₁ = A ◦ X, Y₂ = A · X^T ; A, X (elements of) R^(N,k)

Calculating Y₁ involves N² FP multiplications (Mul). If A and X are both fp32 tensors, A ◦ X consumes 37 times higher energy than adding two int32 matrices of the save (SIC - s/b same?) size. Similarly, Calculating Y₂ involves (m × n × k) FP Mul and the same number of FP additions (Add). When A and X are fp32 tensors, each Mul-Add operation for two numbers consumes 0.9 + 3.7 = 4.6 (pJ)energy. If we replace the fp32 Mul with int32 Add, the energy cost becomes
0.1 + 0.9 = 1.0 (pJ),only 21.7% of the original cost. Similarly, if the inference is conducted in fp16, replacing fp16 Mul with int16 Add result in a 1 - (0.05 + 0.4) / (1.1 + 0.4) = 70% energy saving.

...

We propose L-Mul, a FP multiplication algorithm with O(n) complexity, where n is the bit size
of its FP operands. Consider two FP numbers x, y, whose exponents and fractions are x_e, y_e and x_m, y_m respectively, the vanilla FP Mul result is

Mul(x, y) = (1 + x_m) · 2x_e · (1 + y_m) · 2y_e
= (1 + x_m + y_m + x_m · y_m) · 2x_e+y_e

plus an xor operation ( ⊕ ) to decide the sign of the result. Assume x_m and y_m are mantissas of m bits. The O(m²) mantissa multiplication operation is the complexity bottleneck of this calculation. We remove this operation and introduce a new multiplication algorithm that processes mantissas with a computational complexity of O(m):

L-Mul(x, y) = (1 + x_m + y_m + 2-l(m)) · 2x_ee+y_m,
l(m) = | m if m =< 3,
.........| 3 if m = 4,
.........| 4 if m > 4.

And in the encryption and large prime searches, the O(n²) is based on the large number of digits in the very large numbers involved. From the link: https://theconversation.com/weve-found-a-quicker-way-to-multiply-really-big-numbers-114923 you referenced from The Conversation:

We’ve found a quicker way to multiply really big numbers

In 1960, Anatoly Karatsuba, a 23-year-old mathematics student in Russia, discovered a sneaky algebraic trick that reduces the number of multiplications needed.

For example, to multiply four-digit numbers, instead of needing 4² = 16 multiplications, Karatsuba’s method gets away with only nine. When using his method, twice as many digits means only three times as much work.

This stacks up to an impressive advantage as the numbers get bigger. For numbers with a thousand digits, Karatsuba’s method needs about 17 times fewer multiplications than long multiplication.

Can the solution to the AI problem have some impact on the solution to the problems of multiplying large numbers? In the long-term there may be some connections between the solutions. But, I don't think there is a currently recognized connection between them.

muriel_volestrangler

(102,869 posts)

6. Encryption surely requires exact integer arithmetic, rather than approximate floating-point arithmetic

Reply to eppur_se_muova (Reply #4)

Mon Oct 14, 2024, 04:18 PM

Oct 2024

The paper says their "L-Mul algorithm" achieves higher precision than 8 bit FP multiplication (though 8 bit is not very precise). But encryption is for integers, and I wouldn't think factorization can be "close enough".

hunter

(39,191 posts)

7. It reminds me of tricks that were used to get 3D gaming effects on slow 8 and 16 bit computers.

Reply to Jim__ (Original post)

Sat Oct 19, 2024, 04:33 PM

Oct 2024

As with a lot of AI, these games didn't have to precisely reflect any physical model of reality, they just had to "look" right.

Of course "Natural Intelligence" may take such shortcuts too, which might partly explain why even highly intelligent people sometimes do stupid things.

It would be interesting to compare this energy efficient AI to an otherwise identical floating point or large integer AI.

Reply to this discussion