Project Euler Problem 16

Power Digit Sum

215 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.

What is the sum of the digits of the number 21000?

Approach:

21000 is big. Really big. Compared to the lifespan of the Universe in years, about 13.68 billion, it is huge. That means that brute forcing this will not work. Let's break down the problem statement. We specifically need to figure out where and what digit is in which spot, without knowing the number. That means we will need some way of also knowing the number's length.

To start out, let's look at some more manageable numbers to test against, like 210 and 2100.

In [1]:
print(2**10)
1024
In [2]:
print(2**100)
1267650600228229401496703205376

What actually defines where a digit is placed?

Another way to think of numbers is in base 10, or the base that we use... That means that we can write 1024 like this:

1x103 + 0x102 + 2x101 + 4x100

Let's validate this in Python.

In [4]:
print((1 * 10**3) + (0 * 10**2) + (2 * 10**1) + (4 * 10**0))
1024

Now let's grab a single digit from 1024. To do so we can take advantage of the modulo operator. We know modulo produces the remainder of a division operation, so, if we were to mod 1024 by 10,000; it would return 1024. If we mod 1024 by 1000, it will return 24. Furthermore, if we subtract 24 from 1024, we get... 1000. To get the digit in the 103 place, we simply divide by 1000!

Let's validate this with python.

In [12]:
print((1024 % 10000), (1024 % 1000))
print(((1024 % 10000) - (1024 % 1000)) / 1000)
1024 24
1.0

Now let's write a python function for this logic.

In [14]:
import math
def modularizer(base, power, mod):
    for i in range(0, power):
        if (i == 0):
            temp = base
        else:
            temp = temp * base
        temp = temp % mod
    return temp

Let's test our function with 210 and get the digit in the 103 place.

In [21]:
high = modularizer(2, 10, 10**4)
low = modularizer(2, 10, 10**3)
divisor = 10**3
print((high - low)/divisor)
1.0

Awesome! our function appears to work for our test case. However, how are we supposed to know if the digit being returned is a 0 or a digit that does not exist? For example, let's look at what happens when we try to find a digit that does not exist in our test case.

In [22]:
high = modularizer(2, 10, 10**5)
low = modularizer(2, 10, 10**4)
divisor = 10**4
print((high - low)/divisor)
0.0

Ouch. We need a way to figure out how long our number is without knowing our number...

After a little research I found a neat way to find the length of a number in base 10. Logarithms! As an example, log, base 10, of 100 is 2. Log base 10 of 1000 is 3, etc... This means that we can simply add 1 to log base 10 of our number. Log base 10 does not always return an integer. To fix this we can floor our answer, and that should reliably return the length.

In [30]:
print(math.log10(100) + 1)
print(math.log10(1000) + 1)
print(math.log10(2**10) + 1)
print(math.floor(math.log10(2**10)) + 1)
3.0
4.0
4.0102999566398125
4

Another way we can optimize this equation is to take advantage of the nature of logarithms and rewrite 21000. The property we can use is log(ab) = b * log(a). The last step is to iterate through the length of the number and find what digit is in that spot using our "modularizer" function. As we get the digit, we will append it to a string, and then return a reversed version of the string.

In [33]:
def getDigits(base, power):
    tempStr = ""
    limit = math.floor(power * math.log10(base) + 1)
    for i in range(limit):
        low = 10 ** i
        high = 10 ** (i + 1)
        digit = int((modularizer(base, power, high) - modularizer(base, power, low))/low)
        tempStr += str(digit)
    return tempStr[::-1]
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 5.48 µs
In [34]:
%time
getDigits(2, 10)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 6.68 µs
Out[34]:
'1024'

Awesome! It looks like our function is working. Now let's test against 2100. To speed things up, let's compare our string against the string for 2100.

In [35]:
myString = "1267650600228229401496703205376"
testString = getDigits(2, 100)
for i in range(len(myString)):
    if myString[i] == testString[i]:
        continue
    else:
        print("!", testString[i])

No Errors! Now let's try 21000, and then sum it's digits and check the answer!

In [38]:
%time
testString = getDigits(2, 1000)

mySum = 0
for i in range(len(testString)):
    mySum += int(testString[i])
print(mySum)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 5.96 µs
1366

There we go. The answer is correct, we have no integer overflows, and the execution time is practically free! For fun, let's see how long it takes to calculate the sum of the digits in 210000. As a comparison, physicists estimate there are between 1078 and 1082 atoms in the universe, and 210000 is FAR bigger than those numbers, so let's see how our algorithm holds up.

In [43]:
import time

start = time.time()
testString = getDigits(2, 10000)

mySum = 0
for i in range(len(testString)):
    mySum += int(testString[i])
end = time.time()
print(mySum, (end - start))
13561 27.446865558624268

30 seconds still is pretty good execution time for this algorithm. Ways it could be improved would be to create a custom int class for integer overflows in other languages, but other than that I am happy with it. I'm chalking this one up as a win.