Code Along - Implementing GCD Algorithms in Python

Learning objectives

We are now ready to return to the two algorithms for finding the GCD of two integers that we presented as pseudocode in the core text. In TrivialGCD(), we consider every possible divisor p between 1 and the minimum of the two input integers a and b, and take the largest common divisor.

TrivialGCD(a, b)
    d ← 1
    m ← Min2(a,b)
    for every integer p between 1 and m
        if p is a divisor of both a and b
            d ← p
    return d

We then introduced EuclidGCD(), which iteratively subtracts the smaller of the two integers from the larger until they are both equal.

EuclidGCD(a, b)
    while a ≠ b
        if a > b
            a ← a − b
        else
            b ← b − a
    return a

We claimed that EuclidGCD() is a more efficient algorithm than TrivialGCD(). In this code along, we will implement these two algorithms in Python and then time them to test our hypothesis.

Code along summary

Set up

Create a new folder called gcd in your python/src directory, and then create a new file within the python/src/gcd folder called main.py, which we will edit in this lesson.

In main.py, add the following starter code.

def main():
    print("Implementing two GCD algorithms.")


if __name__ == "__main__":
    main()

Implementing the trivial GCD algorithm

We start with TrivialGCD(), reproduced below.

TrivialGCD(a: int, b: int) -> int
    d ← 1
    m ← Min2(a,b)
    for every integer p between 1 and m
        if p is a divisor of both a and b
            d ← p
    return d

Our introductory work has equipped us with most of what we need to know to implement this function, since we have discussed for loops and if statements, and we have already implemented a min_2() function that trivial_gcd() uses as a subroutine. We reproduce the Python implementation of min_2() below; make sure that it is present in main.py so that it will be visible to our implementation of trivial_gcd().

Note: Of course, instead of using min_2(), we could use the min() function that’s built into Python.

def min_2(a: int, b: int) -> int:
    """
    Takes two integers as input and returns their minimum.

    Parameters:
    - a (int): first integer
    - b (int): second integer

    Returns:
    int: minimum of a and b
    """
    if a < b:
        return a
    else:
        return b

We are nearly ready to implement trivial_gcd(), but we must first determine how to implement the line that tests whether p is a divisor of both input integers,

        if p is a divisor of both a and b

First, note that we can rewrite trivial_gcd() using two nested if statements instead of a single statement, as follows. This change allows us to focus solely on determining whether one integer is a divisor of another.

TrivialGCD(a: int,b: int) -> int
    d ← 1
    m ← Min2(a,b)
    for every integer p between 1 and m
        if p is a divisor of a
            if p is a divisor of b
                //if we make it here, we know that p is a divisor of both a and b
                d ← p
    return d

From the previous code along, we know how to test when p is a divisor of a: it is when a % p, the remainder of the division of a by p, is equal to zero.

def trivial_gcd(a: int, b: int) -> int:
    """
    Returns the GCD of two integers by applying a trivial algorithm attempting each possible divisor of a and b
    up to their minimum.

    Parameters:
    - a (int): first integer
    - b (int): second integer

    Returns:
    int: GCD of a and b
    """

    d = 1
    m = min_2(a, b)
    for p in range(1, m + 1):
        if a % p == 0:
            if b % p == 0:
                # if we make it here, we know that p is a divisor of both a and b
                d = p
    return d

STOP: Let’s call our function on a couple of integers. Type the following code into def main(). Then, open a command line terminal and navigate into the folder containing your code by executing the command cd python/src/gcd. Then, run the code by executing the command python3 main.py (macOS/Linux) or python main.py (Windows). You should see the correct value of 21 printed.

def main():
    x = 63
    y = 42
    print(trivial_gcd(x, y))

Click Run 👇 to try it!

Improving TrivialGCD(), and the “and” and “or” keywords

In practice, we do not need to use two nested if statements to test whether p is a divisor of both a and b. Instead, we can test both statements as part of a single if statement using an and keyword, which we encountered briefly when introducing conditionals. If x and y are two Boolean expressions (i.e., evaluating to True or False), then the expression x and y is True precisely when both x and y are True; it is False otherwise. We will use this idea to update our implementation of trivial_gcd() with a single if statement as follows.

def trivial_gcd(a: int, b: int) -> int:
    """
    Returns the GCD of two integers by applying a trivial algorithm attempting each possible divisor of a and b
    up to their minimum.

    Parameters:
    - a (int): first integer
    - b (int): second integer

    Returns:
    int: GCD of a and b
    """

    d = 1
    m = min_2(a, b)
    for p in range(1, m + 1):
        if a % p == 0 and b % p == 0:
            d = p
    return d

Note: For a given value of a and p, if a % p == 0 is False, then Python will not bother to check whether b % p is equal to 0, since there is no way that both statements can be true. This is called short circuiting.

Python also has an or keyword. The expression x or y is True precisely when at least one of x and y are True; it is False when both x and y are False. A truth table illustrating all four different outcomes for the and and or keywords is shown in the figure below.

`x`	`y`	`x and y`	`x or y`
`True`	`True`	`True`	`True`
`True`	`False`	`False`	`True`
`False`	`True`	`False`	`True`
`False`	`False`	`False`	`False`

Note: When checking the truth of x or y, Python uses short circuiting as well. If x is True, then x or y is automatically True, regardless of the value of y, and Python immediately considers x or y to be True.

Note: A common pitfall for introductory programmers is ensuring that both statements connected by a logical connective like and must be syntactically correct statements in the language. For example, the expression if a % p and b % p == 0: may produce an unexpected result in Python because although it may make sense to a human, Python views the expression a % p as an integer, which is always True unless it is equal to 0.

STOP: Say that we replaced the line if a % p == 0 and b % p == 0: in trivial_gcd() with the incorrect line if a % p == 0 or b % p == 0:. What would the function return instead of the GCD?

Click Run 👇 to try it!

STOP: Technically, trivial_gcd() contains a bug because according to the strictest mathematical definition, for any integer n, the GCD of 0 and n is equal to n. What does trivial_gcd() currently return in this situation, and how could we fix it?

Implementing Euclid’s nontrivial GCD algorithm

We now return to implementing Euclid’s algorithm. The pseudocode for EuclidGCD() is reproduced below.

EuclidGCD(a, b)
    while a ≠ b
        if a > b
            a ← a − b
        else
            b ← b − a
    return a

The only aspect of this function that we have not yet introduced is the test for inequality of a and b, while a ≠ b. In a previous lesson, we introduced a collection of comparison operators. In particular, we saw that ≠ can be represented in Python using the operator !=. We use this operator in our implementation of EuclidGCD() below.

def euclid_gcd(a: int, b: int) -> int:
    """
    Returns the GCD of two integers by applying Euclid's algorithm.

    Parameters:
    - a (int): first integer
    - b (int): second integer

    Returns:
    int: GCD of a and b
    """

    while a != b:
        if a > b:
            a = a - b
        else:
            b = b - a
    return a

We can now check that euclid_gcd() is working by calling it on a couple of real inputs; update main() as follows.

def main():
    x = 63
    y = 42
    print(euclid_gcd(x, y))

Click Run 👇 to try it!

STOP: As with trivial_gcd(), euclid_gcd(0, n) should return n for any positive integer n. What currently happens when we call euclid_gcd() in this situation, and how could we fix it?

Timing the two GCD algorithms

You may still be wondering what the big deal is. Is Euclid’s approach for finding a GCD really that much better? Aren’t modern computers so fast that they make this problem obsolete?

To answer this question, let’s time trivial_gcd() and euclid_gcd() on the same inputs. The code that we will use for timing will prove handy throughout our programming future for measuring the efficiency of different functions.

First, we will need to add a statement to the top of main.py to import Python’s time module; a module is a collection of pre-written functions that we can use without having to write from scratch.

import time

In def main(), we declare two positive integer variables and set them to values that are on the order of a million; it would be better to use randomly generated integers, but we will discuss how to generate random integers in a later chapter.

def main():
    x = 3782026
    y = 2731479

Next, we will time trivial_gcd(). First, we declare a variable named start that is equal to the current time using the built-in function time.time(). This function takes no inputs, and returns a float equal to the number of seconds that have elapsed since January 1, 1970, 00:00:00 (UTC). We can think of declaring start as clicking the “start” button on a stopwatch.

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()

We then will call trivial_gcd(x, y). We do not need the output of trivial_gcd(x, y), and so we will not set a variable equal to this output. (Don’t worry, Python’s interpreter won’t get angry.)

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()
    trivial_gcd(x, y)

Finally, we need to click the “stop” button on our conceptual stopwatch. To do so, we call the function time.time() again, subtracting the start variable from the result, which will tell us how long trivial_gcd() ran. We store the result of this subtraction in a variable elapsed.

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()
    trivial_gcd(x, y)
    elapsed = time.time() - start

We now need to display the amount of time that has gone by. We could always simply call print(elapsed), but we will use this example as an opportunity to show how to format this output more nicely. Python provides a powerful feature called f-strings, which allow us to add formatting to certain parts of a string based on the values of variables. In this case, the line

print(f"trivial_gcd took {elapsed:.6f} seconds")

prints the value of elapsed to six decimal places. This command is similar to the printf command from other programming languages like C, which is why you’ll sometimes hear this style called “printf-style formatting.” We will adapt the following code any time we need to time functions.

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()
    trivial_gcd(x, y)
    elapsed = time.time() - start
    print(f"trivial_gcd took {elapsed:.6f} seconds")

STOP: Run this code, either on your computer or by using the window below. What do you find?

Click Run 👇 to try it!

We now will repeat this process and time euclid_gcd(), updating def main() as shown below. Note that we are very careful to run euclid_gcd() on the exact same input integers that we used for trivial_gcd(), and that we place only the call to euclid_gcd() between the lines invoking the start time.time() and end time.time() so that we obtain accurate results.

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()
    trivial_gcd(x, y)
    elapsed_trivial = time.time() - start
    print(f"trivial_gcd took {elapsed_trivial:.6f} seconds")

    # timing euclid_gcd
    start = time.time()
    euclid_gcd(x, y)
    elapsed_euclid = time.time() - start
    print(f"euclid_gcd took {elapsed_euclid:.6f} seconds")

Finally, we print the speedup, which is the ratio of the runtime of trivial_gcd() to that of euclid_gcd().

def main():
    x = 3782026
    y = 2731479

    # timing trivial_gcd
    start = time.time()
    trivial_gcd(x, y)
    elapsed_trivial = time.time() - start
    print(f"trivial_gcd took {elapsed_trivial:.6f} seconds")

    # timing euclid_gcd
    start = time.time()
    euclid_gcd(x, y)
    elapsed_euclid = time.time() - start
    print(f"euclid_gcd took {elapsed_euclid:.6f} seconds")

    # compute and print speedup
    if elapsed_euclid > 0:
        speedup = elapsed_trivial / elapsed_euclid
        print(f"Speedup: {speedup:.2f}x faster")

STOP: Run this code, either on your computer or by using the window below. How do the two approaches compare?

Click Run 👇 to try it!

Furthermore, let’s see what happens as we make the numbers larger.

STOP: Add a digit to each of x and y in your code. How does it affect the runtime of the two algorithms? What about if you add two digits to x and y? What is this demonstrating?

Click Run 👇 to try it!

The previous question is hinting at a critical theme of programming, which is that the benefits of a faster algorithm provides tend to increase as the size of the input parameters increases. Computing the GCD of two small numbers may be relatively fast with either trivial_gcd() or euclid_gcd(). However, as we increase the size of the input integers to the order of 10 million, trivial_gcd() takes several seconds on a typical computer, while euclid_gcd() finishes in about a thousandth of a second.

This phenomenon is illustrated in the following table, which illustrates the runtime of the two algorithms on input integers having different numbers of digits. We can quantify how much faster euclid_gcd() is by taking the ratio of the runtime of trivial_gcd() to euclid_gcd() on the same input; this ratio is called the speedup of euclid_gcd() compared to trivial_gcd(). As we continue to increase the size of the input integers, trivial_gcd() will start to exhaust considerable computational resources, whereas euclid_gcd() will finish in the blink of an eye.

STOP: Time the two approaches for numbers with a variety of differing numbers of digits. Are your results similar to those of the table below?

Number of Digits	`trivial_gcd()` Runtime	`euclid_gcd()` Runtime	Speedup
6	15.4 ms	43 μs	358
7	0.154 s	43 μs	3,580
8	1.56 s	42 μs	37,143
9	(Times out in online IDE)	43 μs	Unknown

Figure: A table showing sample runtimes of trivial_gcd() and euclid_gcd() for two integers having an indicated number of digits. The final column shows the “speedup” offered by euclid_gcd(), which is the ratio between the runtime of trivial_gcd() and euclid_gcd(). Observe that the speedup increases as the size of the input increases. The units ms and μs denote milliseconds (thousandths of seconds) and microseconds (millionths of seconds), respectively.

As we move toward the final code-along in this introductory chapter, we are now ready to consider the second nontrivial algorithm of this chapter, the sieve of Eratosthenes. Will we find that it provides a considerable speedup as well?

For the precocious: an even faster Euclid GCD algorithm

Say that we are taking the GCD of 50 and 375. Euclid’s algorithm would subtract 50 from 375 to obtain 325, concluding that GCD(50, 375) = GCD (50, 325). In turn, it would subtract 50 from 325 to obtain 275, concluding that GCD(50, 325) = GCD(50, 275). In this manner, it would by repeated subtractions obtain that the original GCD is the same as GCD(50, 75), which is equal to GCD(50, 25).

Yet remembering that repeated subtraction is just division, we note that we obtain 25 by taking the remainder after dividing 375 by 50. In other words, Euclid’s algorithm is relying on the fact that GCD(a, b) = GCD(a, b–a), but there is a stronger result that GCD(a, b) = GCD(a, b%a). By repeatedly applying this fact, we can obtain an optimized version of Euclid’s algorithm that we call faster_euclid_gcd(). In fact, the algorithm below is implemented by the gcd() function contained within Python’s math module.

def faster_euclid_gcd(a: int, b: int) -> int:
    """
    Computes the greatest common divisor (GCD) of two positive integers using
    an optimized version of Euclid's algorithm that replaces subtraction with
    the modulus operator.

    Parameters:
    - a (int): a positive integer
    - b (int): a positive integer

    Returns:
    int: the greatest common divisor of a and b
    """
    while a != 0 and b != 0:
        if a > b:
            a = a % b
        else:
            b = b % a
    if a == 0:
        return b
    else:
        return a

STOP: Time faster_euclid_gcd() against euclid_gcd(). What is the speedup that it provides for integers with 6, 7, 8, and 9 digits?

Click Run 👇 to try it!

Looking ahead

We have implemented one elegant mathematical algorithm from antiquity, but one remains: the sieve of Eratosthenes for finding prime numbers. To implement this algorithm and store multiple prime numbers, we will need to learn more about how Python implements arrays, which allow us to store more than one value at a time.

Next lesson

Check your work from the code along

We provide autograders in the window below (or via a direct link) allowing you to check your work for the following functions:

trivial_gcd()
euclid_gcd()