NumPy: A tale of for-loops, slow code, and how one library changed the course of ML forever

Programming foundations for ML and Data Science

Vizuara AI Labs

and

Sreedath Panat

Apr 12, 2025

Imagine this.

You are in 2003. Python is cool but still kind of niche.
Data scientists are rare mythical creatures.
You just want to multiply two lists.

You write:

a = [1, 2, 3]
b = [4, 5, 6]
result = [a[i]*b[i] for i in range(len(a))]  # Not efficient

You smile. You are proud.
But somewhere, your CPU is silently weeping.

Because this code… is slow.
Like dial-up-internet slow.
Like-waiting-for-your-crush-to-text-back slow.

And then came NumPy

With the elegance of a mathematician and the power of C/C++,
NumPy strolled in and said: “You shall not loop!”

So what is NumPy? It is a library in Python. Introduced in early 2000s (based on Numeric and Numarray)

NumPy was built for speed, vectorization, and efficiency.
It was Python’s glow-up moment. Suddenly, Python could talk math like a native speaker.

Here is what NumPy brought to the scene:

✅ Multi-dimensional arrays (ndarray)
✅ Vectorized operations (bye-bye for-loops 👋)
✅ Linear algebra, stats, broadcasting
✅ The foundation for libraries like Pandas, Scikit-learn, and even early TensorFlow

And most of the heavy lifting?
Done by C and C++ under the hood.

Python stayed readable.
NumPy made it fast.

Python lists had the following issues

NumPy in action

Let us see the code snippets for some basic operations using NumPy arrays.

Array creation and operations? Easy

import numpy as np

a = np.array([1, 2, 3])            # Now we're talking
b = np.array([[1, 2], [3, 4]])     # 2D like your favorite matrix

a + 10         # [11 12 13] - broadcasting wizardry
a * 2          # [2 4 6] - strong
np.sqrt(a)     # [1.0, 1.41, 1.73] - enlightened

Indexing? Smooth

a[0]            # first element
b[0, 1]         # row 0, column 1

Reshaping? Done

c = np.arange(12).reshape(3, 4)

Aggregation? Like a boss

c.sum(axis=0)   # column-wise sum
c.mean(axis=1)  # row-wise mean

All good. Can we quantify how much better is NumPy compared to Python lists?

Speed comparison: Python lists vs NumPy arrays

Let us say you want to multiply TWO ARRAYS of a 100 million numbers.

Just a simple, innocent element-wise multiplication.
Nothing fancy. Should take only a second, right?

Here is how to multiply two giant arrays

import time

# Define the number of elements
n = 100_000_000  # 100 million length

# Create two Python lists with values 0 to 999999
a = list(range(n))
b = list(range(n))

# Start timing
start_time = time.time()

# Perform element-wise multiplication using a for-loop
result = [a[i] * b[i] for i in range(len(a))]

# End timing
end_time = time.time()

# Display results
print(f"Total time taken with Python lists: {end_time - start_time:.4f} seconds")

Time taken:
Enough to make a cup of tea, bake banana bread, finish a Netflix episode, and revisit your life decisions.

CPU status:
“Please… have mercy.”

You:
Watching your fan spin like a helicopter and wondering if it is morally wrong to still be using for-loops.

This code took only about 10.65 seconds in reality. But same operation using NumPy arrays took only 0.38 seconds.

Here is what NumPy does

import time
import numpy as np

# Define the number of elements
n = 100_000_000  # 10 million length

# Create two NumPy arrays with values 0 to 999999
a = np.arange(n)
b = np.arange(n)

# Start timing
start_time = time.time()

# Perform element-wise multiplication using NumPy
result = a * b

# End timing
end_time = time.time()

# Display results
print(f"Total time taken with NumPy arrays: {end_time - start_time:.4f} seconds")

Boom! That is it.

Time taken:
Done before you even hit "Enter". (Just 0.38 seconds to be precise)

CPU status:
“Cool. What’s next?”

You:
Feeling like a wizard. Whispering "np.dot, my old friend…” into the terminal.

It's Meme Monday, so here's a python meme for DS folks : r/datascience

Let us compare Python lists v/s NumPy for matrix multiplication

Let us compare the speeds for multiplication of matrices of different sizes. We will try with matrices of dimensions 10x10, 100x100 and 1000x1000. Here is the code. I will show you the results below the code.

import time
import numpy as np
import matplotlib.pyplot as plt

# Define matrix sizes to test
sizes = [10, 100, 1000]
list_times = []
numpy_times = []

def multiply_lists(n):
    # Create matrices A and B with random values using lists
    A = [[1 for _ in range(n)] for _ in range(n)]
    B = [[1 for _ in range(n)] for _ in range(n)]
    C = [[0 for _ in range(n)] for _ in range(n)]

    start = time.time()
    for i in range(n):
        for j in range(n):
            for k in range(n):
                C[i][j] += A[i][k] * B[k][j]
    end = time.time()
    return end - start

def multiply_numpy(n):
    A = np.ones((n, n))
    B = np.ones((n, n))

    start = time.time()
    C = A @ B  # Matrix multiplication
    end = time.time()
    return end - start

# Run benchmarks
for n in sizes:
    print(f"Running for size {n}x{n}...")
    t_list = multiply_lists(n)
    t_np = multiply_numpy(n)
    list_times.append(t_list)
    numpy_times.append(t_np)

# Plotting
plt.figure(figsize=(10,6))
plt.plot(sizes, list_times, marker='o', label='Python Lists', color='red')
plt.plot(sizes, numpy_times, marker='s', label='NumPy', color='green')
plt.xlabel('Matrix Size (n x n)')
plt.ylabel('Execution Time (seconds)')
plt.title('Matrix Multiplication: Python Lists vs NumPy')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

NumPy’s role in ML workflow

If you start working in ML/DS projects, you will inevitably use NumPy. Here is where you are most likely to encounter NumPy in the ML pipeline.

"If you know NumPy, you’re halfway to mastering machine learning in Python."

Limitations of NumPy

While NumPy is powerful, it also has some limitations:

NumPy mini challenge

Scenario:

You are given the marks of 5 students in 3 subjects as a 2D array.

Your task is to:

Calculate the average marks for each student.
Identify the top-performing student.

I recommend you to do this yourself. If you are struggling, you can refer to the code below.

import numpy as np

# Marks of 5 students in 3 subjects
marks = np.array([
    [85, 90, 88],    # Student 1
    [70, 75, 80],    # Student 2
    [92, 88, 95],    # Student 3
    [60, 65, 70],    # Student 4
    [78, 85, 82]     # Student 5
])

# 1. Calculate average marks for each student
average_marks = np.mean(marks, axis=1)

print("Average marks for each student:")
for i, avg in enumerate(average_marks, 1):
    print(f"Student {i}: {avg:.2f}")

# 2. Identify the top-performing student
top_student_index = np.argmax(average_marks)

print(f"\n🏆 Top-performing student is Student {top_student_index + 1} with an average of {average_marks[top_student_index]:.2f}")

Output:

Average marks for each student:

Student 1: 87.67

Student 2: 75.00

Student 3: 91.67

Student 4: 65.00

Student 5: 81.67

🏆 Top-performing student is Student 3 with an average of 91.67

Conclusion

NumPy didn’t just shape numerical computing. It raised a generation of powerful libraries.

Pandas, TensorFlow, PyTorch, JAX, CuPy they all owe their roots to NumPy’s simple idea:

"Make numerical computing in Python easy... and fast."

The viral image below says it all.

Array Libraries Interoperability | Quansight Labs

While the newer libraries have grown into strong, capable tools…
NumPy still walks quietly behind them — the master, the teacher, the origin.

So the next time you call .mean() or np.dot() without thinking, maybe whisper a little “thank you” to the OG.

NumPy didn’t just teach Python to compute.
It taught it how to grow.

Interested in learning ML from the foundations? Check this out

ML foundations course

Lecture video

I have made a lecture video on this topic and hosted it on Vizuara’s YouTube channel. Do check this out. I hope you enjoy watching this lecture as much as I enjoyed making it.