NumPy (Numerical Python) is the foundational package for scientific computing in Python. It provides a powerful and highly optimized multi-dimensional array object called an ndarray, along with a vast collection of high-level mathematical functions to operate on these arrays. It is the bedrock upon which the entire scientific Python ecosystem, including libraries like Pandas, Matplotlib, and Scikit-learn, is built.
The key difference between a NumPy array and a standard Python list is performance. NumPy arrays are stored in a single, continuous block of memory, which allows for extremely fast and efficient mathematical operations, often orders of magnitude faster than using Python lists.
Key Concepts
- ndarray: The core data structure of NumPy. It's a grid of values, all of the same type, indexed by a tuple of non-negative integers.
- Vectorization: This is the ability to perform operations on entire arrays at once, without needing to write explicit for loops. This makes the code cleaner and significantly faster.
To run these examples, you first need to install NumPy: pip install numpy
Code Examples
1. Creating NumPy Arrays
You can create arrays from Python lists or use built-in NumPy functions to generate arrays.
import numpy as np
# Create an array from a Python list
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)
print(f"Array from list: {my_array}")
# Create a 2D array (matrix)
my_2d_list = [[1, 2, 3], [4, 5, 6]]
my_2d_array = np.array(my_2d_list)
print("2D Array (Matrix):\n", my_2d_array)
# Create an array of zeros
zeros_array = np.zeros((2, 3)) # A 2x3 array of zeros
print("Zeros Array:\n", zeros_array)
# Create an array with a range of numbers
range_array = np.arange(0, 10, 2) # Start, stop (exclusive), step
print(f"Range Array: {range_array}")
2. Array Mathematics (Vectorization)
This is where NumPy's power shines. You can perform mathematical operations on entire arrays without writing loops.
import numpy as np
a = np.array([10, 20, 30, 40])
b = np.array([1, 2, 3, 4])
# Element-wise operations
c = a - b
print(f"Subtraction: {c}") # Output: [ 9 18 27 36]
d = a * b
print(f"Multiplication: {d}") # Output: [ 10 40 90 160]
# Universal functions (ufuncs)
e = np.sin(a)
print(f"Sine of array 'a': {np.round(e, 2)}")
# Scalar operations
f = a + 100
print(f"Adding a scalar: {f}") # Output: [110 120 130 140]
3. Indexing and Slicing
Accessing elements in NumPy arrays is similar to Python lists but with more advanced capabilities, especially for multi-dimensional arrays.
import numpy as np
arr = np.arange(10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Get a single element
print(f"Element at index 3: {arr[3]}") # Output: 3
# Get a slice of the array
print(f"Slice from index 2 to 5: {arr[2:5]}") # Output: [2 3 4]
# Boolean indexing (very powerful)
# Get all elements greater than 5
bool_arr = arr > 5
print(f"Boolean mask (> 5): {bool_arr}")
print(f"Elements greater than 5: {arr[bool_arr]}") # Output: [6 7 8 9]
# Indexing in a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Get element at row 1, column 2
print(f"Element at (1, 2): {arr2d[1, 2]}") # Output: 6
4. Aggregation Functions
NumPy provides fast functions to compute aggregations like sum, mean, min, max, etc., across an entire array or along a specific axis.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Get the sum of all elements
print(f"Sum of all elements: {arr.sum()}") # Output: 45
# Get the mean of all elements
print(f"Mean of all elements: {arr.mean()}") # Output: 5.0
# Get the max value
print(f"Max value: {arr.max()}") # Output: 9
# Get the sum of each column (axis=0)
print(f"Sum of each column: {arr.sum(axis=0)}") # Output: [12 15 18]
# Get the sum of each row (axis=1)
print(f"Sum of each row: {arr.sum(axis=1)}") # Output: [ 6 15 24]