Study | StudyLover

Mahotas

Pandas : Pros & Cons of Machine Learning

Unit:1 Foundations of Python and Its Applications in Machine Learning

Mahotas is a powerful Python library for image processing and computer vision, with a special emphasis on bioimage analysis. It's designed to be fast and efficient, with many of its algorithms implemented in C++ for high performance, while still providing a simple and clean Python interface.

While libraries like Pillow are great for general image manipulation (like cropping and resizing), Mahotas provides more advanced and scientifically-oriented functions. It's an excellent tool for researchers and developers working in fields like microscopy, biology, and materials science, where extracting quantitative features from images is crucial.

Key Features

Performance: Built for speed, making it suitable for processing large sets of images.
Bioimage Focus: Includes specialized functions for tasks common in biology, such as counting cells or analyzing textures.
NumPy Integration: Works seamlessly with NumPy arrays, which is the standard for numerical data in Python.
Simplicity: Despite its power, it maintains a clean and easy-to-use API.

To run these examples, you will need to install Mahotas, NumPy, and Matplotlib: pip install mahotas numpy matplotlib

Code Examples

1. Loading Images and Basic Operations

Mahotas can load images directly into NumPy arrays. This example loads an image, converts it to grayscale, and applies a Gaussian filter to smooth it.

import mahotas

import numpy as np

import matplotlib.pyplot as plt

# You will need an image file for this example.

# Let's use a sample image from the library itself.

# This loads a photograph of Luis Pedro Coelho, the library's author.

image = mahotas.demos.load('luispedro')

# The image is loaded as a NumPy array.

# Convert the color image to grayscale for processing.

# Grayscale simplifies many image analysis tasks.

image_gray = image.mean(axis=2).astype(np.uint8)

# Apply a Gaussian filter to reduce noise.

# The 'sigma' value controls the amount of blurring.

image_filtered = mahotas.gaussian_filter(image_gray, sigma=4)

# Display the original and filtered images

fig, axes = plt.subplots(1, 2, figsize=(10, 5))

axes[0].imshow(image_gray, cmap='gray')

axes[0].set_title('Original Grayscale Image')

axes[0].axis('off')

axes[1].imshow(image_filtered, cmap='gray')

axes[1].set_title('Gaussian Filtered Image')

axes[1].axis('off')

plt.show()

2. Thresholding: Separating Objects from Background

Thresholding is a fundamental technique in image segmentation used to isolate objects of interest. Mahotas provides several automatic thresholding methods.

import mahotas

import numpy as np

import matplotlib.pyplot as plt

# Load a sample image of cells

image = mahotas.demos.load('nuclear')

# Apply a Gaussian filter to smooth the image before thresholding

image_filtered = mahotas.gaussian_filter(image, sigma=3)

# Calculate a threshold value automatically using Otsu's method.

# This method finds an optimal threshold to separate the two classes

# of pixels (foreground and background).

threshold_value = mahotas.thresholding.otsu(image_filtered)

print(f"Otsu Threshold Value: {threshold_value}")

# Create a binary image by applying the threshold.

# Pixels above the threshold value become 'True' (or white),

# and pixels below become 'False' (or black).

binary_image = (image_filtered > threshold_value)

# Display the results

fig, axes = plt.subplots(1, 2, figsize=(10, 5))

axes[0].imshow(image, cmap='gray')

axes[0].set_title('Original Image')

axes[0].axis('off')

axes[1].imshow(binary_image, cmap='gray')

axes[1].set_title('Thresholded (Binary) Image')

axes[1].axis('off')

plt.show()

3. Feature Extraction: Describing Image Textures

A key application of Mahotas is extracting quantitative features from images. Haralick texture features are widely used to describe the texture of an image region, which can be used to train machine learning classifiers.

import mahotas

import numpy as np

# Load a sample image

image = mahotas.demos.load('luispedro')

# Convert to grayscale

image_gray = image.mean(axis=2).astype(np.uint8)

# Calculate Haralick texture features for the entire image.

# This computes features like contrast, correlation, and energy,

# which describe the texture patterns in the image.

# The result is an array of 13 feature values.

haralick_features = mahotas.features.haralick(image_gray)

# The result is a 2D array. For the whole image, we can take the mean.

mean_haralick_features = haralick_features.mean(axis=0)

print("--- Haralick Texture Features (Mean) ---")

feature_names = [

'Angular Second Moment', 'Contrast', 'Correlation', 'Variance',

'Inverse Difference Moment', 'Sum Average', 'Sum Variance', 'Sum Entropy',

'Entropy', 'Difference Variance', 'Difference Entropy',

'Info Correlation 1', 'Info Correlation 2'

]

for name, value in zip(feature_names, mean_haralick_features):

print(f"{name}: {value:.4f}")

4. Watershed: Separating Touching Objects

The watershed algorithm is a powerful technique for separating objects that are touching each other, a common problem in cell biology.

import mahotas

import numpy as np

import matplotlib.pyplot as plt

# Load the nuclear image again

image = mahotas.demos.load('nuclear')

image_filtered = mahotas.gaussian_filter(image, sigma=3.5)

threshold_value = mahotas.thresholding.otsu(image_filtered)

binary_image = (image_filtered > threshold_value)

# The watershed algorithm works on a "distance transform".

# This calculates how far each pixel is from the background.

# The peaks in this transform correspond to the centers of objects.

distance = mahotas.distance(binary_image)

# Find the local maxima (peaks) which will be the "seeds" for the watershed.

local_maxima = mahotas.regmax(distance)

seeds, num_seeds = mahotas.label(local_maxima)

print(f"Number of objects found: {num_seeds}")

# Apply the watershed algorithm

# It "floods" the image starting from the seeds until the regions meet.

labeled_objects = mahotas.cwatershed(distance.max() - distance, seeds)

# Display the result

plt.figure(figsize=(7, 7))

plt.imshow(labeled_objects, cmap=plt.cm.jet)

plt.title('Watershed Segmentation Result')

plt.axis('off')

plt.show()

Pandas Pros & Cons of Machine Learning