Back to Anthropic questions
CodingSoftware Engineer

Batch Image Processor

Role: Software Engineer


Overview

In this coding exercise, you will build a batch image processing system. You are given a set of images and transformation specifications in JSON format. Your task is to apply these transformations to each image and save the results to an output directory.

This is a practical coding challenge that tests your ability to:

  • Quickly research and use unfamiliar libraries - You are expected to search documentation during the interview

  • Parse and apply configuration from JSON files

  • Handle file I/O operations efficiently

  • Optimize for performance with parallel processing

Interview Format Notes

  • Documentation search is allowed and expected - The interviewer wants to see how you research and learn new APIs quickly

  • You may use any resources except AI-generated answers

  • Common library choices: Pillow (PIL) or scikit-image - consider familiarizing yourself with one beforehand

  • The interview involves testing on small images first, then optimizing for large images within a time target


Problem Setup

You are provided with four directories:

text
project/
├── small_images/      # Small test images for development
│   ├── image1.png
│   ├── image2.jpg
│   └── ...
├── large_images/      # Large images for performance testing
│   ├── photo1.png
│   ├── photo2.jpg
│   └── ...
├── transformations/   # JSON files defining transformations
│   ├── transform1.json
│   ├── transform2.json
│   └── ...
└── output/            # Directory to save processed images

Helper utilities are provided to:

  • List all files in each directory

  • Generate output file paths based on input image and transformation file


Transformation Specifications

Each JSON file in the transformations/ directory contains a list of transformations to apply sequentially. There are six types of transformations:

Transformations Without Parameters

TypeDescription
grayscaleConvert image to grayscale
flip_horizontalFlip image horizontally (mirror)
flip_verticalFlip image vertically

Transformations With Parameters

TypeParameterDescription
scalefactor (float)Scale image by the given factor (e.g., 0.5 = half size, 2.0 = double size)
blurradius (int)Apply Gaussian blur with the specified radius
rotateangle (float)Rotate image by the specified angle in degrees

Example Transformation JSON

json
{
  "transformations": [
    { "type": "grayscale" },
    { "type": "scale", "factor": 0.5 },
    { "type": "rotate", "angle": 90 }
  ]
}

This configuration would:

  • Convert the image to grayscale

  • Scale it to 50% of its original size

  • Rotate it 90 degrees counter-clockwise


Requirements

Part 1: Basic Implementation

  • Choose an image processing library - Research and select a Python library capable of performing all six transformation types. Common choices include:

  • Pillow (PIL)

  • scikit-image

  • OpenCV

  • Implement transformation functions - Create functions for each of the six transformation types

  • Process images with transformations:

  • For each transformation JSON file

  • For each image in the source directory

  • Apply all transformations in the JSON file sequentially to the image

  • Save the result to the output directory using the provided path utility

  • Test with small images - Verify correctness using the small_images/ directory before moving to large images

Part 2: Performance Optimization

After verifying correctness with small images, process the large_images/ directory. You must complete processing within a target time limit (provided during the interview).

Key considerations:

  • Image processing is CPU-intensive

  • Each image can be processed independently

  • Consider parallelization strategies


Interface

python
def process_images(
    image_dir: str,
    transformation_dir: str,
    output_dir: str,
    get_output_path: Callable[[str, str], str]
) -> None:
    """
    Process all images with all transformation configurations.

    Args:
        image_dir: Path to directory containing source images
        transformation_dir: Path to directory containing transformation JSON files
        output_dir: Path to directory for saving processed images
        get_output_path: Utility function that generates output path
                        given (image_path, transform_json_path)
    """
    pass

Sample Solution

Disclaimer: This is a sample solution for reference. During the interview, you should develop your own approach and demonstrate your problem-solving process.

Choosing a Library

For this problem, Pillow (PIL) is a good choice because:

  • Simple API for common image operations

  • Built-in support for all required transformations

  • Well-documented and widely used

scikit-image is also a good option, especially if you're familiar with NumPy-based image processing.

Quick Pillow API Reference

When searching documentation, look for these modules:

TransformationPillow Module/Method
GrayscalePIL.ImageOps.grayscale()
Flip horizontalPIL.ImageOps.mirror()
Flip verticalPIL.ImageOps.flip()
Scale/ResizeImage.resize(size, resample)
BlurPIL.ImageFilter.GaussianBlur(radius)
RotateImage.rotate(angle, expand=True)

Basic Implementation with Pillow

python
from PIL import Image, ImageFilter, ImageOps
import json
import os
from pathlib import Path

def load_transformations(json_path: str) -> list:
    """Load transformation specifications from a JSON file."""
    with open(json_path, 'r') as f:
        data = json.load(f)
    return data.get('transformations', [])

def apply_transformation(image: Image.Image, transform: dict) -> Image.Image:
    """Apply a single transformation to an image."""
    transform_type = transform['type']

    if transform_type == 'grayscale':
        # Convert to grayscale, then back to RGB to maintain consistent color mode
        # for subsequent transformations (some operations require RGB)
        return ImageOps.grayscale(image).convert('RGB')

    elif transform_type == 'flip_horizontal':
        return ImageOps.mirror(image)

    elif transform_type == 'flip_vertical':
        return ImageOps.flip(image)

    elif transform_type == 'scale':
        factor = transform['factor']
        new_size = (int(image.width * factor), int(image.height * factor))
        return image.resize(new_size, Image.Resampling.LANCZOS)

    elif transform_type == 'blur':
        radius = transform['radius']
        return image.filter(ImageFilter.GaussianBlur(radius=radius))

    elif transform_type == 'rotate':
        angle = transform['angle']
        return image.rotate(angle, expand=True)

    else:
        raise ValueError(f"Unknown transformation type: {transform_type}")

def apply_all_transformations(image: Image.Image, transformations: list) -> Image.Image:
    """Apply a sequence of transformations to an image."""
    result = image.copy()
    for transform in transformations:
        result = apply_transformation(result, transform)
    return result

def process_images(
    image_dir: str,
    transformation_dir: str,
    output_dir: str,
    get_output_path
) -> None:
    """Process all images with all transformation configurations."""

    # Get all image and transformation files
    # Note: In the interview, file paths are typically provided by helper utilities
    image_files = [f for f in Path(image_dir).iterdir()
                   if f.suffix.lower() in ('.png', '.jpg', '.jpeg')]
    transform_files = list(Path(transformation_dir).glob('*.json'))

    for transform_file in transform_files:
        transformations = load_transformations(str(transform_file))

        for image_file in image_files:
            # Load image
            image = Image.open(str(image_file))

            # Apply transformations
            result = apply_all_transformations(image, transformations)

            # Save to output directory
            output_path = get_output_path(str(image_file), str(transform_file))
            result.save(output_path)

            # Close images to free memory
            image.close()
            result.close()

Optimized Implementation with Parallel Processing

For large images, use ProcessPoolExecutor to parallelize the work. Key insight: image processing is CPU-bound, so multiprocessing (not threading) is needed to bypass Python's GIL.

Important: On Windows, multiprocessing requires the if __name__ == '__main__': guard around the executor code to prevent infinite process spawning.

python
from concurrent.futures import ProcessPoolExecutor
from PIL import Image, ImageFilter, ImageOps
import json
from pathlib import Path
import os

# Note: apply_transformation must be defined at module level for pickling
def apply_transformation(image: Image.Image, transform: dict) -> Image.Image:
    """Apply a single transformation to an image."""
    transform_type = transform['type']

    if transform_type == 'grayscale':
        return ImageOps.grayscale(image).convert('RGB')
    elif transform_type == 'flip_horizontal':
        return ImageOps.mirror(image)
    elif transform_type == 'flip_vertical':
        return ImageOps.flip(image)
    elif transform_type == 'scale':
        factor = transform['factor']
        new_size = (int(image.width * factor), int(image.height * factor))
        return image.resize(new_size, Image.Resampling.LANCZOS)
    elif transform_type == 'blur':
        radius = transform['radius']
        return image.filter(ImageFilter.GaussianBlur(radius=radius))
    elif transform_type == 'rotate':
        angle = transform['angle']
        return image.rotate(angle, expand=True)
    else:
        raise ValueError(f"Unknown transformation type: {transform_type}")

def process_single_image(args: tuple) -> str:
    """Process a single image with a transformation configuration.

    This function runs in a separate process.
    """
    image_path, transform_path, output_path = args

    # Load transformation config
    with open(transform_path, 'r') as f:
        data = json.load(f)
    transformations = data.get('transformations', [])

    # Load and process image
    image = Image.open(image_path)
    result = image.copy()

    for transform in transformations:
        result = apply_transformation(result, transform)

    # Ensure output directory exists and save
    os.makedirs(os.path.dirname(output_path), exist_ok=True)
    result.save(output_path)

    image.close()
    result.close()

    return output_path

def process_images_parallel(
    image_dir: str,
    transformation_dir: str,
    output_dir: str,
    get_output_path,
    max_workers: int = None
) -> None:
    """Process all images in parallel using multiple processes."""

    # Collect all work items
    image_files = [f for f in Path(image_dir).iterdir()
                   if f.suffix.lower() in ('.png', '.jpg', '.jpeg')]
    transform_files = list(Path(transformation_dir).glob('*.json'))

    work_items = []
    for transform_file in transform_files:
        for image_file in image_files:
            output_path = get_output_path(str(image_file), str(transform_file))
            work_items.append((str(image_file), str(transform_file), output_path))

    # Process in parallel using multiple CPU cores
    with ProcessPoolExecutor(max_workers=max_workers) as executor:
        results = list(executor.map(process_single_image, work_items))

    print(f"Processed {len(results)} images")

Follow-up Discussion

The most likely follow-up question is about parallelization strategy. Be prepared to discuss the trade-offs between threading and multiprocessing in Python.

Threading vs Multiprocessing (Primary Follow-up)

Question: Why did you choose ProcessPoolExecutor instead of ThreadPoolExecutor? When would threading be preferable?

Key Points to Discuss:

AspectProcessPoolExecutorThreadPoolExecutor
Best forCPU-bound tasksI/O-bound tasks
GIL impactBypasses GIL (separate interpreters)Limited by GIL
MemorySeparate memory per processShared memory
OverheadHigher (process creation)Lower (thread creation)
Data sharingRequires serialization (pickling)Direct access

Why ProcessPoolExecutor for this problem:

  • Image transformations (blur, rotate, scale) are CPU-intensive numerical operations

  • Python's Global Interpreter Lock (GIL) prevents threads from running Python bytecode in parallel

  • Each process has its own Python interpreter, allowing true parallel execution across CPU cores

  • Images can be processed independently - no need for shared memory

When ThreadPoolExecutor would be better:

  • I/O-bound workloads: Network requests, database queries, file downloads

  • When tasks spend most time waiting (GIL is released during I/O waits)

  • When you need shared state between workers

  • When process creation overhead exceeds the computation time

The GIL Explained:

text
Thread 1: [Python code] [wait] [Python code] [wait]
Thread 2:      [wait]  [Python code]  [wait] [Python code]
                    ↑ Only one thread executes Python at a time

Process 1: [Python code] [Python code] [Python code]
Process 2: [Python code] [Python code] [Python code]
                    ↑ True parallelism with separate interpreters

Nuanced Answer (shows depth of understanding):

  • This problem has both I/O (reading/writing images) and CPU (transformations) components

  • If images are on slow storage (network drive, HDD), I/O might dominate → threading could work

  • If images are on fast storage (SSD, RAM disk), CPU dominates → multiprocessing is better

  • Profiling would reveal the actual bottleneck

  • In practice, for image processing, CPU usually dominates → multiprocessing is the safer choice

Other Potential Follow-ups

Memory Management: How to handle images too large for memory?

  • Process in tiles/chunks, use memory-mapped files, limit concurrent workers

Error Handling: How to make this production-ready?

  • Validate JSON schema, handle corrupt images gracefully, add logging, support resume from failure

Scaling Further: What if you need to process millions of images?

  • Distribute across machines using message queues, consider GPU acceleration with OpenCV/cupy