Batch Image Processor
Role: Software Engineer
Overview
In this coding exercise, you will build a batch image processing system. You are given a set of images and transformation specifications in JSON format. Your task is to apply these transformations to each image and save the results to an output directory.
This is a practical coding challenge that tests your ability to:
-
Quickly research and use unfamiliar libraries - You are expected to search documentation during the interview
-
Parse and apply configuration from JSON files
-
Handle file I/O operations efficiently
-
Optimize for performance with parallel processing
Interview Format Notes
-
Documentation search is allowed and expected - The interviewer wants to see how you research and learn new APIs quickly
-
You may use any resources except AI-generated answers
-
Common library choices: Pillow (PIL) or scikit-image - consider familiarizing yourself with one beforehand
-
The interview involves testing on small images first, then optimizing for large images within a time target
Problem Setup
You are provided with four directories:
project/
├── small_images/ # Small test images for development
│ ├── image1.png
│ ├── image2.jpg
│ └── ...
├── large_images/ # Large images for performance testing
│ ├── photo1.png
│ ├── photo2.jpg
│ └── ...
├── transformations/ # JSON files defining transformations
│ ├── transform1.json
│ ├── transform2.json
│ └── ...
└── output/ # Directory to save processed imagesHelper utilities are provided to:
-
List all files in each directory
-
Generate output file paths based on input image and transformation file
Transformation Specifications
Each JSON file in the transformations/ directory contains a list of transformations to apply sequentially. There are six types of transformations:
Transformations Without Parameters
| Type | Description |
|---|---|
grayscale | Convert image to grayscale |
flip_horizontal | Flip image horizontally (mirror) |
flip_vertical | Flip image vertically |
Transformations With Parameters
| Type | Parameter | Description |
|---|---|---|
scale | factor (float) | Scale image by the given factor (e.g., 0.5 = half size, 2.0 = double size) |
blur | radius (int) | Apply Gaussian blur with the specified radius |
rotate | angle (float) | Rotate image by the specified angle in degrees |
Example Transformation JSON
{
"transformations": [
{ "type": "grayscale" },
{ "type": "scale", "factor": 0.5 },
{ "type": "rotate", "angle": 90 }
]
}This configuration would:
-
Convert the image to grayscale
-
Scale it to 50% of its original size
-
Rotate it 90 degrees counter-clockwise
Requirements
Part 1: Basic Implementation
-
Choose an image processing library - Research and select a Python library capable of performing all six transformation types. Common choices include:
-
Pillow (PIL)
-
scikit-image
-
OpenCV
-
Implement transformation functions - Create functions for each of the six transformation types
-
Process images with transformations:
-
For each transformation JSON file
-
For each image in the source directory
-
Apply all transformations in the JSON file sequentially to the image
-
Save the result to the output directory using the provided path utility
-
Test with small images - Verify correctness using the
small_images/directory before moving to large images
Part 2: Performance Optimization
After verifying correctness with small images, process the large_images/ directory. You must complete processing within a target time limit (provided during the interview).
Key considerations:
-
Image processing is CPU-intensive
-
Each image can be processed independently
-
Consider parallelization strategies
Interface
def process_images(
image_dir: str,
transformation_dir: str,
output_dir: str,
get_output_path: Callable[[str, str], str]
) -> None:
"""
Process all images with all transformation configurations.
Args:
image_dir: Path to directory containing source images
transformation_dir: Path to directory containing transformation JSON files
output_dir: Path to directory for saving processed images
get_output_path: Utility function that generates output path
given (image_path, transform_json_path)
"""
passSample Solution
Disclaimer: This is a sample solution for reference. During the interview, you should develop your own approach and demonstrate your problem-solving process.
Choosing a Library
For this problem, Pillow (PIL) is a good choice because:
-
Simple API for common image operations
-
Built-in support for all required transformations
-
Well-documented and widely used
scikit-image is also a good option, especially if you're familiar with NumPy-based image processing.
Quick Pillow API Reference
When searching documentation, look for these modules:
| Transformation | Pillow Module/Method |
|---|---|
| Grayscale | PIL.ImageOps.grayscale() |
| Flip horizontal | PIL.ImageOps.mirror() |
| Flip vertical | PIL.ImageOps.flip() |
| Scale/Resize | Image.resize(size, resample) |
| Blur | PIL.ImageFilter.GaussianBlur(radius) |
| Rotate | Image.rotate(angle, expand=True) |
Basic Implementation with Pillow
from PIL import Image, ImageFilter, ImageOps
import json
import os
from pathlib import Path
def load_transformations(json_path: str) -> list:
"""Load transformation specifications from a JSON file."""
with open(json_path, 'r') as f:
data = json.load(f)
return data.get('transformations', [])
def apply_transformation(image: Image.Image, transform: dict) -> Image.Image:
"""Apply a single transformation to an image."""
transform_type = transform['type']
if transform_type == 'grayscale':
# Convert to grayscale, then back to RGB to maintain consistent color mode
# for subsequent transformations (some operations require RGB)
return ImageOps.grayscale(image).convert('RGB')
elif transform_type == 'flip_horizontal':
return ImageOps.mirror(image)
elif transform_type == 'flip_vertical':
return ImageOps.flip(image)
elif transform_type == 'scale':
factor = transform['factor']
new_size = (int(image.width * factor), int(image.height * factor))
return image.resize(new_size, Image.Resampling.LANCZOS)
elif transform_type == 'blur':
radius = transform['radius']
return image.filter(ImageFilter.GaussianBlur(radius=radius))
elif transform_type == 'rotate':
angle = transform['angle']
return image.rotate(angle, expand=True)
else:
raise ValueError(f"Unknown transformation type: {transform_type}")
def apply_all_transformations(image: Image.Image, transformations: list) -> Image.Image:
"""Apply a sequence of transformations to an image."""
result = image.copy()
for transform in transformations:
result = apply_transformation(result, transform)
return result
def process_images(
image_dir: str,
transformation_dir: str,
output_dir: str,
get_output_path
) -> None:
"""Process all images with all transformation configurations."""
# Get all image and transformation files
# Note: In the interview, file paths are typically provided by helper utilities
image_files = [f for f in Path(image_dir).iterdir()
if f.suffix.lower() in ('.png', '.jpg', '.jpeg')]
transform_files = list(Path(transformation_dir).glob('*.json'))
for transform_file in transform_files:
transformations = load_transformations(str(transform_file))
for image_file in image_files:
# Load image
image = Image.open(str(image_file))
# Apply transformations
result = apply_all_transformations(image, transformations)
# Save to output directory
output_path = get_output_path(str(image_file), str(transform_file))
result.save(output_path)
# Close images to free memory
image.close()
result.close()Optimized Implementation with Parallel Processing
For large images, use ProcessPoolExecutor to parallelize the work. Key insight: image processing is CPU-bound, so multiprocessing (not threading) is needed to bypass Python's GIL.
Important: On Windows, multiprocessing requires the if __name__ == '__main__': guard around the executor code to prevent infinite process spawning.
from concurrent.futures import ProcessPoolExecutor
from PIL import Image, ImageFilter, ImageOps
import json
from pathlib import Path
import os
# Note: apply_transformation must be defined at module level for pickling
def apply_transformation(image: Image.Image, transform: dict) -> Image.Image:
"""Apply a single transformation to an image."""
transform_type = transform['type']
if transform_type == 'grayscale':
return ImageOps.grayscale(image).convert('RGB')
elif transform_type == 'flip_horizontal':
return ImageOps.mirror(image)
elif transform_type == 'flip_vertical':
return ImageOps.flip(image)
elif transform_type == 'scale':
factor = transform['factor']
new_size = (int(image.width * factor), int(image.height * factor))
return image.resize(new_size, Image.Resampling.LANCZOS)
elif transform_type == 'blur':
radius = transform['radius']
return image.filter(ImageFilter.GaussianBlur(radius=radius))
elif transform_type == 'rotate':
angle = transform['angle']
return image.rotate(angle, expand=True)
else:
raise ValueError(f"Unknown transformation type: {transform_type}")
def process_single_image(args: tuple) -> str:
"""Process a single image with a transformation configuration.
This function runs in a separate process.
"""
image_path, transform_path, output_path = args
# Load transformation config
with open(transform_path, 'r') as f:
data = json.load(f)
transformations = data.get('transformations', [])
# Load and process image
image = Image.open(image_path)
result = image.copy()
for transform in transformations:
result = apply_transformation(result, transform)
# Ensure output directory exists and save
os.makedirs(os.path.dirname(output_path), exist_ok=True)
result.save(output_path)
image.close()
result.close()
return output_path
def process_images_parallel(
image_dir: str,
transformation_dir: str,
output_dir: str,
get_output_path,
max_workers: int = None
) -> None:
"""Process all images in parallel using multiple processes."""
# Collect all work items
image_files = [f for f in Path(image_dir).iterdir()
if f.suffix.lower() in ('.png', '.jpg', '.jpeg')]
transform_files = list(Path(transformation_dir).glob('*.json'))
work_items = []
for transform_file in transform_files:
for image_file in image_files:
output_path = get_output_path(str(image_file), str(transform_file))
work_items.append((str(image_file), str(transform_file), output_path))
# Process in parallel using multiple CPU cores
with ProcessPoolExecutor(max_workers=max_workers) as executor:
results = list(executor.map(process_single_image, work_items))
print(f"Processed {len(results)} images")Follow-up Discussion
The most likely follow-up question is about parallelization strategy. Be prepared to discuss the trade-offs between threading and multiprocessing in Python.
Threading vs Multiprocessing (Primary Follow-up)
Question: Why did you choose ProcessPoolExecutor instead of ThreadPoolExecutor? When would threading be preferable?
Key Points to Discuss:
| Aspect | ProcessPoolExecutor | ThreadPoolExecutor |
|---|---|---|
| Best for | CPU-bound tasks | I/O-bound tasks |
| GIL impact | Bypasses GIL (separate interpreters) | Limited by GIL |
| Memory | Separate memory per process | Shared memory |
| Overhead | Higher (process creation) | Lower (thread creation) |
| Data sharing | Requires serialization (pickling) | Direct access |
Why ProcessPoolExecutor for this problem:
-
Image transformations (blur, rotate, scale) are CPU-intensive numerical operations
-
Python's Global Interpreter Lock (GIL) prevents threads from running Python bytecode in parallel
-
Each process has its own Python interpreter, allowing true parallel execution across CPU cores
-
Images can be processed independently - no need for shared memory
When ThreadPoolExecutor would be better:
-
I/O-bound workloads: Network requests, database queries, file downloads
-
When tasks spend most time waiting (GIL is released during I/O waits)
-
When you need shared state between workers
-
When process creation overhead exceeds the computation time
The GIL Explained:
Thread 1: [Python code] [wait] [Python code] [wait]
Thread 2: [wait] [Python code] [wait] [Python code]
↑ Only one thread executes Python at a time
Process 1: [Python code] [Python code] [Python code]
Process 2: [Python code] [Python code] [Python code]
↑ True parallelism with separate interpretersNuanced Answer (shows depth of understanding):
-
This problem has both I/O (reading/writing images) and CPU (transformations) components
-
If images are on slow storage (network drive, HDD), I/O might dominate → threading could work
-
If images are on fast storage (SSD, RAM disk), CPU dominates → multiprocessing is better
-
Profiling would reveal the actual bottleneck
-
In practice, for image processing, CPU usually dominates → multiprocessing is the safer choice
Other Potential Follow-ups
Memory Management: How to handle images too large for memory?
- Process in tiles/chunks, use memory-mapped files, limit concurrent workers
Error Handling: How to make this production-ready?
- Validate JSON schema, handle corrupt images gracefully, add logging, support resume from failure
Scaling Further: What if you need to process millions of images?
- Distribute across machines using message queues, consider GPU acceleration with OpenCV/cupy