Harnessing the Magic of Pytest and Mocking for Transformers Models Testing

The landscape of Natural Language Processing (NLP) has undergone a dramatic transformation with the advent of transformers models. Powered by Hugging Face’s popular Transformers library, these sophisticated models have ushered in a new era of AI applications, from language translation to sentiment analysis and beyond.

As the prominence of transformers models continues to grow, the need for rigorous testing methodologies becomes paramount. In this blog post, we will explore the vital role of testing in the world of NLP and AI development, focusing specifically on transformers models.

While Pytest simplifies testing, real transformers models can be challenging, due to their resource-intensive nature. Loading large models and processing vast amounts of data can slow down test cycles. This is where the art of mocking with unittest.mock comes into play. By creating lightweight and simulated versions of models, we can substitute time-consuming operations with mocked objects, achieving faster and more focused testing. This allows us to confidently validate the behavior of our model without being blocked by resource constraints.

2. Understanding Pytest

What is Pytest?

Pytest is a feature-rich and widely-used testing framework in Python. Known for its simplicity and ease of use, it allows to write tests using a clean and intuitive syntax. It automatically discovers and runs all the test functions within your codebase, making testing effortless and efficient.

Getting Started with Pytest

To use Pytest in your project, you need to install it via pip:

pip install pytest

Create a file with test functions and name it test_*.py. Pytest will automatically recognize this file as containing test functions.

A simple test function using Pytest looks like this:

# test_example.py
def add(a, b):
    return a + b

def test_add():
    result = add(2, 3)
    assert result == 5

Running Pytest will automatically discover and run the test functions:

pytest

3. Mocking Transformers Models

Why mocking Transformers Models?

Testing transformers models often involves time-consuming tasks, such as loading large models and processing data. Mocking allows us to replace these time-consuming operations with lightweight, simulated versions. This speeds up the test execution and allows us to focus on the functionality of the code being tested.

Using unittest.mock for Mocking

The unittest.mock module in Python provides the tools for creating mock objects. Mocking models, tokenizers, and other dependencies can be done easily using it. Here’s an example of how to mock a transformers model for sentiment analysis.

Let’s begin by creating a basic SentimentAnalysisModel wrapper class that encapsulates a transformers model for sentiment analysis, providing a simple interface for predicting the sentiment of input texts.

# my_transformers_model.py
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch


class SentimentAnalysisModel:
    def __init__(self):
        self.tokenizer = AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
        self.model = AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
        torch.manual_seed(15)


    def predict_sentiment(self, input_text: str) -> str:
        token = self.tokenizer.encode(input_text, return_tensors = 'pt')
        result = self.model(token)
        prediction = int(torch.argmax(result.logits))+1
        return "positive" if prediction >= 3 else "negative"

Testing Sentiment Analysis Function

Now, let’s create a function in my_sentiment_analysis.py that uses the transformers model previously created for sentiment analysis:

# my_sentiment_analysis.py
from my_transformers_model import SentimentAnalysisModel


def analyze_sentiment(input_text: str) -> str:
    model = SentimentAnalysisModel()
    result = model.predict_sentiment(input_text)
    return result

Pytest Test Functions

With the analyze_sentiment() function and the SentimentAnalysisModel class in place, let’s write the Pytest test functions in test_my_sentiment_analysis.py:

# test_my_sentiment_analysis.py
from unittest import mock

import pytest

import my_sentiment_analysis


def test_analyze_sentiment() -> None:
    input_text = "I love this product! It's amazing."
    expected_result = "positive"

    # Create a mock model for testing
    mock_model = mock.MagicMock()
    mock_model.predict_sentiment.return_value = "positive"

    # Mock the SentimentAnalysisModel class
    with mock.patch.object(
        my_sentiment_analysis.SentimentAnalysisModel, "__init__", return_value=None
    ):
        with mock.patch.object(
            my_sentiment_analysis.SentimentAnalysisModel,
            "predict_sentiment",
            return_value=mock_model.predict_sentiment(),
        ) as mocked_predict:
            # Test the function that uses the transformers model
            result = my_sentiment_analysis.analyze_sentiment(input_text)
    # Assertion
    assert result == expected_result
    mocked_predict.assert_called_once_with(input_text)


# Provide test cases with different inputs and expected outputs
@pytest.mark.parametrize(
    "input_text, expected_result",
    [
        ("I love it, it is amazing.", "positive"),
        ("This movie was terrible.", "negative"),
        ("Neutral statement.", "positive"),
    ],
)
def test_analyze_sentiment_multiple_cases(input_text, expected_result) -> None:
    result = my_sentiment_analysis.analyze_sentiment(input_text)
    assert result == expected_result

In the test_analyze_sentiment() function, we use the mock.patch.object context manager to mock the SentimentAnalysisModel class and its methods. We set return_value for the mock model to return “positive” as the predicted sentiment. This allows us to test the analyze_sentiment() function with a mocked version of the model.

By using mock.patch.object, we don’t need to replace the original class with the mock; instead, the mock is temporarily applied within the context and reverted back afterward. This approach makes the code cleaner and ensures that the mock doesn’t interfere with other tests. The analyze_sentiment() function can be thoroughly tested using the mock model without accessing any external resources or running the actual model.

The test_analyze_sentiment_multiple_cases() function instead demonstrates parametrized testing, allowing us to test the analyze_sentiment() function with multiple inputs and expected outputs. Unlike the previous test function (test_analyze_sentiment) that used mocking to replace the real model, here we employ the actual sentiment analysis model defined in my_transformers_model.py. This means that the sentiment analysis is performed using the real model and not a mocked version. The test cases cover various sentiments, ensuring that the function behaves correctly for different scenarios.

4. Conclusion

Testing transformers models is crucial for ensuring their reliability and correctness. By leveraging Pytest and unittest.mock module for mocking, we can write comprehensive and efficient tests for them. Mocking the transformers model allows us to isolate the tests from time-consuming operations, leading to faster testing.

With well-structured test functions and parametrized tests, we can thoroughly evaluate the behavior and performance of the sentiment analysis function without running the actual model. As I continue my journey in NLP and AI development, mastering Pytest and mocking will enable me to deliver robust and reliable transformers models that excel in real-world applications. Happy testing!