Handling test data in pytest can get quite challenging for tests with complex test data (imagine a deeply nested JSON as a request body to an API). In addition to that, if there is a need to parameterize such tests, both code readability and test maintainability go for a toss. In real-world tests, we find this happening very frequently.
Let us look at this sample code to understand the problem better. Consider writing tests for validating email server actions like this
@pytest.mark.parameterize("input", [({ "fromEmail": "john@doe.com", "toEmail": "jane@doe.com", "description": "All fields are filled in with valid values", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "fromEmail": "jane@doe.com", "toEmail": "john@doe.com", "description": "signature is null", "message": { "subject": "Nearing estimate due date", "body": "Hi\n, Please send your work estomation for the sprint in this thread by EOD.", "signature": None } })]) def test_send_email_hapy_path(input): from_email = input["fromEmail"] to_email = input["toEmail"] message = input["message"] response = send_email(from_email, to_email, message) assert response.sent is True assert response.error is None @pytest.mark.parameterize("input", [({ "fromEmail": "john.com", "toEmail": "jane@doe.com", "description": "All fields are filled in with valid values", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "fromEmail": None, "toEmail": "john@doe.com", "description": "signature is null", "message": { "subject": "Nearing estimate due date", "body": "Hi\n, Please send your work estomation for the sprint in this thread by EOD.", "signature": None } })]) def test_send_email_invalid_email_address(): from_email = input["fromEmail"] to_email = input["toEmail"] message = input["message"] with pytest.raises(InvalidMessageBodyException) as e: send_email(from_email, to_email, message)
if we observe the code, we can see,
test_send_email_hapy_path
is trying to validate the happy path scenario for sending an email.test_send_email_invalid_email_address
is validating exceptions thrown because of invalid email idsThis style of test is hard to maintain because of following reasons,
To handle the problem outlined above, we can do the following things,
{ "happyPath": [ { "from": "john@doe.com", "to": "jane@doe.com", "description": "All fields are filled in with valid values", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "from": "jane@doe.com", "to": "john@doe.com", "description": "signature is null", "message": { "subject": "Nearing estimate due date", "body": "Hi\n, Please send your work estomation for the sprint in this thread by EOD.", "signature": null } } ], "invalidEmail": [ { "from": null, "to": "jane@doe.com", "description": "email id is null", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "from": "jane.com", "to": "jane@doe.com", "description": "invalid from email", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "from": "john@doe.com", "to": "jane.com", "description": "invalid to email", "message": { "subject": "Sick Leave", "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } } ], "invalidMessage" : [ { "from": "john@doe.com", "to": "jane@doe.com", "description": "Subject is null", "message": { "subject": null, "body": "Not feeling well, taking the day off", "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } }, { "from": "john@doe.com", "to": "jane@doe.com", "description": "Body is null", "message": { "subject": "Sick Leave", "body": null, "signature": { "firstname": "John", "lastName": "Doe", "salutations": "Regards" } } } ] }
By creating a separate JSON file for maintaining test data, we can create a good separation of concern between test methods and the test data. Also, adding and updating tests and test scenarios becomes much easier.
def create_namedtuple_from_dict(obj): """converts given list or dict to named tuples, generic alternative to dataclass""" if isinstance(obj, dict): fields = sorted(obj.keys()) namedtuple_type = namedtuple( typename='TestData', field_names=fields, rename=True, ) field_value_pairs = OrderedDict( (str(field), create_namedtuple_from_dict(obj[field])) for field in fields ) try: return namedtuple_type(**field_value_pairs) except TypeError: # Cannot create namedtuple instance so fallback to dict (invalid attribute names) return dict(**field_value_pairs) elif isinstance(obj, (list, set, tuple, frozenset)): return [create_namedtuple_from_dict(item) for item in obj] else: return obj def inject_test_data(file): """ Read the content of the JSON file and convert it to a named tuple, can be used for injecting test data set to tests, helps in separating test data from the tests """ file = str(RESOURCES_PATH.joinpath(file)) with open(file) as f: raw_data = json.load(f) return create_namedtuple_from_dict(raw_data)
these are one-time utility methods that can be reused for all the JSON files. The main reason for converting JSON to a named tuple is to use a class-like syntax to fetch the fields. In my opinion, these are much easier than accessing field value from dictionaries. Also, because of duck typing, we can pass named tuples as objects wherever required.
class TestData: test_data = inject_test_data(file="email/emailTestSet.json") @pytest.mark.parametrize("input", test_data.happyPath) def test_email_happy_path(self, input): print(input.description) response = send_email(input.from_, input.to, input.message) assert response.sent assert response.error is None @pytest.mark.parametrize("input", test_data.invalidEmail) def test_email_validation_failed(self, input): print(input.description) with pytest.raises(InvalidEmailException) as e: send_email(input.from_, input.to, input.message) @pytest.mark.parametrize("input", test_data.invalidMessage) def test_email_body_validation_failed(self, input): print(input.description) with pytest.raises(InvalidMessageBodyException) as e: send_email(input.from_, input.to, input.message)
Now, if we observe, the tests are agnostic to the underlying test data. This separation of concerns helps in the maintainability of the tests. The complete example of the solution can be found here
Quick Links
Legal Information
Social Media