ML Testing

This rubric enables assessment on a scale from Insufficient to Excellent.

Testing Framework

Cut-off requirement:

  • Ensure that a testing framework is set up in the project.
  • Document the single command to run the tests clearly in the project’s README or other relevant documentation.

Automated Tests

Sufficient
Test for non-determinism robustness.
Good
Test for non-determinism robustness and use data slices to test model capabilities. There is at least one test for each of the following angle: Feature and Data; Model Development; ML infrastructure; Monitoring tests.
Very Good
Test for non-determinism robustness and use data slices to test model capabilities. There is at least one test for each of the following angles: Feature and Data; Model Development; ML infrastructure; Monitoring tests. Non-functional requirements such as memory and performance are being tested. There is a test adequacy metric being reported. There is an implementation of mutamorphic testing.
Excellent
Test for non-determinism robustness and use data slices to test model capabilities. There is at least one test for each of the following angle: Feature and Data; Model Development; ML infrastructure; Monitoring tests. The cost of features is being tested. Non-functional requirements such as memory and performance are being tested. There is an appropriate test adequacy metric being reported. There is an implementation of mutamorphic testing with automatic inconsistency repair.

Continuous testing

Sufficient
Tests are triggered through the continuous training pipeline.
Good
Tests are triggered through the continuous training pipeline and test adequacy metrics are reported in the code repository.
Very Good
Tests are triggered by the continuous training pipeline. Test adequacy metrics and test results are reported in the code repository.
Excellent
Tests are triggered by the continuous training pipeline. Test adequacy metrics and test result status are reported in the code repository. Test results are reported automatically when a pull request comes in.

The peer review will also contain a free text form. Please use this form to summarise your assessment of all rubric criteria in this category, especially if you rated something as insufficient. In addition, explain what you think would be the next important step that needs to be realised.


Last modified on Jun 27, 2023 at 21:59.