Testing, testing, testing!

The most important feature of any large software project is the test suite for very well understood reasons: it lowers the complexity of the project from constant linear growth proportional to the number of files to a manageable constant.

Put more simply, if you have great tests that cover your entire project then you never have to worry whether something somewhere breaks: that will be identified by the failures in the test suite.

For those new to test-driven development, we recommend the section on testing from Hadley's Advanced R book.

Defining tests

Let's begin with a concrete example. Make sure you are back in the example.sy project from earlier sections.

# Place this in lib/mungebits/imputer.R
train <- function(data) {
  numeric_columns <- vapply(data,
    function(x) is.numeric(x) && any(is.na(x)), logical(1))
  input$columns <- colnames(data)[numeric_columns]
  input$means <- list()
  data[numeric_columns] <- lapply(input$columns, function(column) {
    input$means[[column]] <- mean(data[[column]], na.rm = TRUE)
    col <- data[[column]]
    col[is.na(col)] <- input$means[[column]]

predict <- function(data) {
  data[input$columns] <- lapply(input$columns, function(column) {
    col <- data[[column]]
    col[is.na(col)] <- input$means[[column]]

We can test that our imputer mungebit behaves as expected.

mungebit <- project$resource("lib/mungebits/imputer")
iris[1, 1] <- NA
iris2 <- mungebit$train(iris)
stopifnot(mungebit$predict(iris)[1, 1] == mean(iris[-1, 1]))

We have verified by hand that the imputer mungebit works for one particular example. To ensure that it remains valid even if data scientists modify the definition of the imputer later, we should place the above check in a test file.

# test/lib/mungebits/imputer.R
test_that("it imputes a simple example correctly during training", {
  mungebit <- resource() 
  iris[1, 1] <- NA
  expect_equal(mungebit$train(iris)[1, 1], mean(iris[-1, 1]))

test_that("it imputes a simple example correctly during prediction", {
  mungebit <- resource() 
  iris[1, 1] <- NA
  expect_equal(mungebit$predict(iris)[1, 1], mean(iris[-1, 1]))

Note the resource helper available in test files which is equivalent to calling project$resource on the current filename without the leading "test/". In other words, even testing enforces a convention: the test for a given resource should be the filename of that resource located within the test directory.

Using the stest helper provided by the modeling engine, we can check that our mungebit behaves as expected.

# ..

In general, all resources in a project should have accompanying tests. If you find it difficult to test a resource, it probably indicates that you haven't broken it down enough or exposed sufficient mocking, stubbing and other test engineering ideas. (These are more advanced concepts you can Google, but applying them to Syberia projects is still an ongoing open question.)

Where do tests come from?

The magic behind tests is the same as that of any other Syberia file: tests are defined by a tests controller in the base package, and thus tests for just another resource.

To check out how the base engine defines testing, see the test controller.

Testing models

Because models form complicated execution pipelines, testing them is not so straightforward. By default, the modeling engine comes with a models test controller. This test controller says that each model shall cache the last 100 rows of data after running import stage and save the highly downsampled dataframe to a directory in test/models/.registry.

If this sounds funky, it's because it relies on a well-understood testing design pattern. Rather than repulling data when running the models in the full test suite, we store a tiny sample of the data and only use that to test the data pipeline. Otherwise it might take a very long time to run all the tests!

The other convention established by the models test controller is that it cuts off the stagerunner at the end of the data stage. We don't want to build a full model, even on a small sample of data, and anyway that should be handled by the classifier's tests.

Running the full test suite

To run all the tests in your test directory (your "test suite"), execute the following from the global R console after opening your project:


Next Steps

You have covered the fundamentals of the Syberia modeling engine and its dependencies. Let's proceed to a summary of everything we have learned.

Appendix: Test configuration

Sometimes you may wish to define functions that execute before and after you call test_project, or before and after every individual test. These are called test setup and test teardown hooks.

Appendix: Continuous integration

The practice of continuous integration refers to running the test suite in the cloud using some tool like Travis every time that code changes are committed to the codebase.

Projects built on top of the Syberia modeling engine make it easy to add continuous integration to any modeling project. Follow the instructions on how to enable travis for your repository and put the file below in ".travis.yml" at the root of your project. Non-Travis continuous integration services like a local Jenkins setup may require a different configuration.

# Place this in .travis.yml in the root of your Syberia modeling project.
language: c
dist: trusty
sudo: false
      - r-packages-precise
      - r-base-dev
      - r-recommended
      - pandoc
  - global:
    - TRAVIS=true
    - R_LIBS_USER=~/.R/library
  - rm -rf "/home/travis/.R/.syberia"
  - mkdir -p "$R_LIBS_USER"
script: "Rscript -e 'library(syberia); library(methods); devtools::with_options(list(stub = 1), force); syberia::syberia_engine(); quit(status = tryCatch({ syberia::test_engine(); 0 }, error = function(e) { message(e); message(bettertrace::stacktrace()); 1 }));'"
    on_success: change
    on_failure: change