The most important feature of any large software project is the test suite for very well understood reasons: it lowers the complexity of the project from constant linear growth proportional to the number of files to a manageable constant.
Put more simply, if you have great tests that cover your entire project then you never have to worry whether something somewhere breaks: that will be identified by the failures in the test suite.
For those new to test-driven development, we recommend the section on testing from Hadley's Advanced R book.
Let's begin with a concrete example. Make sure you are back in the example.sy project from earlier sections.
# Place this in lib/mungebits/imputer.R
train <- function(data) {
numeric_columns <- vapply(data,
function(x) is.numeric(x) && any(is.na(x)), logical(1))
input$columns <- colnames(data)[numeric_columns]
input$means <- list()
data[numeric_columns] <- lapply(input$columns, function(column) {
input$means[[column]] <- mean(data[[column]], na.rm = TRUE)
col <- data[[column]]
col[is.na(col)] <- input$means[[column]]
col
})
data
}
predict <- function(data) {
data[input$columns] <- lapply(input$columns, function(column) {
col <- data[[column]]
col[is.na(col)] <- input$means[[column]]
col
})
data
}
We can test that our imputer mungebit behaves as expected.
mungebit <- project$resource("lib/mungebits/imputer")
iris[1, 1] <- NA
iris2 <- mungebit$train(iris)
stopifnot(mungebit$predict(iris)[1, 1] == mean(iris[-1, 1]))
We have verified by hand that the imputer mungebit works for one particular example. To ensure that it remains valid even if data scientists modify the definition of the imputer later, we should place the above check in a test file.
# test/lib/mungebits/imputer.R
test_that("it imputes a simple example correctly during training", {
mungebit <- resource()
iris[1, 1] <- NA
expect_equal(mungebit$train(iris)[1, 1], mean(iris[-1, 1]))
})
test_that("it imputes a simple example correctly during prediction", {
mungebit <- resource()
iris[1, 1] <- NA
mungebit$train(iris)
expect_equal(mungebit$predict(iris)[1, 1], mean(iris[-1, 1]))
})
Note the resource
helper available in
test files which is equivalent to calling project$resource
on the current filename without the leading "test/"
.
In other words, even testing enforces a convention: the test for a given
resource should be the filename of that resource located within the
test directory.
Using the stest
helper provided by the
modeling engine, we can check that our mungebit behaves as expected.
stest("lib/mungebits/imputer")
# ..
# DONE
In general, all resources in a project should have accompanying tests. If you find it difficult to test a resource, it probably indicates that you haven't broken it down enough or exposed sufficient mocking, stubbing and other test engineering ideas. (These are more advanced concepts you can Google, but applying them to Syberia projects is still an ongoing open question.)
The magic behind tests is the same as that of any other Syberia file: tests are defined by a tests controller in the base package, and thus tests for just another resource.
To check out how the base engine defines testing, see the test controller.
Because models form complicated execution pipelines, testing them
is not so straightforward. By default, the modeling engine comes
with a models test controller. This test controller
says that each model shall cache the last 100 rows of data after
running import stage and save the highly downsampled dataframe
to a directory in test/models/.registry
.
If this sounds funky, it's because it relies on a well-understood testing design pattern. Rather than repulling data when running the models in the full test suite, we store a tiny sample of the data and only use that to test the data pipeline. Otherwise it might take a very long time to run all the tests!
The other convention established by the models test controller is that it cuts off the stagerunner at the end of the data stage. We don't want to build a full model, even on a small sample of data, and anyway that should be handled by the classifier's tests.
To run all the tests in your test directory (your "test suite"), execute the following from the global R console after opening your project:
syberia::test_engine()
You have covered the fundamentals of the Syberia modeling engine and its dependencies. Let's proceed to a summary of everything we have learned.
Sometimes you may wish to define functions that execute before and
after you call test_project
, or before
and after every individual test. These are called test setup
and test teardown hooks.
The practice of continuous integration refers to running the test suite in the cloud using some tool like Travis every time that code changes are committed to the codebase.
Projects built on top of the Syberia modeling engine make it
easy to add continuous integration to any modeling project. Follow
the instructions on how to enable travis for your repository
and put the file below in ".travis.yml"
at
the root of your project. Non-Travis continuous integration services
like a local Jenkins setup may require a different configuration.
# Place this in .travis.yml in the root of your Syberia modeling project.
language: c
dist: trusty
sudo: false
addons:
apt:
sources:
- r-packages-precise
packages:
- r-base-dev
- r-recommended
- pandoc
env:
- global:
- TRAVIS=true
- R_LIBS_USER=~/.R/library
cache:
directories:
~/.R
before_script:
- rm -rf "/home/travis/.R/.syberia"
- mkdir -p "$R_LIBS_USER"
script: "Rscript -e 'library(syberia); library(methods); devtools::with_options(list(stub = 1), force); syberia::syberia_engine(); quit(status = tryCatch({ syberia::test_engine(); 0 }, error = function(e) { message(e); message(bettertrace::stacktrace()); 1 }));'"
notifications:
email:
on_success: change
on_failure: change