Developer documentation¶
This section of the documentation is meant for those who wish to contribute to the Girder Worker platform.
Installing from source¶
Clone from git:
git clone https://github.com/girder/girder_worker.git
cd girder_worker
Install requirements:
pip install -r requirements.txt
# If you plan on developing the worker, you will also want to install system packages and Python requirements:
# Command for Ubuntu
sudo apt-get install libxml2-dev libxslt1-dev
pip install -r requirements-dev.txt
If you want to run the girder_worker as a remote worker from the source install directory, you’ll need to install it with pip in editable mode. If you don’t want to include any girder worker plugins:
pip install -U -e .
Test it:
To test whether the setup without complex dependencies is working
python -m unittest tests.format_test
To test the setup is working with complex dependencies including R and Mongo
python -m unittest -v tests.table_test python -m unittest -v tests.tree_test
Some things not working? You can install a few things so they do. For example, install MongoDB and R, in addition to their Python bindings:
pip install pymongo rpy2 # may need sudo
You’ll need to get a MongoDB server listening on localhost by running mongod
.
In R, you’ll need to install some stuff too, currently just the ape
package:
install.packages("ape")
Installing worker plugins from pip¶
An individual plugin can be installed through pip, in editable mode, like so:
pip install -U -e .[r]
You can run this command at any time to install dependencies of other plugins, even if the worker is already installed.
See also
For more information on the worker plugin system, see Application Plugins.
Creating a new release¶
Girder Worker releases are uploaded to PyPI
for easy installation via pip
. The recommended process for generating a new
release is described here.
From the target commit, set the desired version number in the top level
__init__.py
. Create a new commit and note the SHA; this will become the release tag.Ensure that all tests pass.
Clone the repository in a new directory and checkout the release SHA. (Packaging in an old directory could cause extraneous files to be mistakenly included in the source distribution.)
Run
python setup.py sdist --dist-dir .
to generate the distribution tarball in the project directory, which looks likegirder-worker-x.y.z.tar.gz
.Create a new virtual environment and install the Python package into it. This should not be done in the repository directory because the wrong package will be imported.
mkdir test && cd test virtualenv release source release/bin/activate pip install ../girder-worker-<version>.tar.gz
Once that finishes, you should be able to start the worker by simply running
girder-worker
.When you are confident everything is working correctly, generate a new release on GitHub. You must be sure to use a tag version of
v<version>
, where<version>
is the version number as it exists insetup.py
. For example,v0.2.4
. Attach the tarball you generated to the release.Add the tagged version to readthedocs and make sure it builds correctly.
Finally, upload the release to PyPI with the following command:
python setup.py sdist upload
Note
The first time you create a release, you will need to register to PyPI
before you can run the upload step. To do so, simply run python setup.py sdist register
.
Task Plugins¶
This is an example plugin that demonstrates how to extend girder_worker
by allowing it to run additional tasks. Plugin’s are implemented as
separate pip installable packages. To install this example plugin you
can checkout this code base, change directories to
examples/plugin_example/
and run pip install .
This will add the
gwexample
plugin to girder_worker. If you then run girder_worker
with a log level of ‘info’ (e.g. girder-worker -l info
) you should
see the following output:
(girder)$ girder-worker -l info
-------------- celery@minastirith v3.1.23 (Cipater)
---- **** -----
--- * *** * -- Linux-4.8.6-1-ARCH-x86_64-with-glibc2.2.5
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: girder_worker:0x7f69bfff1050
- ** ---------- .> transport: amqp://guest:**@localhost:5672//
- ** ---------- .> results: amqp://
- *** --- * --- .> concurrency: 32 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. girder_worker.convert
. girder_worker.run
. girder_worker.validators
. gwexample.analyses.tasks.fibonacci
[2016-11-08 12:22:56,163: INFO/MainProcess] Connected to amqp://guest:**@127.0.0.1:5672//
[2016-11-08 12:22:56,184: INFO/MainProcess] mingle: searching for neighbors
[2016-11-08 12:22:57,198: INFO/MainProcess] mingle: all alone
[2016-11-08 12:22:57,218: WARNING/MainProcess] celery@minastirith ready.
Notice that the task gwexample.analyses.tasks.fibonacci
is now
available. With the girder-worker processes running, you should be able to
execute python example_client.py
in the current working directory.
After a brief delay, this should print out 121393
- the Fibonacci
number for 26.
Writing your own plugin¶
Adding additional tasks to the girder_worker infrastructure is easy and
takes three steps. (1) Creating tasks, (2) creating a plugin class and
(3) adding a girder_worker_plugins
entry point to your setup.py.
Creating tasks follows the standard celery
conventions.
The only difference is the celery application that decorates the
function should be imported from girder_worker.app
. E.g.:
from girder_worker.app import app
@app.task
def fibonacci(n):
if n == 1 or n == 2:
return 1
return fibonacci(n-1) + fibonacci(n-2)
Each plugin must define a plugin class the inherits from
girder_worker.GirderWorkerPluginABC
. GirderWorkerPluginABC’s
interface is simple. The class must define an __init__
function and
a task_imports
function. __init__
takes the girder_worker’s
celery application as its first argument. This allows the plugin to
store a reference to the application, or change configurations of the
application as necessary. The task_imports
function takes no
arguments and must return a list of the package paths (e.g. importable
strings) that contain the plugin’s tasks. As an example:
from girder_worker import GirderWorkerPluginABC
class GWExamplePlugin(GirderWorkerPluginABC):
def __init__(self, app, *args, **kwargs):
self.app = app
# Update the celery application's configuration
# it is not necessary to change the application configuration
# this is simply included to illustrate that it is possible.
self.app.config.update({
'TASK_TIME_LIMIT': 300
})
def task_imports(self):
return ['gwexample.analyses.tasks']
Finally, in order to make the plugin class discoverable, each plugin
must define a custom entry point in its setup.py
. For our example,
this entry point looks like this:
from setuptools import setup
setup(name='gwexample',
# ....
entry_points={
'girder_worker_plugins': [
'gwexample = gwexample:GWExamplePlugin',
]
},
# ....
)
Python Entry Points are a way for python packages to advertise classes and objects to other installed packages. Entry points are defined in the following way:
entry_points={
'entry_point_group_id': [
'entry_point_name = importable.package.name:class_or_object',
]
}
The girder_worker package introduces a new entry point group
girder_worker_plugins
. This is followed by a list of strings which
are parsed by setuptools. The strings must be in the form
name = module:plugin_class
Where name
is an arbitrary string (by
convention the name of the plugin), module
is the importable path to
the module containing the plugin class, and plugin_class
is a class
that inherits from GirderWorkerPluginABC
.
Final notes¶
With these three components (Tasks, Plugin Class, Entry Point) you
should be able to add arbitrary tasks to the girder_worker client. By
default, jobs created in girder using the ‘worker’ plugin run the
girder_worker.run
task. This can be overridden to call custom plugin
tasks by generating jobs with a celeryTaskName
defined in the job’s
otherFields
key word argument. E.g.:
# Create a job to be handled by the worker plugin
job = jobModel.createJob(
title='Some Job', type='some_type', handler='worker_handler',
user=self.admin, public=False, args=(25), kwargs={},
otherFields={
'celeryTaskName': 'gwexample.analyses.tasks.fibonacci'
})
jobModel.scheduleJob(job)
This will schedule a job that runs
gwexample.analyses.tasks.fibonacci(25)
on the girder worker.
Finally, by default the core girder_worker tasks (run
, convert
,
validate
) are included along with their plugins etc. If you wish to
prevent these tasks from being loaded inside the celery instance, you
can configure core_tasks=false
in worker.local.cfg
under the
[girder_worker]
section of the configuration.
Writing cancelable tasks¶
girder_worker provides support for signaling that a task should be canceled using
Celery’s revoke
mechanism. In order for a task to be able to be canceled cleanly it must periodically
check if it has been canceled, if it has then is can do any necessary cleanup and
return. girder_worker provides a task base class (girder_worker.utils.Task
)
that provides a property that can be used to check if the task has been canceled.
An example of its use is shown below:
from girder_worker.app import app
from girder_work.utils import Task
@app.task(bind=True)
def my_cancellable_task(task):
while not self.cancelled:
# Do work
The Girder job model associated with the canceled task will be moved into the
JobStatus.CANCELED
state.