Contributing

Code contributions are handled through the git repository hosted at sciCORE, University of Basel: https://git.scicore.unibas.ch/schwede/ProMod3. Get in touch with the main developers if you have a fantastic new feature and need an account there. The following should explain, in a coarse grain manner, how to add new features to ProMod3. The most general advice would be to use existing bits and pieces as examples and to be consistent with what you already find here.

How To Share Your Own Script

If you have a useful script using ProMod3 that you want to share, it should go as a subfolder into the extras/external_scripts folder. Make sure to describe the use and purpose of the script in a short README including working commands on how to use it.

How To Start Your Own Module

This is just a walk-through how the topics from above work together when you start your own module. For the entry point, lets assume that you already cloned the repository into a directory and just changed into it.

All new features should take off from the develop branch. That way, they work fine with all the other new fellows waiting for release right from the beginning. Therefore you need to switch branches as a first step. Git will tell you for which branch you went, a story of failure otherwise.

$ git checkout develop
Switched to branch 'develop'

Sitting on top of the right code basis, you should just spawn your own branch from it. As an example, your feature will go by the name of ‘sidechain’.

$ git checkout -b sidechain
Switched to a new branch 'sidechain'

This time, Git should tell you about going for a new branch.

Before starting to create anything for real, now is the perfect moment to install our very own Git hook to check some coding rules on commit.

$ cp extras/pre_commit/pre-commit .git/hooks/

With that in place, changes which break our coding standards will abort any commit.

Now create the directory structure where your project will live. Here is the list of directories which are likely to be used in every project.

$ mkdir -p sidechain/doc
$ mkdir -p sidechain/pymod
$ mkdir -p sidechain/tests

If you run git status at this point, you will see basically nothing. That is, Git does not admire empty directories. Before you bring your module under version control, create a couple of files which are always needed.

$ touch sidechain/pymod/__init__.py
$ echo ":mod:\`~promod3.sidechain\` - ProMod3 side chain optimiser" \
       >> sidechain/doc/index.rst
$ echo "==========================================================" \
"======================" >> sidechain/doc/index.rst

Having an empty __init__.py is perfectly fine for Python, it just announces a directory as a module. But a blank index.rst has the chance to give Sphinx quite a headache so you already fill it with a headline for your documentation.

For integration with make, the build system needs to now about the new module and its members. This goes for setting up new CMake files and extending some around the directory root.

$ touch sidechain/CMakeLists.txt
$ touch sidechain/pymod/CMakeLists.txt
$ touch sidechain/doc/CMakeLists.txt

Each of those files still needs a bit of content. The simplest one comes from the module’s root, sidechain/CMakeLists.txt:

1add_subdirectory(pymod)
2add_subdirectory(doc)

Those two directives just tell CMake to go and look in directories pymod and doc below the current path for more CMake configurations. The next level in CMakeLists.txt magic comes for the doc directory:

1set(SIDECHAIN_RST
2index.rst
3)
4
5add_doc_source(NAME sidechain RST ${SIDECHAIN_RST})

add_doc_source is our custom CMake macro to register reST files for the documentation. On running make, those files are placed in a doc/source directory tree within the build directory. Each new submodule in your project should be covered by its own documentation entity, extending the list in RST. Maintaining readability, its good practice to store this list in a separate variable, called SIDECHAIN_RST here.

For the actual code, you should keep in mind that a Python module may be rather complex. There is for sure Python code, there could be a bit of C++ and conditional compilation. In rare cases you also want to modify the directory structure of the package. All this has to be declared in the pymod subtree. We cannot enumerate all specialities but there should be a couple of examples around in this repository. Here is the most basic CMakeLists.txt:

1set(SIDECHAIN_PYMOD
2__init__.py
3)
4
5pymod(NAME sidechain PY ${SIDECHAIN_PYMOD})

Source files should be again listed in a dedicated variable. Later, you probably add some C++ code and settings diverging from the defaults via the pymod macro. This is where things clutter up quite quickly. As set up here, your project would be added as a module sidechain in the ProMod3 Python package tree.

The final step towards CMake is to register your module’s directory in the top level CMakeLists.txt:

 1## <lots of cmake commands...>
 2
 3## sub dirs to be recognised by CMake
 4## e.g. add_subdirectory(src), subdirs have their own CMakeLists.txt
 5add_subdirectory(config)
 6add_subdirectory(core)
 7add_subdirectory(modelling)
 8add_subdirectory(sidechain)
 9add_subdirectory(loop)
10add_subdirectory(scripts)
11add_subdirectory(actions)
12add_subdirectory(extras)
13add_subdirectory(cmake_support)
14
15## <lots of cmake commands...>

All that needs to be done for CMake to recognise your module is adding its directory as shown in line 8.

This was the final step to set up the build system. Running CMake at this point would create the build environment in place. But building software in your code repository has several drawbacks. First of all, it puts all kind of new files in the directory tree and git status would show them all. Then its very likely, that manual intervention is needed after make clean. Plus, this would be very static. Imagine at one point you want to switch on all debugging flags for your C++ code. So you either clean the whole repository and rebuild or you go by two separated repositories copying code changes from A to B. The solution to this is instead of ‘in place’ you go ‘out of source’. You still can stay in your repository while being out of the source tree by using sub-directories. ProMod3 comes with a dedicated prefix ‘build*’ in .gitignore. Have a directory build and build-dbg and it will not show up in git status.

$ mkdir build
$ cd build

To actually create all the makefiles and generated files, you may use one of the configuration scripts from the conf-scripts directory. Usually those scripts only need to be pointed to an OST staging tree. Even if you are on a system not covered by available scripts, their code may help you at the CMake command. Once you managed to conquer a new system, feel free to add a new configuration script. The following example assumes Fedora 19.

$ ../conf-scripts/fedora-19-conf ../../ost.git/stage

From this point, make should work and you could start adding your files to the repository using git add.

Up to now, we did not cover the tests branch of a new module. But its good practice to develop new functionality along tests and that right from the beginning. At some point, new code needs testing anyway to see if it does what it should, so just do this by writing unit tests. Test sources are stored in files with a prefix test_ and usually come per submodule instead of sporting a single monolithic test_sidechain_reconstruction.py.

Python code is evaluated using its own unit testing framework with a little help from OST (C++ uses the Boost Test Library). The basic scheme is to import your module, subclass unittest.TestCase and make the whole file runnable as script using the most common __name__ attribute. As an example we test the promod3.modelling.ReconstructSidechains() function:

 1import unittest
 2from promod3 import modelling
 3from ost import io,mol
 4import os
 5
 6class ReconstructTests(unittest.TestCase):
 7    def testReconstruct(self):
 8        in_file = os.path.join('data', '1eye.pdb')
 9        ref_file = os.path.join('data', '1eye_rec.pdb')
10        # get and reconstruct 1eye
11        prot = io.LoadPDB(in_file)
12        modelling.ReconstructSidechains(prot, keep_sidechains=False)
13        # compare with reference solution
14        prot_rec = io.LoadPDB(ref_file)
15        self.assertEqual(prot.GetAtomCount(), prot_rec.GetAtomCount())
16
17if __name__ == "__main__":
18    from ost import testutils
19    testutils.RunTests()

To hook up your tests with make codetest (and to create a test_reconstruct_sidechains.py_run target), everything has to be introduced to CMake. First, tell CMake to search tests for a CMakeLists.txt file by extending the list of sub-directories in sidechain/CMakeLists.txt:

1 add_subdirectory(pymod)
2 add_subdirectory(doc)
3 add_subdirectory(tests)

Then fill sidechain/tests/CMakeLists.txt with your new test script and make will recognise the changes next time it is run and fix the rest for you.

 1set(SIDECHAIN_UNIT_TESTS
 2  test_reconstruct_sidechains.py
 3)
 4
 5set(SIDECHAIN_TEST_DATA
 6  data/1eye.pdb
 7  data/1eye_rec.pdb
 8)
 9
10promod3_unittest(MODULE sidechain
11                 SOURCES "${SIDECHAIN_UNIT_TESTS}"
12                 DATA "${SIDECHAIN_TEST_DATA}")

Note how we listed the test data that we require in the unit test by defining SIDECHAIN_TEST_DATA.

Now tests should be available by make check, make codetest and make test_reconstruct_sidechains.py_run.

How To Start Your Own Action

In ProMod3 we call scripts/ programs ‘actions’. They are started by a launcher found in your staging directory at stage/bin/pm. This little guy helps keeping the shell environment in the right mood to carry out your job. So usually you will start an action by

$ stage/bin/pm help

To start your own action, follow How To Start Your Own Module until creating a directory structure for a new module. Also do go for a dedicated branch for action-development. There you can produce intermediate commits while other branches stay clean in case you have to do some work there which needs to get public.

After preparing your repository its time to create a file for the action. That is a bit different than for modules. Assuming we are sitting in the repository’s root:

$ touch action/pm-awesome-action
$ chmod +x action/pm-awesome-action

Two things are important here: actions are prefixed with pm-, so they are recognised by the pm launcher. Secondly, action files need to be executable, which does not propagate if you do it after the first call to make.

To get the new action recognised by make to be placed in stage/libexec/promod3, it has to be registered with CMake in actions/CMakeLists.txt:

1 add_custom_target(actions ALL)
2 add_subdirectory(tests)
3
4 pm_action_init()
5 pm_action(pm-build-rawmodel actions)
6 pm_action(pm-help actions)
7 pm_action(pm-awesome-action actions)

Just add your action with its full filename with a call to pm_action() at the end of the file.

Before coding your action, lets set up unit tests for it. Usually when adding features, you will immediately try them, check that everything works as intended, etc.. ProMod3 helps you automatising those tests so its rather easy to check later, if code changes break anything. For actions, we are using test_actions.ActionTestCase instead of unittest.TestCase. Since testing has a lot in common for different actions, we decided to put up a little wrapper around this subject. See the documentation of ActionTestCase for more information.

Now its time to fill your action with code. Instead of reading a lot more of explanations, it should be easy to go by examples from the actions directory. There are only two really important points:

  • No shebang line (#! /usr/bin/python) in your action! Also no #! /usr/bin/env python or anything like this. This may lead to funny side effects, like calling a Python interpreter from outside a virtual environment or a different version OST. Basically it may mess up the environment your action is running in. Actions are called by pm, that’s enough to get everything just right.

  • The action of your action happens in the __main__ branch of the script. Your action will have own function definitions, variables and all the bells and whistles. Hiding behind __main__ keeps everything separated and makes things easier when it gets to debugging. So just after

    import alot
    
    def functions_specific_to_your_action(...):
    
    if __name__ == "__main__":
        <put together what your action should do here>
    

    start putting your action together.

How To Write Your Own Scorer

The scoring module contains several classes to make it easy to add new scorers. As usual, you can use existing bits and pieces as examples and try to be consistent with it. Here, we quickly give an overview of the separation of concerns:

  • BackboneScorer: Defines the scorer with all its parameters and energies and the functionality to compute scores. Scorers are setup by the user (or loaded from disk) if necessary.

    Scorers do not store any environment data. If needed they can be linked via pointers to env. data kept and updated by the score env.. Also, they may be linked to a score env. listener to handle specially organized data.

  • BackboneScoreEnv: Handles all model-specific data used by the scorers. The user sets up the environment and updates it whenever something changes.

    Residue-specific data is kept in arrays of fixed size (see IdxHandler for how the indexing is done). An array of bool-like integers can be accessed with “GetEnvSetData()” and used to determine whether env. data is available for a certain residue. The list of sequences handled by the env. is fixed as otherwise pointers into the data-storage would be invalidated.

  • BackboneScoreEnvListener: Allows to have score-specific data to be extracted from the model-specific data available in the score env. class. It is commonly used to define spatially organized structures to quickly access other atoms within a given radius.

    All score env. listener are attached to a master score env. and they get updated when the score env. gets updated. Multiple scorers can use the same listener. Listeners are not accessible by anyone outside of the scorers and the score env. object responsible for it. Since the user doesn’t see them, there is no Python API for them.

  • IdxHandler: This takes care of translating chain indices (range [0, GetNumChains()]) and residue numbers (range [1, GetChainSize(chain_idx)]) into the indexing used internally by the score env. (range [0, GetNumResidues()]). The score env. takes care of this object and makes it accessible for scorers.

As an example, let’s look at the CBPackingScorer:

  • it contains score-specific parameters and energies which can be either set manually or loaded from disk

  • it is linked to a score env. listener of type CBetaEnvListener, which provides a FindWithin() function to quickly access neighboring CB atoms (note that the same listener is also used by the CBetaScorer)

  • a pointer to the IdxHandler object of the score env. is extracted when the environment is attached and is used to get sequence-specific data when calculating the score

As a second example, look at the PairwiseScorer:

  • it does not require any score-specific setup

  • it is linked to residue-specific CA/CB positions and the pairwise functions defined in the score env.

  • “GetEnvSetData()” of the score env. is used to determine if env. data is available for a given residue

Quick testing of ProMod3 features

High-level features of ProMod3, can be tested directly in an interactive Python shell. First, you need to tell Python, where to find the modules by defining the PYTHONPATH env. variable in your shell to include the lib64/python3.6/site-packages folders of the stage folders of ProMod3 and OST. For convenience, you can place the export-command in your .bashrc (or so). Then, you can import modules from promod3 and ost as in the example codes shown in this documentation.

To test low-level C++ features, you can copy the extras/test_code folder and adapt it for your purposes. First, you will have to fix the paths to ProMod3 and OST in the Makefile by changing the following lines:

# path to OST and ProMod3 stage
OST_ROOT = <DEFINEME>/ost/build/stage
PROMOD3_ROOT = <DEFINEME>/ProMod3/build/stage

Afterwards, you should be able to compile and run small sample codes that use ProMod3 and OST as in the test.cc example. You can compile your code by executing make and run it with make run. Also, remember to set the PROMOD3_SHARED_DATA_PATH variable if you moved the stage folder.

Unit Tests

Of course your code should contain tests. But we cannot give an elaborate tutorial on unit testing here. Again, have a look at how other modules treat this topic and then there is quite a lot of educated material to be found on the Internet. Nevertheless, here is a short list of most important advices:

  • Tests go into dedicated scripts/ source files in the tests directory

  • No external data dependencies, if tests need data, they find it in tests/data

  • If ‘exotic’ Python modules are used, consider making the test aware of the possibility that the module is not available

  • Tests do not fail on purpose

  • No failing tests, that are considered ‘this does not affect anything’

To run the whole test suite, make check is enough. This will also trigger the doctest and linkcheck targets. Alternatively you can run:

  • make codetest to run only unit tests from all modules in ProMod3. Note that make check does nothing more but invoking doctest, linkcheck and codetest as dependencies.

  • make check_xml to run tests without stopping after each failure. Failures are shortly reported to the command line and the result of each test is written in ‘PYTEST-<TestCaseName>.xml’ files in the ‘tests’ subfolders of your ‘build’ folder.

  • Run single tests: assuming you have your_module/tests/test_awesome_feature.py, CMake will provide you with a target test_awesome_feature.py_run. If your module has C++ tests, those will be available by test_suite_your_module_run.

Writing Documentation

To create documentation, we use Sphinx to go from reStructuredText (reST) files and API documentation in source files to HTML or man pages.

For each module, at least one reST document exists, that gives an idea of concepts and pulls in interfaces from source. Copying files to the build directory, issuing the Sphinx call and everything else that is needed to create the actual documentation is done by CMake and its makefiles. Hence, the CMakeLists.txt of the doc directory of a module is crucial. For documentation which does not relate to a particular module, the repository comes with a top-level doc directory.

If you write new functionality for ProMod3, or fix bugs, feel free to extend the CHANGELOG file. It will be automatically pulled into the documentation.

It is highly recommended to add code examples with your documentation. For that purpose, you should write a fully runnable script which is to be placed in the doc/tests/scripts directory. The script is to be runnable from within the doc/tests directory as pm SCRIPTPATH and may use data stored in the doc/tests/data directory. The script and any data needed by it, must then be referenced in the doc/tests/CMakeLists.txt file. Afterwards, they can be included in the documentation using the literalinclude directive. For instance, if you add a new example code loop_main.py, you would add it in your module documentation as follows:

.. literalinclude:: ../../../tests/doc/scripts/loop_main.py

If your example does not relate to a specific module and the documentation is in the top-level doc directory, you need to drop one of the .. as follows:

.. literalinclude:: ../../tests/doc/scripts/hello_world.py

To ensure that the code examples keep on working, a unit test has to be defined in doc/tests/test_doctests.py. Each example code is run by a dedicated test function. Usually, the codes are run and the return code is checked. Command-line output or resulting files can also be checked (see existing test codes for examples). A few more guidelines for example codes:

  • If it generates files as output, please delete them after checking them.

  • If it requires external modules or binaries, check for their availablity. If the external dependencies are not available, output a warning and skip the test.

A copy of the generated html documentation is kept in doc/html so that there is no need to compile ProMod3 to read it. Our policy is to keep that folder in-sync with the latest documentation at least on the master branch (i.e. for every release). You can use the following commands to do the update:

$ cd <PROMOD3_PATH>/build
$ make html
$ rsync -iv -az --exclude=".*" --delete \
        "stage/share/promod3/html/" "../doc/html"

Third Party Contributions (License Issues)

For some tasks you may want to make use of third party contributions in your module, for example

  • calling/ using the output of third party binaries

  • external libraries

  • smallish bits of source code included into the ProMod3 directory tree

  • Python modules not distributed as part of the Python standard library

Modules from the Python standard library are covered by the Python license and licenses is what you have to watch out for with this subject. While the Python license is safe to be used, in the past several projects went restrictive because of exclusive terms of use. Those issues often came from ‘academic licenses’, allowing use if free of charge but for commercial entities. To prevent this is one reason for the existence of ProMod3. This means, before utilising external code, third party libraries, basically anything not created within this project (including pictures, test data, etc.), check licensing. What cannot be used at all are items without any license. Those things are not ‘free’ but more in a legally uncertain state. Also forbidden are licenses which exclude commercial entities.

There are a lot of rather permissive licenses out there, very often asking for acknowledgements. We definitively support this. Either go by example phrases suggested in the license itself or find some nice paragraph yourself and place it in the documentation. We should also not forget to promote those contributions to web pages using ProMod3.

Search

Enter search terms or a module, class or function name.

Contents