pm3argparse - Parsing Command Lines

Introduction

A lot of the actions in ProMod3 have a bunch of command line parameters/ arguments in common. For example we need an input alignment quite often and usually for an alignment we need information on what is the target sequence, what identifies a template sequence and eventually a hint on the format. That means we need the same functionality on the command line in several actions. There PM3ArgumentParser serves as a simplification. It provides a set of standard arguments you just need to activate for your action plus it comes with some verification functionality for input.

"""
Place the description of your script right in the file and import
it via '__doc__' as description to the parser ('-h', '--help').
"""

from promod3.core import pm3argparse

# make sure we see output when passing '-h'
import ost
ost.PushVerbosityLevel(2) 

parser = pm3argparse.PM3ArgumentParser(__doc__)
parser.AddAlignment()
parser.AssembleParser()
opts = parser.Parse()

When the example above is run with pm and no additional arguments, the script exits with a code 2. If it is run with an additional argument -h or --help, it exits with a code 0 and displays the help message given as a docstring in your script.

Argument Parser

class promod3.core.pm3argparse.PM3ArgumentParser(description, action=True)

This class is a child of argparse.ArgumentParser. It provides a set of standard arguments which can be activated with Add*() methods and then assembled with AssembleParser(). This helps keeping up a common naming scheme throughout all ProMod3 actions. As a real extension, this subclass provides checking of input parameters on Parse(). Besides this, everything you can do with a ‘real’ ArgumentParser instance is possible here.

Attributes beyond argparse.ArgumentParser:

action

Indicates if the calling script is a ProMod3 action.

Type:

bool

__init__(description, action=True)

Create a new instance of PM3ArgumentParser.

Parameters:
Returns:

argparse.ArgumentParser.

AddAlignment(allow_multitemplate=False)

Commandline options for alignments.

Activate everything needed to load alignments to the argument parser. Command line arguments are then added in AssembleParser() and the input is post processed and checked in Parse().

Parameters:

allow_multitemplate (bool) – enable support for multitemplate alignments

Options/arguments added:

  • -f/--fasta <FILE> - Target-template alignment in FASTA format. Target sequence is either named “trg” or “target” or the first sequence is used. File can be plain or gzipped.

  • -c/--clustal <FILE> - Target-template alignment in CLUSTAL format. Target sequence is either named “trg” or “target” or the first sequence is used. File can be plain or gzipped.

  • -j/--json <OBJECT>|<FILE> - Alignments provided as JSON file/object. File can be plain or gzipped.

See here for details on the file formats.

Attributes added to the namespace returned by Parse():

  • fasta - filled with the input of the --fasta option, a list of str (filenames).

  • clustal - filled with the input of the --clustal option, a list of str (filenames).

  • json - filled with the input of the --json option, a list of str, where each string may be a filename or a JSON object string.

  • alignments - ost.AlignmentList, same order as given. First sequence of the alignment is the target sequence, if in doubt, check for sequence roles TARGET or TEMPLATE

  • aln_sources - list of str with the original source(s) of the alignment: may be filename(s) or JSON strings.

Exit codes related to alignment input:

  • 12 - a given alignment file does not exist

  • 13 - never raised (parameter for checking gzip files)

  • 14 - gzip file cannot be opened

  • 15 - found an empty alignment file

  • 16 - unsupported number of sequences in alignment: only 1 sequence or (unless allow_multitemplate = True) more than 2 sequences

  • 17 - mutliple target sequences found in alignment

  • 18 - error when reading fasta/clustal file

  • 19 - problem with a JSON formatted file handed over to --json

  • 20 - JSON file could not be decoded into a JSON object

  • 21 - JSON object has no ‘alignmentlist’ key

  • 22 - JSON object has no ‘target’/’template’ in the ‘alignmentlist’

  • 23 - JSON string could not be decoded

  • 24 - JSON object ‘alignmentlist’ does not point to a list

  • 25 - JSON object ‘alignmentlist’ member is not a dictionary

  • 26 - JSON object ‘alignmentlist’ ‘target’/’template’ does not point to a dictionary

  • 27 - JSON object ‘alignmentlist’ ‘target’/’template’ does not have a needed key

  • 28 - JSON object ‘alignmentlist’ ‘target’/’template’ has a value of wrong type

AddFragments()

Commandline option for usage of Fragments

Activate everything needed to setup promod3.modelling.FraggerHandle objects in the argument parser. Command line arguments are then added in AssembleParser() and the input is post processed and checked in Parse().

Options/arguments added:

  • -r/--use-fragments - Boolean flag whether to setup fragger handles.

Notes:

  • Fragger handles are setup to identify fragments in a promod3.loop.StructureDB.

  • If no profiles are provided as additional argument (-s/--seqprof <FILE>), fragments are identified based on BLOSUM62 sequence similarity.

  • If you provide profiles that are not in hhm format, fragments are identified based on BLOSUM62 sequence similarity, sequence profile scoring and structural profile scoring.

  • If you provide profiles in hhm format (optimal case), psipred predictions are fetched from there and fragments are identified based on secondary structure agreement, secondary structure dependent torsion probabilities, sequence profile scoring and structure profile scoring.

Attributes added to the namespace returned by Parse():

Exit codes related to fragments input:

  • 56 - cannot read psipred prediction from hhm file

AddProfile()

Commandline options for profiles

Activate everything needed to load profiles to the argument parser. Command line arguments are then added in AssembleParser() and the input is post processed and checked in Parse().

Options/arguments added:

  • -s/--seqprof <FILE> - Sequence profile in any format readable by the ost.io.LoadSequenceProfile() method. Format is chosen by file ending. Recognized file extensions: .hhm, .hhm.gz, .pssm, .pssm.gz. Consider to use ost.bindings.hhblits.HHblits.A3MToProfile() if you have a file in a3m format at hand.

Notes:

  • the profiles are mapped based on exact matches towards the gapless target sequences, i.e. one profile is mapped to several chains in case of homo-oligomers

  • every profile must have a unique sequence to avoid ambiguities

  • all or nothing - you cannot provide profiles for only a subset of target sequences

Attributes added to the namespace returned by Parse():

Exit codes related to profile input:

  • 51 - a given profile file does not exist

  • 52 - failure to read a given profile file

  • 53 - a profile cannot be mapped to any target sequence

  • 54 - profile sequences are not unique

  • 55 - only subset of target sequences is covered by profile

AddStructure(attach_views=False)

Commandline options for structures.

Activate everything needed to load structures to the argument parser. Command line arguments are then added in AssembleParser() and the input is post processed and checked in Parse().

Parameters:

attach_views (bool) – if True: attach views to alignments. Requires call to AddAlignment(). Chains for each sequence are identified based on the sequence name of the templates in the alignments (see here for details).

Options/arguments added:

  • -p/--pdb <FILE> - Structure in PDB format. File can be plain or gzipped.

  • -e/--entity <FILE> - Structure in any format readable by the ost.io.LoadEntity() method. Format is chosen by file ending. Recognized File Extensions: .ent, .pdb, .ent.gz, .pdb.gz, .cif, .cif.gz.

Notes:

  • one of the inputs must be given and only one type of input acceptable

  • callable multiple times (structures appended in given order)

Attributes added to the namespace returned by Parse():

  • pdb - filled with the input of the --pdb option, a list of str (filenames).

  • entity - filled with the input of the --entity option, a list of str (filenames).

  • structures - list of ost.EntityHandle, same order as given.

  • structure_sources - list of str with the original filenames of the structures.

Exit codes related to alignment input:

  • 32 - a given structure file does not exist

  • 33 - failure to read a given structure file

  • 34 - file ending is not a supported format

Exit codes if attach_views = True:

  • 41 - attach_views used without adding alignments

  • 42 - inconsistent offsets between seq. name and seq. in alignment

  • 43 - non-integer offset defined in seq. name

  • 44 - too many “|” in seq. name

  • 45 - chain to attach to sequence could not be identified

AssembleParser()

When adding options via the Add*() methods, call this after you are done. Everything before just tells the parser that it should contain those option sets but does not actually add anything. AssembleParser() will put everything in place, in the right order and with the right constraints.

Parse(args=None)

Parse an argument string. See Add*() methods.

Options/arguments added by default: -h/--help shows usage.

General exit codes:

  • 1 - an unhandled exception was raised

  • 2 - arguments cannot be parsed or required arguments are missing

Parameters:

args (list) – The argument string. As default sys.argv is used.

Returns:

Namespace filled with attributes (see Add*() methods).

Search

Enter search terms or a module, class or function name.

Contents