.. The config module documentation master file, created by sphinx-quickstart on Wed Mar 3 19:15:42 2010. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. ========================================================== Configuring Python applications with the ``config`` module ========================================================== :Release: |release| :Date: May 11, 2010 .. module:: config :synopsis: A hierarchical configuration module for Python .. moduleauthor:: Vinay Sajip .. sectionauthor:: Vinay Sajip .. toctree:: :maxdepth: 2 Introduction ============ This document describes ``config``, a module for configuring Python programs which aims to offer more power and flexibility than the existing ``ConfigParser`` module. Python programs which are designed as a hierarchy of components can use ``config`` to configure their various components in a uniform way. This module is expected to be used with Python versions >= 2.2. A complete API is available_, and a test suite is included with the distribution - see the Download link above for further details. .. _available: http://www.red-dove.com/config/index.html Simple Usage ============ The simplest scenario is, of course, "Hello, world". Let's look at a very simple configuration file ``simple.cfg`` where a message to be printed is configured:: # The message to print (this is a comment) message: 'Hello, world!' and the program which uses it:: from config import Config # You can pass any file-like object; if it has a name attribute, # that name is used when file format error messages are printed f = file('simple.cfg') cfg = Config(f) print cfg.message which results in the expected:: Hello, world! A configuration file is, at the top level, a list of key-value pairs. Each value, as we'll see later, can be a sequence or a mapping, and these can be nested without any practical limit. In addition to attribute access (``cfg.message`` in the example above), you can also access a value in the configuration using the ``getByPath`` method of a configuration: ``cfg.getByPath('message')`` would be equivalent. The parameter passed to ``getByPath`` is the path of the required value. The ``getByPath`` method is useful for when the path is variable. It could even be read from a configuration :-) There is also a ``get`` method which acts like the dictionary method of the same name - you can pass a default value which is returned if the value is not found in the configuration. The ``get`` method works with dictionary keys or attribute names, rather than paths. Hence, you may call ``cfg.getByPath('a.b')`` which is equivalent to ``cfg.a.b``, or you can call ``cfg.a.get('b', 1234)`` which will return ``cfg.a.b`` if it is defined, and ``1234`` otherwise. Evaluating values ================= So far, so obvious. Now, suppose that we need to print not to ``stdout``, but to ``stdout`` or ``stderr`` depending on the configuration. Then, the modified configuration file might look like this:: # The message to print (this is a comment) message: 'Hello, world!' # The stream to print to (comments are of course optional) stream: `sys.stderr` Notice the use of backticks to indicate a special value. The corresponding program would be:: from config import Config # You can pass any file-like object; if it has a name attribute, # that name is used when error messages are printed f = file('simple.cfg') cfg = Config(f) # The cfg attributes correspond to the keys in the # configuration file print >> cfg.stream, cfg.message with the same result as before:: Hello, world! Notice that the "``sys.stderr``" in backticks was apparently correctly evaluated. This is not a special case, but a generalized mechanism; you can provide any dotted-identifier expression in backticks and it will be evaluated against a list of namespaces you specify. The reason for the dotted-identifier mechanism is to provide some security - the system does **not** perform an unrestricted eval(). By default, the system supports ``sys`` and ``os`` modules, which gives easy access to environment variables (for example). If you change the configuration file to:: # The message to print (this is a comment) message: 'Hello, world!' # The stream to print to (comments are of course optional) stream: `sys.stderr` value: `handlers.DEFAULT_TCP_LOGGING_PORT` and the program to:: f = file('simple.cfg') cfg = Config(f) print >> cfg.stream, cfg.message, cfg.value then running it as is would give rise to an error:: config.ConfigResolutionError: unable to evaluate `handlers.DEFAULT_TCP_LOGGING_PORT` in the configuration's namespaces because an appropriate namespace is not in the list. To rectify this, we modify the program to:: f = file('simple.cfg') cfg = Config(f) # Add lines to import a namespace and add it to the list of namespaces used import logging, logging.handlers cfg.add_namespace(logging) print >> cfg.stream, cfg.message, cfg.value with a more satisfactory result:: Hello, world! 9020 Dealing with repeating values and mappings ========================================== The ``config`` module allows you to specify repeating values using syntax which is very similar to Python's list syntax. You can also use syntax which is almost identical to Python's dict syntax to specify mappings in the configuration. If the application is required to print a sequence of messages to corresponding streams, you could use a configuration file like this:: messages: [ { stream : `sys.stderr`, message: 'Welcome' }, { stream : `sys.stdout`, message: 'Welkom' }, { stream : `sys.stderr`, message: 'Bienvenue' }, ] and the program would look like this:: from config import Config f = file('simple.cfg') cfg = Config(f) for m in cfg.messages: print >> m.stream, m.message Running the above would give what you would expect intuitively:: Welcome Welkom Bienvenue The preamble to the above example mentioned that the list and dict syntax of the ``config`` module is very similar or almost identical to Python's. The main differences are: * You do not need to specify commas between dict and list entries - you can use either commas or line breaks to delineate the items. * Strings need only be quoted if they contain non-identifier characters (alphanumeric and underscore). The module is fairly liberal about whitespace and is not indentation-sensitive:: messages: [ { stream : `sys.stderr` message: Welcome name: 'Harry' } { stream : `sys.stdout` message: Welkom name: 'Ruud' } { stream : `sys.stderr` message : Bienvenue name : Yves } ] However, there is one area where whitespace can be significant; see below. Handling cross-references ========================= Sometimes there is a need to cross-reference one part of the configuration from another. Suppose in the above configuration, the third message (the one in French) needs to use the same stream as the English message, whatever stream that might be. This can be expressed as follows:: messages: [ { stream : `sys.stderr` message: 'Welcome' name: 'Harry' } { stream : `sys.stdout` message: 'Welkom' name: 'Ruud' } { stream : $messages[0].stream message: 'Bienvenue' name: Yves } ] The ``$`` syntax is used because the intent is similar to substitution: ``$messages[0].stream`` is replaced with the value to which it refers. The above configuration works with the program:: from config import Config f = file('simple.cfg') cfg = Config(f) for m in cfg.messages: s = '%s, %s' % (m.message, m.name) try: print >> m.stream, s except IOError, e: print e to give:: Welcome, Harry Welkom, Ruud Bienvenue, Yves However, if you change the file to:: messages: [ { stream : `sys.stdin` message: 'Welcome' name: 'Harry' } { stream : `sys.stdout` message: 'Welkom' name: 'Ruud' } { stream : $messages[0].stream message: 'Bienvenue' name: Yves } ] and run the program again (note the change to ``sys.stdin``, which is bound to cause an error if we try to write to it), you get two errors:: (0, 'Error') Welkom, Ruud (0, 'Error') This is because the stream for the third message is effectively the same as that for the first message. Note that in the above expression ``$messages[0].stream``, whitespace is significant before the ``[``. This is so that we can distinguish between ``[ $a[1] ]`` (a sequence whose single element is the second element of the sequence referenced as ``a``) and ``[ $a [1] ]`` (a two-element sequence whose first element is the value referenced by ``a`` and whose second element is the sequence with the single element which is integer 1. Using expressions ================= Although calculations are not normally the preserve of configuration modules, there are times when it is useful to express configuration values in terms of others. For example, an overall time period may be specified and other configuration values are fractions thereof. It may also be desirable to perform other simple calculations declaratively, e.g. concatenation of numerous file names to a base directory to get a final pathname. To support this, the ``config`` module allows expressions involving ``+``, ``-``, ``*``, ``/`` and ``%`` to be used in a configuration. The ``+`` operator can be used for string concatenation. For example, the file:: total_period : 100 header_time: 0.3 * $total_period steady_time: 0.5 * $total_period trailer_time: 0.2 * $total_period base_prefix: '/my/app/' log_file: $base_prefix + 'test.log' used with the program:: from config import Config f = file('simple.cfg') cfg = Config(f) print "Header time: %d" % cfg.header_time print "Steady time: %d" % cfg.steady_time print "Trailer time: %d" % cfg.trailer_time print "Log file name: %s" % cfg.log_file leads to the result:: Header time: 30 Steady time: 50 Trailer time: 20 Log file name: /my/app/test.log Including configurations within others ====================================== You can include a configuration within another configuration at any point where you would specify a value. The included configuration is treated as if it were a dictionary at the inclusion point. Hence, given the configuration file:: # application configuration app: { name : MyApplication base: '/path/to/app/logs/' # support team email address support_team: myappsupport mail_domain: '@my-company.com' } # logging for the app logging: @"logging.cfg" test: $logging.handler.email.from The ``logging`` key in this configuration includes another file called ``logging.cfg``, which looks like this:: # root logger configuration root: { level : DEBUG handlers : [$handlers.console, $handlers.file, $handlers.email] } # logging handlers handlers: { console: [ # the class to instantiate StreamHandler, # how to configure the instance { # the logger level level : WARNING # the stream to use stream : `sys.stderr` } ] file: [ FileHandler, { filename: $app.base + $app.name + '.log', mode : 'a' } ] socket: [ `handlers.SocketHandler`, { host: localhost, # use this port for now port: `handlers.DEFAULT_TCP_LOGGING_PORT`} ] nt_eventlog: [`handlers.NTEventLogHandler`, { appname: $app.name, logtype : Application } ] email: [ `handlers.SMTPHandler`, { level: CRITICAL, host: localhost, port: 25, from: $app.name + $app.mail_domain, to: [$app.support_team + $app.mail_domain, 'QA' + $app.mail_domain, 'product_manager' + $app.mail_domain], subject: 'Take cover' } ] } # the loggers which are configured loggers: { "input" : { handlers: [$handlers.socket] } "input.xls" : { handlers: [$handlers.nt_eventlog] } } Given the above, the program:: from config import Config cfg = Config(file('app.cfg')) file = open('test.txt', 'w') cfg.save(file) file.close() file = open('testlog.txt', 'w') cfg.logging.save(file) file.close() file = open('root.txt', 'w') cfg.logging.root.save(file) file.close() import logging, logging.handlers cfg.add_namespace(logging) print cfg.logging.loggers['input.xls'].handlers[0][0] print cfg.logging.handlers.console[1].stream print cfg['logging']['handlers']['console'][1]['stream'] print cfg.logging.handlers.email[1]['from'] x = cfg.logging.handlers.email[1].to print x for a in x: print a print x[0:2] print cfg.logging.handlers.file[1].filename Prints the following:: logging.handlers.NTEventLogHandler ', mode 'w' at 0x0088E0A0> ', mode 'w' at 0x0088E0A0> MyApplication@my-company.com ['myappsupport@my-company.com', 'QA@my-company.com', 'product_manager@my-company.com'] myappsupport@my-company.com QA@my-company.com product_manager@my-company.com ['myappsupport@my-company.com', 'QA@my-company.com'] /path/to/app/logs/MyApplication.log You will see from the code of the above program that there are a number of ways of accessing portions of the configuration, and you will also see that parts of the configuration have been written out. Here is ``test.txt``, which was used when writing out the whole configuration:: # application configuration app : { name : 'MyApplication' base : '/path/to/app/logs/' # support team email address support_team : 'myappsupport' mail_domain : '@my-company.com' } # logging for the app logging : { # root logger configuration root : { level : 'DEBUG' handlers : [ $handlers.console $handlers.file $handlers.email ] } # logging handlers handlers : { console : [ # the class to instantiate StreamHandler # how to configure the instance { # the logger level level : 'WARNING' # the stream to use stream : `sys.stderr` } ] file : [ FileHandler { filename : $app.base + $app.name + '.log' mode : 'a' } ] socket : [ `handlers.SocketHandler` { host : 'localhost' # use this port for now port : `handlers.DEFAULT_TCP_LOGGING_PORT` } ] nt_eventlog : [ `handlers.NTEventLogHandler` { appname : $app.name logtype : 'Application' } ] email : [ `handlers.SMTPHandler` { level : 'CRITICAL' host : 'localhost' port : 25 from : $app.name + $app.mail_domain to : [ $app.support_team + $app.mail_domain 'QA' + $app.mail_domain 'product_manager' + $app.mail_domain ] subject : 'Take cover' } ] } # the loggers which are configured loggers : { input : { handlers : [ $handlers.socket ] } 'input.xls' : { handlers : [ $handlers.nt_eventlog ] } } } test : $logging.handler.email.from You will see that the entire configuration (including the included file) has been written out, and that ordering and comments have been preserved. If we examine ``testlog.txt``, into which the logging part of the configuration was written, we see:: # root logger configuration root : { level : 'DEBUG' handlers : [ $handlers.console $handlers.file $handlers.email ] } # logging handlers handlers : { console : [ # the class to instantiate StreamHandler # how to configure the instance { # the logger level level : 'WARNING' # the stream to use stream : `sys.stderr` } ] file : [ FileHandler { filename : $app.base + $app.name + '.log' mode : 'a' } ] socket : [ `handlers.SocketHandler` { host : 'localhost' # use this port for now port : `handlers.DEFAULT_TCP_LOGGING_PORT` } ] nt_eventlog : [ `handlers.NTEventLogHandler` { appname : $app.name logtype : 'Application' } ] email : [ `handlers.SMTPHandler` { level : 'CRITICAL' host : 'localhost' port : 25 from : $app.name + $app.mail_domain to : [ $app.support_team + $app.mail_domain 'QA' + $app.mail_domain 'product_manager' + $app.mail_domain ] subject : 'Take cover' } ] } # the loggers which are configured loggers : { input : { handlers : [ $handlers.socket ] } 'input.xls' : { handlers : [ $handlers.nt_eventlog ] } } which is just the logging configuration. If we examine ``root.txt``, we see the portion relating to the root logger:: level : 'DEBUG' handlers : [ $handlers.console $handlers.file $handlers.email ] Changing a configuration ======================== There's not much point in being able to save a configuration programatically if you can't make changes to it programatically. This can be done using standard attribute syntax. For example, given the file:: messages : [ { stream : `sys.stdin` message : 'Welcome' name : 'Harry' } { stream : `sys.stdout` message : 'Welkom' name : 'Ruud' } { stream : $messages[0].stream message : 'Bienvenue' name : 'Yves' } ] the following program could be used to modify the configuration and save the changes:: from config import Config f = file('simple.cfg') cfg = Config(f) cfg.written = 1234 cfg.messages[2].surname = 'Montand' f = file('test.txt', 'w') cfg.save(f) print 'written' in cfg print 'writen' in cfg print 'surname' in cfg.messages[2] print 'xyzzy' in cfg.messages[2] With the printed output:: True False True False and the output file:: messages : [ { stream : `sys.stdin` message : 'Welcome' name : 'Harry' } { stream : `sys.stdout` message : 'Welkom' name : 'Ruud' } { stream : $messages[0].stream message : 'Bienvenue' name : 'Yves' surname : 'Montand' } ] written : 1234 Cascading configurations ======================== There may be times when you want to cascade configurations - e.g. at the suite, program and user level. When a value is required, you could check the user configuration first, then the configuration at program level, and finally at program suite level. To do this, you can use the handy ``ConfigList`` class, as in the following example:: from config import Config, ConfigList cfglist = ConfigList() cfglist.append(Config(file('/path/to/user.cfg'))) cfglist.append(Config(file('/path/to/program.cfg'))) cfglist.append(Config(file('/path/to/suite.cfg'))) To access a configuration value (e.g. ``verbosity``), you can say:: cfglist.getByPath('verbosity') and the value from the first configuration which defines ``verbosity`` will be returned. This technique can also be used where you want to override configuration values with command-line values. See the section below entitled "`Integrating with command-line options`_" for how the ``config`` module can be used with the standard library's ``optparse`` module. Merging configurations ====================== There are two ways in which configurations can be merged: * If there are no clashing keys, you can load multiple configuration files into the same ``Config`` instance. * You can use the ``ConfigMerger`` class to merge two configurations. To see how to use ``ConfigMerger``, suppose you have two files, ``merge1.cfg`` and ``merge2.cfg``, shown below:: value1: True value3: [1, 2, 3] value5: [7] value6: { 'a' : 1, 'c' : 3 } and:: value2: False value4: [4, 5, 6] value5: ['abc'] value6: { 'b' : 2, 'd' : 4 } The following program:: from config import Config, ConfigMerger f = file('merge1.cfg') cfg1 = Config(f) f = file('merge2.cfg') cfg2 = Config(f) merger = ConfigMerger() merger.merge(cfg1, cfg2) f = file('test.txt', 'w') cfg1.save(f) results in the following file being saved:: value1 : True value3 : [ 1 2 3 ] value5 : [ 7 abc ] value6 : { a : 1 c : 3 b : 2 d : 4 } value2 : False value4 : [ 4 5 6 ] As you can see, the keys have been merged, and the sequence elements have been appended. Starting with V0.3.6, ``ConfigMerger`` takes in its constructor an optional resolver argument (a default resolver is provided which allows the behaviour to be the same as in earlier versions). The resolver can be any callable which is called with three arguments and returns a string. The arguments are map1, map2 and key, where map1 is the target mapping for the merge, map2 is the merge operand and key is the clashing key. If a clash occurs (key is in both map1 and map2), the resolver is called to try to resolve the conflict. It can return one of several values: * "merge" - merge two Mappings. * "append" - append one sequence to another. * "overwrite" - overwrite the value in the merge target with that in the merge operand. * "mismatch" - call handleMismatch to handle the mismatch. Care should be taken to return a value compatible with the objects being merged. For example, it doesn't make sense to return "merge" when dealing with two sequences, or "append" when dealing with two mappings. Integrating with command-line options ===================================== It's fairly easy to integrate command line options with configurations read from files. We use the standard library's excellent ``optparse`` module to parse the command line for options, and make those options available to the application through the ``config`` API. Here's an example configuration file (``cmdline.cfg``):: cmdline_values: { verbose : `cmdline.verbose` file: `cmdline.filename` } other_config_items: { whatever : 'you want' } The program which demonstrates ``optparse`` integration is below:: from optparse import OptionParser from config import Config parser = OptionParser() parser.add_option("-f", "--file", action="store", type="string", dest="filename", help="write report to FILE", metavar="FILE") parser.add_option("-q", "--quiet", action="store_false", dest="verbose", default=1, help="don't print status messages to stdout") (options, args) = parser.parse_args() cfg = Config(file('cmdline.cfg')) cfg.addNamespace(options, 'cmdline') print "The verbose option value is %r" % cfg.cmdline_values.verbose print "The file name is %r" % cfg.cmdline_values.file Once we've parsed the command-line options using ``optparse`` and loaded the configuration, we add the parsed-options object as a namespace with name ``cmdline``. When we then fetch ``cfg.cmdline_values.verbose``, for example, this causes evaluation of ``cmdline.verbose`` against the configuration's namespaces, and fetches the appropriate value from the parsed-option object. The program, when run with arguments ``-q -f test``, will print:: The verbose option value is False The file name is 'test' Uniform component configuration =============================== You can use the ``config`` module to initialize a component hierarchy in a uniform manner. Typically, in a component,you initialize various attributes, some of which are other components. Suppose you have a component of class ``NetworkHandler`` which contains a particular subcomponent which is either of class ``HTTPHandler`` or of class ``FTPHandler``. You could have a hierarchical configuration as follows:: netHandler: { host: 'alpha' port: 8080 protocol: { class: `HTTPHandler` config: { secure: True version: '1.1' keepAlive: True } } } You could define the initialization of these classes as:: class HTTPHandler: def __init__(self, config): self.secure = config.get('secure', False) self.version = config.get('version', '1.0') self.keepAlive = config.get('keepAlive', False) class NetworkHandler: def __init__(self, config): self.host = config.get('host', 'localhost') self.port = config.get('port', 80) protocolClass = config.protocol.get('class') if protocolClass is None: raise ValueError('NetworkHandler: protocol class not specified') protocolConfig = config.protocol.get('config', {}) protocolHandler = protocolClass(protocolConfig) and then a NetworkHandler could be initialized as follows:: from config import Config def makeNetworkHandler(): cfg = Config('network.cfg') return NetworkHandler(cfg.netHandler) In this scheme, each class has a constructor which takes a single argument - a configuration mapping. Subcomponents can be passed the appropriate mapping without the constructing class needing to know its schema. In the above example, ``NetworkHandler`` neither knows nor cares about the exact contents of the mapping with path ``netHandler.protocol.config``. The creator of the configuration file needs only ensure that the mapping makes sense to the class being constructed - ``HTTPHandler`` in this case. If it was desired to use FTP instead, the ``netHandler.protocol`` mapping would perhaps look like this:: protocol: { class: `FTPHandler` config: { maxSize: 1048576 } } which could be used with a class initialized like this:: class FTPHandler: def __init__(self, config): self.maxSize = config.get('maxSize', 32768) You can see a more complete example of this in the files ``logconfig.cfg`` and ``logconfig.py``, which configure logging using a scheme very like that described above. Unicode support =============== Unicode support for reading files is provided through the ``ConfigInputStream`` class. This is used automatically by ``defaultStreamOpener``. ``ConfigInputStream`` automatically recognizes BOMs (byte order marks) for UTF-8, UTF-16LE and UTF-16BE. If a BOM is present, it is used to determine how the stream is to be decoded. If there is no BOM recognized, the stream is treated as a non-Unicode stream, assumed to be in the correct encoding, and read without decoding. Example of use:: from config import ConfigInputStream for filename in ['ANSI.txt', 'Unicode8.txt', 'UnicodeLE.txt', 'UnicodeBE.txt']: pathname = '/temp/' + filename stream = file(pathname, 'rb') print "- raw contents of %s:" % pathname print repr(stream.read(6)) print repr(stream.readline()) stream.close() stream = ConfigInputStream(file(pathname, 'rb')) print "- decoded contents of %s, encoding = %s:" % (pathname, stream.encoding) print repr(stream.read(6)) print repr(stream.readline()) stream.close() which produces:: - raw contents of /temp/ANSI.txt: 'Test\r\n' 'Line\r\n' - decoded contents of /temp/ANSI.txt, encoding = None: 'Test\r\n' 'Line\r\n' - raw contents of /temp/Unicode8.txt: '\xef\xbb\xbfTes' 't\r\n' - decoded contents of /temp/Unicode8.txt, encoding = utf-8: u'Test\r\n' u'Line\r\n' - raw contents of /temp/UnicodeLE.txt: '\xff\xfeT\x00e\x00' 's\x00t\x00\r\x00\n' - decoded contents of /temp/UnicodeLE.txt, encoding = utf-16le: u'Test\r\n' u'Line\r\n' - raw contents of /temp/UnicodeBE.txt: '\xfe\xff\x00T\x00e' '\x00s\x00t\x00\r\x00\n' - decoded contents of /temp/UnicodeBE.txt, encoding = utf-16be: u'Test\r\n' u'Line\r\n' Unicode support for writing files is provided through the ``ConfigOutputStream`` class. Here is an example on how to use it:: from config import Config, ConfigOutputStream cfg = Config('app.cfg') file = ConfigOutputStream(open('root.txt', 'wb'), 'utf-16be') cfg.save(file) If the encoding is one of ``utf-8``, ``utf-16le`` or ``utf16-be``, the appropriate BOM is written to the output. Note that the underlying stream should be opened in binary mode; newlines are automatically written as ``'\r\n'`` (Windows), ``'\r'`` (Mac) or ``'\n'`` (other). Download ======== Here is the current version, 0.3.9, in tarball_, zip_ and Windows_ formats. .. _tarball: ../config-0.3.9.tar.gz .. _zip: ../config-0.3.9.zip .. _Windows: ../config-0.3.9.win32.exe Further work ============ The ``config`` module is in a very early state, though it is already quite usable. The syntax is broadly fixed, though adjustments may be made to e.g. backticks, $ notation for references, and the @ notation for file inclusion, depending on feedback. The use of builtin functions - e.g. include("x"), get(an.attribute.chain), evaluate(sys.stderr) - has been considered, and not yet completely ruled out. Unicode support could be improved. No doubt there are bugs in the implementation, awaiting the completion of a more comprehensive test suite. Some minor changes in the API can be expected. All feedback will be gratefully received; please send it to vinay_sajip at red-dove.com or post it on the Python Wiki on the HierConfig page: http://www.python.org/moin/HierConfig. Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`