.. :copyright: Copyright (c) 2013 Martin Pengelly-Phillips :license: See LICENSE.txt. .. _assembly: Assembly ======== .. module:: clique As seen in the :ref:`tutorial`, Clique provides the high-level :py:func:`assemble` function to support automatically assembling items into relevant :ref:`collections ` based on a common changing numerical component:: >>> import clique >>> collections = clique.assemble([ ... 'file.0001.jpg', 'file.0002.jpg', 'file.0003.jpg', ... 'file.0001.dpx', 'file.0002.dpx', 'file.0003.dpx' ... ]) >>> print collections [, ] However, as mentioned in the :ref:`introduction`, Clique has no understanding of what a numerical component represents. Therefore, it takes a conservative approach and considers **all** collections with a common changing numerical component as valid. This can lead to surprising results at first:: >>> collections = clique.assemble([ ... 'file_v1.0001.jpg', 'file_v1.0002.jpg', 'file_v1.0003.jpg', ... 'file_v2.0001.jpg', 'file_v2.0002.jpg', 'file_v2.0003.jpg' ... ]) >>> print collections [, , , , ] Here, Clique returned more collections that might have been expected, but, as can be seen, they are all valid collections. This is an important feature of Clique - it doesn't attempt to guess. Instead, it is designed to be wrapped easily with domain specific logic to get the results desired. There are a couple of ways to influence the returned result from the :py:func:`~assemble` function: * Pass a *minimum_items* argument. * Pass custom *patterns*. Minimum Items ------------- By default, Clique will filter out any collection from the returned result of :py:func:`~assemble` that has less than two items. This value can be customised per :py:func:`~assemble` call by passing *minimum_items* as a keyword:: >>> print clique.assemble(['file.0001.jpg']) [] >>> print clique.assemble(['file.0001.jpg'], minimum_items=1) [] Patterns -------- By default, Clique finds all groups of numbers in each item and creates collections that have common :term:`head`, :term:`tail` and :term:`padding` values. Custom patterns can be used to tailor the process. Pass them as a list of regular expressions (either strings or :py:class:`re.RegexObject` instances):: >>> items = [ ... 'file.0001.jpg', 'file.0002.jpg', 'file.0003.jpg', ... 'file.0001.dpx', 'file.0002.dpx', 'file.0003.dpx' ... ]) >>> print clique.assemble(items, patterns=[ ... '\.(?P(?P0*)\d+)\.\D+\d?$' ... ]) [, ] .. note:: Each custom expression **must** contain the expression from :py:data:`DIGITS_PATTERN` exactly once. An easy way to do this is using Python's string formatting. So, instead of:: '\.(?P(?P0*)\d+)\.\D+\d?$' use:: '\.{0}\.\D+\d?$'.format(clique.DIGITS_PATTERN) Some common expressions are predefined in the :py:data:`~clique.PATTERNS` dictionary (contributions welcome!):: >>> print clique.assemble(items, patterns=[clique.PATTERNS['frames']]) [, ]