summaryrefslogtreecommitdiff
path: root/biology
AgeCommit message (Collapse)AuthorFilesLines
2022-03-15biology/biolibc-tools: Update to 0.1.2bacon3-7/+11
New fastx-stats and ensembl2gene subcommands Minor updates for biolibc 0.2.2 API changes Minor bug fixes and ennancements
2022-03-15biology/ad2vcf: Update to 0.1.5bacon3-7/+7
Minor update for bioloibc 0.2.2 API changes
2022-03-15biology/biolibc: Update to 0.2.2bacon4-21/+113
Numerous bug fixes and enhancements Serveral new functions 3 new classes Cleaned up some API slop Changes: https://github.com/auerlab/biolibc/tags
2022-03-13(biology/minimap2) Updated 2.18 to 2.24mef2-12/+8
Release 2.24-r1122 (26 December 2021) ------------------------------------- This release improves alignment around long poorly aligned regions. Older minimap2 may chain through such regions in rare cases which may result in missing alignments later. The issue has become worse since the the change of the chaining algorithm in v2.19. v2.23 implements an incomplete remedy. This release provides a better solution with a X-drop-like heuristic and by enabling two-bandwidth chaining in the assembly mode. (2.24: 26 December 2021, r1122) Release 2.23-r1111 (18 November 2021) ------------------------------------- Notable changes: * Bugfix: fixed missing alignments around long inversions (#806 and #816). This bug affected v2.19 through v2.22. * Improvement: avoid extremely long mapping time for pathologic reads with highly repeated k-mers not in the reference (#771). Use --q-occ-frac=0 to disable the new heuristic. * Change: use --cap-kalloc=1g by default. (2.23: 18 November 2021, r1111) Release 2.22-r1101 (7 August 2021) ---------------------------------- When choosing the best alignment, this release uses logarithm gap penalty and query-specific mismatch penalty. It improves the sensitivity to long INDELs in repetitive regions. Other notable changes: * Bugfix: fixed an indirect memory leak that may waste a large amount of memory given highly repetitive reference such as a 16S RNA database (#749). All versions of minimap2 have this issue. * New feature: added --cap-kalloc to reduce the peak memory. This option is not enabled by default but may become the default in future releases. Known issue: * Minimap2 may take a long time to map a read (#771). So far it is not clear if this happens to v2.18 and earlier versions. (2.22: 7 August 2021, r1101) Release 2.21-r1071 (6 July 2021) -------------------------------- This release fixed a regression in short-read mapping introduced in v2.19 (#776). It also fixed invalid comparisons of uninitialized variables, though these are harmless (#752). Long-read alignment should be identical to v2.20. (2.21: 6 July 2021, r1071) Release 2.20-r1061 (27 May 2021) -------------------------------- This release fixed a bug in the Python module and improves the command-line compatibiliity with v2.18. In v2.19, if `-r` is specified with an `asm*` preset, users would get alignments more fragmented than v2.18. This could be an issue for existing pipelines specifying `-r`. This release resolves this issue. (2.20: 27 May 2021, r1061) Release 2.19-r1057 (26 May 2021) -------------------------------- This release includes a few important improvements backported from unimap: * Improvement: more contiguous alignment through long INDELs. This is enabled by the minigraph chaining algorithm. All `asm*` presets now use the new algorithm. They can find INDELs up to 100kb and may be faster for chromosome-long contigs. The default mode and `map*` presets use this algorithm to replace the long-join heuristic. * Improvement: better alignment in highly repetitive regions by rescuing high-occurrence seeds. If the distance between two adjacent seeds is too large, attempt to choose a fraction of high-occurrence seeds in-between. Minimap2 now produces fewer clippings and alignment break points in long satellite regions. * Improvement: allow to specify an interval of k-mer occurrences with `-U`. For repeat-rich genomes, the automatic k-mer occurrence threshold determined by `-f` may be too large and makes alignment impractically slow. The new option protects against such cases. Enabled for `asm*` and `map-hifi`. * New feature: added the `map-hifi` preset for maping PacBio High-Fidelity (HiFi) reads. * Change to the default: apply `--cap-sw-mem=100m` for genomic alignment. * Bugfix: minimap2 could not generate an index file with `-xsr` (#734). This release represents the most signficant algorithmic change since v2.1 in 2017. With features backported from unimap, minimap2 now has similar power to unimap for contig alignment. Unimap will remain an experimental project and is no longer recommended over minimap2. Sorry for reverting the recommendation in short time. (2.19: 26 May 2021, r1057)
2022-02-27biology/kallisto: Update to 0.48.0bacon6-8/+81
Long awaited bug fix release Also unbundled htslib Changes: https://github.com/pachterlab/kallisto/tags
2022-02-26biology/bcftools: Update to 1.15bacon4-8/+8
Several minor enhancements and bug fixes Changes: https://github.com/samtools/bcftools/tags
2022-02-26biology/samtools: Update to 1.15bacon4-8/+10
Several minor enhancements and bug fixes Changes: https://github.com/samtools/samtools/tags
2022-02-26biology/htslib: Update to 1.15bacon4-8/+8
Several minor enhancements and bug fixes No API changes affecting existing packages Changes: https://github.com/samtools/htslib/tags
2022-02-17py-biopython: update to 1.79.wiz3-105/+115
1 June 2021: Biopython 1.79 ================================ This is intended to be our final release supporting Python 3.6. It also supports Python 3.7, 3.8 and 3.9, and has also been tested on PyPy3.6.1 v7.1.1. The ``Seq`` and ``MutableSeq`` classes in ``Bio.Seq`` now store their sequence contents as ``bytes` ` and ``bytearray`` objects, respectively. Previously, for ``Seq`` objects a string object was used, and a Unicode array object for ``MutableSeq`` objects. This was maintained during the transition from Python2 to Python3. However, a Python2 string object corresponds to a ``bytes`` object in Python3, storing the string as a series of 256-bit characters. While non- ASCII characters could be stored in Python2 strings, they were not treated as such. For example: In Python2:: >>> s = "Генетика" >>> type(s) <class 'str'> >>> len(s) 16 In Python3:: >>> s = "Генетика" >>> type(s) <class 'str'> >>> len(s) 8 In Python3, storing the sequence contents as ``bytes`` and ``bytearray`` objects has the further advantage that both support the buffer protocol. Taking advantage of the similarity between ``bytes`` and ``bytearray``, the ``Seq`` and ``MutableSeq`` classes now inherit from an abstract base class ``_SeqAbstractBaseClass`` in ``Bio.Seq`` that implements most of the ``Seq`` and ``MutableSeq`` methods, ensuring their consistency with each other. For methods that modify the sequence contents, an optional ``inplace`` argument to specify if a new sequence object should be returned with the new sequence contents (if ``inplace`` is ``False``, the default) or if the sequence object itself should be modified (if ``inplace`` is ``True``). For ``Seq`` objects, which are immutable, using ``inplace=True`` raises an exception. For ``inplace=False``, the default, ``Seq`` objects and ``MutableSeq`` behave consistently. As before, ``Seq`` and ``MutableSeq`` objects can be initialized using a string object, which will be converted to a ``bytes`` or ``bytearray`` object assuming an ASCII encoding. Alternatively, a ``bytes`` or ``bytearray`` object can be used, or an instance of any class inheriting from the new ``SequenceDataAbstractBaseClass`` abstract base class in ``Bio.Seq``. This requires that the class implements the ``__len__`` and ``__getitem`` methods that return the sequence length and sequence contents on demand. Initialzing a ``Seq`` instance using an instance of a class inheriting from ``SequenceDataAbstractBaseClass`` allows the ``Seq`` object to be lazy, meaning that its sequence is provided on demand only, without requiring to initialize the full sequence. This feature is now used in ``BioSQL``, providing on-demand sequence loading from an SQL database, as well as in a new parser for twoBit (.2bit) sequence data added to ``Bio.SeqIO``. This is a lazy parser that allows fast access to genome-size DNA sequence files by not having to read the full genome sequence. The new ``_UndefinedSequenceData`` class in ``Bio.Seq`` also inherits from ``SequenceDataAbstractBaseClass`` to represent sequences of known length but unknown sequence contents. This provides an alternative to ``UnknownSeq``, which is now deprecated as its definition was ambiguous. For example, in these examples the ``UnknownSeq`` is interpreted as a sequence with a well-defined sequence contents:: >>> s = UnknownSeq(3, character="A") >>> s.translate() UnknownSeq(1, character='K') >>> s + "A" Seq("AAAA") A sequence object with an undefined sequence contents can now be created by using ``None`` when creating the ``Seq`` object, together with the sequence length. Trying to access its sequence contents raises an ``UndefinedSequenceError``:: >>> s = Seq(None, length=6) >>> s Seq(None, length=6) >>> len(s) 6 >>> "A" in s Traceback (most recent call last): ... Bio.Seq.UndefinedSequenceError: Sequence content is undefined >>> print(s) Traceback (most recent call last): .... Bio.Seq.UndefinedSequenceError: Sequence content is undefined Element assignment in Bio.PDB.Atom now returns "X" when the element cannot be unambiguously guessed from the atom name, in accordance with PDB structures. Bio.PDB entities now have a ``center_of_mass()`` method that calculates either centers of gravity or geometry. New method ``disordered_remove()`` implemented in Bio.PDB DisorderedAtom and DisorderedResidue to remove children. New module Bio.PDB.SASA implements the Shrake-Rupley algorithm to calculate atomic solvent accessible areas without third-party tools. Expected ``TypeError`` behaviour has been restored to the ``Seq`` object's string like methods (fixing a regression in Biopython 1.78). The KEGG ``KGML_Pathway`` KGML output was fixed to produce output that complies with KGML v0.7.2. Parsing motifs in ``pfm-four-rows`` format can now handle motifs with values in scientific notation. Parsing motifs in ``minimal``` MEME format will use ``nsites`` when making the count matrix from the frequency matrix, instead of multiply the frequency matrix by 1000000. Bio.UniProt.GOA now parses Gene Product Information (GPI) files version 1.2, files can be downloaded from the EBI ftp site: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ 4 September 2020: Biopython 1.78 ================================ This release of Biopython supports Python 3.6, 3.7 and 3.8. It has also been tested on PyPy3.6.1 v7.1.1. The main change is that ``Bio.Alphabet`` is no longer used. In some cases you will now have to specify expected letters, molecule type (DNA, RNA, protein), or gap character explicitly. Please consult the updated Tutorial and API documentation for guidance. This simplification has sped up many ``Seq`` object methods. See https://biopython.org/wiki/Alphabet for more information. ``Bio.SeqIO.parse()`` is faster with "fastq" format due to small improvements in the ``Bio.SeqIO.QualityIO`` module. The ``SeqFeature`` object's ``.extract()`` method can now be used for trans-spliced locations via an optional dictionary of references. As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. Additionally, a number of small bugs and typos have been fixed with additions to the test suite. There has been further work to follow the Python PEP8, PEP257 and best practice standard coding style, and all of the tests have been reformatted with the ``black`` tool to match the main code base. 25 May 2020: Biopython 1.77 =========================== This release of Biopython supports Python 3.6, 3.7 and 3.8 It has also been tested on PyPy3.6.1 v7.1.1-beta0. **We have dropped support for Python 2 now.** ``pairwise2`` now allows the input of parameters with keywords and returns the alignments as a list of ``namedtuples``. The codon tables have been updated to NCBI genetic code table version 4.5, which adds Cephalodiscidae mitochondrial as table 33. Updated ``Bio.Restriction`` to the January 2020 release of REBASE. A major contribution by Rob Miller to ``Bio.PDB`` provides new methods to handle protein structure transformations using dihedral angles (internal coordinates). The new framework supports lossless interconversion between internal and cartesian coordinates, which, among other uses, simplifies the analysis and manipulation of coordinates of proteins structures. As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. Additionally, a number of small bugs and typos have been fixed with further additions to the test suite. There has been further work to follow the Python PEP8, PEP257 and best practice standard coding style, and all the main code base has been reformatted with the ``black`` tool. 20 December 2019: Biopython 1.76 ================================ This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and 3.8. It has also been tested on PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0. We intend this to be our final release supporting Python 2.7 and 3.5. As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. ``PDBParser`` and ``PDBIO`` now support PQR format file parsing and input/ output. In addition to the mainstream ``x86_64`` aka ``AMD64`` CPU architecture, we now also test every contribution on the ``ARM64``, ``ppc64le``, and ``s390x`` CPUs under Linux thanks to Travis CI. Further post-release testing done by Debian and other packagers and distributors of Biopython also covers these CPUs. ``Bio.motifs.PositionSpecificScoringMatrix.search()`` method has been re-written: it now applies ``.calculate()`` to chunks of the sequence to maintain a low memory footprint for long sequences. Additionally, a number of small bugs and typos have been fixed with further additions to the test suite. There has been further work to follow the Python PEP8, PEP257 and best practice standard coding style, and more of the code style has been reformatted with the ``black`` tool. 6 November 2019: Biopython 1.75 =============================== This release of Biopython supports Python 2.7, 3.5, 3.6, 3.7 and is expected to work on the soon to be released Python 3.8. It has also been tested on PyPy2.7.13 v7.1.1 and PyPy3.6.1 v7.1.1-beta0. Note we intend to drop Python 2.7 support in early 2020. The restriction enzyme list in ``Bio.Restriction`` has been updated to the August 2019 release of REBASE. ``Bio.SeqIO`` now supports reading and writing files in the native format of Christian Marck's DNA Strider program ("xdna" format, also used by Serial Cloner), as well as reading files in the native formats of GSL Biotech's SnapGene ("snapgene") and Textco Biosoftware's Gene Construction Kit ("gck"). ``Bio.AlignIO`` now supports GCG MSF multiple sequence alignments as the "msf" format (work funded by the National Marrow Donor Program). The main ``Seq`` object now has string-like ``.index()`` and ``.rindex()`` methods, matching the existing ``.find()`` and ``.rfind()`` implementations. The ``MutableSeq`` object retains its more list-like ``.index()`` behaviour. The ``MMTFIO`` class has been added that allows writing of MMTF file format files from a Biopython structure object. ``MMTFIO`` has a similar interface to ``PDBIO`` and ``MMCIFIO``, including the use of a ``Select`` class to write out a specified selection. This final addition to read/write support for PDB/mmCIF/MMTF in Biopython allows conversion between all three file formats. Values from mmCIF files are now read in as a list even when they consist of a single value. This change improves consistency and reduces the likelihood of making an error, but will require user code to be updated accordingly. `Bio.motifs.meme` has been updated to parse XML output files from MEME over the plain-text output file. The goal of this change is to parse a more structured data source with minimal loss of functionality upon future MEME releases. ``Bio.PDB`` has been updated to support parsing REMARK 99 header entries from PDB-style Astral files. A new keyword parameter ``full_sequences`` was added to ``Bio.pairwise2``'s pretty print method ``format_alignment`` to restore the output of local alignments to the 'old' format (showing the whole sequences including the un-aligned parts instead of only showing the aligned parts). A new function ``charge_at_pH(pH)`` has been added to ``ProtParam`` and ``IsoelectricPoint`` in ``Bio.SeqUtils``. The ``PairwiseAligner`` in ``Bio.Align`` was extended to allow generalized pairwise alignments, i.e. alignments of any Python object, for example three-letter amino acid sequences, three-nucleotide codons, and arrays of integers. A new module ``substitution_matrices`` was added to ``Bio.Align``, which includes an ``Array`` class that can be used as a substitution matrix. As the ``Array`` class is a subclass of a numpy array, mathematical operations can be applied to it directly, and C code that makes use of substitution matrices can directly access the numerical values stored in the substitution matrices. This module is intended as a replacement of ``Bio.SubsMat``, which is currently unmaintained. As in recent releases, more of our code is now explicitly available under either our original "Biopython License Agreement", or the very similar but more commonly used "3-Clause BSD License". See the ``LICENSE.rst`` file for more details. Additionally, a number of small bugs and typos have been fixed with further additions to the test suite, and there has been further work to follow the Python PEP8, PEP257 and best practice standard coding style. We have also started to use the ``black`` Python code formatting tool.
2022-02-17py-mol: removewiz5-1028/+1
This package is from 2012, the current version is from 2020. A replacement candidate is in wip/py-mol but needs more work. One of the last users of py-numpy16 in pkgsrc.
2022-01-17py-cutadapt: updated to 3.5adam3-10/+17
v3.5 (2021-09-29) ----------------- * :issue:`555`: Add support for dumping statistics in JSON format using ``--json``. * :issue:`541`: Add a "Read fate breakdown" section heading to the report, and also add statistics for reads discarded because of ``--discard-untrimmed`` and ``--discard-trimmed``. With this, the numbers in that section should add up to 100%. * Add option ``-Q``, which allows to specify a quality-trimming threshold for R2 that is different from the one for R1. * :issue:`567`: Add ``noindels`` adapter-trimming parameter. You can now write ``-a "ADAPTER;noindels"`` to disallow indels for a single adapter only. * :issue:`570`: Fix ``--pair-adapters`` not finding some pairs when reads contain more than one adapter. * :issue:`524`: Fix a memory leak when using ``--info-file`` with multiple cores. * :issue:`559`: Fix adjacent base statistics not being shown for linked adapters.
2022-01-11biology/molsketch: fix broken buildpin1-2/+2
2022-01-10py-mol: convert to egg.mkwiz2-5/+4
2022-01-10*: Recursive revbump from boost 1.78.0ryoon6-12/+12
2022-01-05python: egg.mk: add USE_PKG_RESOURCES flagwiz2-4/+8
This flag should be set for packages that import pkg_resources and thus need setuptools after the build step. Set this flag for packages that need it and bump PKGREVISION.
2022-01-04*: bump PKGREVISION for egg.mk userswiz4-5/+8
They now have a tool dependency on py-setuptools instead of a DEPENDS
2021-12-30Forget about Python 3.6adam2-5/+3
2021-12-17biology/vsearch: Update to 2.18.0bacon7-32/+42
Add powerpc64 support Numerous fixes and enhancements since 2.13 Changes: https://github.com/torognes/vsearch/tags
2021-12-17biology/cdhit: Update to 4.8.1bacon3-15/+21
Add support for gzipped input Changes: https://github.com/weizhongli/cdhit/releases
2021-12-17biology/bowtie2: Update to 2.4.4bacon5-20/+14
Replaced TBB with C++ threads Support for Apple M1 Several other fixes and enhancements Changes: https://github.com/BenLangmead/bowtie2/tags
2021-12-17biology/bcftools: Update to 1.14bacon4-10/+8
Numerous fixes and enhancments since 1.12 Changes: https://github.com/samtools/bcftools/tags
2021-12-17biology/samtools: Update to 1.14bacon4-10/+10
Numerous fixes and enhancements since 1.12 Changes: https://github.com/samtools/samtools/tags
2021-12-17biology/htslib: Update to 1.14bacon4-11/+10
Numerous fixes and enhancements since 1.12 Changes: https://github.com/samtools/htslib/releases/tag/1.14
2021-12-14biology/vcf2hap: Update to 0.1.4bacon3-9/+9
Updates for evolving libxtend and biolibc APIs Add --version flag
2021-12-14biology/vcf-split: Update to 0.1.3.3bacon3-9/+10
Transfer header from multi-sample input Updates for evolving libxtend and biolibc APIs Add --version flag Numerous minor fixes and enhancements
2021-12-14biology/peak-classifier: Update to 0.1.2bacon3-9/+7
Mainly updates for evolving libxtend and biolibc APIs A few minor fixes and enhancements
2021-12-14biology/biolibc-tools: Update to 0.1.1bacon3-15/+31
Make all programs subcommands of "blt" Several new commands Updates for evolving libxtend and biolibc APIs Add --version flag Numerous minor fixes and enhancements Changes: https://github.com/auerlab/biolibc-tools/releases/tag/0.1.1
2021-12-14biology/ad2vcf: Update to 0.1.4bacon3-9/+9
Updates for evolving libxtend and biolibc APIs Add --version flag Filter out unused SAM fields on input Numerous other minor fixes and enhancements Changes: https://github.com/auerlab/ad2vcf/releases/tag/0.1.4
2021-12-14biology/biolibc: Update to 0.2.1bacon5-337/+158
Add orf.c with start/stop codon locators Standardize BED and GFF APIs Implement VCF input filtering Eliminate mutator macros mirroring mutator functions Numerous minor bug fixes and enhancements Changes: https://github.com/auerlab/biolibc/releases/tag/0.2.1
2021-12-08revbump for icu and libffiadam12-24/+24
2021-11-11*: Revbump for protobuf-3.19.0kim1-2/+2
Fix for: Shared object "libprotobuf.so.29" not found
2021-10-26biology: Replace RMD160 checksums with BLAKE2s checksumsnia73-150/+150
All checksums have been double-checked against existing RMD160 and SHA512 hashes
2021-10-21*: Revbump for protobuf-3.18.0kim1-2/+2
Fix for: Shared object "libprotobuf.so.28" not found
2021-10-07biology: Remove SHA1 hashes for distfilesnia73-150/+73
2021-10-04py-pydicom: updated to 2.2.2adam2-7/+7
Version 2.2.0 Changes ------- * Data elements with a VR of **AT** must now be set with values acceptable to :func:`~pydicom.tag.Tag`, and are always stored as a :class:`~pydicom.tag.BaseTag`. Previously, any Python type could be set. * :meth:`BaseTag.__eq__()<pydicom.tag.BaseTag.__eq__>` returns ``False`` rather than raising an exception when the operand cannot be converted to :class:`~pydicom.tag.BaseTag` (:pr:`1327`) * :meth:`DA.__str__()<pydicom.valuerep.DA.__str__>`, :meth:`DT.__str__()<pydicom.valuerep.DT.__str__>` and :meth:`TM.__str__()<pydicom.valuerep.TM.__str__>` return valid DICOM strings instead of the formatted date and time representations (:issue:`1262`) * If comparing :class:`~pydicom.dataset.FileDataset` instances, the file metadata is now ignored. This makes it possible to compare a :class:`~pydicom.dataset.FileDataset` object with a :class:`~pydicom.dataset.Dataset` object. * :func:`~pydicom.pixel_data_handlers.rle_handler.rle_encode_frame` is deprecated and will be removed in v3.0, use :meth:`~pydicom.dataset.Dataset.compress` or :attr:`~pydicom.encoders.RLELosslessEncoder` instead. * :func:`~pydicom.filereader.read_file` is deprecated and will be removed in v3.0, use :func:`~pydicom.filereader.dcmread` instead. * :func:`~pydicom.filewriter.write_file` is deprecated and will be removed in v3.0, use :func:`~pydicom.filewriter.dcmwrite` instead. * Data dictionaries updated to version 2021b of the DICOM Standard * :class:`~pydicom.dataset.Dataset` no longer inherits from :class:`dict` Enhancements ------------ * Added a command-line interface for pydicom. Current subcommands are: * ``show``: display all or part of a DICOM file * ``codify`` to produce Python code for writing files or sequence items from scratch. Please see the :ref:`cli_guide` for examples and details of all the options for each command. * A field containing an invalid number of bytes will result in a warning instead of an exception when :attr:`~pydicom.config.convert_wrong_length_to_UN` is set to ``True``. * Private tags known via the private dictionary will now get the configured VR if read from a dataset instead of **UN** (:issue:`1051`). * While reading explicit VR, a switch to implicit VR will be silently attempted if the VR bytes are not valid VR characters, and config option :attr:`~pydicom.config.assume_implicit_vr_switch` is ``True`` (default) * New functionality to help with correct formatting of decimal strings (**DS**) * Added :func:`~pydicom.valuerep.is_valid_ds` to check whether a string is valid as a DICOM decimal string and :func:`~pydicom.valuerep.format_number_as_ds` to format a given ``float`` or ``Decimal`` as a DS while retaining the highest possible level of precision * If :attr:`~pydicom.config.enforce_valid_values` is set to ``True``, all **DS** objects created will be checked for the validity of their string representations. * Added optional ``auto_format`` parameter to the init methods of :class:`~pydicom.valuerep.DSfloat` and :class:`~pydicom.valuerep.DSdecimal` and the :func:`~pydicom.valuerep.DS` factory function to allow explicitly requesting automatic formatting of the string representations of these objects when they are constructed. * Added methods to construct :class:`~pydicom.valuerep.PersonName` objects from individual components of names (``family_name``, ``given_name``, etc.). See :meth:`~pydicom.valuerep.PersonName.from_named_components` and :meth:`~pydicom.valuerep.PersonName.from_named_components_veterinary`. * Added support for downloading the large test files with the `requests <https://docs.python-requests.org/en/master/>`_ package in addition to :mod:`urllib.request` (:pr:`1340`) * Ensured :func:`~pydicom.pixel_data_handlers.util.convert_color_space` uses 32-bit floats for calculation, added `per_frame` flag to allow frame-by-frame processing and improved the speed by ~20-60% (:issue:`1348`) * Optimisations for RLE encoding using *pydicom* (~40% faster). * Added support for faster decoding (~4-5x) and encoding (~20x) of *RLE Lossless* *Pixel Data* via the `pylibjpeg-rle <https://github.com/pydicom/pylibjpeg-rle>`_ plugin (:pr:`1361`, :pr:`1372`). * Added :func:`Dataset.compress()<pydicom.dataset.Dataset.compress>` function for compressing uncompressed pixel data using a given encoding format as specified by a UID. Only *RLE Lossless* is currently supported (:pr:`1372`) * Added :mod:`~pydicom.encoders` module and the following encoders: * :attr:`~pydicom.encoders.RLELosslessEncoder` with 'pydicom', 'pylibjpeg' and 'gdcm' plugins * Added `read` parameter to :func:`~pydicom.data.get_testdata_file` to allow reading and returning the corresponding dataset (:pr:`1372`) * Handle decoded RLE segments with padding (:issue:`1438`) * Add option to JSON functions to suppress exception and continue (:pr:`1332`) * Allow searching :class:`~pydicom.fileset.FileSet` s for a list of elements (:pr:`1428`) * Added hash function to SR :class:`~pydicom.sr.Code` (:pr:`1434`) Fixes ----- * Fixed pickling a :class:`~pydicom.dataset.Dataset` instance with sequences after the sequence had been read (:issue:`1278`) * Fixed JSON export of numeric values * Fixed handling of sequences of unknown length that switch to implicit encoding, and sequences with VR **UN** (:issue:`1312`) * Do not load external data sources until needed - fixes problems with standard workflow if `setuptools` are not installed (:issue:`1341`) * Fixed empty **PN** elements read from file being :class:`str` rather than :class:`~pydicom.valuerep.PersonName` (:issue:`1338`) * Fixed handling of JPEG (10918-1) images compressed using RGB colourspace rather than YBR with the Pillow pixel data handler (:pr:`878`) * Allow to deepcopy a `~pydicom.dataset.FileDataset` object (:issue:`1147`) * Fixed elements with a VR of **OL**, **OD** and **OV** not being set correctly when an encoded backslash was part of the element value (:issue:`1412`) * Fixed expansion of linear segments with floating point steps in segmented LUTs (:issue:`1415`) * Fixed handling of code extensions with person name component delimiter (:pr:`1449`) * Fixed bug decoding RBG jpg with APP14 marker due to change in Pillow (:pr:`1444`) * Fixed decoding for `FloatPixelData` and `DoubleFloatPixelData` via `pydicom.pixel_data_handlers.numpy_handler` (:issue:`1457`)
2021-09-29revbump for boost-libsadam8-15/+16
2021-09-18biology/biolibc: Update to 0.2.0.11bacon4-10/+10
Regenerate man pages with improved auto-c2man Improved formatting and added missing return value sections
2021-09-03biology/peak-classifier: Update to 0.1.1.21bacon3-9/+9
Fix regression: Replace BL_BED_SET_STRAND() macro with bl_bed_set_strand(), which performs sanity checks
2021-09-03biology/biolibc: Update to 0.2.0.1bacon4-9/+10
Fix regression: Replace BL_BED_SET_STRAND() macro with bl_bed_set_strand(), which performs sanity checks
2021-09-01py-pydicom: PLIST fixadam1-1/+7
2021-08-31biology/Makefile: Add biolibc-toolsbacon1-1/+2
2021-08-31biology/biolibc-tools: import biolibc-tools-0.1.0.36bacon4-0/+37
Biolibc-tools is a collection of simple, fast, and memory-efficient programs for processing biological data. These programs built on biolibc are not complex enough to warrant separate projects.
2021-08-29py-pydicom: add ALTERNATIVESadam1-0/+1
2021-08-29py-pydicom: updated to 2.2.1adam3-197/+313
Version 2.2.0 Changes Data elements with a VR of AT must now be set with values acceptable to Tag(), and are always stored as a BaseTag. Previously, any Python type could be set. BaseTag.__eq__() returns False rather than raising an exception when the operand cannot be converted to BaseTag DA.__str__(), DT.__str__() and TM.__str__() return valid DICOM strings instead of the formatted date and time representations If comparing FileDataset instances, the file metadata is now ignored. This makes it possible to compare a FileDataset object with a Dataset object. rle_encode_frame() is deprecated and will be removed in v3.0, use compress() or RLELosslessEncoder instead. read_file() is deprecated and will be removed in v3.0, use dcmread() instead. write_file() is deprecated and will be removed in v3.0, use dcmwrite() instead. Data dictionaries updated to version 2021b of the DICOM Standard Dataset no longer inherits from dict Enhancements Added a command-line interface for pydicom. Current subcommands are: show: display all or part of a DICOM file codify to produce Python code for writing files or sequence items from scratch. Please see the Command-line Interface Guide for examples and details of all the options for each command. A field containing an invalid number of bytes will result in a warning instead of an exception when convert_wrong_length_to_UN is set to True. Private tags known via the private dictionary will now get the configured VR if read from a dataset instead of UN While reading explicit VR, a switch to implicit VR will be silently attempted if the VR bytes are not valid VR characters, and config option assume_implicit_vr_switch is True (default) New functionality to help with correct formatting of decimal strings (DS) Added is_valid_ds() to check whether a string is valid as a DICOM decimal string and format_number_as_ds() to format a given float or Decimal as a DS while retaining the highest possible level of precision If enforce_valid_values is set to True, all DS objects created will be checked for the validity of their string representations. Added optional auto_format parameter to the init methods of DSfloat and DSdecimal and the DS() factory function to allow explicitly requesting automatic formatting of the string representations of these objects when they are constructed. Added methods to construct PersonName objects from individual components of names (family_name, given_name, etc.). See from_named_components() and from_named_components_veterinary(). Added support for downloading the large test files with the requests package in addition to urllib.request Ensured convert_color_space() uses 32-bit floats for calculation, added per_frame flag to allow frame-by-frame processing and improved the speed by ~20-60% Optimisations for RLE encoding using pydicom (~40% faster). Added support for faster decoding (~4-5x) and encoding (~20x) of RLE Lossless Pixel Data via the pylibjpeg-rle plugin Added Dataset.compress() function for compressing uncompressed pixel data using a given encoding format as specified by a UID. Only RLE Lossless is currently supported Added encoders module and the following encoders: RLELosslessEncoder with ‘pydicom’, ‘pylibjpeg’ and ‘gdcm’ plugins Added read parameter to get_testdata_file() to allow reading and returning the corresponding dataset Handle decoded RLE segments with padding Add option to JSON functions to suppress exception and continue Allow searching FileSet s for a list of elements Added hash function to SR Code Fixes Fixed pickling a Dataset instance with sequences after the sequence had been read Fixed JSON export of numeric values Fixed handling of sequences of unknown length that switch to implicit encoding, and sequences with VR UN Do not load external data sources until needed - fixes problems with standard workflow if setuptools are not installed Fixed empty PN elements read from file being str rather than PersonName Fixed handling of JPEG (10918-1) images compressed using RGB colourspace rather than YBR with the Pillow pixel data handler Allow to deepcopy a ~pydicom.dataset.FileDataset object Fixed elements with a VR of OL, OD and OV not being set correctly when an encoded backslash was part of the element value Fixed expansion of linear segments with floating point steps in segmented LUTs Fixed handling of code extensions with person name component delimiter Fixed bug decoding RBG jpg with APP14 marker due to change in Pillow Fixed decoding for FloatPixelData and DoubleFloatPixelData via pydicom.pixel_data_handlers.numpy_handler Version 2.1.1 Fixes Remove py.typed Fix ImportError with Python 3.6.0 Fix converting Sequences with Bulk Data when loading from JSON Version 2.1.0 Changelog Dropped support for Python 3.5 (only Python 3.6+ supported) Enhancements Large testing data is no longer distributed within the pydicom package with the aim to reduce the package download size. These test files will download on-the-fly whenever either the tests are run, or should the file(s) be requested via the data manager functions. For example: To download all files and get their paths on disk you can run pydicom.data.get_testdata_files(). To download an individual file and get its path on disk you can use pydicom.data.get_testdata_file(), e.g. for RG1_UNCI.dcm use pydicom.data.get_testdata_file("RG1_UNCI.dcm") Added a new pixel data handler based on pylibjpeg which supports all (non-retired) JPEG transfer syntaxes Added apply_rescale() alias Added apply_voi() and apply_windowing() Added prefer_lut keyword parameter to apply_voi_lut() and handle empty VOI LUT module elements Added ability to register external data sources for use with the functions in pydicom.data __contains__, __next__ and __iter__ implementations added to PersonName Added convenience constants for the MPEG transfer syntaxes to pydicom.uid Added support for decoding Waveform Data: Added pydicom.waveforms module and generate_multiplex() and multiplex_array() functions. Added Dataset.waveform_array() which returns an ndarray for the multiplex group at index within a Waveform Sequence element. When JPEG 2000 image data is unsigned and the Pixel Representation is 1 the image data is converted to signed Added keyword property for the new UID keywords in version 2020d of the DICOM Standard Added testing of the variable names used when setting Dataset attributes and INVALID_KEYWORD_BEHAVIOR config option to allow customizing the behavior when a camel case variable name is used that isn’t a known element keyword Added INVALID_KEY_BEHAVIOR config option to allow customizing the behavior when an invalid key is used with the Dataset in operator Implemented full support (loading, accessing, modifying, writing) of DICOM File-sets and their DICOMDIR files via the FileSet class Added AllTransferSyntaxes Added option to turn on pydicom future breaking behavior to allow user code to check itself against the next major version release. Set environment variable “PYDICOM_FUTURE” to “True” or call future_behavior() Added another signature to the bulk_data_uri_handler in from_json to allow for the communication of not just the URI but also the tag and VR to the handler. Previous handlers will work as expected, new signature handlers will get the additional information. pack_bits() can now be used with 2D or 3D input arrays and will pad the packed data to even length by default. Elements with the IS VR accept float strings that are convertible to integers without loss, e.g. “1.0” Added encapsulate_extended() function for use when an Extended Offset Table is required Changes Reading and adding unknown non-private tags now does not raise an exception per default, only when enforce_valid_values is set Data dictionaries updated to version 2020d of the DICOM Standard Updated a handful of the SOP Class variable names in _storage_sopclass_uids to use the new UID keywords. Variables with Multiframe in them become MultiFrame, those with and in them become And, and DICOSQuadrupoleResonanceQRStorage becomes DICOSQuadrupoleResonanceStorage. The following UID constants are deprecated and will be removed in v2.2: JPEGBaseline: use JPEGBaseline8Bit JPEGExtended: use JPEGExtended12Bit JPEGLossless: use JPEGLosslessSV1 JPEGLSLossy: use JPEGLSNearLossless JPEG2000MultiComponentLossless: use JPEG2000MCLossless JPEG2000MultiComponent: use JPEG2000MC In v3.0 the value for JPEGLossless will change from 1.2.840.10008.1.2.4.70 to 1.2.840.10008.1.2.4.57 to match its UID keyword The following lists of UIDs are deprecated and will be removed in v2.2: JPEGLossyCompressedPixelTransferSyntaxes: use JPEGTransferSyntaxes JPEGLSSupportedCompressedPixelTransferSyntaxes: use JPEGLSTransferSyntaxes JPEG2000CompressedPixelTransferSyntaxes: use JPEG2000TransferSyntaxes RLECompressedLosslessSyntaxes: use RLETransferSyntaxes UncompressedPixelTransferSyntaxes: use UncompressedTransferSyntaxes PILSupportedCompressedPixelTransferSyntaxes DicomDir and the dicomdir module are deprecated and will be removed in v3.0. Use FileSet instead pydicom.overlay_data_handlers is deprecated, use pydicom.overlays instead Removed transfer syntax limitations when converting overlays to an ndarray The overlay_data_handlers config option is deprecated, the default handler will always be used. Fixes Dataset.copy() now works as expected Optimistically parse undefined length non-SQ data as if it’s encapsulated pixel data to avoid erroring out on embedded sequence delimiter Fixed get_testdata_file() and get_testdata_files() raising an exception if no network connection is available Fixed GDCM < v2.8.8 not returning the pixel array for datasets not read from a file-like Raise TypeError if dcmread() or dcmwrite() is called with wrong argument Gracefully handle empty Specific Character Set Fixed empty ambiguous VR elements raising an exception Allow apply_voi_lut() to apply VOI lookup to an input float array Fixed Dataset.setdefault() not adding working correctly when the default value is None and not adding private elements when enforce_valid_values is True Version 2.0.0 Changelog Dropped support for Python 2 (only Python 3.5+ supported) Changes to Dataset.file_meta file_meta now shown by default in dataset str or repr output; pydicom.config.show_file_meta can be set False to restore previous behavior new FileMetaDataset class that accepts only group 2 data elements Deprecation warning given unless Dataset.file_meta set with a FileMetaDataset object (in pydicom 3, it will be required) Old PersonName class removed; PersonName3 renamed to PersonName. Classes PersonNameUnicode and PersonName3 are aliased to PersonName but are deprecated and will be removed in version 2.1 dataelem.isMultiValue (previously deprecated) has been removed. Use dataelem.DataElement.VM instead. Enhancements Allow PathLike objects for filename argument in dcmread, dcmwrite and Dataset.save_as Deflate post-file meta information data when writing a dataset with the Deflated Explicit VR Little Endian transfer syntax UID Added config.replace_un_with_known_vr to be able to switch off automatic VR conversion for known tags with VR “UN” Added config.use_DS_numpy and config.use_IS_numpy to have multi-valued data elements with VR of DS or IS return a numpy array Fixes Fixed reading of datasets with an empty Specific Character Set tag Fixed failure to parse dataset with an empty LUT Descriptor or Red/Green/Blue Palette Color LUT Descriptor element. Made Dataset.save_as a wrapper for dcmwrite Removed 1.2.840.10008.1.2.4.70 - JPEG Lossless (Process 14, SV1) from the Pillow pixel data handler as Pillow doesn’t support JPEG Lossless. Fixed error when writing elements with a VR of OF Fixed improper conversion when reading elements with a VR of OF Fixed apply_voi_lut() and apply_modality_lut() not handling (0028,3006) LUT Data with a VR of OW Fixed access to private creator tag in raw datasets Fixed description of newly added known private tag Fixed update of private blocks after deleting private creator Fixed bug in updating pydicom.config.use_DS_Decimal flag in DS_decimal()
2021-08-28biology/peak-classifier: Update to 0.1.1.20bacon3-9/+10
Updates for libxtend and biolibc API changes
2021-08-28biology/vcf2hap: Update to 0.1.3.12bacon3-8/+9
Updates for libxtend and bioloibc API changes
2021-08-28biology/vcf-split: Update to 0.1.2.14bacon3-8/+9
Updates for libxtend and biolibc API changes
2021-08-28biology/ad2vcf: Update to 0.1.3.31bacon3-8/+9
Updates for libxtend and biolibc API changes Clean up and minor bug fixes
2021-08-28biology/biolibc: Update to 0.2.0bacon5-29/+565
Major API overhaul New classes for FASTA and FASTQ Generate accessor and mutator functions for all classes Generate man pages for all functions and macros Export delimiter-separated-value class to libxtend
2021-06-29py-numpy: "Python version >= 3.7 required."nia1-1/+3