summaryrefslogtreecommitdiff
path: root/biology
AgeCommit message (Collapse)AuthorFilesLines
2022-12-14biology/biolibc-tools: Update to 0.1.4.1bacon3-7/+8
deromanize: Check for NULL field to prevent crash in some situations Take optional input file as second argument
2022-12-13samtools: updated to 1.16.1adam3-13/+14
1.16.1 Bug fixes: Fixed a bug with the template-coordinate sort which caused incorrect ordering when using threads, or processing large files that don't fit completely in memory. Fixed a crash that occurred when trying to use samtools merge in template-coordinate mode. 1.16 New work and changes: samtools reference command added. This subcommand extracts the embedded reference out of a CRAM file. samtools import now adds grouped by query-name to the header. Made samtools view read error messages more generic. Former error message would claim that there was a "truncated file or corrupt BAM index file" with no real justification. Also reset errno in stream_view which could lead to confusing error messages. Make samtools view -p also clear mqual, tlen and cigar. Add bedcov option -c to report read count. Add UMI/barcode handling to samtools markdup. Add a new template coordinate sort order to samtools sort and samtools merge. This is useful when working with unique molecular identifiers (UMIs). Rename mpileup --ignore-overlaps to --ignore-overlaps-removal or --disable-overlap-removal. The previous name was ambiguous and was often read as an option to enable removal of overlapping bases, while in reality this is on by default and the option turns off the ability to remove overlapping bases. The dict command can now read BWA's .alt file and add AH:* tags indicating reference sequences that represent alternate loci. The samtools index command can now accept multiple alignment filenames with the new -M option, and will index each of them separately. (Specifying the output index filename via out.index or the new -o option is currently only applicable when there is only one alignment file to be indexed.) Allow samtools fastq -T "*". This allows all tags from SAM records to be written to fastq headers. This is a counterpart to samtools import -T "*". Bug Fixes: Re-enable --reference option for samtools depth. The reference is not used but this makes the command line usage compatible with older releases. Fix regex coordinate bug in samtools markdup. Fix divide by zero in plot-bamstats -m, on unmapped data. Fix missing RG headers when using samtools merge -r. Fix a possible unaligned access in samtools reference. Documentation: Add documentation on CRAM compression profiles and some of the newer options that appear in CRAM 3.1 and above. Add sclen filter expression keyword documentation. Extend FILTER EXPRESSION man page section to match the changes made in HTSlib. Non user-visible changes and build improvements: Ensure generated test files are ignored (by git) and cleaned (by make testclean)
2022-12-13htslib: updated to 1.16adam2-14/+12
1.16 Make hfile_s3 refresh AWS credentials on expiry in order to make HTSlib work better with AWS IAM credentials, which have a limited lifespan. Allow BAM headers between 2GB and 4GB in size once more. This is not permitted in the BAM specification but was allowed in an earlier version of HTSlib. There is now a warning at 2GB and a hard failure at 4GB. Improve error message when failing to load an index. Permit MM (base modification) tags containing . and ? suffixes. These define implicit vs explicit coordinates. See the SAM tags specification for details. Warn if spaces instead of tabs are detected in a VCF file to prevent confusion. Add an sclen filter expression keyword. This is the length of a soft-clip, both left and right end. It may be combined with qlen (qlen-sclen) to obtain the number of bases in the query sequence that have been aligned to the genome ie it provides a way to compare local-alignment vs global-alignment length. Improve error messages for CRAM reference mismatches. If the user specifies the wrong reference, the CRAM slice header MD5sum checks fail. We now report the SQ line M5 string too so it is possible to validate against the whole chr in the ref.fa file. The error message has also been improved to report the reference name instead of #num. Finally, we now hint at the likely cause, which counters the misleading samtools supplied error of "truncated or corrupt" file. Expose more of the CRAM API and add new functionality to extract the reference from a CRAM file. Improvements to the implementation of embedded references in CRAM where no external reference is specified. The CRAM writer now allows alignment records with RG:Z: aux tags that don't have a corresponding @RG ID in the file header. Previously these tags would have been silently dropped. HTSlib will complain whenever it has to add one though, as such tags do not conform to recommended practice for the SAM, BAM and CRAM formats. Set tab delimiter in man page for tabix GFF3 sort. When using libdeflate, the 1...9 scale of BGZF compression levels is now remapped to the 1...12 range used by libdeflate instead of being passed directly. In particular, HTSlib levels 8 and 9 now map to libdeflate levels 10 and 12, so it is possible to select the highest (but slowest) compression offered by libdeflate. The VCF variant API has been extended so that it can return separate flags for INS and DEL variants as well as the existing INDEL one. These flags have not been added to the old bcf_get_variant_types() interface as it could break existing users. To access them, it is necessary to use new functions bcf_has_variant_type() and bcf_has_variant_types(). The missing, but trivial, le_to_u8() function has been added to hts_endian. bcf_format_gt() now works properly on big-endian platforms.
2022-12-13biology/biolibc-tools: Update to 0.1.4bacon3-7/+11
Add deromanize subcommand to convert Roman numeral chromosome IDs fastx-stats: Report standard deviation for read length Changes: https://github.com/auerlab/biolibc-tools/releases
2022-12-12biology/Makefile: Add rna-seq meta-packagebacon1-1/+2
2022-12-12biology/rna-seq: Core tools needed for RNA-Seq analysisbacon2-0/+28
The rna-seq meta-package provides the core tools needed for performing a typical RNA-Seq differential gene expression analysis, including adapter trimming, quality control, alignment, and identification of differentially expressed genes. Researchers may want additional tools for data manipulation, gene ontology, etc.
2022-12-12biology/Makefile: Add fasdabacon1-1/+2
2022-12-12biology/fasda: Fast and simple differential analysisbacon4-0/+39
FASDA aims to provide a fast and simple differential analysis tool that just works and does not require any knowledge beyond basic Unix command-line skills. The code is written entirely in C to maximize efficiency and portability, and to provide a simple command-line user interface.
2022-12-11biology/fastq-trim: Update to 0.1.2bacon3-8/+7
Minor enhancements, fixes for SunOS Changes: https://github.com/auerlab/fastq-trim/releases
2022-12-11biology/vcf-split: Update to 0.1.5.4bacon3-7/+8
Update for biolibc API changes
2022-12-11biology/peak-classifier: Update to 0.1.4-5bacon3-8/+8
Update for biolibc API changes
2022-12-11biology/biolibc: Update to 0.2.4bacon4-9/+9
Minor enhancements Changes: https://github.com/auerlab/biolibc/releases
2022-12-03biology/balance-tui: update to 0.1.1pin3-28/+32
- Updated dependencies
2022-11-28Add balance-tuipin1-1/+2
2022-11-28biology/balance-tui: import packagepin5-0/+412
Balance tui is a simple cli program to balance chemical equations.
2022-11-23massive revision bump after textproc/icu updateadam7-13/+14
2022-11-16py-cutadapt: fix build with python 3.11wiz1-1/+8
2022-11-15py-pydicom: updated to 2.3.1adam2-6/+6
pydicom 2.3.1 Small fix to make 2.3.X compatible with Python 3.11.
2022-11-14py-dnaio: update to 0.9.1.wiz3-11/+21
v0.9.1 (2022-08-01) ------------------- * :pr:`85`: macOS wheels are now also built as part of the release procedure. * :pr:`81`: API documentation improvements and minor code refactors for readability. v0.9.0 (2022-05-17) ------------------- * :pr:`79`: Added a `records_are_mates` function to be used for checking whether three or more records are mates of each other (by checking the ID). * :pr:`74`, :pr:`68`: Made FASTQ parsing faster by implementing the check for ASCII using SSE vector instructions. * :pr:`72`: Added a `tutorial <https://dnaio.readthedocs.io/en/latest/tutorial.html>`_. v0.8.0 (2022-03-26) ------------------- * Preliminary documentation is available at <https://dnaio.readthedocs.io/>. * :pr:`53`: Renamed ``Sequence`` to `SequenceRecord`. The previous name is still available as an alias so that existing code will continue to work. * When reading a FASTQ file, there is now a check that ensures that all characters are ASCII. * Function ``record_names_match`` is deprecated, use `SequenceRecord.is_mate` instead. * Dropped Python 3.6 support as it is end-of-life. v0.7.1 (2022-01-26) ------------------- * :pr:`34`: Fix parsing of FASTA files that just contain a comment and no reads v0.7.0 (2022-01-17) ------------------- * @rhpvorderman contributed many performance improvements in :pr:`15`, :pr:`17`, :pr:`18`, :pr:`20`, :pr:`21`, :pr:`22`, :pr:`23`. Reading and writing FASTQ files and reading of paired-end FASTQ files was sped up significantly. For example, reading uncompressed FASTQ is 50% faster (!) than before. * :pr:`28`: Windows support added v0.6.0 (2021-09-28) ------------------- * :pr:`12`: Improve FASTQ writing speed twofold (thanks to @rhpvorderman) v0.5.2 (2021-09-07) ------------------- * :issue:`7`: Ignore a trailing "3" in the read id
2022-11-08py-pydicom: updated to 2.3.0adam3-66/+60
Version 2.3.0 ================================= Changes ------- * :meth:`DataElement.description<pydicom.dataelem.DataElement.description>` is deprecated and will be removed in v3.0, use :attr:`DataElement.name<pydicom.dataelem.DataElement.name>` instead * Updated the private dictionary * :attr:`~pydicom.config.enforce_valid_values` is deprecated in favor of :attr:`~pydicom.config.settings.reading_validation_mode` * Added `download` parameter to :func:`~pydicom.data.get_testdata_file` to allow skipping downloading the file if missed locally (:pr:`1617`) Enhancements ------------ * Values are now validated for valid length, allowed character set and format on reading and writing. Depending on the value of :attr:`~pydicom.config.settings.reading_validation_mode` and :attr:`~pydicom.config.settings.writing_validation_mode` a warning is logged, an exception is raised, or the validation is skipped. * Added :class:`~pydicom.valuerep.VR` enum (:pr:`1500`) * UIDs for all Storage SOP Classes have been added to the ``uid`` module (:issue:`1498`) * Use rle_handler as last resort handler for decoding RLE encoded data as it is the slowest handler (:issue:`1487`) * Added, enhanced, or removed a number of Mitra private dictionary entries (:pr:`1588`) * Added support for unpacking bit-packed data without using NumPy to :func:`~pydicom.pixel_data_handlers.utils.unpack_bits`(:pr:`1594`) * Added :func:`~pydicom.pixel_data_handlers.util.expand_ybr422` for expanding uncompressed ``YBR_FULL_422`` data to ``YBR_FULL`` (:pr:`1593`) * Replacement of ``UN`` VR with ``SQ`` VR for undefined length data elements (introduced in 2.2.2), can now be configured via :attr:`~pydicom.config.settings.infer_sq_for_un_vr` * Updated dictionaries to DICOM 2022a Fixes ----- * Fixed odd-length **OB** values not being padded during write (:issue:`1511`) * Fixed Hologic private dictionary entry (0019xx43) * Fixed Mitra global patient ID private dictionary entry (:pr:`1588`) * Fixed :meth:`~pydicom.dataset.Dataset.compress` not setting the correct encoding for the rest of the dataset (:issue:`1565`) * Fixed `AttributeError` on deep copy of :class:`~pydicom.dataset.FileDataset` (:issue:`1571`) * Fixed an exception during pixel decoding if using GDCM < 2.8.8 on Windows (:issue:`1581`) * Fixed crashes on Windows and MacOS when using the GDCM plugin to compress into *RLE Lossless* (:issue:`1581`) * Fixed ``dir(Dataset())`` not returning class attributes (:issue:`1599`) * Fixed bad DICOMDIR offsets when using :meth:`FileSet.write() <pydicom.fileset.FileSet.write>` with a *Directory Record Sequence* using undefined length items (:issue:`1596`) * Assigning a list of length one as tag value is now correctly handled as assigning the single value (:issue:`1606`) * Fixed an exception with multiple deferred reads with file-like objects (:issue:`1609`)
2022-11-06biology/Makefile: Add gffreadbacon1-1/+2
2022-11-06biology/gffread: GFF/GTF format conversions, filtering, etcbacon5-0/+110
GFF/GTF utility providing format conversions, filtering, FASTA sequence extraction and more. The program gffread can be used to validate, filter, convert and perform various other operations on GFF files. Because the program shares the same GFF parser code with Cufflinks, Stringtie, and gffcompare, it could be used to verify that a GFF file from a certain annotation source is correctly "understood" by these programs.
2022-11-06biology/Makefile: Add fastq-trimbacon1-1/+2
2022-11-06biology/fastq-trim: Lightening fast sequence read trimmerbacon4-0/+107
Fastq-trim is a lightening fast read trimming tool for QA of DNA and RNA reads prior to analyses such as RNA-Seq. it runs in a fraction of the time required by popular trimmers and uses only a few megabytes of RAM, so it will run almost entirely in cache. The design supports adding any number of alignment functions, so it can be easily adapted to any trimming needs.
2022-10-26*: bump PKGREVISION for libunistring shlib major bumpwiz2-3/+4
2022-10-18biology/molsketch: update to 0.7.3pin2-7/+6
- This is only a small release coming out in order to establish an automated build - and publication pipeline. Some new bond types were added nevertheless.
2022-08-11Bump all dependent packages of wayland (belatedly)gutteridge2-4/+4
The package changed with the addition of its libepoll-shim dependency. Otherwise, we can get: ERROR: libepoll-shim>=0.0.20210418 is not installed; can't buildlink files.
2022-07-25*: remove pkg-config from tools where no buildlink3.mk file is includedwiz1-2/+1
Bulk build on NetBSD of these packages had the same result as before (build succeeds, no PLIST change).
2022-07-05htslib: updated to 1.15.1adam2-8/+7
1.15.1 Security fix: Fixed broken error reporting in the sam_cap_mapq() function, due to a missing hts_log() parameter. Prior to this fix it was possible to abuse the log message format string by passing a specially crafted alignment record to this function. HTSlib now uses libhtscodecs release 1.2.2. This fixes a number of bugs where invalid compressed data could trigger usage of uninitialised values. Fixed excessive memory used by multi-threaded SAM output on long reads. Fixed a bug where tabix would misinterpret region specifiers starting at position 0. It will also now warn if the file being indexed is supposed to be 1-based but has positions less than or equal to 0. The VCF header parser will now issue a warning if it finds an INFO header with Type=Flag but Number not equal to 0. It will also ignore the incorrect Number so the flag can be used.
2022-06-30*: Revbump packages that use Python at runtime without a PKGNAME prefixnia9-15/+18
2022-06-28*: recursive bump for perl 5.36wiz22-34/+44
2022-06-11biology/peak-classifier: Update to 0.1.4bacon3-7/+7
Update for bl_gff_t API streamlining https://github.com/auerlab/peak-classifier/releases
2022-06-11biology/vcf2hap: Update to 0.1.6bacon3-7/+7
Update for bl_vcf_t API streamlining Changes: https://github.com/auerlab/vcf2hap/releases
2022-06-11biology/ad2vcf: Update to 0.1.6bacon3-7/+7
Updates for bl_sam_t and bl_vcf_t API streamlining Changes: https://github.com/auerlab/ad2vcf/releases
2022-06-11biology/vcf-split: Update to 0.1.5bacon3-7/+7
Use latest biolibc API Minor build system improvements https://github.com/auerlab/vcf-split/releases
2022-06-11biology/biolibc-tools: Update to 0.1.3bacon3-7/+9
Add vcf-downsample subcommand Improvements to build system extract-seq: Recurse to output subfeatures Numerous other minor enhancements and fixes Changes: https://github.com/auerlab/biolibc-tools/releases
2022-06-11biology/biolibc: Update to 0.2.3bacon4-15/+62
Expand use of tsv_read_field_malloc() to improve memory efficiency Add SAM bit flag constants Import SAM-GFF compare functions from diffanal Updates for libxtend DSV API changes Numerous minor bug fixes and enhancements Changes: https://github.com/auerlab/biolibc/releases
2022-04-25stacks: needs -lsocket on SunOStnn1-1/+3
2022-04-25stacks: avoid ambiguous math functionstnn7-1/+100
2022-04-18revbump for textproc/icu updateadam7-13/+14
2022-04-10biology/ncbi-blast+: Update to 2.13.0bacon6-51/+276
Several minor bug fixes and improvements since 2.11.0 Changes: https://www.ncbi.nlm.nih.gov/books/NBK131777/?report=reader
2022-04-05biology/Makefile: Add fastx-toolkitbacon1-1/+2
2022-04-05biology/fastx-toolkit: CLI tools for FASTA/FASTQ files preprocessingbacon7-0/+109
The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
2022-04-03revbump for devel/protobufadam1-2/+2
2022-03-17biology/stacks: Update to 2.60bacon3-63/+35
Numerous bug fixes and enhancements since 2.2 Unbreak build on Darwin
2022-03-17biology/hisat2: Update to 2.2.1bacon7-482/+198
pkgsrc fix: Unbreak build on Darwin Add python3 support Several bug fixes and enhancements Changes: https://github.com/DaehwanKimLab/hisat2/tags
2022-03-16biology/ncbi-blast+: Disable MKPIEbacon1-1/+2
Temporary fix to unbreak build on NetBSD with freeze approaching
2022-03-15biology/vcf2hap: Update to 0.1.5bacon3-7/+7
Minor update for biolibc 0.2.2 API changes
2022-03-15biology/vcf-split: Update to 0.1.4bacon3-8/+7
Minor update for biolibc 0.2.2 API changes
2022-03-15biology/peak-classifier: Update to 0.1.3bacon3-7/+7
Minor update for biolibc 0.2.2 API changes