Age | Commit message (Collapse) | Author | Files | Lines |
|
deromanize:
Check for NULL field to prevent crash in some situations
Take optional input file as second argument
|
|
1.16.1
Bug fixes:
Fixed a bug with the template-coordinate sort which caused incorrect ordering when using threads, or processing large files that don't fit completely in memory.
Fixed a crash that occurred when trying to use samtools merge in template-coordinate mode.
1.16
New work and changes:
samtools reference command added. This subcommand extracts the embedded reference out of a CRAM file.
samtools import now adds grouped by query-name to the header.
Made samtools view read error messages more generic. Former error message would claim that there was a "truncated file or corrupt BAM index file" with no real justification. Also reset errno in stream_view which could lead to confusing error messages.
Make samtools view -p also clear mqual, tlen and cigar.
Add bedcov option -c to report read count.
Add UMI/barcode handling to samtools markdup.
Add a new template coordinate sort order to samtools sort and samtools merge. This is useful when working with unique molecular identifiers (UMIs).
Rename mpileup --ignore-overlaps to --ignore-overlaps-removal or --disable-overlap-removal. The previous name was ambiguous and was often read as an option to enable removal of overlapping bases, while in reality this is on by default and the option turns off the ability to remove overlapping bases.
The dict command can now read BWA's .alt file and add AH:* tags indicating reference sequences that represent alternate loci.
The samtools index command can now accept multiple alignment filenames with the new -M option, and will index each of them separately. (Specifying the output index filename via out.index or the new -o option is currently only applicable when there is only one alignment file to be indexed.)
Allow samtools fastq -T "*". This allows all tags from SAM records to be written to fastq headers. This is a counterpart to samtools import -T "*".
Bug Fixes:
Re-enable --reference option for samtools depth. The reference is not used but this makes the command line usage compatible with older releases.
Fix regex coordinate bug in samtools markdup.
Fix divide by zero in plot-bamstats -m, on unmapped data.
Fix missing RG headers when using samtools merge -r.
Fix a possible unaligned access in samtools reference.
Documentation:
Add documentation on CRAM compression profiles and some of the newer options that appear in CRAM 3.1 and above.
Add sclen filter expression keyword documentation.
Extend FILTER EXPRESSION man page section to match the changes made in HTSlib.
Non user-visible changes and build improvements:
Ensure generated test files are ignored (by git) and cleaned (by make testclean)
|
|
1.16
Make hfile_s3 refresh AWS credentials on expiry in order to make HTSlib work better with AWS IAM credentials, which have a limited lifespan.
Allow BAM headers between 2GB and 4GB in size once more. This is not permitted in the BAM specification but was allowed in an earlier version of HTSlib. There is now a warning at 2GB and a hard failure at 4GB.
Improve error message when failing to load an index.
Permit MM (base modification) tags containing . and ? suffixes. These define implicit vs explicit coordinates. See the SAM tags specification for details.
Warn if spaces instead of tabs are detected in a VCF file to prevent confusion.
Add an sclen filter expression keyword. This is the length of a soft-clip, both left and right end. It may be combined with qlen (qlen-sclen) to obtain the number of bases in the query sequence that have been aligned to the genome ie it provides a way to compare local-alignment vs global-alignment length.
Improve error messages for CRAM reference mismatches. If the user specifies the wrong reference, the CRAM slice header MD5sum checks fail. We now report the SQ line M5 string too so it is possible to validate against the whole chr in the ref.fa file. The error message has also been improved to report the reference name instead of #num. Finally, we now hint at the likely cause, which counters the misleading samtools supplied error of "truncated or corrupt" file.
Expose more of the CRAM API and add new functionality to extract the reference from a CRAM file.
Improvements to the implementation of embedded references in CRAM where no external reference is specified.
The CRAM writer now allows alignment records with RG:Z: aux tags that don't have a corresponding @RG ID in the file header. Previously these tags would have been silently dropped. HTSlib will complain whenever it has to add one though, as such tags do not conform to recommended practice for the SAM, BAM and CRAM formats.
Set tab delimiter in man page for tabix GFF3 sort.
When using libdeflate, the 1...9 scale of BGZF compression levels is now remapped to the 1...12 range used by libdeflate instead of being passed directly. In particular, HTSlib levels 8 and 9 now map to libdeflate levels 10 and 12, so it is possible to select the highest (but slowest) compression offered by libdeflate.
The VCF variant API has been extended so that it can return separate flags for INS and DEL variants as well as the existing INDEL one. These flags have not been added to the old bcf_get_variant_types() interface as it could break existing users. To access them, it is necessary to use new functions bcf_has_variant_type() and bcf_has_variant_types().
The missing, but trivial, le_to_u8() function has been added to hts_endian.
bcf_format_gt() now works properly on big-endian platforms.
|
|
Add deromanize subcommand to convert Roman numeral chromosome IDs
fastx-stats: Report standard deviation for read length
Changes: https://github.com/auerlab/biolibc-tools/releases
|
|
|
|
The rna-seq meta-package provides the core tools needed for performing
a typical RNA-Seq differential gene expression analysis, including
adapter trimming, quality control, alignment, and identification of
differentially expressed genes. Researchers may want additional tools
for data manipulation, gene ontology, etc.
|
|
|
|
FASDA aims to provide a fast and simple differential analysis tool
that just works and does not require any knowledge beyond basic Unix
command-line skills. The code is written entirely in C to maximize
efficiency and portability, and to provide a simple command-line user
interface.
|
|
Minor enhancements, fixes for SunOS
Changes: https://github.com/auerlab/fastq-trim/releases
|
|
Update for biolibc API changes
|
|
Update for biolibc API changes
|
|
Minor enhancements
Changes: https://github.com/auerlab/biolibc/releases
|
|
- Updated dependencies
|
|
|
|
Balance tui is a simple cli program to balance chemical equations.
|
|
|
|
|
|
pydicom 2.3.1
Small fix to make 2.3.X compatible with Python 3.11.
|
|
v0.9.1 (2022-08-01)
-------------------
* :pr:`85`: macOS wheels are now also built as part of the release procedure.
* :pr:`81`: API documentation improvements and minor code refactors for
readability.
v0.9.0 (2022-05-17)
-------------------
* :pr:`79`: Added a `records_are_mates` function to be used for checking whether
three or more records are mates of each other (by checking the ID).
* :pr:`74`, :pr:`68`: Made FASTQ parsing faster by implementing the check for
ASCII using SSE vector instructions.
* :pr:`72`: Added a `tutorial <https://dnaio.readthedocs.io/en/latest/tutorial.html>`_.
v0.8.0 (2022-03-26)
-------------------
* Preliminary documentation is available at
<https://dnaio.readthedocs.io/>.
* :pr:`53`: Renamed ``Sequence`` to `SequenceRecord`.
The previous name is still available as an alias
so that existing code will continue to work.
* When reading a FASTQ file, there is now a check that ensures that
all characters are ASCII.
* Function ``record_names_match`` is deprecated, use `SequenceRecord.is_mate` instead.
* Dropped Python 3.6 support as it is end-of-life.
v0.7.1 (2022-01-26)
-------------------
* :pr:`34`: Fix parsing of FASTA files that just contain a comment and no reads
v0.7.0 (2022-01-17)
-------------------
* @rhpvorderman contributed many performance improvements in :pr:`15`,
:pr:`17`, :pr:`18`, :pr:`20`, :pr:`21`, :pr:`22`, :pr:`23`. Reading
and writing FASTQ files and reading of paired-end FASTQ files was
sped up significantly. For example, reading uncompressed FASTQ is
50% faster (!) than before.
* :pr:`28`: Windows support added
v0.6.0 (2021-09-28)
-------------------
* :pr:`12`: Improve FASTQ writing speed twofold (thanks to @rhpvorderman)
v0.5.2 (2021-09-07)
-------------------
* :issue:`7`: Ignore a trailing "3" in the read id
|
|
Version 2.3.0
=================================
Changes
-------
* :meth:`DataElement.description<pydicom.dataelem.DataElement.description>` is
deprecated and will be removed in v3.0, use
:attr:`DataElement.name<pydicom.dataelem.DataElement.name>` instead
* Updated the private dictionary
* :attr:`~pydicom.config.enforce_valid_values` is deprecated in favor of
:attr:`~pydicom.config.settings.reading_validation_mode`
* Added `download` parameter to :func:`~pydicom.data.get_testdata_file`
to allow skipping downloading the file if missed locally (:pr:`1617`)
Enhancements
------------
* Values are now validated for valid length, allowed character set and format
on reading and writing. Depending on the value of
:attr:`~pydicom.config.settings.reading_validation_mode`
and :attr:`~pydicom.config.settings.writing_validation_mode`
a warning is logged, an exception is raised, or the validation is skipped.
* Added :class:`~pydicom.valuerep.VR` enum (:pr:`1500`)
* UIDs for all Storage SOP Classes have been added to the ``uid`` module
(:issue:`1498`)
* Use rle_handler as last resort handler for decoding RLE encoded data as it is
the slowest handler (:issue:`1487`)
* Added, enhanced, or removed a number of Mitra private dictionary entries (:pr:`1588`)
* Added support for unpacking bit-packed data without using NumPy to
:func:`~pydicom.pixel_data_handlers.utils.unpack_bits`(:pr:`1594`)
* Added :func:`~pydicom.pixel_data_handlers.util.expand_ybr422` for expanding
uncompressed ``YBR_FULL_422`` data to ``YBR_FULL`` (:pr:`1593`)
* Replacement of ``UN`` VR with ``SQ`` VR for undefined length data elements
(introduced in 2.2.2), can now be configured via
:attr:`~pydicom.config.settings.infer_sq_for_un_vr`
* Updated dictionaries to DICOM 2022a
Fixes
-----
* Fixed odd-length **OB** values not being padded during write (:issue:`1511`)
* Fixed Hologic private dictionary entry (0019xx43)
* Fixed Mitra global patient ID private dictionary entry (:pr:`1588`)
* Fixed :meth:`~pydicom.dataset.Dataset.compress` not setting the correct
encoding for the rest of the dataset (:issue:`1565`)
* Fixed `AttributeError` on deep copy of :class:`~pydicom.dataset.FileDataset`
(:issue:`1571`)
* Fixed an exception during pixel decoding if using GDCM < 2.8.8 on Windows
(:issue:`1581`)
* Fixed crashes on Windows and MacOS when using the GDCM plugin to compress
into *RLE Lossless* (:issue:`1581`)
* Fixed ``dir(Dataset())`` not returning class attributes (:issue:`1599`)
* Fixed bad DICOMDIR offsets when using :meth:`FileSet.write()
<pydicom.fileset.FileSet.write>` with a *Directory Record Sequence* using
undefined length items (:issue:`1596`)
* Assigning a list of length one as tag value is now correctly handled as
assigning the single value (:issue:`1606`)
* Fixed an exception with multiple deferred reads with file-like objects
(:issue:`1609`)
|
|
|
|
GFF/GTF utility providing format conversions, filtering, FASTA sequence
extraction and more. The program gffread can be used to validate,
filter, convert and perform various other operations on GFF files.
Because the program shares the same GFF parser code with Cufflinks,
Stringtie, and gffcompare, it could be used to verify that a GFF file
from a certain annotation source is correctly "understood" by these
programs.
|
|
|
|
Fastq-trim is a lightening fast read trimming tool for QA of DNA and RNA reads
prior to analyses such as RNA-Seq. it runs in a fraction of the time required
by popular trimmers and uses only a few megabytes of RAM, so it will run
almost entirely in cache. The design supports adding any number of alignment
functions, so it can be easily adapted to any trimming needs.
|
|
|
|
- This is only a small release coming out in order to establish an automated
build - and publication pipeline.
Some new bond types were added nevertheless.
|
|
The package changed with the addition of its libepoll-shim dependency.
Otherwise, we can get:
ERROR: libepoll-shim>=0.0.20210418 is not installed; can't buildlink files.
|
|
Bulk build on NetBSD of these packages had the same result as before
(build succeeds, no PLIST change).
|
|
1.15.1
Security fix: Fixed broken error reporting in the sam_cap_mapq() function, due to a missing hts_log() parameter. Prior to this fix it was possible to abuse the log message format string by passing a specially crafted alignment record to this function.
HTSlib now uses libhtscodecs release 1.2.2. This fixes a number of bugs where invalid compressed data could trigger usage of uninitialised values.
Fixed excessive memory used by multi-threaded SAM output on long reads.
Fixed a bug where tabix would misinterpret region specifiers starting at position 0. It will also now warn if the file being indexed is supposed to be 1-based but has positions less than or equal to 0.
The VCF header parser will now issue a warning if it finds an INFO header with Type=Flag but Number not equal to 0. It will also ignore the incorrect Number so the flag can be used.
|
|
|
|
|
|
Update for bl_gff_t API streamlining
https://github.com/auerlab/peak-classifier/releases
|
|
Update for bl_vcf_t API streamlining
Changes: https://github.com/auerlab/vcf2hap/releases
|
|
Updates for bl_sam_t and bl_vcf_t API streamlining
Changes: https://github.com/auerlab/ad2vcf/releases
|
|
Use latest biolibc API
Minor build system improvements
https://github.com/auerlab/vcf-split/releases
|
|
Add vcf-downsample subcommand
Improvements to build system
extract-seq: Recurse to output subfeatures
Numerous other minor enhancements and fixes
Changes: https://github.com/auerlab/biolibc-tools/releases
|
|
Expand use of tsv_read_field_malloc() to improve memory efficiency
Add SAM bit flag constants
Import SAM-GFF compare functions from diffanal
Updates for libxtend DSV API changes
Numerous minor bug fixes and enhancements
Changes: https://github.com/auerlab/biolibc/releases
|
|
|
|
|
|
|
|
Several minor bug fixes and improvements since 2.11.0
Changes: https://www.ncbi.nlm.nih.gov/books/NBK131777/?report=reader
|
|
|
|
The FASTX-Toolkit is a collection of command line tools for
Short-Reads FASTA/FASTQ files preprocessing.
|
|
|
|
Numerous bug fixes and enhancements since 2.2
Unbreak build on Darwin
|
|
pkgsrc fix: Unbreak build on Darwin
Add python3 support
Several bug fixes and enhancements
Changes: https://github.com/DaehwanKimLab/hisat2/tags
|
|
Temporary fix to unbreak build on NetBSD with freeze approaching
|
|
Minor update for biolibc 0.2.2 API changes
|
|
Minor update for biolibc 0.2.2 API changes
|
|
Minor update for biolibc 0.2.2 API changes
|