summaryrefslogtreecommitdiff
path: root/math/py-pandas
AgeCommit message (Collapse)AuthorFilesLines
2016-08-19Prefer egg.mk to distutils.mk. Clean up. Add missing dependency onwiz2-10/+16
py-sqlite3. Add missing test dependency on py-nose. Add comments with links to bug reports about test failures. Bump PKGREVISION for dependency change.
2016-08-16Update py-pandas to 0.18.1maya3-100/+446
Highlights in changelog: v0.18.1: .groupby(...) has been enhanced to provide convenient syntax when working with .rolling(..), .expanding(..) and .resample(..) per group, see here pd.to_datetime() has gained the ability to assemble dates from a DataFrame, see here Method chaining improvements, see here. Custom business hour offset, see here. Many bug fixes in the handling of sparse, see here Expanded the Tutorials section with a feature on modern pandas, courtesy of @TomAugsburger. (GH13045). v0.18.0: Moving and expanding window functions are now methods on Series and DataFrame, similar to .groupby, see here. Adding support for a RangeIndex as a specialized form of the Int64Index for memory savings, see here. API breaking change to the .resample method to make it more .groupby like, see here. Removal of support for positional indexing with floats, which was deprecated since 0.14.0. This will now raise a TypeError, see here. The .to_xarray() function has been added for compatibility with the xarray package, see here. The read_sas function has been enhanced to read sas7bdat files, see here. Addition of the .str.extractall() method, and API changes to the .str.extract() method and .str.cat() method. pd.test() top-level nose test runner is available (GH4327). Update by K.I.A.Derouiche in PR pkg/51272 Slightly modified.
2016-07-15Do not include py-numexpr/bl3.mk, just DEPEND on it.wiz1-2/+2
2016-06-08Switch to MASTER_SITES_PYPI.wiz1-2/+2
2015-12-28Update py-pandas to 0.17.1.wiz3-18/+120
0.17.1 This is a minor bug-fix release from 0.17.0 and includes a large number of bug fixes along several new features, enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: Support for Conditional HTML Formatting, see here Releasing the GIL on the csv reader & other ops, see here Fixed regression in DataFrame.drop_duplicates from 0.16.2, causing incorrect results on integer values (GH11376) 0.17.0 This is a major release from 0.16.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: Release the Global Interpreter Lock (GIL) on some cython operations, see here Plotting methods are now available as attributes of the .plot accessor, see here The sorting API has been revamped to remove some long-time inconsistencies, see here Support for a datetime64[ns] with timezones as a first-class dtype, see here The default for to_datetime will now be to raise when presented with unparseable formats, previously this would return the original input. Also, date parse functions now return consistent results. See here The default for dropna in HDFStore has changed to False, to store by default all rows even if they are all NaN, see here Datetime accessor (dt) now supports Series.dt.strftime to generate formatted strings for datetime-likes, and Series.dt.total_seconds to generate each duration of the timedelta in seconds. See here Period and PeriodIndex can handle multiplied freq like 3D, which corresponding to 3 days span. See here Development installed versions of pandas will now have PEP440 compliant version strings (GH9518) Development support for benchmarking with the Air Speed Velocity library (GH8361) Support for reading SAS xport files, see here Documentation comparing SAS to pandas, see here Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see here Display format with plain text can optionally align with Unicode East Asian Width, see here Compatibility with Python 3.5 (GH11097) Compatibility with matplotlib 1.5.0 (GH11111)
2015-11-03Add SHA512 digests for distfiles for math categoryagc1-1/+2
Problems found locating distfiles: Package dfftpack: missing distfile dfftpack-20001209.tar.gz Package eispack: missing distfile eispack-20001130.tar.gz Package fftpack: missing distfile fftpack-20001130.tar.gz Package linpack: missing distfile linpack-20010510.tar.gz Package minpack: missing distfile minpack-20001130.tar.gz Package odepack: missing distfile odepack-20001130.tar.gz Package py-networkx: missing distfile networkx-1.10.tar.gz Package py-sympy: missing distfile sympy-0.7.6.1.tar.gz Package quadpack: missing distfile quadpack-20001130.tar.gz Otherwise, existing SHA1 digests verified and found to be the same on the machine holding the existing distfiles (morden). All existing SHA1 digests retained for now as an audit trail.
2015-07-21Update py-pandas to 0.16.2.bad3-19/+73
Closes PR pkg/49958 by matthewd. Changes since 0.14.1 for a full list see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html: v 0.16.2 This is a minor bug-fix release from 0.16.1 and includes a a large number of bug fixes along some new features (pipe() method), enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: A new pipe method Documentation on how to use numba with pandas, v 0.16.1 This is a minor bug-fix release from 0.16.0 and includes a a large number of bug fixes along several new features, enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: Support for a CategoricalIndex, a category based index New section on how-to-contribute to pandas Revised “Merge, join, and concatenate” documentation, including graphical examples to make it easier to understand each operations New method sample for drawing random samples from Series, DataFrames and Panels. The default Index printing has changed to a more uniform format BusinessHour datetime-offset is now supported Further enhancement to the .str accessor to make string operations easier v0.16.0 (March 22, 2015) This is a major release from 0.15.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: DataFrame.assign method Series.to_coo/from_coo methods to interact with scipy.sparse Backwards incompatible change to Timedelta to conform the .seconds attribute with datetime.timedelta Changes to the .loc slicing API to conform with the behavior of .ix Changes to the default for ordering in the Categorical constructor Enhancement to the .str accessor to make string operations easier The pandas.tools.rplot, pandas.sandbox.qtpandas and pandas.rpy modules are deprecated. We refer users to external packages like seaborn, pandas-qt and rpy2 for similar or equivalent functionality, see here v0.15.0 (October 18, 2014) This is a major release from 0.14.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Warning pandas >= 0.15.0 will no longer support compatibility with NumPy versions < 1.7.0. If you want to use the latest versions of pandas, please upgrade to NumPy >= 1.7.0 (GH7711) Highlights include: The Categorical type was integrated as a first-class pandas type New scalar type Timedelta, and a new index type TimedeltaIndex New datetimelike properties accessor .dt for Series, see Datetimelike Properties New DataFrame default display for df.info() to include memory usage, see Memory Usage read_csv will now by default ignore blank lines when parsing API change in using Indexes in set operations Enhancements in the handling of timezones A lot of improvements to the rolling and expanding moment funtions Internal refactoring of the Index class to no longer sub-class ndarray, see Internal Refactoring dropping support for PyTables less than version 3.0.0, and numexpr less than version 2.1 (GH7990) Split indexing documentation into Indexing and Selecting Data and MultiIndex / Advanced Indexing Split out string methods documentation into Working with Text Data
2014-07-19Update math/py-pandas to 0.14.1.bad3-42/+204
This is two major releases since 0.12.0. Changes include API changes, new features, enhancements, and performance improvements along with a large number of bug fixes. For the detailed list of changes see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
2014-01-16Convert to use versioned_dependencies.mk.wiz1-2/+4
2013-12-10Update pandas to 0.12.0.bad3-14/+75
This is a major release from 0.11.0 and includes several new features and enhancements along with a large number of bug fixes. Highlites include a consistent I/O API naming scheme, routines to read html, write multi-indexes to csv files, read & write STATA data files, read & write JSON format files, Python 3 support for HDFStore, filtering of groupby expressions via filter, and a revamped replace routine that accepts regular expressions. For detailed changes see: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
2013-05-16Update py-pandas to 0.11.0.bad3-22/+62
Summary of changes since 0.10.1: This is a major release from 0.10.1 and includes many new features and enhancements along with a large number of bug fixes. The methods of Selecting Data have had quite a number of additions, and Dtype support is now full-fledged. There are also a number of important API changes that long-time pandas users should pay close attention to. * New precision indexing fields loc, iloc, at, and iat, to reduce occasional ambiguity in the catch-all hitherto ix method. * Expanded support for NumPy data types in DataFrame. * NumExpr integration to accelerate various operator evaluation. * Improved DataFrame to CSV exporting performance. For a full list refer to the "what's new" page. Also fixes PLIST errors introduced in last update.
2013-02-16Update pandas to 0.10.1.bad2-6/+6
Release date: 2013-01-22 New features: Add data inferface to World Bank WDI pandas.io.wb (GH2592) API Changes: Restored inplace=True behavior returning self (same object) with deprecation warning until 0.11 (GH1893) HDFStore refactored HFDStore to deal with non-table stores as objects, will allow future enhancements removed keyword compression from put (replaced by keyword complib to be consistent across library) warn PerformanceWarning if you are attempting to store types that will be pickled by PyTables Improvements to existing features: HDFStore enables storing of multi-index dataframes (closes GH1277) support data column indexing and selection, via data_columns keyword in append support write chunking to reduce memory footprint, via chunksize keyword to append support automagic indexing via index keyword to append support expectedrows keyword in append to inform PyTables about the expected tablesize support start and stop keywords in select to limit the row selection space added get_store context manager to automatically import with pandas added column filtering via columns keyword in select added methods append_to_multiple/select_as_multiple/ select_as_coordinates to do multiple-table append/selection added support for datetime64 in columns added method unique to select the unique values in an indexable or data column added method copy to copy an existing store (and possibly upgrade) show the shape of the data on disk for non-table stores when printing the store added ability to read PyTables flavor tables (allows compatiblity to other HDF5 systems) Add logx option to DataFrame/Series.plot (GH2327, GH2565) Support reading gzipped data from file-like object pivot_table aggfunc can be anything used in GroupBy.aggregate (GH2643) Implement DataFrame merges in case where set cardinalities might overflow 64-bit integer (GH2690) Raise exception in C file parser if integer dtype specified and have NA values. (GH2631) Attempt to parse ISO8601 format dates when parse_dates=True in read_csv for major performance boost in such cases (GH2698) Add methods neg and inv to Series Implement kind option in ExcelFile to indicate whether it's an XLS or XLSX file (GH2613) Bug fixes: Fix read_csv/read_table multithreading issues (GH2608) HDFStore correctly handle nan elements in string columns; serialize via the nan_rep keyword to append raise correctly on non-implemented column types (unicode/date) handle correctly Term passed types (e.g. index<1000, when index is Int64), (closes GH512) handle Timestamp correctly in data_columns (closes GH2637) contains correctly matches on non-natural names correctly store float32 dtypes in tables (if not other float types in the same table) Fix DataFrame.info bug with UTF8-encoded columns. (GH2576) Fix DatetimeIndex handling of FixedOffset tz (GH2604) More robust detection of being in IPython session for wide DataFrame console formatting (GH2585) Fix platform issues with file:/// in unit test (GH2564) Fix bug and possible segfault when grouping by hierarchical level that contains NA values (GH2616) Ensure that MultiIndex tuples can be constructed with NAs (GH2616) Fix int64 overflow issue when unstacking MultiIndex with many levels (GH2616) Exclude non-numeric data from DataFrame.quantile by default (GH2625) Fix a Cython C int64 boxing issue causing read_csv to return incorrect results (GH2599) Fix groupby summing performance issue on boolean data (GH2692) Don't bork Series containing datetime64 values with to_datetime (GH2699) Fix DataFrame.from_records corner case when passed columns, index column, but empty record list (GH2633) Fix C parser-tokenizer bug with trailing fields. (GH2668) Don't exclude non-numeric data from GroupBy.max/min (GH2700) Don't lose time zone when calling DatetimeIndex.drop (GH2621) Fix setitem on a Series with a boolean key and a non-scalar as value (GH2686) Box datetime64 values in Series.apply/map (GH2627, GH2689) Upconvert datetime + datetime64 values when concatenating frames (GH2624) Raise a more helpful error message in merge operations when one DataFrame has duplicate columns (GH2649) Fix partial date parsing issue occuring only when code is run at EOM (GH2618) Prevent MemoryError when using counting sort in sortlevel with high-cardinality MultiIndex objects (GH2684) Fix Period resampling bug when all values fall into a single bin (GH2070) Fix buggy interaction with usecols argument in read_csv when there is an implicit first index column (GH2654)
2013-01-07Update pandas to 0.10.0.bad4-32/+53
pkgsrc change: depend on math/py-pytables. Changes since 0.9.1: * Delimited file parsing engine rewritten to use a fraction of memory while being 40%+ faster. - Much-improved Unicode handling via the encoding option. - Column filtering (usecols) - Dtype specification (dtype argument) - Ability to specify strings to be recognized as True/False - Ability to yield NumPy record arrays (as_recarray) - High performance delim_whitespace option - Decimal format (e.g. European format) specification - Easier CSV dialect options: escapechar, lineterminator, quotechar, etc. - More robust handling of many exceptional kinds of files observed in the wild * API changes - Deprecated DataFrame BINOP TimeSeries special case behavior - Altered resample default behavior - Infinity and negative infinity are no longer treated as NA by isnull and notnull. - Methods with the inplace option now all return None instead of the calling object. - pandas.merge no longer sorts the group keys (sort=False) by default. - The default column names for a file with no header have been changed. - Values like 'Yes' and 'No' are not interpreted as boolean by default. - The file parsers will not recognize non-string values arising from a converter function as NA. - Calling fillna on Series or DataFrame with no arguments is no longer valid code. - Series.apply will now operate on a returned value from the applied function. - New API functions for working with pandas options. * New features - Wide DataFrame Printing. - Updated PyTables Support. * Enhancements - added ability to hierarchical keys. - added mixed-dtype support! - performance improvments on table writing. - support for arbitrarily indexed dimensions. - SparseSeries now has a density property. * Bug fixes - added Term method of specifying where conditions. - del store['df'] now call store.remove('df') for store deletion. - deleting of consecutive rows is much faster than before. - in_itemsize parameter can be specified in table creation to force a minimum size for indexing columns. - indexing support via create_table_index (requires PyTables >= 2.3) - appending on a store would fail if the table was not first created via put. - fixed issue with missing attributes after loading a pickled dataframe. - minor change to select and remove: require a table ONLY if where is also provided. * Compatibility - 0.10 of HDFStore is backwards compatible for reading tables created in a prior version of pandas, however, query terms using the prior (undocumented) methodology are unsupported. * N Dimensional Panels (Experimental)
2012-11-22Initial import of pandas 0.9.1.bad5-0/+503
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.