Discussion:
Bug#972033: python3.9, dask, pandas 1.1
Add Reply
Rebecca N. Palmer
2020-10-14 08:00:01 UTC
Reply
Permalink
Control: severity 969648 serious
Control: tags 969650 pending
Control: tags 972033 pending

Python 3.9 related breakage has been declared RC, so if nobody objects,
I intend to upload pandas 1.1 to unstable (possibly tonight, but it
probably won't build before numpy and matplotlib are binNMUd for Python
3.9) despite the dask breakage.

As noted in #969648, it is likely that dask can be fixed, but this has
not been tested in Debian. (Testing dask also currently fails for lack
of a Python 3.9 numpy.)
Debian Bug Tracking System
2020-10-14 08:00:02 UTC
Reply
Permalink
Post by Rebecca N. Palmer
severity 969648 serious
Bug #969648 [python3-dask] dask: autopkgtest fail with pandas 1.1 - datetime issues
Ignoring request to change severity of Bug 969648 to the same value.
Post by Rebecca N. Palmer
tags 969650 pending
Bug #969650 [python3-pandas] transition: pandas 1.0 -> 1.1
Ignoring request to alter tags of bug #969650 to the same tags previously set
Post by Rebecca N. Palmer
tags 972033 pending
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Ignoring request to alter tags of bug #972033 to the same tags previously set
Ignoring request to alter tags of bug #972015 to the same tags previously set
--
969648: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969648
969650: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969650
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-14 08:00:02 UTC
Reply
Permalink
Post by Rebecca N. Palmer
severity 969648 serious
Bug #969648 [python3-dask] dask: autopkgtest fail with pandas 1.1 - datetime issues
Severity set to 'serious' from 'normal'
Post by Rebecca N. Palmer
tags 969650 pending
Bug #969650 [python3-pandas] transition: pandas 1.0 -> 1.1
Added tag(s) pending.
Post by Rebecca N. Palmer
tags 972033 pending
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Added tag(s) pending.
Added tag(s) pending.
--
969648: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969648
969650: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969650
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-14 08:00:03 UTC
Reply
Permalink
Post by Rebecca N. Palmer
severity 969648 serious
Bug #969648 [python3-dask] dask: autopkgtest fail with pandas 1.1 - datetime issues
Ignoring request to change severity of Bug 969648 to the same value.
Post by Rebecca N. Palmer
tags 969650 pending
Bug #969650 [python3-pandas] transition: pandas 1.0 -> 1.1
Ignoring request to alter tags of bug #969650 to the same tags previously set
Post by Rebecca N. Palmer
tags 972033 pending
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Ignoring request to alter tags of bug #972033 to the same tags previously set
Ignoring request to alter tags of bug #972015 to the same tags previously set
--
969648: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969648
969650: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969650
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-16 19:00:02 UTC
Reply
Permalink
tags -1 - pending
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Removed tag(s) pending.
Removed tag(s) pending.
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Rebecca N. Palmer
2020-10-16 19:00:01 UTC
Reply
Permalink
Control: tags -1 - pending

The package currently in Salsa doesn't work. test_statsmodels is
probably a circular dependency that should be ignored for now;
TestHDFStore is under investigation.

=================================== FAILURES
===================================
________________ TestHDFStore.test_append_frame_column_oriented
________________

self = <pandas.tests.io.pytables.test_store.TestHDFStore object at
0x7f753e5ba2e0>
setup_path = 'tmp.__PxTZkyR5w5__.h5'


@pytest.mark.xfail(condition=is_crashing_arch,reason="https://bugs.debian.org/790925",strict=False,run=False)
def test_append_frame_column_oriented(self, setup_path):
with ensure_clean_store(setup_path) as store:

# column oriented
df = tm.makeTimeDataFrame()
df.index = df.index._with_freq(None) # freq doesnt round-trip

_maybe_remove(store, "df1")
store.append("df1", df.iloc[:, :2], axes=["columns"])
store.append("df1", df.iloc[:, 2:])
tm.assert_frame_equal(store["df1"], df)

result = store.select("df1", "columns=A")
expected = df.reindex(columns=["A"])
tm.assert_frame_equal(expected, result)

# selection on the non-indexable
result = store.select("df1", ("columns=A",
"index=df.index[0:4]"))
expected = df.reindex(columns=["A"], index=df.index[0:4])
tm.assert_frame_equal(expected, result)

# this isn't supported
store.select("df1", "columns=A and index>df.index[4]")
pandas/tests/io/pytables/test_store.py:1336:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
pandas/io/pytables.py:876: in select
return it.get_result()
pandas/io/pytables.py:1930: in get_result
results = self.func(self.start, self.stop, where)
pandas/io/pytables.py:860: in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
pandas/io/pytables.py:4483: in read
result = self._read_axes(where=where, start=start, stop=stop)
pandas/io/pytables.py:3682: in _read_axes
selection = Selection(self, where=where, start=start, stop=stop)
pandas/io/pytables.py:5167: in __init__
self.terms = self.generate(where)
pandas/io/pytables.py:5180: in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
pandas/core/computation/pytables.py:573: in __init__
self.terms = self.parse()
pandas/core/computation/expr.py:806: in parse
return self._visitor.visit(self.expr)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:404: in visit_Module
return self.visit(expr, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:407: in visit_Expr
return self.visit(node.value, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:726: in visit_BoolOp
return reduce(visitor, operands)
pandas/core/computation/expr.py:720: in visitor
rhs = self._try_visit_binop(y)
pandas/core/computation/expr.py:715: in _try_visit_binop
return self.visit(bop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:699: in visit_Compare
return self.visit(binop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:520: in visit_BinOp
op, op_class, left, right = self._maybe_transform_eq_ne(node)
pandas/core/computation/expr.py:441: in _maybe_transform_eq_ne
right = self.visit(node.right, side="right")
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/pytables.py:430: in visit_Subscript
return self.const_type(value[slobj], self.env)
pandas/core/indexes/extension.py:215: in __getitem__
result = self._data[key]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
'2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
'2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 4

def __getitem__(self, key):
"""
This getitem defers to the underlying array, which
by-definition can
only handle list-likes, slices, and integer scalars
"""

if lib.is_integer(key):
# fast-path
result = self._data[key]
if self.ndim == 1:
return self._box_func(result)
return self._simple_new(result, dtype=self.dtype)

if com.is_bool_indexer(key):
# first convert to boolean, because check_array_indexer doesn't
# allow object dtype
if is_object_dtype(key):
key = np.asarray(key, dtype=bool)

key = check_array_indexer(self, key)
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
elif isinstance(key, list) and len(key) == 1 and
isinstance(key[0], slice):
# see https://github.com/pandas-dev/pandas/issues/31299,
need to allow
# this for now (would otherwise raise in check_array_indexer)
pass
else:
key = check_array_indexer(self, key)

freq = self._get_getitem_freq(key)
result = self._data[key]
E IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/datetimelike.py:559: IndexError
__________________ TestHDFStore.test_append_with_data_columns
__________________

self = <pandas.tests.io.pytables.test_store.TestHDFStore object at
0x7f75179bde20>
setup_path = 'tmp.__zAu5cCxZKU__.h5'

def test_append_with_data_columns(self, setup_path):

with ensure_clean_store(setup_path) as store:
df = tm.makeTimeDataFrame()
df.iloc[0, df.columns.get_loc("B")] = 1.0
_maybe_remove(store, "df")
store.append("df", df[:2], data_columns=["B"])
store.append("df", df[2:])
tm.assert_frame_equal(store["df"], df)

# check that we have indices created
assert store._handle.root.df.table.cols.index.is_indexed is
True
assert store._handle.root.df.table.cols.B.is_indexed is True

# data column searching
result = store.select("df", "B>0")
expected = df[df.B > 0]
tm.assert_frame_equal(result, expected)

# data column searching (with an indexable and a data_columns)
result = store.select("df", "B>0 and index>df.index[3]")
pandas/tests/io/pytables/test_store.py:1560:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
pandas/io/pytables.py:876: in select
return it.get_result()
pandas/io/pytables.py:1930: in get_result
results = self.func(self.start, self.stop, where)
pandas/io/pytables.py:860: in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
pandas/io/pytables.py:4483: in read
result = self._read_axes(where=where, start=start, stop=stop)
pandas/io/pytables.py:3682: in _read_axes
selection = Selection(self, where=where, start=start, stop=stop)
pandas/io/pytables.py:5167: in __init__
self.terms = self.generate(where)
pandas/io/pytables.py:5180: in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
pandas/core/computation/pytables.py:573: in __init__
self.terms = self.parse()
pandas/core/computation/expr.py:806: in parse
return self._visitor.visit(self.expr)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:404: in visit_Module
return self.visit(expr, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:407: in visit_Expr
return self.visit(node.value, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:726: in visit_BoolOp
return reduce(visitor, operands)
pandas/core/computation/expr.py:720: in visitor
rhs = self._try_visit_binop(y)
pandas/core/computation/expr.py:715: in _try_visit_binop
return self.visit(bop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:699: in visit_Compare
return self.visit(binop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:520: in visit_BinOp
op, op_class, left, right = self._maybe_transform_eq_ne(node)
pandas/core/computation/expr.py:441: in _maybe_transform_eq_ne
right = self.visit(node.right, side="right")
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/pytables.py:430: in visit_Subscript
return self.const_type(value[slobj], self.env)
pandas/core/indexes/extension.py:215: in __getitem__
result = self._data[key]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
'2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
'2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 3

def __getitem__(self, key):
"""
This getitem defers to the underlying array, which
by-definition can
only handle list-likes, slices, and integer scalars
"""

if lib.is_integer(key):
# fast-path
result = self._data[key]
if self.ndim == 1:
return self._box_func(result)
return self._simple_new(result, dtype=self.dtype)

if com.is_bool_indexer(key):
# first convert to boolean, because check_array_indexer doesn't
# allow object dtype
if is_object_dtype(key):
key = np.asarray(key, dtype=bool)

key = check_array_indexer(self, key)
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
elif isinstance(key, list) and len(key) == 1 and
isinstance(key[0], slice):
# see https://github.com/pandas-dev/pandas/issues/31299,
need to allow
# this for now (would otherwise raise in check_array_indexer)
pass
else:
key = check_array_indexer(self, key)

freq = self._get_getitem_freq(key)
result = self._data[key]
E IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/datetimelike.py:559: IndexError
_______________________ TestHDFStore.test_invalid_terms
________________________

self = <pandas.tests.io.pytables.test_store.TestHDFStore object at
0x7f753e359430>
setup_path = 'tmp.__mFQj4Z0toA__.h5'

def test_invalid_terms(self, setup_path):

with ensure_clean_store(setup_path) as store:

with catch_warnings(record=True):

df = tm.makeTimeDataFrame()
df["string"] = "foo"
df.loc[df.index[0:4], "string"] = "bar"

store.put("df", df, format="table")

# some invalid terms
with pytest.raises(TypeError):
Term()

# more invalid
store.select("df", "df.index[3]")
pandas/tests/io/pytables/test_store.py:2326:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
pandas/io/pytables.py:876: in select
return it.get_result()
pandas/io/pytables.py:1930: in get_result
results = self.func(self.start, self.stop, where)
pandas/io/pytables.py:860: in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
pandas/io/pytables.py:4483: in read
result = self._read_axes(where=where, start=start, stop=stop)
pandas/io/pytables.py:3682: in _read_axes
selection = Selection(self, where=where, start=start, stop=stop)
pandas/io/pytables.py:5167: in __init__
self.terms = self.generate(where)
pandas/io/pytables.py:5180: in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
pandas/core/computation/pytables.py:573: in __init__
self.terms = self.parse()
pandas/core/computation/expr.py:806: in parse
return self._visitor.visit(self.expr)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:404: in visit_Module
return self.visit(expr, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:407: in visit_Expr
return self.visit(node.value, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/pytables.py:430: in visit_Subscript
return self.const_type(value[slobj], self.env)
pandas/core/indexes/extension.py:215: in __getitem__
result = self._data[key]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
'2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
'2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 3

def __getitem__(self, key):
"""
This getitem defers to the underlying array, which
by-definition can
only handle list-likes, slices, and integer scalars
"""

if lib.is_integer(key):
# fast-path
result = self._data[key]
if self.ndim == 1:
return self._box_func(result)
return self._simple_new(result, dtype=self.dtype)

if com.is_bool_indexer(key):
# first convert to boolean, because check_array_indexer doesn't
# allow object dtype
if is_object_dtype(key):
key = np.asarray(key, dtype=bool)

key = check_array_indexer(self, key)
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
elif isinstance(key, list) and len(key) == 1 and
isinstance(key[0], slice):
# see https://github.com/pandas-dev/pandas/issues/31299,
need to allow
# this for now (would otherwise raise in check_array_indexer)
pass
else:
key = check_array_indexer(self, key)

freq = self._get_getitem_freq(key)
result = self._data[key]
E IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/datetimelike.py:559: IndexError
____________________ TestHDFStore.test_frame_select_complex
____________________

self = <pandas.tests.io.pytables.test_store.TestHDFStore object at
0x7f75242194c0>
setup_path = 'tmp.__utBSsB3aA1__.h5'

def test_frame_select_complex(self, setup_path):
# select via complex criteria

df = tm.makeTimeDataFrame()
df["string"] = "foo"
df.loc[df.index[0:4], "string"] = "bar"

with ensure_clean_store(setup_path) as store:
store.put("df", df, format="table", data_columns=["string"])

# empty
result = store.select("df", 'index>df.index[3] & string="bar"')
pandas/tests/io/pytables/test_store.py:3327:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
pandas/io/pytables.py:876: in select
return it.get_result()
pandas/io/pytables.py:1930: in get_result
results = self.func(self.start, self.stop, where)
pandas/io/pytables.py:860: in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
pandas/io/pytables.py:4483: in read
result = self._read_axes(where=where, start=start, stop=stop)
pandas/io/pytables.py:3682: in _read_axes
selection = Selection(self, where=where, start=start, stop=stop)
pandas/io/pytables.py:5167: in __init__
self.terms = self.generate(where)
pandas/io/pytables.py:5180: in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
pandas/core/computation/pytables.py:573: in __init__
self.terms = self.parse()
pandas/core/computation/expr.py:806: in parse
return self._visitor.visit(self.expr)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:404: in visit_Module
return self.visit(expr, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:407: in visit_Expr
return self.visit(node.value, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:726: in visit_BoolOp
return reduce(visitor, operands)
pandas/core/computation/expr.py:719: in visitor
lhs = self._try_visit_binop(x)
pandas/core/computation/expr.py:715: in _try_visit_binop
return self.visit(bop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:699: in visit_Compare
return self.visit(binop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:520: in visit_BinOp
op, op_class, left, right = self._maybe_transform_eq_ne(node)
pandas/core/computation/expr.py:441: in _maybe_transform_eq_ne
right = self.visit(node.right, side="right")
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/pytables.py:430: in visit_Subscript
return self.const_type(value[slobj], self.env)
pandas/core/indexes/extension.py:215: in __getitem__
result = self._data[key]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
'2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
'2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 3

def __getitem__(self, key):
"""
This getitem defers to the underlying array, which
by-definition can
only handle list-likes, slices, and integer scalars
"""

if lib.is_integer(key):
# fast-path
result = self._data[key]
if self.ndim == 1:
return self._box_func(result)
return self._simple_new(result, dtype=self.dtype)

if com.is_bool_indexer(key):
# first convert to boolean, because check_array_indexer doesn't
# allow object dtype
if is_object_dtype(key):
key = np.asarray(key, dtype=bool)

key = check_array_indexer(self, key)
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
elif isinstance(key, list) and len(key) == 1 and
isinstance(key[0], slice):
# see https://github.com/pandas-dev/pandas/issues/31299,
need to allow
# this for now (would otherwise raise in check_array_indexer)
pass
else:
key = check_array_indexer(self, key)

freq = self._get_getitem_freq(key)
result = self._data[key]
E IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/datetimelike.py:559: IndexError
_____________________ TestHDFStore.test_select_as_multiple
_____________________

self = <pandas.tests.io.pytables.test_store.TestHDFStore object at
0x7f7517854160>
setup_path = 'tmp.__W7kEAOquWg__.h5'

def test_select_as_multiple(self, setup_path):

df1 = tm.makeTimeDataFrame()
df2 = tm.makeTimeDataFrame().rename(columns="{}_2".format)
df2["foo"] = "bar"

with ensure_clean_store(setup_path) as store:

# no tables stored
with pytest.raises(Exception):
store.select_as_multiple(None, where=["A>0", "B>0"],
selector="df1")

store.append("df1", df1, data_columns=["A", "B"])
store.append("df2", df2)

# exceptions
with pytest.raises(Exception):
store.select_as_multiple(None, where=["A>0", "B>0"],
selector="df1")

with pytest.raises(Exception):
store.select_as_multiple([None], where=["A>0", "B>0"],
selector="df1")

msg = "'No object named df3 in the file'"
with pytest.raises(KeyError, match=msg):
store.select_as_multiple(
["df1", "df3"], where=["A>0", "B>0"], selector="df1"
)

with pytest.raises(KeyError, match=msg):
store.select_as_multiple(["df3"], where=["A>0", "B>0"],
selector="df1")

with pytest.raises(KeyError, match="'No object named df4 in
the file'"):
store.select_as_multiple(
["df1", "df2"], where=["A>0", "B>0"], selector="df4"
)

# default select
result = store.select("df1", ["A>0", "B>0"])
expected = store.select_as_multiple(
["df1"], where=["A>0", "B>0"], selector="df1"
)
tm.assert_frame_equal(result, expected)
expected = store.select_as_multiple(
"df1", where=["A>0", "B>0"], selector="df1"
)
tm.assert_frame_equal(result, expected)

# multiple
result = store.select_as_multiple(
["df1", "df2"], where=["A>0", "B>0"], selector="df1"
)
expected = concat([df1, df2], axis=1)
expected = expected[(expected.A > 0) & (expected.B > 0)]
tm.assert_frame_equal(result, expected)

# multiple (diff selector)
result = store.select_as_multiple(
["df1", "df2"], where="index>df2.index[4]", selector="df2"
)

pandas/tests/io/pytables/test_store.py:3819:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
pandas/io/pytables.py:1070: in select_as_multiple
return it.get_result(coordinates=True)
pandas/io/pytables.py:1923: in get_result
where = self.s.read_coordinates(
pandas/io/pytables.py:4130: in read_coordinates
selection = Selection(self, where=where, start=start, stop=stop)
pandas/io/pytables.py:5167: in __init__
self.terms = self.generate(where)
pandas/io/pytables.py:5180: in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
pandas/core/computation/pytables.py:573: in __init__
self.terms = self.parse()
pandas/core/computation/expr.py:806: in parse
return self._visitor.visit(self.expr)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:404: in visit_Module
return self.visit(expr, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:407: in visit_Expr
return self.visit(node.value, **kwargs)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:699: in visit_Compare
return self.visit(binop)
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/expr.py:520: in visit_BinOp
op, op_class, left, right = self._maybe_transform_eq_ne(node)
pandas/core/computation/expr.py:441: in _maybe_transform_eq_ne
right = self.visit(node.right, side="right")
pandas/core/computation/expr.py:398: in visit
return visitor(node, **kwargs)
pandas/core/computation/pytables.py:430: in visit_Subscript
return self.const_type(value[slobj], self.env)
pandas/core/indexes/extension.py:215: in __getitem__
result = self._data[key]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
'2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
'2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 4

def __getitem__(self, key):
"""
This getitem defers to the underlying array, which
by-definition can
only handle list-likes, slices, and integer scalars
"""

if lib.is_integer(key):
# fast-path
result = self._data[key]
if self.ndim == 1:
return self._box_func(result)
return self._simple_new(result, dtype=self.dtype)

if com.is_bool_indexer(key):
# first convert to boolean, because check_array_indexer doesn't
# allow object dtype
if is_object_dtype(key):
key = np.asarray(key, dtype=bool)

key = check_array_indexer(self, key)
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
elif isinstance(key, list) and len(key) == 1 and
isinstance(key[0], slice):
# see https://github.com/pandas-dev/pandas/issues/31299,
need to allow
# this for now (would otherwise raise in check_array_indexer)
pass
else:
key = check_array_indexer(self, key)

freq = self._get_getitem_freq(key)
result = self._data[key]
E IndexError: only integers, slices (`:`), ellipsis (`...`),
numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/datetimelike.py:559: IndexError
=============================== warnings summary
===============================
[...]
=================================== FAILURES
===================================
_______________________________ test_statsmodels
_______________________________

@tm.network
# Cython import warning
@pytest.mark.filterwarnings("ignore:pandas.util.testing is deprecated")
@pytest.mark.filterwarnings("ignore:can't:ImportWarning")
@pytest.mark.filterwarnings(
# patsy needs to update their imports
"ignore:Using or importing the ABCs from
'collections:DeprecationWarning"
)
def test_statsmodels():

statsmodels = import_module("statsmodels") # noqa
import statsmodels.api as sm
pandas/tests/test_downstream.py:86:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
/usr/lib/python3/dist-packages/statsmodels/api.py:11: in <module>
from .regression.recursive_ls import RecursiveLS
/usr/lib/python3/dist-packages/statsmodels/regression/recursive_ls.py:14: in
<module>
from statsmodels.tsa.statespace.mlemodel import (
/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/mlemodel.py:32: in
<module>
from .simulation_smoother import SimulationSmoother
/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/simulation_smoother.py:9:
in <module>
from .kalman_smoother import KalmanSmoother
/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/kalman_smoother.py:11:
in <module>
from statsmodels.tsa.statespace.representation import OptionWrapper
/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/representation.py:9:
in <module>
from .tools import (
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

"""
Statespace Tools

Author: Chad Fulton
License: Simplified-BSD
"""
import numpy as np
from scipy.linalg import solve_sylvester
import pandas as pd

from statsmodels.compat.pandas import Appender
from statsmodels.tools.data import _is_using_pandas
from scipy.linalg.blas import find_best_blas_type
from . import (_initialization, _representation, _kalman_filter,
_kalman_smoother, _simulation_smoother,
_cfa_simulation_smoother, _tools)
E ImportError: cannot import name '_initialization' from
'statsmodels.tsa.statespace'
(/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/__init__.py)

/usr/lib/python3/dist-packages/statsmodels/tsa/statespace/tools.py:14:
ImportError
Rebecca N. Palmer
2020-10-17 10:20:01 UTC
Reply
Permalink
This bug is not specific to DateTimeIndexes, and the immediate cause is
the index being passed to numpy as a
pandas.core.computation.pytables.Constant instead of an int:

$ python3.9
Python 3.9.0+ (default, Oct 16 2020, 17:57:59)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import pandas as pd;from pandas.io.pytables import
HDFStore;s1=HDFStore("tmp1.h5","w");df=pd.DataFrame([[1,2,3],[4,5,6]],columns=['A','B','C']);s1.append("d1",df,data_columns=["B"]);df2=s1.select("d1","index>df.index[0]");print(type(df2.index[0]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
876, in select
return it.get_result()
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
1930, in get_result
results = self.func(self.start, self.stop, where)
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
860, in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
4483, in read
result = self._read_axes(where=where, start=start, stop=stop)
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
3682, in _read_axes
selection = Selection(self, where=where, start=start, stop=stop)
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
5167, in __init__
self.terms = self.generate(where)
File "/usr/lib/python3/dist-packages/pandas/io/pytables.py", line
5180, in generate
return PyTablesExpr(where, queryables=q, encoding=self.table.encoding)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/pytables.py",
line 573, in __init__
self.terms = self.parse()
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
806, in parse
return self._visitor.visit(self.expr)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
398, in visit
return visitor(node, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
404, in visit_Module
return self.visit(expr, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
398, in visit
return visitor(node, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
407, in visit_Expr
return self.visit(node.value, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
398, in visit
return visitor(node, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
699, in visit_Compare
return self.visit(binop)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
398, in visit
return visitor(node, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
520, in visit_BinOp
op, op_class, left, right = self._maybe_transform_eq_ne(node)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
441, in _maybe_transform_eq_ne
right = self.visit(node.right, side="right")
File
"/usr/lib/python3/dist-packages/pandas/core/computation/expr.py", line
398, in visit
return visitor(node, **kwargs)
File
"/usr/lib/python3/dist-packages/pandas/core/computation/pytables.py",
line 430, in visit_Subscript
return self.const_type(value[slobj], self.env)
File "/usr/lib/python3/dist-packages/pandas/core/indexes/range.py",
line 720, in __getitem__
return super().__getitem__(key)
File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py",
line 4111, in __getitem__
result = getitem(key)
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis
(`None`) and integer or boolean arrays are valid indices
import pdb;pdb.pm()
/usr/lib/python3/dist-packages/pandas/core/indexes/base.py(4111)__getitem__()
-> result = getitem(key)
(Pdb) p key

0
(Pdb) p type(key)
<class 'pandas.core.computation.pytables.Constant'>
(Pdb)
Rebecca N. Palmer
2020-10-17 11:40:01 UTC
Reply
Permalink
Control: tags -1 patch

The underlying cause (and reason this is 3.9-specific) appears to be the
mostly-removal of ast.Index and its replacement by a bare value.

This appears to fix it, though I'm not yet sure if it's a good idea:

--- a/pandas/core/computation/pytables.py
+++ b/pandas/core/computation/pytables.py
@@ -425,6 +425,10 @@ class PyTablesExprVisitor(BaseExprVisito
value = value.value
except AttributeError:
pass
+ try:
+ slobj = slobj.value
+ except AttributeError:
+ pass

try:
return self.const_type(value[slobj], self.env)
Debian Bug Tracking System
2020-10-17 11:40:01 UTC
Reply
Permalink
Post by Rebecca N. Palmer
tags -1 patch
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Ignoring request to alter tags of bug #972033 to the same tags previously set
Ignoring request to alter tags of bug #972015 to the same tags previously set
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-17 11:40:01 UTC
Reply
Permalink
Post by Rebecca N. Palmer
tags -1 patch
Bug #972015 [src:pandas] pandas ftbfs with python3 as supported python version
Bug #972033 [src:pandas] pandas ftbfs with python3.9 as supported python version
Added tag(s) patch.
Added tag(s) patch.
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-18 13:00:02 UTC
Reply
Permalink
tags -1 pending
Bug #972015 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Bug #972033 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Added tag(s) pending.
Added tag(s) pending.
forwarded -1 https://github.com/pandas-dev/pandas/issues/37217
Bug #972015 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Bug #972033 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Changed Bug forwarded-to-address to 'https://github.com/pandas-dev/pandas/issues/37217' from 'https://github.com/pandas-dev/pandas/issues/36296'.
Changed Bug forwarded-to-address to 'https://github.com/pandas-dev/pandas/issues/37217' from 'https://github.com/pandas-dev/pandas/issues/36296'.
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-18 13:00:02 UTC
Reply
Permalink
tags -1 pending
Bug #972033 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Ignoring request to alter tags of bug #972033 to the same tags previously set
Ignoring request to alter tags of bug #972015 to the same tags previously set
forwarded -1 https://github.com/pandas-dev/pandas/issues/37217
Bug #972033 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Bug #972015 [src:pandas] pandas ftbfs with python 3.9 as supported python version
Ignoring request to change the forwarded-to-address of bug#972033 to the same value
Ignoring request to change the forwarded-to-address of bug#972015 to the same value
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
972033: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972033
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-18 17:10:01 UTC
Reply
Permalink
Your message dated Sun, 18 Oct 2020 17:05:41 +0000
with message-id <E1kUC7p-0007qF-***@fasolo.debian.org>
and subject line Bug#972015: fixed in pandas 1.1.3+dfsg-1
has caused the Debian Bug report #972015,
regarding pandas ftbfs with python 3.9 as supported python version
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Debian Bug Tracking System
2020-10-18 17:10:02 UTC
Reply
Permalink
Your message dated Sun, 18 Oct 2020 17:05:41 +0000
with message-id <E1kUC7p-0007qF-***@fasolo.debian.org>
and subject line Bug#972015: fixed in pandas 1.1.3+dfsg-1
has caused the Debian Bug report #972015,
regarding pandas ftbfs with python 3.9 as supported python version
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
972015: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972015
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Rebecca N. Palmer
2020-10-19 07:50:02 UTC
Reply
Permalink
The upstream patch doesn't even apply as-is; this version does, but I
don't have time right now to actually test it.

There's also a circular dependency problem, as dask indirectly
build-depends on itself and my new pandas makes it uninstallable.

Description: pandas 1.1 compatibility

Origin: part of upstream f212b76fefeb93298205d7d224cbc1f7ed387ce9
Author: Tom Augspurger, Rebecca Palmer

diff --git a/dask/dataframe/core.py b/dask/dataframe/core.py
index 4a5c6d1f..cedd46fc 100644
--- a/dask/dataframe/core.py
+++ b/dask/dataframe/core.py
@@ -2487,7 +2487,7 @@ Dask Name: {name}, {task} tasks"""
else:
is_anchored = offset.isAnchored()

- include_right = is_anchored or not hasattr(offset, "_inc")
+ include_right = is_anchored or not hasattr(offset, "delta")

if end == self.npartitions - 1:
divs = self.divisions
@@ -4106,7 +4106,7 @@ class DataFrame(_Frame):
left_index=on is None,
right_index=True,
left_on=on,
- suffixes=[lsuffix, rsuffix],
+ suffixes=(lsuffix, rsuffix),
npartitions=npartitions,
shuffle=shuffle,
)
diff --git a/dask/dataframe/tests/test_dataframe.py
b/dask/dataframe/tests/test_dataframe.py
index 64c15000..5e4f2bef 100644
--- a/dask/dataframe/tests/test_dataframe.py
+++ b/dask/dataframe/tests/test_dataframe.py
@@ -37,6 +37,9 @@ dsk = {
meta = make_meta({"a": "i8", "b": "i8"}, index=pd.Index([], "i8"))
d = dd.DataFrame(dsk, "x", meta, [0, 5, 9, 9])
full = d.compute()
+CHECK_FREQ = {}
+if dd._compat.PANDAS_GT_110:
+ CHECK_FREQ["check_freq"] = False


def test_dataframe_doc():
@@ -222,7 +225,18 @@ def test_index_names():
assert ddf.index.compute().name == "x"


-@pytest.mark.parametrize("npartitions", [1, pytest.param(2,
marks=pytest.mark.xfail)])
+@pytest.mark.parametrize(
+ "npartitions",
+ [
+ 1,
+ pytest.param(
+ 2,
+ marks=pytest.mark.xfail(
+ not dd._compat.PANDAS_GT_110, reason="Fixed upstream."
+ ),
+ ),
+ ],
+)
def test_timezone_freq(npartitions):
s_naive = pd.Series(pd.date_range("20130101", periods=10))
s_aware = pd.Series(pd.date_range("20130101", periods=10,
tz="US/Eastern"))
@@ -385,12 +399,48 @@ def test_describe_numeric(method, test_values):
(None, None, None, ["c", "d", "g"]), # numeric + bool
(None, None, None, ["c", "d", "f", "g"]), # numeric + bool +
timedelta
(None, None, None, ["f", "g"]), # bool + timedelta
- ("all", None, None, None),
- (["number"], None, [0.25, 0.5], None),
- ([np.timedelta64], None, None, None),
- (["number", "object"], None, [0.25, 0.75], None),
- (None, ["number", "object"], None, None),
- (["object", "datetime", "bool"], None, None, None),
+ pytest.param(
+ "all",
+ None,
+ None,
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
+ pytest.param(
+ ["number"],
+ None,
+ [0.25, 0.5],
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
+ pytest.param(
+ [np.timedelta64],
+ None,
+ None,
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
+ pytest.param(
+ ["number", "object"],
+ None,
+ [0.25, 0.75],
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
+ pytest.param(
+ None,
+ ["number", "object"],
+ None,
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
+ pytest.param(
+ ["object", "datetime", "bool"],
+ None,
+ None,
+ None,
+ marks=pytest.mark.xfail(PANDAS_GT_110, reason="upstream
changes"),
+ ),
],
)
def test_describe(include, exclude, percentiles, subset):
@@ -2522,15 +2572,17 @@ def test_to_timestamp():
index = pd.period_range(freq="A", start="1/1/2001", end="12/1/2004")
df = pd.DataFrame({"x": [1, 2, 3, 4], "y": [10, 20, 30, 40]},
index=index)
ddf = dd.from_pandas(df, npartitions=3)
- assert_eq(ddf.to_timestamp(), df.to_timestamp())
+ assert_eq(ddf.to_timestamp(), df.to_timestamp(), **CHECK_FREQ)
assert_eq(
ddf.to_timestamp(freq="M", how="s").compute(),
df.to_timestamp(freq="M", how="s"),
+ **CHECK_FREQ
)
assert_eq(ddf.x.to_timestamp(), df.x.to_timestamp())
assert_eq(
ddf.x.to_timestamp(freq="M", how="s").compute(),
df.x.to_timestamp(freq="M", how="s"),
+ **CHECK_FREQ
)


diff --git a/dask/dataframe/tests/test_extensions.py
b/dask/dataframe/tests/test_extensions.py
index bc83784a..c69bcd06 100644
--- a/dask/dataframe/tests/test_extensions.py
+++ b/dask/dataframe/tests/test_extensions.py
@@ -41,7 +41,11 @@ def test_reduction():
dser = dd.from_pandas(ser, 2)
assert_eq(ser.mean(skipna=False), dser.mean(skipna=False))

- assert_eq(ser.to_frame().mean(skipna=False),
dser.to_frame().mean(skipna=False))
+ # It's unclear whether this can be reliably provided, at least with
the current
+ # implementation, which uses pandas.DataFrame.sum(), returning a
(homogenous)
+ # series which has potentially cast values.
+
+ # assert_eq(ser.to_frame().mean(skipna=False),
dser.to_frame().mean(skipna=False))


def test_scalar():
diff --git a/dask/dataframe/tests/test_indexing.py
b/dask/dataframe/tests/test_indexing.py
index 2348b89f..88939db4 100644
--- a/dask/dataframe/tests/test_indexing.py
+++ b/dask/dataframe/tests/test_indexing.py
@@ -19,6 +19,9 @@ dsk = {
meta = make_meta({"a": "i8", "b": "i8"}, index=pd.Index([], "i8"))
d = dd.DataFrame(dsk, "x", meta, [0, 5, 9, 9])
full = d.compute()
+CHECK_FREQ = {}
+if dd._compat.PANDAS_GT_110:
+ CHECK_FREQ["check_freq"] = False


def test_loc():
@@ -369,24 +372,35 @@ def test_loc_timestamp_str():
assert_eq(df.loc["2011-01-02"], ddf.loc["2011-01-02"])
assert_eq(df.loc["2011-01-02":"2011-01-10"],
ddf.loc["2011-01-02":"2011-01-10"])
# same reso, dask result is always DataFrame
- assert_eq(df.loc["2011-01-02 10:00"].to_frame().T,
ddf.loc["2011-01-02 10:00"])
+ assert_eq(
+ df.loc["2011-01-02 10:00"].to_frame().T,
+ ddf.loc["2011-01-02 10:00"],
+ **CHECK_FREQ
+ )

# series
- assert_eq(df.A.loc["2011-01-02"], ddf.A.loc["2011-01-02"])
- assert_eq(df.A.loc["2011-01-02":"2011-01-10"],
ddf.A.loc["2011-01-02":"2011-01-10"])
+ assert_eq(df.A.loc["2011-01-02"], ddf.A.loc["2011-01-02"],
**CHECK_FREQ)
+ assert_eq(
+ df.A.loc["2011-01-02":"2011-01-10"],
+ ddf.A.loc["2011-01-02":"2011-01-10"],
+ **CHECK_FREQ
+ )

# slice with timestamp (dask result must be DataFrame)
assert_eq(
df.loc[pd.Timestamp("2011-01-02")].to_frame().T,
ddf.loc[pd.Timestamp("2011-01-02")],
+ **CHECK_FREQ
)
assert_eq(
df.loc[pd.Timestamp("2011-01-02") : pd.Timestamp("2011-01-10")],
ddf.loc[pd.Timestamp("2011-01-02") : pd.Timestamp("2011-01-10")],
+ **CHECK_FREQ
)
assert_eq(
df.loc[pd.Timestamp("2011-01-02 10:00")].to_frame().T,
ddf.loc[pd.Timestamp("2011-01-02 10:00")],
+ **CHECK_FREQ
)

df = pd.DataFrame(
diff --git a/dask/dataframe/tests/test_rolling.py
b/dask/dataframe/tests/test_rolling.py
index 81d8f498..948e1fa5 100644
--- a/dask/dataframe/tests/test_rolling.py
+++ b/dask/dataframe/tests/test_rolling.py
@@ -4,6 +4,7 @@ import pandas as pd
import pytest
import numpy as np

+import dask.array as da
import dask.dataframe as dd
from dask.dataframe.utils import assert_eq, PANDAS_VERSION

@@ -139,6 +140,10 @@ rolling_method_args_check_less_precise = [
@pytest.mark.parametrize("window", [1, 2, 4, 5])
@pytest.mark.parametrize("center", [True, False])
def test_rolling_methods(method, args, window, center,
check_less_precise):
+ if dd._compat.PANDAS_GT_110:
+ check_less_precise = {}
+ else:
+ check_less_precise = {"check_less_precise": check_less_precise}
# DataFrame
prolling = df.rolling(window, center=center)
drolling = ddf.rolling(window, center=center)
@@ -150,7 +155,7 @@ def test_rolling_methods(method, args, window,
center, check_less_precise):
assert_eq(
getattr(prolling, method)(*args, **kwargs),
getattr(drolling, method)(*args, **kwargs),
- check_less_precise=check_less_precise,
+ **check_less_precise,
)

# Series
@@ -159,7 +164,7 @@ def test_rolling_methods(method, args, window,
center, check_less_precise):
assert_eq(
getattr(prolling, method)(*args, **kwargs),
getattr(drolling, method)(*args, **kwargs),
- check_less_precise=check_less_precise,
+ **check_less_precise,
)


@@ -264,6 +269,11 @@ def test_time_rolling_constructor():
)
@pytest.mark.parametrize("window", ["1S", "2S", "3S",
pd.offsets.Second(5)])
def test_time_rolling_methods(method, args, window, check_less_precise):
+ if dd._compat.PANDAS_GT_110:
+ check_less_precise = {}
+ else:
+ check_less_precise = {"check_less_precise": check_less_precise}
+
# DataFrame
if method == "apply":
kwargs = {"raw": False}
@@ -274,7 +284,7 @@ def test_time_rolling_methods(method, args, window,
check_less_precise):
assert_eq(
getattr(prolling, method)(*args, **kwargs),
getattr(drolling, method)(*args, **kwargs),
- check_less_precise=check_less_precise,
+ **check_less_precise,
)

# Series
@@ -283,7 +293,7 @@ def test_time_rolling_methods(method, args, window,
check_less_precise):
assert_eq(
getattr(prolling, method)(*args, **kwargs),
getattr(drolling, method)(*args, **kwargs),
- check_less_precise=check_less_precise,
+ **check_less_precise,
)


diff --git a/dask/dataframe/tests/test_shuffle.py
b/dask/dataframe/tests/test_shuffle.py
index 63a65737..39f5ccd7 100644
--- a/dask/dataframe/tests/test_shuffle.py
+++ b/dask/dataframe/tests/test_shuffle.py
@@ -36,6 +35,9 @@ dsk = {
meta = make_meta({"a": "i8", "b": "i8"}, index=pd.Index([], "i8"))
d = dd.DataFrame(dsk, "x", meta, [0, 4, 9, 9])
full = d.compute()
+CHECK_FREQ = {}
+if dd._compat.PANDAS_GT_110:
+ CHECK_FREQ["check_freq"] = False


shuffle_func = shuffle # conflicts with keyword argument
@@ -772,7 +774,7 @@ def test_set_index_on_empty():
ddf = ddf[ddf.y > df.y.max()].set_index("x")
expected_df = df[df.y > df.y.max()].set_index("x")

- assert assert_eq(ddf, expected_df)
+ assert assert_eq(ddf, expected_df, **CHECK_FREQ)
assert ddf.npartitions == 1


@@ -916,8 +918,8 @@ def test_set_index_timestamp():
assert ts1.value == ts2.value
assert ts1.tz == ts2.tz

- assert_eq(df2, ddf_new_div)
- assert_eq(df2, ddf.set_index("A"))
+ assert_eq(df2, ddf_new_div, **CHECK_FREQ)
+ assert_eq(df2, ddf.set_index("A"), **CHECK_FREQ)


@pytest.mark.parametrize("compression", [None, "ZLib"])
diff --git a/dask/dataframe/tests/test_utils_dataframe.py
b/dask/dataframe/tests/test_utils_dataframe.py
index ffbebb69..fa6a6625 100644
--- a/dask/dataframe/tests/test_utils_dataframe.py
+++ b/dask/dataframe/tests/test_utils_dataframe.py
@@ -129,7 +129,7 @@ def test_meta_nonempty():
"E": np.int32(1),
"F": pd.Timestamp("2016-01-01"),
"G": pd.date_range("2016-01-01", periods=3,
tz="America/New_York"),
- "H": pd.Timedelta("1 hours", "ms"),
+ "H": pd.Timedelta("1 hours"),
"I": np.void(b" "),
"J": pd.Categorical([UNKNOWN_CATEGORIES] * 3),
},
@@ -147,7 +147,7 @@ def test_meta_nonempty():
assert df3["E"][0].dtype == "i4"
assert df3["F"][0] == pd.Timestamp("1970-01-01 00:00:00")
assert df3["G"][0] == pd.Timestamp("1970-01-01 00:00:00",
tz="America/New_York")
- assert df3["H"][0] == pd.Timedelta("1", "ms")
+ assert df3["H"][0] == pd.Timedelta("1")
assert df3["I"][0] == "foo"
assert df3["J"][0] == UNKNOWN_CATEGORIES

diff --git a/dask/dataframe/tseries/tests/test_resample.py
b/dask/dataframe/tseries/tests/test_resample.py
index 327b4392..ee24313e 100644
--- a/dask/dataframe/tseries/tests/test_resample.py
+++ b/dask/dataframe/tseries/tests/test_resample.py
@@ -7,6 +7,10 @@ from dask.dataframe.utils import assert_eq, PANDAS_VERSION
from dask.dataframe._compat import PANDAS_GT_0240
import dask.dataframe as dd

+CHECK_FREQ = {}
+if dd._compat.PANDAS_GT_110:
+ CHECK_FREQ["check_freq"] = False
+

def resample(df, freq, how="mean", **kwargs):
return getattr(df.resample(freq, **kwargs), how)()
@@ -195,7 +199,7 @@ def test_series_resample_non_existent_datetime():
result = ddf.resample("1D").mean()
expected = df.resample("1D").mean()

- assert_eq(result, expected)
+ assert_eq(result, expected, **CHECK_FREQ)


@pytest.mark.skipif(PANDAS_VERSION <= "0.23.4", reason="quantile not
in 0.23")
Rebecca N. Palmer
2020-10-19 19:00:02 UTC
Reply
Permalink
Or maybe not an actual regression...it's a ~5e-7 difference and one of
the things the patch does (at around
dask/dataframe/tests/test_rolling.py:270) is _tighten_ the tolerance on
that test.

I have filed a separate bug (#972516) for the fsspec issues.
Stefano Rivera
2020-10-19 19:10:02 UTC
Reply
Permalink
Hi Rebecca (2020.10.19_11:51:33_-0700)
Or maybe not an actual regression...it's a ~5e-7 difference and one of the
things the patch does (at around dask/dataframe/tests/test_rolling.py:270)
is _tighten_ the tolerance on that test.
Hrm, I didn't see that failure. Testing again on a 32bit arch to be
sure...
That looks like my earlier version, which fails with NameError.
Yeah, I applied it as-is first, and then followed up with the fixes,
after seeing the test failures.

SR
--
Stefano Rivera
http://tumbleweed.org.za/
+1 415 683 3272
Rebecca N. Palmer
2020-10-19 19:20:02 UTC
Reply
Permalink
Post by Stefano Rivera
Hi Rebecca (2020.10.19_11:51:33_-0700)
Or maybe not an actual regression...it's a ~5e-7 difference and one of the
things the patch does (at around dask/dataframe/tests/test_rolling.py:270)
is _tighten_ the tolerance on that test.
Hrm, I didn't see that failure. Testing again on a 32bit arch to be
sure...
My log is from amd64, but I don't know if it's reproducible.
Stefano Rivera
2020-10-19 19:30:01 UTC
Reply
Permalink
Hi Rebecca (2020.10.19_12:07:08_-0700)
Post by Stefano Rivera
Or maybe not an actual regression...it's a ~5e-7 difference and one of the
things the patch does (at around dask/dataframe/tests/test_rolling.py:270)
is _tighten_ the tolerance on that test.
Hrm, I didn't see that failure. Testing again on a 32bit arch to be
sure...
Aha. Reproduced.

And found https://github.com/dask/dask/pull/6502

SR
--
Stefano Rivera
http://tumbleweed.org.za/
+1 415 683 3272
Rebecca N. Palmer
2020-10-19 22:00:02 UTC
Reply
Permalink
- python3-pandas:amd64 (>= 0.19.0)
- python3-distributed:amd64
- python3-dask:amd64 (>= 2.9.0)
- python3-dask:amd64 (< 2.11.0+dfsg-1.1~)
I removed the python3-distributed build-dependency to break this cycle,
but I've only tested that with my version. It can be added back once we
have an installable dask.

Debian Bug Tracking System
2020-10-19 19:00:01 UTC
Reply
Permalink
tag -1 pending
Bug #969648 [python3-dask] dask: autopkgtest fail with pandas 1.1 - datetime issues
Added tag(s) pending.
--
969648: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969648
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Stefano Rivera
2020-10-19 19:00:02 UTC
Reply
Permalink
Control: tag -1 pending

Hello,

Bug #969648 in dask reported by you has been fixed in the
Git repository and is awaiting an upload. You can see the commit
message below and you can check the diff of the fix at:

https://salsa.debian.org/python-team/packages/dask/-/commit/f37e48dc02aad3d34f9ea9acb7ca8d65fd107aab

------------------------------------------------------------------------
Patch: Compatibility with pandas 1.1, Thanks Rebecca Palmer. (Closes: #969648)
------------------------------------------------------------------------

(this message was generated automatically)
--
Greetings

https://bugs.debian.org/969648
Rebecca N. Palmer
2020-10-19 19:10:01 UTC
Reply
Permalink
sorry, not any more.
That looks like my earlier version, which fails with NameError.
Rebecca N. Palmer
2020-10-19 19:10:01 UTC
Reply
Permalink
That looks like my earlier version, which fails with NameError.
Debian Bug Tracking System
2020-10-19 21:10:04 UTC
Reply
Permalink
Your message dated Mon, 19 Oct 2020 21:03:45 +0000
with message-id <E1kUcJl-000BgS-***@fasolo.debian.org>
and subject line Bug#969648: fixed in dask 2.11.0+dfsg-2
has caused the Debian Bug report #969648,
regarding dask: autopkgtest fail with pandas 1.1 - datetime issues
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
969648: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969648
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Loading...