
GITHUB . COM {
}
Detected CMS Systems:
- Wordpress (2 occurrences)
Title:
BUG: Styling of a `DataFrame` with boolean column labels leaves out `False` and fails when adding non-numerical column Β· Issue #47838 Β· pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. Reproducible Example # %...
Website Age:
17 years and 8 months (reg. 2007-10-09).
Matching Content Categories {π}
- Technology & Computing
- Cryptocurrency
- Mobile Technology & AI
Content Management System {π}
What CMS is github.com built with?
Github.com is built with WORDPRESS.
Traffic Estimate {π}
What is the average monthly size of github.com audience?
ππ Tremendous Traffic: 10M - 20M visitors per month
Based on our best estimate, this website will receive around 10,000,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.
check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush
How Does Github.com Make Money? {πΈ}
Subscription Packages {π³}
We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.How Much Does Github.com Make? {π°}
Subscription Packages {π³}
Prices on github.com are in US Dollars ($).
They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 4,989,889 paying customers.
The estimated monthly recurring revenue (MRR) is $20,957,532.
The estimated annual recurring revenues (ARR) are $251,490,385.
Wordpress Themes and Plugins {π¨}
What WordPress theme does this site use?
It is strange but we were not able to detect any theme on the page.
What WordPress plugins does this website use?
It is strange but we were not able to detect any plugins on the page.
Keywords {π}
boolean, column, subset, pandas, issue, indexing, bug, false, labels, adding, bherwerth, sign, styling, dataframe, true, problem, names, columns, styler, case, dfloc, cols, projects, fails, default, loc, mask, solution, work, added, attack, change, navigation, pull, requests, actions, security, leaves, nonnumerical, closed, description, version, confirmed, exists, import, index, styled, displayhtmldfstylebackgroundgradienttohtml, indexerror, expected,
Topics {βοΈ}
df pandas/pandas/io/formats/style pandas/pandas/io/formats/style pandas/tests/io/formats/style/ styler conditional formatting personal information bug boolean column labels good idea generally comment metadata assignees issue description bherwerth mentioned mixing boolean names styler subset arg latest version pandas interprets bug exists styled dataframe renders type projects projects milestone triage issue column names pandas style type boolean mask boolean values boolean indexers boolean indexing visualize pd numerical columns column false boolean index indexerror display true column main branch requires running wrong length expected behavior include=np automatic selection series/frames suggested fix added unintended consequences milestone relationships adding column issue solution proposed proposed solution general df draft pr github
Payment Methods {π}
- Braintree
Questions {β}
- Already have an account?
Schema {πΊοΈ}
DiscussionForumPosting:
context:https://schema.org
headline:BUG: Styling of a `DataFrame` with boolean column labels leaves out `False` and fails when adding non-numerical column
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this bug exists on the main branch of pandas.
### Reproducible Example
```python
# %%
import pandas as pd
from IPython import display
df = pd.DataFrame(
[
[1, 2],
[3, 4]
],
columns=[False, True],
index=[1, 2],
)
# column `False` not styled (requires running in Jupyter)
display.HTML(df.style.background_gradient().to_html())
# %%
df["C"] = ["a", "b"]
# fails with IndexError
display.HTML(df.style.background_gradient().to_html())
```
### Issue Description
Before adding `C`, the styled dataframe renders like this:

After adding `C`, I get `IndexError: Boolean index has wrong length: 2 instead of 3`.
<details>
<summary>Traceback</summary>
```
Traceback (most recent call last):
File "minimal.py", line 18, in <module>
display.HTML(df.style.background_gradient().to_html())
File "/home/pandas/pandas/io/formats/style.py", line 1372, in to_html
html = obj._render_html(
File "/home/pandas/pandas/io/formats/style_render.py", line 202, in _render_html
d = self._render(sparse_index, sparse_columns, max_rows, max_cols, " ")
File "/home/pandas/pandas/io/formats/style_render.py", line 166, in _render
self._compute()
File "/home/pandas/pandas/io/formats/style_render.py", line 254, in _compute
r = func(self)(*args, **kwargs)
File "/home/pandas/pandas/io/formats/style.py", line 1711, in _apply
data = self.data.loc[subset]
File "/home/pandas/pandas/core/indexing.py", line 1065, in __getitem__
return self._getitem_tuple(key)
File "/home/pandas/pandas/core/indexing.py", line 1254, in _getitem_tuple
return self._getitem_tuple_same_dim(tup)
File "/home/pandas/pandas/core/indexing.py", line 922, in _getitem_tuple_same_dim
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/home/pandas/pandas/core/indexing.py", line 1287, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/home/pandas/pandas/core/indexing.py", line 1089, in _getbool_axis
key = check_bool_indexer(labels, key)
File "/home/pandas/pandas/core/indexing.py", line 2561, in check_bool_indexer
return check_array_indexer(index, result)
File "/home/pandas/pandas/core/indexers/utils.py", line 552, in check_array_indexer
raise IndexError(
IndexError: Boolean index has wrong length: 2 instead of 3
```
</details>
The same problem occurs when using `style.bar` instead of `style.background_gradient`.
### Expected Behavior
First table:

After adding column "C":

I think the problem is that `subset` is by default taken to be the column names of the numerical columns of `df`
https://github.com/pandas-dev/pandas/blob/e8093ba372f9adfe79439d90fe74b0b5b6dea9d6/pandas/io/formats/style.py#L2826-L2827
Later, `df` is indexed with `loc`, which treats `subset` as a boolean mask:
https://github.com/pandas-dev/pandas/blob/e8093ba372f9adfe79439d90fe74b0b5b6dea9d6/pandas/io/formats/style.py#L1422-L1423
A solution could be to make `subset` by default a boolean mask:
```python
subset = (
self.data.columns.isin(
self.data.select_dtypes(include=np.number)
)
)
```
I tried this replacement and used it to produce the screenshots for the expected behavior above.
I suppose having boolean column labels is exotic and probably not a good idea generally. When indexing, for example `df[[False]]`, `pandas` interprets this as a boolean mask. Yet, I encountered this problem when trying to visualize `pd.crosstab(a, b)`, where `b` has boolean values.
If there is interest in fixing this, I'd be happy to work on a PR. Perhaps the very least, a warning should be shown when styling with boolean column labels.
### Installed Versions
<details>
/opt/conda/lib/python3.8/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
------------------
commit : a62897ae5f5c68360f478bbd5facea231dd55f30
python : 3.8.13.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.104-linuxkit
Version : #1 SMP Thu Mar 17 17:08:06 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : C.UTF-8
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.0.dev0+1189.ga62897ae5f
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 63.2.0
pip : 22.2
Cython : 0.29.30
pytest : 7.1.2
hypothesis : 6.47.1
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : 0.8.1
fsspec : 2022.5.0
gcsfs : 2022.5.0
matplotlib : 3.5.2
numba : 0.55.2
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : 0.17.6
pyarrow : 8.0.0
pyreadstat : 1.1.9
pyxlsb : 1.0.9
s3fs : 0.6.0
scipy : 1.8.1
snappy :
sqlalchemy : 1.4.39
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.6.0
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : 0.18.0
</details>
author:
url:https://github.com/bherwerth
type:Person
name:bherwerth
datePublished:2022-07-24T15:35:10.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:2
url:https://github.com/47838/pandas/issues/47838
context:https://schema.org
headline:BUG: Styling of a `DataFrame` with boolean column labels leaves out `False` and fails when adding non-numerical column
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this bug exists on the main branch of pandas.
### Reproducible Example
```python
# %%
import pandas as pd
from IPython import display
df = pd.DataFrame(
[
[1, 2],
[3, 4]
],
columns=[False, True],
index=[1, 2],
)
# column `False` not styled (requires running in Jupyter)
display.HTML(df.style.background_gradient().to_html())
# %%
df["C"] = ["a", "b"]
# fails with IndexError
display.HTML(df.style.background_gradient().to_html())
```
### Issue Description
Before adding `C`, the styled dataframe renders like this:

After adding `C`, I get `IndexError: Boolean index has wrong length: 2 instead of 3`.
<details>
<summary>Traceback</summary>
```
Traceback (most recent call last):
File "minimal.py", line 18, in <module>
display.HTML(df.style.background_gradient().to_html())
File "/home/pandas/pandas/io/formats/style.py", line 1372, in to_html
html = obj._render_html(
File "/home/pandas/pandas/io/formats/style_render.py", line 202, in _render_html
d = self._render(sparse_index, sparse_columns, max_rows, max_cols, " ")
File "/home/pandas/pandas/io/formats/style_render.py", line 166, in _render
self._compute()
File "/home/pandas/pandas/io/formats/style_render.py", line 254, in _compute
r = func(self)(*args, **kwargs)
File "/home/pandas/pandas/io/formats/style.py", line 1711, in _apply
data = self.data.loc[subset]
File "/home/pandas/pandas/core/indexing.py", line 1065, in __getitem__
return self._getitem_tuple(key)
File "/home/pandas/pandas/core/indexing.py", line 1254, in _getitem_tuple
return self._getitem_tuple_same_dim(tup)
File "/home/pandas/pandas/core/indexing.py", line 922, in _getitem_tuple_same_dim
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/home/pandas/pandas/core/indexing.py", line 1287, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/home/pandas/pandas/core/indexing.py", line 1089, in _getbool_axis
key = check_bool_indexer(labels, key)
File "/home/pandas/pandas/core/indexing.py", line 2561, in check_bool_indexer
return check_array_indexer(index, result)
File "/home/pandas/pandas/core/indexers/utils.py", line 552, in check_array_indexer
raise IndexError(
IndexError: Boolean index has wrong length: 2 instead of 3
```
</details>
The same problem occurs when using `style.bar` instead of `style.background_gradient`.
### Expected Behavior
First table:

After adding column "C":

I think the problem is that `subset` is by default taken to be the column names of the numerical columns of `df`
https://github.com/pandas-dev/pandas/blob/e8093ba372f9adfe79439d90fe74b0b5b6dea9d6/pandas/io/formats/style.py#L2826-L2827
Later, `df` is indexed with `loc`, which treats `subset` as a boolean mask:
https://github.com/pandas-dev/pandas/blob/e8093ba372f9adfe79439d90fe74b0b5b6dea9d6/pandas/io/formats/style.py#L1422-L1423
A solution could be to make `subset` by default a boolean mask:
```python
subset = (
self.data.columns.isin(
self.data.select_dtypes(include=np.number)
)
)
```
I tried this replacement and used it to produce the screenshots for the expected behavior above.
I suppose having boolean column labels is exotic and probably not a good idea generally. When indexing, for example `df[[False]]`, `pandas` interprets this as a boolean mask. Yet, I encountered this problem when trying to visualize `pd.crosstab(a, b)`, where `b` has boolean values.
If there is interest in fixing this, I'd be happy to work on a PR. Perhaps the very least, a warning should be shown when styling with boolean column labels.
### Installed Versions
<details>
/opt/conda/lib/python3.8/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
------------------
commit : a62897ae5f5c68360f478bbd5facea231dd55f30
python : 3.8.13.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.104-linuxkit
Version : #1 SMP Thu Mar 17 17:08:06 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : C.UTF-8
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.0.dev0+1189.ga62897ae5f
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 63.2.0
pip : 22.2
Cython : 0.29.30
pytest : 7.1.2
hypothesis : 6.47.1
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : 0.8.1
fsspec : 2022.5.0
gcsfs : 2022.5.0
matplotlib : 3.5.2
numba : 0.55.2
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : 0.17.6
pyarrow : 8.0.0
pyreadstat : 1.1.9
pyxlsb : 1.0.9
s3fs : 0.6.0
scipy : 1.8.1
snappy :
sqlalchemy : 1.4.39
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.6.0
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : 0.18.0
</details>
author:
url:https://github.com/bherwerth
type:Person
name:bherwerth
datePublished:2022-07-24T15:35:10.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:2
url:https://github.com/47838/pandas/issues/47838
Person:
url:https://github.com/bherwerth
name:bherwerth
url:https://github.com/bherwerth
name:bherwerth
InteractionCounter:
interactionType:https://schema.org/CommentAction
userInteractionCount:2
interactionType:https://schema.org/CommentAction
userInteractionCount:2
External Links {π}(6)
- What is the earnings of https://github.blog?
- How much money does https://pandas.pydata.org/docs/whatsnew/index.html generate?
- What is the monthly revenue of https://user-images.githubusercontent.com/108834862/180653795-a7d5c010-ab5d-45e1-9799-d88f3059ec90.png?
- Financial intake of https://user-images.githubusercontent.com/108834862/180654325-e5f977c4-965d-4a6f-847d-8ad343f58502.png
- https://user-images.githubusercontent.com/108834862/180654342-64c5ce1a-64fa-4f46-ae2f-6c6927d3098c.png's revenue stream
- Earnings of https://www.githubstatus.com/
Analytics and Tracking {π}
- Site Verification - Google
Libraries {π}
- Clipboard.js
- D3.js
- Lodash
Emails and Hosting {βοΈ}
Mail Servers:
- aspmx.l.google.com
- alt1.aspmx.l.google.com
- alt2.aspmx.l.google.com
- alt3.aspmx.l.google.com
- alt4.aspmx.l.google.com
Name Servers:
- dns1.p08.nsone.net
- dns2.p08.nsone.net
- dns3.p08.nsone.net
- dns4.p08.nsone.net
- ns-1283.awsdns-32.org
- ns-1707.awsdns-21.co.uk
- ns-421.awsdns-52.com
- ns-520.awsdns-01.net