
GITHUB . COM {
}
Detected CMS Systems:
- Wordpress (2 occurrences)
Title:
BUG: SeriesGroupBy.value_counts - index name missing when applied on categorical column Β· Issue #44324 Β· pandas-dev/pandas
Description:
I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the master branch of pandas. Reproducible Example import pandas as pd df ...
Website Age:
17 years and 9 months (reg. 2007-10-09).
Matching Content Categories {π}
- Politics
- Technology & Computing
- Law & Government
Content Management System {π}
What CMS is github.com built with?
Github.com operates using WORDPRESS.
Traffic Estimate {π}
What is the average monthly size of github.com audience?
ππ Tremendous Traffic: 10M - 20M visitors per month
Based on our best estimate, this website will receive around 10,000,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.
check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush
How Does Github.com Make Money? {πΈ}
Subscription Packages {π³}
We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.How Much Does Github.com Make? {π°}
Subscription Packages {π³}
Prices on github.com are in US Dollars ($).
They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 5,013,426 paying customers.
The estimated monthly recurring revenue (MRR) is $21,056,387.
The estimated annual recurring revenues (ARR) are $252,676,649.
Wordpress Themes and Plugins {π¨}
What WordPress theme does this site use?
It is strange but we were not able to detect any theme on the page.
What WordPress plugins does this website use?
It is strange but we were not able to detect any plugins on the page.
Keywords {π}
gender, female, bug, index, categorical, country, column, numberpioso, pandas, issue, object, dtype, male, type, seriesgroupbyvaluecounts, missing, int, added, sign, applied, valuecounts, printdfgroupbycountrygendervaluecounts, algos, code, projects, donok, import, education, low, dfgender, dfgenderastypecategory, commented, contributor, problem, navigation, solutions, pull, requests, actions, security, closed, description, confirmed, exists, high, dfdtypes, dfgroupbycountrygendervaluecounts, dfgroupbycountrygendervaluecountsindexnames, frozenlistcountry, category,
Topics {βοΈ}
[ pandas/pandas/core/groupby/generic pandas/core/groupby/generic object type column handling categorical types categorical type numberpioso added personal information bug type projects object df df['gender'] = df['gender'] type category projects milestone 1 issue description solutions import pandas subclasses type 'education' bug exists bug starting arithmetic algos genre column triage issue github index class int64 country added value_counts - index value_counts index int64 df country gender latest version master branch names frozenlist top level backward compatible return apply_series_value_counts elif is_categorical_dtype simply delete fall back 100% complete relationships pandas df = pd bug applying value_counts problem solved column issue groupby 5 closed index
Payment Methods {π}
- Braintree
Questions {β}
- Already have an account?
Schema {πΊοΈ}
DiscussionForumPosting:
context:https://schema.org
headline:BUG: SeriesGroupBy.value_counts - index name missing when applied on categorical column
articleBody:###
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [ ] I have confirmed this bug exists on the master branch of pandas.
### Reproducible Example
```python
import pandas as pd
df = pd.DataFrame({'gender': ['female', 'male', 'female', 'male', 'female', 'male'],
'education': ['low', 'medium', 'high', 'low', 'high', 'low'],
'country': ['US', 'FR', 'US', 'FR', 'FR', 'US']})
df.dtypes
gender object
education object
country object
dtype: object
df.groupby('country')['gender'].value_counts()
country gender
FR male 2
female 1
US female 2
male 1
Name: gender, dtype: int64
df.groupby('country')['gender'].value_counts().index.names
FrozenList(['country', 'gender'])
```
Now, if using a categorical type on the genre column, the column name is not saved in the index:
```python
df['gender'] = df['gender'].astype('category')
df.dtypes
gender category
education object
country object
dtype: object
df.groupby('country')['gender'].value_counts()
country
FR male 2
female 1
US female 2
male 1
Name: gender, dtype: int64
df.groupby('country')['gender'].value_counts().index.names
FrozenList(['country', None])
```
### Issue Description
The column name (here **gender**) is dropped when applying `value_counts()` on a column of type `category`.
It might be link with other problems in handling categorical types: #44001
This was working as of pandas v 1.2.3, and I could reproduce the bug starting with v 1.3.0, still present in v 1.3.4
### Expected Behavior
Column name should be added in the index for uniformity with other types, as in the first example above with object type column.
### Installed Versions
<details>
pd.show_versions()
INSTALLED VERSIONS
------------------
commit : 945c9ed766a61c7d2c0a7cbb251b6edebf9cb7d5
python : 3.9.6.final.0
python-bits : 64
OS : Darwin
OS-release : 21.1.0
Version : Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.3.4
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.3.1
setuptools : 57.0.0
Cython : None
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.26.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : None
tables : None
tabulate : 0.8.9
xarray : 0.19.0
xlrd : None
xlwt : None
numba : None
</details>
author:
url:https://github.com/donok1
type:Person
name:donok1
datePublished:2021-11-05T14:24:46.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:3
url:https://github.com/44324/pandas/issues/44324
context:https://schema.org
headline:BUG: SeriesGroupBy.value_counts - index name missing when applied on categorical column
articleBody:###
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [ ] I have confirmed this bug exists on the master branch of pandas.
### Reproducible Example
```python
import pandas as pd
df = pd.DataFrame({'gender': ['female', 'male', 'female', 'male', 'female', 'male'],
'education': ['low', 'medium', 'high', 'low', 'high', 'low'],
'country': ['US', 'FR', 'US', 'FR', 'FR', 'US']})
df.dtypes
gender object
education object
country object
dtype: object
df.groupby('country')['gender'].value_counts()
country gender
FR male 2
female 1
US female 2
male 1
Name: gender, dtype: int64
df.groupby('country')['gender'].value_counts().index.names
FrozenList(['country', 'gender'])
```
Now, if using a categorical type on the genre column, the column name is not saved in the index:
```python
df['gender'] = df['gender'].astype('category')
df.dtypes
gender category
education object
country object
dtype: object
df.groupby('country')['gender'].value_counts()
country
FR male 2
female 1
US female 2
male 1
Name: gender, dtype: int64
df.groupby('country')['gender'].value_counts().index.names
FrozenList(['country', None])
```
### Issue Description
The column name (here **gender**) is dropped when applying `value_counts()` on a column of type `category`.
It might be link with other problems in handling categorical types: #44001
This was working as of pandas v 1.2.3, and I could reproduce the bug starting with v 1.3.0, still present in v 1.3.4
### Expected Behavior
Column name should be added in the index for uniformity with other types, as in the first example above with object type column.
### Installed Versions
<details>
pd.show_versions()
INSTALLED VERSIONS
------------------
commit : 945c9ed766a61c7d2c0a7cbb251b6edebf9cb7d5
python : 3.9.6.final.0
python-bits : 64
OS : Darwin
OS-release : 21.1.0
Version : Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.3.4
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.3.1
setuptools : 57.0.0
Cython : None
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.26.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : None
tables : None
tabulate : 0.8.9
xarray : 0.19.0
xlrd : None
xlwt : None
numba : None
</details>
author:
url:https://github.com/donok1
type:Person
name:donok1
datePublished:2021-11-05T14:24:46.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:3
url:https://github.com/44324/pandas/issues/44324
Person:
url:https://github.com/donok1
name:donok1
url:https://github.com/donok1
name:donok1
InteractionCounter:
interactionType:https://schema.org/CommentAction
userInteractionCount:3
interactionType:https://schema.org/CommentAction
userInteractionCount:3
External Links {π}(3)
Analytics and Tracking {π}
- Site Verification - Google
Libraries {π}
- Clipboard.js
- D3.js
- GSAP
- Lodash
Emails and Hosting {βοΈ}
Mail Servers:
- aspmx.l.google.com
- alt1.aspmx.l.google.com
- alt2.aspmx.l.google.com
- alt3.aspmx.l.google.com
- alt4.aspmx.l.google.com
Name Servers:
- dns1.p08.nsone.net
- dns2.p08.nsone.net
- dns3.p08.nsone.net
- dns4.p08.nsone.net
- ns-1283.awsdns-32.org
- ns-1707.awsdns-21.co.uk
- ns-421.awsdns-52.com
- ns-520.awsdns-01.net