
GITHUB . COM {
}
Detected CMS Systems:
- Wordpress (2 occurrences)
Title:
BUG: unwanted type conversion when partial reassigning Β· Issue #50467 Β· pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. Reproducible Example fro...
Website Age:
17 years and 8 months (reg. 2007-10-09).
Matching Content Categories {π}
- Technology & Computing
- Video & Online Content
- Social Networks
Content Management System {π}
What CMS is github.com built with?
Github.com relies on WORDPRESS.
Traffic Estimate {π}
What is the average monthly size of github.com audience?
ππ Tremendous Traffic: 10M - 20M visitors per month
Based on our best estimate, this website will receive around 10,634,219 visitors per month in the current month.
check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush
How Does Github.com Make Money? {πΈ}
Subscription Packages {π³}
We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.How Much Does Github.com Make? {π°}
Subscription Packages {π³}
Prices on github.com are in US Dollars ($).
They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 5,306,347 paying customers.
The estimated monthly recurring revenue (MRR) is $22,286,656.
The estimated annual recurring revenues (ARR) are $267,439,867.
Wordpress Themes and Plugins {π¨}
What WordPress theme does this site use?
It is strange but we were not able to detect any theme on the page.
What WordPress plugins does this website use?
It is strange but we were not able to detect any plugins on the page.
Keywords {π}
npnan, pandas, bug, issue, phofl, trycastint, type, petrov, dfiloc, sign, conversion, reassigning, return, dataframe, preprocessing, half, member, indexing, code, projects, partial, reproducible, import, getsampledf, apply, top, team, behavior, commented, navigation, pull, requests, actions, security, unwanted, closed, description, version, confirmed, exists, def, str, removed, applymaptrycastint, found, line, float, int, avoid, dftop,
Topics {βοΈ}
phofl edits member np import pandas personal information bug pandas team comment metadata assignees issue description parse integer string custom function try_cast_int unwanted type conversion latest version type projects partial reassigning projects milestone bug exists triage issue apply `try_cast_int` apply try_cast_int type conversion code shorter def try_cast_int df = get_sample_df cast string conversion fails pandas complex preprocessing preprocessing strings place preprocessing main branch business reasons top half interesting fact 1st column context manager bottom half series/frames extra job bit overcomplicated makes sense milestone relationships return pd df_top = df issue bug float values github df concat df_bottom = df expected behavior return int reassigning
Payment Methods {π}
- Braintree
Questions {β}
- Already have an account?
- Any reason you are not using astype here?
- Edit: Any reason you are not using astype here?
- How can we avoid this behavior?
- I found a bug(?
- Pandas are not designed to do this kind of task, right?
- Why type conversion happens?
Schema {πΊοΈ}
DiscussionForumPosting:
context:https://schema.org
headline:BUG: unwanted type conversion when partial reassigning
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this bug exists on the main branch of pandas.
### Reproducible Example
```python
from typing import Union
import numpy as np
import pandas as pd
def get_sample_df():
return pd.DataFrame(
[
["1", np.nan],
["2", np.nan],
["3", np.nan],
["4", np.nan]
],
dtype="object" # Reproduce the dataframe I'm working with.
)
def try_cast_int(s: str) -> Union[int, str]:
# some complex preprocessing here
# I removed those to make my code shorter
try:
return int(s)
except ValueError:
return s
df = get_sample_df()
# Due to business reasons, apply `try_cast_int` only to the top half of `df`.
df.iloc[:2, :] = df.iloc[:2, :].applymap(try_cast_int)
```
### Issue Description
Hello pandas team!
I found a bug(?) today. I expect that the `df` after the last line in the "Reproducible Example" will be a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
But what I got was like below. I have no idea why I got `float` such like `1.0` and `2.0` instead of `int`.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1.0 | np.NaN |
| 1 | 2.0 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
One interesting fact is `df.iloc[:2, :].applymap(try_cast_int)` before reassigning will return a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
It seems that integers on the 1st column are converted into float values when partial reassigning.
My questions are
- Why type conversion happens?
- How can we avoid this behavior? (Is there any context manager or something for that?)
### Expected Behavior
The `df` after the last line in the "Reproducible Example" will be a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
I guess that "apply `try_cast_int` to the top half of `df`" is a cause of this issue. Pandas are not designed to do this kind of task, right?
---
I found that I can avoid this behavior and get what I want by following below steps.
1. spit the `df` into 2 before applying `try_cast_int`
2. apply `try_cast_int` to the top half of the `df`
3. No processing is done on the bottom half of the `df`
4. concat them into 1
example code
```python
df = get_sample_df()
df_top = df.iloc[:2, :]
df_top = df_top.applymap(try_cast_int)
df_bottom = df.iloc[2:, :]
df_full = pd.concat([df_top, df_bottom], axis=0)
```
Best of luck.
### Installed Versions
<details>
INSTALLED VERSIONS
------------------
commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d
python : 3.8.16.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.133+
Version : #1 SMP Fri Aug 26 08:44:51 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.5
numpy : 1.21.6
pytz : 2022.6
dateutil : 2.8.2
pip : 21.1.3
setuptools : 57.4.0
Cython : 0.29.32
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.6
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.9.2
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.9.5 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.9.0
pandas_datareader: 0.9.0
bs4 : 4.6.3
bottleneck : None
fsspec : 2022.11.0
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.8.4
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.9
pyarrow : 9.0.0
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.45
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.12.0
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.56.4
</details>
author:
url:https://github.com/petrov826
type:Person
name:petrov826
datePublished:2022-12-28T15:05:36.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:3
url:https://github.com/50467/pandas/issues/50467
context:https://schema.org
headline:BUG: unwanted type conversion when partial reassigning
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this bug exists on the main branch of pandas.
### Reproducible Example
```python
from typing import Union
import numpy as np
import pandas as pd
def get_sample_df():
return pd.DataFrame(
[
["1", np.nan],
["2", np.nan],
["3", np.nan],
["4", np.nan]
],
dtype="object" # Reproduce the dataframe I'm working with.
)
def try_cast_int(s: str) -> Union[int, str]:
# some complex preprocessing here
# I removed those to make my code shorter
try:
return int(s)
except ValueError:
return s
df = get_sample_df()
# Due to business reasons, apply `try_cast_int` only to the top half of `df`.
df.iloc[:2, :] = df.iloc[:2, :].applymap(try_cast_int)
```
### Issue Description
Hello pandas team!
I found a bug(?) today. I expect that the `df` after the last line in the "Reproducible Example" will be a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
But what I got was like below. I have no idea why I got `float` such like `1.0` and `2.0` instead of `int`.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1.0 | np.NaN |
| 1 | 2.0 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
One interesting fact is `df.iloc[:2, :].applymap(try_cast_int)` before reassigning will return a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
It seems that integers on the 1st column are converted into float values when partial reassigning.
My questions are
- Why type conversion happens?
- How can we avoid this behavior? (Is there any context manager or something for that?)
### Expected Behavior
The `df` after the last line in the "Reproducible Example" will be a dataframe like below.
| | 0 | 1 |
|---:|----:|----:|
| 0 | 1 | np.NaN |
| 1 | 2 | np.NaN |
| 2 | 3 | np.NaN |
| 3 | 4 | np.NaN |
I guess that "apply `try_cast_int` to the top half of `df`" is a cause of this issue. Pandas are not designed to do this kind of task, right?
---
I found that I can avoid this behavior and get what I want by following below steps.
1. spit the `df` into 2 before applying `try_cast_int`
2. apply `try_cast_int` to the top half of the `df`
3. No processing is done on the bottom half of the `df`
4. concat them into 1
example code
```python
df = get_sample_df()
df_top = df.iloc[:2, :]
df_top = df_top.applymap(try_cast_int)
df_bottom = df.iloc[2:, :]
df_full = pd.concat([df_top, df_bottom], axis=0)
```
Best of luck.
### Installed Versions
<details>
INSTALLED VERSIONS
------------------
commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d
python : 3.8.16.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.133+
Version : #1 SMP Fri Aug 26 08:44:51 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.5
numpy : 1.21.6
pytz : 2022.6
dateutil : 2.8.2
pip : 21.1.3
setuptools : 57.4.0
Cython : 0.29.32
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.6
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.9.2
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.9.5 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 7.9.0
pandas_datareader: 0.9.0
bs4 : 4.6.3
bottleneck : None
fsspec : 2022.11.0
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.8.4
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.9
pyarrow : 9.0.0
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.45
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.12.0
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.56.4
</details>
author:
url:https://github.com/petrov826
type:Person
name:petrov826
datePublished:2022-12-28T15:05:36.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:3
url:https://github.com/50467/pandas/issues/50467
Person:
url:https://github.com/petrov826
name:petrov826
url:https://github.com/petrov826
name:petrov826
InteractionCounter:
interactionType:https://schema.org/CommentAction
userInteractionCount:3
interactionType:https://schema.org/CommentAction
userInteractionCount:3
External Links {π}(3)
Analytics and Tracking {π}
- Site Verification - Google
Libraries {π}
- Clipboard.js
- D3.js
- Lodash
Emails and Hosting {βοΈ}
Mail Servers:
- aspmx.l.google.com
- alt1.aspmx.l.google.com
- alt2.aspmx.l.google.com
- alt3.aspmx.l.google.com
- alt4.aspmx.l.google.com
Name Servers:
- dns1.p08.nsone.net
- dns2.p08.nsone.net
- dns3.p08.nsone.net
- dns4.p08.nsone.net
- ns-1283.awsdns-32.org
- ns-1707.awsdns-21.co.uk
- ns-421.awsdns-52.com
- ns-520.awsdns-01.net