GITHUB . COM {}

Detected CMS Systems:

Wordpress (2 occurrences)

Analyzed Page
Matching Content Categories
CMS
Monthly Traffic Estimate
How Does Github.com Make Money
How Much Does Github.com Make
Wordpress Themes And Plugins
Keywords
Topics
Payment Methods
Questions
Schema
External Links
Analytics And Tracking
Libraries
Hosting Providers

We are analyzing https://github.com/pandas-dev/pandas/issues/50467.

Title:
BUG: unwanted type conversion when partial reassigning · Issue #50467 · pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. Reproducible Example fro...
Website Age:
17 years and 8 months (reg. 2007-10-09).

Matching Content Categories {📚}

Technology & Computing
Video & Online Content
Social Networks

Content Management System {📝}

What CMS is github.com built with?

Github.com relies on WORDPRESS.

Traffic Estimate {📈}

What is the average monthly size of github.com audience?

🚀🌠 Tremendous Traffic: 10M - 20M visitors per month

Based on our best estimate, this website will receive around 10,634,219 visitors per month in the current month.

check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush

How Does Github.com Make Money? {💸}

Subscription Packages {💳}

We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.

How Much Does Github.com Make? {💰}

Subscription Packages {💳}

Prices on github.com are in US Dollars ($). They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 5,306,347 paying customers.
The estimated monthly recurring revenue (MRR) is $22,286,656.
The estimated annual recurring revenues (ARR) are $267,439,867.

Wordpress Themes and Plugins {🎨}

What WordPress theme does this site use?

It is strange but we were not able to detect any theme on the page.

What WordPress plugins does this website use?

It is strange but we were not able to detect any plugins on the page.

Keywords {🔍}

npnan, pandas, bug, issue, phofl, trycastint, type, petrov, dfiloc, sign, conversion, reassigning, return, dataframe, preprocessing, half, member, indexing, code, projects, partial, reproducible, import, getsampledf, apply, top, team, behavior, commented, navigation, pull, requests, actions, security, unwanted, closed, description, version, confirmed, exists, def, str, removed, applymaptrycastint, found, line, float, int, avoid, dftop,

Topics {✒️}

phofl edits member np import pandas personal information bug pandas team comment metadata assignees issue description parse integer string custom function try_cast_int unwanted type conversion latest version type projects partial reassigning projects milestone bug exists triage issue apply `try_cast_int` apply try_cast_int type conversion code shorter def try_cast_int df = get_sample_df cast string conversion fails pandas complex preprocessing preprocessing strings place preprocessing main branch business reasons top half interesting fact 1st column context manager bottom half series/frames extra job bit overcomplicated makes sense milestone relationships return pd df_top = df issue bug float values github df concat df_bottom = df expected behavior return int reassigning

Payment Methods {📊}

Braintree

Questions {❓}

Already have an account?
Any reason you are not using astype here?
Edit: Any reason you are not using astype here?
How can we avoid this behavior?
I found a bug(?
Pandas are not designed to do this kind of task, right?
Why type conversion happens?

Schema {🗺️}

DiscussionForumPosting:
      context:https://schema.org
      headline:BUG: unwanted type conversion when partial reassigning
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [X] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python from typing import Union import numpy as np import pandas as pd def get_sample_df(): return pd.DataFrame( [ ["1", np.nan], ["2", np.nan], ["3", np.nan], ["4", np.nan] ], dtype="object" # Reproduce the dataframe I'm working with. ) def try_cast_int(s: str) -> Union[int, str]: # some complex preprocessing here # I removed those to make my code shorter try: return int(s) except ValueError: return s df = get_sample_df() # Due to business reasons, apply `try_cast_int` only to the top half of `df`. df.iloc[:2, :] = df.iloc[:2, :].applymap(try_cast_int) ``` ### Issue Description Hello pandas team! I found a bug(?) today. I expect that the `df` after the last line in the "Reproducible Example" will be a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | But what I got was like below. I have no idea why I got `float` such like `1.0` and `2.0` instead of `int`. | | 0 | 1 | |---:|----:|----:| | 0 | 1.0 | np.NaN | | 1 | 2.0 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | One interesting fact is `df.iloc[:2, :].applymap(try_cast_int)` before reassigning will return a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | It seems that integers on the 1st column are converted into float values when partial reassigning. My questions are - Why type conversion happens? - How can we avoid this behavior? (Is there any context manager or something for that?) ### Expected Behavior The `df` after the last line in the "Reproducible Example" will be a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | I guess that "apply `try_cast_int` to the top half of `df`" is a cause of this issue. Pandas are not designed to do this kind of task, right? --- I found that I can avoid this behavior and get what I want by following below steps. 1. spit the `df` into 2 before applying `try_cast_int` 2. apply `try_cast_int` to the top half of the `df` 3. No processing is done on the bottom half of the `df` 4. concat them into 1 example code ```python df = get_sample_df() df_top = df.iloc[:2, :] df_top = df_top.applymap(try_cast_int) df_bottom = df.iloc[2:, :] df_full = pd.concat([df_top, df_bottom], axis=0) ``` Best of luck. ### Installed Versions <details> INSTALLED VERSIONS ------------------ commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d python : 3.8.16.final.0 python-bits : 64 OS : Linux OS-release : 5.10.133+ Version : #1 SMP Fri Aug 26 08:44:51 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.3.5 numpy : 1.21.6 pytz : 2022.6 dateutil : 2.8.2 pip : 21.1.3 setuptools : 57.4.0 Cython : 0.29.32 pytest : 3.6.4 hypothesis : None sphinx : 1.8.6 blosc : None feather : 0.4.1 xlsxwriter : None lxml.etree : 4.9.2 html5lib : 1.0.1 pymysql : None psycopg2 : 2.9.5 (dt dec pq3 ext lo64) jinja2 : 2.11.3 IPython : 7.9.0 pandas_datareader: 0.9.0 bs4 : 4.6.3 bottleneck : None fsspec : 2022.11.0 fastparquet : None gcsfs : None matplotlib : 3.2.2 numexpr : 2.8.4 odfpy : None openpyxl : 3.0.10 pandas_gbq : 0.17.9 pyarrow : 9.0.0 pyxlsb : None s3fs : None scipy : 1.7.3 sqlalchemy : 1.4.45 tables : 3.7.0 tabulate : 0.8.10 xarray : 2022.12.0 xlrd : 1.2.0 xlwt : 1.3.0 numba : 0.56.4 </details>
      author:
         url:https://github.com/petrov826
         type:Person
         name:petrov826
      datePublished:2022-12-28T15:05:36.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:3
      url:https://github.com/50467/pandas/issues/50467
      context:https://schema.org
      headline:BUG: unwanted type conversion when partial reassigning
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [X] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python from typing import Union import numpy as np import pandas as pd def get_sample_df(): return pd.DataFrame( [ ["1", np.nan], ["2", np.nan], ["3", np.nan], ["4", np.nan] ], dtype="object" # Reproduce the dataframe I'm working with. ) def try_cast_int(s: str) -> Union[int, str]: # some complex preprocessing here # I removed those to make my code shorter try: return int(s) except ValueError: return s df = get_sample_df() # Due to business reasons, apply `try_cast_int` only to the top half of `df`. df.iloc[:2, :] = df.iloc[:2, :].applymap(try_cast_int) ``` ### Issue Description Hello pandas team! I found a bug(?) today. I expect that the `df` after the last line in the "Reproducible Example" will be a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | But what I got was like below. I have no idea why I got `float` such like `1.0` and `2.0` instead of `int`. | | 0 | 1 | |---:|----:|----:| | 0 | 1.0 | np.NaN | | 1 | 2.0 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | One interesting fact is `df.iloc[:2, :].applymap(try_cast_int)` before reassigning will return a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | It seems that integers on the 1st column are converted into float values when partial reassigning. My questions are - Why type conversion happens? - How can we avoid this behavior? (Is there any context manager or something for that?) ### Expected Behavior The `df` after the last line in the "Reproducible Example" will be a dataframe like below. | | 0 | 1 | |---:|----:|----:| | 0 | 1 | np.NaN | | 1 | 2 | np.NaN | | 2 | 3 | np.NaN | | 3 | 4 | np.NaN | I guess that "apply `try_cast_int` to the top half of `df`" is a cause of this issue. Pandas are not designed to do this kind of task, right? --- I found that I can avoid this behavior and get what I want by following below steps. 1. spit the `df` into 2 before applying `try_cast_int` 2. apply `try_cast_int` to the top half of the `df` 3. No processing is done on the bottom half of the `df` 4. concat them into 1 example code ```python df = get_sample_df() df_top = df.iloc[:2, :] df_top = df_top.applymap(try_cast_int) df_bottom = df.iloc[2:, :] df_full = pd.concat([df_top, df_bottom], axis=0) ``` Best of luck. ### Installed Versions <details> INSTALLED VERSIONS ------------------ commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d python : 3.8.16.final.0 python-bits : 64 OS : Linux OS-release : 5.10.133+ Version : #1 SMP Fri Aug 26 08:44:51 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.3.5 numpy : 1.21.6 pytz : 2022.6 dateutil : 2.8.2 pip : 21.1.3 setuptools : 57.4.0 Cython : 0.29.32 pytest : 3.6.4 hypothesis : None sphinx : 1.8.6 blosc : None feather : 0.4.1 xlsxwriter : None lxml.etree : 4.9.2 html5lib : 1.0.1 pymysql : None psycopg2 : 2.9.5 (dt dec pq3 ext lo64) jinja2 : 2.11.3 IPython : 7.9.0 pandas_datareader: 0.9.0 bs4 : 4.6.3 bottleneck : None fsspec : 2022.11.0 fastparquet : None gcsfs : None matplotlib : 3.2.2 numexpr : 2.8.4 odfpy : None openpyxl : 3.0.10 pandas_gbq : 0.17.9 pyarrow : 9.0.0 pyxlsb : None s3fs : None scipy : 1.7.3 sqlalchemy : 1.4.45 tables : 3.7.0 tabulate : 0.8.10 xarray : 2022.12.0 xlrd : 1.2.0 xlwt : 1.3.0 numba : 0.56.4 </details>
      author:
         url:https://github.com/petrov826
         type:Person
         name:petrov826
      datePublished:2022-12-28T15:05:36.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:3
      url:https://github.com/50467/pandas/issues/50467
Person:
      url:https://github.com/petrov826
      name:petrov826
      url:https://github.com/petrov826
      name:petrov826
InteractionCounter:
      interactionType:https://schema.org/CommentAction
      userInteractionCount:3
      interactionType:https://schema.org/CommentAction
      userInteractionCount:3

External Links {🔗}(3)

Analytics and Tracking {📊}

Site Verification - Google

Libraries {📚}

Clipboard.js
D3.js
Lodash

Emails and Hosting {✉️}

Mail Servers:

aspmx.l.google.com
alt1.aspmx.l.google.com
alt2.aspmx.l.google.com
alt3.aspmx.l.google.com
alt4.aspmx.l.google.com

Name Servers:

dns1.p08.nsone.net
dns2.p08.nsone.net
dns3.p08.nsone.net
dns4.p08.nsone.net
ns-1283.awsdns-32.org
ns-1707.awsdns-21.co.uk
ns-421.awsdns-52.com
ns-520.awsdns-01.net

8.64s.