Here's how GITHUB.COM makes money* and how much!

*Please read our disclaimer before using our estimates.
Loading...

GITHUB . COM {}

Detected CMS Systems:

  1. Analyzed Page
  2. Matching Content Categories
  3. CMS
  4. Monthly Traffic Estimate
  5. How Does Github.com Make Money
  6. How Much Does Github.com Make
  7. Wordpress Themes And Plugins
  8. Keywords
  9. Topics
  10. Payment Methods
  11. Questions
  12. Schema
  13. External Links
  14. Analytics And Tracking
  15. Libraries
  16. Hosting Providers

We are analyzing https://github.com/pandas-dev/pandas/issues/48510.

Title:
BUG: Appending or concatenating to empty ExtensionArray removes type information Β· Issue #48510 Β· pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. Reproducible Example imp...
Website Age:
17 years and 9 months (reg. 2007-10-09).

Matching Content Categories {πŸ“š}

  • Technology & Computing
  • Video & Online Content
  • Family & Parenting

Content Management System {πŸ“}

What CMS is github.com built with?


Github.com uses WORDPRESS.

Traffic Estimate {πŸ“ˆ}

What is the average monthly size of github.com audience?

πŸš€πŸŒ  Tremendous Traffic: 10M - 20M visitors per month


Based on our best estimate, this website will receive around 10,666,346 visitors per month in the current month.

check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush

How Does Github.com Make Money? {πŸ’Έ}


Subscription Packages {πŸ’³}

We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.

How Much Does Github.com Make? {πŸ’°}


Subscription Packages {πŸ’³}

Prices on github.com are in US Dollars ($). They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 5,347,483 paying customers.
The estimated monthly recurring revenue (MRR) is $22,459,429.
The estimated annual recurring revenues (ARR) are $269,513,148.

Wordpress Themes and Plugins {🎨}

What WordPress theme does this site use?

It is strange but we were not able to detect any theme on the page.

What WordPress plugins does this website use?

It is strange but we were not able to detect any plugins on the page.

Keywords {πŸ”}

issue, type, pandas, dtype, bug, empty, information, extensionarray, concatvalues, ssche, dfadtype, toconcat, unit, mroeschke, appending, intdtype, joinunits, needed, sign, concatenating, removes, arr, pddataframea, assert, column, dtypes, join, arrays, member, conversions, tests, projects, branch, pdconcatdf, concatenatejoinunits, units, axis, behavior, added, commented, navigation, pull, requests, actions, security, closed, description, version, confirmed, exists,

Topics {βœ’οΈ}

personal information bug ignores type information dtype conversions unexpected mroeschke added specific ea type comment metadata assignees ignore[call-overload] assert df2['a'] type projects dtype information import pandas ea handling branch ssche changed issue type tests unit test assess issue latest version dtype == df['a'] concatenating join units bug exists empty list type extensiondtype extensionarray prevent regressions projects milestone 2 empty column remains int64dtype pandas assert dtype df2['a'] triage issue df2 = pd df2 = df behavior needed join units concat_values = concat_compat dtype=pd concat_values = concat_values ea means ea values overload variant 0 closed empty concat_values = ensure_block_shape unit test df = pd merge/join type resulting dataframe' expected

Payment Methods {πŸ“Š}

  • Braintree

Questions {❓}

  • Already have an account?
  • Can this be hit through another method?

Schema {πŸ—ΊοΈ}

DiscussionForumPosting:
      context:https://schema.org
      headline:BUG: Appending or concatenating to empty ExtensionArray removes type information
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [ ] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python import pandas as pd arr = [] df = pd.DataFrame({'a': pd.array(arr, dtype=pd.Int64Dtype())}) other = pd.DataFrame({'a': [1, 2]}) df2 = df.append(other) # same issue for pd.concat(...) # df2 = pd.concat([df, other]) assert df2['a'].dtype == df['a'].dtype ``` ``` > assert df2['a'].dtype == df['a'].dtype E AssertionError: assert dtype('O') == Int64Dtype() E + where dtype('O') = 0 1\n1 2\nName: a, dtype: object.dtype E + and Int64Dtype() = Series([], Name: a, dtype: Int64).dtype ``` ### Issue Description When appending a dataframe (`df_other`) to another dataframe (`df`) which has an empty column of type `ExtensionDtype` (in this case `Int64Dtype`, but the specific EA type doesn't matter), then the resulting dataframe's column (`df2['a']`) loses the dtype information and turns into an object dtype. You can run the example with `arr = [1]` instead of the empty list (`arr = []`) and observe that - as expected - the type is not changed and remains `Int64Dtype`. I traced the issue to `_concatenate_join_units` and `_get_empty_dtype` which ignores type information when the column is empty (`if not unit.is_na`). This in turn then fails to enter the `elif any(is_1d_only_ea_obj(t) for t in to_concat)` EA handling branch in `_concatenate_join_units`. ``` def _get_empty_dtype(join_units: Sequence[JoinUnit]) -> DtypeObj: ... dtypes = [unit.dtype for unit in join_units if not unit.is_na] if not len(dtypes): dtypes = [unit.dtype for unit in join_units if unit.block.dtype.kind != "V"] dtype = find_common_type(dtypes) ... ``` ``` def _concatenate_join_units( join_units: list[JoinUnit], concat_axis: int, copy: bool ) -> ArrayLike: """ Concatenate values from several join units along selected axis. """ if concat_axis == 0 and len(join_units) > 1: # Concatenating join units along ax0 is handled in _merge_blocks. raise AssertionError("Concatenating join units along axis0") empty_dtype = _get_empty_dtype(join_units) has_none_blocks = any(unit.block.dtype.kind == "V" for unit in join_units) upcasted_na = _dtype_to_na_value(empty_dtype, has_none_blocks) to_concat = [ ju.get_reindexed_values(empty_dtype=empty_dtype, upcasted_na=upcasted_na) for ju in join_units ] if len(to_concat) == 1: # Only one block, nothing to concatenate. concat_values = to_concat[0] if copy: if isinstance(concat_values, np.ndarray): # non-reindexed (=not yet copied) arrays are made into a view # in JoinUnit.get_reindexed_values if concat_values.base is not None: concat_values = concat_values.copy() else: concat_values = concat_values.copy() elif any(is_1d_only_ea_obj(t) for t in to_concat): # <-- this branch isn't entered # TODO(EA2D): special case not needed if all EAs used HybridBlocks # NB: we are still assuming here that Hybrid blocks have shape (1, N) # concatting with at least one EA means we are concatting a single column # the non-EA values are 2D arrays with shape (1, n) # error: No overload variant of "__getitem__" of "ExtensionArray" matches # argument type "Tuple[int, slice]" to_concat = [ t if is_1d_only_ea_obj(t) else t[0, :] # type: ignore[call-overload] for t in to_concat ] concat_values = concat_compat(to_concat, axis=0, ea_compat_axis=True) concat_values = ensure_block_shape(concat_values, 2) else: concat_values = concat_compat(to_concat, axis=concat_axis) return concat_values ``` ### Expected Behavior Type information remains as both types are compatible (the fact that one Series is empty shouldn't matter). ### Installed Versions <details> INSTALLED VERSIONS ------------------ commit : ca60aab7340d9989d9428e11a51467658190bb6b python : 3.8.13.final.0 python-bits : 64 OS : Linux OS-release : 5.19.8-200.fc36.x86_64 Version : #1 SMP PREEMPT_DYNAMIC Thu Sep 8 19:02:21 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_AU.UTF-8 LOCALE : en_AU.UTF-8 pandas : 1.4.4 numpy : 1.23.2 pytz : 2020.4 dateutil : 2.8.1 setuptools : 59.6.0 pip : 22.2.2 Cython : 0.29.32 pytest : 7.1.2 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 0.9.6 lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.8.6 jinja2 : 2.11.2 IPython : None pandas_datareader: None bs4 : None bottleneck : 1.3.5 brotli : None fastparquet : None fsspec : None gcsfs : None markupsafe : 1.1.1 matplotlib : None numba : None numexpr : 2.8.1 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 1.0.1 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.4.1 snappy : None sqlalchemy : 1.3.23 tables : 3.7.0 tabulate : None xarray : None xlrd : 2.0.1 xlwt : None zstandard : None </details>
      author:
         url:https://github.com/ssche
         type:Person
         name:ssche
      datePublished:2022-09-12T03:48:07.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:3
      url:https://github.com/48510/pandas/issues/48510
      context:https://schema.org
      headline:BUG: Appending or concatenating to empty ExtensionArray removes type information
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [ ] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python import pandas as pd arr = [] df = pd.DataFrame({'a': pd.array(arr, dtype=pd.Int64Dtype())}) other = pd.DataFrame({'a': [1, 2]}) df2 = df.append(other) # same issue for pd.concat(...) # df2 = pd.concat([df, other]) assert df2['a'].dtype == df['a'].dtype ``` ``` > assert df2['a'].dtype == df['a'].dtype E AssertionError: assert dtype('O') == Int64Dtype() E + where dtype('O') = 0 1\n1 2\nName: a, dtype: object.dtype E + and Int64Dtype() = Series([], Name: a, dtype: Int64).dtype ``` ### Issue Description When appending a dataframe (`df_other`) to another dataframe (`df`) which has an empty column of type `ExtensionDtype` (in this case `Int64Dtype`, but the specific EA type doesn't matter), then the resulting dataframe's column (`df2['a']`) loses the dtype information and turns into an object dtype. You can run the example with `arr = [1]` instead of the empty list (`arr = []`) and observe that - as expected - the type is not changed and remains `Int64Dtype`. I traced the issue to `_concatenate_join_units` and `_get_empty_dtype` which ignores type information when the column is empty (`if not unit.is_na`). This in turn then fails to enter the `elif any(is_1d_only_ea_obj(t) for t in to_concat)` EA handling branch in `_concatenate_join_units`. ``` def _get_empty_dtype(join_units: Sequence[JoinUnit]) -> DtypeObj: ... dtypes = [unit.dtype for unit in join_units if not unit.is_na] if not len(dtypes): dtypes = [unit.dtype for unit in join_units if unit.block.dtype.kind != "V"] dtype = find_common_type(dtypes) ... ``` ``` def _concatenate_join_units( join_units: list[JoinUnit], concat_axis: int, copy: bool ) -> ArrayLike: """ Concatenate values from several join units along selected axis. """ if concat_axis == 0 and len(join_units) > 1: # Concatenating join units along ax0 is handled in _merge_blocks. raise AssertionError("Concatenating join units along axis0") empty_dtype = _get_empty_dtype(join_units) has_none_blocks = any(unit.block.dtype.kind == "V" for unit in join_units) upcasted_na = _dtype_to_na_value(empty_dtype, has_none_blocks) to_concat = [ ju.get_reindexed_values(empty_dtype=empty_dtype, upcasted_na=upcasted_na) for ju in join_units ] if len(to_concat) == 1: # Only one block, nothing to concatenate. concat_values = to_concat[0] if copy: if isinstance(concat_values, np.ndarray): # non-reindexed (=not yet copied) arrays are made into a view # in JoinUnit.get_reindexed_values if concat_values.base is not None: concat_values = concat_values.copy() else: concat_values = concat_values.copy() elif any(is_1d_only_ea_obj(t) for t in to_concat): # <-- this branch isn't entered # TODO(EA2D): special case not needed if all EAs used HybridBlocks # NB: we are still assuming here that Hybrid blocks have shape (1, N) # concatting with at least one EA means we are concatting a single column # the non-EA values are 2D arrays with shape (1, n) # error: No overload variant of "__getitem__" of "ExtensionArray" matches # argument type "Tuple[int, slice]" to_concat = [ t if is_1d_only_ea_obj(t) else t[0, :] # type: ignore[call-overload] for t in to_concat ] concat_values = concat_compat(to_concat, axis=0, ea_compat_axis=True) concat_values = ensure_block_shape(concat_values, 2) else: concat_values = concat_compat(to_concat, axis=concat_axis) return concat_values ``` ### Expected Behavior Type information remains as both types are compatible (the fact that one Series is empty shouldn't matter). ### Installed Versions <details> INSTALLED VERSIONS ------------------ commit : ca60aab7340d9989d9428e11a51467658190bb6b python : 3.8.13.final.0 python-bits : 64 OS : Linux OS-release : 5.19.8-200.fc36.x86_64 Version : #1 SMP PREEMPT_DYNAMIC Thu Sep 8 19:02:21 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_AU.UTF-8 LOCALE : en_AU.UTF-8 pandas : 1.4.4 numpy : 1.23.2 pytz : 2020.4 dateutil : 2.8.1 setuptools : 59.6.0 pip : 22.2.2 Cython : 0.29.32 pytest : 7.1.2 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 0.9.6 lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.8.6 jinja2 : 2.11.2 IPython : None pandas_datareader: None bs4 : None bottleneck : 1.3.5 brotli : None fastparquet : None fsspec : None gcsfs : None markupsafe : 1.1.1 matplotlib : None numba : None numexpr : 2.8.1 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 1.0.1 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.4.1 snappy : None sqlalchemy : 1.3.23 tables : 3.7.0 tabulate : None xarray : None xlrd : 2.0.1 xlwt : None zstandard : None </details>
      author:
         url:https://github.com/ssche
         type:Person
         name:ssche
      datePublished:2022-09-12T03:48:07.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:3
      url:https://github.com/48510/pandas/issues/48510
Person:
      url:https://github.com/ssche
      name:ssche
      url:https://github.com/ssche
      name:ssche
InteractionCounter:
      interactionType:https://schema.org/CommentAction
      userInteractionCount:3
      interactionType:https://schema.org/CommentAction
      userInteractionCount:3

Analytics and Tracking {πŸ“Š}

  • Site Verification - Google

Libraries {πŸ“š}

  • Clipboard.js
  • D3.js
  • GSAP
  • Lodash

Emails and Hosting {βœ‰οΈ}

Mail Servers:

  • aspmx.l.google.com
  • alt1.aspmx.l.google.com
  • alt2.aspmx.l.google.com
  • alt3.aspmx.l.google.com
  • alt4.aspmx.l.google.com

Name Servers:

  • dns1.p08.nsone.net
  • dns2.p08.nsone.net
  • dns3.p08.nsone.net
  • dns4.p08.nsone.net
  • ns-1283.awsdns-32.org
  • ns-1707.awsdns-21.co.uk
  • ns-421.awsdns-52.com
  • ns-520.awsdns-01.net
8.93s.