GITHUB . COM {}

Detected CMS Systems:

Wordpress (2 occurrences)

Analyzed Page
Matching Content Categories
CMS
Monthly Traffic Estimate
How Does Github.com Make Money
How Much Does Github.com Make
Wordpress Themes And Plugins
Keywords
Topics
Payment Methods
Questions
Schema
External Links
Analytics And Tracking
Libraries
Hosting Providers

We are analyzing https://github.com/pandas-dev/pandas/issues/45585.

Title:
BUG: read_excel skiprows callable can result in infinite loop · Issue #45585 · pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this bug exists on the latest version of pandas. I have confirmed this bug exists on the main branch of pandas. Reproducible Example imp...
Website Age:
17 years and 8 months (reg. 2007-10-09).

Matching Content Categories {📚}

Video & Online Content
Technology & Computing
Mobile Technology & AI

Content Management System {📝}

What CMS is github.com built with?

Github.com relies on WORDPRESS.

Traffic Estimate {📈}

What is the average monthly size of github.com audience?

🚀🌠 Tremendous Traffic: 10M - 20M visitors per month

Based on our best estimate, this website will receive around 10,000,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.

check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush

How Does Github.com Make Money? {💸}

Subscription Packages {💳}

We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.

How Much Does Github.com Make? {💰}

Subscription Packages {💳}

Prices on github.com are in US Dollars ($). They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 4,989,889 paying customers.
The estimated monthly recurring revenue (MRR) is $20,957,532.
The estimated annual recurring revenues (ARR) are $251,490,385.

Wordpress Themes and Plugins {🎨}

What WordPress theme does this site use?

It is strange but we were not able to detect any theme on the page.

What WordPress plugins does this website use?

It is strange but we were not able to detect any plugins on the page.

Keywords {🔍}

bug, issue, pandas, loop, readexcel, skiprows, callable, row, sign, infinite, projects, closed, bram, import, fix, added, navigation, pull, requests, actions, security, result, description, version, confirmed, exists, pddataframerow, expecteddf, test, triage, reviewed, team, member, excel, toexcel, milestone, github, type, footer, skip, content, menu, product, solutions, resources, open, source, enterprise, pricing, search,

Topics {✒️}

read_excel skiprows callable personal information bug comment metadata assignees issue description false values coming skiprows callable latest version projects milestone 1 loop infinitely loop forever type projects skiprows=lambda bug exists triage issue index=false pandas main branch df = pd read_excel expected_df = pd unit test fix expected behavior test provided 5 closed 100% complete relationships bug issue github to_excel type row index source_df assert_frame_equal pd tempfile df expected_df to_excel sign row skip jump result checked reported confirmed reproducible dataframe namedtemporaryfile suffix=

Payment Methods {📊}

Braintree

Questions {❓}

Already have an account?

Schema {🗺️}

DiscussionForumPosting:
      context:https://schema.org
      headline:BUG: read_excel skiprows callable can result in infinite loop
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [X] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python import tempfile import pandas as pd from pandas.testing import assert_frame_equal source_df = pd.DataFrame([{"row": 0}, {"row": 1}, {"row": 2}, {"row": 3}]) with tempfile.NamedTemporaryFile("w", suffix=".xlsx") as xlsx_fd: source_df.to_excel(xlsx_fd.name, index=False) df = pd.read_excel( xlsx_fd.name, skiprows=lambda x: x not in [0, 2] ) # infinite loop here expected_df = pd.DataFrame([{"row": 1}, {"row": 3}]) assert_frame_equal(df, expected_df) ``` ### Issue Description In certain cases the `skiprows` callable on `read_excel` will loop infinitely. It's related to the sequence of `True` and `False` values coming out of the function. I've already written a unit test that triggers the issue and a fix to address it. Creating this bug to track the fix against. ### Expected Behavior The test provided above should pass, where currently it will loop forever. To put it another way; `skiprows` callable should never be called with a row index that is greater than the length of the document. ### Installed Versions <details> ❯ ls bug.py | entr -c python bug.py /Users/bram/Code/nca/pandas/pandas/compat/_optional.py:149: UserWarning: Pandas requires version '1.3.1' or newer of 'bottleneck' (version '1.2.1' currently installed). warnings.warn(msg, UserWarning) INSTALLED VERSIONS ------------------ commit : 04f5721617a622912f493f42f160139e073d0c2f python : 3.9.9.final.0 python-bits : 64 OS : Darwin OS-release : 21.2.0 Version : Darwin Kernel Version 21.2.0: Sun Nov 28 20:28:41 PST 2021; root:xnu-8019.61.5~1/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8 pandas : 1.5.0.dev0+130.g04f5721617 numpy : 1.22.1 pytz : 2021.3 dateutil : 2.8.2 pip : 21.2.4 setuptools : 58.1.0 Cython : 0.29.26 pytest : 6.2.5 hypothesis : 6.36.0 sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : 1.2.1 fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : 2.0.1 xlwt : None zstandard : None </details>
      author:
         url:https://github.com/bram2000
         type:Person
         name:bram2000
      datePublished:2022-01-24T10:34:58.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:0
      url:https://github.com/45585/pandas/issues/45585
      context:https://schema.org
      headline:BUG: read_excel skiprows callable can result in infinite loop
      articleBody:### Pandas version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas. - [X] I have confirmed this bug exists on the main branch of pandas. ### Reproducible Example ```python import tempfile import pandas as pd from pandas.testing import assert_frame_equal source_df = pd.DataFrame([{"row": 0}, {"row": 1}, {"row": 2}, {"row": 3}]) with tempfile.NamedTemporaryFile("w", suffix=".xlsx") as xlsx_fd: source_df.to_excel(xlsx_fd.name, index=False) df = pd.read_excel( xlsx_fd.name, skiprows=lambda x: x not in [0, 2] ) # infinite loop here expected_df = pd.DataFrame([{"row": 1}, {"row": 3}]) assert_frame_equal(df, expected_df) ``` ### Issue Description In certain cases the `skiprows` callable on `read_excel` will loop infinitely. It's related to the sequence of `True` and `False` values coming out of the function. I've already written a unit test that triggers the issue and a fix to address it. Creating this bug to track the fix against. ### Expected Behavior The test provided above should pass, where currently it will loop forever. To put it another way; `skiprows` callable should never be called with a row index that is greater than the length of the document. ### Installed Versions <details> ❯ ls bug.py | entr -c python bug.py /Users/bram/Code/nca/pandas/pandas/compat/_optional.py:149: UserWarning: Pandas requires version '1.3.1' or newer of 'bottleneck' (version '1.2.1' currently installed). warnings.warn(msg, UserWarning) INSTALLED VERSIONS ------------------ commit : 04f5721617a622912f493f42f160139e073d0c2f python : 3.9.9.final.0 python-bits : 64 OS : Darwin OS-release : 21.2.0 Version : Darwin Kernel Version 21.2.0: Sun Nov 28 20:28:41 PST 2021; root:xnu-8019.61.5~1/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8 pandas : 1.5.0.dev0+130.g04f5721617 numpy : 1.22.1 pytz : 2021.3 dateutil : 2.8.2 pip : 21.2.4 setuptools : 58.1.0 Cython : 0.29.26 pytest : 6.2.5 hypothesis : 6.36.0 sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : 1.2.1 fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : 2.0.1 xlwt : None zstandard : None </details>
      author:
         url:https://github.com/bram2000
         type:Person
         name:bram2000
      datePublished:2022-01-24T10:34:58.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:0
      url:https://github.com/45585/pandas/issues/45585
Person:
      url:https://github.com/bram2000
      name:bram2000
      url:https://github.com/bram2000
      name:bram2000
InteractionCounter:
      interactionType:https://schema.org/CommentAction
      userInteractionCount:0
      interactionType:https://schema.org/CommentAction
      userInteractionCount:0

External Links {🔗}(3)

Analytics and Tracking {📊}

Site Verification - Google

Libraries {📚}

Clipboard.js
D3.js
Lodash

Emails and Hosting {✉️}

Mail Servers:

aspmx.l.google.com
alt1.aspmx.l.google.com
alt2.aspmx.l.google.com
alt3.aspmx.l.google.com
alt4.aspmx.l.google.com

Name Servers:

dns1.p08.nsone.net
dns2.p08.nsone.net
dns3.p08.nsone.net
dns4.p08.nsone.net
ns-1283.awsdns-32.org
ns-1707.awsdns-21.co.uk
ns-421.awsdns-52.com
ns-520.awsdns-01.net

9.02s.