
GITHUB . COM {
}
Detected CMS Systems:
- Wordpress (2 occurrences)
Title:
PERF:cythonize _return_parsed_timezone_results? Β· Issue #50107 Β· pandas-dev/pandas
Description:
Pandas version checks I have checked that this issue has not already been reported. I have confirmed this issue exists on the latest version of pandas. I have confirmed this issue exists on the main branch of pandas. Reproducible Example...
Website Age:
17 years and 8 months (reg. 2007-10-09).
Matching Content Categories {π}
- Technology & Computing
- Careers
- Automotive
Content Management System {π}
What CMS is github.com built with?
Github.com relies on WORDPRESS.
Traffic Estimate {π}
What is the average monthly size of github.com audience?
ππ Tremendous Traffic: 10M - 20M visitors per month
Based on our best estimate, this website will receive around 10,653,634 visitors per month in the current month.
check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush
How Does Github.com Make Money? {πΈ}
Subscription Packages {π³}
We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.How Much Does Github.com Make? {π°}
Subscription Packages {π³}
Prices on github.com are in US Dollars ($).
They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 5,316,035 paying customers.
The estimated monthly recurring revenue (MRR) is $22,327,346.
The estimated annual recurring revenues (ARR) are $267,928,149.
Wordpress Themes and Plugins {π¨}
What WordPress theme does this site use?
It is strange but we were not able to detect any theme on the page.
What WordPress plugins does this website use?
It is strange but we were not able to detect any plugins on the page.
Keywords {π}
returnparsedtimezoneresults, issue, marcogorelli, pandas, performance, sign, perfcythonize, datetime, projects, tzresults, speed, navigation, pull, requests, actions, security, closed, version, confirmed, exists, python, outprof, spent, nparray, lukemanley, added, memory, execution, data, dtype, mentioned, github, type, milestone, footer, skip, content, menu, product, solutions, resources, open, source, enterprise, pricing, search, jump, pandasdev, public, notifications,
Topics {βοΈ}
loops pandas/pandas/core/tools/datetimes personal information perf comment metadata assignees format='%y-%d-% latest version type projects projects milestone issue exists prof perf '%y-%d-% pandas cythonize _return_parsed_timezone_results main branch py snakeviz prof --server good candidate milestone relationships issue tz_results = np 329 seconds spent github speed utc=false perf pd _return_parsed_timezone_results dates tz_results spent python utc skip jump sign checked reported confirmed reproducible made date_range tz_localize strftime %z' tolist append to_datetime cprofile function pair 0cebd75
Payment Methods {π}
- Braintree
Questions {β}
- Already have an account?
- DEPR deprecate mixed timezone offsets with utc=False?
- Would this be a good candidate for Cython to speed it up?
Schema {πΊοΈ}
DiscussionForumPosting:
context:https://schema.org
headline:PERF:cythonize _return_parsed_timezone_results?
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this issue exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this issue exists on the main branch of pandas.
### Reproducible Example
I made a file
```python
import pandas as pd
dates = pd.date_range('1900', '2000').tz_localize('+01:00').strftime('%Y-%d-%m %H:%M:%S%z').tolist()
dates.append('2020-01-01 00:00:00+02:00')
pd.to_datetime(dates, format='%Y-%d-%m %H:%M:%S%z')
```
I then ran
```
python -m cProfile -o out.prof perf.py
snakeviz out.prof --server
```
I can then see that, of the 0.329 seconds spent in `to_datetime`, a whole 0.248 are spent in `_return_parsed_timezone_results`. That's a whole 75%!
This function is a pair of Python for-loops
https://github.com/pandas-dev/pandas/blob/0cebd7508e252aa4aef7be7590537dc3e20a0282/pandas/core/tools/datetimes.py#L316-L328
Would this be a good candidate for Cython to speed it up?
cc @lukemanley
### Installed Versions
<details>
INSTALLED VERSIONS
------------------
commit : e93ee07729afe0bc7661655755df6adad657c23b
python : 3.8.15.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.102.1-microsoft-standard-WSL2
Version : #1 SMP Wed Mar 2 00:30:59 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 2.0.0.dev0+863.ge93ee07729
numpy : 1.23.5
pytz : 2022.6
dateutil : 2.8.2
setuptools : 65.5.1
pip : 22.3.1
Cython : 0.29.32
pytest : 7.2.0
hypothesis : 6.59.0
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.0.3
IPython : 8.7.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : 2022.11.0
fsspec : 2021.11.0
gcsfs : 2021.11.0
matplotlib : 3.6.2
numba : 0.56.4
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.9
pyarrow : 9.0.0
pyreadstat : 1.2.0
pyxlsb : 1.0.10
s3fs : 2021.11.0
scipy : 1.9.3
snappy :
sqlalchemy : 1.4.44
tables : 3.7.0
tabulate : 0.9.0
xarray : 2022.11.0
xlrd : 2.0.1
zstandard : 0.19.0
tzdata : None
qtpy : None
pyqt5 : None
None
</details>
### Prior Performance
_No response_
author:
url:https://github.com/MarcoGorelli
type:Person
name:MarcoGorelli
datePublished:2022-12-07T17:34:54.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:0
url:https://github.com/50107/pandas/issues/50107
context:https://schema.org
headline:PERF:cythonize _return_parsed_timezone_results?
articleBody:### Pandas version checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this issue exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.
- [X] I have confirmed this issue exists on the main branch of pandas.
### Reproducible Example
I made a file
```python
import pandas as pd
dates = pd.date_range('1900', '2000').tz_localize('+01:00').strftime('%Y-%d-%m %H:%M:%S%z').tolist()
dates.append('2020-01-01 00:00:00+02:00')
pd.to_datetime(dates, format='%Y-%d-%m %H:%M:%S%z')
```
I then ran
```
python -m cProfile -o out.prof perf.py
snakeviz out.prof --server
```
I can then see that, of the 0.329 seconds spent in `to_datetime`, a whole 0.248 are spent in `_return_parsed_timezone_results`. That's a whole 75%!
This function is a pair of Python for-loops
https://github.com/pandas-dev/pandas/blob/0cebd7508e252aa4aef7be7590537dc3e20a0282/pandas/core/tools/datetimes.py#L316-L328
Would this be a good candidate for Cython to speed it up?
cc @lukemanley
### Installed Versions
<details>
INSTALLED VERSIONS
------------------
commit : e93ee07729afe0bc7661655755df6adad657c23b
python : 3.8.15.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.102.1-microsoft-standard-WSL2
Version : #1 SMP Wed Mar 2 00:30:59 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 2.0.0.dev0+863.ge93ee07729
numpy : 1.23.5
pytz : 2022.6
dateutil : 2.8.2
setuptools : 65.5.1
pip : 22.3.1
Cython : 0.29.32
pytest : 7.2.0
hypothesis : 6.59.0
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : 2.9.3
jinja2 : 3.0.3
IPython : 8.7.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : 2022.11.0
fsspec : 2021.11.0
gcsfs : 2021.11.0
matplotlib : 3.6.2
numba : 0.56.4
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.9
pyarrow : 9.0.0
pyreadstat : 1.2.0
pyxlsb : 1.0.10
s3fs : 2021.11.0
scipy : 1.9.3
snappy :
sqlalchemy : 1.4.44
tables : 3.7.0
tabulate : 0.9.0
xarray : 2022.11.0
xlrd : 2.0.1
zstandard : 0.19.0
tzdata : None
qtpy : None
pyqt5 : None
None
</details>
### Prior Performance
_No response_
author:
url:https://github.com/MarcoGorelli
type:Person
name:MarcoGorelli
datePublished:2022-12-07T17:34:54.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:0
url:https://github.com/50107/pandas/issues/50107
Person:
url:https://github.com/MarcoGorelli
name:MarcoGorelli
url:https://github.com/MarcoGorelli
name:MarcoGorelli
InteractionCounter:
interactionType:https://schema.org/CommentAction
userInteractionCount:0
interactionType:https://schema.org/CommentAction
userInteractionCount:0
External Links {π}(3)
Analytics and Tracking {π}
- Site Verification - Google
Libraries {π}
- Clipboard.js
- D3.js
- Lodash
Emails and Hosting {βοΈ}
Mail Servers:
- aspmx.l.google.com
- alt1.aspmx.l.google.com
- alt2.aspmx.l.google.com
- alt3.aspmx.l.google.com
- alt4.aspmx.l.google.com
Name Servers:
- dns1.p08.nsone.net
- dns2.p08.nsone.net
- dns3.p08.nsone.net
- dns4.p08.nsone.net
- ns-1283.awsdns-32.org
- ns-1707.awsdns-21.co.uk
- ns-421.awsdns-52.com
- ns-520.awsdns-01.net