
GITHUB . COM {
}
Detected CMS Systems:
- Wordpress (2 occurrences)
Title:
Slow collection time when tests are not in a relative folder to the current working folder Β· Issue #13420 Β· pytest-dev/pytest
Description:
Created after this discussion - #13413 OSes, python and pytest versions OS: macOS 15.4.1, Ubuntu 22.04 Python 3.12.8 Pytest 8.3.4 Problem description I need to execute a lot of non-python tests that are stored in folders with lots of nes...
Website Age:
17 years and 8 months (reg. 2007-10-09).
Matching Content Categories {π}
- Careers
- Education
- Technology & Computing
Content Management System {π}
What CMS is github.com built with?
Github.com utilizes WORDPRESS.
Traffic Estimate {π}
What is the average monthly size of github.com audience?
ππ Tremendous Traffic: 10M - 20M visitors per month
Based on our best estimate, this website will receive around 10,000,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.
check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush
How Does Github.com Make Money? {πΈ}
Subscription Packages {π³}
We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.How Much Does Github.com Make? {π°}
Subscription Packages {π³}
Prices on github.com are in US Dollars ($).
They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 4,989,889 paying customers.
The estimated monthly recurring revenue (MRR) is $20,957,532.
The estimated annual recurring revenues (ARR) are $251,490,385.
Wordpress Themes and Plugins {π¨}
What WordPress theme does this site use?
It is strange but we were not able to detect any theme on the page.
What WordPress plugins does this website use?
It is strange but we were not able to detect any plugins on the page.
Keywords {π}
collection, issue, percall, time, cache, nodescheckinitialpathsforrelpath, added, pytest, fix, add, performance, commit, references, tests, folder, sashko, def, ncalls, tottime, cumtime, ronnypfannschmidt, sign, return, frameworkinternalfolder, minutes, function, filenamelinenofunction, type, pytestdev, projects, closed, rootworkingfolder, repowithtests, nodespycheckinitialpathsforrelpath, stats, initialpath, rel, mentioned, verified, patchback, navigation, solutions, code, pull, requests, actions, security, slow, relative, current,
Topics {βοΈ}
memory problem/improvement type 0afab2f sashko1988 mentioned 625a026 patchback mentioned significant time difference code context assigned labels topic collection phase type solutions cache resolves needed stuff comment metadata assignees type projects collection time projects milestone execution time cumulative time root_working_folder collection mechanism collection related relative folder folder structure pytest struggles pytest --collect /_pytest/pathlib framework_internal_folder python tests python tests} repo_with_tests wrapper=true def pytest_collection def pytest_collect_file class yamlfile def collect previous runner class yamltest tc_spec = tc_spec change decreased performance performance patch release milestone relationships collection def _check_initialpaths_for_relpath _check_initialpaths_for_relpath` performance resolved_paths = resolve_suites return yamlfile test_cases = yamltestresolver yield yamltest def __init__ /_pytest/nodes adding lru_cache github
Payment Methods {π}
- Braintree
Questions {β}
- @RonnyPfannschmidt, any info when this will be released?
- Already have an account?
- Anything else on the collection mechanism?
Schema {πΊοΈ}
DiscussionForumPosting:
context:https://schema.org
headline:Slow collection time when tests are not in a relative folder to the current working folder
articleBody:
**Created after this discussion** - https://github.com/pytest-dev/pytest/discussions/13413
## OSes, python and pytest versions
OS: macOS 15.4.1, Ubuntu 22.04
Python 3.12.8
Pytest 8.3.4
## Problem description
I need to execute a lot of non-python tests that are stored in folders with lots of nesting. And I found that Pytest struggles during the collection.
Some code context:
```python
@pytest.hookimpl(wrapper=True)
def pytest_collection(session):
resolved_paths = resolve_suites(session)
session.config.args.extend(resolved_paths)
return (yield)
def pytest_collect_file(parent, file_path):
if file_path.suffix == ".yaml":
return YamlFile.from_parent(parent, path=file_path)
class YamlFile(pytest.File):
def collect(self) -> Iterable[pytest.Item | pytest.Collector]:
test_cases = YamlTestResolver().from_file(f"{self.path}") # leftover from previous runner, but resolves needed stuff.
for tc in test_cases:
yield YamlTest.from_parent(self, name=tc.name, tc_spec=tc)
class YamlTest(pytest.Item):
def __init__(self, ptul_tc, **kwargs) -> None:
super().__init__(**kwargs)
self.tc_spec = tc_spec
```
Consider this folder structure:
```
root_working_folder
βββ framework_repo
β βββ framework_internal_folder
βββ repo_with_tests
βββ tests
βββ test_folder_1
β βββ inner_folder
βββ test_folder_2
βββ inner_folder
βββ even_more_depth
```
But even more subfolders in `repo_with_tests`
Pytest call is the following: `pytest --collect only ${list with 1k non-python tests}`. (1 test per file)
When I execute the above from `framework_internal_folder`, the execution time is 56 minutes with `cProfile`, 23 minutes without. When I make the same call from `root_working_folder` or `repo_with_tests`, the execution time is ~2 minutes with with `cProfile` / 38 seconds without.
The most significant time difference in the two calls is in the cumulative time of that function - `nodes.py:546(_check_initialpaths_for_relpath)`
```
# from framework_internal_folder
ncalls tottime percall cumtime percall filename:lineno(function)
237033 85.508 0.000 3176.004 0.013 ../_pytest/nodes.py:546(_check_initialpaths_for_relpath)
# from root_working_folder
ncalls tottime percall cumtime percall filename:lineno(function)
135 0.051 0.000 1.772 0.013 ../_pytest/nodes.py:546(_check_initialpaths_for_relpath)
```
According to stats, when executing from framework_internal_folder, the most struggling function is here:
```
ncalls tottime percall cumtime percall filename:lineno(function)
206063304 262.471 0.000 1580.672 0.000 ../_pytest/pathlib.py:990(commonpath)
# and stats for callers of that function:
Function was called by...
ncalls tottime cumtime
pathlib.py:990(commonpath) <- 205937164/3722648 262.308 28.844 nodes.py:546(_check_initialpaths_for_relpath)
```
## Possible solutions
### Cache for `_check_initialpaths_for_relpath`
I experimented with adding `lru_cache` to `_check_initialpaths_for_relpath`:
```python
@lru_cache(maxsize=1000)
def _check_initialpaths_for_relpath(initialpaths: frozenset[Path], path: Path) -> str | None:
for initial_path in initialpaths:
if commonpath(path, initial_path) == initial_path:
rel = str(path.relative_to(initial_path))
return "" if rel == "." else rel
return None
```
That change decreased the overall collection time to 4 minutes.
Stats are also impressive:
```
ncalls tottime percall cumtime percall filename:lineno(function)
5798 2.109 0.000 79.265 0.014 nodes.py:545(_check_initialpaths_for_relpath)
```
I'm not sure if `commonpath` needs caching as well.
### Anything else on the collection mechanism?
Other optimizations in directory/file collections
author:
url:https://github.com/sashko1988
type:Person
name:sashko1988
datePublished:2025-05-12T16:00:56.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:2
url:https://github.com/13420/pytest/issues/13420
context:https://schema.org
headline:Slow collection time when tests are not in a relative folder to the current working folder
articleBody:
**Created after this discussion** - https://github.com/pytest-dev/pytest/discussions/13413
## OSes, python and pytest versions
OS: macOS 15.4.1, Ubuntu 22.04
Python 3.12.8
Pytest 8.3.4
## Problem description
I need to execute a lot of non-python tests that are stored in folders with lots of nesting. And I found that Pytest struggles during the collection.
Some code context:
```python
@pytest.hookimpl(wrapper=True)
def pytest_collection(session):
resolved_paths = resolve_suites(session)
session.config.args.extend(resolved_paths)
return (yield)
def pytest_collect_file(parent, file_path):
if file_path.suffix == ".yaml":
return YamlFile.from_parent(parent, path=file_path)
class YamlFile(pytest.File):
def collect(self) -> Iterable[pytest.Item | pytest.Collector]:
test_cases = YamlTestResolver().from_file(f"{self.path}") # leftover from previous runner, but resolves needed stuff.
for tc in test_cases:
yield YamlTest.from_parent(self, name=tc.name, tc_spec=tc)
class YamlTest(pytest.Item):
def __init__(self, ptul_tc, **kwargs) -> None:
super().__init__(**kwargs)
self.tc_spec = tc_spec
```
Consider this folder structure:
```
root_working_folder
βββ framework_repo
β βββ framework_internal_folder
βββ repo_with_tests
βββ tests
βββ test_folder_1
β βββ inner_folder
βββ test_folder_2
βββ inner_folder
βββ even_more_depth
```
But even more subfolders in `repo_with_tests`
Pytest call is the following: `pytest --collect only ${list with 1k non-python tests}`. (1 test per file)
When I execute the above from `framework_internal_folder`, the execution time is 56 minutes with `cProfile`, 23 minutes without. When I make the same call from `root_working_folder` or `repo_with_tests`, the execution time is ~2 minutes with with `cProfile` / 38 seconds without.
The most significant time difference in the two calls is in the cumulative time of that function - `nodes.py:546(_check_initialpaths_for_relpath)`
```
# from framework_internal_folder
ncalls tottime percall cumtime percall filename:lineno(function)
237033 85.508 0.000 3176.004 0.013 ../_pytest/nodes.py:546(_check_initialpaths_for_relpath)
# from root_working_folder
ncalls tottime percall cumtime percall filename:lineno(function)
135 0.051 0.000 1.772 0.013 ../_pytest/nodes.py:546(_check_initialpaths_for_relpath)
```
According to stats, when executing from framework_internal_folder, the most struggling function is here:
```
ncalls tottime percall cumtime percall filename:lineno(function)
206063304 262.471 0.000 1580.672 0.000 ../_pytest/pathlib.py:990(commonpath)
# and stats for callers of that function:
Function was called by...
ncalls tottime cumtime
pathlib.py:990(commonpath) <- 205937164/3722648 262.308 28.844 nodes.py:546(_check_initialpaths_for_relpath)
```
## Possible solutions
### Cache for `_check_initialpaths_for_relpath`
I experimented with adding `lru_cache` to `_check_initialpaths_for_relpath`:
```python
@lru_cache(maxsize=1000)
def _check_initialpaths_for_relpath(initialpaths: frozenset[Path], path: Path) -> str | None:
for initial_path in initialpaths:
if commonpath(path, initial_path) == initial_path:
rel = str(path.relative_to(initial_path))
return "" if rel == "." else rel
return None
```
That change decreased the overall collection time to 4 minutes.
Stats are also impressive:
```
ncalls tottime percall cumtime percall filename:lineno(function)
5798 2.109 0.000 79.265 0.014 nodes.py:545(_check_initialpaths_for_relpath)
```
I'm not sure if `commonpath` needs caching as well.
### Anything else on the collection mechanism?
Other optimizations in directory/file collections
author:
url:https://github.com/sashko1988
type:Person
name:sashko1988
datePublished:2025-05-12T16:00:56.000Z
interactionStatistic:
type:InteractionCounter
interactionType:https://schema.org/CommentAction
userInteractionCount:2
url:https://github.com/13420/pytest/issues/13420
Person:
url:https://github.com/sashko1988
name:sashko1988
url:https://github.com/sashko1988
name:sashko1988
InteractionCounter:
interactionType:https://schema.org/CommentAction
userInteractionCount:2
interactionType:https://schema.org/CommentAction
userInteractionCount:2
External Links {π}(2)
Analytics and Tracking {π}
- Site Verification - Google
Libraries {π}
- Clipboard.js
- D3.js
- Lodash
Emails and Hosting {βοΈ}
Mail Servers:
- aspmx.l.google.com
- alt1.aspmx.l.google.com
- alt2.aspmx.l.google.com
- alt3.aspmx.l.google.com
- alt4.aspmx.l.google.com
Name Servers:
- dns1.p08.nsone.net
- dns2.p08.nsone.net
- dns3.p08.nsone.net
- dns4.p08.nsone.net
- ns-1283.awsdns-32.org
- ns-1707.awsdns-21.co.uk
- ns-421.awsdns-52.com
- ns-520.awsdns-01.net