GITHUB . COM {}

Detected CMS Systems:

Wordpress (2 occurrences)

Analyzed Page
Matching Content Categories
CMS
Monthly Traffic Estimate
How Does Github.com Make Money
How Much Does Github.com Make
Wordpress Themes And Plugins
Keywords
Topics
Payment Methods
Questions
Schema
External Links
Analytics And Tracking
Libraries
Hosting Providers

We are analyzing https://github.com/marcelm/cutadapt/issues/369.

Title:
Assertion error in few processesed files · Issue #369 · marcelm/cutadapt
Description:
I am running cutadapt as part of DADA2 pipeline to analyze paired-end MiSeq 16S sequencing data. Since my sequences contains 0-7 b long heterogeneity spacer sequence, as described in Fadrosh et al. 2014, in addition to primer sequences, ...
Website Age:
17 years and 8 months (reg. 2007-10-09).

Matching Content Categories {📚}

Technology & Computing
Mobile Technology & AI
Video & Online Content

Content Management System {📝}

What CMS is github.com built with?

Github.com utilizes WORDPRESS.

Traffic Estimate {📈}

What is the average monthly size of github.com audience?

🚀🌠 Tremendous Traffic: 10M - 20M visitors per month

Based on our best estimate, this website will receive around 10,000,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.

check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush

How Does Github.com Make Money? {💸}

Subscription Packages {💳}

We've located a dedicated page on github.com that might include details about subscription plans or recurring payments. We identified it based on the word pricing in one of its internal links. Below, you'll find additional estimates for its monthly recurring revenues.

How Much Does Github.com Make? {💰}

Subscription Packages {💳}

Prices on github.com are in US Dollars ($). They range from $4.00/month to $21.00/month.
We estimate that the site has approximately 4,989,889 paying customers.
The estimated monthly recurring revenue (MRR) is $20,957,532.
The estimated annual recurring revenues (ARR) are $251,490,385.

Wordpress Themes and Plugins {🎨}

What WordPress theme does this site use?

It is strange but we were not able to detect any theme on the page.

What WordPress plugins does this website use?

It is strange but we were not able to detect any plugins on the page.

Keywords {🔍}

read, file, marcelm, line, cutadapt, issue, error, files, running, sequences, adapter, sign, sequence, reads, matches, commented, code, lauripa, bases, version, cctacgggnggcwgcag, filepathpath, filtn, rflags, total, pairs, length, removed, pretty, call, homelaurilocallibpythonsitepackagescutadaptmodifierspy, owner, navigation, pull, requests, actions, security, assertion, processesed, closed, dada, pairedend, long, primer, fixed, things, command, python, fwd, rev,

Topics {✒️}

/dada2analyysi/cutadapt/a059-7m-3-cagttcca-tgtctaag-paulamaki-run20190304r_s59_l001_r1_001 /dada2analyysi/cutadapt/a059-7m-3-cagttcca-tgtctaag-paulamaki-run20190304r_s59_l001_r2_001 /dada2analyysi/filtn/a059-7m-3-cagttcca-tgtctaag-paulamaki-run20190304r_s59_l001_r1_001 /dada2analyysi/filtn/a059-7m-3-cagttcca-tgtctaag-paulamaki-run20190304r_s59_l001_r2_001 6/site-packages/cutadapt/modifiers 6/site-packages/cutadapt/pipeline 6/site-packages/cutadapt/adapters 6/site-packages/cutadapt/main completed marcelm mentioned paired-end mode total basepairs processed local/bin/cutadapt marcelm closed local/lib/python3 comment metadata assignees src/cutadapt/_align skipped lengthy print pretty lengthy post cutadapt error cutting fixed number problematic file pairs ambiguous bases fnfs filtn/ subdirectory fnrs gz processing reads error message removed sequences code pairs written install cutadapt running cutadapt files succesfully dada2 pipeline suspiciously long command prompt error primer sequences pretty similar pretty good recent call call return adapter sequence persistent trial wsl ubuntu 18 multithread = false /home/lauri/ begun smoothly passing filters allowed errors skipped prints sample lost

Payment Methods {📊}

Braintree

Questions {❓}

Already have an account?
Can you make one of the problematic file pairs available to me?

Schema {🗺️}

DiscussionForumPosting:
      context:https://schema.org
      headline:Assertion error in few processesed files
      articleBody:I am running cutadapt as part of DADA2 pipeline to analyze paired-end MiSeq 16S sequencing data. Since my sequences contains 0-7 b long heterogeneity spacer sequence, as described in Fadrosh et al. 2014, in addition to primer sequences, I had to remove them by using primer sequences as a template instead of just cutting fixed number of bases from each end. So, I am relatively new to running things on command prompt, but after persistent trial and error, I managed to install cutadapt to my WSL Ubuntu 18.04. As DADA2 is R program, I am naturally running it on latest version of R that is 3.5.3. Python is also updated to 3.6.7 and cutadapt is version 2.1 This is the code that I was running in R: ` >FWD <- "CCTACGGGNGGCWGCAG" >REV <- "GACTACHVGGGTATCTAATCC" > >#Removal of ambiguous bases >fnFs.filtN <- file.path(path, "filtN", basename(fnFs)) # Put N-filterd files in filtN/ subdirectory >fnRs.filtN <- file.path(path, "filtN", basename(fnRs)) >filterAndTrim(fnFs, fnFs.filtN, fnRs, fnRs.filtN, maxN = 0, multithread = FALSE) > >cutadapt <- "/home/lauri/.local/bin/cutadapt" > > path.cut <- file.path(path, "cutadapt") > if(!dir.exists(path.cut)) dir.create(path.cut) > fnFs.cut <- file.path(path.cut, basename(fnFs)) > fnRs.cut <- file.path(path.cut, basename(fnRs)) > > FWD.RC <- dada2:::rc(FWD) > REV.RC <- dada2:::rc(REV) > > R1.flags <- paste("-g", FWD, "-a", REV.RC) > R2.flags <- paste("-G", REV, "-A", FWD.RC) for(i in seq_along(fnFs)) { system2(cutadapt, args = c(R1.flags, R2.flags, "-n", 2, "-o", fnFs.cut[i], "-p", fnRs.cut[i], # output files fnFs.filtN[i], fnRs.filtN[i])) # input files } Running this begun smoothly and I got these prints as expected: > === Summary === > > Total read pairs processed: 71,065 > Read 1 with adapter: 70,393 (99.1%) > Read 2 with adapter: 70,102 (98.6%) > Pairs written (passing filters): 71,065 (100.0%) > > Total basepairs processed: 42,922,924 bp > Read 1: 23,166,984 bp > Read 2: 19,755,940 bp > Total written (filtered): 38,799,372 bp (90.4%) > Read 1: 21,163,432 bp > Read 2: 17,635,940 bp > > === First read: Adapter 1 === > > Sequence: GGATTAGATACCCBDGTAGTC; Type: regular 3'; Length: 21; Trimmed: 3938 times. > > No. of allowed errors: > 0-9 bp: 0; 10-19 bp: 1; 20-21 bp: 2 > > Bases preceding removed adapters: > A: 16.5% > C: 3.2% > G: 44.7% > T: 35.6% > none/other: 0.0% > > .... (I skipped lengthy print of removed sequences, some of which were suspiciously long) > > === First read: Adapter 2 === > > Sequence: CCTACGGGNGGCWGCAG; Type: regular 5'; Length: 17; Trimmed: 70265 times. > > No. of allowed errors: > 0-9 bp: 0; 10-16 bp: 1 > > Overview of removed sequences > length count expect max.err error counts > 3 25 1110.4 0 25 > 11 3 0.0 1 1 2 > 12 1 0.0 1 0 1 > 14 5 0.0 1 4 1 > 15 14 0.0 1 8 6 > 16 282 0.0 1 87 195 > 17 18379 0.0 1 18261 118 > 18 324 0.0 1 62 262 > 19 18035 0.0 1 17885 150 > 20 305 0.0 1 61 244 > 21 15572 0.0 1 15458 114 > 22 575 0.0 1 292 283 > 23 16654 0.0 1 16539 115 > 24 86 0.0 1 29 57 > 25 2 0.0 1 2 > 27 1 0.0 1 1 > 28 1 0.0 1 1 > 67 1 0.0 1 1 ....(skipped prints for the second read, they were pretty similar to this one) Things went pretty good until I got this error message: > This is cutadapt 2.1 with Python 3.6.7 > Command line parameters: -g CCTACGGGNGGCWGCAG -a GGATTAGATACCCBDGTAGTC -G GACTACHVGGGTATCTAATCC -A CTGCWGCCNCCCGTAGG -n 2 -o /mnt/d/DADA2analyysi/cutadapt/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R1_001.fastq.gz -p /mnt/d/DADA2analyysi/cutadapt/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R2_001.fastq.gz /mnt/d/DADA2analyysi/filtN/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R1_001.fastq.gz /mnt/d/DADA2analyysi/filtN/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R2_001.fastq.gz > Processing reads on 1 core in paired-end mode ... > [ 8<---------] 00:00:06 30,000 reads @ 209.6 µs/read; 0.29 M reads/minuteTraceback (most recent call last): > File "/home/lauri/.local/bin/cutadapt", line 11, in <module> > sys.exit(main()) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 803, in main > stats = runner.run() > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 727, in run > (n, total1_bp, total2_bp) = self._pipeline.process_reads(progress=self._progress) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 324, in process_reads > read1, read2 = modifier(read1, read2, matches1, matches2) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 33, in __call__ > return self._modifier1(read1, matches1), self._modifier2(read2, matches2) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 179, in __call__ > match = AdapterCutter.best_match(self.adapters, trimmed_read) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 107, in best_match > match = adapter.match_to(read) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/adapters.py", line 841, in match_to > alignment = self.aligner.locate(read_seq) > File "src/cutadapt/_align.pyx", line 515, in cutadapt._align.Aligner.locate > AssertionError Code kept running after this and proceed to process some more files succesfully. Only 5 samples out of 30 were affected and looks like they are missing most of the reads. As an example one sample lost over 70% of its reads. This ended up being pretty lengthy post, but I wasn't sure what to include.
      author:
         url:https://github.com/LauriPa
         type:Person
         name:LauriPa
      datePublished:2019-03-19T12:36:05.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:4
      url:https://github.com/369/cutadapt/issues/369
      context:https://schema.org
      headline:Assertion error in few processesed files
      articleBody:I am running cutadapt as part of DADA2 pipeline to analyze paired-end MiSeq 16S sequencing data. Since my sequences contains 0-7 b long heterogeneity spacer sequence, as described in Fadrosh et al. 2014, in addition to primer sequences, I had to remove them by using primer sequences as a template instead of just cutting fixed number of bases from each end. So, I am relatively new to running things on command prompt, but after persistent trial and error, I managed to install cutadapt to my WSL Ubuntu 18.04. As DADA2 is R program, I am naturally running it on latest version of R that is 3.5.3. Python is also updated to 3.6.7 and cutadapt is version 2.1 This is the code that I was running in R: ` >FWD <- "CCTACGGGNGGCWGCAG" >REV <- "GACTACHVGGGTATCTAATCC" > >#Removal of ambiguous bases >fnFs.filtN <- file.path(path, "filtN", basename(fnFs)) # Put N-filterd files in filtN/ subdirectory >fnRs.filtN <- file.path(path, "filtN", basename(fnRs)) >filterAndTrim(fnFs, fnFs.filtN, fnRs, fnRs.filtN, maxN = 0, multithread = FALSE) > >cutadapt <- "/home/lauri/.local/bin/cutadapt" > > path.cut <- file.path(path, "cutadapt") > if(!dir.exists(path.cut)) dir.create(path.cut) > fnFs.cut <- file.path(path.cut, basename(fnFs)) > fnRs.cut <- file.path(path.cut, basename(fnRs)) > > FWD.RC <- dada2:::rc(FWD) > REV.RC <- dada2:::rc(REV) > > R1.flags <- paste("-g", FWD, "-a", REV.RC) > R2.flags <- paste("-G", REV, "-A", FWD.RC) for(i in seq_along(fnFs)) { system2(cutadapt, args = c(R1.flags, R2.flags, "-n", 2, "-o", fnFs.cut[i], "-p", fnRs.cut[i], # output files fnFs.filtN[i], fnRs.filtN[i])) # input files } Running this begun smoothly and I got these prints as expected: > === Summary === > > Total read pairs processed: 71,065 > Read 1 with adapter: 70,393 (99.1%) > Read 2 with adapter: 70,102 (98.6%) > Pairs written (passing filters): 71,065 (100.0%) > > Total basepairs processed: 42,922,924 bp > Read 1: 23,166,984 bp > Read 2: 19,755,940 bp > Total written (filtered): 38,799,372 bp (90.4%) > Read 1: 21,163,432 bp > Read 2: 17,635,940 bp > > === First read: Adapter 1 === > > Sequence: GGATTAGATACCCBDGTAGTC; Type: regular 3'; Length: 21; Trimmed: 3938 times. > > No. of allowed errors: > 0-9 bp: 0; 10-19 bp: 1; 20-21 bp: 2 > > Bases preceding removed adapters: > A: 16.5% > C: 3.2% > G: 44.7% > T: 35.6% > none/other: 0.0% > > .... (I skipped lengthy print of removed sequences, some of which were suspiciously long) > > === First read: Adapter 2 === > > Sequence: CCTACGGGNGGCWGCAG; Type: regular 5'; Length: 17; Trimmed: 70265 times. > > No. of allowed errors: > 0-9 bp: 0; 10-16 bp: 1 > > Overview of removed sequences > length count expect max.err error counts > 3 25 1110.4 0 25 > 11 3 0.0 1 1 2 > 12 1 0.0 1 0 1 > 14 5 0.0 1 4 1 > 15 14 0.0 1 8 6 > 16 282 0.0 1 87 195 > 17 18379 0.0 1 18261 118 > 18 324 0.0 1 62 262 > 19 18035 0.0 1 17885 150 > 20 305 0.0 1 61 244 > 21 15572 0.0 1 15458 114 > 22 575 0.0 1 292 283 > 23 16654 0.0 1 16539 115 > 24 86 0.0 1 29 57 > 25 2 0.0 1 2 > 27 1 0.0 1 1 > 28 1 0.0 1 1 > 67 1 0.0 1 1 ....(skipped prints for the second read, they were pretty similar to this one) Things went pretty good until I got this error message: > This is cutadapt 2.1 with Python 3.6.7 > Command line parameters: -g CCTACGGGNGGCWGCAG -a GGATTAGATACCCBDGTAGTC -G GACTACHVGGGTATCTAATCC -A CTGCWGCCNCCCGTAGG -n 2 -o /mnt/d/DADA2analyysi/cutadapt/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R1_001.fastq.gz -p /mnt/d/DADA2analyysi/cutadapt/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R2_001.fastq.gz /mnt/d/DADA2analyysi/filtN/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R1_001.fastq.gz /mnt/d/DADA2analyysi/filtN/A059-7M-3-CAGTTCCA-TGTCTAAG-Paulamaki-run20190304R_S59_L001_R2_001.fastq.gz > Processing reads on 1 core in paired-end mode ... > [ 8<---------] 00:00:06 30,000 reads @ 209.6 µs/read; 0.29 M reads/minuteTraceback (most recent call last): > File "/home/lauri/.local/bin/cutadapt", line 11, in <module> > sys.exit(main()) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/__main__.py", line 803, in main > stats = runner.run() > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 727, in run > (n, total1_bp, total2_bp) = self._pipeline.process_reads(progress=self._progress) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/pipeline.py", line 324, in process_reads > read1, read2 = modifier(read1, read2, matches1, matches2) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 33, in __call__ > return self._modifier1(read1, matches1), self._modifier2(read2, matches2) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 179, in __call__ > match = AdapterCutter.best_match(self.adapters, trimmed_read) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/modifiers.py", line 107, in best_match > match = adapter.match_to(read) > File "/home/lauri/.local/lib/python3.6/site-packages/cutadapt/adapters.py", line 841, in match_to > alignment = self.aligner.locate(read_seq) > File "src/cutadapt/_align.pyx", line 515, in cutadapt._align.Aligner.locate > AssertionError Code kept running after this and proceed to process some more files succesfully. Only 5 samples out of 30 were affected and looks like they are missing most of the reads. As an example one sample lost over 70% of its reads. This ended up being pretty lengthy post, but I wasn't sure what to include.
      author:
         url:https://github.com/LauriPa
         type:Person
         name:LauriPa
      datePublished:2019-03-19T12:36:05.000Z
      interactionStatistic:
         type:InteractionCounter
         interactionType:https://schema.org/CommentAction
         userInteractionCount:4
      url:https://github.com/369/cutadapt/issues/369
Person:
      url:https://github.com/LauriPa
      name:LauriPa
      url:https://github.com/LauriPa
      name:LauriPa
InteractionCounter:
      interactionType:https://schema.org/CommentAction
      userInteractionCount:4
      interactionType:https://schema.org/CommentAction
      userInteractionCount:4

External Links {🔗}(2)

Analytics and Tracking {📊}

Site Verification - Google

Libraries {📚}

Clipboard.js
D3.js
Lodash

Emails and Hosting {✉️}

Mail Servers:

aspmx.l.google.com
alt1.aspmx.l.google.com
alt2.aspmx.l.google.com
alt3.aspmx.l.google.com
alt4.aspmx.l.google.com

Name Servers:

dns1.p08.nsone.net
dns2.p08.nsone.net
dns3.p08.nsone.net
dns4.p08.nsone.net
ns-1283.awsdns-32.org
ns-1707.awsdns-21.co.uk
ns-421.awsdns-52.com
ns-520.awsdns-01.net

9.18s.