Jan. 13 and 21, 2018. Comments on PubMed PMID 29280214: Thorough in silico and in vitro cDNA analysis of 21 putative BRCA1 and BRCA2 splice variants and a complex tandem duplication in BRCA2, allowing the identification of activated cryptic splice donor sites in BRCA2 exon 11.

We have posted a comment in PubMed Commons about Baert et al. “Thorough in silico and in vitro cDNA analysis of 21 putative BRCA1 and BRCA2 splice variants and a complex tandem duplication in BRCA2, allowing the identification of activated cryptic splice donor sites in BRCA2 exon 11.” (2017) (doi: 10.1002/humu.23390). The updated comments can be found at: https://www.ncbi.nlm.nih.gov/pubmed/29280214#comments. They have been highlighted twice by PubMed Commons as a “Top Comment”.

NB: We have exchanged views with Dr. Claes (senior author), who has inquired about our NGS pipeline for splicing mutation analysis, MutationForecaster (www.mutationforecaster.com):

Peter Rogan2018 Jan 12 2:39 p.m.edited 2 of 2 people found this helpful

Twenty one BRCA1 and BRCA2 mRNA splice site variants were analyzed by semi-quantitative RT-PCR, with commercial software that scores putative splice sites by ad hoc methods, and with bioinformatic models based on Adaboost and Random Forest, which are general machine learning approaches. The authors cited our review on interpretation of splicing mutations (Caminsky N, 2014), however the analytic approach described in that paper was not evaluated. As an update to our previous BRCA mutation study (Mucaki EJ, 2011), we carried out information theory-based splicing analysis of all potential splicing mutations listed in Supplemental Table S3. The splicing consequences of all variants were accurately predicted by information analysis. We also report results of exon definition-based mRNA splicing mutation analysis (Mucaki EJ, 2013), which infers relative abundance of wild type and mutated splice isoforms from total splicing information content of each prospective exon. Due to length limitations in PubMed Commons commenting system, detailed results for each variant are described in: https://doi.org/10.5281/zenodo.1146708

Also, during our analysis, some inconsistencies in mutation designation or interpretation were noted in the paper: (1) The complex BRCA2duplication described in this article (c.425+415_4780dup[insGATCGCAGTGA]) is sometimes referred to as “c.426-415_4780dup[insGATCGCAGTGA]” (e.g. the title of Figure 5, and Suppl. Table S3), which are not congruent mutations. The true mutation is likely the former, as the Figure 5 legend describes an mRNA splice form that includes 293nt of intron 4. If the duplication was c.426-415_4780dup[insGATCGCAGTGA], the intron inclusion would only be 205nt long. (2) We report an additional inconsistency in regards to Figure 5: The legend of Figure 5E describes a splice form where a truncated exon 11 junctions with the aforementioned 11nt insertion. However, the diagram and the electropherogram in Figure 5e shows exon 11 (ending at c.2398) sharing a junction with the beginning of exon 5. The latter is most likely the correct isoform, as an acceptor is not predicted at the junction between c.4780 and the 11nt insertion.

  • Kathleen B M Claes2018 Jan 17 10:44 a.m. 2 of 2 people found this helpful

    Dear dr Rogan, thank you very much for your constructive comments. It is very interesting to learn that your exon definition-based mRNA splicing analyses are in agreement with our cDNA analyses for all variants we studied (an overview is provided in Suppl Table S1 of our paper – not S3). I read the detailed comments on the URL you referred to. How easy can this approach be implemented in an NGS data analysis pipeline? Can you define cut-offs in this program to indicate when cDNA analysis is warranted?

    I also would like to thank you for alerting us about the typing error for the Multi-exon duplication in BRCA2 – the correct nomenclature for this duplication is indeed c.426+415_4780dup{insGATCGCAGTGA}. We corrected this in the final proofs.

    • Peter Rogan2018 Jan 21 1:20 p.m.edited 1 of 1 people found this helpful

      The results reported in Table S1 of the different bioinformatic methods were difficult for us to assess. For example, why were there no bioinformatic analyses for c.426+415_4780dup(insGATCGCAGTGA)? Our analysis includes this mutation. Model cutoffs for these bioinformatic methods are defined arbitrarily because they are based on underlying datasets with unpublished or unknown content; furthermore, the binding site models are not easily reproduced, in part because they are not actually based on binding site affinities (Rogan PK, 2013).

      The details of the methods and source data we use to derive our information weight matrices and the matrices themselves are available (Rogan PK, 2003). The information contents of splice recognition sites or exons are expressed in units of bits, which have been formally proven to be related to binding site affinity through the second law of thermodynamics (Schneider TD, 1997Rogan PK, 1998). In fact, relative entropy used by maxEntscan, violates the triangle inequality which is a fundamental requirement of the second law (Schneider TD, 1999). These articles demonstrate the cutoff for true binding sites is very close to the theoretical minimum of zero bits (Delta G = 0). We have also demonstrated this thermodynamic threshold holds for other types of binding sites (Lu R, 2017).

      Our pipeline for NGS data analysis has been validated extensively (Shirley BC, 2013Viner C, 2014Dorman SN, 2014Caminsky NG, 2016Mucaki EJ, 2016Yang XR, 2017Dos Santos ES, 2017). The URL of the MutationForecaster pipeline is given in the document linked to our previous PubMed Commons post .

 

PubMedCommonsentry1-12-2018