Package 'RJafroc'

Title: Artificial Intelligence Systems and Observer Performance
Description: Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: <https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840>. Online updates to this book, which use the software, are at <https://dpc10ster.github.io/RJafrocQuickStart/>, <https://dpc10ster.github.io/RJafrocRocBook/> and at <https://dpc10ster.github.io/RJafrocFrocBook/>. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, <https://github.com/dpc10ster/WindowsJafroc>. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single modality analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.
Authors: Dev Chakraborty [cre, aut, cph], Peter Phillips [ctb], Xuetong Zhai [aut], Lucy D'Agostino McGowan [ctb], Alejandro RodriguezRuiz [ctb]
Maintainer: Dev Chakraborty <[email protected]>
License: GPL-3
Version: 2.1.3
Built: 2024-10-29 11:43:00 UTC
Source: https://github.com/dpc10ster/rjafroc

Help Index


Artificial Intelligence Systems and Observer Performance

Description

RJafroc analyzes the performance of artificial intelligence (AI) systems/algorithms characterized by a search-and-report strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The methods here apply to any task involving searching for and reporting arbitrary targets in images. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where the implicit lesion localization information is used. A book describing the underlying methodology and which uses the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840. Online updates to this book, which use the software, are at https://dpc10ster.github.io/RJafrocQuickStart/, https://dpc10ster.github.io/RJafrocRocBook/ and at https://dpc10ster.github.io/RJafrocFrocBook/. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict "proper" ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, https://github.com/dpc10ster/WindowsJafroc. Package functions are organized as follows. Data file related function names are preceded by Df, curve fitting functions by Fit, included data sets by dataset, plotting functions by Plot, significance testing functions by St, sample size related functions by Ss, data simulation functions by Simulate and utility functions by Util. Implemented are figures of merit (FOMs) for quantifying performance, functions for visualizing empirical operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via both Dorfman-Berbaum-Metz and the Obuchowski-Rockette methods. Also implemented are single modality analyses, allowing comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists/algorithms interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification. All changes are noted in NEWS.md.

Details

Package: RJafroc
Type: Package
Version: 2.1.3
Date: 2023-07-31
License: GPL-3
URL: https://dpc10ster.github.io/RJafroc/

Definitions and abbreviations

  • a: The separation or "a" parameter of the binormal model

  • AFROC curve: plot of LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases

  • AFROC: alternative FROC, see Chakraborty 1989

  • AFROC1 curve: plot of LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases

  • alphaalpha: The significance level α\alpha of the test of the null hypothesis of no modality effect

  • AUC: area under curve; e.g., ROC-AUC = area under ROC curve, an example of a FOM

  • b: The width or "b" parameter of the conventional binormal model

  • Binormal model: two unequal variance normal distributions, one at zero and one at mumu, for modeling ROC ratings, sigmasigma is the std. dev. ratio of diseased to non-diseased distributions

  • CAD: computer aided detection algorithm

  • CBM: contaminated binormal model (CBM): two equal variance normal distributions for modeling ROC ratings, the diseased distribution is bimodal, with a peak at zero and one at μ\mu, the integrated fraction at μ\mu is α\alpha (not to be confused with α\alpha of NH testing)

  • CI: The (1-α\alpha) confidence interval for the stated statistic

  • Crossed-modality: a dataset containing two modality (i.e., treatment) factors, with the levels of the two factors crossed, see paper by Thompson et al

  • DBM: Dorfman-Berbaum-Metz, a significance testing method for detecting a modality effect in MRMC studies, with Hillis suggested modification to ddf.

  • ddf: Denominator degrees of freedom of appropriate FF-test; the corresponding ndf is I - 1

  • Empirical AUC: trapezoidal area under curve, same as the Wilcoxon statistic for ROC paradigm

  • FN: false negative, a diseased case classified as non-diseased

  • FOM: figure of merit, a quantitative measure of performance, performance metric

  • FP: false positive, a non-diseased case classified as diseased

  • FPF: number of FPs divided by number of non-diseased cases

  • FROC curve: plot of LLF (ordinate) vs. NLF

  • FROC: free-response ROC (a data collection paradigm where each image yields a random number, 0, 1, 2,..., of mark-rating pairs)

  • FRRC: Analysis that treats readers as fixed and cases as random factors

  • I: total number of modalities, indexed by ii

  • image/case: used interchangeably; a case can consist of several images of the same patient in the same modality

  • iMRMC: A text file format used for ROC data by FDA/CDRH researchers

  • individual: A single-modality single-reader dataset.

  • Intrinsic: Used in connection with RSM; a parameter that is independent of the RSM μ\mu parameter, but whose meaning may not be as transparent as the corresponding physical parameter

  • J: number of readers, indexed by j

  • JAFROC file format: A .xlsx format file, applicable to ROC, ROI, FROC and LROC paradigms

  • JAFROC: jackknife AFROC: Windows software for analyzing observer performance data: no longer updated, replaced by current package; the name is a misnomer as the jackknife is used only for significance testing; alternatively, the bootstrap could be used; what distinguishes FROC from ROC analysis is the use of the AFROC-AUC as the FOM. With this change, the DBM or the OR method can be used for significance testing

  • K: total number of cases, K = K1 + K2, indexed by kk

  • K1: total number of non-diseased cases, indexed by k1k1

  • K2: total number of diseased cases, indexed by k2k2

  • LL: lesion localization i.e., a mark that correctly locates an existing localized lesion; TP is a special case, when the proximity criterion is lax (i.e., "acceptance radius" is large)

  • LLF: number of LLs divided by the total number of lesions

  • LROC: location receiver operating characteristic, a data collection paradigm where each image yields a single rating and one location

  • lrc/MRMC: A text file format used for ROC data by University of Iowa researchers

  • mark: the location of a suspected diseased region

  • maxLL: maximum number of lesions per case in dataset

  • maxNL: maximum number of NL marks per case in dataset

  • MRMC: multiple reader multiple case (each reader interprets each case in each modality, i.e. fully crossed study design)

  • ndf: Numerator degrees of freedom of appropriate FF-test, usually number of treatments minus one

  • NH: The null hypothesis that all modality effects are zero; rejected if the pp-value is smaller than α\alpha

  • NL: non-lesion localization, of which FP is a special case, i.e., a mark that does not correctly locate any existing localized lesion(s)

  • NLF: number of NLs divided by the total number of cases

  • Operating characteristic: A plot of normalized correct decisions on diseased cases along ordinate vs. normalized incorrect decisions on non-diseased cases

  • Operating point: A point on an operating characteristic, e.g., (FPF, TPF) represents an operating point on an ROC

  • OR: Obuchowski-Rockette, a significance testing method for detecting a modality effect in MRMC studies, with Hillis suggested modifications

  • Physical parameter: Used in connection with RSM; a parameter whose meaning is more transparent than the corresponding intrinsic parameter, but which depends on the RSM μ\mu parameter

  • Proximity criterion / acceptance radius: Used in connection with FROC (or LROC data); the "nearness" criterion is used to determine if a mark is close enough to a lesion to be counted as a LL (or correct localization); otherwise it is counted as a NL (or incorrect localization)

  • p-value: the probability, under the null hypothesis, that the observed modality effects, or larger, could occur by chance

  • Proper: a proper fit does not inappropriately fall below the chance diagonal, does not display a "hook" near the upper right corner

  • PROPROC: Metz's binormal model based fitting of proper ROC curves

  • RSM, Radiological Search Model: two unit variance normal distributions for modeling NL and LL ratings; four parameters, μ\mu, ν\nu', λ\lambda' and ζ\zeta1

  • Rating: Confidence level assigned to a case; higher values indicate greater confidence in presence of disease; -Inf is allowed but NA is not allowed

  • Reader/observer/radiologist/CAD: used interchangeably

  • RJafroc: the current software

  • ROC: receiver operating characteristic, a data collection paradigm where each image yields a single rating and location information is ignored

  • ROC curve: plot of TPF (ordinate) vs. FPF, as threshold is varied; an example of an operating characteristic

  • ROCFIT: Metz software for binormal model based fitting of ROC data

  • ROI: region-of-interest (each case is divided into a number of ROIs and the reader assigns an ROC rating to each ROI)

  • FRRC: Analysis that treats readers as fixed and cases as random factors

  • RRFC: Analysis that treats readers as random and cases as fixed factors

  • RRRC: Analysis that treats both readers and cases as random factors

  • RSCORE-II: original software for binormal model based fitting of ROC data

  • RSM: Radiological search model, also method for fitting a proper ROC curve to ROC data

  • RSM-ζ\zeta1: Lowest reporting threshold, determines if suspicious region is actually marked

  • RSM-λ\lambda: Intrinsic parameter of RSM corresponding to λ\lambda', independent of μ\mu

  • RSM-λ\lambda': Physical Poisson parameter of RSM, average number of latent NLs per case; depends on μ\mu

  • RSM-μ\mu: separation of the unit variance distributions of RSM

  • RSM-ν\nu: Intrinsic parameter of RSM, corresponding to ν\nu', independent of μ\mu

  • RSM-ν\nu': binomial parameter of RSM, probability that lesion is found

  • SE: sensitivity, same as TPFTPF

  • Significance testing: determining the p-value of a statistical test

  • SP: specificity, same as 1FPF1-FPF

  • Threshold: Reporting criteria: if confidence exceeds a threshold value, report case as diseased, otherwise report non-diseased

  • TN: true negative, a non-diseased case classified as non-diseased

  • TP: true positive, a diseased case classified as diseased

  • TPF: number of TPs divided by number of diseased cases

  • Treatment/modality: used interchangeably, for example, computed tomography (CT) images vs. magnetic resonance imaging (MRI) images

  • wAFROC curve: plot of weighted LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases ONLY

  • wAFROC1 curve: plot of weighted LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases

  • wAFROC1 FOM: weighted trapezoidal area under AFROC1 curve: only use if there are zero non-diseased cases is always number of treatments minus one

Dataset

A standard dataset object has 3 list elements: $ratings, $lesions and $descriptions, where:

  • dataset$ratings: contains 3 elements as sub-lists: $NL, $LL and $LL_IL; these describe the structure of the ratings;

  • dataset$lesions: contains 3 elements as sub-lists: $perCase, $IDs and $weights; these describe the structure of the lesions;

  • dataset$descriptions: contains 7 elements as sub-lists: $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID; these describe other characteristics of the dataset as detailed next.

Note: -Inf is used to indicate the ratings of unmarked lesions and/or missing values. As an example of the latter, if the maximum number of NLs in a dataset is 4, but some images have fewer than 4 NL marks, the corresponding "empty" positions would be filled with -Infs. Do not use NA to denote a missing rating.

Note: A standard dataset always represents R object(s) with the following structure(s):

Data structure, e.g., dataset02, an ROC dataset, and dataset05, an FROC dataset.

  • ratings$NL: a float array with dimensions c(I, J, K, maxNL), containing the ratings of NL marks. The first K1 locations of the third index corresponds to NL marks on non-diseased cases and the remaining locations correspond to NL marks on diseased cases. The 4th dimension allows for multiple NL marks on a case: the first index holds the first NL rating on the image, the second holds the second NL rating on the image, etc. The value of maxNL is determined by the case with the maximum number of lesions per case in the dataset. For FROC datasets missing NL ratings are assigned the -Inf rating. For ROC datasets, FP ratings are assigned to the first K1 elements of NL[,,1:K1,1] and the remaining K2 elements of NL[,,(K1+1):K,1] are set to -Inf.

  • ratings$LL: for non-LROC datasets a float array with dimensions c(I, J, K2, maxLL) containing the ratings of LL marks. The value of maxLL is determined by the maximum number of lesions per case in the dataset. Unmarked lesions are assigned the -Inf rating. For ROC datasets TP ratings are assigned to LL[,,1:K2,1]. For LROC datasets it is a float array with dimensions c(I, J, K2, 1) containing the ratings of correct localizations, otherwise the rating is recorded in the incorrect localization array described next.

  • ratings$LL_IL: for LROC datasets the ratings of incorrect localization marks on abnormal cases. It is a float array with dimensions c(I, J, K2, 1). For non-LROC datasets this array is filled with NAs.

  • lesions$perCase: an integer array with length K2, the number of lesions on each diseased case. The maximum value of this array equals maxLL. For example, dataset05$lesions$perCase[4 is 2, meaning the 4th diseased case has two lesions.

  • lesions$IDs: an integer array with dimensions [K2, maxLL], labeling (or naming) the lesions on the diseased cases. For example, dataset05$lesions$IDs[4,] is c(1,2,-Inf), meaning the 4th diseased case has two lesions, labeled 1 and 2.

  • lesions$weights: a floating point array with dimensions c(K2, maxLL), representing the relative importance of detecting each lesion. The weights for an abnormal case must sum to unity. For example, dataset05$lesions$weights[4,] is c(0.5,0.5, -Inf), corresponding to equal weights (0.5) assigned to of the two lesions in the case.

  • descriptions$fileName: a character variable containing the file name of the source data for this dataset. This is generated automatically by the DfReadDataFile function used to read the file. For a simulalated dataset it is set to "NA" (i.e., a character vector, not the variable NA).

  • descriptions$type: a character variable describing the data type: "ROC", "LROC", "ROI" or "FROC".

  • descriptions$name: a character variable containing the name of the dataset: e.g., "dataset02" or "dataset05". This is generated automatically by the DfReadDataFile function used to read the file.

  • descriptions$truthTableStr: a c(I, J, L, maxLL+1) object. For normal cases elements c(I, J, L, 1) are filled with 1s if the corresponding interpretations occurred or NAs otherwise. For abnormal cases elements c(I, J, L, 2:(maxLL+1)) are filled with 1s if the corresponding interpretations occurred or NAs otherwise. This object is necessary for analyzing more complex designs.

  • descriptions$design: a character variable: "FCTRL", corresponding to factorial design.

  • descriptions$modalityID: a character vector of length II, which labels/names the modalities in the dataset. For non-JAFROC data file formats, they must be unique integers.

  • descriptions$readerID: a character vector of length JJ, which labels/names the readers in the dataset. For non-JAFROC data file formats, they must be unique integers.

ROI data structure, example datasetROI

Only changes from the previously described structure are described below:

  • ratings$NL: a float array with dimensions c(I, J, K, Q) containing the ratings of each of Q quadrants for each non-diseased case.

  • ratings$LL: a float array with dimensions c(I, J, K2, Q) containing the ratings of quadrants for each diseased case.

  • lesions$perCase: this contains the locations, on abnormal cases, containing at least one lesion.

Crossed-modality dataset structure, example datasetXModality

Only changes from the previously described structure are described below:

  • dataset$ratings$NL: a float array with dimension c(I1, I2, J, K, maxNL) containing the ratings of NL marks. Note the existence of two modality indices.

  • LL: a float array with dimension c(I1, I2, J, K2, maxLL) containing the ratings of all LL marks. Note the existence of two modality indices.

  • dataset$descriptions$modalityID1: corresponding to first modality factor.

  • dataset$descriptions$modalityID2: corresponding to second modality factor.

Df: Datafile related functions

Fitting Functions

  • FitBinormalRoc: Fit the binormal model to ROC data (R equivalent of ROCFIT or RSCORE).

  • FitCbmRoc: Fit the contaminated binormal model (CBM) to ROC data.

  • FitRsmRoc: Fit the radiological search model (RSM) to ROC data.

  • FitCorCbm: Fit the correlated contaminated binormal model (CORCBM) to paired ROC data.

  • FitRsmRoc: Fit the radiological search model (RSM) to ROC data.

Plotting Functions

Simulation Functions

Sample size Functions

  • SsPowerGivenJK: Calculate statistical power given numbers of readers J and cases K.

  • SsPowerTable: Generate a power table.

  • SsSampleSizeKGivenJ: Calculate number of cases K, for specified number of readers J, to achieve desired power for an ROC study.

Significance Testing Functions

  • St: Performs significance testing, DBM or OR, with factorial or crossed modalities.

  • StCadVsRad: Perform significance testing, CAD vs. radiologists.

Miscellaneous and Utility Functions

Author(s)

References

Basics of ROC

Metz, CE (1978). Basic principles of ROC analysis. In Seminars in nuclear medicine (Vol. 8, pp. 283–298). Elsevier.

Metz, CE (1986). ROC Methodology in Radiologic Imaging. Investigative Radiology, 21(9), 720.

Metz, CE (1989). Some practical issues of experimental design and data analysis in radiological ROC studies. Investigative Radiology, 24(3), 234.

Metz, CE (2008). ROC analysis in medical imaging: a tutorial review of the literature. Radiological Physics and Technology, 1(1), 2–12.

Wagner, R. F., Beiden, S. V, Campbell, G., Metz, CE, & Sacks, W. M. (2002). Assessment of medical imaging and computer-assist systems: lessons from recent experience. Academic Radiology, 9(11), 1264–77.

Wagner, R. F., Metz, CE, & Campbell, G. (2007). Assessment of medical imaging systems and computer aids: a tutorial review. Academic Radiology, 14(6), 723–48.

DBM/OR methods and extensions

DORFMAN, D. D., BERBAUM, KS, & Metz, CE (1992). Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Investigative Radiology, 27(9), 723.

Obuchowski, NA, & Rockette, HE (1994). HYPOTHESIS TESTING OF DIAGNOSTIC ACCURACY FOR MULTIPLE READERS AND MULTIPLE TESTS: AN ANOVA APPROACH WITH DEPENDENT OBSERVATIONS. Communications in Statistics-Simulation and Computation, 24(2), 285–308.

Hillis, SL, Berbaum, KS, & Metz, CE (2008). Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. Academic Radiology, 15(5), 647–61.

Hillis, SL, Obuchowski, NA, & Berbaum, KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.

Hillis, SL SL (2007). A comparison of denominator degrees of freedom methods for multiple observer ROC analysis. Statistics in Medicine, 26(3), 596–619.

FROC paradigm

Chakraborty DP. Maximum Likelihood analysis of free-response receiver operating characteristic (FROC) data. Med Phys. 1989;16(4):561–568.

Chakraborty, DP, & Berbaum, KS (2004). Observer studies involving detection and localization: modeling, analysis, and validation. Medical Physics, 31(8), 1–18.

Chakraborty, DP (2006). A search model and figure of merit for observer data acquired according to the free-response paradigm. Physics in Medicine and Biology, 51(14), 3449–62.

Chakraborty, DP (2006). ROC curves predicted by a model of visual search. Physics in Medicine and Biology, 51(14), 3463–82.

Chakraborty, DP (2011). New Developments in Observer Performance Methodology in Medical Imaging. Seminars in Nuclear Medicine, 41(6), 401–418.

Chakraborty, DP (2013). A Brief History of Free-Response Receiver Operating Characteristic Paradigm Data Analysis. Academic Radiology, 20(7), 915–919.

Chakraborty, DP, & Yoon, H.-J. (2008). Operating characteristics predicted by models for diagnostic tasks involving lesion localization. Medical Physics, 35(2), 435.

Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-modality JAFROC observer study. Medical Physics. 43(3):1265-1274.

Zhai X, Chakraborty DP. (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. doi: 10.1002/mp.12263:2207–2222.

Hillis SL, Chakraborty DP, Orton CG. ROC or FROC? It depends on the research question. Medical Physics. 2017.

Chakraborty DP, Nishikawa RM, Orton CG. Due to potential concerns of bias and conflicts of interest, regulatory bodies should not do evaluation methodology research related to their regulatory missions. Medical Physics. 2017.

Dobbins III JT, McAdams HP, Sabol JM, Chakraborty DP, et al. (2016) Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 282(1):236-250.

Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.

Chakraborty DP, Zhai X. On the meaning of the weighted alternative free-response operating characteristic figure of merit. Medical physics. 2016;43(5):2548-2557.

Chakraborty DP. (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples. Taylor-Francis, LLC.


Compute the chisquare goodness of fit statistic for ROC fitting model

Description

Compute the chisquare goodness of fit statistic for specified ROC data fitting model

Usage

ChisqrGoodnessOfFit(fpCounts, tpCounts, parameters, model, lesDistr)

Arguments

fpCounts

The FP counts table

tpCounts

The TP counts table

parameters

The parameters of the model including cutoffs, see details

model

The fitting model: "BINORMAL", "CBM" or "RSM

lesDistr

The lesion distribution matrix; not needed for "BINORMAL" or "CBM" models. Array [1:maxLL,1:2]. The probability mass function of the lesion distribution for diseased cases. The first column contains the actual numbers of lesions per case. The second column contains the fraction of diseased cases with the number of lesions specified in the first column. The second column must sum to unity.

Details

For model = "BINORMAL" the parameters are c(a,b,zetas). For model = "CBM" the parameters are c(mu,alpha,zetas). For model = "RSM" the parameters are c(mu,lambda,nu,zetas). Due to the sparsity of the data, in most cases the goodness of fit statistic cannot be calculated as the criterion of at least 5 counts in each cell (TP and FP) is usually not met. An exception dataset is shown below.

Value

A list with the following elements:

chisq

The chi-square statistic

pVal

The p-value of the fit

df

The degrees of freedom

Examples

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
fit <- FitBinormalRoc(ds, 2, 3) # trt 2 and rdr 3
## fitted a,b and zeta parameters from preceding line were used to call the
## function as shown below:
fpCounts = c(119,  30,   9,  19,   7,   1)
tpCounts = c(10, 11,  7, 16, 29, 16)
gfit = ChisqrGoodnessOfFit(fpCounts, tpCounts, 
parameters = c(fit$a, fit$b, fit$zetas), model="BINORMAL")

TONY FROC dataset

Description

This is referred to in the book as the "TONY" dataset. It consists of 185 cases, 89 of which are diseased, interpreted in two treatments ("BT" = breast tomosynthesis and "DM" = digital mammography) by five radiologists using the FROC paradigm.

Usage

dataset01

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:5, 1:185, 1:3], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:5, 1:89, 1:2], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:89], number of lesions per diseased case

  • lesions$IDs, num [1:89, 1:2], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:89, 1:2], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset01", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "TONY", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:5, 1:185, 1:3] 1 1 1 1 ..., truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "BT" "DM", modality labels

  • descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Chakraborty DP, Svahn T (2011) Estimating the parameters of a model of visual search from ROC data: an alternate method for fitting proper ROC curves. PROC SPIE 7966.

Examples

res <- str(dataset01)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset01, opChType = "wAFROC")$Plot

Van Dyke ROC dataset

Description

This is referred to in the book as the "VD" dataset. It consists of 114 cases, 45 of which are diseased, interpreted in two treatments ("0" = single spin echo MRI, "1" = cine-MRI) by five radiologists using the ROC paradigm. Each diseased cases had an aortic dissection; the ROC paradigm generates one rating per case. Often referred to in the ROC literature as the Van Dyke dataset, which, along with the Franken dataset, has been widely used to illustrate advances in ROC methodology. The example below displays the ROC plot for the first modality and first reader.

Usage

dataset02

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:5, 1:114, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:5, 1:45, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:45], number of lesions per diseased case

  • lesions$IDs, num [1:45, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:45, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset02", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "VAN-DYKE", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:5, 1:114, 1:2] 1 1 1 1 ..., truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "0" "1", modality labels

  • descriptions$readerID, chr [1:5] "0" "1" "2" ..., reader labels

References

Van Dyke CW, et al. Cine MRI in the diagnosis of thoracic aortic dissection. 79th RSNA Meetings. 1993.

Examples

res <- str(dataset02)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset02, opChType = "ROC")$Plot

Franken ROC dataset

Description

This is referred to in the book as the "FR" dataset. It consists of 100 cases, 67 of which are diseased, interpreted in two treatments, "0" = conventional film radiographs, "1" = digitized images viewed on monitors, by four radiologists using the ROC paradigm. Often referred to in the ROC literature as the Franken-dataset, which, along the the Van Dyke dataset, has been widely used to illustrate advances in ROC methodology.

Usage

dataset03

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:4, 1:100, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:4, 1:67, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:67], number of lesions per diseased case

  • lesions$IDs, num [1:67, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:67, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset03", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "FRANKEN", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:4, 1:100, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "TREAT1" "TREAT2", modality labels

  • descriptions$readerID, chr chr [1:4] "READER_1" "READER_2" "READER_3" "READER_4", reader labels

References

Franken EA, et al. Evaluation of a Digital Workstation for Interpreting Neonatal Examinations: A Receiver Operating Characteristic Study. Investigative Radiology. 1992;27(9):732-737.

Examples

res <- str(dataset03)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset03, opChType = "ROC")$Plot

Federica Zanca FROC dataset

Description

This is referred to in the book as the "FED" dataset. It consists of 200 mammograms, 100 of which contained one to 3 simulated microcalcifications, interpreted in five treatments (basically different image processing algorithms) by four radiologists using the FROC paradigm and a 5-point rating scale. The maximum number of NLs per case, over the entire dataset was 7 and the dataset contained at least one diseased mammogram with 3 lesions. The Excel file containing this dataset is /inst/extdata/datasets/FZ_ALL.xlsx. The normal cases are labeled 100:199 while the normal cases are labeled 0:99.

Usage

dataset04

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:5, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:5, 1:4, 1:100, 1:3], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:100], number of lesions per diseased case

  • lesions$IDs, num [1:100, 1:3], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:100, 1:3], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset04", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "FEDERICA", the name of the dataset

  • descriptions$truthTableStr, num [1:5, 1:4, 1:200, 1:4], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:5] "1" "2" "3" "4" "5", modality labels

  • descriptions$readerID, chr [1:4] "1" "3" "4" "5", reader labels

References

Zanca F et al. Evaluation of clinical image processing algorithms used in digital mammography. Medical Physics. 2009;36(3):765-775.

Examples

res <- str(dataset04)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset04, opChType = "wAFROC")$Plot

John Thompson FROC dataset

Description

This is referred to in the book as the "JT" dataset. It consists of 92 cases, 47 of which are diseased, interpreted in two treatments ("1" = CT images acquired for attenuation correction, "2" = diagnostic CT images), by nine radiographers using the FROC paradigm. Each case was a slice of an anthropomorphic phantom 47 with inserted nodular lesions (max 3 per slice). The maximum number of NLs per case, over the entire dataset was 7.

Usage

dataset05

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:9, 1:92, 1:7], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:9, 1:47, 1:3], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:47], number of lesions per diseased case

  • lesions$IDs, num [1:47, 1:3], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:47, 1:3], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset05", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "THOMPSON", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:9, 1:92, 1:4], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "1" "2", modality labels

  • descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Thompson JD Hogg P, et al. (2014) A Free-Response Evaluation Determining Value in the Computed Tomography Attenuation Correction Image for Revealing Pulmonary Incidental Findings: A Phantom Study. Academic Radiology, 21 (4): 538-545.

Examples

res <- str(dataset05)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset05, opChType = "wAFROC")$Plot

Magnus FROC dataset

Description

This is referred to in the book as the "MAG" dataset (after Magnus Bath, who conducted the JAFROC analysis). It consists of 100 cases, 69 of which are diseased, interpreted in two treatments ("1" = conventional chest, "1" = chest tomosynthesis) by four radiologists using the FROC paradigm.

Usage

dataset06

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:4, 1:89, 1:17], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:4, 1:42, 1:15], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:42], number of lesions per diseased case

  • lesions$IDs, num [1:42, 1:15], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:42, 1:15], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset06", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "MAGNUS", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:4, 1:89, 1:16], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "1" "2", modality labels

  • descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Vikgren J et al. Comparison of Chest Tomosynthesis and Chest Radiography for Detection of Pulmonary Nodules: Human Observer Study of Clinical Cases. Radiology. 2008;249(3):1034-1041.

Examples

res <- str(dataset06)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset06, opChType = "wAFROC")$Plot

Lucy Warren FROC dataset

Description

This is referred to in the book as the "OPT" dataset (for OptiMam). It consists of 162 cases, 81 of which are diseased, interpreted in five treatments (see reference, basically different ways of acquiring the images) by seven radiologists using the FROC paradigm.

Usage

dataset07

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:5, 1:7, 1:162, 1:4], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:5, 1:7, 1:81, 1:3], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:81], number of lesions per diseased case

  • lesions$IDs, num [1:81, 1:3], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:81, 1:3], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset07", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "LUCY-WARREN", the name of the dataset

  • descriptions$truthTableStr, num [1:5, 1:7, 1:162, 1:4], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, [1:5] "1" "2" "3" "4" ..., modality labels

  • descriptions$readerID, chr [1:7] "1" "2" "3" "4" ..., reader labels

References

Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.

Examples

res <- str(dataset07)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset07, opChType = "wAFROC")$Plot

Monica Penedo ROC dataset

Description

This is referred to in the book as the "PEN" dataset. It consists of 112 cases, 64 of which are diseased, interpreted in five treatments (basically different image compression algorithms) by five radiologists using the FROC paradigm (the inferred ROC dataset is included; the original FROC data is lost).

Usage

dataset08

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:5, 1:5, 1:112, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:5, 1:5, 1:64, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:64], number of lesions per diseased case

  • lesions$IDs, num [1:64, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:64, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset08", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "PENEDO", the name of the dataset

  • descriptions$truthTableStr, num [1:5, 1:5, 1:112, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:5] "0" "1" "2" "3" ..., modality labels

  • descriptions$readerID, chr [1:5] "0" "1" "2" "3" ..., reader labels

References

Penedo et al. Free-Response Receiver Operating Characteristic Evaluation of Lossy JPEG2000 and Object-based Set Partitioning in Hierarchical Trees Compression of Digitized Mammograms. Radiology. 2005;237(2):450-457.

Examples

res <- str(dataset08)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset08, opChType = "ROC")$Plot

Nico Karssemeijer ROC dataset (CAD vs. radiologists)

Description

This is referred to in the book as the "NICO" dataset. It consists of 200 mammograms, 80 of which contain one malignant mass, interpreted by a CAD system and nine radiologists using the LROC paradigm. The first reader is CAD. The highest rating was used to convert this to an ROC dataset. The original LROC data is datasetCadLroc. Analyzing this data requires methods described in the book, implemented in the function StCadVsRad.

Usage

dataset09

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:80], number of lesions per diseased case

  • lesions$IDs, num [1:80, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset09", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "NICO-CAD-ROC", the name of the dataset

  • descriptions$truthTableStr, num [1, 1:10, 1:200, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels

References

Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.

Examples

res <- str(dataset09)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset09, rdrs = 1:10, opChType = "ROC")$Plot

Mark Ruschin ROC dataset

Description

This is referred to in the book as the "RUS" dataset. It consists of 90 cases, 40 of which are diseased, the images were acquired at three dose levels, which can be regarded as treatments. "0" = conventional film radiographs, "1" = digitized images viewed on monitors, Eight radiologists interpreted the cases using the FROC paradigm. These have been reduced to ROC data by using the highest ratings (the original FROC data is lost).

Usage

dataset10

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:3, 1:8, 1:90, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:3, 1:8, 1:40, 1] , ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:40], number of lesions per diseased case

  • lesions$IDs, num [1:40, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:40, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset10", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "RUSCHIN", the name of the dataset

  • descriptions$truthTableStr, num [1:3, 1:8, 1:90, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:3] "1" "2" "3", modality label(s)

  • descriptions$readerID, chr [1:8] "1" "2" "3" "4" ..., reader labels

References

Ruschin M, et al. Dose dependence of mass and microcalcification detection in digital mammography: free response human observer studies. Med Phys. 2007;34:400 - 407.

Examples

res <- str(dataset10)
## PlotEmpiricalOperatingCharacteristics(dataset = dataset10, opChType = "ROC")$Plot

Dobbins 1 FROC dataset

Description

This is referred to in the book as the "DOB1" dataset. Dobbins et al conducted a multi-institutional, MRMC study to compare the performance of digital tomosynthesis (GE's VolumeRad device), dual-energy (DE) imaging, and conventional chest radiography for pulmonary nodule detection and management. All study images were obtained with a flat-panel detector developed by GE. The case set consisted of 158 subjects, of which 43 were non-diseased and the rest had 1 - 20 pulmonary nodules independently verified, using with CT images, by 3 experts who did not participate in the observer study. The study used FROC paradigm data collection. There are 4 treatments labeled 1 - 4 (conventional chest x-ray, CXR, CXR augmented with dual-energy (CXR+DE), VolumeRad digital tomosynthesis images and VolumeRad augmented with DE (VolumeRad+DE).

Usage

dataset11

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:4, 1:5, 1:115, 1:20], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:115], number of lesions per diseased case

  • lesions$IDs, num [1:115, 1:20], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:115, 1:20], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset11", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "DOBBINS-1", the name of the dataset

  • descriptions$truthTableStr, num [1:4, 1:5, 1:158, 1:21], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:4] "1" "2" "3" "4", modality label(s)

  • descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.

Examples

res <- str(dataset11)

Dobbins 2 ROC dataset

Description

This is referred to in the code as the "DOB2" dataset. It contains actionability ratings, i.e., do you recommend further follow up on the patient, one a 1 (definitely not) to 5 (definitely yes), effectively an ROC dataset using a 5-point rating scale.

Usage

dataset12

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:4, 1:5, 1:152, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:4, 1:5, 1:88, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:88], number of lesions per diseased case

  • lesions$IDs, num [1:88, 1], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:88, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset12", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "DOBBINS-2", the name of the dataset

  • descriptions$truthTableStr, num [1:4, 1:5, 1:152, 1:2] , truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:4] "1" "2" "3" "4", modality label(s)

  • descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.

Examples

res <- str(dataset12)

Dobbins 3 FROC dataset

Description

This is referred to in the code as the "DOB3" dataset. This is a subset of DOB1 which includes data for lesions not-visible on CXR, but visible to truth panel on all treatments.

Usage

dataset13

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:4, 1:5, 1:106, 1:15], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:106], number of lesions per diseased case

  • lesions$IDs, num [1:106, 1:15], numeric labels of lesions on diseased cases

  • lesions$weights, num [1:106, 1:15], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset13", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "DOBBINS-3", the name of the dataset

  • descriptions$truthTableStr, num [1:4, 1:5, 1:158, 1:16], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:4] "1" "2" "3" "4", modality label(s)

  • descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.

Examples

res <- str(dataset13)

Federica Zanca real (as opposed to inferred) ROC dataset

Description

This is referred to in the book as the "FZR" dataset. It is a real ROC study, conducted on the same images and using the same radiologists, on treatments "4" and "5" of dataset04. This was compared to highest rating inferred ROC data from dataset04 to conclude, erroneously, that the highest rating assumption is invalid. See book Section 13.6 and run "~/GitHub/RJafroc/inst/InferredVsReal/InferredVsReal.R".

Usage

dataset14

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:4, 1:200, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:4, 1:100, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:100], number of lesions per diseased case

  • lesions$IDs, num [1:100, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:100, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "dataset14", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "FEDERICA-REAL-ROC", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "4" "5", modality label(s)

  • descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Zanca F, Hillis SL, Claus F, et al (2012) Correlation of free-response and receiver-operating-characteristic area-under-the-curve estimates: Results from independently conducted FROC/ROC studies in mammography. Med Phys. 39(10):5917-5929.

Examples

res <- str(dataset14)

Binned dataset suitable for checking FitCorCbm; seed = 123

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 123. Note the formatting of the data as a single modality two reader dataset, even though the actual pairing might be different, see FitCorCbm. The dataset is intentionally large so as to demonstrate the asymptotic convergence of ML estimates, produced by FitCorCbm, to the population values. The data was generated by the following argument values to DfCreateCorCbmDataset: seed = 123, K1 = 5000, K2 = 5000, desiredNumBins = 5, muX = 1.5, muY = 3, alphaX = 0.4, alphaY = 0.7, rhoNor = 0.3, rhoAbn2 = 0.8.

Usage

datasetBinned123

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:5000], number of lesions per diseased case

  • lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetBinned123", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "SIM-CORCBM-SEED-123", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

res <- str(datasetBinned123)

Binned dataset suitable for checking FitCorCbm; seed = 124

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 124. Otherwise similar to datasetBinned123.

Usage

datasetBinned124

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:5000], number of lesions per diseased case

  • lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetBinned124", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "SIM-CORCBM-SEED-124", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

res <- str(datasetBinned124)

Binned dataset suitable for checking FitCorCbm; seed = 125

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 125. Otherwise similar to datasetBinned123.

Usage

datasetBinned125

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:5000], number of lesions per diseased case

  • lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetBinned125", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "SIM-CORCBM-SEED-125", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

res <- str(datasetBinned125)

Nico Karssemeijer LROC dataset (CAD vs. radiologists)

Description

This is the actual LROC data corresponding to dataset09, which was the inferred ROC data. Note that the LL field is split into two, LL, representing true positives where the lesions were correctly localized, and LL_IL, representing true positives where the lesions were incorrectly localized. The first reader is CAD and the remaining readers are radiologists.

Usage

datasetCadLroc

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:10, 1:200, 1], ratings of localizations on normal cases

  • rating$LL, num [1, 1:10, 1:80, 1], ratings of correct localizations on abnormal cases

  • rating$LL_ILnum [1, 1:10, 1:80, 1], ratings of incorrect localizations on abnormal cases

  • lesions$perCase, int [1:80], number of lesions per diseased case

  • lesions$IDs, num [1:80, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetCadLroc", base name of dataset in 'data' folder

  • descriptions$type, chr "LROC", the data type

  • descriptions$name, chr "NICO-CAD-LROC", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels

References

Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.

Examples

res <- str(datasetCadLroc)

Simulated FROC CAD vs. RAD dataset

Description

Simulated FROC CAD vs. RAD dataset suitable for checking code. It was generated from datasetCadLroc using SimulateFrocFromLrocData.R. The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LL (correct localiztion) array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LL_IL array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this dataset & function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.

Usage

datasetCadSimuFroc

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:80], number of lesions per diseased case

  • lesions$IDs, num [1:80, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetCadSimuFroc", base name of dataset in 'data' folder

  • descriptions$type, chr "LROC", the data type

  • descriptions$name, chr "NICO-CAD-LROC", the name of the dataset

  • descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure

  • descriptions$design, chr "FCTRL", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels


Simulated degenerate ROC dataset (for testing purposes)

Description

A simulated degenerated dataset. A degenerate dataset is defined as one with no interior operating points on the ROC plot. Such data tend to be observed with expert level radiologists. This dataset is used to illustrate the robustness of CBM and RSM fitting models.

Usage

datasetDegenerate

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1, 1, 1:15, 1], ratings of non-lesion localizations, NLs

  • rating$LL, num [1, 1, 1:10, 1], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:10], number of lesions per diseased case

  • lesions$IDs, num [1:10, 1] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:10, 1], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetDegenerate", base name of dataset in 'data' folder

  • descriptions$type, chr "ROC", the data type

  • descriptions$name, chr "SIM-DEGENERATE", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr "1", modality label(s)

  • descriptions$readerID, chr "1", reader labels

Examples

res <- str(datasetDegenerate)

#' Simulated FROC SPLIT-PLOT-C dataset

Description

#' Simulated FROC SPLIT-PLOT-C dataset

Usage

datasetFROCSpC

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:4, 1:100, 1:3], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:100], number of lesions per diseased case

  • lesions$IDs, num [1:100, 1:3] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:100, 1:3], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetFROCSpC", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "SIM-FROC-SPLIT-PLOT-C", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "4" "5", treatment label(s)

  • descriptions$readerID, chr [1:4] "1" "3" "4" "5", reader labels


Simulated ROI dataset

Description

TBA Simulated ROI dataset: assumed are 4 ROIs per case, 5 readers, 50 non-dieased and 40 diseased cases.

Usage

datasetROI

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:5, 1:90, 1:4], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:5, 1:40, 1:4], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:40], number of lesions per diseased case

  • lesions$IDs, num [1:40, 1:4] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:40, 1:4], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetROI", base name of dataset in 'data' folder

  • descriptions$type, chr "ROI", the data type

  • descriptions$name, chr "SIM-ROI", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "1" "2", modality label(s)

  • descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

Examples

res <- str(datasetROI)

John Thompson crossed modality FROC dataset

Description

This is a crossed modality dataset, see book Section 18.5. There are two modality factors. The first modality factor modalityID1 can be "F" or "I", which represent two CT reconstruction algorithms. The second modality factor modalityID2 can be "20" "40" "60" "80", which represent the mAs values of the image acquisition. The factors are fully crossed.

Usage

datasetX

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

  • rating$NL, num [1:2, 1:4, 1:11, 1:68, 1:5], ratings of non-lesion localizations, NLs

  • rating$LL, num [1:2, 1:4, 1:11, 1:34, 1:3], ratings of lesion localizations, LLs

  • rating$LL_ILNA, this placeholder is used only for LROC data

  • lesions$perCase, int [1:34], number of lesions per diseased case

  • lesions$IDs, num [1:34, 1:3] , numeric labels of lesions on diseased cases

  • lesions$weights, num [1:34, 1:3], weights (or clinical importances) of lesions

  • descriptions$fileName, chr, "datasetX", base name of dataset in 'data' folder

  • descriptions$type, chr "FROC", the data type

  • descriptions$name, chr "THOMPSON-X-MOD", the name of the dataset

  • descriptions$truthTableStr, NA, truth table structure

  • descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset

  • descriptions$modalityID, chr [1:2] "F" "I", modality label(s)

  • descriptions$readerID, chr [1:4] "20" "40" "60" "80", reader labels

References

Thompson JD, Chakraborty DP, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-modality JAFROC observer study. Medical Physics. 43(3):1265-1274.

Examples

res <- str(datasetX)

Convert ratings arrays to an RJafroc dataset

Description

Converts ratings arrays, ROC or FROC, but not LROC, to an RJafroc dataset, thereby allowing the user to leverage the file I/O, plotting and analyses capabilities of RJafroc.

Usage

Df2RJafrocDataset(NL, LL, InputIsCountsTable = FALSE, ...)

Arguments

NL

Non-lesion localizations array (or FP array for ROC data).

LL

Lesion localizations array (or TP array for ROC data).

InputIsCountsTable

If TRUE, the NL and LL arrays are rating-counts tables, with common lengths equal to the number of ratings R, if FALSE, the default, these are arrays of lengths K1, the number of non-diseased cases, and K2, the number of diseased cases, respectively.

...

Other elements of RJafroc dataset that may, depending on the context, need to be specified. perCase must be specified if an FROC dataset is to be returned. It is a K2-length array specifying the numbers of lesions in each diseased case in the dataset.

Details

The function "senses" the data type (ROC or FROC) from the the absence or presence of perCase.

  • ROC data can be NL[1:K1] and LL[1:K2] or NL[1:I,1:J,1:K1] and LL[1:I,1:J,1:K2].

  • FROC data can be NL[1:K1,1:maxNL] and LL[1:K2, 1:maxLL] or NL[1:I,1:J,1:K1,1:maxNL] and LL[1:I,1:J,1:K2,1:maxLL].

Here maxNL/maxLL = maximum numbers of NLs/LLs, per case, over entire dataset. Equal weights are assigned to every lesion (FROC data). Consecutive characters/integers starting with "1" are assigned to IDs, modalityID and readerID.

Value

A dataset with the structure described in RJafroc-package.

Examples

## Input as ratings arrays
set.seed(1);NL <- rnorm(5);LL <- rnorm(7)*1.5 + 2
dataset <- Df2RJafrocDataset(NL, LL)

## Input as counts tables
K1t <- c(30, 19, 8, 2, 1)
K2t <- c(5,  6, 5, 12, 22)
dataset <- Df2RJafrocDataset(K1t, K2t, InputIsCountsTable = TRUE)

Returns a binned dataset

Description

Bins continuous (i.e. floating point) or quasi-continuous (e.g. integers 0-100) ratings in a dataset and returns the corresponding binned dataset in which the ratings are integers 1, 2,...., with higher values representing greater confidence in presence of disease

Usage

DfBinDataset(dataset, desiredNumBins = 7, opChType)

Arguments

dataset

The dataset to be binned, with structure as in RJafroc-package.

desiredNumBins

The desired number of bins. The default is 7.

opChType

The operating characteristic relevant to the binning operation: "ROC", "FROC", "AFROC", or "wAFROC".

Details

For small datasets the number of bins may be smaller than desiredNumBins. The algorithm needs to know the type of operating characteristic relevant to the binning operation. For ROC the bins are FP and TP counts, for FROC the bins are NL and LL counts, for AFROC the bins are FP and LL counts, and for wAFROC the bins are FP and wLL counts. Binning is generally employed prior to fitting a statistical model, e.g., maximum likelihood, to the data. This version chooses ctffs so as to maximize empirical AUC (this yields a unique choice of ctffs which gives the reader the maximum deserved credit).

Value

The binned dataset

References

Miller GA (1956) The Magical Number Seven, Plus or Minus Two: Some limits on our capacity for processing information, The Psychological Review 63, 81-97

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

binned <- DfBinDataset(dataset02, desiredNumBins = 3, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "AFROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "wAFROC")
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 1)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 2)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 3)
## etc.

 

## takes longer than 5 sec on OSX
dataset <- SimulateRocDataset(I = 2, J = 5, K1 = 50, K2 = 70, a = 1, b = 0.5, seed = 123)
datasetB <- DfBinDataset(dataset, desiredNumBins = 7, opChType = "ROC")
fomOrg <- as.matrix(UtilFigureOfMerit(dataset, FOM = "Wilcoxon"))
##print(fomOrg)
fomBinned <- as.matrix(UtilFigureOfMerit(datasetB, FOM = "Wilcoxon"))
##print(fomBinned)
##cat("mean, sd = ", mean(fomOrg), sd(fomOrg), "\n")
##cat("mean, sd = ", mean(fomBinned), sd(fomBinned), "\n")

Create paired dataset for testing FitCorCbm

Description

The paired dataset is generated using bivariate sampling; details are in referenced publication

Usage

DfCreateCorCbmDataset(
  seed = 123,
  K1 = 50,
  K2 = 50,
  desiredNumBins = 5,
  muX = 1.5,
  muY = 3,
  alphaX = 0.4,
  alphaY = 0.7,
  rhoNor = 0.3,
  rhoAbn2 = 0.8
)

Arguments

seed

The seed variable, default is 123; set to NULL for truly random seed

K1

The number of non-diseased cases, default is 50

K2

The number of diseased cases, default is 50

desiredNumBins

The desired number of bins; default is 5

muX

The CBM μ\mu parameter in condition X

muY

The CBM μ\mu parameter in condition Y

alphaX

The CBM α\alpha parameter in condition X

alphaY

The CBM ‘⁠alpha⁠’ parameter in condition Y

rhoNor

The correlation of non-diseased case z-samples

rhoAbn2

The correlation of diseased case z-samples, when disease is visible in both conditions

Details

The ROC data is bined to 5 bins in each condition.

Value

The desired dataset suitable for testing FitCorCbm.

References

Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

## seed <- 1 
## this gives unequal numbers of bins in X and Y conditions for 50/50 dataset
dataset <- DfCreateCorCbmDataset()


## this takes very long time!! used to show asymptotic convergence of ML estimates 
## dataset <- DfCreateCorCbmDataset(K1 = 5000, K2 = 5000)

Extract two arms of a pairing from an MRMC ROC dataset

Description

Extract a paired dataset from a larger dataset. The pairing could be two readers in the same modality, or different readers in different treatments, or the same reader in different treatments. If necessary The data is binned to 5 bins in each condition.

Usage

DfExtractCorCbmDataset(dataset, trts = 1, rdrs = 1)

Arguments

dataset

The original dataset from which the pairing is to be extracted

trts

A vector, maximum length 2, contains the indices of the modality or treatments to be extracted

rdrs

A vector, maximum length 2, contains the indices of the reader or readers to be extracted

Details

The desired pairing is contained in the vectors trts and rdrs. If either has length one, the other must have length two and the pairing is implicit. If both are length two, then the pairing is that implied by the first treatement and the second reader, which is one arm, and the other arm is that implied by the second modality paired with the first reader. Using this method any allowed pairing can be extracted and analyzed by FitCorCbm. The utility of this software is in designing a ratings simulator that is statistically matched to a real dataset.

Value

A 1-modality 2-reader dataset

Examples

## Extract the paired data corresponding to the second and third readers in the first modality
## from the included ROC dataset
dataset11_23 <- DfExtractCorCbmDataset(dataset05, trts = 1, rdrs = c(2,3))

## Extract the paired data corresponding to the third reader in the first and second treatments
dataset12_33 <- DfExtractCorCbmDataset(dataset05, trts = c(1,2), rdrs = 3)

## Extract the data corresponding to the first reader in the first
## modality paired with the data
## from the third reader in the second modality
## (the bin indices are at different positions in the two arrays)
dataset12_13 <- DfExtractCorCbmDataset(dataset05,
trts = c(1,2), rdrs = c(1,3))

Extract a subset of treatments and readers from a dataset

Description

Extract a dataset consisting of a subset of treatments/readers from a larger dataset

Usage

DfExtractDataset(dataset, trts, rdrs)

Arguments

dataset

The original dataset from which the subset is to be extracted

trts

A vector contains the indices of the treatments to be extracted. If this parameter is not supplied, all treatments are extracted.

rdrs

A vector contains the indices of the readers to be extracted. If this parameter is not supplied, all readers are extracted.

Details

Note that trts and rdrs are the vectors of indices not IDs. For example, if the ID of the first reader is "0", the corresponding value in trts should be 1 not 0.

Value

A dataset containing only the specified treatments and readers that were extracted from the original dataset

Examples

## Extract the data corresponding to the second reader in the 
## first modality from an included ROC dataset
ds1 <- DfExtractDataset(dataset05, trts = 1, rdrs = 2)

## Extract the data of the first and third reader in all 
## modality from the included ROC dataset
ds2 <- DfExtractDataset(dataset05, rdrs = c(1, 3))

Simulates an "AUC-equivalent" LROC dataset from an FROC dataset

Description

Simulates a multiple-modality multiple-reader "AUC-equivalent" LROC dataset from a supplied FROC dataset.

Usage

DfFroc2Lroc(dataset)

Arguments

dataset

The FROC dataset to be converted to LROC.

Details

The FROC paradigm can have 0 or more marks per case. However, LROC is restricted to exactly one mark per case. For the NL array of the LROC data, for non-disesed cases, the highest rating of the FROC marks, or -Inf if there are no marks, is copied to case index k1 = 1 to k1 = K1 of the LROC dataset. For each diseased case, if the max LL rating exceeds the max NL rating, then the max LL rating is copied to the LL array, otherwise the max NL rating is copied to the LL_IL array. The max NL rating on each diseased case is then set to -Inf (since the LROC paradigm only allows one mark. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the Significance testing functions using MRMC LROC datasets, which I currently don't have.

Value

The AUC-equivalent LROC dataset

Examples

lrocDataset <- DfFroc2Lroc(dataset05)
frocHrAuc <- UtilFigureOfMerit(dataset05, FOM = "HrAuc")   
lrocWilcoxonAuc <- UtilFigureOfMerit(lrocDataset, FOM = "Wilcoxon")
testthat::expect_equal(frocHrAuc, lrocWilcoxonAuc)

Convert an FROC dataset to an ROC dataset

Description

Convert an FROC dataset to a highest rating inferred ROC dataset

Usage

DfFroc2Roc(dataset)

Arguments

dataset

The FROC dataset to be converted, RJafroc-package.

Details

The first member of the ROC dataset is NL, whose 3rd dimension has length (K1 + K2), the total number of cases. Ratings of cases (K1 + 1) through (K1 + K2) are -Inf. This is because in an ROC dataset FPs are only possible on non-diseased cases.The second member of the list is LL. Its 3rd dimension has length K2, the number of diseased cases. This is because TPs are only possible on diseased cases. For each case the inferred ROC rating is the highest of all FROC ratings on that case. If a case has no marks, a finite ROC rating, guaranteed to be smaller than the rating on any marked case, is assigned to it. The dataset structure is shown below:

  • NL Ratings array [1:I, 1:J, 1:(K1+K2), 1], of false positives, FPs

  • LL Ratings array [1:I, 1:J, 1:K2, 1], of true positives, TPs

  • perCase array [1:K2], number of lesions per diseased case

  • IDs array [1:K2, 1], labels of lesions on diseased cases

  • weights array [1:K2, 1], weights (or clinical importances) of lesions

  • dataType "ROC", the data type

  • modalityID [1:I] inherited modality labels

  • readerID [1:J] inherited reader labels

Value

An ROC dataset with finite ratings in NL[,,1:K1,1] and LL[,,1:K2,1].

Examples

rocDataSet <- DfFroc2Roc(dataset05)

## in the following example, because of the smaller number of cases, 
## it is easy to see the process at work:

set.seed(1);K1 <- 3;K2 <- 5
mu <- 1;nu <- 0.5;lambda <- 2;zeta1 <- 0
lambda_i <- Util2Intrinsic(mu,lambda,nu)$lambda_i
nu_i <- Util2Intrinsic(mu,lambda,nu)$nu_i
Lmax <- 2;Lk2 <- floor(runif(K2, 1, Lmax + 1))
frocDataRaw <- SimulateFrocDataset(mu, lambda_i, nu_i, zeta1, I = 1, J = 1, 
K1, K2, perCase = Lk2)
hrData <- DfFroc2Roc(frocDataRaw)

## print("frocDataRaw$ratings$NL[1,1,,] = ")
## print("hrData$ratings$NL[1,1,1:K1,] = ")
## print("frocDataRaw$ratings$LL[1,1,,] = ")
## print("hrData$ratings$LL[1,1,,] = ")

## following is the output

## [1] "frocDataRaw$ratings$NL[1,1,,] = "
## [,1]      [,2]      [,3] [,4]
## [1,] 2.4046534 0.7635935      -Inf -Inf
## [2,]      -Inf      -Inf      -Inf -Inf
## [3,] 0.2522234      -Inf      -Inf -Inf
## [4,] 0.4356833      -Inf      -Inf -Inf
## [5,]      -Inf      -Inf      -Inf -Inf
## [6,]      -Inf      -Inf      -Inf -Inf
## [7,]      -Inf      -Inf      -Inf -Inf
## [8,] 0.8041895 0.3773956 0.1333364 -Inf

## > ## print("hrData$ratings$NL[1,1,1:K1,] = ")
## [1] "hrData$ratings$NL[1,1,1:K1,] = "
## [1] 2.4046534      -Inf 0.2522234
## > ## print("frocDataRaw$ratings$LL[1,1,,] = ")
## [1] "frocDataRaw$ratings$LL[1,1,,] = "
## [,1] [,2]
## [1,]      -Inf -Inf
## [2,] 1.5036080 -Inf
## [3,] 0.8442045 -Inf
## [4,] 1.0467262 -Inf
## [5,]      -Inf -Inf
## > ## print("hrData$ratings$LL[1,1,,] = ")
## [1] "hrData$ratings$LL[1,1,,] = "
## [1] 0.4356833 1.5036080 0.8442045 1.0467262 0.8041895
## Note that rating of the first and the last diseased case came from NL marks

Simulates an "AUC-equivalent" FROC dataset from an LROC dataset

Description

Simulates a multiple-modality multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.

Usage

DfLroc2Froc(dataset)

Arguments

dataset

The LROC dataset to be converted to FROC.

Details

The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LLCl array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LLIl array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.

Value

The AUC-equivalent FROC dataset

Examples

frocDataset <- DfLroc2Froc(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")

Convert an LROC dataset to a ROC dataset

Description

Converts an LROC dataset to an ROC dataset

Usage

DfLroc2Roc(dataset)

Arguments

dataset

The LROC dataset to be converted.

Details

For the diseased cases one takes the maximum rating on each diseased case, which could be a LL ("true positive" correct localization) or a LL_IL ("true positive" incorrect localization) rating, whichever has the higher rating. For non-diseased cases the NL arrays are identical.

Value

An ROC dataset

Examples

rocDataSet <- DfLroc2Roc(datasetCadLroc)

Read a factorial data file (not SPLIT-PLOT)

Description

Read an Excel file and create an ROC, FROC or LROC dataset object from it.

Usage

DfReadDataFile(
  fileName,
  format = "JAFROC",
  newExcelFileFormat = FALSE,
  lrocForcedMark = NA,
  delimiter = ",",
  sequentialNames = FALSE
)

Arguments

fileName

A string specifying the name of the file. The file-extension must match the format specified below.

format

A string specifying the format of the data file. It can be "JAFROC", the default, which requires an .xlsx Excel file (not .xls), "MRMC" or "iMRMC". For "MRMC" the format is determined by the data file extension (.csv or .txt or .lrc) as specified in https://perception.lab.uiowa.edu/. For "iMRMC" the file extension is .imrmc and the format is described in https://code.google.com/archive/p/imrmc/. See note for important information about deprecation of the "MRMC" format.

newExcelFileFormat

Logical. Must be true to read LROC data. This argument only applies to the "JAFROC" format. The default is TRUE. If TRUE the function accommodates 3 additional columns in the Truth worksheet. If FALSE, the original function (as in version 1.2.0) is used and the three extra columns, if present, throw an error.

lrocForcedMark

Logical: For LROC dataset only: is a forced mark required on every image? The default is NA. If a mark is not required, set it to FALSE otherwise to TRUE.

delimiter

The string delimiter to be used for the "MRMC" format ("," is the default), see https://perception.lab.uiowa.edu/. This parameter is not used when reading "JAFROC" or "iMRMC" data files.

sequentialNames

A logical variable: if TRUE, consecutive integers (starting from 1) will be used as the modality and reader IDs (i.e., names). Otherwise, modality and reader IDs in the original data file will be used.

Value

A dataset with the structure specified in RJafroc-package.

Note

The "MRMC" format is deprecated. For non-JAFROC formats four file extensions (.csv, .txt, .lrc and .imrmc) are possible, all of which are restricted to ROC data. Only iMRMC format is now supported, i.e, files with extension .imrmc. Other formats (.csv, .txt, .lrc) are deprecated. Such files can still be read by this function and then saved to a JAFROC format file for further analysis within this package. For non-JAFROC data file formats, the readerID and modalityID fields must be unique integers.

This function is used only for factorial datasets. For SPLIT-PLOT datasets use function DfReadSP.

Examples

fileName <- system.file("extdata", "toyFiles/ROC/rocCr.xlsx",
package = "RJafroc", mustWork = TRUE)
rdrArr1D <- DfReadDataFile(fileName, newExcelFileFormat = TRUE)



fileName <- system.file("extdata", "Roc.xlsx",
package = "RJafroc", mustWork = TRUE)
RocDataXlsx <- DfReadDataFile(fileName)

fileName <- system.file("extdata", "RocData.csv",
package = "RJafroc", mustWork = TRUE)
RocDataCsv<- DfReadDataFile(fileName, format = "MRMC")

fileName <- system.file("extdata", "RocData.imrmc",
package = "RJafroc", mustWork = TRUE)
RocDataImrmc<- DfReadDataFile(fileName, format = "iMRMC")

fileName <- system.file("extdata", "Froc.xlsx",
package = "RJafroc", mustWork = TRUE)
FrocDataXlsx <- DfReadDataFile(fileName, sequentialNames = TRUE)

Read a SPLIT PLOT data file (not factorial)

Description

Read a disk file and create an ROC or FROC dataset object

Usage

DfReadSP(fileName)

Arguments

fileName

A string specifying the name of the file.

Value

A dataset with the structure specified in RJafroc-package.

Examples

fileName <- system.file("extdata", "toyFiles/ROC/rocCr.xlsx",
package = "RJafroc", mustWork = TRUE)
ds <- DfReadSP(fileName)

Read a crossed-modality data file

Description

Read a crossed-modality data file, in which the two modality factors are crossed

Usage

DfReadXModalities(fileName, sequentialNames = FALSE)

Arguments

fileName

A string specifying the name of the file that contains the dataset, which must be an extended-JAFROC format Excel file containing an additional modality factor.

sequentialNames

If TRUE, consecutive integers (starting from 1) will be used as the modality and reader IDs. Otherwise, modality and reader IDs in the original data file will be used. The default is FALSE.

Details

The data format is similar to the JAFROC format (see RJafroc-package). The difference is that there are two modality factors. TBA For an example see ... add reference to FROC book chapter https://dpc10ster.github.io/RJafrocFrocBook/

Value

A dataset with the specified structure, similar to a standard RJafroc dataset (see RJafroc-package). Because of the extra modality factor, NL and LL are each five dimensional arrays. There are also two modality IDS: modalityID1 and modalityID2.

References

Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-modality JAFROC observer study. Medical Physics. 43(3):1265-1274.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840


Save ROC dataset in different formats

Description

Save ROC dataset in other formats so it can be analyzed with alternate software

Usage

DfSaveDataFile(
  dataset,
  fileName,
  format = "MRMC",
  dataDescription = "RJafroc dataset converted to imrmc format"
)

Arguments

dataset

The dataset to be saved.

fileName

The file name of the output data file. The extension of the data file must match the corresponding format, see RJafroc-package

format

The format of the data file, which can be "MRMC" or "iMRMC", see RJafroc-package.

dataDescription

An optional string variable describing the data file, the default value is the variable name of dataset The description appears on the first line of *.lrc or *imrmc data file. This parameter is not used when saving dataset in other formats.

Examples

## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.csv", format = "MRMC")
## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.lrc", format = "MRMC", 
##     dataDescription = "ExampleROCdata1")
## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.txt", format = "MRMC", 
##     dataDescription = "ExampleROCdata2")
##  DfSaveDataFile(dataset = dataset02, 
##    fileName = "dataset05.imrmc", format = "iMRMC", 
##    dataDescription = "ExampleROCdata3")

Save dataset object as a JAFROC format Excel file

Description

Save a dataset object as a JAFROC format Excel file

Usage

DfWriteExcelDataFile(dataset, fileName)

Arguments

dataset

The dataset object, see RJafroc-package.

fileName

The file name to save to; the extension of the data file must be .xlsx

Examples

##DfWriteExcelDataFile(dataset = dataset05, fileName = "rocData2.xlsx")

Fit the binormal model to selected modality and reader in an ROC dataset

Description

Fit the binormal model-predicted ROC curve for a dataset. This is the R equivalent of ROCFIT or RSCORE

Usage

FitBinormalRoc(dataset, trt = 1, rdr = 1)

Arguments

dataset

The ROC dataset

trt

The desired modality, default is 1

rdr

The desired reader, default is 1

Details

In the binormal model ratings (more accurately the latent decision variables) from diseased cases are sampled from N(a,1)N(a,1) while ratings for non-diseased cases are sampled from N(0,b2)N(0,b^2). To avoid clutter error bars are only shown for the lowest and uppermost operating points. An FROC dataset is internally converted to a highest rating inferred ROC dataset. To many bins containing zero counts will cause the algorithm to fail; so be sure to bin the data appropriately to fewer bins, where each bin has at least one count.

Value

The returned value is a list with the following elements:

a

The mean of the diseased distribution; the non-diseased distribution is assumed to have zero mean

b

The standard deviation of the non-diseased distribution. The diseased distribution is assumed to have unit standard deviation

zetas

The binormal model cutoffs, zetas or thresholds

AUC

The binormal model fitted ROC-AUC

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print() to display the object

References

Dorfman DD, Alf E (1969) Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals - Rating-Method Data, Journal of Mathematical Psychology 6, 487-496.

Grey D, Morgan B (1972) Some aspects of ROC curve-fitting: normal and logistic models. Journal of Mathematical Psychology 9, 128-139.

Examples

## Test with an included ROC dataset
retFit <- FitBinormalRoc(dataset02);## print(retFit$fittedPlot)


## Test with an included FROC dataset; it needs to be binned
## as there are more than 5 discrete ratings levels
binned <- DfBinDataset(dataset05, desiredNumBins = 5, opChType = "ROC")
retFit <- FitBinormalRoc(binned);## print(retFit$fittedPlot)


## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitBinormalRoc(ds, 2, 3);## print(retFit$fittedPlot)
## retFit$ChisqrFitStats
 
## Test with included degenerate ROC data
retFit <- FitBinormalRoc(datasetDegenerate);## print(retFit$fittedPlot)

Fit the contaminated binormal model (CBM) to selected modality and reader in an ROC dataset

Description

Fit the CBM-predicted ROC curve for specified modality and reader

Usage

FitCbmRoc(dataset, trt = 1, rdr = 1)

Arguments

dataset

The dataset containing the data

trt

The desired modality, default is 1

rdr

The desired reader, default is 1

Details

In CBM ratings from diseased cases are sampled from a mixture distribution with two components: (1) distributed normal with mean mumu and unit variance with integrated area alphaalpha, and (2) from a unit-normal distribution with integrated area 1alpha1-alpha. Ratings for non-diseased cases are sampled from a unit-normal distribution. The ChisqrFitStats consists of a list containing the chi-square value, the p-value and the degrees of freedom.

Value

A list with the following elements:

mu

The mean of the visible diseased distribution (the non-diseased) has zero mean

alpha

The proportion of diseased cases where the disease is visible

zetas

The cutoffs, zetas or thresholds

AUC

The AUC of the fitted ROC curve

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print() to display the object

Note

This algorithm is very robust, especially compared to the binormal model.

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol, 7:6, 427–437.

Examples

## CPU time 8.7 sec on Ubuntu (#13)
## Test with included ROC data
retFit <- FitCbmRoc(dataset02);## print(retFit$fittedPlot)

## Test with included degenerate ROC data (yes! CBM can fit such data)
retFit <- FitCbmRoc(datasetDegenerate);## print(retFit$fittedPlot)

## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);## print(retFit$fittedPlot)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);
## print(retFit$fittedPlot)

## Test with included ROC data (some bins have zero counts)
retFit <- FitCbmRoc(dataset02, 2, 1);## print(retFit$fittedPlot)

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitCbmRoc(ds, 2, 3);## print(retFit$fittedPlot)
retFit$ChisqrFitStats

Fit CORCBM to a paired ROC dataset

Description

Fit the Correlated Contaminated Binormal Model (CORCBM) to a paired ROC dataset. The ROC dataset has to be formatted as a single modality, two-reader dataset, even though the actual pairing may be different, see details.

Usage

FitCorCbm(dataset)

Arguments

dataset

A paired ROC dataset

Details

The conditions (X, Y) can be two readers interpreting images in the same modality, the same reader interpreting images in different treatments, or different readers interpreting images in 2 different treatments. Function DfExtractCorCbmDataset can be used to construct a dataset suitable for FitCorCbm. With reference to the returned values, and assuming R bins in condition X and L bins in conditon Y, FPCounts is the R x L matrix containing the counts for non-diseased cases, TPCounts is the R x L matrix containing the counts for diseased cases; muX,muY,alphaX,alphaY,rhoNor,rhoAbn2 are the CORCBM parameters; aucX,aucX are the AUCs in the two conditions; stdAucX,stdAucY are the corresponding standard errors;stdErr contains the standard errors of the parameters of the model; areaStat, areaPval,covMat are the area-statistic, the p-value and the covariance matrix of the parameters. If a parameter approaches a limit, e.g., rhoNor = 0.9999, it is held constant at near the limiting value and the covariance matrix has one less dimension (along each edge) for each parameter that is held constant. The indices of the parameters held fixed are in fitCorCbmRet$fixParam.

Value

A list containing three objects:

fitCorCbmRet

list(FPCounts,TPCounts, muX,muY,alphaX,alphaY,rhoNor, rhoAbn2,zetaX,zetaY,covMat,fixParam)

stats

list(aucX,aucX,stdAucX, stdAucY,stdErr,areaStat,areaPval)

fittedPlot

The fitted plot with operating points, error bars, for both conditions

References

Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.


Fit the radiological search model (RSM) to an ROC dataset

Description

Fit an RSM-predicted ROC curve to a binned single-modality single-reader ROC dataset

Usage

FitRsmRoc(binnedRocData, lesDistr, trt = 1, rdr = 1)

Arguments

binnedRocData

A binned ROC dataset

lesDistr

The lesion distribution 1D array.

trt

The selected modality, default is 1

rdr

The selected reader, default is 1

Details

If dataset is FROC, first convert it to ROC, using DfFroc2Roc. MLE ROC algorithms require binned datasets. Use DfBinDataset to perform the binning prior to calling this function. In the RSM: (1) The (random) number of latent NLs per case is Poisson distributed with mean parameter lambda, and the corresponding ratings are sampled from N(0,1)N(0,1). The (2) The (random) number of latent LLs per diseased case is binomial distributed with success probability nu and trial size equal to the number of lesions in the case, and the corresponding ratings are sampled from N(mumu,1). (3) A latent NL or LL is actually marked if its rating exceeds the lowest threshold zeta1. To avoid clutter error bars are only shown for the lowest and uppermost operating points. Because of the extra parameter, and the requirement to have five counts, the chi-square statistic often cannot be calculated.

Value

A list with the following elements

mu

The mean of the diseased distribution relative to the non-diseased one

lambda

The Poisson parameter describing the distribution of latent NLs per case

nu

The binomial success probability describing the distribution of latent LLs per diseased case

zetas

The RSM cutoffs, zetas or thresholds

AUC

The RSM fitted ROC-AUC

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print to display the object

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search. Phys Med Biol 51, 3463–3482.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

## Test with included ROC data (some bins have zero counts)
lesDistr <- UtilLesDistr(dataset02)$Freq
retFit <- FitRsmRoc(dataset02, lesDistr)
## print(retFit$fittedPlot)

## Test with included degenerate ROC data
lesDistr <- UtilLesDistr(datasetDegenerate)$Freq
retFit <- FitRsmRoc(datasetDegenerate, lesDistr)

## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesDistr(binnedRocData)$Freq
retFit <- FitRsmRoc(binnedRocData, lesDistr)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesDistr(binnedRocData)$Freq
retFit <- FitRsmRoc(binnedRocData, lesDistr)


## Test with three interior data points
fp <- c(rep(1,12), rep(2, 5), rep(3, 3), rep(4, 5))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7), rep(4, 10))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesDistr(binnedRocData)$Freq
retFit <- FitRsmRoc(binnedRocData, lesDistr)

## test for TONY data, i = 2 and j = 3 
## only case permitting chisqure calculation
lesDistr <- UtilLesDistr(dataset01)$Freq
rocData <- DfFroc2Roc(dataset01)
retFit <- FitRsmRoc(rocData, lesDistr, trt = 2, rdr = 3)
## print(retFit$fittedPlot)
## retFit$ChisqrFitStats

Determine if a dataset is binned

Description

Determine if a dataset is binned

Usage

isBinnedDataset(dataset, maxUniqeRatings = 6)

Arguments

dataset

The dataset

maxUniqeRatings

For each modality-reader combination, the max number of unique ratings in order to be classified as binned, the default value for maxUniqeRatings is 6; if there are more unique ratings the modality-reader combination is classified as not binned.

Value

a logical [I x J] array, TRUE if the corresponding modality-reader combination is binned, i.e., has at most maxUniqeRatings unique ratings, FALSE otherwise.

Examples

isBinnedDataset(dataset01)

Check the validity of a dataset for FOM and other input parameters

Description

Checks the validity of a specified dataset for FOM and other input parameters.

Usage

isValidDataset(
  dataset,
  FOM,
  method = "OR",
  covEstMethod = "jackknife",
  analysisOption = "RRRC"
)

Arguments

dataset

The dataset object to be checked.

FOM

The figure of merit.

method

The analysis method "OR" (default) or "DBM".

covEstMethod

The covariance estimation method "jackknife" (default), "bootstrap" or "DeLong" (for an ROC dataset).

analysisOption

Specification of the random factor(s): "RRRC" (default), "RRFC", or "FRRC.

Value

None.


Plot binormal fit

Description

Plot the binormal-predicted ROC curve with provided parameters

Usage

PlotBinormalFit(a, b)

Arguments

a

vector: the mean(s) of the diseased distribution(s).

b

vector: the standard deviations(s) of the diseased distribution(s).

Details

a and b must have the same length. The predicted ROC curve for each a and b pair will be plotted.

Value

A ggplot2 object of the plotted ROC curve(s) are returned. Use print function to display the saved object.

Examples

binormalPlot <- PlotBinormalFit(c(1, 2), c(0.5, 0.5))
## print(binormalPlot)

Plot CBM fitted curve

Description

Plot the CBM-predicted ROC curve with provided CBM parameters

Usage

PlotCbmFit(mu, alpha)

Arguments

mu

vector: the mean(s) of the z-samples of the diseased distribution(s) where the disease is visible

alpha

vector: the proportion(s) of the diseased distribution(s) where the disease is visible

Details

mu and alpha must have equal length. The predicted ROC curve for each mu and alpha pair will be plotted.

Value

A ggplot2 object of the plotted ROC curve(s)

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7, 427–437.

Examples

cbmPlot <- PlotCbmFit(c(1, 2), c(0.5, 0.5))
## print(cbmPlot)

Plot empirical operating characteristics, ROC, FROC or LROC

Description

Plot empirical operating characteristics (operating points connected by straight lines) for specified modalities and readers, or, if desired, plots (no operating points) averaged over specified modalities and / or readers.

Usage

PlotEmpiricalOperatingCharacteristics(
  dataset,
  trts = 1,
  rdrs = 1,
  opChType,
  legend.position = c(0.8, 0.3),
  maxDiscrete = 10
)

Arguments

dataset

Dataset object.

trts

List or vector: integer indices of modalities to be plotted. Default is 1.

rdrs

List or vector: integer indices of readers to be plotted. Default is 1.

opChType

Type of operating characteristic to be plotted: "ROC", "FROC", "AFROC", "wAFROC", "AFROC1", "wAFROC1", or "LROC".

legend.position

Where to position the legend. The default is c(0.8, 0.2), i.e., 0.8 rightward and 0.2 upward (the plot is a unit square).

maxDiscrete

maximum number of op. points in order to be considered discrete and to be displayed by symbols and connecting lines; any more points will be regarded as continuous and only connected by lines; default is 10.

Details

The trts and rdrs are vectors or lists of integer indices, not the corresponding string IDs. For example, if the string ID of the first reader is "0", the value in rdrs should be 1 not 0. The legend will display the string IDs.

If both of trts and rdrs are vectors, all combinations of modalities and readers are plotted. See Example 1.

If both trts and rdrs are lists, they must have the same length. Only the combination of modality and reader at the same position in their respective lists are plotted. If some elements of the modalities and / or readers lists are vectors, the average operating characteristic over the implied modalities and / or readers are plotted. See Example 2.

For LROC datasets, opChType can be "ROC" or "LROC".

Value

A ggplot2 object containing the operating characteristic plot(s) and a data frame containing the points defining the operating characteristics.

Plot

ggplot2 object. For continuous or averaged data, operating characteristics curves are plotted without showing operating points. For binned (individual) data, both operating points and connecting lines are shown. To avoid clutter, if there are more than 20 operating points, they are not shown.

Points

Data frame with four columns: abscissa, ordinate, class (which codes modality and reader names) and type, which can be "D" for discrete ratings, "C" for continuous ratings, i.e., more than 20 operating points, or "A", for reader averaged.

Examples

## Example 1
## Plot individual empirical ROC plots for all combinations of modalities
## 1 and 2 and readers 1, 2 and 3. Six operating characteristics are plotted.

ret <- PlotEmpiricalOperatingCharacteristics(dataset =
dataset02, trts = c(1:2), rdrs = c(1:3), opChType = "ROC")
## print(ret$Plot)

## Example 2
## Empirical wAFROC plots, consisting of
## three sub-plots:
## (1) sub-plot, red, with operating points, for the 1st modality (string ID "1") and the 2nd
## reader (string ID "3"), labeled "M:1 R:3"
## (2) sub-plot, green, no operating points, for the 2nd modality (string ID "2") AVERAGED
## over the 2nd and 3rd readers (string IDs "3" and "4"), labeled "M:2  R: 3 4"
## (3) sub-plot, blue, no operating points, AVERAGED over the first two modalities
## (string IDs "1" and "2") AND over the 1st, 2nd and 3rd readers
## (string IDs "1", "3" and "4"), labeled "M: 1 2  R: 1  3 4"

plotT <- list(1, 2, c(1:2))
plotR <- list(2, c(2:3), c(1:3))

ret <- PlotEmpiricalOperatingCharacteristics(dataset = dataset04, trts = plotT,
   rdrs = plotR, opChType = "wAFROC")
## print(ret$Plot)

## Example 3
## Correspondences between indices and string identifiers for modalities and
## readers in this dataset (apparently reader "2" did not complete the study).

## names(dataset04$descriptions$readerID)
## [1] "1" "3" "4" "5"

RSM predicted operating characteristics, ROC pdfs and AUCs

Description

Visualize RSM predicted ROC, AFROC, wAFROC and FROC curves, and ROC pdfs, given equal-length arrays of search model parameters: mu, lambda, nu and zeta1.

Usage

PlotRsmOperatingCharacteristics(
  mu,
  lambda,
  nu,
  zeta1,
  lesDistr = 1,
  relWeights = 0,
  OpChType = "ALL",
  legendPosition = "bottom",
  legendDirection = "horizontal",
  legendJustification = c(0, 1),
  nlfRange = NULL,
  llfRange = NULL,
  nlfAlpha = NULL
)

Arguments

mu

Array: the RSM mu parameter.

lambda

Array: the RSM lambda parameter.

nu

Array: the RSM nu parameter.

zeta1

Array, the lowest reporting threshold; if missing the default is an array of -Inf.

lesDistr

Array: the probability mass function of the lesion distribution for diseased cases. The default is 1. See UtilLesDistr.

relWeights

The relative weights of the lesions; a vector of length equal to length(maxLL). The default is zero, in which case equal weights are assumed.

OpChType

The type of operating characteristic desired: can be "ROC", "AFROC", "wAFROC", "FROC" or "pdfs" or "ALL". The default is "ALL".

legendPosition

The positioning of the legend: "right", "left", "top" or "bottom". Use "none" to suppress the legend.

legendDirection

Allows control on the direction of the legend; "horizontal", the default, or "vertical"

legendJustification

Where to position the legend, default is bottom right corner c(0,1)

nlfRange

This applies to FROC plot only. The x-axis range, e.g., c(0,2), for FROC plot. Default is "NULL", which means the maximum NLF range, as determined by the data.

llfRange

This applies to FROC plot only. The y-axis range, e.g., c(0,1), for FROC plot. Default is "NULL", which means the maximum LLF range, as determined by the data.

nlfAlpha

Upper limit of the integrated area under the FROC plot. Default is "NULL", which means the maximum NLF range is used (i.e., lambda). Attempt to integrate outside the maximum NLF will generate an error.

Details

RSM is the Radiological Search Model described in the book. This function is vectorized with respect to the first 4 arguments. For lesDistr the sum must be one. To indicate that all dis. cases contain 4 lesions, set lesDistr = c(0,0,0,1).

Value

A list containing five ggplot2 objects (ROCPlot, AFROCPlot wAFROCPlot, FROCPlot and PDFPlot) and two area measures (each of which can have up to two elements), the area under the search model predicted ROC curves in up to two treatments, the area under the search model predicted AFROC curves in up to two treatments, the area under the search model predicted wAFROC curves in up to two treatments, the area under the search model predicted FROC curves in up to two treatments.

  • ROCPlot The predicted ROC plots

  • AFROCPlot The predicted AFROC plots

  • wAFROCPlot The predicted wAFROC plots

  • FROCPlot The predicted FROC plots

  • PDFPlot The predicted ROC pdf plots, highest rating generated

  • aucROC The predicted ROC AUCs, highest rating generated

  • aucAFROC The predicted AFROC AUCs

  • aucwAFROC The predicted wAFROC AUCs

  • aucFROC The predicted FROC AUCs

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Chakraborty, DP, Yoon, HJ (2008) Operating characteristics predicted by models for diagnostic tasks involving lesion localization, Med Phys, 35:2, 435.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples (CRC Press, Boca Raton, FL). https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

## Following example is for mu = 2, lambda = 1, nu = 0.6, in one modality and
## mu = 3, lambda = 1.5, nu = 0.8, in the other modality. 20% of the diseased
## cases have a single lesion, 40% have two lesions, 10% have 3 lesions,
## and 30% have 4 lesions.

res <- PlotRsmOperatingCharacteristics(mu = c(2, 3), lambda = c(1, 1.5), nu = c(0.6, 0.8),
   lesDistr = c(0.2, 0.4, 0.1, 0.3), legendPosition = "bottom")

RSM predicted ROC-abscissa as function of z

Description

RSM predicted ROC-abscissa as function of z

Usage

RSM_FPF(z, lambda)

Arguments

z

The z-vector at which to evaluate the ROC-abscissa.

lambda

The scalar RSM lambda parameter.

Value

FPF, the abscissa of the ROC

Examples

RSM_FPF(c(-Inf,0.1,0.2,0.3),1)

RSM predicted FROC ordinate

Description

RSM predicted FROC ordinate

Usage

RSM_LLF(z, mu, nu)

Arguments

z

The z-vector value at which to evaluate the FROC ordinate.

mu

The scalar RSM mu parameter.

nu

The scalar RSM nu prime parameter.

Value

LLF, the ordinate of the FROC curve

Examples

RSM_LLF(c(1,2),1,0.5)

RSM predicted FROC abscissa

Description

RSM predicted FROC abscissa

Usage

RSM_NLF(z, lambda)

Arguments

z

The z-vector at which to evaluate the FROC abscissa.

lambda

The scalar RSM lambda parameter.

Value

NLF, the abscissa of the FROC curve

Examples

RSM_NLF(c(1,2),1)

RSM predicted ROC-rating pdf for diseased cases

Description

RSM predicted ROC-rating pdf for diseased cases

Usage

RSM_pdfD(z, mu, lambda, nu, lesDistr)

Arguments

z

The z-vector at which to evaluate the pdf.

mu

The scalar RSM mu parameter.

lambda

The scalar RSM lambda parameter.

nu

The scalar RSM nu parameter.

lesDistr

The lesion distribution 1D vector.

Value

pdf, density function for diseased cases

Examples

RSM_pdfD(c(1,2),1,1,0.9, c(0.5, 0.5))
RSM_pdfD(c(1,2),1,1,0.5, c(0.2, 0.3, 0.5))

RSM predicted ROC-rating pdf for non-diseased cases

Description

RSM predicted ROC-rating pdf for non-diseased cases

Usage

RSM_pdfN(z, lambda)

Arguments

z

The z-vector at which to evaluate the pdf.

lambda

The scalar RSM lambda parameter.

Value

pdf, density function for non-diseased cases

Examples

RSM_pdfN(c(1,2),1)

RSM predicted ROC-ordinate as function of z

Description

RSM predicted ROC-ordinate as function of z

Usage

RSM_TPF(z, mu, lambda, nu, lesDistr)

Arguments

z

The z-vector at which to evaluate the pdf.

mu

The scalar RSM mu parameter.

lambda

The scalar RSM lambda parameter.

nu

The scalar nu parameter.

lesDistr

The lesion distribution 1D vector.

Value

TPF, the ordinate of the ROC

Examples

lesDistr <- c(0.1,0.3,0.6)
RSM_TPF(c(-Inf,0.1,0.2,0.3), 1, 1, 0.9, lesDistr)

RSM predicted wAFROC ordinate, cpp code

Description

RSM predicted wAFROC ordinate, cpp code

Usage

RSM_wLLF(zeta, mu, nu, lesDistr, relWeights)

Arguments

zeta

The zeta-vector at which to evaluate the FROC ordinate.

mu

The scalar RSM mu parameter.

nu

The scalar RSM nu prime parameter.

lesDistr

Lesion distribution vector.

relWeights

The lesion weights matrix

Value

wLLF, the ordinate of the wAFROC curve

Examples

RSM_wLLF(1, 1, 0.9, lesDistr = c(0.5, 0.4, 0.1), relWeights = c(0.7, 0.2, 0.1)) 
## 0.34174

Simulate paired binned data for testing FitCorCbm

Description

Simulates single modality 2-reader binned ROC dataset, simulated according to the CORCBM model, for the purpose of testing the fitting program FitCorCbm.

Usage

SimulateCorCbmDataset(
  seed = 123,
  K1 = 50,
  K2 = 50,
  desiredNumBins = 5,
  muX = 1.5,
  muY = 3,
  alphaX = 0.4,
  alphaY = 0.7,
  rhoNor = 0.3,
  rhoAbn2 = 0.8
)

Arguments

seed

The seed variable, default is 123; set to NULL for truly random seed

K1

The number of non-diseased cases, default is 50

K2

The number of diseased cases, default is 50

desiredNumBins

The desired number of bins; default is 5

muX

The CBM mu parameter in condition X

muY

The CBM mu parameter in condition Y

alphaX

The CBM alpha parameter in condition X

alphaY

The CBM alpha parameter in condition Y

rhoNor

The correlation of non-diseased case z-samples

rhoAbn2

The correlation of diseased case z-samples, when disease is visible in both conditions

Details

X and Y refer to the two arms of the pairing. muX and alphaX refer to the univariate CBM parameters in condition X, rhoNor is the correlation of ratings of non-diseased cases and rhoAbn2 is the correlation of ratings of diseased cases when disease is visible in both conditions. The ROC data is bined to 5 bins in each condition. See referenced publication.

Value

The desired dataset suitable for testing FitCorCbm

References

Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

dataset <- SimulateCorCbmDataset()


## this takes very long
## dataset <- SimulateCorCbmDataset(K1 = 5000, K2 = 5000)

Simulates an MRMC uncorrelated FROC dataset using the RSM

Description

Simulates an uncorrelated MRMC FROC dataset for specified numbers of readers and treatments

Usage

SimulateFrocDataset(
  mu,
  lambda,
  nu,
  zeta1,
  I,
  J,
  K1,
  K2,
  perCase,
  seed = NULL,
  deltaMu = 0
)

Arguments

mu

mu parameter of the RSM

lambda

RSM lambda parameter

nu

RSM nu parameter

zeta1

Lowest reporting threshold

I

Number of treatments, default is 1

J

Number of readers

K1

Number of non-diseased cases

K2

Number of diseased cases

perCase

A K2 length array containing the numbers of lesions per diseased case

seed

Initial seed for random number generator, default NULL, for random seed.

deltaMu

Inter-modality increment in mu, default zero

Details

See book chapters on the Radiological Search Model (RSM) for details. In this code correlations between ratings on the same case are assumed to be zero.

Value

An FROC dataset.

References

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

set.seed(1) 
K1 <- 5;K2 <- 7;
maxLL <- 2;perCase <- floor(runif(K2, 1, maxLL + 1))
mu <- 1;lambda <- 1;nu <- 0.99 ;zeta1 <- -1
I <- 2; J <- 5

frocDataRaw <- SimulateFrocDataset(
  mu = mu, lambda = lambda, nu = nu, zeta1 = zeta1,
  I = I, J = J, K1 = K1, K2 = K2, perCase = perCase )
  
## plot the data
ret <- PlotEmpiricalOperatingCharacteristics(frocDataRaw, opChType = "FROC")
## print(ret$Plot)

Simulates an "AUC-equivalent" FROC dataset from an LROC dataset

Description

Simulates a multiple-modality multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.

Usage

SimulateFrocFromLrocDataset(dataset)

Arguments

dataset

The LROC dataset to be converted to FROC.

Details

The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LLCl array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LLIl array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.

Value

The AUC-equivalent FROC dataset

Examples

frocDataset <- SimulateFrocFromLrocDataset(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")   
testthat::expect_equal(lrocAuc, frocHrAuc)

Simulates an uncorrelated FLROC FrocDataset using the RSM

Description

Simulates an uncorrelated LROC dataset for specified numbers of readers and treatments

Usage

SimulateLrocDataset(mu, lambda, nu, zeta1, I, J, K1, K2, lesionVector)

Arguments

mu

The mu parameter of the RSM

lambda

The RSM lambda parameter

nu

The RSM nu parameter

zeta1

The lowest reporting threshold

I

The number of treatments

J

The number of readers

K1

The number of non-diseased cases

K2

The number of diseased cases

lesionVector

A K2 length array containing the numbers of lesions per diseased case

Details

See book chapters on the Radiological Search Model (RSM) for details. The approach is to first simulate an FROC dataset and then convert it to an Lroc dataset. The correlations between FROC ratings on the same case are assumed to be zero.

Value

An LROC dataset.

References

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

set.seed(1)
  K1 <- 5; K2 <- 5; mu <- 2; lambda <- 1; lesionVector <- rep(1, 5); nu <- 0.8; zeta1 <- -3
  frocData <- SimulateFrocDataset(mu, lambda, nu, zeta1, I = 2, J = 5, K1, K2, lesionVector)
  lrocData <- DfFroc2Lroc(frocData)

Simulates a binormal model ROC dataset

Description

Simulates an uncorrelated binormal model ROC factorial dataset

Usage

SimulateRocDataset(I = 1, J = 1, K1, K2, a, deltaA = 0, b, seed = NULL)

Arguments

I

Number of modalities, default 1

J

The number of readers, default 1

K1

Number of non-diseased cases

K2

Number of diseased cases

a

aa parameter of binormal model

deltaA

Inter-modality increment in the aa parameter, default zero

b

bb parameter of the binormal model

seed

Initial seed, default is NULL, for random seed

Details

See book Chapter 6 for details

Value

An ROC dataset

References

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

K1 <- 5;K2 <- 7;a <- 1.5;b <- 0.5
rocDataRaw <- SimulateRocDataset(K1 = K1, K2 = K2, a = a, b = b)

Construct RSM NH model for FROC sample size estimation

Description

Construct RSM NH model for FROC sample size estimation

Usage

SsFrocNhRsmModel(dataset, lesDistr)

Arguments

dataset

The pilot dataset.

lesDistr

A 1D array containing the probability mass function of number of lesions per diseased case in the pivotal FROC study.

Value

A list containing:

  • mu The RSM mu parameter of the NH model.

  • lambda The RSM lambda parameter of the NH model.

  • nu The RSM nu parameter of the NH model.

  • scaleFactor, the factor by which the ROC effect size must by multiplied to get the wAFROC effect size.

  • R2 The squared correlation of the wAFROC-AUC to ROC-AUC fit.


RSM fitted model for FROC sample size

Description

RSM fitted model for FROC sample size

Usage

SsFrocSampleSize(dataset, effectSizeROC, JPivot, KPivot, lesDistr)

Arguments

dataset

The pilot dataset.

effectSizeROC

The effect size in ROC-AUC units

JPivot

The number of readers in the pivotal study

KPivot

The number of cases in the pivotal study

lesDistr

A 1D array containing the probability mass function of number of lesions per diseased case in the pivotal FROC study.

Details

See https://dpc10ster.github.io/RJafrocQuickStart/froc-sample-size.html for explanation of the FROC sample size estimation procedure.

Value

A list containing:

  • effectSizeROC, the specified ROC effect size.

  • scaleFactor, the factor by which the ROC effect size must by multiplied to get the wAFROC effect size.

  • powerRoc, the ROC power.

  • powerFroc, the wAFROC power.

Examples

## Examples with CPU or elapsed time > 5s
## user system elapsed
## SsFrocSampleSize 8.102  0.023   8.135

## SsFrocSampleSize(DfExtractDataset(dataset04, trts = c(1,2)), 
## effectSizeROC = 0.03, JPivot = 5, KPivot = 100, lesDistr = c(0.69, 0.2, 0.11))

Statistical power for specified numbers of readers and cases

Description

Calculate the statistical power for specified numbers of readers J, cases K, analysis method and DBM or OR variances components

Usage

SsPowerGivenJK(
  dataset,
  ...,
  FOM,
  J,
  K,
  effectSize = NULL,
  method = "OR",
  covEstMethod = "jackknife",
  analysisOption = "RRRC",
  UseDBMHB2004 = FALSE,
  alpha = 0.05
)

Arguments

dataset

The pilot dataset. If set to NULL then variance components must be supplied.

...

Optional variance components: needed if dataset is not supplied.

FOM

The figure of merit.

J

The number of readers in the pivotal study.

K

The number of cases in the pivotal study.

effectSize

The effect size to be used in the pivotal study. Default is NULL, which uses the observed effect size in the pilot dataset. Must be supplied if dataset is set to NULL and variance components are supplied.

method

"OR" (the default) or "DBM" (but see UseDBMHB2004 option below).

covEstMethod

Specify the variance covariance estimation method(s): "jackknife" (the default), "bootstrap" or "DeLong" (for ROC datasets).

analysisOption

Specify the random factor(s): "RRRC" (the default), "RRFC or FRRC".

UseDBMHB2004

Logical, defaults to FALSE, which results in OR sample size method being used, even if DBM method is specified, as in Hillis 2011 & 2018 papers. If TRUE the method based on Hillis-Berbaum 2004 sample size paper is used.

alpha

The significance level, default is 0.05.

Details

The default effectSize uses the observed effect size in the pilot study. A numeric value over-rides the default value. This argument must be supplied if dataset = NULL and variance compenents (the ... arguments) are supplied.

Value

The expected statistical power in pivotal study for the given conditions and J and K.

Note

The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.

References

Hillis SL, Berbaum KS (2004). Power Estimation for the Dorfman-Berbaum-Metz Method. Acad Radiol, 11, 1260–1273.

Hillis SL, Obuchowski NA, Berbaum KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.

Hillis SL, Schartz KM (2018). Multireader sample size program for diagnostic studies: demonstration and methodology. Journal of Medical Imaging, 5(04).

Examples

## EXAMPLE 1: RRRC power 
## specify 2-modality ROC dataset and force DBM alg.
res <- SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", UseDBMHB2004 = TRUE) # RRRC is default  

## EXAMPLE 1A: FRRC power 
res <- SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", UseDBMHB2004 = TRUE, analysisOption = "FRRC") 

## EXAMPLE 1B: RRFC power 
res <- SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", UseDBMHB2004 = TRUE, analysisOption = "RRFC") 

## EXAMPLE 2: specify NULL dataset & DBM var. comp. & force DBM-based alg.
vcDBM <- UtilDBMVarComp(dataset02, FOM = "Wilcoxon")$VarCom
res <- SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", J = 6, K = 251, 
effectSize = 0.05, method = "DBM", UseDBMHB2004 = TRUE, 
list( 
VarTR = vcDBM["VarTR","Estimates"], # replace rhs with actual values as in 4A
VarTC = vcDBM["VarTC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"])) # do:
                     
## EXAMPLE 3: specify 2-modality ROC dataset and use OR-based alg.
res <- SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251)

## EXAMPLE 4: specify NULL dataset & OR var. comp. & use OR-based alg.
JStar <- length(dataset02$ratings$NL[1,,1,1])
KStar <- length(dataset02$ratings$NL[1,1,,1])
vcOR <- UtilORVarComp(dataset02, FOM = "Wilcoxon")$VarCom
res <- SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6, 
K = 251, list(JStar = JStar, KStar = KStar, 
   VarTR = vcOR["VarTR","Estimates"], # replace rhs with actual values as in 4A
   Cov1 = vcOR["Cov1","Estimates"],   # do:
   Cov2 = vcOR["Cov2","Estimates"],   # do:
   Cov3 = vcOR["Cov3","Estimates"],   # do:
   Var = vcOR["Var","Estimates"]))
   
## EXAMPLE 4A: specify NULL dataset & OR var. comp. & use OR-based alg.
res <- SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6, 
K = 251, list(JStar = 5, KStar = 114, 
   VarTR = 0.00020040252,
   Cov1 = 0.00034661371,
   Cov2 = 0.00034407483,
   Cov3 = 0.00023902837,
   Var = 0.00080228827))
   
## EXAMPLE 5: specify NULL dataset & DBM var. comp. & use OR-based alg.
## The DBM var. comp. are converted internally to OR var. comp.
vcDBM <- UtilDBMVarComp(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
res <- SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05, 
method = "DBM", FOM = "Wilcoxon",
list(KStar = KStar,                # replace rhs with actual values as in 5A 
VarR = vcDBM["VarR","Estimates"], # do:
VarC = vcDBM["VarC","Estimates"], # do:
VarTR = vcDBM["VarTR","Estimates"], # do:
VarTC = vcDBM["VarTC","Estimates"], # do:
VarRC = vcDBM["VarRC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"]))

## EXAMPLE 5A: specify NULL dataset & DBM var. comp. & use OR-based alg.
res <- SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05, 
method = "DBM", FOM = "Wilcoxon",
list(KStar = 114,
VarR = 0.00153499935,
VarC = 0.02724923428,
VarTR = 0.00020040252,
VarTC = 0.01197529621,
VarRC = 0.01226472859,
VarErr = 0.03997160319))

Power given J, K and Dorfman-Berbaum-Metz variance components

Description

Power given J, K and Dorfman-Berbaum-Metz variance components

Usage

SsPowerGivenJKDbmVarCom(
  J,
  K,
  effectSize,
  VarTR,
  VarTC,
  VarErr,
  alpha = 0.05,
  analysisOption = "RRRC"
)

Arguments

J

The number of readers

K

The number of cases

effectSize

The effect size

VarTR

The modality-reader DBM variance component

VarTC

The modality-case DBM variance component

VarErr

The error-term DBM variance component

alpha

The size of the test (default = 0.05)

analysisOption

Specify the random factor(s): "RRRC", "FRRC", "RRFC"

Details

The variance components are obtained using St with method = "DBM".

Value

A list containing the estimated power and associated statistics for the specified random factor(s).

Examples

VarCom <- St(dataset02, FOM = "Wilcoxon", method = "DBM", 
   analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
VarTC <- VarCom["VarTC",1]
VarErr <- VarCom["VarErr",1]
ret <- SsPowerGivenJKDbmVarCom (J = 5, K = 100, effectSize = 0.05, VarTR, 
   VarTC, VarErr, analysisOption = "RRRC")
##cat("RRRC power = ", ret$powerRRRC)

Power given J, K and Obuchowski-Rockette variance components

Description

Power given J, K and Obuchowski-Rockette variance components

Usage

SsPowerGivenJKOrVarCom(
  J,
  K,
  KStar,
  effectSize,
  VarTR,
  Cov1,
  Cov2,
  Cov3,
  Var,
  alpha = 0.05,
  analysisOption = "RRRC"
)

Arguments

J

The number of readers in the pivotal study

K

The number of cases in the pivotal study

KStar

The number of cases in the pilot study

effectSize

The effect size

VarTR

The modality-reader OR variance component

Cov1

The OR Cov1 covariance

Cov2

The OR Cov2 covariance

Cov3

The OR Cov3 covariance

Var

The OR pure variance term

alpha

The size of the test (default = 0.05)

analysisOption

Specify the random factor(s): "RRRC", "FRRC", "RRFC"

Details

The variance components are obtained using St with method = "OR".

Value

A list containing the estimated power and associated statistics for the specified random factor(s).

Examples

dataset <- dataset02 ## the pilot study
KStar <- length(dataset$ratings$NL[1,1,,1])
VarCom <- St(dataset, FOM = "Wilcoxon", 
method = "OR", analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
Cov1 <- VarCom["Cov1",1]
Cov2 <- VarCom["Cov2",1]
Cov3 <- VarCom["Cov3",1]
Var <- VarCom["Var",1]
ret <- SsPowerGivenJKOrVarCom (J = 5, K = 100, KStar = KStar,  
   effectSize = 0.05, VarTR, Cov1, Cov2, Cov3, Var, analysisOption = "RRRC")
    
##cat("RRRC power = ", ret$powerRRRC)

Generate a power table using the OR method

Description

Generate combinations of numbers of readers J and numbers of cases K for desired power and specified random factor(s)

Usage

SsPowerTable(
  dataset,
  FOM,
  effectSize = NULL,
  alpha = 0.05,
  desiredPower = 0.8,
  analysisOption = "RRRC"
)

Arguments

dataset

The pilot ROC dataset to be used to extrapolate to the pivotal study.

FOM

The figure of merit.

effectSize

The effect size to be used in the pivotal study, default value is NULL. See Details.

alpha

The The size of the test, default is 0.05.

desiredPower

The desired statistical power, default is 0.8.

analysisOption

Specification of random factor(s): "RRRC" (the default), "FRRC", or "RRFC".

Details

The default effectSize uses the observed effect size in the pilot study. A supplied numeric value over-rides the default value.

Value

A list containing up to 3 (depending on analysisOption) data frames.

Each dataframe contains 3 arrays:

numReaders

The numbers of readers in the pivotal study.

numCases

The numbers of cases in the pivotal study.

power

The estimated statistical powers.

Note

The procedure is valid for ROC studies only; for FROC studies see online books.

Examples

## Examples with CPU or elapsed time > 5s
##              user    system elapsed
## SsPowerTable 20.033  0.037  20.077    

## Example of sample size calculation with OR method
## SsPowerTable(dataset02, FOM = "Wilcoxon", method = "OR")

Number of cases, for specified number of readers, to achieve desired power

Description

Number of cases to achieve the desired power, for specified number of readers J, and specified DBM or ORH analysis method

Usage

SsSampleSizeKGivenJ(
  dataset,
  ...,
  J,
  FOM,
  effectSize = NULL,
  method = "OR",
  alpha = 0.05,
  desiredPower = 0.8,
  analysisOption = "RRRC",
  UseDBMHB2004 = FALSE
)

Arguments

dataset

The pilot dataset. If set to NULL then variance components must be supplied.

...

Optional variance components, VarTR, VarTC and VarErr. These are needed if dataset is not supplied.

J

The number of readers in the pivotal study.

FOM

The figure of merit. Not needed if variance components are supplied.

effectSize

The effect size to be used in the pivotal study. Default is NULL. Must be supplied if dataset is set to NULL and variance components are supplied.

method

"OR" (default) or "DBM".

alpha

The significance level of the study, default is 0.05.

desiredPower

The desired statistical power, default is 0.8.

analysisOption

Specifies the random factor(s): "RRRC" (the default), "FRRC", or "RRFC".

UseDBMHB2004

Logical, default is FALSE, if TRUE the 2004 DBM method is used. Otherwise the OR method is used.

Details

effectSize = NULL uses the observed effect size in the pilot study. A numeric value over-rides the default value. This argument must be supplied if dataset = NULL and variance components (the optional ... arguments) are supplied.

Value

A list of two elements:

K

The minimum number of cases K in the pivotal study to just achieve the desired statistical power, calculated for each value of analysisOption.

power

The predicted statistical power.

Note

The procedure is valid for ROC studies only; for FROC studies see online books.

Examples

## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "DBM")

a <- UtilDBMVarComp(dataset02, FOM = "Wilcoxon")$VarCom
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "DBM", UseDBMHB2004 = TRUE,
   list(VarTR = a["VarTR",1], 
   VarTC = a["VarTC",1], 
   VarErr = a["VarErr",1]))

## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "OR")

a <- UtilORVarComp(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "OR", 
   list(KStar = KStar, 
   VarTR = a["VarTR",1], 
   Cov1 = a["Cov1",1], 
   Cov2 = a["Cov2",1], 
   Cov3 = a["Cov3",1], 
   Var = a["Var",1]))

 
for (J in 6:10) {
 ret <- SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", J = J, analysisOption = "RRRC") 
 message("# of readers = ", J, " estimated # of cases = ", ret$K, 
 ", predicted power = ", signif(ret$powerRRRC,3), "\n")
}

DBM or OR significance testing for a one treatment factorial or two-treatment crossed modality factorial dataset (not SPLIT_PLOT)

Description

Performs DBM or OR significance testing for the dataset.

Usage

St(
  dataset,
  FOM,
  method = "OR",
  covEstMethod = "jackknife",
  analysisOption = "RRRC",
  alpha = 0.05,
  FPFValue = 0.2,
  nBoots = 200,
  seed = NULL,
  details = 0
)

Arguments

dataset

The dataset to be analyzed, see RJafroc-package. The dataset design can be "FCTRL" or "FCTRL-X-MOD".

FOM

The figure of merit, see UtilFigureOfMerit

method

The significance testing method to be used: "DBM" for the Dorfman-Berbaum-Metz method or "OR" for the Obuchowski-Rockette method (default).

covEstMethod

The covariance matrix estimation method in ORH analysis (for method = "DBM" the jackknife is always used).

  • "Jackknife" (default),

  • "Bootstrap", in which case nBoots is relevant, default 200,

  • "DeLong"; requires FOM = "Wilcoxon" or "ROI" or "HrAuc".

analysisOption

Determines which factors are regarded as random and which are fixed:

  • "RRRC" = random-reader random case (default),

  • "FRRC" = fixed-reader random case,

  • "RRFC" = random-reader fixed case,

alpha

The significance level (alpha) of the test of the null hypothesis that all modality effects are zero (default: alpha = 0.05).

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. (default: FPFValue = 0.2).

nBoots

The number of bootstraps (defaults to 200), only needed if covEstMethod = "bootstrap" and method = "OR"

seed

For bootstraps the seed of the RNG (default: seed = NULL), only needed if method = "OR" and covEstMethod = "bootstrap".

details

Amount of explanations in output, default is 0 for no explanations and 1 for explanations.

Value

A list containing the results of the analysis.

Note

details = 0 should suffice for factorial dataset analysis since the names of the output lists are self-explanatory. For cross-modality analysis details = 1 is suggested to better understand the output.

References

Dorfman DD, Berbaum KS, Metz CE (1992) ROC characteristic rating analysis: Generalization to the Population of Readers and Patients with the Jackknife method, Invest. Radiol. 27, 723-731.

Obuchowski NA, Rockette HE (1995) Hypothesis Testing of the Diagnostic Accuracy for Multiple Diagnostic Tests: An ANOVA Approach with Dependent Observations, Communications in Statistics: Simulation and Computation 24, 285-308.

Hillis SL (2014) A marginal-mean ANOVA approach for analyzing multireader multicase radiological imaging data, Statistics in medicine 33, 330-360.

Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-modality JAFROC observer study. Medical Physics. 43(3):1265-1274.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

result <- St(dataset02,FOM = "Wilcoxon", method = "DBM")
result <- St(dataset02,FOM = "Wilcoxon", method = "OR")
result <- St(datasetX, FOM = "wAFROC", method = "OR", analysisOption = "RRRC")


result <- St(dataset05, FOM = "wAFROC")
result <- St(dataset05, FOM = "HrAuc", method = "DBM")

Significance testing: standalone CAD vs. radiologists

Description

Comparing standalone CAD vs. at least two radiologists interpreting the same cases; standalone CAD means that all the designer-level mark-rating pairs generated by the CAD algorithm are available to the analyst, not just the one or two marks per case displayed to the radiologist (the latter are marks whose ratings exceed a pre-selected threshold). At the very minimum, location-level information, such as in the LROC paradigm, should be used. Ideally, the FROC paradigm should be used. A severe statistical power penalty is paid if one uses the ROC paradigm. See Standalone CAD vs Radiologists chapter, available via download link at site https://github.com/dpc10ster/RJafrocBook/blob/gh-pages/RJafrocBook.pdf

Usage

StCadVsRad(
  dataset,
  FOM,
  FPFValue = 0.2,
  method = "1T-RRRC",
  alpha = 0.05,
  plots = FALSE
)

Arguments

dataset

The dataset to be analyzed; must be single-modality at least three readers, where the first reader is CAD.

FOM

The desired FOM; for ROC data it must be "Wilcoxon", for FROC data it can be any valid FOM, e.g., "HrAuc", "wAFROC", etc; for LROC data it must be "Wilcoxon", or "PCL" or "ALROC".

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

method

The desired analysis: "1T-RRFC","1T-RRRC" (the default) or "2T-RRRC", see manuscript for details.

alpha

Significance level of the test, defaults to 0.05.

plots

Flag, default is FALSE, i.e., a plot is not displayed. If TRUE, it displays the appropriate operating characteristic for all readers and CAD.

Details

  • PCL is the probability of a correct localization.

  • The LROC is the plot of PCL (ordinate) vs. FPF.

  • For LROC data, FOM = "PCL" means the interpolated PCL value at the specified FPFValue.

  • For FOM = "ALROC" the trapezoidal area under the LROC from FPF = 0 to FPF = FPFValue is used.

  • If method = "1T-RRRC" the first reader is assumed to be CAD.

  • If method = "2T-RRRC" the first modality is assumed to be CAD.

  • The NH is that the FOM of CAD equals the average of the readers.

  • The method = "1T-RRRC" analysis uses an adaptation of the single-modality multiple-reader Obuchowski Rockette (OR) model described in a paper by Hillis (2007), section 5.3. It is characterized by 3 parameters VarR, Var and Cov2, where the latter two are estimated using the jackknife.

  • For method = "2T-RRRC" the analysis replicates the CAD data as many times as necessary so as to form one "modality" of an MRMC pairing, the other "modality" being the radiologists. Then standard ORH analysis is applied. The method is described in Kooi et al. It gives exactly the same final results (F-statistic, ddf and p-value) as "1T-RRRC" but the intermediate quantities are meaningless.

Value

If method = "1T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciAvgDiffFom

The 95-percent CI of the average difference, RAD - CAD.

varR

The variance of the radiologists.

varError

The variance of the error term in the single-modality multiple-reader OR model.

cov2

The covariance of the error term.

tstat

The observed value of the t-statistic; it's square is equivalent to an F-statistic.

df

The degrees of freedom of the t-statistic.

pval

The p-value for rejecting the NH.

Plots

If argument plots = TRUE, a ggplot object containing empirical operating characteristics corresponding to specified FOM. For example, if FOM = "Wilcoxon" an ROC plot object is produced where reader 1 is CAD. If an LROC FOM is selected, an LROC plot is displayed.

If method = "2T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciDiffFom

A data frame containing the statistics associated with the average difference, RAD - CAD.

ciAvgRdrEachTrt

A data frame containing the statistics associated with the average FOM in each "modality".

varR

The variance of the pure reader term in the OR model.

varTR

The variance of the modality-reader term error term in the OR model.

cov1

The covariance1 of the error term - same reader, different treatments.

cov2

The covariance2 of the error term - different readers, same modality.

cov3

The covariance3 of the error term - different readers, different treatments.

varError

The variance of the pure error term in the OR model.

FStat

The observed value of the F-statistic.

ndf

The numerator degrees of freedom of the F-statistic.

df

The denominator degrees of freedom of the F-statistic.

pval

The p-value for rejecting the NH.

Plots

see above.

References

Hillis SL (2007) A comparison of denominator degrees of freedom methods for multiple observer ROC studies, Statistics in Medicine. 26:596-619.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Hupse R, Samulski M, Lobbes M, et al (2013) Standalone computer-aided detection compared to radiologists performance for the detection of mammographic masses, Eur Radiol. 23(1):93-100.

Kooi T, Gubern-Merida A, et al. (2016) A comparison between a deep convolutional neural network and radiologists for classifying regions of interest in mammography. Paper presented at: International Workshop on Digital Mammography, Malmo, Sweden.

Examples

ret1M <- StCadVsRad (dataset09, 
FOM = "Wilcoxon", method = "1T-RRRC")

StCadVsRad(datasetCadLroc, 
FOM = "Wilcoxon", method = "1T-RRFC")

retLroc1M <- StCadVsRad (datasetCadLroc, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

## test with fewer readers
dataset09a <- DfExtractDataset(dataset09, rdrs = seq(1:7))
ret1M7 <- StCadVsRad (dataset09a, 
FOM = "Wilcoxon", method = "1T-RRRC")

datasetCadLroc7 <- DfExtractDataset(datasetCadLroc, rdrs = seq(1:7))
ret1MLroc7 <- StCadVsRad (datasetCadLroc7, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)


## takes longer than 5 sec on OSX
## retLroc2M <- StCadVsRad (datasetCadLroc, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

## ret2MLroc7 <- StCadVsRad (datasetCadLroc7, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

Performs OR significance testing for SPLIT-PLOT A or C datasets

Description

Performs Obuchowski-Rockette (OR) significance testing for specified dataset.

Usage

StSP(dataset, FOM, alpha = 0.05, analysisOption = "RRRC")

Arguments

dataset

The dataset to be analyzed, see RJafroc-package. Must have two or more treatments and two or more readers. The dataset design must be "SPLIT-PLOT-A" or "SPLIT-PLOT-C".

FOM

The figure of merit

alpha

The significance level of the test, default is 0.05

analysisOption

Determines which factors are regarded as random vs. fixed:

  • "RRRC" = random-reader random case, the default,

  • "FRRC" = fixed-reader random case,

  • "RRFC" = random-reader fixed case,

Value

Results of the analysis

Examples

fileName <- system.file("extdata", "/toyFiles/ROC/rocSpAZP.xlsx", 
package = "RJafroc", mustWork = TRUE)
dsSpA <- DfReadSP(fileName)
ret <- StSP(dsSpA, FOM = "Wilcoxon")

ret <- StSP(datasetFROCSpC, FOM = "wAFROC")

Convert from physical to intrinsic RSM parameters

Description

Convert physical RSM parameters λi\lambda_i' and νi\nu_i' to the intrinsic RSM parameters λi\lambda_i and νi\nu_i. The physical parameters are more meaningful but they depend on μ\mu. The intrinsic parameters are independent of μ\mu. See book for details.

Usage

Util2Intrinsic(mu, lambda, nu)

Arguments

mu

The mean of the Gaussian distribution for the ratings of latent LLs, i.e. continuous ratings of lesions that were found by the search mechanism ~ N(μ\mu,1). The corresponding distribution for the ratings of latent NLs is N(0,1)

lambda

The Poisson λi\lambda_i parameter, which describes the distribution of random numbers of latent NLs (suspicious regions that do not correspond to actual lesions) per case; the mean of these random numbers asymptotically approaches lambda

nu

The νi\nu_i parameter; it is the success probability of the binomial distribution describing the random number of latent LLs (suspicious regions that correspond to actual lesions) per diseased case

Details

RSM is the Radiological Search Model described in the book. A latent mark becomes an actual mark if the corresponding rating exceeds the lowest reporting threshold zeta1. See also Util2Physical.

Value

A list containing λi\lambda_i and νi\nu_i, the RSM search parameters

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

mu <- 2;lambda <- 10;nu <- 0.9
lambda_i <- Util2Intrinsic(mu, lambda, nu)$lambda_i 
nu_i <- Util2Intrinsic(mu, lambda, nu)$nu_i 
## note that the physical values are only constrained to be positive, e.g., nu_i is not constrained
## to be between 0 and one.

Convert from intrinsic to physical RSM parameters

Description

Convert intrinsic RSM parameters lambdailambda_i and nuinu_i correspond to the physical RSM parameters lambdailambda_i' and nuinu_i'. The physical parameters are more meaningful but they depend on mumu. The intrinsic parameters are independent of mumu. See book for details.

Usage

Util2Physical(mu, lambda_i, nu_i)

Arguments

mu

The mean of the Gaussian distribution for the ratings of latent LLs, i.e. continuous ratings of lesions that were found by the search mechanism ~ N(μ\mu,1). The corresponding distribution for the ratings of latent NLs is N(0,1).

lambda_i

The intrinsic Poisson lambda_i parameter.

nu_i

The intrinsic Binomial nu_i parameter.

Details

RSM is the Radiological Search Model described in the book. See also Util2Intrinsic.

Value

A list containing λ\lambda and ν\nu

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449–3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

mu <- 2;lambda_i <- 20;nu_i <- 1.1512925 
lambda <- Util2Physical(mu, lambda_i, nu_i)$lambda 
nu <- Util2Physical(mu, lambda_i, nu_i)$nu 
## note that only the physical values are only constrained to be positive
## the physical variable nu must obey 0 <= nu <= 1

RSM ROC/AFROC/wAFROC AUC calculator

Description

Returns the ROC, AFROC and wAFROC AUCs corresponding to specified RSM parameters. See also UtilAucPROPROC, UtilAucBIN and UtilAucCBM

Usage

UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights = 0)

Arguments

mu

The mean of the Gaussian distribution for the ratings of latent LLs (continuous ratings of lesions that are found by the search mechanism). The NLs are assumed to be distributed as N(0,1).

lambda

The RSM lambda parameter.

nu

The RSM nu parameters.

zeta1

The lowest reporting threshold, the default is -Inf.

lesDistr

The lesion distribution 1D array, i.e., the probability mass function (pmf) of the numbers of lesions for diseased cases.

relWeights

The relative weights of the lesions; a vector of length maxLL; if zero, the default, equal weights are assumed.

Value

The ROC, AFROC and wAFROC AUCs corresponding to the specified parameters

References

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Examples

mu <- 1;lambda <- 1;nu <- 0.9
lesDistr <- c(0.9, 0.1) 
## i.e., 90% of dis. cases have one lesion, and 10% have two lesions
relWeights <- c(0.05, 0.95)
## i.e., lesion 1 has weight 5 percent while lesion two has weight 95 percent

UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr)
UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights)

Binormal model AUC function

Description

Returns the Binormal model ROC-AUC corresponding to specified parameters. See also UtilAnalyticalAucsRSM, UtilAucPROPROC and UtilAucCBM

Usage

UtilAucBIN(a, b)

Arguments

a

The a parameter of the binormal model (separation of non-diseased and diseased pdfs)

b

The b parameter of the binormal model (std. dev. of non-diseased diseased pdf; diseased pdf has unit std. dev)

Value

Binormal model-predicted ROC-AUC

References

Dorfman DD, Alf E (1969) Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals - Rating-Method Data, Journal of Mathematical Psychology. 6:487-496.

Examples

a <- 2;b <- 0.7
UtilAucBIN(a,b)

CBM AUC function

Description

Returns the CBM ROC-AUC See also UtilAnalyticalAucsRSM, UtilAucPROPROC and UtilAucBIN

Usage

UtilAucCBM(mu, alpha)

Arguments

mu

The mu parameter of CBM (separation of non-diseased and diseased pdfs)

alpha

The alpha parameter of CBM, i.e., the fraction of diseased cases on which the disease is visible

Value

CBM-predicted ROC-AUC for the specified parameters

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7:6 427–437.

Examples

mu <- 2;alpha <- 0.8
UtilAucCBM(mu,alpha)

PROPROC AUC function

Description

Returns the PROPROC ROC-AUC corresponding to specified parameters. See also UtilAnalyticalAucsRSM, UtilAucBIN and UtilAucCBM

Usage

UtilAucPROPROC(c1, da)

Arguments

c1

The c-parameter of the PROPROC model, since c is a reserved function in R.

da

The da-parameter of the PROPROC model.

Value

PROPROC model-predicted ROC-AUC for the specified parameters

References

Metz CE, Pan X (1999) Proper Binormal ROC Curves: Theory and Maximum-Likelihood Estimation, J Math Psychol 43(1):1-33.

Examples

c1 <- .2;da <- 1.5
UtilAucPROPROC(c1,da)

Convert from DBM to OR variance components

Description

UtilDBM2ORVarCom converts from DBM variance components to OR variance components

Usage

UtilDBM2ORVarCom(K, DBMVarCom)

Arguments

K

Total number of cases

DBMVarCom

DBM variance components, a data.frame containing VarR, VarC, VarTR, VarTC, VarRC and VarErr

Value

UtilDBM2ORVarCom returns the equivalent OR Variance components

Examples

DBMVarCom <- St(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)

ORVarCom <- St(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)

Utility for Dorfman-Berbaum-Metz variance components

Description

Utility for Dorfman-Berbaum-Metz variance components

Usage

UtilDBMVarComp(dataset, FOM, FPFValue = 0.2)

Arguments

dataset

The dataset object

FOM

The figure of merit

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Value

A list containing the variance components.

Examples

result <- UtilDBMVarComp(dataset02, FOM = "Wilcoxon")

Calculate empirical figures of merit (FOMs) for factorial dataset, standard one-treatment or two-treatment cross-modality

Description

Calculate the specified empirical figure of merit for each modality-reader combination in a standard (1T) or cross-modality (2T) dataset

Usage

UtilFigureOfMerit(dataset, FOM = "wAFROC", FPFValue = 0.2)

Arguments

dataset

The dataset to be analyzed, RJafroc-package

FOM

The figure of merit; the default is "wAFROC"

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Details

The allowed FOMs depend on the dataType field of the dataset object.

For dataset$descriptions$type = "ROC" only FOM = "Wilcoxon" is allowed. For dataset$descriptions$type = "FROC" the following FOMs are allowed:

  • FOM = "AFROC1" (use only if no non-diseased cases are available)

  • FOM = "AFROC"

  • FOM = "wAFROC1" (use only if no non-diseased cases are available)

  • FOM = "wAFROC" (the default)

  • FOM = "HrAuc"

  • FOM = "HrSe" (example of an end-point based FOM)

  • FOM = "HrSp" (do:)

  • FOM = "MaxLLF" (do:)

  • FOM = "MaxNLF" (do:)

  • FOM = "MaxNLFAllCases" (do:)

"MaxLLF", "MaxNLF" and "MaxNLFAllCases" correspond to ordinate, and abscissa, respectively, of the highest point on the FROC operating characteristic obtained by counting all the marks. Given the number of FOMs possible with FROC data, it is appropriate to make a recommendation: it is recommended the wAFROC FOM be used whenever possible. One should use the wAFROC1 FOM only if the dataset has no non-diseased cases.

For dataType = "ROI" dataset only FOM = "ROI" is allowed.

For dataType = "LROC" dataset the following FOMs are allowed:

  • FOM = "Wilcoxon" for ROC data inferred from LROC data

  • FOM = "PCL" the probability of correct localization at specified FPFValue

  • FOM = "ALROC" the area under the LROC from zero to specified FPFValue

FPFValue The FPF at which to evaluate PCL or ALROC; the default is 0.2; only needed for LROC data. For cross-modality analysis ROI and LROC datasets are not supported.

Value

For standard IT dataset: A c(I, J) dataframe, where the row names are modalityID's of the treatments and column names are the readerID's of the readers. For cross-modality dataset: A list containing two data frames are returned: * c(I2, J) data frame, FOMs averaged over the first modality, where the row names are modality IDS of the second modality * c(I1, J) data frames, FOMs averaged over the second modality, where the row names are modality IDs of the first modality, * For either 1T or 2T the column names are the readerID's.

References

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

res <- UtilFigureOfMerit(dataset02, FOM = "Wilcoxon") # ROC data
res <- UtilFigureOfMerit(dataset01) # FROC dataset, default wAFROC FOM
res <- UtilFigureOfMerit(datasetX, FOM = "wAFROC")

The lesionID distribution of a dataset or a supplied 1D-array

Description

The lesionID distribution of a dataset or of a supplied 1D-array. The lesionID field is described in the format of the Excel input file, see QuickStart (use Command click to view the link).

Usage

UtilLesDistr(dsOrArr)

Arguments

dsOrArr

A dataset object from which the lesion distribution is extracted or a 1D-array specifying the lesion distribution.

Details

Apart from ratings two characteristics of an FROC dataset affect the FOM: the distribution of lesionIDs per case and the lesion weights. This function addresses the former. The latter is addressed in UtilLesWghts. The return value is a dataframe containing 2 equal length vectors: the lesionID labels and the corresponding fractions of lesions per diseased case in the dataset. For ROC or LROC data this vector is always c(1), since all diseased cases contain one lesion. For FROC data the first element of the dataframe, i.e., lesID, contains the numbers of lesions per diseased case. The second element, i.e., Freq, contains the fraction of dis. cases containing one lesion, the fraction containing two lesions, etc. See PlotRsmOperatingCharacteristics for a function that depends on LD.

Value

A data frame containing the number of lesionIDs per case, lesID, and the frequency distribution of the lesionIDs, Freq.

Examples

res <- UtilLesDistr(dataset01) # FROC dataset
##    lesID       Freq
## 1      1 0.93258427
## 2      2 0.06741573
## In the Excel input file, 93 percent of lesions have lesionID = 1
## and the rest have lesionID = 2


res <- UtilLesDistr(dataset02) # ROC dataset
##       lesID Freq
## 1         1    1
## In the Excel input file all dis. cases have one lesion

res <- UtilLesDistr(datasetCadLroc) # LROC dataset
##       lesID Freq
## 1         1    1
## In the Excel input file all dis. cases have one lesion

res <- UtilLesDistr(c(0.5, 0.3, 0.1, 0.1))
##      lesID Freq
## 1        1  0.5
## 2        2  0.3
## 3        3  0.1
## 4        4  0.1
## An example of array input; 50 percent of the cases have lesionID = 1,
## 30 percent have lesionID = 2, etc.

res <- UtilLesDistr(dataset11) ## big froc dataset
##     lesID        Freq
## 1       1 0.217391304
## 2       2 0.200000000
## 3       3 0.113043478
## 4       4 0.086956522
## 5       5 0.043478261
## 6       6 0.095652174
## 7       7 0.052173913
## 8       8 0.069565217
## 9       9 0.017391304
## 10     10 0.026086957
## 11     11 0.026086957
## 12     12 0.026086957
## 16     16 0.017391304
## 20     20 0.008695652

## TBA!! This dataset has lots of lesions (3D imaging for lung cancer).
## The lesionIDs range from 1 to 20 with a few missing,
## e.g., lesionID = 13 is not present in any diseased case.
## Cases with lesionID = 1 have frequency 0.217, those with lesionID = 16
## have frequency 0.174, those with lesionID = 20 have frequency 0.00870, etc.

Lesion weights distribution matrix

Description

Determine the lesion weights distribution 2D matrix of a dataset or manually specify the lesion weights distribution.

Usage

UtilLesWghtsDS(dsOrArr, relWeights = 0)

UtilLesWghtsLD(LDOrArr, relWeights = 0)

Arguments

dsOrArr

A dataset object or a 1D-array, see UtilLesDistr.

relWeights

The relative weights of the lesions: a unit sum vector of length equal to the maximum number of lesions per dis. case. For example, c(0.2, 0.4, 0.1, 0.3) means that on cases with one lesion the weight of the lesion is unity, on cases with two lesions the ratio of the weight of the first lesion to that of the second lesion is 0.2:0.4, i.e., lesion 2 is twice as important as lesion 1. On cases with 4 lesions the weights are in the ratio 0.2:0.4:0.1:0.3. The default is zero, in which case equal lesion weights are assumed.

LDOrArr

The lesion distribution (LD) dataframe object produced by UtilLesDistr or a 1D-array.

Details

Two characteristics of an FROC dataset, apart from the ratings, affect the FOM: the distribution of lesion per case and the distribution of lesion weights. This function addresses the weights. The distribution of lesions is addressed in UtilLesDistr. See PlotRsmOperatingCharacteristics for a function that depends on lesWghtDistr. The underlying assumption is that lesion 1 is the same type across all diseased cases, lesion 2 is the same type across all diseased cases, ..., etc. This allows assignment of weights independent of the case index.

Value

The lesion distribution (LD) dataframe object produced by UtilLesDistr or a 1D-array.

Examples

UtilLesWghtsDS (dataset01) # FROC data

##      [,1] [,2] [,3]
##[1,]    1  1.0 -Inf
##[2,]    2  0.5  0.5

UtilLesWghtsDS (dataset02) # ROC data

##      [,1] [,2]
##[1,]    1  1

UtilLesWghtsDS(c(0.7,0.2,0.1)) # only frequencies supplied
## relWeights defaults to zero

## Dataset with 1 to 4 lesions per case, with frequency as per first argument

UtilLesWghtsLD (c(0.6, 0.2, 0.1, 0.1), c(0.2, 0.4, 0.1, 0.3))

##       [,1]  [,2]      [,3]      [,4]   [,5]
##[1,]    1 1.0000000      -Inf      -Inf -Inf 
##[2,]    2 0.3333333 0.6666667      -Inf -Inf
##[3,]    3 0.2857143 0.5714286 0.1428571 -Inf
##[4,]    4 0.2000000 0.4000000 0.1000000  0.3

## Explanation 
##> c(0.2)/sum(c(0.2))
##[1] 1 ## (weights for cases with 1 lesion)
##> c(0.2, 0.4)/sum(c(0.2, 0.4))
##[1] 0.3333333 0.6666667 ## (weights for cases with 2 lesions)
##> c(0.2, 0.4, 0.1)/sum(c(0.2, 0.4, 0.1))
##[1] 0.2857143 0.5714286 0.1428571 ## (weights for cases with 3 lesions)
##> c(0.2, 0.4, 0.1, 0.3)/sum(c(0.2, 0.4, 0.1, 0.3))
##[1] 0.2000000 0.4000000 0.1000000  0.3 ## (weights for cases with 4 lesions)

UtilLesWghtsLD (c(0.1, 0.7, 0.0, 0.2), c(0.4, 0.3, 0.2, 0.1)) 

## Weights are included for non-existent `lesionID` = 3 but corresponding frequency will be zero
##      [,1]       [,2]       [,3]       [,4] [,5]
## [1,]    1 1.00000000       -Inf       -Inf -Inf
## [2,]    2 0.57142857 0.42857143       -Inf -Inf
## [3,]    3 0.44444444 0.33333333 0.22222222 -Inf
## [4,]    4 0.40000000 0.30000000 0.20000000  0.1


UtilLesWghtsDS(dataset05, relWeights = c(0.78723404, 0.17021277, 0.04255319))
##      [,1]       [,2]       [,3]       [,4]
## [1,]    1 1.00000000       -Inf       -Inf
## [2,]    2 0.82222222 0.17777778       -Inf
## [3,]    3 0.78723404 0.17021277 0.04255319

Calculate mean squares for factorial dataset

Description

Calculates the mean squares used in the DBM and ORH methods for factorial dataset

Usage

UtilMeanSquares(dataset, FOM = "Wilcoxon", FPFValue = 0.2, method = "DBM")

Arguments

dataset

The dataset to be analyzed, see RJafroc-package.

FOM

The figure of merit to be used in the calculation. The default is "FOM_wAFROC". See UtilFigureOfMerit.

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

method

The method, in which the mean squares are calculated. The two valid choices are "DBM" (default) and "OR".

Details

For DBM method, msT, msTR, msTC, msTRC will not be available if the dataset contains only one modality (NAs are returned). Similarly, msR, msTR, msRC, msTRC NAs are returned for single reader dataset. For ORH method, msT, msR, msTR will be returned for multiple reader multiple modality dataset. msT is not available for single modality dataset and msR is not available for single reader dataset.

Value

A list containing the mean squares

Examples

result <- UtilMeanSquares(dataset02, FOM = "Wilcoxon")
result <- UtilMeanSquares(dataset05, FOM = "wAFROC", method = "OR")

Convert from OR to DBM variance components

Description

UtilOR2DBMVarCom converts from OR to DBM variance components.

Usage

UtilOR2DBMVarCom(K, ORVarCom)

Arguments

K

Total number of cases

ORVarCom

OR variance components, a data.frame containing VarR, VarTR, Cov1, Cov2, Cov3 and Var

Value

UtilOR2DBMVarCom returns the equivalent DBM variance components

Examples

DBMVarCom <- St(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)

ORVarCom <- St(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)

Obuchowski-Rockette variance components for dataset

Description

Obuchowski-Rockette variance components for dataset

Usage

UtilORVarComp(
  dataset,
  FOM,
  covEstMethod = "jackknife",
  FPFValue = 0.2,
  nBoots = 200,
  seed = NULL
)

Arguments

dataset

Factorial one-treatment or cross-modality two-treatment dataset

FOM

Figure of merit

covEstMethod

The covariance estimation method, "jackknife" (the default) or "bootstrap" or "DeLong" ("DeLong" is applicable only for FOM = "Wilcoxon").

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC": the FPFValue at which to evaluate a partial curve based figure of merit. The default is 0.2.

nBoots

The number of bootstraps (default 200).Only needed for covEstMethod = "bootstrap".

seed

Only needed for the bootstrap covariance estimation method. The initial seed for the random number generator, the default is NULL, for random seed.

Details

The variance components are identical to those obtained using St with method = "OR".

Value

A list containing the following data.frames:

  • foms: the figures of merit for different modality-reader combinations

  • TRanova: the OR modality-reader ANOVA table

  • VarCom: the OR variance-components Cov1, Cov2, Cov3, Var and correlations rho1, rho2 and rho3

  • IndividualTrt: the individual modality mean-squares, Var and Cov2 values

  • IndividualRdr: the individual reader mean-squares, Var and Cov1 values

Examples

## use the default jackknife for covEstMethod
vc <- UtilORVarComp(dataset02, FOM = "Wilcoxon")

##UtilORVarComp(dataset02, FOM = "Wilcoxon", covEstMethod = "bootstrap", 
##nBoots = 2000, seed = 100)$VarCom 

##UtilORVarComp(dataset02, FOM = "Wilcoxon", covEstMethod = "DeLong")$VarCom

vc <- UtilORVarComp(datasetX, FOM = "wAFROC")

Pseudovalues for given factorial or crossed modality dataset and FOM

Description

Returns centered jackknife pseudovalues AND jackknife FOM values, for factorial study designs

Usage

UtilPseudoValues(dataset, FOM, FPFValue = 0.2)

Arguments

dataset

The dataset to be analyzed, see RJafroc-package.

FOM

The figure of merit to be used. The default is "FOM_wAFROC". See UtilFigureOfMerit.

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Value

A list containing two arrays containing the pseudovalues and the jackknife FOM values of the datasets (a third returned value is for internal use).

Note

Each returned array has dimension c(I,J,K), where K depends on the FOM: K1 for FOMs that are based on normal cases only, K2 for FOMs that are based on abnormal cases only, and K for FOMs that are based on normal and abnormal cases.

Examples

result <- UtilPseudoValues(dataset05, FOM = "wAFROC")$jkFomValues[1,1,1:10]
result <- UtilPseudoValues(datasetX, FOM = "wAFROC")