Title: | Multi-Reader, Multi-Case Analysis Methods (ROC, Agreement, and Other Metrics) |
---|---|
Description: | This software does Multi-Reader, Multi-Case (MRMC) analyses of data from imaging studies where clinicians (readers) evaluate patient images (cases). What does this mean? ... Many imaging studies are designed so that every reader reads every case in all modalities, a fully-crossed study. In this case, the data is cross-correlated, and we consider the readers and cases to be cross-correlated random effects. An MRMC analysis accounts for the variability and correlations from the readers and cases when estimating variances, confidence intervals, and p-values. The functions in this package can treat arbitrary study designs and studies with missing data, not just fully-crossed study designs. An overview of this software, including references presenting details on the methods, can be found here: <https://www.fda.gov/medical-devices/science-and-research-medical-devices/imrmc-software-do-multi-reader-multi-case-statistical-analysis-reader-studies>. |
Authors: | Brandon Gallas [aut, cre] |
Maintainer: | Brandon Gallas <[email protected]> |
License: | CC0 |
Version: | 2.1.0 |
Built: | 2025-03-02 03:33:22 UTC |
Source: | https://github.com/cran/iMRMC |
Convert an MRMC data frame to a design matrix, dropping readers or cases with no observations
convertDFtoDesignMatrix(dfMRMC, modality = NULL, dropFlag = TRUE)
convertDFtoDesignMatrix(dfMRMC, modality = NULL, dropFlag = TRUE)
dfMRMC |
An MRMC data frame |
modality |
The score matrix depends on the modality. If more than one modality exists in the data frame, you must specify which modality to subset. |
dropFlag |
[logical] The default setting (TRUE) removes readers and cases that have no observations. Dropping them by default will speed up analyses. Leaving the levels (dropFlag = FALSE) is useful if you need the entire score or design matrix when comparing or doing analyses with two modalities. |
A matrix that is nCases by nReaders indicating which scores were reported for each reader and case
Convert an MRMC data frame to a score matrix, dropping readers or cases with no observations
convertDFtoScoreMatrix(dfMRMC, modality = NULL, dropFlag = TRUE)
convertDFtoScoreMatrix(dfMRMC, modality = NULL, dropFlag = TRUE)
dfMRMC |
An MRMC data frame |
modality |
The score matrix depends on the modality. If more than one modality exists in the data frame, you must specify which modality to subset. |
dropFlag |
[logical] The default setting (TRUE) removes readers and cases that have no observations. Dropping them by default will speed up analyses. Leaving the levels (dropFlag = FALSE) is useful if you need the entire score or design matrix when comparing or doing analyses with two modalities. |
A matrix that is nCases by nReaders of the scores each reader reported for each case
Assign a group label to items in a vector
createGroups(items, nG)
createGroups(items, nG)
items |
A vector of items |
nG |
The number of groups |
A data frame containing the items and their group labels
x <- paste("item", 1:10, sep = "") df <- createGroups(x, 3) print(df)
x <- paste("item", 1:10, sep = "") df <- createGroups(x, 3) print(df)
Convert a data frame with all needed factors to doIMRMC formatted data frame
createIMRMCdf( dFrame, keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score", truth = "truth"), truePositiveFactor = "cancer" )
createIMRMCdf( dFrame, keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score", truth = "truth"), truePositiveFactor = "cancer" )
dFrame |
This data frame includes columns for readerID, caseID, modalityID, score, and truth. These columns are not expected to be named as such and other columns may exist. |
keyColumns |
This list identifies the column names of the data frame to be used for the analysis. list(readerID = "***", caseID = "***", modalityID = "***", score = "***", truth="***") |
truePositiveFactor |
The true positive label, such as "cancer" or "1" |
output a doIMRMC formatted data frame: rows for truth and rows for data. The results will be an iMRMC formatted data frame, see dfMRMC_example
Delete a data frame column
deleteCol(df, colName)
deleteCol(df, colName)
df |
A data frame |
colName |
Column name or list of column names to be deleted |
The data frame without the deleted column or columns
An example data frame formatted for 'doIMRMC' and other iMRMC functions.
dfMRMC_example
dfMRMC_example
A data frame with 880 rows and 4 columns:
Factor with 5 levels like "reader1", "reader2", ... As well as the special reader "truth"
Factor with 80 levels like "case1", "case2", ...
Factor with 2 levels like "modality1", "modality2", ... As well as the special modality "truth"
Numeric reader score
Each row of this data frame corresponds to an observation. For every caseID, there must be a row corresponding to the truth observation. The readerID for a truth observation is "truth". The modalityID for a truth observation is "truth". The score for a truth observation must be either 0 (signal-absent) or 1 (signal-present).
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dfMRMC_example <- sim.gRoeMetz(config) # Analyze the MRMC ROC data result <- doIMRMC(dfMRMC_example)
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dfMRMC_example <- sim.gRoeMetz(config) # Analyze the MRMC ROC data result <- doIMRMC(dfMRMC_example)
Execute a Multi-Reader, Multi-Case (MRMC) analysis of ROC data from imaging studies where clinicians (readers) evaluate patient images (cases). An overview of this software, including references presenting details on the methods, can be found HERE or as an entry in the FDA/CDRH Regulatory Science Tool Catalog HERE.
doAUCmrmc(data, flagROC = FALSE)
doAUCmrmc(data, flagROC = FALSE)
data |
an iMRMC formatted data frame, see dfMRMC_example |
flagROC |
boolean indicating if ROC results should be computed. |
The MRMC analysis results as a list, below is a quick summary:
summaryMRMC, list
a list of summary study design information.
nM, num
number of modalities
nR, num
number of readers
nC.neg, num
number of signal-present caeses
nC.pos, num
number of signal-absent cases
modalites, char
names of modalities
readers, char
names of modalities
cases.neg, char
names of modalities
cases.pos, char
names of modalities
perReader.full, data.frame
this data frame contains the performance results for each reader and modality comparison.
The analysis returns the final AUC results and the moments, coefficients of those moments.
Key variables of this data frame are AUC.1 (where '.1' indicates the row's reader and
modality '.1' pair), AUC.2 ('.2' indicates the '.2' reader and modality pair), and covAUC.
Ustat.full, data.frame
this data frame contains the reader-average AUC performance results.
The analysis results are based on U-statistics.
Key variables of this data frame are AUC.1, AUC.2, AUC1minusAUC2 and the corresponding
variances, confidence intervals, degrees of freedom and p-values.
ROC, list
each object of this list is an object containing an ROC curve.
There is an ROC curve for every combination of reader and modality.
For every modality, there are also four average ROC curves. These are discussed in
Chen2014_Br-J-Radiol_v87p20140016.
The diagonal average averages the reader-specific ROC curves along y = -x + b for b in (0,1).
The horizontal average averages the reader specific ROC curves along y = b for b in (0,1).
The vertical average averages the reader specific ROC curves along x = b for b in (0,1).
The pooled average ignores readerID and pools all the scores together to create one ROC curve.
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Analyze the MRMC ROC data and compute ROC curves aucResult <- doAUCmrmc(dFrame.imrmc, flagROC = TRUE)
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Analyze the MRMC ROC data and compute ROC curves aucResult <- doAUCmrmc(dFrame.imrmc, flagROC = TRUE)
Execute a Multi-Reader, Multi-Case (MRMC) analysis
of ROC data from imaging studies where clinicians (readers) evaluate patient
images (cases). This function is a wrapper that executes
doAUCmrmc
and formats the output to generally match the output of
doIMRMC
version 1.2.5.
An overview of this software, including references presenting
details on the methods, can be found HERE
or as an entry in the FDA/CDRH Regulatory Science Tool Catalog
HERE.
doIMRMC(data)
doIMRMC(data)
data |
an iMRMC formatted data frame, see dfMRMC_example |
Unlike the legacy doIMRMC_java
, the 'varDecomp' results no
longer scale the covariance by a factor of 2. This scaling is needed when
calculating the total variance of the difference in modalities. The user
must scale this covariance by 2 manually now to achieve the total variance
of the difference in modalities result.
The MRMC analysis results, below is a quick summary:
perReader, data.frame
The performance results for each combination of reader and pair of modalities.
Key variables of this data frame are AUCA, AUCB, AUCAminusAUCB
and the corresponding variances.
When the modalities differ, the variance is understood to be the covariance between the modalities.
Ustat, data.frame
Reader-averaged performance results for each pair of modalities.
The analysis results are based on U-statistics.
Key variables of this data frame are AUCA, AUCB,
AUCAminusAUCB and the corresponding variances,
confidence intervals, degrees of freedom, and p-values.
When the modalities differ, the variance is understood to be the covariance between the modalities.
MLEstat, data.frame
Reader-average performance results for each pair of modalities.
The analysis results are based on V-statistics, which
approximates the true distribution with
the empirical distribution. The empirical distribution equals
the nonparametric MLE
estimate of the true distribution, which is also equivalent
to the ideal bootstrap estimate.
Key variables of this data frame are AUCA, AUCB,
AUCAminusAUCB and the corresponding
variances, confidence intervals, degrees of freedom, and p-values.
When the modalities differ, the variance is understood to be the covariance between the modalities.
varDecomp, list
list of data frames of the coefficient and components of
variance. The analysis includes variance decomposition based off both the
BDG and BCK MRMC methods, and Ustat and MLE statistical methods.
Each MRMC and statistical method combination is contained within this
list of lists.
ROC, list
each object of this list is an object containing an ROC curve.
There is an ROC curve for every combination of reader and modality.
For every modality, there are also four average ROC curves.
These are discussed in Chen2014_Br-J-Radiol_v87p20140016.
The diagonal average averages the reader-specific ROC curves
along y = -x + b for b in (0,1).
The horizontal average averages the reader specific ROC curves
along y = b for b in (0,1).
The vertical average averages the reader specific ROC curves
along x = b for b in (0,1).
The pooled average ignores readerID and pools all the scores
together to create one ROC curve.
full, list
This returns the same result as doAUCmrmc
.
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Analyze the MRMC ROC data result <- doIMRMC(dFrame.imrmc)
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Analyze the MRMC ROC data result <- doIMRMC(dFrame.imrmc)
doIMRMC_java takes ROC data as a data frame and runs a multi-reader multi-case analysis
based on U-statistics as described in the following papers
Gallas2006_Acad-Radiol_v13p353 (single-modality),
Gallas2008_Neural-Networks_v21p387 (multiple modalities, arbitrary study designs),
Gallas2009_Commun-Stat-A-Theor_v38p2586 (framework paper). This function is deprecated, please use doIMRMC
instead.
doIMRMC_java( data = NULL, fileName = NULL, workDir = NULL, iMRMCjarFullPath = NULL, stripDatesForTests = FALSE )
doIMRMC_java( data = NULL, fileName = NULL, workDir = NULL, iMRMCjarFullPath = NULL, stripDatesForTests = FALSE )
data |
an iMRMC formatted data frame, see dfMRMC_example |
fileName |
This character string identifies the location of an iMRMC input file. The input file is identical to data except there is a free text section to start, then a line with "BEGIN DATA:", then the data frame info. |
workDir |
This character string determines the directory where intermediate results are written. If this parameter is not set, the program writes the intermediate results to the directory specified by tempdir() and then deletes them. |
iMRMCjarFullPath |
This character string identifies the location of the iMRMC.jar file this jar file can be downloaded from https://github.com/DIDSR/iMRMC/releases this R program supports version iMRMC-v3p2.jar |
stripDatesForTests |
Since results include a date and time stamp, these need to be stripped out when doing the package tests. This parameter flags whether or not the dates should be stripped out. |
In detail, this procedure reads the name of an input file from the local file system, or takes a data frame and writes it to the local file system formatted for the iMRMC program (found at https://github.com/DIDSR/iMRMC/releases), it executes a java app, the iMRMC engine, which writes the results to the local files system, it reads the analysis results from the local file system, packs the analysis results into a list object, deletes the data and analysis results from the local file system, and returns the list object.
This software requires Java(>=8).
The examples took too long for CRAN to accept. So here is an example:
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Analyze the MRMC ROC data result <- doIMRMC_java(dFrame.imrmc)
[list] iMRMC outputs. The objects of this list are described in detail in the iMRMC documentation which can be found at <http://didsr.github.io/iMRMC/000_iMRMC/userManualHTML/index.htm>
Here is a quick summary:
perReader
data.frame containing the performance results for each reader.
Key variables of this data frame are AUCA, AUCB, AUCAminusAUCB and the corresponding
variances, confidence intervals, degrees of freedom and p-values.
Ustat
data.frame containing the reader-average performance results.
The analysis results are based on U-statistics and the papers listed above.
Key variables of this data frame are AUCA, AUCB, AUCAminusAUCB and the corresponding
variances, confidence intervals, degrees of freedom and p-values.
MLEstat
data.frame containing the reader-average performance results.
The analysis results are based on V-statistics, which approximates the true distribution with
the empirical distribution. The empirical distribution equals the nonparametric MLE
estimate of the true distribution, which is also equivalent to the ideal bootstrap estimate.
Please refer to the papers listed above.
Key variables of this data frame are AUCA, AUCB, AUCAminusAUCB and the corresponding
variances, confidence intervals, degrees of freedom and p-values.
ROC
list containing ROC curves
There is an ROC curve for every combination of reader and modality.
For every modality, there are also four average ROC curves. These are discussed in
Chen2014_Br-J-Radiol_v87p20140016.
The diagonal average averages the reader-specific ROC curves along y = -x + b for b in (0,1).
The horizontal average averages the reader specific ROC curves along y = b for b in (0,1).
The vertical average averages the reader specific ROC curves along x = b for b in (0,1).
The pooled average ignores readerID and pools all the scores together to create one ROC curve.
varDecomp
list containing different decompositions of the total variance.
Please refer to Gallas2009_Commun-Stat-A-Theor_v38p2586 (framework paper).
The different decompositions are BCK, BDG, DBM, MS, OR.
Empirically average over multiple empirical ROC curves
doROCavg(ROC, direction = "SeSp")
doROCavg(ROC, direction = "SeSp")
ROC |
list of ROC curves. Each element of the list is a data frame with pairs of (fpf, tpf) operating points. |
direction |
the direction over which to average
|
data frame of an ROC curve
fpf
False-positive fractions (1-specificity)
tpf
True-positive fractions (sensitivity)
Create a standard set of ROC curves from an MRMC data frame
doROCcurveMRMC(mrmcAlternate)
doROCcurveMRMC(mrmcAlternate)
mrmcAlternate |
data frame
|
list of ROC curves corresponding to all reader x modality combinations, all modalities with the reader scores pooled, all modalities with the per-reader ROC curves, horizontally, diagonally, and vertically averaged.
Create empirical ROC curve
doROCxy(sa, sp)
doROCxy(sa, sp)
sa |
signal-absent scores |
sp |
signal-present scores |
data frame of an ROC curve
fpf
False-positive fractions (1-specificity)
tpf
True-positive fractions (sensitivity)
threshold
Threshold corresponding to each fpf, tpf
operating point
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Isolate signal absent scores indexSA <- grep("negCase", dFrame.imrmc$caseID) sa <- dFrame.imrmc[indexSA, ]$score # Isolate signal present scores indexSP <- grep("posCase", dFrame.imrmc$caseID) sp <- dFrame.imrmc[indexSP, ]$score # Compute empirical ROC curve result <- doROCxy(sa, sp)
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Isolate signal absent scores indexSA <- grep("negCase", dFrame.imrmc$caseID) sa <- dFrame.imrmc[indexSA, ]$score # Isolate signal present scores indexSP <- grep("posCase", dFrame.imrmc$caseID) sp <- dFrame.imrmc[indexSP, ]$score # Compute empirical ROC curve result <- doROCxy(sa, sp)
Create empirical ROC curve from an MRMC formatted data frame
doROCxyMRMC(mrmcAlternate)
doROCxyMRMC(mrmcAlternate)
mrmcAlternate |
data frame
|
data frame of an ROC curve
modalityID
readerID
fpf
False-positive fractions (1-specificity)
tpf
True-positive fractions (sensitivity)
threshold
Threshold corresponding to each fpf, tpf
Extract between-reader between-modality pairs of scores
extractPairedComparisonsBRBM( data0, modalities = c("testA", "testB"), keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score") )
extractPairedComparisonsBRBM( data0, modalities = c("testA", "testB"), keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score") )
data0 |
This data frame includes columns for readerID, caseID, modalityID, score. |
modalities |
The modalities (testA, testB) for the scores to be paired |
keyColumns |
This list identifies the column names of the data frame to be used for the analysis. list(readerID = "***", caseID = "***", modalityID = "***", score = "***", truth="***") |
A data frame of all paired observations. Each observation comes from a pair of readers evaluating a case in two modalities. The first column corresponds to one reader evaluating the case in testA. The second column corresonds to the other reader evaluating the case in testB.
Extract within-reader between-modality pairs of scores
extractPairedComparisonsWRBM( data0, modalities = "testA", keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score") )
extractPairedComparisonsWRBM( data0, modalities = "testA", keyColumns = list(readerID = "readerID", caseID = "caseID", modalityID = "modalityID", score = "score") )
data0 |
This data frame includes columns for readerID, caseID, modalityID, score. |
modalities |
The modalities (testA, testB) for the scores to be paired |
keyColumns |
This list identifies the column names of the data frame to be used for the analysis. list(readerID = "***", caseID = "***", modalityID = "***", score = "***", truth="***") |
A data frame of all paired observations. Each observation comes from a one reader evaluating a case in two modalities The first column corresponds to one reader evaluating the case in "testA". The second column corresonds to the same reader evaluating the case in "testB".
Get between-reader, between-modality paired data from an MRMC data frame
getBRBM(mcsData, modality.X, modality.Y)
getBRBM(mcsData, modality.X, modality.Y)
mcsData |
A data frame with the following columns: readerID, caseID, modalityID, score |
modality.X |
The name of one modality |
modality.Y |
The name of one modality. |
If modality.Y = modality.X, then the data would be between-reader, within-modality (BRWM).
The result of merging the modality.X and modality.Y subsets of mcsData by caseID for every pair of readers
Import MRMC dataset from the web (https://github.com/DIDSR/iMRMC/wiki/iMRMC-Datasets)
getMRMCdataset(dataset = "viperObs")
getMRMCdataset(dataset = "viperObs")
dataset |
Possible dataset options available:
|
desired dataset downloaded from the web as a csv
# Save Prostate MRI ground truth and reader data truthData <- getMRMCdataset("prostateTruth") rawData <- getMRMCdataset("prostateRawData")
# Save Prostate MRI ground truth and reader data truthData <- getMRMCdataset("prostateTruth") rawData <- getMRMCdataset("prostateRawData")
Get a score from an MRMC data frame
getMRMCscore(df, iR, iC, modality)
getMRMCscore(df, iR, iC, modality)
df |
An MRMC data frame |
iR |
The numeric index of the readerID |
iC |
The numeric index of the caseID |
modality |
The character description of the modalityID |
The score
Get within-reader, between-modality paired data from an MRMC data frame
getWRBM(mrmcDF, modality.X, modality.Y)
getWRBM(mrmcDF, modality.X, modality.Y)
mrmcDF |
A data frame with the following columns: readerID, caseID, modalityID, score |
modality.X |
The name of one modality |
modality.Y |
The name of one modality. This should be different from modality.X |
The result of merging the modality.X and modality.Y subsets of mrmcDF by readerID and caseID
See the documentation for the parallel package
.
If you require backwards compatibility, please run RNGversion("3.5.0")
.
init.lecuyerRNG(seed = 1, stream = 2)
init.lecuyerRNG(seed = 1, stream = 2)
seed |
This determines the position in each stream |
stream |
This determines the stream |
Nothing
These four functions calculate four types of Limits of Agreement using ANOVA:
Within-Reader Within-Modality(WRWM), Between-Reader Within-Modality(BRWM),
Within-Reader Between-Modality(WRBM), and Between-Reader Between-Modality(BRBM).
The 95% confidence interval of the mean difference is also provided. If the study is fully crossed, the ANOVA
methods are realized either by applying stats::aov
or by matrix multiplication. Otherwise, the SS in ANOVA are
computed as residual sums of squares of linear models. See details below about the model structure
and these references.
S. Wen and B. D. Gallas, “Three-Way Mixed Effect ANOVA to Estimate MRMC Limits of Agreement,” Statistics in Biopharmaceutical Research, 14, pp. 532–541, 2022, doi:10.1080/19466315.2022.2063169.
S. Wen and B. D. Gallas, “Expanding to Arbitrary Study Designs: ANOVA to Estimate Limits of Agreement for MRMC Studies,” arXiv, 2023, doi:10.48550/ARXIV.2312.16097.
laWRBM( df, modalitiesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laBRWM( df, modality = c("testA"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laWRWM( df, replicatesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laBRBM( df, modalitiesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE, is.sparseQR = T )
laWRBM( df, modalitiesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laBRWM( df, modality = c("testA"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laWRWM( df, replicatesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE ) laBRBM( df, modalitiesToCompare = c("testA", "testB"), keyColumns = c("readerID", "caseID", "modalityID", "score"), if.aov = TRUE, type = 1, reader.first = TRUE, is.sparseQR = T )
df |
Data frame of observations, one per row. Columns identify random effects, fixed effects, and the observation. Namely,
|
modalitiesToCompare |
The factors identifying the modalities to compare. It should be length 2.
Default = |
keyColumns |
Identify the factors corresponding to the readerID (random effect), caseID (random effect),
modalityID (fixed effect), and score (observation).
Default = |
if.aov |
Boolean value to determine whether to use the 'stats::aov' function or to
calculate the ANOVA statistics explicitly. 'stats::aov' is only appropriate
for fully-crossed study only. This flag permits head-to-head comparisons of
the output from 'stats::aov' and the explicit calculations.
Default = |
type |
Identify how SS are computed in ANOVA for unbalanced study designs. The possible values are c(1,2,3), corresponding to the approaches introduced in the SAS package(Langsrud2003_Stat-Comput_v13p163). Default |
reader.first |
Boolean value to determine whether reader effect is added to the model before the case effect.
Default |
modality |
The factor identifying the modality for laBRWM. It should be length 1.
Default = |
replicatesToCompare |
The factors identifying the replicates to compare for |
is.sparseQR |
Boolean value to determine whether the 'base::qr' function assumes the input
data is sparse or not.
Default = |
Suppose the score from a reader j for case k under modality is
, then the difference score from the
same reader for the same case under two different modalities is
.
laWRBM
use two-way random effect ANOVA to analyze the difference scores . The model
is
, where
and
are random effects for readers
and cases. The variances of mean and individual observations are expressed as linear combinations of the MS
given by ANOVA.
laBRWM
use two-way random effect ANOVA to analyze the scores for a single modality.
The model is
, where
and
are random effects
for readers and cases. The variances of mean and individual observations are expressed as linear combinations
of the MS given by ANOVA.
laWRWM
use two-way random effect ANOVA to analyze the difference scores from the same
reader for the same cases under the same modality with different replicates
.
The model is
, where
and
are random effects for
readers and cases. The variances of mean and individual observations are expressed as linear combinations of
the MS given by ANOVA.
laBRBM
use three-way mixed effect ANOVA to analyze the scores . The model is given by
, where
and
are random effects for readers and cases,
is a fixed effect for modality, and the other terms
are interaction terms. The variances of mean and individual observations are expressed as linear combinations
of the MS given by ANOVA.
A list of two dataframes.
The first dataframe is limits.of.agreement
. It has one row. Each column is as follows:
The mean difference score.
The variance of the mean difference score.
The variance of the difference score.
Lower bound of 95% CI for the mean difference score. meanDiff+
1.96*sqrt(var.MeanDiff)
Upper bound of 95% CI for the mean difference score. meanDiff-
1.96*sqrt(var.MeanDiff)
Lower Limit of Agreement for the difference score. meanDiff+1.96*sqrt(var.1obs)
Upper Limit of Agreement for the difference score. meanDiff-1.96*sqrt(var.1obs)
The second dataframe is two.way.ANOVA
or three.way.ANOVA
shows the degrees of freedom,
sums of squares, and estimates of variance components for each source of variation
# Initialize the simulation configuration parameters config <- sim.NormalIG.Hierarchical.config(modalityID = c("testA", "testB")) # Initizlize the seed and stream of the random number generator init.lecuyerRNG() # Simulate an MRMC ROC data set dFrame <- sim.NormalIG.Hierarchical(config) # Compute Limits of Agreement laWRBM_result <- laWRBM(dFrame) print(laWRBM_result) laBRBM_result <- laBRBM(dFrame) print(laBRBM_result)
# Initialize the simulation configuration parameters config <- sim.NormalIG.Hierarchical.config(modalityID = c("testA", "testB")) # Initizlize the seed and stream of the random number generator init.lecuyerRNG() # Simulate an MRMC ROC data set dFrame <- sim.NormalIG.Hierarchical(config) # Compute Limits of Agreement laWRBM_result <- laWRBM(dFrame) print(laWRBM_result) laBRBM_result <- laBRBM(dFrame) print(laBRBM_result)
Rename a data frame column name or a list object name
renameCol(df, oldColName, newColName)
renameCol(df, oldColName, newColName)
df |
A data frame |
oldColName |
Old column name |
newColName |
New column name |
the data frame with the updated column name
Convert ROC data formatted for doIMRMC to TPF and FPF data formatted for doIMRMC
roc2binary(df.auc, threshold)
roc2binary(df.auc, threshold)
df.auc |
data frame of roc scores formatted for doIMRMC |
threshold |
The threshold for determining binary decisions |
a list of two data frames (df.tpf and df.fpf) both formatted for doIMRMC
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Convert ROC MRMC data to TPF and FPF data frames result <- roc2binary(dFrame.imrmc, threshold = 0.9) # Analyze TPF data using doIMRMC tpf_result <- doIMRMC(result$df.tpf) # View(tpf_result$perReader)
# Create a sample configuration file config <- sim.gRoeMetz.config() # Simulate an MRMC ROC data set dFrame.imrmc <- sim.gRoeMetz(config) # Convert ROC MRMC data to TPF and FPF data frames result <- roc2binary(dFrame.imrmc, threshold = 0.9) # Analyze TPF data using doIMRMC tpf_result <- doIMRMC(result$df.tpf) # View(tpf_result$perReader)
This is a data frame containing the configuration parameters
used in Roe1997_Acad-Radiol_v4p298. Each row corresponds to one of the twelve configurations
appearing in Table 1 of that paper in a format that can be the input to sim.gRoeMetz
.
The columns of this data frame are as follows
Experiment labels and size
modalityID.A: [character] label modality A
modalityID.B: [character] label modality B
nR: [numeric] number of readers
nC.neg: [numeric] number of signal-absent cases
nC.pos: [numeric] number of signal-present cases
There are six fixed effects:
mu.neg: [numeric] signal-absent (neg, global mean)
mu.pos: [numeric] signal-present (pos, global mean)
mu.Aneg: [numeric] modality A signal-absent (Aneg, modality effect)
mu.Bneg: [numeric] modality B signal-absent (Bneg, modality effect)
mu.Apos: [numeric] modality A signal-present (Apos, modality effect)
mu.Bpos: [numeric] modality B signal-present (Bpos, modality effect)
There are six random effects that are independent of modality
var_r.neg: [numeric] variance of random reader effect
var_c.neg: [numeric] variance of random case effect
var_rc.neg: [numeric] variance of random reader by case effect
var_r.pos: [numeric] variance of random reader effect
var_c.pos: [numeric] variance of random case effect
var_rc.pos: [numeric] variance of random reader by case effect
There are six random effects that are specific to modality A
var_r.Aneg: [numeric] variance of random reader effect
var_c.Aneg: [numeric] variance of random case effect
var_rc.Aneg: [numeric] variance of random reader by case effect
var_r.Apos: [numeric] variance of random reader effect
var_c.Apos: [numeric] variance of random case effect
var_rc.Apos: [numeric] variance of randome reader by case effect
There are six random effects that are specific to modality B
var_r.Bneg: [numeric] variance of random reader effect
var_c.Bneg: [numeric] variance of random case effect
var_rc.Bneg: [numeric] variance of random reader by case effect
var_r.Bpos: [numeric] variance of random reader effect
var_c.Bpos: [numeric] variance of random case effect
var_rc.Bpos: [numeric] variance of randome reader by case effect
This procedure simulates an MRMC data set of an ROC experiment comparing two modalities. It is based on Gallas2014_J-Med-Img_v1p031006, which generalizes of the model in Roe1997_Acad-Radiol_v4p298 and Roe1997_Acad-Radiol_v4p587. Specifically, it allows the variance components to depend on the truth and the modality. For the simpler Roe and Metz model, you can enter the smaller set of parameters into sim.gRoeMetz.config and it will return a larger set of parameters that can be used with this function.
sim.gRoeMetz(config)
sim.gRoeMetz(config)
config |
[list] of simulation parameters:
|
The simulation is a linear model with six fixed effects related to modality and truth and 18 normally distributed independent random effects for readers, cases, and the interaction between the two. Here is the linear model:
L.mrct = mu.t + mu.mt
+ reader.rt + case.ct + readerXcase.rct
+ modalityXreader.mrt + modalityXcase.mct + modalityXreaderXcase.mrct
m=modality (levels: A and b)
t=truth (levels: neg and Pos)
mu.t is the global mean for t=neg and t=pos cases
mu.mt is the modality specific fixed effects for t=neg and t=pos cases
the remaining terms are the random effects: all independent normal random variables
dFrame.imrmc [data.frame] with (nC.neg + nC.pos)*(nR+1) rows including
readerID: [factor] w/ nR levels "reader1", "reader2", ...
caseID: [factor] w/ nC levels "case1", "case2", ...
modalityID: [factor] w/ 1 level config$modalityID
score: [numeric] reader score
Note that the first nC.neg + nC.pos rows specify the truth labels for each case. For these rows, the readerID must be "truth" and the score must be 0 for negative cases and 1 for positive cases.
This function creates a configuration object for the Roe & Metz simulation model to be used as input for the sim.gRoeMetz program. The default model returned when there are no arguments given to the function is the "HH" model from Roe1987_Acad-Radiol_v4p298. Following that paper, The user can specify three parameters related to experiment size (nR, nC.neg, nC.pos) and five parameters parameters specifying a linear model that does not depend on modality or truth (mu.neg, mu.pos, var_r, var_c, var_rc). The mu.pos is set to 1.0, which yields a reader averaged AUC of approximately 0.765.
sim.gRoeMetz.config( nR = 5, nC.neg = 40, nC.pos = 40, mu.neg = 0, mu.pos = 1, var_r = 0.03, var_c = 0.3, var_rc = 0.2 )
sim.gRoeMetz.config( nR = 5, nC.neg = 40, nC.pos = 40, mu.neg = 0, mu.pos = 1, var_r = 0.03, var_c = 0.3, var_rc = 0.2 )
nR |
Number of readers (default = 5) |
nC.neg |
Number of signal-absent cases (default = 25) |
nC.pos |
Number of signal-present cases (default = 25) |
mu.neg |
Mean fixed effect of signal-absent distribution (default = 0.0) |
mu.pos |
Mean fixed effect of signal-present distribution (default = 1.0) |
var_r |
Variance of reader random effect (default = 0.03) |
var_c |
Variance of case random effect (default = 0.30) |
var_rc |
Variance of reader.by.case random effect (default = 0.20) |
If no arguments, this function returns a default simulation configuration for sim.gRoeMetz
config [list] Refer to the sim.gRoeMetz input variable
This procedure simulates an MRMC data set for an MRMC agreement study comparing two modalities. It is a hierarchical model that consists of two interaction terms: reader-case interaction and modality-reader-case-replicate interaction. Both interaction terms are conditionally normally distributed, with the case(-related) factor contributing to the conditional mean and the reader(-related) factor contributing to the conditional variance. The case effect is normally distributed, while the reader effect is an inverse-gamma.
The Hierarchical Inverse-Gamma model is described in this paper:
S. Wen and B. D. Gallas, “Three-Way Mixed Effect ANOVA to Estimate MRMC Limits of Agreement,” Statistics in Biopharmaceutical Research, 14, pp. 532–541, 2022, doi:10.1080/19466315.2022.2063169
sim.NormalIG.Hierarchical( config, R = NULL, AR = NULL, BR = NULL, is.within = FALSE )
sim.NormalIG.Hierarchical( config, R = NULL, AR = NULL, BR = NULL, is.within = FALSE )
config |
[list] of simulation parameters:
|
R |
[vector] of size |
AR |
[vector] of size |
BR |
[vector] of size |
is.within |
[bol] whether the data are within-modality (A==B).
In this case the modality-reader and modality-case interaction terms
will be the same.
Default |
The model has the following structure: X.ijkl = mu + m.i + RC.jk + mRCE.ijkl
mu = grand mean
m.i = modalities (levels: A and B)
RC.jk given R.j,C.k ~ N(C.k, R.j) reader-case interaction term
mRCE.ijkl given mR.ij,mC.ik ~ N(mC.ik, mR.ij) modality-reader-case-replicate term
C.k and mC.ik are Normal/beta distributed
R.j and mR.ij are Inverse-Gamma distributed
df [data.frame] with nR x nC x 2 rows including
readerID: [Factor] w/ nR levels "reader1", "reader2", ...
caseID: [Factor] w/ nC levels "case1", "case2", ...
modalityID: [Factor] w/ 2 levels "testA" and "testB"
score: [num] reader score
This function creates a configuration object that sets the parameters
for the Hierarchical Inverse-Gamma simulation model. The configuration
object is an to the sim.NormalIG.Hierarchical
function.
sim.NormalIG.Hierarchical.config( nR = 5, nC = 100, modalityID = c("testA", "testA*"), C_dist = "normal", mu = 0, tau_A = 0, tau_B = 0, alpha_R = 10, beta_R = 1, sigma_C = 1, a_C = 0.8, b_C = 3, sigma_tauC = 1, alpha_tauR = 10, beta_tauR = 1, C_scale = 1, RC_scale = 1, tauC_scale = 1, tauRCE_scale = 1 )
sim.NormalIG.Hierarchical.config( nR = 5, nC = 100, modalityID = c("testA", "testA*"), C_dist = "normal", mu = 0, tau_A = 0, tau_B = 0, alpha_R = 10, beta_R = 1, sigma_C = 1, a_C = 0.8, b_C = 3, sigma_tauC = 1, alpha_tauR = 10, beta_tauR = 1, C_scale = 1, RC_scale = 1, tauC_scale = 1, tauRCE_scale = 1 )
nR |
[num] Number of readers. Default |
nC |
[num] Number of cases. Default |
modalityID |
[vector] List of modalityID. Default |
C_dist |
[chr] Distribution of the case. Default |
mu |
[num] grand mean. Default |
tau_A |
[num] modality A effect. Default |
tau_B |
[num] modality B effect. Default |
alpha_R |
[num] shape parameter for reader. Default |
beta_R |
[num] scale parameter for reader. Default |
sigma_C |
[num] std of case factor (if |
a_C |
[num] alpha for distribution of case (if |
b_C |
[num] beta for distribution of case (if |
sigma_tauC |
[num] std of modality-case (if |
alpha_tauR |
[num] shape parameter for modality-reader. Default |
beta_tauR |
[num] scale parameter for modality-reader. Default |
C_scale |
[num] weight for the case factor. Default |
RC_scale |
[num] weight for the reader-case interaction term. Default |
tauC_scale |
[num] weight for the modality-case term. Default |
tauRCE_scale |
[num] weight for the modality-reader-case-replicate interaction term. Default |
If no arguments, this function returns a default simulation
configuration for the sim.NormalIG.Hierarchical
function.
config [list] of input parameters for sim.NormalIG.Hierarchical
.
This program simulates observations from one set of readers scoring one set of cases. It produces one modality and one truth state of ROC data following Roe1997_Acad-Radiol_v4p298 and Roe1997_Acad-Radiol_v4p587. In order to produce an entire ROC data set, please use sim.gRoeMetz.
simMRMC(simMRMC.config)
simMRMC(simMRMC.config)
simMRMC.config |
[list] of simulation parameters:
|
The simulation is a linear model with one fixed effect and three normally distributed independent random effects corresponding to readers, cases, and an interaction between the two.
L.rc = mu + readerEffect.r + caseEffect.c + readerXcaseEffect.rc
L [data.frame] with nC*nR rows of 4 variables
L$modalityID [factor] determined by input modalityID
L$readerID [factor] determined by input readerIDs
L$caseID [factor] determined by input caseIDs
L$score [numeric] R.r + C.c + RC.rc
r = 1,2,...,nR
c = 1,2,...,nC
R.r ~ N(0,var_r)
C.c ~ N(0,var_c)
RC.rc ~ N(0,var_rc)
Simulates a sample MRMC ROC experiment
simRoeMetz.example()
simRoeMetz.example()
dFrame.imrmc [data.frame] Please refer to the description of the sim.gRoeMetz
return variable
Convert an MRMC data frame of successes to one formatted for doIMRMC
successDFtoROCdf(df)
successDFtoROCdf(df)
df |
Each row contains a success observation for one reader evaluating one case |
data frame ready for doIMRMC
Convert a doIMRMC formatted data frame to a standard data frame with all factors.
undoIMRMCdf(df.MRMC)
undoIMRMCdf(df.MRMC)
df.MRMC |
This data frame includes columns for readerID, caseID, modalityID, score. Each row is a reader x case x modality observation from the study In addition to observations from the study, this data frame requires rows specifying the truth for each caseID. For truth specifications, the readerID needs to equal "truth" or "-1", modalityID can be anything ("truth" is a good choice), and score should be 0 for signal-absent normal case, 1 for signal-present disease case. |
Delete rows specifying truth and put the truth information on every row.
output a data frame with columns readerID, caseID, modalityID, score, truth
These two functions calculate the mean and variance of a user-specified U-statistic kernel, which is a function of cross-correlated scores.
The motivation for this analysis is data collected in imaging studies where multiple readers read multiple cases in different modes or modalities. The goal is to evaluate the variance of a reader- and case-averaged endpoint, accounting for cross-correlated data arising from two random effects: the random reader skill and the random case difficulty. This analysis is sometimes referred to as an MRMC analysis. Of course, the random effects can be from sources other than readers and cases.
uStat11.jointD( df.input, modalitiesToCompare, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score") ) uStat11.conditionalD( df.input, modalitiesToCompare, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score") )
uStat11.jointD( df.input, modalitiesToCompare, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score") ) uStat11.conditionalD( df.input, modalitiesToCompare, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score") )
df.input |
an iMRMC formatted data frame, see dfMRMC_example |
modalitiesToCompare |
The factors identifying the modalities to compare. |
kernelFlag |
This determines the kernel function
|
keyColumns |
Identify the factors corresponding to the readerID, caseID, modalityID, and score (or alternative random and fixed effects). |
uStat11.conditionalD
is identical to uStat11.jointD
when the study is fully-crossed:
when every reader readers all the cases in both modalities. For arbitrary study designs
the two functions differ according to how the components of variance are estimated.
uStat11.conditionalD
follows Gallas2007_J-Opt-Soc-Am-A_v24pB70
<doi:10.1364/JOSAA.24.000B70> and estimates the components of variance
(which isolate combinations of different random effects) with nested conditional means.
uStat11.jointD
is analogous to the method in Gallas2008_Neural-Networks_v21p387
<doi:10.1016/j.neunet.2007.12.013> and estimates the components of variance
(which isolate combinations of different random effects) with a joint distribution over all
the observations giving equal weight to each one.
Both functions yield unbiased variance estimates.
Our simulations find that uStat11.conditionalD
is statistically more efficient than
uStat11.jointD
(its variance estimate is more precise), but it is slower.
Please refer to the tests/testthat folder of the package for examples using these functions.
This function calculates the mean and variance of the indicated U-statistic kernel, which is a function of the scores. For the identity kernel, we simply return the mean and variance of the scores.
The function returns a list of outputs. Many of these outputs have three elements.
If kernelFlag
= 1 == identity kernel, the first element corresponds to the mean score of
modality A, the second corresponds to mean score of modality B,
and the third corresponds to the mean of the difference in scores from modality A and B.
If kernelFlag
= 2 == difference kernel, the first element corresponds to the
mean difference in scores from modalities A and B, the second element corresponds to
the mean difference in scores from modalities C and D, and the third elements corresponds
to the difference of the just-mentioned differences.
There are 16 outputs:
mean:
See description above.
var:
The variance of the mean.
var.1obs:
The variance of one reader-case-modality observation.
meanPerR
The reader-specific means.
nR
The number of readers in the study.
nC
The number of cases in the study.
nCperR
The number of cases evaluated by each reader for each modality.
moments
The second order moments of the problem.
coeff
The coefficients corresponding to the second-order moments such that
the scalar product between the moments and coefficients yields the variance.
kernel.A
A matrix showing the kernel evaluated for each combination
of each reader and case for modality A (or AB).
design.A
A matrix showing the what data exists for each combination
of each reader and case for modality A (or AB).
kernel.B
A matrix showing the kernel evaluated for each combination
of each reader and case for modality B (or CD).
design.B
A matrix showing the what data exists for each combination
of each reader and case for modality B (or CD).
# Create an MRMC data frame # Refer to Gallas2014_J-Med-Img_v1p031006 simRoeMetz.config <- sim.gRoeMetz.config() # Simulate data df.MRMC <- sim.gRoeMetz(simRoeMetz.config) # Reformat data df <- undoIMRMCdf(df.MRMC) # Grab part of the data df <- droplevels(df[grepl("pos", df$caseID), ]) #### uStat11.jointD.identity #### # Calculate the reader- and case-averaged difference in scores from testA and testB # (kernelFlag = 1 specifies the U-statistics kernel to be the identity) result.jointD.identity <- uStat11.jointD( df, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score"), modalitiesToCompare = c("testA", "testB")) cat("\n") cat("uStat11.jointD.identity \n") print(result.jointD.identity[1:2])
# Create an MRMC data frame # Refer to Gallas2014_J-Med-Img_v1p031006 simRoeMetz.config <- sim.gRoeMetz.config() # Simulate data df.MRMC <- sim.gRoeMetz(simRoeMetz.config) # Reformat data df <- undoIMRMCdf(df.MRMC) # Grab part of the data df <- droplevels(df[grepl("pos", df$caseID), ]) #### uStat11.jointD.identity #### # Calculate the reader- and case-averaged difference in scores from testA and testB # (kernelFlag = 1 specifies the U-statistics kernel to be the identity) result.jointD.identity <- uStat11.jointD( df, kernelFlag = 1, keyColumns = c("readerID", "caseID", "modalityID", "score"), modalitiesToCompare = c("testA", "testB")) cat("\n") cat("uStat11.jointD.identity \n") print(result.jointD.identity[1:2])
The kernel is the difference kernel
uStat11.diff( df.input, modalitiesToCompare, keyColumns = c("readerID", "caseID", "modalityID", "score") )
uStat11.diff( df.input, modalitiesToCompare, keyColumns = c("readerID", "caseID", "modalityID", "score") )
df.input |
Data frame of observations, one per row. Columns also identify random and fixed effects. |
modalitiesToCompare |
The factors identifying the modalities to compare |
keyColumns |
The required columns |
The kernel is the identity kernel
uStat11.identity( df.input, modalitiesToCompare, keyColumns = c("readerID", "caseID", "modalityID", "score") )
uStat11.identity( df.input, modalitiesToCompare, keyColumns = c("readerID", "caseID", "modalityID", "score") )
df.input |
Data frame of observations, one per row. Columns also identify random and fixed effects. |
modalitiesToCompare |
The factors identifying the modalities to compare |
keyColumns |
The required columns |