To compare functional magnetic resonance MR imaging for language mapping hereafter, language functional MR imaging with direct cortical stimulation DCS in patients with brain tumors and to assess factors associated with its accuracy.
Findings were pooled by using bivariate random-effects and hierarchic summary receiver operating characteristic curve models.
Login to your account
Meta-regression and subgroup analyses were performed to evaluate whether publication year, functional MR imaging paradigm, magnetic field strength, statistical threshold, and analysis software affected classification accuracy. Ten articles with a total of patients were included in the analysis. Results of this study showed moderate accuracy of language functional MR imaging when compared with intraoperative DCS, and the included studies displayed significant methodologic heterogeneity.
Online supplemental material is available for this article. Functional magnetic resonance MR imaging by using blood oxygenation level—dependent contrast is one of the most commonly applied noninvasive techniques for presurgical mapping of the eloquent cortical areas and for identifying the language-dominant hemisphere lateralization 1 — 3. Functional MR imaging for language mapping hereafter, language functional MR imaging can assist neurosurgeons with planning surgical procedures near the eloquent cortex, with the aim of maximizing the extent of tumor resection while reducing the risk of postoperative dysphasia 3 , 4.
Nonetheless, intraoperative direct cortical stimulation DCS during awake craniotomy remains the standard for brain mapping. As functional MR imaging has become more widely available, the question of whether language functional MR imaging is as accurate as DCS has attracted a great deal of attention. Another review confirmed these widely discrepant results and suggested that language functional MR imaging is not universally accepted as a standard of care for presurgical planning because its accuracy has not been consistently established Because the utility of preoperative language functional MR imaging remains debatable, investigators have begun to examine the study characteristics and imaging parameters that may contribute to the wide variability in estimates of accuracy.
The wide variations in the results of comparisons of functional MR imaging and DCS likely stem from several factors, including the different functional MR imaging language tasks, imaging processing algorithms, statistical thresholds, tumor locations, and the effects of pathologic brain conditions 1 , 7 , Additionally, the language tasks used in the presurgical setting are not always equivalent, or even analogous, to those used intraoperatively, which makes direct comparison of mapping results difficult To our knowledge, to date no systematic review and meta-analysis has evaluated presurgical language functional MR imaging in patients with brain tumors.
We performed a quantitative meta-analysis by using the bivariate random-effects and hierarchic summary receiver operating characteristic models to determine the language localization accuracy sensitivity and specificity of presurgical functional MR imaging, and the factors associated with its accuracy, by comparing it with the DCS in patients with brain tumors near the eloquent center undergoing resection.
In addition, we manually reviewed the references cited in the retrieved articles. This study was conducted and the results were reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines Our methodologic approach to evidence searching and synthesis, in predefined review protocol, conformed to the diagnostic test accuracy methods of the Cochrane Collaboration Studies that were selected for inclusion in the meta-analysis met the following criteria: a they had been published as full research articles not a letter or an abstract ; b they directly compared functional MR imaging with DCS; and c they either calculated the rates or sensitivities and specificities of true-positive, false-negative, false-positive, and true-negative findings or presented data allowing such calculations to be made.
The exclusion criteria were as follows: a studies of pediatric patients; b studies of animal experiments; c reviews of the literature; d studies in which the data reported were insufficient to enable extraction of the number of subjects per group; and e studies in which the data were presented in another study, in which case only the study with the larger sample size was included.
All searches were performed on September 20, The searches were not subject to language restrictions. The results were collected and de-duplicated by using EndNote software version 7. Titles and abstracts were screened independently by two authors H.
The full text of candidate articles was reviewed by two authors H. All data were coded in standardized form by two investigators H. We extracted demographic data from each article. Interrater disagreements 2 of 92, 2.
Fmri basics and clinical applications pdf viewer
The study adhered to the recommendations of a recent Cochrane review on diagnostic test accuracy The 14 items of the Quality Assessment of Diagnostic Accuracy Studies 2 tool were used to assess the quality of the included studies The risk of bias was assessed in four domains: patient selection, execution of both the index test and reference standard, and flow of patients in particular, whether an appropriate interval existed between the index test and the reference standard.
A third investigator H. We determined the language-localization accuracy of functional MR imaging by comparing its identification of the eloquent cortex with that of DCS the reference standard across studies at both the per-person and per-tag ie, cerebral site levels. Hierarchic summary receiver operating characteristic curves and confidence and prediction regions were used to display the variation in diagnostic accuracy among studies and to determine the influence of threshold effects.
Diagnostic accuracy measures were pooled with the use of a unified model 21 that combined a bivariate random-effects model 22 for estimating point-summary values of sensitivity and specificity and a hierarchic summary receiver operating characteristic curve model 23 for determining test accuracy.
A continuity correction of 0. Because high degrees of heterogeneity are expected in systematic reviews of diagnostic accuracy studies, we followed current recommendations and did not calculate Cochrane Q or Higgins I 2 statistics These statistics are considered uninformative because they do not consider threshold effects.
Heterogeneity among studies was instead investigated by using subgroup analysis and multiple univariate meta-regression analysis to determine whether the heterogeneity was attributable to the covariates used.
To investigate whether a factor was associated with functional MR imaging accuracy, we first performed exploratory analyses by using visual inspection of coupled forest plots and hierarchic summary receiver operating characteristic plots for V -shaped threshold effect A Spearman correlation analysis was also performed to determine the coefficient between sensitivity and the false-positive rate ie, 1-specificity.
Generally a correlation coefficient greater than or equal to 0.
We used the Deek funnel plot asymmetry test, the recommended tool for assessing risk of publication bias in meta-analyses of diagnostic test accuracy A P value less than. P values less than or equal to. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses search flowchart outlining the study identification and selection process is shown in Figure 1.
Ten articles fulfilled the inclusion criteria. Three of the articles presented data on a per-patient basis ie, the mapping results were summarized at the level of the individual patient , and the others on a per-tag basis ie, each DCS stimulation site or tag was considered a separate data point across all patients.
The data extracted from each study are shown in Table 1.
Figure 1: Flowchart of the search process. The results of the assessment of study quality according to the Quality Assessment of Diagnostic Accuracy Studies 2 criteria are shown in Figure 2. This high risk for bias was attributable to the use of sequential selection of clinical cases and of intraoperative DCS as the reference comparison in the studies.
High-risk ratings for concerns regarding applicability were found in all studies for the patient selection domain and nine studies in the index test domain. In contrast, low concerns regarding applicability were found for all studies in the reference standard domain. Figure 2: Stacked bar charts show results of quality assessment for risk of bias and applicability of included studies. The diagnostic performance results obtained from the meta-analysis are shown in Table 2. Data for the per-patient analysis were extracted from three articles involving seven language tasks and patients.
Forest plots of sensitivity and specificity on a per-person basis are shown in Figure 3a.
Related JoVE Videos
The hierarchic summary receiver operating characteristic curve demonstrated an area under the curve of 0. We were unable to conduct subgroup analyses because of the small number of articles and limited availability of patient demographic data. Data were derived from all studies and subgroups of studies performed on a per-patient and on a per-tag basis. Figure 3a: Forest plots of sensitivity and specificity of the studies included in a per-patient basis and b per-tag basis data analysis show sensitivity left and specificity right.
Each solid square represents an eligible study and its point estimates of sensitivity and specificity. Diamonds at the bottom and the vertical dotted line represent pooled summary estimates of sensitivity and specificity.
Resting-State Functional MRI: Everything That Nonexperts Have Always Wanted to Know
Figure 3b: Forest plots of sensitivity and specificity of the studies included in a per-patient basis and b per-tag basis data analysis show sensitivity left and specificity right.
Figure 4a: Hierarchic summary receiver operating characteristic HSROC curves depict the diagnostic accuracy of functional MR imaging on a per-patient basis and b per-tag basis in patients with brain tumors. These HSROC curves illustrate the sensitivity that can be achieved at different specificities, with optimal tradeoff between sensitivity and specificity located at the top left-hand corner. Figure 4b: Hierarchic summary receiver operating characteristic HSROC curves depict the diagnostic accuracy of functional MR imaging on a per-patient basis and b per-tag basis in patients with brain tumors.
Data for the per-tag analysis were extracted from seven articles that used 16 language tasks and involved a total of 88 patients. Forest plots of sensitivity and specificity on a per-tag basis are shown in Figure 3b.
The hierarchic summary receiver operating characteristic curve showed an area under the curve of 0. The meta-regression analyses for sensitivity and specificity regarding publication year, magnetic field strength, and analysis software showed no significant differences.
The type of language task used and mode of stimulus presentation were also nonsignificant in the meta-regression analyses for sensitivity. The Spearman correlation coefficient values of functional MR imaging were 0.
These results showed that considerable threshold effects were present in this meta-analysis. The impact of publication bias on the results of the meta-analysis was assessed by using Deek funnel plots.
The shapes of the funnel plots for the pooled sensitivity and specificity of functional MR imaging revealed obvious symmetry. This meta-analysis study showed that presurgical language functional MR imaging was moderately sensitive and specific in both the per-patient and per-tag analyses, and the methodology used in published studies was heterogeneous.
Several factors associated with the accuracy were identified, which may help the optimization and standardization of the functional MR imaging protocol and paradigm design.
Basics of Multivariate Analysis in Neuroimaging Data
Previous studies have reported widely varying sensitivity and specificity values for language functional MR imaging on a per-patient basis 5 , 6 , 11 and on a per-tag basis 1 , 7. The inconsistent results across studies limit confidence in the reliability and clinical utility of the technique.
Some authors have argued that lack of standardization in language functional MR imaging techniques across studies is a main contributor to the varied performance of the technique 4. Indeed, we identified several modifiable factors that contributed to the observed heterogeneity in accuracy across studies, including statistical threshold, type of language task, functional MR imaging parameter, and mode of stimulus presentation.
Additionally, DCS only reveals necessary cortex through stimulation of spatially limited exposed brain surface, whereas presurgical language functional MR imaging displays all cortical areas that participate in language production ie, essential and facilitatory regions 7.
As such, the high false-positive rates at language functional MR imaging may reflect identification of the broader networks involved—but not necessarily essential to—language functioning, limiting specificity.
Both the sensitivity and specificity of language functional MR imaging depend on the statistical threshold chosen for data evaluation, and testing of multiple thresholds is often applied for detection of activated networks High sensitivity for the eloquent brain regions is desirable to ensure that no eloquent cortex is missed and no functional deficit is induced by surgery.
At low specificity, noncritical brain regions are also activated and visualized, imparting a risk that such areas will be unnecessarily avoided during surgery, resulting in a potentially less extensive resection than would have been possible Unfortunately, the data available from our meta-analysis did not permit a more fine-grained examination of thresholding effects and determination of an optimal threshold was not feasible. Although there may be no one-size-fits-all approach to statistical thresholding, we believe that determination of the significant threshold of activation should be data driven and that future work in this area is needed The selection of presurgical language tasks for functional MR imaging should be specific to the patient and depend on the location of the brain tumor and the functional networks suspected to be near the lesion.
Importantly, language tasks may differ in their sensitivity and specificity; this is another important factor to consider during paradigm selection. Indeed, we found that the use of expressive language paradigms was associated with greater specificity than was the use of receptive and semantic tasks. Although it is possible that the expressive tasks used in these studies are in fact more accurate than others, particularly for assessing dominant-hemisphere functions 34 , closer inspection of the studies suggests that this finding is likely an artifact of the types of tasks used for the comparison with intraoperative DCS.
Specifically, most studies used expressive language tasks for DCS, most frequently of the object-naming variety. This likely explains the additional finding that visual presentation format of functional MR imaging stimuli was associated with higher specificity.
That is, most studies using a visual presentation included object-naming presurgical functional MR imaging paradigms, which directly corresponds with the tasks commonly used intraoperatively ie, object naming.
These findings point to an important limitation in the existing literature of surgical language mapping.