Top Banner
ORIGINAL ARTICLE Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT PHAN NGUYEN, 1,2 MANOJ BHATT, 2,3 F ARZAD BASHIRZADEH, 1,2 JUSTIN HUNDLOE, 1,2 ROBERT WARE, 4 DAVID FIELDING 1,2 AND ARAVIND S. RAVI KUMAR 2,3 Departments of 1 Thoracic Medicine and 3 Nuclear Medicine and Specialised PET Services Queensland, The Royal Brisbane and Women’s Hospital, 4 Queensland Children’s Medical Research Institute, Herston and 2 School of Medicine, Faculty of Health Sciences, University of Queensland, St Lucia, Queensland, Australia ABSTRACT Background and objective: There is widespread adoption of FDG-PET/CT in staging of lung cancer, but no universally accepted criteria for classifying thoracic nodes as malignant. Previous studies show high nega- tive predictive values, but reporting criteria and posi- tive predictive values varies. Using Endobronchial ultrasound transbronchial needle aspiration (EBUS- TBNA) results as gold standard, we evaluated objective FDG-PET/CT criteria for interpreting mediastinal and hilar nodes and compared this to expert visual inter- pretation (EVI). Methods: A retrospective review of all patients with lung cancer who had both FDG-PET/CT and EBUS- TBNA from 2008 to 2010 was performed. Scan interpre- tation was blinded to histology. Patients from 2008/ 2009 were used for the prediction set.The validation set analysed patients from 2010. Objective FDG-PET/CT criteria were SUVmax lymph node (SUVmaxLN), ratio SUVmaxLN/SUVmax primary lung malignancy, ratio SUVmaxLN/SUVaverage liver, ratio SUVmaxLN/ SUVmax liver and ratio SUVmaxLN/SUVmax blood pool. A nuclear medicine physician reviewed all scans and classified nodal stations as benign or malignant. Results: Eighty-seven malignant lymph nodes and 41 benign nodes were in the prediction set. All objective FDG-PET/CT criteria analysed were significantly higher in the malignant group (P < 0.0001). EVI cor- rectly classified 122/128 nodes (95.3%). Thirty-four malignant nodes and 19 benign nodes were in the vali- dation set. The new proposed cut-off values of the objective criteria from the prediction set correctly clas- sified 44/53 (83.0%) nodes: 28/34 (82.4%) malignant nodes and 16/19 (84.2%) benign nodes. EVI had 91% accuracy: 33/34 (97.1%) malignant nodes and 15/19 (79.0%) benign nodes. Conclusions: Objective analysis of 18-F FDG PET/CT can differentiate between malignant and benign nodes but is not superior to EVI. Key words: lung cancer, lung cancer nodal staging, positron emission tomography, standardized uptake value criterion. Abbreviations: aROC, area under the receiver operator curve; DOR, diagnostic odds ratio; EBUS-TBNA, endobronchial ultra- sound transbronchial needle aspiration; EVI, expert visual inter- pretation; FDG PET/CT, fluoro-deoxy-glucose positron emission tomography/computed tomography; LR, likelihood ratio; NPV, negative predictive value; PPV, positive predictive value; ROI, region of interest; SUV, standardized uptake value; SUVmax, maximum standardized uptake value. INTRODUCTION It is well established that 18-flourine fluoro-deoxy- glucose positron emission tomography/computed tomography (18-F FDG PET/CT) is used for staging of lung cancer according to published guidelines and recommendations from the seventh edition of the lung cancer staging system. 1,2 As increased FDG uptake is well known to occur in inflammatory conditions as well as malignancy, tissue diagnosis is still paramount. Endobronchial ultrasound transbronchial needle aspiration (EBUS-TBNA) has replaced mediastinoscopy as the investigation of choice to diagnose FDG-avid hilar and mediastinal Correspondence: Phan Nguyen, Department of Thoracic Medi- cine, The Royal Brisbane and Women’s Hospital, Butterfield St., Herston, Qld. 4029, Australia. Email: [email protected] Received 28 April 2014; invited to revise 8 June 2014; revised 4 July 2014; accepted 24 July 2014 (Associate Editor: David Feller- Kopman). SUMMARY AT A GLANCE FDG PET/CT is widely used for lung cancer mediastinal staging. Objective criteria for FDG PET/CT scan analysis have various published thresholds that are not well validated. We derived and validated objective criteria from patients at our institution and compared their performance with EVI. EBUS results were used as gold standard. © 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137 doi: 10.1111/resp.12409
9

Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

Mar 31, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

ORIGINAL ARTICLE

Comparison of objective criteria and expert visual interpretationto classify benign and malignant hilar and mediastinal nodes on

18-F FDG PET/CT

PHAN NGUYEN,1,2 MANOJ BHATT,2,3 FARZAD BASHIRZADEH,1,2 JUSTIN HUNDLOE,1,2 ROBERT WARE,4

DAVID FIELDING1,2 AND ARAVIND S. RAVI KUMAR2,3

Departments of 1Thoracic Medicine and 3Nuclear Medicine and Specialised PET Services Queensland, The Royal Brisbaneand Women’s Hospital, 4Queensland Children’s Medical Research Institute, Herston and 2School of Medicine, Faculty of

Health Sciences, University of Queensland, St Lucia, Queensland, Australia

ABSTRACT

Background and objective: There is widespreadadoption of FDG-PET/CT in staging of lung cancer, butno universally accepted criteria for classifying thoracicnodes as malignant. Previous studies show high nega-tive predictive values, but reporting criteria and posi-tive predictive values varies. Using Endobronchialultrasound transbronchial needle aspiration (EBUS-TBNA) results as gold standard, we evaluated objectiveFDG-PET/CT criteria for interpreting mediastinal andhilar nodes and compared this to expert visual inter-pretation (EVI).Methods: A retrospective review of all patients withlung cancer who had both FDG-PET/CT and EBUS-TBNA from 2008 to 2010 was performed. Scan interpre-tation was blinded to histology. Patients from 2008/2009 were used for the prediction set.The validation setanalysed patients from 2010. Objective FDG-PET/CTcriteria were SUVmax lymph node (SUVmaxLN), ratioSUVmaxLN/SUVmax primary lung malignancy, ratioSUVmaxLN/SUVaverage liver, ratio SUVmaxLN/SUVmax liver and ratio SUVmaxLN/SUVmax bloodpool. A nuclear medicine physician reviewed all scansand classified nodal stations as benign or malignant.Results: Eighty-seven malignant lymph nodes and 41benign nodes were in the prediction set. All objectiveFDG-PET/CT criteria analysed were significantlyhigher in the malignant group (P < 0.0001). EVI cor-rectly classified 122/128 nodes (95.3%). Thirty-fourmalignant nodes and 19 benign nodes were in the vali-dation set. The new proposed cut-off values of theobjective criteria from the prediction set correctly clas-sified 44/53 (83.0%) nodes: 28/34 (82.4%) malignantnodes and 16/19 (84.2%) benign nodes. EVI had 91%accuracy: 33/34 (97.1%) malignant nodes and 15/19(79.0%) benign nodes.

Conclusions: Objective analysis of 18-F FDG PET/CTcan differentiate between malignant and benign nodesbut is not superior to EVI.

Key words: lung cancer, lung cancer nodal staging, positronemission tomography, standardized uptake value criterion.

Abbreviations: aROC, area under the receiver operator curve;DOR, diagnostic odds ratio; EBUS-TBNA, endobronchial ultra-sound transbronchial needle aspiration; EVI, expert visual inter-pretation; FDG PET/CT, fluoro-deoxy-glucose positron emissiontomography/computed tomography; LR, likelihood ratio; NPV,negative predictive value; PPV, positive predictive value; ROI,region of interest; SUV, standardized uptake value; SUVmax,maximum standardized uptake value.

INTRODUCTION

It is well established that 18-flourine fluoro-deoxy-glucose positron emission tomography/computedtomography (18-F FDG PET/CT) is used for stagingof lung cancer according to published guidelines andrecommendations from the seventh edition of thelung cancer staging system.1,2 As increased FDGuptake is well known to occur in inflammatoryconditions as well as malignancy, tissue diagnosisis still paramount. Endobronchial ultrasoundtransbronchial needle aspiration (EBUS-TBNA) hasreplaced mediastinoscopy as the investigation ofchoice to diagnose FDG-avid hilar and mediastinal

Correspondence: Phan Nguyen, Department of Thoracic Medi-cine, The Royal Brisbane and Women’s Hospital, Butterfield St.,Herston, Qld. 4029, Australia. Email: [email protected]

Received 28 April 2014; invited to revise 8 June 2014; revised4 July 2014; accepted 24 July 2014 (Associate Editor: David Feller-Kopman).

SUMMARY AT A GLANCE

FDG PET/CT is widely used for lung cancermediastinal staging. Objective criteria for FDGPET/CT scan analysis have various publishedthresholds that are not well validated. We derivedand validated objective criteria from patients atour institution and compared their performancewith EVI. EBUS results were used as gold standard.

bs_bs_banner

© 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137doi: 10.1111/resp.12409

Page 2: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

lymphadenopathy in lung cancer.3 In a recent meta-analysis, EBUS-TBNA showed a pooled sensitivity of93% and specificity of 100% for malignancy.4

Increased FDG uptake as measured by maximumstandardized uptake value (SUVmax) alone may notbe the most accurate parameter to judge malignantnodal involvement, particularly in the thorax wherethe nodes are chronically exposed to environmentalpathogens and toxins. There are also many technicalscanning variables that affect SUV calculationsincluding uptake time, scan acquisition and imageprocessing. Many centres with an 18-F FDG PET/CTservice report on expert visual assessment of nodalFDG uptake to characterize lymph nodes as benign ormalignant rather than using objective criteria.

Different objective criteria for scan interpretationhave been proposed but are not well validated. Exam-ples of previous published objective criteria for malig-nancy include:• SUVmax of >2.5 for the lymph node5,6

• A ratio of 0.56 for SUVmax of the lymph node com-pared to the primary tumour7

• A ratio of >1.5 for SUVmax of the lymph node com-pared to SUV average of the liver8

• Assessment of SUV max of the node compared toSUV max of the blood pool9

Many of these studies did not assess their criteria bytrying to reproduce their results in a validation set. Inaddition, we are not aware of any studies that havereported on a cohort of patients who had both 18-FFDG PET/CT and EBUS-TBNA for the assessment ofunexplained FDG-avid lymphadenopathy. We aimedto establish our own objective cut point values in acohort of patients who had EBUS-TBNA in our centre.Furthermore, we aimed to validate these cut pointvalues in a further series of patients and comparethese objective FDG PET/CT criteria to expert visualinterpretation.

METHODS

This study was approved by the ethics committee andperformed at The Royal Brisbane and Women’s Hos-pital. A retrospective review of all EBUS-TBNApatients from January 2008 to December 2010 inclu-sive was performed, and those who had 18-F FDGPET/CT scans as part of their diagnostic work up wereincluded in the study. Patients from 2008 and 2009were used in the prediction set, with patients from2010 included in the validation set. Post-treatmentpatients, lymphoma patients and patients withnon-lung malignancy-related lymphadenopathywere excluded.

The Olympus convex probe bronchoscope andExera EU-C60 processor (Olympus Medical Corp.,Tokyo, Japan) were used for the EBUS-TBNA pro-cedures. Tissue diagnosis via cytology of EBUS-TBNAsamples or histology of surgically resected lymphnodes was taken as gold standard to define malignantor benign aetiology. In addition, benign aetiology wasdefined as stability or resolution of lymphadenopathyat a ≥12 months repeat CT scan if TBNA was non-diagnostic.10

FDG PET image acquisition protocol

Patients fasted for a minimum of 4 h, and blood sugarlevels were checked to ensure levels <10 mmol/L.Intravenous injection of 4.5 MBq/kg 18 F-FDG wasinjected and scans acquired after approximately60 min of uptake. Patients were scanned from base ofskull to mid-thigh with arms up on a Phillips GeminiGXL 16 PET/CT scanner (Philips Medical Systems,Cleveland, OH, USA). Emission images were acquiredover 10–13 bed positions with 2 min (weight <90 kg)or 2.5 min (weight >90 kg) for each bed position.

CT scans were performed without use of contrastmaterials and were acquired during tidal breathing.Tube voltage was set at 140 keV, with variation incurrent according to the weight (<60 kg: 20 mA,60–90 kg: 30 mA, >90 kg: 50 mA).

FDG PET image analysis

An experienced nuclear medicine physician and asenior nuclear medicine fellow, blinded to the finaldiagnosis, analysed the images using the followingobjective semiquantitative features:• SUVmax of the lymph node (SUVmaxLN)• SUVmax of the lymph node : SUVmax of the liver(SUVmaxLN : SUVmaxLiver)• SUVmax of the lymph node : SUVaverage of theliver (SUVmaxLN : SUVavgLiver)• SUVmax of the lymph node : SUVmax of the bloodpool at the left atrium (SUVmaxLN : SUVmaxBP)• SUVmax of the lymph node : SUVmax of the lungprimary where a primary lesion was available(SUVmaxLN : SUVmaxPrimary).

Circular regions of interest (ROI) were placed overthe lesions (primary malignancy or lymph nodes) andthe maximum value obtained after scrolling over allaxial levels of the lesion (Fig. 1a). For the liver, a largeROI was placed over the right lobe (Fig. 1b). For bloodpool SUV calculation, a left atrial ROI was placed(Fig. 1c). Medview software (MedImage, Ann Arbor,MI, USA) was used for all PET/CT visual image inter-pretation and generation of semiquantitative data.SUV parameters were corrected for the patient’s bodyweight rather than lean body mass, as the former isvery simple to measure. There is controversy aboutthe best way to measure or estimate lean body mass,and there is evidence that correction for lean bodymass may also not be accurate.11

The nuclear medicine specialist also classified thenodal stations as benign, malignant or equivocalbased on visual interpretation of the images. Equivo-cal interpretation was classified as benign for the pur-poses of statistical analysis.

Lymph node stations were classified according tothe International Association for the Study of LungCancer lymph node map. We included the followingEBUS accessible lymph node stations: 2, 4, 7, 10and 11.

Statistical analysis

Statistical analysis followed a similar methodology aspreviously published by members of our group.12

P Nguyen et al.130

© 2014 Asian Pacific Society of RespirologyRespirology (2015) 20, 129–137

Page 3: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

(a)

(b)

Figure 1 (a) Region of interest around primary lung cancer situated in the left upper lobe. Standardized uptake value (SUV) readingstaken from the circular region of interest (ROI) as shown in the middle axial image. (b) ROI around right lobe of liver. SUV readings takenfrom the circular ROI as shown in the middle axial image. (c) ROI for blood pool at left atrium. SUV readings taken from the circular ROIas shown in the middle axial image.

Objective versus expert PET interpretation 131

© 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137

Page 4: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

Prediction setMann–Whitney U-tests compared the objective 18-FFDG PET/CT features between malignant and benignlymph nodes. The area under the receiver operatorcurve (aROC) for sensitivity versus specificity was cal-culated for each feature. An aROC of between 0.50 and0.70 indicates low accuracy, between 0.70 and 0.90moderate accuracy and greater than 0.90 high accu-racy.13 We used an aROC of >0.70 as our cut-off todefine significant objective 18-F FDG criteria to assessthe validation set. These features had positive andnegative likelihood ratios (LR) calculated at appropri-ate sensitivity and specificity cut point values alongthe ROC. We also calculated the diagnostic odds ratio(DOR), which measures test accuracy14:

DORsensitivity sensitivity

specificity specificity= −( )

−( )1

1(1)

Large DOR signifies high test accuracy, while a DORclose to 1 means that the test is no better thanchance.14

Validation setLymph nodes in the validation set were analysed inthe same manner as those from the prediction set. Acombined LR was calculated for each lymph node bymultiplying together LRs for the objective semi quan-titative 18-F FDG PET/CT criteria with an aROC of

>0.70. By convention, an LR >10 markedly increasesdisease probability, and an LR <0.1 markedlydecreases disease probability.15 These cut-offs arebased on Bayesian decision theory16 and equate to a45% increase or decrease respectively in probability.We therefore considered LRs >10 as suspect for malig-nancy; LRs between 0.1–10 as equivocal (classified asbenign) and LRs <0.1 as excluding malignancy.

Fisher’s exact test was used to compare the perfor-mance of objective 18-F FDG PET/CT criteria versusexpert visual interpretation (EVI).

RESULTS

In the prediction set, 112/268 (42%) patients had bothEBUS-TBNA and 18-F FDG PET/CT scans as part oftheir diagnostic work up from 2008 to 2009. Eightnon-lung cancer patients were excluded. This left 104patients available for inclusion in the prediction set.Of these, 87 malignant lymph nodes were found in 75patients and 41 benign lymph nodes from 21 patients.In the validation set, 70/231 (30%) patients whounderwent EBUS-TBNA in 2010 also had 18-F FDGPET/CT scans. Two patients were excluded because ofa diagnosis of lymphoma. Unfortunately, due to tech-nical factors, raw data was lost in 20 patients. This left34 malignant nodes from 34 patients and 19 benignlymph nodes from 14 patients for analysis in the vali-dation set. In both datasets, there were no post treat-ment cases requiring exclusion.

(c)

Figure 1 Continued

P Nguyen et al.132

© 2014 Asian Pacific Society of RespirologyRespirology (2015) 20, 129–137

Page 5: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

Prediction set results

Table 1 summarizes the prediction set results. Malig-nant lymph nodes had higher values for all objectivecriteria analysed as outlined in Table 1 (P < 0.0001for all criteria). EVI correctly classified 83/87 (95.4%)malignant lymph nodes and 39/41 (95.1%) benignlymph nodes. One lymph node in the malignantgroup was classified as equivocal, and 10 lymphnodes in the benign group were classified asequivocal.

Validation set

Table 2 summarizes the selected cut point values foreach of these features based on the highest percent-age correctly classified on the ROC. All five objective18-F FDG PET/CT criteria had aROCs of >0.70 andwere used to classify lymph nodes in the validationset. Four of the five criteria had an aROC >0.90(Fig. 2a–e). Their corresponding sensitivities, specifi-cities, positive and negative LRs for malignancy, andDORs are also presented. All DORs were >10 signifyinghigh accuracy.

Using these selected cut point values, the com-bined LR was calculated for each lymph node in thevalidation set. Excluding SUVmax lymph node :SUVmax primary (which was the poorest performingobjective criterion), the other four objective 18-F FDGPET/CT criteria correctly classified 44/53 (83.0%) oflymph nodes in the validation set. For malignant

nodes 28/34 (82.4%) were correctly classified, and forbenign lymph nodes 16/19 (84.2%) were correctlyclassified. The positive predictive value (PPV) was90.3% and the negative predictive value (NPV) 72.7%in the validation set. In comparison, EVI correctlyclassified 48/53 (91%) of lymph nodes.

There were 33/53 lymph nodes in the validation setwith an identifiable primary cancer. Using all fiveobjective FDG PET-CT criteria, 28/33 (84.8%) werecorrectly classified compared with 31/33 (93.9%) byEVI. In this dataset, the five lymph nodes misclassifiedby objective criteria were four malignant lymphnodes and one benign lymph node. EVI misclassifiedone malignant lymph node and one benign lymphnode.

There was no significant difference between objec-tive 18-F FDG PET/CT criteria and EVI by Fisher’sexact test.

DISCUSSION

We have derived and validated objective 18-F FDGPET/CT criteria that can correctly classify FDG avidhilar and mediastinal lymphadenopathy with highaccuracy. In our patient cohort, these criteria are dif-ferent from the results of previous investigators,implying that population variables should be consid-ered when applying these results more widely.Although they do not outperform EVI, objective

Table 1 Objective fluoro-deoxy-glucose positron emission tomography/computed tomography (FDG PET/CT) criteriaresults for the prediction set

Malignant lymphnodes (median—

interquartile range)

Benign lymphnodes (median—

interquartile range) P-value

SUVmax lymph node 9.5 (5.9–13.2) 3 (2.5–3.6) < 0.0001SUVmax lymph node : SUVmax liver 3.70 (2.21–5.14) 1.15 (0.95–1.35) < 0.0001SUVmax lymph node : SUVavg liver 4.26 (2.68–6.14) 1.42 (1.14–1.62) < 0.0001SUVmax lymph node : SUVmax blood pool 5.58 (340–8.31) 1.67 (1.37–1.89) < 0.0001SUVmax lymph node : SUVmax primary

(98 available nodes)0.81 (0.56–0.98) 0.34 (0.23–0.60) < 0.0001

SUV, standardized uptake value; SUVmax, maximum standardized uptake value.

Table 2 Objective fluoro-deoxy-glucose positron emission tomography/computed tomography (FDG PET CT) criteriaused to assess the validation set

Variable

Cutpointvalue Sensitivity Specificity

Correctlyclassified (%)

PositiveLR

NegativeLR aROC DOR

SUVmax lymph node ≥3.9 89.66 90.24 89.84 9.19 0.11 0.95 104.98SUVmax lymph node : SUVmax liver ≥1.82 83.91% 97.56% 88.28 34.4 0.16 0.94 219.08SUVmax lymph node : SUVavg liver ≥2.31 81.61% 97.56% 86.72 33.46 0.19 0.93 186.42SUVmax lymph node : SUVmax

blood pool≥2.15 87.36% 92.5% 88.98 11.65 0.14 0.94 177.78

SUVmax lymph node : SUVmaxprimary (104 available nodes)

≥0.64 73.02% 77.14% 74.49 3.20 0.35 0.77 19.14

aROC, area under the receiver operator curve; DOR, diagnostic odds ratio; LR, likelihood ratio; SUV, standardized uptake value;SUVmax, maximum standardized uptake value.

Objective versus expert PET interpretation 133

© 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137

Page 6: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

criteria can still be important, especially in caseswhere 18-F FDG PET/CT scans are repeated to deter-mine therapy response.17

18-F FDG PET/CT is well established in lungcancer staging with clear superiority to CT formediastinal staging.18 Sensitivity of 91% and specific-

ity of 86% has been reported for the detection ofmediastinal lymph node metastases with highNPVs.19,20 However PPV varies between 74% and93%.21,22 Furthermore, there are no universallyaccepted criteria for a ‘positive’ node, with localpractice and expertise often determining a trade-off

(a) (b)

(c)

(e)

(d)

Figure 2 (a–e) Receiver operator curves with an area under the curve of greater than 0.70.

P Nguyen et al.134

© 2014 Asian Pacific Society of RespirologyRespirology (2015) 20, 129–137

Page 7: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

between sensitivity and specificity. In our validationset, we managed to achieve a high PPV (90.3%) but alower NPV (72.7%)—ideally a larger validation set isrequired to assess our values further.

A unique feature of our study is the validation dataset to confirm the prediction set results. Otherstrengths include the stringent application of multi-ple objective scan interpretation criteria (to maximizethe potential accuracy of the objective scan criteria),use of a single PET/CT scanner with standardizedimage acquisition and processing protocols, andcomprehensive clinic-pathological follow up in aquaternary referral centre.

EVI performed at least as well as a combination ofpopulation-specific objective criteria. This potentiallyis explained by the fact that an experienced nuclearmedicine physician can take a holistic approach,using a priori knowledge and experience as well asinformation from the whole scan rather than arestricted set of predetermined criteria. However,there is a learning curve, with studies showing thatinter-observer variation decreases with increasingexperience.23,24 Therefore, objective criteria would stillbe useful for those training in nuclear medicineand/or validating visual assessment in difficult cases.

Further difficulty with objective criteria can beencountered for the recognition of benign FDG avidnodes that appear positive according to SUV criteriawith previous researchers demonstrating that sym-metry can assist in detecting these.25 In particular, the‘inverted Y pattern’ is often characteristic of a benigncause (Fig. 3). Visual interpretation of these patternswe suspect improves with increasing physicianexperience.

Our prediction set results showed very strongaROCs of >0.90 for all objective criteria apart fromSUVmax lymph node : SUVmax of the primary, sug-gesting that these criteria individually could be usedas a simple, single calculation in clinical practice toaid scan interpretation rather than calculating a com-bined LR. Validation in other clinical settings wouldbe required.

The use of ratios as in our study is potentially abetter comparator than SUVmax, as SUVmaxcan be difficult to compare between scanners in dif-ferent institutions. Technical variables includingmeasuring residual activities after tracer injection,difference in uptake time and difference in scanningtime for each bed position play a role in difference inresults.26 These would be less marked when usingratios rather than SUVmax values. We did notattempt correction of uptake values to nodalsize (partial volume correction) as it is difficult toachieve accurately without the use of intravenouscontrast in the CT component of the examination.There are also well known difficulties with the repro-ducibility of nodal size measurements on CTscans.26,27

A recent study found that lymph node : aorta SUVand lymph node : liver SUV ratios potentiallyimprove the diagnostic accuracy of 18-F FDG PET/CT,28 but their reported aROC were lower than ours.From our experience, the left atrium as blood poolmeasurement gives more consistent SUV readingsthan the aorta because of false high readings fromthe latter due to incorporation of tracer uptakefrom the vessel wall rather than pure blood poolactivity.

Figure 3 Inverted Y mediastinal pattern characteristic of benign disease.

Objective versus expert PET interpretation 135

© 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137

Page 8: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

Our cut point values are higher than those previ-ously reported for each criterion. One possible expla-nation is that we had a high number of benign nodesin our series with inflammatory lymph nodes and sar-coidosis both present, which are known to give highSUV readings.29,30 Inflammatory nodules can haveSUV values similar to those of malignant lesions.30 Theprediction set results with very strong aROC could notbe replicated in the validation set. This may reflect theheterogeneity seen in inflammatory conditions andthe difficulty of using objective criteria to definebenign disease in particular.

The weaknesses of our study are that not all nodalstations in all patients had tissue samples (impossibleto achieve without a complete nodal surgical dissec-tion), and nodal stations that were inaccessible byEBUS-TBNA were excluded. However, as the highNPV of FDG PET/CT is well validated in the literature,the likely bias induced with our study design would bea potential reduction in PPV by overrepresentation ofbenign FDG-avid nodes in the dataset.

In conclusion, our population-specific objectivecriteria were highly accurate and a useful adjunct forthe interpretation of FDG avid hilar and mediastinallymphadenopathy but did not readily replace EVI.Heterogeneity in previously published criteria andour findings likely reflect varying characteristics ofthe patient population studied. Further multicentreprospective studies with larger numbers of patientscould lead to the derivation of universally acceptedand reproducible objective FDG PET scan interpreta-tion criteria.

AcknowledgementsThe authors thank the Cancer Council of Queensland Australia,the Australian Lung Foundation and The Royal Brisbane andWomen’s Hospital Foundation for their PhD scholarship supportof Dr Phan Nguyen.

REFERENCES

1 Silvestri GA, Gould MK, Margolis ML, Tanoue LT, McCrory D,Toloza E, Detterbeck F. Noninvasive staging of non-small celllung cancer: ACCP evidenced-based clinical practice guidelines(2nd edition). Chest 2007; 132: 178S–201S.

2 Detterbeck FC, Boffa DJ, Tanoue LT. The new lung cancer stagingsystem. Chest 2009; 136: 260–71.

3 Herth FJ, Eberhardt R. Actual role of endobronchial ultrasound(EBUS). Eur. Radiol. 2007; 17: 1806–12.

4 Gu P, Zhao YZ, Jiang LY, Zhang W, Xin Y, Han BH. Endobronchialultrasound-guided transbronchial needle aspiration for stagingof lung cancer: a systematic review and meta-analysis. Eur. J.Cancer 2009; 45: 1389–96.

5 Okereke IC, Gangadharan SP, Kent MS, Nicotera SP, Shen C,DeCamp MM. Standard uptake value predicts survival in non-small cell lung cancer. Ann. Thorac. Surg. 2009; 88: 911–15,discussion 5–6.

6 Hellwig D, Graeter TP, Ukena D, Groeschel A, Sybrecht GW,Schaefers HJ, Kirsch CM. 18F-FDG PET for mediastinal staging oflung cancer: which SUV threshold makes sense? J. Nucl. Med.2007; 48: 1761–6.

7 Cerfolio RJ, Bryant AS. Ratio of the maximum standardizeduptake value on FDG-PET of the mediastinal (N2) lymph nodesto the primary tumor may be a universal predictor of nodal

malignancy in patients with nonsmall-cell lung cancer. Ann.Thorac. Surg. 2007; 83: 1826–9, discussion 9–30.

8 Tournoy KG, Maddens S, Gosselin R, Van Maele G, vanMeerbeeck JP, Kelles A. Integrated FDG-PET/CT does not makeinvasive staging of the intrathoracic lymph nodes in non-smallcell lung cancer redundant: a prospective study. Thorax 2007; 62:696–701.

9 Gupta NC, Graeber GM, Bishop HA. Comparative efficacy ofpositron emission tomography with fluorodeoxyglucose inevaluation of small (<1 cm), intermediate (1 to 3 cm), and large(>3 cm) lymph node lesions. Chest 2000; 117: 773–8.

10 Hirdes MM, Schwartz MP, Tytgat KM, Schlosser NJ, Sie-Go DM,Brink MA, Oldenburg B, Siersema PD, Vleggaar FP. Performanceof EUS-FNA for mediastinal lymphadenopathy: impact onpatient management and costs in low-volume EUS centers. Surg.Endosc. 2010; 24: 2260–7.

11 Erselcan T, Turgut B, Dogan D, Ozdemir S. Lean body mass-based standardized uptake value, derived from a predictiveequation, might be misleading in PET studies. Eur. J. Nucl. Med.Mol. Imaging 2002; 29: 1630–8.

12 Nguyen P, Bashirzadeh F, Hundloe J, Salvado O, Dowson N, WareR, Masters IB, Bhatt M, Kumar AR, Fielding D. Optical differen-tiation between malignant and benign lymphadenopathy bygrey scale texture analysis of endobronchial ultrasound convexprobe images. Chest 2012; 141: 709–15.

13 Akobeng AK. Understanding diagnostic tests 3: receiver operat-ing characteristic curves. Acta Paediatr. 2007; 96: 644–7.

14 Williams GJ, Macaskill P, Chan SF, Turner RM, Hodson E, CraigJC. Absolute and relative accuracy of rapid urine tests for urinarytract infection in children: a meta-analysis. Lancet Infect. Dis.2010; 10: 240–50.

15 Stengel D, Bauwens K, Sehouli J, Ekkernkamp A, Porzsolt F. Alikelihood ratio approach to meta-analysis of diagnostic studies.J. Med. Screen. 2003; 10: 47–51.

16 Goodman SN. Toward evidence-based medical statistics. 2: thebayes factor. Ann. Intern. Med. 1999; 130: 1005–13.

17 Ryu JS, Choi NC, Fischman AJ, Lynch TJ, Mathisen DJ. FDG-PETin staging and restaging non-small cell lung cancer afterneoadjuvant chemoradiotherapy: correlation with histopathol-ogy. Lung Cancer 2002; 35: 179–87.

18 Dwamena BA, Sonnad SS, Angobaldo JO, Wahl RL. Metastasesfrom non-small cell lung cancer: mediastinal staging in the1990s—meta-analytic comparison of PET and CT. Radiology1999; 213: 530–6.

19 van Tinteren H, Hoekstra OS, Smit EF, van den Bergh JH,Schreurs AJ, Stallaert RA, van Velthoven PC, Comans EF,Diepenhorst FW, Verboom P et al. Effectiveness of positronemission tomography in the preoperative assessment of patientswith suspected non-small-cell lung cancer: the PLUSmulticentre randomised trial. Lancet 2002; 359: 1388–93.

20 Antoch G, Stattaus J, Nemat AT, Marnitz S, Beyer T, Kuehl H,Bockisch A, Debatin JF, Freudenberg LS. Non-small cell lungcancer: dual-modality PET/CT in preoperative staging. Radiol-ogy 2003; 229: 526–33.

21 Pieterman RM, van Putten JW, Meuzelaar JJ, Mooyaart EL,Vaalburg W, Koeter GH, Fidler V, Pruim J, Groen HJ. Preoperativestaging of non-small-cell lung cancer with positron-emissiontomography. N. Engl. J. Med. 2000; 343: 254–61.

22 Vansteenkiste JF, Stroobants SG, De Leyn PR, Dupont PJ, BogaertJ, Maes A, Deneffe GJ, Nackaerts KL, Verschakelen JA, Lerut TEet al. Lymph node staging in non-small-cell lung cancer withFDG-PET scan: a prospective study on 690 lymph node stationsfrom 68 patients. J. Clin. Oncol. 1998; 16: 2142–9.

23 Krabbe CA, Pruim J, Scholtens AM, Roodenburg JL, BrouwersAH, Phan TT, Agool A, Dijkstra PU. 18F-FDG PET in squamouscell carcinoma of the oral cavity and oropharynx: a study oninter- and intraobserver agreement. J. Oral and Maxillofac. Surg.2010; 68: 21–7.

24 Hofman MS, Smeeton NC, Rankin SC, Nunan T, O’Doherty MJ.Observer variation in FDG PET-CT for staging of non-small-cell

P Nguyen et al.136

© 2014 Asian Pacific Society of RespirologyRespirology (2015) 20, 129–137

Page 9: Comparison of objective criteria and expert visual interpretation to classify benign and malignant hilar and mediastinal nodes on 18-F FDG PET/CT

lung carcinoma. Eur. J. Nucl. Med. Mol. Imaging 2009; 36: 194–9.

25 Karam M, Roberts-Klein S, Shet N, Chang J, Feustel P. Bilateralhilar foci on 18F-FDG PET scan in patients without lung cancer:variables associated with benign and malignant etiology. J. Nucl.Med. 2008; 49: 1429–36.

26 Cademartiri F, Luccichenti G, Maffei E, Fusaro M, Palumbo A,Soliani P, Sianesi M, Zompatori M, Crisi G, Krestin GR. Imagingfor oncologic staging and follow-up: review of current methodsand novel approaches. Acta Biomed. 2008; 79: 85–91.

27 Buerke B, Puesken M, Muter S, Weckesser M, Gerss J, Heindel W,Wessling J. Measurement accuracy and reproducibility ofsemiautomated metric and volumetric lymph node analysis inMDCT. AJR Am. J. Roentgenol. 2010; 195: 979–85.

28 Kuo WH, Wu YC, Wu CY, Ho KC, Chiu PH, Wang CW, Chang CJ,Yu CT, Yen TC, Lin C. Node/aorta and node/liver SUV ratios from(18)F-FDG PET/CT may improve the detection of occultmediastinal lymph node metastases in patients with non-smallcell lung carcinoma. Acad. Radiol. 2012; 19: 685–92.

29 Chowdhury FU, Sheerin F, Bradley KM, Gleeson FV. Sarcoid-likereaction to malignancy on whole-body integrated (18)F-FDGPET/CT: prevalence and disease pattern. Clin. Radiol. 2009; 64:675–81.

30 Chun EJ, Lee HJ, Kang WJ, Kim KG, Goo JM, Park CM, Lee CH.Differentiation between malignancy and inflammation in pul-monary ground-glass nodules: the feasibility of integrated (18)F-FDG PET/CT. Lung Cancer 2009; 65: 180–6.

Objective versus expert PET interpretation 137

© 2014 Asian Pacific Society of Respirology Respirology (2015) 20, 129–137