Identification of Aberrant Railroad Wayside WILD and THD Detectors: Using Industry-wide Railroad Data June 10 , 2014 1
Identification of Aberrant Railroad Wayside WILD and THD Detectors:Using Industry-wide Railroad Data
June 10 , 2014
1
Agenda
• Background / Overview
• Solution Approach
• Groupings Criteria• Sites• Speed• Weight• Car Type
• Solution Architecture (with SAS)
• REPORTING: Dashboards to Analyze WILD Sites
2
3
WILD Detectors
WILD: WHEEL IMPACT LOAD DETECTOR
4
WILD Detectors Data Collection
TTXRailinc
5
WILD Detector Locations
158 WILD detectors in N. A.TTX receives 375,000 records per day.
Project Overview
6 |
Wayside Detectors
• Located at about 158 WILD sites in the North America
• Detect Wheel impacts
• Can exhibit aberrant behavior (e.g. due to flooding etc.)
• Aberrant detector can lead to wrong detection and unnecessary wheel repair expenses
Business Problem
• An aberrant detector could lead to removal of 200-500 healthy wheels (costing $2-3K each) before it is identified (cost = $1M+)
• Once suspected, it takes at least 2-3 weeks of manual work to analyze data and verify it
Goals
• Build an interactive tool for rapid identification of aberrant wayside detectors / health diagnostics using INDUSTRY-WIDE DATA
• Evaluate readings grouped by Speed, Weight and Car type for better precision
• Advanced Enterprise Reporting system & Insights(Dashboards/UI, email alerts)
TTX-CARS ONLY WILD DATA
7
PREVIOUS WORK/BACKGROUND
Variables affecting Alerts from Detectors
Factors that influence Number of Alerts from a Detector
• Variability of measurement system• Alerts differ across time (seasonality, winter temperature)• Alerts differ across detectors (railcar types, location)
Metrics• APW: Alert per 10,000 wheels passed• Defined for each alert level
9
Variability is Normal in WILD Detectors
Temperature / Seasonal variation*
0
0.5
1
1.5
2
2.5
3
3.5
4
1/1
/20
09
1/3
1/2
00
93
/2/2
00
94
/1/2
00
95
/1/2
00
95
/31
/20
09
6/3
0/2
00
97
/30
/20
09
8/2
9/2
00
99
/28
/20
09
10
/28
/20
09
11
/27
/20
09
12
/27
/20
09
1/2
6/2
01
02
/25
/20
10
3/2
7/2
01
04
/26
/20
10
5/2
6/2
01
06
/25
/20
10
7/2
5/2
01
08
/24
/20
10
9/2
3/2
01
01
0/2
3/2
01
01
1/2
2/2
01
01
2/2
2/2
01
01
/21
/20
11
2/2
0/2
01
13
/22
/20
11
4/2
1/2
01
15
/21
/20
11
6/2
0/2
01
17
/20
/20
11
8/1
9/2
01
19
/18
/20
11
10
/18
/20
11
11
/17
/20
11
12
/17
/20
11
2010T = -9 C
2011T = -13 C
2012T = -7 C
2009T = -14 C
APW
• More WILD alerts during winter months
• Colder winter leads to more alerts*TTX-CARS ONLY WILD DATA
10
11
Grouping*
» Example of grouping sites based on similar environmental criteria not by rail road.
0
5
10
15
20
25
30
35
12
Grouping*Used to handle seasonality and temperature related effects
Group 2
0
5
10
15
20
25
30
35
Group 1APW
*TTX-CARS ONLY WILD DATA
13
Detector Specific Variations*
APW differs from detector to detector
APW
0
2
4
6
8
10
12
*TTX-CARS ONLY WILD DATA
0
5
10
15
20
25
30
35
1-J
an-1
0
10
-Fe
b-1
0
22
-Mar
-10
1-M
ay-1
0
10
-Ju
n-1
0
20
-Ju
l-1
0
29
-Au
g-1
0
8-O
ct-1
0
17
-No
v-1
0
27
-De
c-1
0
5-F
eb
-11
17
-Mar
-11
26
-Ap
r-1
1
5-J
un
-11
15
-Ju
l-1
1
24
-Au
g-1
1
3-O
ct-1
1
12
-No
v-1
1
22
-De
c-1
1
31
-Jan
-12
11
-Mar
-12
20
-Ap
r-1
2
30
-May
-12
0
1
2
3
4
5
6
1-J
an-1
0
13
-Fe
b-1
0
28
-Mar
-10
10
-May
-10
22
-Ju
n-1
0
4-A
ug-
10
16
-Se
p-1
0
29
-Oct
-10
11
-De
c-1
0
23
-Jan
-11
7-M
ar-1
1
19
-Ap
r-1
1
1-J
un
-11
14
-Ju
l-1
1
26
-Au
g-1
1
8-O
ct-1
1
20
-No
v-1
1
2-J
an-1
2
14
-Fe
b-1
2
28
-Mar
-12
10
-May
-12
14
Normalized APWNormalization for comparison across sites
APWAPW
3-year Average
*TTX-CARS ONLY WILD DATA
0
1
2
3
4
5
61
-Jan
-10
21
-Jan
-10
10
-Fe
b-1
0
2-M
ar-1
0
22
-Mar
-10
11
-Ap
r-1
0
1-M
ay-1
0
21-M
ay-…
10
-Ju
n-1
0
30
-Ju
n-1
0
20
-Ju
l-1
0
9-A
ug-
10
29
-Au
g-1
0
18
-Se
p-1
0
8-O
ct-1
0
28
-Oct
-10
17
-No
v-1
0
7-D
ec-1
0
27
-De
c-1
0
16
-Jan
-11
5-F
eb
-11
25
-Fe
b-1
1
17
-Mar
-11
6-A
pr-
11
26
-Ap
r-1
1
16-M
ay-…
5-J
un
-11
25
-Ju
n-1
1
15
-Ju
l-1
1
4-A
ug-
11
24
-Au
g-1
1
13
-Se
p-1
1
3-O
ct-1
1
23
-Oct
-11
12
-No
v-1
1
2-D
ec-1
1
22
-De
c-1
1
11
-Jan
-12
31
-Jan
-12
20
-Fe
b-1
2
11
-Mar
-12
31
-Mar
-12
20
-Ap
r-1
2
10-M
ay-…
30-M
ay-…
15
Group Average and Standard Deviation
Mean ± std. dev.
No
rma
lized
A
PW
*TTX-CARS ONLY WILD DATA
16
Group Average is the Benchmark
0
1
2
3
4
5
61
-Jan
-10
21
-Jan
-10
10
-Fe
b-1
0
2-M
ar-1
0
22
-Mar
-10
11
-Ap
r-1
0
1-M
ay-1
0
21-M
ay-…
10
-Ju
n-1
0
30
-Ju
n-1
0
20
-Ju
l-1
0
9-A
ug-
10
29
-Au
g-1
0
18
-Se
p-1
0
8-O
ct-1
0
28
-Oct
-10
17
-No
v-1
0
7-D
ec-1
0
27
-De
c-1
0
16
-Jan
-11
5-F
eb
-11
25
-Fe
b-1
1
17
-Mar
-11
6-A
pr-
11
26
-Ap
r-1
1
16-M
ay-…
5-J
un
-11
25
-Ju
n-1
1
15
-Ju
l-1
1
4-A
ug-
11
24
-Au
g-1
1
13
-Se
p-1
1
3-O
ct-1
1
23
-Oct
-11
12
-No
v-1
1
2-D
ec-1
1
22
-De
c-1
1
11
-Jan
-12
31
-Jan
-12
20
-Fe
b-1
2
11
-Mar
-12
31
-Mar
-12
20
-Ap
r-1
2
10-M
ay-…
30-M
ay-…
Mean ± std. dev.
No
rma
lized
A
PW
Mean ± std. dev.Mean ± std. dev.
*TTX-CARS ONLY WILD DATA
0
1
2
3
4
5
6
17
Detector Scoring
Score=𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅−𝑮𝒓𝒐𝒖𝒑 𝑨𝒗𝒈
𝑺𝒕𝒅.𝑫𝒆𝒗.
Yellow: Score > 2Red: Score > 3
No
rma
lized
A
PW
*TTX-CARS ONLY WILD DATA
SCORE: How far from the average is aberrant?
18
Example of Normal site
0
0.5
1
1.5
2
2.5
3
3.5
4
No
rma
lized
A
PW
*TTX-CARS ONLY WILD DATA
19
Example of Faulty Detector (1)
0
1
2
3
4
5
6
No
rma
lized
A
PW
*TTX-CARS ONLY WILD DATA
WILD
20
INDUSTRY WIDE DATA
21
Site Groupings* (Industry-Wide Data)
*Site Groupings is for Industry-wide data and is based on similarity in APW/Base APW (= S) trends, domain knowledge, geography and environmental criteria.
~ 375,000
~ 1,650,000
- 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000
TTX Only
All Cars
Average Wheel Count Per Day for 2012
Comparison of Data Size
22
Industry Wide Data Volume• ~ 1.5 - 2M rows of data PER DAY (~ 200 MB size)• ~ 200 -220 GB of data over last three years• Stripped Car Numbers and Car Initials for anonymity
purposes
Approx. 4X more data
0
1
2
3
4
5
6
23
Solution Approach
Score=𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅−𝑮𝒓𝒐𝒖𝒑 𝑨𝒗𝒈
𝑺𝒕𝒅.𝑫𝒆𝒗.
No
rma
lized
A
PW
Compute S = APW / Base APW1
Determine benchmark = Group average
3
4
One timeGroup sites based on similarity in APW/Base APW (= S) trends, domain knowledge, geography
2
APW =𝐴𝑙𝑒𝑟𝑡𝑠
𝑊ℎ𝑒𝑒𝑙𝑠× 10000
---------- Benchmark Trend
Detector Trend
Green: Score < 2Yellow: Score > 2Red: Score > 3
Separating the signal from the noise - typical site
WILD
24
TTX-ONLY CARS / INDUSTRY WIDE DATA COMPARISON
Parameter S – TTX Cars vs. All CarsGroup 1 (Site A)
* For Alert Level 1, Site Direction 1
25
The lines form a very tight band
(8/15/2011) 65% higher standard deviation.
26
Example from Group 6 (Site B)
0
0.5
1
1.5
2
2.5
3
3.5
4
01
-Jan
-10
15
-Fe
b-1
0
01
-Ap
r-1
0
15
-May
-10
01
-Ju
l-1
0
15
-Au
g-1
0
01
-Oct
-10
15
-No
v-1
0
01
-Jan
-11
15
-Fe
b-1
1
01
-Ap
r-1
1
15
-May
-11
01
-Ju
l-1
1
15
-Au
g-1
1
01
-Oct
-11
15
-No
v-1
1
01
-Jan
-12
15
-Fe
b-1
2
01
-Ap
r-1
2
0
0.5
1
1.5
2
2.5
3
3.5
4
01
-Jan
-10
15
-Fe
b-1
0
01
-Ap
r-1
0
15
-May
-10
01
-Ju
l-1
0
15
-Au
g-1
0
01
-Oct
-10
15
-No
v-1
0
01
-Jan
-11
15
-Fe
b-1
1
01
-Ap
r-1
1
15
-May
-11
01
-Ju
l-1
1
15
-Au
g-1
1
01
-Oct
-11
15
-No
v-1
1
01
-Jan
-12
15
-Fe
b-1
2
01
-Ap
r-1
2
TTX Cars * All Cars *
(8/15/2011) 360% higher standard deviation.
Normalized APW for Detector
ANALYSIS FOR SPEED, WEIGHT AND CAR TYPE FACTORS
WILD
Site D Track 1 – All Cars (1)
* For Alert Level 1 and Site Direction 1 28
APW
Higher APW at higher speeds
Comparison of Observed APW for Site D Track 1 – By Speed*
Site D Track 1 – All Cars (2)
* For Alert Level 1 and Site Direction 1
Speed 30 Day Alerts 30 Day Wheels Alerts / 10k Wheels
0 – 40 mph 379 88,496 43
40 – 60 mph 5,608 288,855 194
60 – 80 mph 42 8,760 48
Comparison of Data: Sample Data Point - May 1st ,2011 *
29
Not Faulty at lower speeds
Z-sc
ore
Comparison of Z-Score for Site D Track 1 – By Speed*
Weight Bucket Analysis-# of Wheels & Alerts
30
149,854
6,196 4,323 4,350 8,422
105,460
1570
20,00040,00060,00080,000
100,000120,000140,000160,000
Empty(0 - 160,000lbs)
Mid(160,000 -180,000 lbs)
Mid(180,000 -200,000 lbs)
Mid(200,000 -220,000 lbs)
Mid(220,000 -240,000 lbs)
FullyLoaded(240,000 -
320,000 lbs)
Above FullyLoaded(320,000+
lbs)
# o
f W
he
els
Avg. # of Wheels per Month per Site
Mid-range (160,000 – 240,000 lbs) is further split into buckets of 20,000 lbs each
* 2010-13 data for all sites
7 4 4 6 14
298
< 10
100
200
300
400
Empty(0 - 160,000lbs)
Mid(160,000 -180,000 lbs)
Mid(180,000 -200,000 lbs)
Mid(200,000 -220,000 lbs)
Mid(220,000 -240,000 lbs)
FullyLoaded(240,000 -
320,000 lbs)
Above FullyLoaded(320,000+
lbs)
# o
f A
lert
s
Avg. # of Condemnable Alerts Per Month Per Site
Site D Track 1 – Hopper & Tank Cars
Car Weight 30 Day Alerts 30 Day Wheels Alerts per 10,000 Wheels (APW)
% of Total Wheels
Empty(0 – 160,000 lbs) 6 67,148 1 50.3%
Mid(160,000 – 240,000 lbs) 18 1,423 126 1.1%
Fully Loaded (240,000 – 320,000 lbs) 4,041 64,685 625 48.5%
Above Fully Loaded (320,000 lbs +) 11 117 940 0.1%
* For Alert Level 1, Site Direction 1, Speed Range 40 – 60 mph
Comparison of Data: Sample Data Point - May 1st ,2011 *
31
Site D Track 1 – Empty Versus Fully Loaded
* For Alert Level 1, Site Direction 1, Speed Range 40 – 60 mph, Hopper & Tank Cars 32
Comparison of Z-Score for Site D Track 1 – By Weight*
Final Groupings Summary for All Data
Group Car UmlerType
C1 Box Cars
C2 Covered Hopper / Tank Cars
C3 Gondolas
C4 Flat Cars
C5 Equipped & Unequipped Hoppers / Gondolas-GT
C6 Refrigerator Cars
C7 Vehicular Flat Cars
C8 Locomotives
C9 Conventional, Intermodal & Stack Cars
X* Caboose, Passenger, Containers, Trailers,
Special Types
SpeedGroup
Train Speed(mph)
S1 0-40
S2 40-60
S3 60-80
S4 80+
Weight Group
Car Weight
W1 0-160,000 lbs
W2 160,000 -240,000lbs
W3 240,000 -320,000 lbs
W4 320,000 lbs +• 40 – 60 mph is the most
stable speed range
• Speeds lower than 40 mph do not produce sufficient load impact
• The detector readings become unstable at higher speeds
SPEED CAR TYPE WEIGHT
• W1: Fully Empty / Nearly Empty
• W3: Fully Loaded / Almost Fully Loaded (+-10%)
WILD
34
PERFORMANCE OF CAR GROUPS
Site D Track 1 – By Car Type
Car Type 30 Day Alerts 30 Day Wheels APW % Total Wheels
Flat Cars 56 1,112 503 1%
Hopper & Tank Cars
2007 67,386 298 76%
Gondolas 374 10,560 354 12%
Vehicular Flat Cars
0 0 0 0%
Box Cars & RefrigeratorCars
23 616 373 1%
Stacked & Intermodal Cars
34 9,071 37 10%
* For Alert Level 1, Site Direction 1, Speed Range 40 – 60 mph & Weight Range 240,000 - 320,000 lbs
Comparison of Data: Sample Data Point – Jan 1st ,2012 *
35
Site D Track 1 – By Car Type
* For Alert Level 1, Site Direction 1, Speed Range 40 – 60 mph & Weight Range 240,000 – 320,000 lbs 36
Faulty
Not Faulty
SAME DAY
37
SOLUTION ARCHITECTURE
38
Investigation 1 – Aberrant Site
39
Investigation 1 – Site Dashboard
40
Investigation1 – Comparison of Rails
Both rails are above the group average with Z-scores > 3.
Rail 1 Rail 2
41
Investigation 1 – Comparison of Cars
Covered hoppers
Open hoppers
Gondolas All car types representing 90% of the traffic had high
alerts.
42
Investigation 2 – No trouble found
Railroad asked TCCI to investigate a detector giving a large number of alerts.
No, the Z-score is 1.8.• All mid west detectors
were higher due to the long, cold winter.
• The traffic pattern changed.
A lot more carsMore weight per car
43
Conclusions
Change the state of the art
1. From gut feel to clear statistical signals.
2. Weekly, automate analysis of 11.5 million signals from 158 detectors, and send notifications of abnormal readings.
3. Rapid process to investigate suspicious detectors.