How good a match is it? Prototype Software To Render Quantitative
Download
Report
Transcript How good a match is it? Prototype Software To Render Quantitative
“How good a match is it?”
Prototype Software To Render Quantitative CMS
Statements
3
2
1
0
1
2
3
Nicholas D. K. Petraco, Michael Neel and James Hamby
Outline
• Possible sources of data
• Tables of CMS runs
• Can we use the information we have now to
come up with a reasonable quantitative
model?
• “Match” probability and weight of evidence
estimates from CMS data and Bayesian
Networks
• Were to get the model and software to make it
useful
2D Data Acquisition For Toolmarks
KM
Jerry Petillo
KNM
3D Data Acquisition For Toolmarks
Confocal Microscope
Focus Variation Microscope
Striation Patterns
LEA
Bullet base, 9mm Ruger Barrel
Looking down Segment
the optical dividing
line on
a comparison
scope is like looking at a profile
hills and
valleys
into “lines”:
Compare “Lines” Between Known Matches
KM1
KM2
Compare “lines”
Compare “lines” Between Known Non Matches
KNM2
KNM1
Use scores to count width of CMS runs
KM
KNM
Count Tables of CMS Runs
• 2007 Neel and Wells study tabulated “hand” counted
CMS runs for many KMs and KNMs:
914 KM comparisons
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
508
186
109
39
21
10
4
10
14
13
3X
612
172
59
29
15
9
9
6
2
1
1411 KNM comparisons
4X
694
135
43
19
16
2
1
3
0
1
…
…
…
…
…
…
…
…
…
…
…
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
771
298
143
84
46
21
13
14
6
15
3X
1239
124
35
10
2
1
0
0
0
0
4X
1357
47
4
2
1
0
0
0
0
0
…
…
…
…
…
…
…
…
…
…
…
Bayesian Match Probabilities from CMS
• Examiners and line matching algorithms can
supplement Neel and Wells data:
• Examiners: Keep a notebook of what you see and
perhaps even photo-documentationMurdock
• Algorithm: a hunk of our data set:
1393 KM comparisons
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
857
350
128
39
11
5
2
1
0
0
3X
980
323
71
17
2
0
0
0
0
0
100533 KNM comparisons
4X
1010
312
64
7
0
0
0
0
0
0
…
…
…
…
…
…
…
…
…
…
…
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
47842
33699
13666
4037
1004
230
44
9
2
0
3X
71599
23988
4386
500
53
6
1
0
0
0
4X
85101
14104
1247
76
4
0
1
0
0
0
…
…
…
…
…
…
…
…
…
…
…
Took ~1h to do all 101926 pair-wise comparisons between 452 toolmarks from 82 tools
Bayesian Statistics
• The basic Bayesian philosophy:
Prior Knowledge × Data = Updated Knowledge
A better understanding
of the world
Prior × Data = Posterior
Bayesian Match Probabilities from CMS
• Model CMS run length counts in each column with a
multinomial likelihood:
We have this
We need this
• Model each cell probability before we’ve seen any data as an
“uninformative” Dirichlet prior:
• Use Bayes’ theorem to combine:
• “prior beliefs”: CMS run length probabilities
• “data”: CMS run length counts
• And get “updated” (posterior) CMS run length probabilities
Bayesian Match Probabilities from CMS
• Updated CMS run length probabilities:
KM comparisons
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
0.5921
0.2328
0.1032
0.0342
0.0143
0.0069
0.0030
0.0052
0.0065
0.0061
3X
0.6905
0.2150
0.0568
0.0204
0.0078
0.0043
0.0043
0.0030
0.0013
0.0009
KNM comparisons
4X
0.7391
0.1942
0.0468
0.0117
0.0074
0.0013
0.0009
0.0017
0.0004
0.0009
…
…
…
…
…
…
…
…
…
…
…
Number observed
0
1
2
3
4
5
6
7
8
>8
CMS run
lengths:
2X
0.47687
0.33350
0.13547
0.04043
0.01031
0.00247
0.00057
0.00024
0.00009
0.00016
3X
0.71450
0.23653
0.04338
0.00501
0.00055
0.00008
0.00002
0.00001
0.00001
0.00001
4X
0.84810
0.13882
0.01228
0.00077
0.00006
0.00001
0.00002
0.00001
0.00001
0.00001
…
…
…
…
…
…
…
…
…
…
…
• So what can we use these for??
• Lot’s of stuff, but we put them into a Bayesian network:
• BN model for Match/Non-match probabilities given observed
numbers of CMS runs
Bayesian Networks
• A “scenario” is represented by a joint probability
function
• Contains variables relevant to a situation which represent
uncertain information
• Contain “dependencies” between variables that describe how
they influence each other.
• A graphical way to represent the joint probability
function is with nodes and directed lines
• Called a Bayesian NetworkPearl
Most important: Lots of user friendly software “to do the math”
Bayesian Networks
• What does this mean for CMS?Biasotti,Buckleton,Neel:
• The number CMS counts for each run length
is affected by whether or not the comparison is between
“matching toolmarks” or “non-matching toolmarks”.
Match/Non
-Match
# of nX
CMS runs
Bayesian Networks
“Prior” network based on historical/available count data
and multinomial-Dirichlet model for run length
probabilities:
GeNIe
Run Algorithm
Known
Unknown
6x
5x
6x
4x
1-4X, 1-5X, 2-6X
Enter the observed run length data for the comparison
into the network and update “match” (same source)
odds:
LR = 96/3.8 ≈ 25
0-2X
0-3X
1-4X
1-5X
2-6X
0-7X
0-8X
0-9X
0-10X 0->10X
The evidence “strongly supports”Kass-Raftery that the striation patterns were made by the same tool
Where to Get the Model and Software
Bayes Net software: No
cost for noncommercial/demo use
BayesFusion: http://www.bayesfusion.com/
SamIam: http://reasoning.cs.ucla.edu/samiam/
Hugin: http://www.hugin.com/
gR packages: http://people.math.aau.dk/~sorenh/software/gR/
Future Directions
• Test for dependence between CMS runsBuckleton
• More data needed, but probably not an issue.
• Make compatible with John Song’s Congruent
Matching Cells (CMC)
• Needed if you only have one or two “really good
matching lines”.
• Uncertainty for Bayesian Networks
• Models, parameters…
References
Petraco:
• https://github.com/npetraco/CMS-Network
Neel:
• Neel, M and Wells M. “A Comprehensive Analysis of Striated Toolmark
Examinations. Part 1: Comparing Known Matches to Known Non-Matches”, AFTE
J 39(3):176-198 2007.
Buckleton:
• Wevers, G, Michael Neel, M and Buckleton, J. “A Comprehensive Statistical
Analysis of Striated Tool Mark Examinations Part 2: Comparing Known Matches
and Known Non-Matches using Likelihood Ratios”, AFTE J 43(2):1-9 2011.
• Buckleton J, Nichols R, Triggs C and Wevers G. “An Exploratory Bayesian Model
for Firearm and Tool Mark Interpretation”, AFTE J 37(4):352-359 2005.
Kass-Raftery:
• Kass RE and Raftery A. “Bayes Factors”, J Amer Stat Assoc 90(430):773-795
1995.
R-Core: https://www.r-project.org/
BayesFusion: http://www.bayesfusion.com/
SamIam: http://reasoning.cs.ucla.edu/samiam/
Hugin: http://www.hugin.com/
gR packages: http://people.math.aau.dk/~sorenh/software/gR/
•
•
•
•
•
•
•
•
•
•
Acknowledgements
Robert Thompson (NIST)
John Song (NIST)
John Murdock (CCC)
Scott Chumbley (Iowa State)
Max Morris (Iowa State)
Nick Matia
Steve Deady
Alan Zheng (NIST)
Ryan Lillien (Cadre)
Collaborations,
Reprints/Preprints:
[email protected]
http://jjcweb.jjay.cuny.edu/npetraco/
• Dr. James Hamby
• Ms. Diana Paredes
• Mr. Nick Natalie
• Mr. Nicholas Petraco
• Mr. Daniel Azevedo
• Mr. Mike Neel
• Ms. Stephanie Pollut
• Ms. Tatiana Batson
• Ms. Alison Hartwell, Esq.
• Dr. Jacqueline Speir
• Dr. Martin Baiker
• Mr. Robert McLean
• Dr. Peter Shenkin
• Ms. Julie Cohen
• Dr. Brooke Kammrath
• Mr. Peter Tytell
• Dr. Peter Diaczuk
• Mr. Chris Lucky
• Dr. Peter Zoon
• Mr. Antonio Del Valle
• Off. Patrick McLaughlin
• Ms. Carol Gambino
• Dr. Mecki Prinz
Research Team:
• Dr. Linton Mohammed