1 2016 Research Papers Competition Presented by: Mixed Membership Martial Arts: Data-Driven Analysis of Winning Martial Arts Styles Other Sports Paper ID: 1575 Sean R. Hackett and John D. Storey Lewis-Sigler Institute for Integrative Genomics Princeton University, Princeton, NJ 08544, USA Abstract A major analytics challenge in Mixed Martial Arts (MMA) is understanding the differences between fighters that are essential for both establishing matchups and facilitating fan understanding. Here, we model ~18,000 fighters as mixtures of 10 data-defined prototypical martial arts styles, each with characteristic ways of winning. By balancing fighter-level data with broader trends in MMA, fighter behavior can be predicted even for inexperienced fighters. Beyond providing an informative summary of a fighter's style, it is also the case that style is a major determinant of success in MMA. This is reflected by the fact that champions of the sport conform to a narrow subset of successful styles. 1. Introduction Early events in Mixed Martial Arts (MMA) were touted as a chance to determine which styles of pure martial arts were most effective for defeating an opponent [1, 2]. These events, epitomized by the mid-1990s events of the Ultimate Fighting Championship (UFC) and Vale Tudo, featured matchups of stylistically diverse martial artists representing sports such as Boxing, Judo and Karate [1, 3]. As the sport of MMA has evolved, fighters have become increasingly well-rounded [1, 2]. Modern MMA fighters need to be effective strikers who dictate where the fight occurs, either by taking down their opponents or forcing them to fight standing. Each MMA fighter’s skills are drawn from a mix of multiple pure martial arts that define his/her personalized fighting style. While modern MMA fighters are more stylistically mixed than their pure martial artist forbearers, the mixtures of individual fighters differ. Some fighters would be referred to as strikers and others, as grapplers who specialize in martial arts such as Brazilian Jiu-Jitsu (BJJ) [4]. These stylistic characterizations are important when establishing match-ups; but to date, there is no quantitative procedure for characterizing the style of MMA fighters. Furthermore, because of the absence of such an approach, the impact of style on success in MMA has never been quantitatively investigated. Every sport is enriched by the behaviors that distinguish its athletes and teams. Whether one is concerned with a pitcher’s arsenal of pitches, the plays a football team employs or the locations from which a basketball player likes to shoot, style is an important, albeit often nebulous, concept in sports. To make style more accessible, data-driven modeling approaches will be invaluable. These models have enormous value - they can be used to identify prospects who are similar to superstars,
17
Embed
Mixed Membership Martial Arts: Data-Driven Analysis of Winning …... · 2020. 12. 31. · personalized fighting style. While modern MMA fighters are more stylistically mixed than
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Amajoranalyticschallenge inMixedMartialArts(MMA) isunderstandingthedifferencesbetween fighters that are essential for both establishing matchups and facilitating fanunderstanding. Here, we model ~18,000 fighters as mixtures of 10 data-definedprototypical martial arts styles, each with characteristic ways of winning. By balancingfighter-leveldatawithbroader trends inMMA, fighterbehaviorcanbepredictedeven forinexperiencedfighters.Beyondprovidingan informativesummaryofa fighter'sstyle, it isalso thecase thatstyle isamajordeterminantofsuccess inMMA.This isreflectedby thefactthatchampionsofthesportconformtoanarrowsubsetofsuccessfulstyles.
1. IntroductionEarly events inMixedMartialArts (MMA)were touted as a chance to determinewhich styles ofpuremartialartsweremosteffectivefordefeatinganopponent[1,2].Theseevents,epitomizedbythe mid-1990s events of the Ultimate Fighting Championship (UFC) and Vale Tudo, featuredmatchupsofstylisticallydiversemartialartistsrepresentingsportssuchasBoxing,JudoandKarate[1, 3]. As the sport ofMMA has evolved, fighters have become increasinglywell-rounded [1, 2].ModernMMA fighters need to be effective strikerswhodictatewhere the fight occurs, either bytakingdowntheiropponentsorforcingthemtofightstanding.
EachMMA fighter’s skills aredrawn fromamixofmultiplepuremartial arts thatdefinehis/herpersonalizedfightingstyle.WhilemodernMMAfightersaremorestylisticallymixedthantheirpuremartialartistforbearers,themixturesofindividualfightersdiffer.Somefighterswouldbereferredtoasstrikersandothers,asgrapplerswhospecializeinmartialartssuchasBrazilianJiu-Jitsu(BJJ)[4].Thesestylisticcharacterizationsareimportantwhenestablishingmatch-ups;buttodate,thereisnoquantitativeprocedureforcharacterizingthestyleofMMAfighters.Furthermore,becauseofthe absence of such an approach, the impact of style on success in MMA has never beenquantitativelyinvestigated.
Every sport is enriched by the behaviors that distinguish its athletes and teams.Whether one isconcernedwith a pitcher’s arsenal of pitches, the plays a football team employs or the locationsfromwhichabasketballplayerlikestoshoot,styleisanimportant,albeitoftennebulous,conceptinsports.Tomakestylemoreaccessible,data-drivenmodelingapproacheswillbeinvaluable.Thesemodelshaveenormousvalue-theycanbeusedtoidentifyprospectswhoaresimilartosuperstars,
2
2016ResearchPapersCompetitionPresentedby:
buildabalancedteam,andguidevisualizationsthatinformcasualfansofeachathlete’sstrengths.Todate,approachesthathaveaimedtodefineathletes'styleshavegenerallyattemptedtoidentifylatent factors that could serve as a representationof style [5, 6, 7]. These approacheshavebeenrestrictedtospatialsummariesofathletes(e.g.,fromwheretheyscorepoints).Consequently,theirlatent factors reflect common spatial patterns, like three-point shootingordunking in basketball[6].
While latent variable based approaches have been successfully applied to sports where a largeamountofathlete-specificperformancedataisavailable,suchmethodshavethetendencytooverfitpatternswhen little data is available orwhen complex patterns are sought (for example, if stylewere allowed to vary between seasons or between games).Most sports are not immune to suchoverfitting;forinstance,thecapacityforoverfittingobserveddatahasbeenwell-demonstratedforthedata-richapplicationofbaseballbattingaverages [8,9].Wewould intuitivelyexpect abatterwith300hitsoutof1000at-batstoperformbetterinthefuturethanabatterwith5hitsoutof10at-bats,yetthesecondbatterwouldhaveahigherbattingaverageusingtheMaximumLikelihoodEstimate (MLE)ofbatting averages. Indeed, itwasdemonstrated that theMLEwas less accuratethan the James-Stein estimate of batting averages, which shrinks observed batting averagestowards the mean of all players based on the amount of individual-specific data [8]. Bayesianapproachesof thissorthavethecapacity tobalanceathlete-specificdatawithoverallbehavior inordertoimproveprediction.
Characterizing the styles of MMA fighters is an unprecedented challenge because of both theamountandtypeoffighter-leveldataavailable.Mostprofessionalathletescompeteinupwardsof100gamesinasingleseason.Incontrast,asuccessfulMMAfightermayretireafteronly20bouts.With such limiteddata, a fighter's to-datebehavior alone is insufficient to predict his/her futureperformance.Additionally,whilethemannerbywhichafighterfinishesafight(e.g.,finishessuchaspunches,kicksorchokes)isthepurestdepictionofhis/herstyle[3],thesemetricsarecategorical.Consequently, methods for defining spatially-oriented styles from other sports are not directlyapplicable.WeaddresstheabovetwochallengesbyapplyinglatentDirichletallocation(LDA)[10,11],aBayesianmixedmembershipmodelthatnaturallyaccommodatescategoricaldataandsharespoweracrossfighters,therebybalancingfighter-specificbehaviorwithbroadertrends.Thegoalofthismodelistoestimatetwomixturesthatdefineaplausibleunderlyingstructureoffighters'styles(Figure1):(i)Distinctmartialarts(prototypes)havecharacteristicfinishesand(ii)Fightersareamixtureofprototypes(withindividualizedweightssummingtoone).
1. Striker
3. Grappler
2. Balanced
Wins using
Punches, Elbows, Punches, Kicks
Armbar, Choke, Toehold, Armbar
Punches, Choke, Armbar, Elbows
Fighters Signal
Punches, Elbows, Punches, Kicks
Punches, Elbows, Punches, Kicks
Armbar, Choke, Toehold, Armbar
Punches, Elbows, Choke, Armbar
Identify covarying finishes (prototypes) and fighter styles
Usingdata from>250,000 fights,weapplied themixedmembershipmartialarts (3MA)model toidentify10prototypicalmartial arts styles, each composedof a setof characteristic finishes.Thestyleofeachof~18,000fighterswasdefinedbasedonhis/herindividualizedcombinationoftheseprototypes. These stylisticmakeups greatly improved the prediction of held out results. Beyondtheirvalueasan implicit readoutofa fighter’sabilities, fighterstylesareamajordeterminantofsuccess in thecage. Inparticular,winning fights in theupperechelonsofMMArequiresdynamicstrikingandtheability to“gothedistance”bywinning fights through judges’decision.DefendingUFC champions, the most successful athletes in the sport, generally conform to these styles,providinginsightsintohowfightingstrategyaffectssuccessinMMA.2. DataSetUsingtheSherdogFightFinder[12],weobtainedrawsummariesof257,582previousamateurandprofessionalboutsamong151,746fightersthroughOctober29,2016.Eachboutwassummarizedbased on the pair of fighters, the result of the bout (win, loss, draw, no contest) and theway inwhichtheboutwasfinished(e.g.bypunchesorunanimousdecision).Rarefinishesweremanuallycombined with the most similar high-frequency finish to generate 50 distinct high-frequencycategories (shown in Figure 2). Only fights that ended in awin for one fighterwere considered,leaving230,126boutsforthisanalysis.
3. IdentifyingpatternsinfinishusageAttheirmostbasiclevel,martialartsaresetsoftechniquesthatareusedtowinfights.Eachfightiswonbyusinga single technique,but ifwe lookata fighter's career,patternsmayemergewheresetsoffinishesshowuprepeatedlyacrossafighter'svictories.Ifthesepatternsareduetoamartialartthatissharedamongfighters,wewouldexpectsetsoffinishestoco-occurnotonlyinasinglefighterbutalsoinarecurringmanneracrossmanyindividuals.Forexample,MuayThaifightersusebothelbowsandknees,soweexpectthatfighterswhowinwithelbowstendtowinbyknees.Thus,elbowsandkneesshouldco-occur.
To determine whether any subsets of the 50 finishes are frequently used in conjunction, wequantifiedhowofteneachpairof finishes co-occurs (i.e. theyareused towin fightsby the samefighter), aggregating across all fighters. To answer this question, we compared observed co-occurrencesofeach finishpair (numberofpairsofa finishx andy ina fighter's record, summedover fighters) with the expectation under independence. Then, using permutation testing, wedeterminedhowoftensuchlargedeviationswereexpected.Outof1,225pairsoffinishes,289pairsco-occurmorefrequentlythanexpected,and292pairsco-occurlessfrequentlyatafalsediscoveryrateof0.05[13].
Pairs of finishes that significantly co-occur are far from random; rather, they are organized intocliques of generally mutually correlated finishes (Figure 2). For example, kicks/knees form onetightlyconnectedcliqueandlegsubmissions,another.Fromthisanalysis,wealsoseethatpatternsaremore strongly influencedby theposition fromwhich a submission is applied, rather thanbywhattypeofsubmissionisapplied,perse.Forexample,theanacondachoke,appliedfromafront
4. Determiningfighter-specificstylesBy aggregating over all fighters, we demonstrated that finishes exist in co-occurring groups.Directly applying this information at the fighter-level is challenging, however. Although wecollectively have a large amount of information about fighters’ performances, we have a moremodest amount of fighter-specific data. No fighter has won a fight with each of the 50 possiblefinishes.Infact,thevastmajorityofexperiencedfightersdonotevenhave50winstodate.Becauseweonlyhavean incompletereadoutofa fighter’scapabilities,weneedtobalancewhatweknowabout a fighter with what we do not, hedging the uncertainty in fighter-specific data by usingbroader patterns across fighters. Specifically, we use the behavior of similar fighters to definecommon finish patterns and we then characterize individual fighters in the context of howfrequentlytheyhavefoughtandtheclassesoffinishestheyhaveemployed.
Increatingamodeloffighterstyles,ourultimategoalistomodelthefinishprobabilitiesforeachfighter. Statistically, this objective amounts to estimating fighter-specific finish probabilities:Pr(Finishj|Fighteri)foreachfinishj=1,2,…,50andeachfighteri=1,2,…,17,778includedinour
5
2016ResearchPapersCompetitionPresentedby:
data set. (We restricted our analysis to fighters with four or more wins to date.) Our data iscollectedintoanI=17,778byJ=50matrixX,wherewenotethatthe(i,j)entryisthetotalnumberofwinsforfighterioffinishtypej.Wewilldenotewij=Pr(Finishj|Fighteri),whicharethevalueswewishtoestimate,andwecollectthesevaluesintotheIxJmatrixW.
However,whenusedonaper-fighterbasis,theMLEissusceptibletooverfittingtheobserveddata(particularly for fighters with a smaller number of observations), and the MLE estimates thatunobserved finishes have zero probability. It also does not “borrow strength” across themanyfighters’observeddata.
Instead,inlinewithotherlatentvariablebasedapproaches(suchasprincipalcomponentsanalysisand factor analysis) [14], we assume a lower dimensional structure toW. By doing so, we areassumingthatfighters’individualizedstylescanbeattributedtoasetofprototypicalwinningstyleswhich will be smaller than the number of J = 50 finish types. We will denote the number ofprototypesbyK,whereK < J, andwewill estimate the valueofK from thedata.Weassume thefollowingdecompositionofW:
(1)
where foralliandwherefkjÎ[0,1]forallkandj.Theinterpretationofthismodelis
thatthe“prototypes”arecapturedbyfkj=Pr(Finishj|Prototypek).Theseprototypesaremappedtoeach individual fighter i through the values qik where qik = Pr(Prototypek|Fighteri), where
ThisapproachresultsingroupingofJ=50finishesintoasmallernumberofprototypesK,andthusprovides increasedgeneralizabilitywhenpredictingpreviouslyunobservedfinishes.Forexample,submissionsthattargetthelegs(e.g.,kneebarandanklelock;Figure2)shouldbeassociatedwiththe same prototype. Because of this, a fighter with only kneebar submissions would have someweight in this “leg-submission” prototype. Therefore,wewould predict that he/she also has thecapacitytoapplyanklelocks.Onelimitationofthisapproach,however,isthatifafighterhadneverusedlegsubmissions,he/shewouldhavezeroweightinthe“leg-submission”prototype.Asaresult,wewouldpredictthatthereisprobabilityzeroforhim/hertoperformanylegsubmission.
the earlier example of applying shrinkage to batting averages in baseball [8]. Latent Dirichletallocation(LDA) isapopularBayesianmodelingapproach for incorporatingsuchaprior into thecategoricalmatrixfactorizationshowninmodel(1),andfurthermore, forfittingthismodel(1)todatathroughatechniquecalledvariationalinference[10].InLDA,Dirichletpriorswithparametersa= (a1,a2,…,aK) andb= (b1,b2,…,bJ) are placed onQ andF, respectively. LDA thenutilizesvariational inference to estimate the maximum a posteriori probability (MAP) of the posteriordistributionsofQandF,Pr(Q|X)andPr(F|X).Theestimates and aresettobetheseMAP
estimates,andwefinallyform .
ThevaluesofbdonotgreatlyaffectinferenceonFbecauseFisarelativelysmallmatrix(KxJ)thatis estimated by pooling data across all fighters [15]. In contrast, choice of a greatly impactsestimation ofQ [15]. Each component ofa, denoted byak, can be interpreted as the number ofeffectiveobservations(pseudocounts)[16]ofaprototypekthatsupplementseachfighter'shistory.Theoverallmagnitudeofa,whichwedenoteby ,reflectsthedegreetowhichfighters’
prototypes will be determined by their own records (low aS) versus the average of prototypefrequenciesacrossall fighters (highaS).Treatinga asasymmetric (meaning theak valuesacrossprototypesmaydiffer) isanother importantconsiderationbecause thisallows largerandsmallerprototypestoexistratherthanarbitrarilyenforcingthatprototypesbeequallyprevalent.
ToapproximateoptimalvaluesofaSandK(sinceboththedistributionofaandthevaluesofbcanbeadaptivelyestimated),possiblecombinationsofaSandKwerecomparedtodeterminewhichsetgeneratedanestimateofWthatbestpredictedheldoutfinishes.Quantitatively,foreachpairofaSand K values, we used 20-fold cross validation to estimateW for each of 20 subsets of the fulldatasetusingthesoftwareMALLET[17].Wethencalculatedthelog-likelihoodofobservationsthatwereheldoutfromeachdataset:
ParametersetswithhighKgenerallyresultedinheldoutlog(L)similarinmagnitudetothosewithmoderateK, but highKmodelswere less interpretable.Accordingly,we adopted aparsimoniousapproach to choosing a set of parameter values (small K, small aS) among those with goodperformance ( 95% of the difference between and multinomial likelihood model
containingnofighter-specificinformation).ThisparsimoniousapproachresultedinsettingK=10prototypes (two pairs of prototypes were combined because they contained the same majorityfinish)withaS=16.
log 𝐿 = 𝑥#$log 𝜔#$
𝒮%
$∈𝒮%#∈𝒮%
./
012
(2)
7
2016ResearchPapersCompetitionPresentedby:
4.2.Performanceandgeneralityof3MA:winsvs.lossesandamateursvs.theeliteTo evaluate the performance of 3MA and assess its broader applicability, we applied the aboveapproachboth toelitesubsetsof fightersandtodatasetsconstructed fromfighters' lossesratherthantheirwins(Table1).
Dataset Type InclusionCriteria Nfighters Nbouts K aSAllData allfighters wins 4+wins 17,778 142,775 10 16SubsetsofData
Table 1: 3MAwas applied to six datasets constructed based on the fighters included andwhetherwinsor losseswereinvestigated.Weconsidered“toppromotion” tobeUFC,Bellator,WorldSeriesofFighting,PRIDE,DREAMandStrikeforce.
Our estimates of K and aS aresimilar regardless of whether allfighters or only elite fighters areinvestigated, suggesting thatregardless of the tier ofcompetition, similar move sets areemployedandaccordinglyamateurfightersmaybeimportanttobetterunderstand the elite. Toquantitatively evaluate thisassertion, for each of the sixdatasets described in Table 1,prediction accuracywas calculatedfor both inexperienced andexperiencedfighters(Figure3).
In all cases, 3MA greatly outperforms a multinomial model, 1 prototype model (where finishprobabilitiesforeachfighterareproportionaltofinishfrequenciesacrossallfighters).Ingeneral,itiseasiertopredicthowafighterwinsthanhowhe/sheloses(losingisnotsomethingoverwhichafighterhascontrol).Additionally,thefinishesofexperiencedfighters(thosewith>20wins/losses)canbemoreaccuratelypredictedwhenamateurfightersareincludedratherthanusingasmallerdatasetcomposedexclusivelyofelite fighters.Thissuggests thatpatternsofMMAprototypesarenearlyuniversal;accordingly,informationfromamateurscanimprovetheaccuracyofdefiningthestylesofchampions.Becausethecompletedatasetresultsinthebestpredictionoffinishaccuracyregardlessoffighterexperience,subsequentanalyseswillutilizeall~18,000fighters.
strikes,submissions,anddecisions.Becauseafighter'sweaknessescanbefaithfullyreadoff fromhis/her recordwhilewinningprototypesareconstructed fromnuancedcombinationsof finishes,wefocusourremainingattentiononwinningstylesderivedfromanalyzingallfighters.Thesestylesare reflected in theMALLET software estimates of both the fighter-specific prototypes ( ) andprototype-specificfinishes( ).
4.3.PrototypesasmixturesoffinishesBy studying theestimatedvaluesoffkj =Pr(Finishj|Prototypek),wecancharacterizewhich finishtypesmake up each prototype. We can also calculate themarginal probability Pr(Prototypek) =
𝜃#45#12 /𝐼 as a summary of each prototype's overall relevance in the data. Summaries of
• Prototypes P2 and P4 contain wins by judges’ decision. P2 primarily reflects unanimousdecision,casesinwhichafighterdominatedhis/heropponentforthelengthofthefightandalljudges agreed thathe/shewon.P4has someweight inunanimousdecisionbut also stronglyrepresentssplitdecisions:fightsinwhichjudgesdisagreedaboutwhowon.
9
2016ResearchPapersCompetitionPresentedby:
• BJJandsubmissiongrapplingarerepresentedbyfivegenerallysmallprobabilityprototypes.P5primarily contains thearmbarand submissions that target the legs.P7are submissions fromguard. P9 has some contribution from the rear naked choke but is primarily composed ofsubmissions applied from side-control and front headlock. P8 and P3 each largely contain asinglesubmission:theguillotinechokeandrearnakedchoke,respectively.Thesesubmissions,whiletraditionallyapartofBJJ,areknownandpracticedbymostMMAfighters.
• Thefinalandsmallestprototype,P10,isahodgepodgeofunusualstoppages,generictermsandeasily defended submissions that were relatively common in circa 2000 MMA but havedecreasedinfrequencybyover10-foldsince.Thus,P10isaprototypewhichdatesafightertothe formative years ofMMA.
Thisanalysisdoesnot includewrestling, judo and othermartial arts that arespecialized in achievingtakedowns. Because thesemartial arts are not directlytied toMMA finishes,winsbytheir practitioners couldmanifest in many ways, suchas through decision (P2) orsubmissions from strongcontrolpositions(P3,P8,P9).
4.4.FightersasmixturesofprototypesPrototypesexhibitedabroadrangeofaveragefrequencies, ranging from 5% to 22%.Individual fighters vary greatly about theseaverages,reflectingeithertheirspecializationor neglect of a given prototype and itsassociated finishes (Figure 5). Because allpairs of prototype loadings are negativelycorrelated,themixturesofprototypesusedbyfightersarerelativelyindependent;theweightin any one prototype does not necessarilyentail higher weight in another prototype.While this trendwould be partially expecteddue to tradeoffs between mixturecomponents, the absence of strongassociationsbetweenprototypes (as couldbedealtwithusingcorrelatedtopicmodels[18])suggests that the structure of finishes isappropriatelydiscretizedusingLDA.5. Notallprototypesarecreatedequal:winningstylesinMMAMMAfightersemployavarietyofapproachestodefeattheiropponents,butitisunclearwhetherallstylesareequallyviableorifsomehavefounddisproportionatesuccessinthecage.Thesuccessofastylemayalsobecontextual.MMAfansfrequentlytalkaboutanopponentbeinga“goodmatchup”for a fighter, reflecting that a fighter’s oddsofwinningmightbe influencedbybothhis/herownstyleaswellasanopponent’sstyle.Bycreatingaquantitativerepresentationofstyle,weareabletosystematicallyassesswhethersomestylesaresuperiortoothersandwhetherthissuccessdependsuponanopponent’sstyleaswell.5.1.Theroleofmatchups:prototypesformastrictdominancehierarchyIfsomeprototypesandtheirassociatedfinishesaremoreimportantforwinningMMAfightsthanothers, wewould expect that fighterswho favor these finisheswouldwinmore frequently. Therelativestrengthofaprototypemayalsodependonanopponent'sstyle.Insuchacase,therelativestrength of a prototypemay be contextual, resulting in intransitive properties like those seen inrock-paper-scissors.
Toinvestigatehowtheprototypesperformagainsteachother,wecanconsiderwhotendstowinwhenallpairsofprototypesarematchedagainsteachother.Sinceeveryfighterisrepresentedbyaprototype mixture rather than by a pure prototype, we cannot directly count who wins duringmatchups.Instead,weconsiderthesetsofboutsℬ={(b1,b2)}whereb1isthewinningfighterandb2isthelosingfighter(sob1,b2Î{1,2,…,I}),andcharacterizetheirprototypemixturesforeachpairofprototypesm(inwinningfighters)andn(inlosingfighters)wherem,nÎ{1,2,…,K}andm¹n.To the extent that and are large, prototypem hasde facto defeatedprototypen.More
Todeterminewhether thesedominancescores (dmn)weresignificant, theobservedvalueofeachscore was compared to 10,000 null values where winner and loser labels were permuted, andsignificantdominator-dominatedprototypepairswerefoundatafalsediscoveryrateof0.05[13].
Visualizingthedominancesbetweenallpairsofprototypes(Figure6)suggeststhatintransitivitiesare a relativelyminor phenomenon. Instead, prototypes occupy five tiers of dominance,whereinprototypes assigned to higher tiers dominate all prototypes in tiers beneath them.The topmostsuccessful,tierofprototypescontainsonlyP2(unanimousdecision),whiletheleastsuccessfultierincludesP5(armbarandlegsubmissions),P8(guillotinechoke)andP10(circa2000submissions).
5.2.EffectofprototypemixturesonoverallsuccessBecause prototypes are either effective or ineffective without regard to an opponent’s strategy,each fighter's performance can be explained based on his/her prototype mixture, setting asideattributes of his/her opponent. To investigate prototype performance from the perspective ofindividual fighters, we sought to evaluate the degree towhich fighterwin percentages could beexplained by prototype frequencies, i.e., Pr(win | ). One challengewith this strategy is that, byconstruction, fighterswith extreme values of any prototype component are inherently veterans;theyhavewonenough fights that theirobserved finisheshaveoverwhelmed the influenceof theprior.Bycontrast, fighterswhoseprototypeproportionsarenear thepriorproportionsgenerallyhaveveryfewfights. Therefore,theirperformanceismorerepresentativeofan“averagefighter.”Becauseprototypevaluesarepartiallyconflatedwithexperience,weneedto teaseout these twoeffectsinordertodeterminethedirectcontributionoffighterstylesonperformance.
Inordertoseparatetheinfluenceofexperienceandfighterstyles( ),usinggeneralizedadditivemodels (GAMs) with a logistic link function, the win percentage of each fighter i, Pr(win)i, wasmodeledbasedonthenumberoffightswon(binnedintointervalswhichincludeasimilarnumberof fighters)andthe10estimatedprototypemixturecomponents( ).Logisticregressionwasused because Pr(win) can be treated as Binomial successes (wins) and failures (losses);accordingly,inferenceisbestperformedonthelogoddsofwinsversuslosses[19,20].Theuseofgeneralized additive models provides flexibility, allowing for the identification of nonlinearrelationships between each explanatory variable and the log-odds of winning [21]. Fights werebinnedandmixturecomponentswerelog-transformedinordertoapproximateNormaldistributedvaluessothatpredictionwasnotundulyinfluencedbyaminorityoffighters.
From the fittedmodel (Figure7A),we can see that the oddsofwinning increasesmonotonicallywiththenumberofwinsasexpected,whiletheinfluencesofprototypeabundancesontheoddsofwinning, though highly significant (p < 10-200, ANOVA versus amodel excluding prototypes) aremorecomplicated.InFigure6,wenotedthatprototypesformtiersofdominance.Here,weseethatafighter'ssuccesswithaprototypegreatlydependsonthedegreetowhichthefighterhasinvestedinit.Basedondominance,P2isthemostimportantprototypeforsuccessoverall,butitsbenefitsactually peak once a fighter contains 25% of this style and its effectiveness decreases at highervalues.Aside fromP2, athighvaluesofotherprototypes, the rankingofprototypeeffectson theodds of winning is similar to the tiers of dominance (Figure 6). Prototypes P5, P8 and P10 stillnegatively impact theoddsofwinning,while the remainingprototypesare all generallypositive.Thesepositiveprototypeseitherhavestrongeffectsbutimpactfewfighters(e.g.,extremevaluesofP4,P6andP9)orrelativelyweakeffectsthatimpactmanyfighters(e.g.,intermediatevaluesofP2andhighvaluesofP1).Toprovideasenseofthemagnitudeoftheseeffects,consideringstylealone,a 50%:50% P6:P9 fighter and a 50%:50% P5:P7 fighter would translate to an expected winpercentageof60%and43%,respectively.
Experience and prototypes together strongly predict fighters’ true win frequencies; but, thepredictionisstillquiteremarkablewiththeroleofexperienceremoved(generatingamodelwithexperienceandthencorrectingforthisinfluenceinthefit)(Figure7B).
13
2016ResearchPapersCompetitionPresentedby:
Based on Experience and Prototypes Based on Prototypes only
Becauseprototypespecializationisanimportantpredictorofhowfrequentlyeachfighterwins,aninteresting question is whether the most successful MMA fighters, UFC champions who havesuccessfully defended their title at least once, utilize these effective prototypes. Looking at thepredictedwin frequencies of fighters (corrected for experience), defendingUFC champions favoreffectiveprototypemixtures(Figure8A).Visualizingthemostsuccessfulofthesechampions(thosewiththemosttitledefenses),strongdynamicstriking(representedbyP6andtoalesserextentP1)andtheabilityto“gothedistance”bywinningthroughdecision(P2)arethemajorattributesthatdistinguishthebestof thebest(Figure8B).Thesetrendssuggest thatbeingagreat fighter isnotsufficienttobecomeachampion;fightersmustalsoembraceastylethatpotentiatestheirsuccess.
5.3.SheddinglightonsuccessinMMAApplyingunsupervisedapproaches,wedefinedstyleinMMAandfoundthatdifferentelementsofMMAfighters'gameshaverealizedvaryingsuccessinthecage.Lessonsemergefromthisanalysis.Successfulstrategies:settingthepaceandemployingdynamicstriking• Being able to win by decision is a crucial component of every fighter's game. Winning by
• Winningbyusingkicks,elbowsandkneesisagreaterindicatorofsuccessthanvictorybyusingpunches,eventhoughpunchesare themost frequentwayof finishinga fight.ThisMuayThaistyle could be advantageous for several reasons. Kicks, elbows and knees are important forcontrollingdistance; they are generally acknowledged as beingmorepowerful thanpunches,andtheymayalsobeanindicatorofatop-levelstrikerwhowouldbedangerouswithpunches.
• Themostsuccessful submissionsare thekimura, rearnakedchoke,andotherchokesappliedwiththearms(excludingtheguillotine).Acommonfeatureofthesesuccessfulsubmissions isthat they are applied from a position of very strong control (back,mount and side control).Becauseof this, ifanopponentescapes fromoneof thesesubmissions, thesubmitting fighterwillstillbeleftinastrongposition.
• Guillotine chokes are generally applied by pulling guard (or being forced down during atakedown).Assuch,shouldtheattackfail,anopponentisinagenerallyadvantageouspositiontopassguardandsecureastrongcontrolposition.
• Leg locksarealsonotapplied fromthetraditionaldominantgrapplingpositions.Accordingly,whentheyfail,anopponentoftenhasenoughspacetoeithersecureagoodpositionortoreverttoastandingposition.
• Themostpoisonousstylewasanensembleof finishesthatwerecommoninMMAaroundtheyear 2000, but have since decreased greatly in prevalence. These finishes could bedisadvantageous, but more likely, they simply date a fighter. If fighters have won by thesefinishes,itislikelythattheyarepastthemostcompetitiveperiodoftheircareersandthatthisisreflectedintheirrecords.
strikingbeforemovingtotheground.Then,whenontheground,thesefightersapplychokesfromstrongpositions.While themannerbywhicha fighterwins is thepurest reflectionofhis/her individualizedstyle,andthissummarycanbeobtainedforamassivenumberoffighters,afighter'sstylecontributesnotonlytohowhe/shewinsaboutbutalsotohowhe/sheperformsthroughoutthefight.Suchdataisincreasingly captured, at least for top competitors, through analysis of video footage byorganizationssuchasFightMetric[22].Thesedatacouldimprovetheaccuracywithwhichstyleisdefined for top fighters as well as capture other elements of a fighter's style, such as wrestlingability,thatdonotdirectlyleadtocharacteristicfinishes.6. BroaderapplicationsinsportsanalyticsAcommonelementof latentvariablebasedapproachesthathavesoughttodefineathletes’styleshasbeentofindtrendsacrossathletesthatsummarizeeachathlete'sperformance.Onelimitationof such methods is that they are directly fit to the data, and thus do not account for stylisticcomponentsthathaveyettomanifest(orhave,bychance,onlybeenweaklydemonstrated)inanathlete's performance.When applying principles in stylistic inference toMMA, a sport that is anextreme case of little athlete-specific data, it is crucial to share power across fighters usinginferencegroundedinBayesianmodeling.Whileothersportsarelikelymoreinsulatedfromsuchpathologies by virtue of possessingmore per-athlete data, evenwhen a large amount of data isavailable,directlyfittingtoper-athleteobserveddatamightnotmostaccuratelyestimatethetruepropertiesoftheathlete(ashasbeendemonstratedforbaseballbattingaverages).
By harnessing latent variable based methods, we characterized MMA fighters' styles based onprototypesthatnaturallygroupfinishes.Ouranalysiscanbeappliedtoanycategoricalmeasureofperformance, including to other martial arts where fighters score points or win by using anexpressivesetofpossiblemoves(e.g.,BJJ,TaeKwonDo,Sumo,JudoandFencing).Ineachofthesesports, the 3MA approach could help to systematize the styles of competitors and reveal factorsgoverningsuccess.
References[1] R. S. Garcıa and D. Malcolm, “Decivilizing, civilizing or informalizing? The internationaldevelopmentofMixedMartialArts,”InternationalReviewfortheSociologyofSport,vol.45,pp.39–58,Mar.2010.[2] S. H. Bishop, P. La Bounty, and M. Devlin, “Mixed Martial Arts: A Comprehensive Review,”JournalofSportandHumanPerformance,vol.1,no.1,pp.28-42,2013.[3]G.J.Buse,“Noholdsbarredsportfighting:a10yearreviewofmixedmartialartscompetition,”BritishJournalofSportsMedicine,vol.40,pp.169–172,Feb.2006.[4] A.HiroseandK.K.-h.Pih, “MenWhoStrikeandMenWhoSubmit:HegemonicandMarginal-izedMasculinitiesinMixedMartialArts,”MenandMasculinities,vol.13,pp.190–209,Sept.2009.[5]K.GoldsberryandE.Weiss,“TheDwighteffect:AnewensembleofinteriordefenseanalyticsfortheNBA,”MITSloanSportsAnalyticsConference,2013.[6]A.Miller,L.Bornn,R.Adams,andK.Goldsberry,“FactorizedPointProcessIntensities:ASpatialAnalysisofProfessionalBasketball.,”Proceedingsofthe31stInternationalConferenceonMachineLearning,vol.32,2014.
17
2016ResearchPapersCompetitionPresentedby:
[7]X.Wei,P.Lucey,S.Morgan,M.Reid,andS.Sridharan,“TheThinEdgeoftheWedge”:AccuratelyPredicting Shot Outcomes in Tennis using Style and Context Priors,” MIT Sloan Sports AnalyticsConference,2015.[8]B.EfronandC.Morris,“DataAnalysisUsingStein’sEstimatoranditsGeneralizations,”JournaloftheAmericanStatisticalAssociation,vol.70,no.350,pp.311–319,1975.[9] S. T. Jensen, B. B. McShane, and A. J. Wyner, “Hierarchical Bayesian modeling of hittingperformanceinbaseball,”BayesianAnalysis,vol.4,pp.631-652,2009.[10] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” the Journal of machineLearningresearch,vol.3,pp.993–1022,Mar.2003.[11] D. M. Blei, “Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models,”AnnualReviewofStatisticsanditsApplications,vol.1,no.1,pp.203-232,Jan.2014.[12]Sherdog,“SherdogFightFinder:http://www.sherdog.com/stats/fightfinder.”[13]J.D.StoreyandR.Tibshirani,“Statisticalsignificanceforgenomewidestudies.,”ProceedingsoftheNationalAcademyof Sciences of theUnited States ofAmerica, vol. 100, pp. 9440–9445,Aug.2003.[14] W. Hao, M. Song, and J. D. Storey, “Probabilistic models of genetic variation in structuredpopulationsappliedtoglobalhumanstudies.,”Bioinformatics,vol.32,pp.713–721,Mar.2016.[15]H.M.Wallach,D.M.Mimno,andA.McCallum,“RethinkingLDA:WhyPriorsMatter,”AdvancesinNeuralInformationProcessingSystems,vol.23,pp.1973–1981,2009.[16]G.Heinrich,“Parameterestimationfortextanalysis,”Tech.rep.,UniversityofLeipzig,2009.[17]A.K.Mccallum,MALLET:AMachineLearningforLanguageToolkit,2002.[18]D.BleiandJ.Lafferty,“ACorrelatedTopicModelofScience,”TheAnnalsofAppliedStatistics,vol.1,no.1,pp.17-35,2007.[19] V.S.Y.Lo, J.Bacon-Shone,andK.Busche,“TheApplicationofRankingProbabilityModelstoRacetrackBetting,”ManagementScience,vol.41,no.6,pp.1048-1059,June1995.[20] S. M. Crowe and J. Middeldorp, “A Comparison of Leg Before Wicket Rates Between Aus-tralians and Their Visiting Teams for Test Cricket Series Played in Australia, 1977-94,” TheStatistician,vol.45,no.2,p.255,1996.[21]T.HastieandR.Tibshirani,“GeneralizedAdditiveModels,”StatisticalScience,vol.1,no.3,pp.297-310,1986.[22]R.Genauer,“FightingwithData:CreatingStatisticsfromScratchinMixedMartialArts,”inMITSloanSportsAnalyticsConference,2012.