AUDIT: Practical Accountability of Secret Processes · AUDIT: Practical Accountability of Secret Processes Jonathan Frankle Sunoo Park Daniel Shaar Shafi Goldwasser Daniel J. Weitzner

Post on 13-Sep-2018

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

AUDIT Practical Accountability of Secret Processes

Jonathan Frankle Sunoo Park Daniel Shaar Shafi Goldwasser Daniel J Weitzner

Massachusetts Institute of Technology

Abstract

The US federal court system is exploring ways to im-prove the accountability of electronic surveillance anopaque process often involving cases sealed from publicview and tech companies subject to gag orders againstinforming surveilled users One judge has proposed pub-licly releasing some metadata about each case on a papercover sheet as a way to balance the competing goals of(1) secrecy so the target of an investigation does not dis-cover and sabotage it and (2) accountability to assurethe public that surveillance powers are not misused orabused

Inspired by the courtsrsquo accountability challenge weillustrate how accountability and secrecy are simultane-ously achievable when modern cryptography is broughtto bear Our system improves configurability while pre-serving secrecy offering new tradeoffs potentially morepalatable to the risk-averse court system Judges lawenforcement and companies publish commitments tosurveillance actions argue in zero-knowledge that theirbehavior is consistent and compute aggregate surveil-lance statistics by multi-party computation (MPC)

We demonstrate that these primitives perform effi-ciently at the scale of the federal judiciary To do sowe implement a hierarchical form of MPC that mir-rors the hierarchy of the court system We also de-velop statements in succinct zero-knowledge (SNARKs)whose specificity can be tuned to calibrate the amountof information released All told our proposal not onlyoffers the court system a flexible range of options for en-hancing accountability in the face of necessary secrecybut also yields a general framework for accountability ina broader class of secret information processes

1 Introduction

We explore the challenge of providing public account-ability for secret processes To do so we design a system

Law Enforcement Agency

Court Company

(1)S

urve

illan

cere

ques

t

(2)Courtorder

(3)Data

reque

st

(4) Contesting data request

(5) Modified court order

(6)Req

ueste

d data

Figure 1 The workflow of electronic surveillance

that increases transparency and accountability for one ofthe leading United States electronic surveillance lawsthe Electronic Communications Privacy Act (ECPA) [2]which allows law enforcement agencies to request dataabout users from tech companies The core accountabil-ity challenge in the operation of ECPA is that many ofthe official acts of the judges law enforcement agenciesand companies remain hidden from public view (sealed)often indefinitely Therefore the public has limited infor-mation on which to base confidence in the system

To put this in perspective in 2016 Google received27850 requests from US law enforcement agencies fordata implicating 57392 user accounts [4] and Microsoftreceived 9907 requests implicating 24288 users [7]These numbers taken from the companiesrsquo own volun-tary transparency reports are some of the only publiclyavailable figures on the scope of law enforcement re-quests for data from technology companies under ECPA

Underlying many of these requests is a court order Acourt order is an action by a federal judge requiring acompany to turn over data related to a target (ie a user)who is suspected of committing a crime it is issued inresponse to a request from a law enforcement agencyECPA is one of several electronic surveillance laws andeach follows somewhat different legal procedures how-ever they broadly tend to follow the idealized workflow

in Figure 1 First a law enforcement agency presentsa surveillance request to a federal judge (arrow 1)Thejudge can either approve or deny it Should the judgeapprove the request she signs an order authorizing thesurveillance (arrow 2) A law enforcement agency thenpresents this order describing the data to be turned overto a company (arrow 3) The company either complies orcontests the legal basis for the order with the judge (ar-row 4) Should the companyrsquos challenge be accepted theorder could be narrowed (arrow 5) or eliminated if notthe company turns over the requested data (arrow 6)

These court orders are the primary procedural markerthat surveillance ever took place They are often sealedie temporarily hidden from the public for a period oftime after they are issued In addition companies arefrequently gagged ie banned from discussing the or-der with the target of the surveillance These measuresare vital for the investigative process were a target todiscover that she were being surveilled she could changeher behavior endangering the underlying investigation

According to Judge Stephen Smith a federal mag-istrate judge whose role includes adjudicating requestsfor surveillance gags and seals come at a cost Open-ness of judicial proceedings has long been part of thecommon-law legal tradition and court documents arepresumed to be public by default To Judge Smith acourtrsquos public records are ldquothe source of its own legit-imacyrdquo [37] Judge Smith has noted several specificways that gags and seals undermine the legal mecha-nisms meant to balance the powers of investigators andthose investigated [37]

1 Indefinite sealing Many sealed orders are ultimatelyforgotten by the courts which issued them meaning os-tensibly temporary seals become permanent in practiceTo determine whether she was surveilled a member ofthe public would have to somehow discover the exis-tence of a sealed record confirm the seal had expiredand request the record Making matters worse theserecords are scattered across innumerable courthouses na-tionwide

2 Inadequate incentive and opportunity to appeal Sealsand gags make it impossible for a target to learn she isbeing surveilled let alone contest or appeal the decisionMeanwhile no other party has the incentive to appealCompanies prefer to reduce compliance and legal costsby cooperating A law enforcement agency would onlyconsider appealing when a judge denies its request how-ever Judge Smith explains that even then agencies oftenprefer not to ldquorisk an appeal that could make lsquobad lawrsquordquoby creating precedent that makes surveillance harder inthe future As a result judges who issue these ordershave ldquoliterally no appellate guidancerdquo

3 Inability to discern the extent of surveillance Judge

Smith laments that lack of data means ldquoneither Congressnor the public can accurately assess the breadth anddepth of current electronic surveillance activityrdquo [38]Several small efforts shed some light on this processwiretap reports by the Administrative Office of the USCourts [9] and the aforementioned ldquotransparency re-portsrdquo by tech companies [7 4] These reports whilevaluable clarify only the faintest outlines of surveillance

The net effect is that electronic surveillance laws arenot subject to the usual process of challenge critiqueand modification that keeps the legal system operatingwithin the bounds of constitutional principles This lackof scrutiny ultimately reduces public trust we lack an-swers to many basic questions Does surveillance abideby legal and administrative rules Do agencies presentauthorized requests to companies and do companies re-turn the minimum amount of data to comply To apublic concerned about the extent of surveillance credi-ble assurances would increase trust To foreign govern-ments that regulate cross-border dataflows such assur-ances could determine whether companies have to drasti-cally alter data management when operating abroad Yettoday no infrastructure for making such assurances ex-ists

To remedy these concerns Judge Smith proposes thateach order be accompanied by a publicly available coversheet containing general metadata about an order (egkind of data searched crimes suspected length of theseal reasons for sealing) [38] The cover sheet wouldserve as a visible marker of sealed cases when a sealexpires the public can hold the court accountable by re-questing the sealed document Moreover the cover sheetmetadata enables the public to compute aggregate statis-tics about surveillance complementing the transparencyreports released by the government and companies

Designing the cover sheet involves balancing twocompeting instincts (1) for law enforcement to conducteffective investigations some information about surveil-lance must be hidden and (2) public scrutiny can hold lawenforcement accountable and prevent abuses of powerThe primary design choice available is the amount of in-formation to release

Our contribution As a simple sheet of paper JudgeSmithrsquos proposal is inherently limited in its ability to pro-mote public trust while maintaining secrecy Inspired byJudge Smithrsquos proposal we demonstrate the accountabil-ity achievable when the power of modern cryptographyis brought to bear Cryptographic commitments can in-dicate the existence of a surveillance document withoutrevealing its contents Secure multiparty computation(MPC) can allow judges to compute aggregate statisticsabout all casesmdashinformation currently scattered acrossvoluntary transparency reportsmdashwithout revealing dataabout any particular case Zero-knowledge arguments

can demonstrate that a particular surveillance action(eg requesting data from a company) follows properlyfrom a previous surveillance action (eg a judgersquos order)without revealing the contents of either item All of thisinformation is stored on an append-only ledger givingthe courts a way to release information and the public adefinitive place to find it Courts can post additional in-formation to the ledger from the date that a seal expiresto the entirety of a cover sheet Together these primi-tives facilitate a flexible accountability strategy that canprovide greater assurance to the public while protectingthe secrecy of the investigative process

To show the practicality of these techniques we evalu-ate MPC and zero-knowledge protocols that amply scaleto the size of the federal judiciary1 To meet our effi-ciency requirements we design a hierarchical MPC pro-tocol that mirrors the structure of the federal court sys-tem Our implementation supports sophisticated aggre-gate statistics (eg ldquohow many judges ordered data fromGoogle more than ten timesrdquo) and scales to hundredsof judges who may not stay online long enough to par-ticipate in a synchronized multiround protocol We alsoimplement succinct zero-knowledge arguments about theconsistency of data held in different commitments thelegal system can tune the specificity of these statementsin order to calibrate the amount of information releasedOur implementations apply and extend the existing li-braries Webmpc [16 29] and Jiff [5] (for MPC) and Lib-SNARK [34] (for zero-knowledge) Our design is notcoupled to these specific libraries however an analo-gous implementation could be developed based on anysuitable MPC and SNARK libraries Thus our designcan straightforwardly inherit efficiency improvements offuture MPC and SNARK libraries

Finally we observe that the federal court systemrsquos ac-countability challenge is an instance of a broader classof secret information processes where some informa-tion must be kept secret among participants (eg judgeslaw enforcement agencies and companies) engaging ina protocol (eg surveillance as in Figure 1) yet the pro-priety of the participantsrsquo interactions are of interest toan auditor (eg the public) After presenting our systemas tailored to the case study of electronic surveillancewe describe a framework that generalizes our strategy toany accountability problem that can be framed as a secretinformation process Concrete examples include clinicaltrials public spending and other surveillance regimes

In summary we design a novel system achieving pub-lic accountability for secret processes while leverag-ing off-the-shelf cryptographic primitives and librariesWe call the system ldquoAUDITrdquo which can be read as anacronym for ldquoAccountability of Unreleased Data for Im-

1There are approximately 900 federal judges [10]

proved Transparencyrdquo The design is adaptable to newlegal requirements new transparency goals and entirelynew applications within the realm of secret informationprocesses

Roadmap Section 2 discusses related work Section 3introduces our threat model and security goals Sec-tion 4 introduces the system design of our accountabilityscheme for the court system and Section 5 presents de-tailed protocol algorithms Sections 6 and 7 discuss theimplementation and performance of hierarchical MPCand succinct zero knowledge Section 8 generalizes ourframework to a range of scenarios beyond electronicsurveillance and Section 9 concludes

2 Related Work

Accountability The term accountability has many defi-nitions [21] categorizes technical definitions of account-ability according to the timing of interventions informa-tion used to assess actions and response to violations[20] further formalizes these ideas [31] surveys defini-tions from both computer science and law [44] surveysdefinitions specific to distributed systems and the cloud

In the terminology of these surveys our focus is ondetection (ldquoThe system facilitates detection of a viola-tionrdquo [21]) and responsibility (ldquoDid the organization fol-low the rulesrdquo [31]) Our additional challenge is that weconsider protocols that occur in secret Other account-ability definitions consider how ldquoviolations [are] tied topunishmentrdquo [21 28] we defer this question to the le-gal system and consider it beyond the scope of this workUnlike [32] which advocates for ldquoprospectiverdquo account-ability measures like access control our view of account-ability is entirely retrospective

Implementations of accountability in settings whereremote computers handle data (eg the cloud [3239 40] and healthcare [30]) typically follow thetransparency-centric blueprint of information account-ability [43] remote actors record their actions and makelogs available for scrutiny by an auditor (eg a user) Inour setting (electronic surveillance) we strive to releaseas little information as possible subject to accountabilitygoals meaning complete transparency is not a solution

Cryptography and government surveillance KrollFelten and Boneh [27] also consider electronic surveil-lance but focus on cryptographically ensuring that partic-ipants only have access to data when legally authorizedSuch access control is orthogonal to our work Their sys-tem includes an audit log that records all surveillanceactions much of their logged data is encrypted with aldquosecret escrow keyrdquo In contrast motivated by concernsarticulated directly by the legal community we focus ex-clusively on accountability and develop a nuanced frame-

work for public release of controlled amounts of infor-mation to address a general class of accountability prob-lems of which electronic surveillance is one instance

Bates et al [12] consider adding accountability tocourt-sanctioned wiretaps in which law enforcementagencies can request phone call content They encryptduplicates of all wiretapped data in a fashion only acces-sible by courts and other auditors and keep logs thereofsuch that they can later be analyzed for aggregate statis-tics or compared with law enforcement records A keydifference between [12] and our system is that our de-sign enables the public to directly verify the propriety ofsurveillance activities partially in real time

Goldwasser and Park [23] focus on a different legalapplication secret laws in the context of the ForeignIntelligence Surveillance Act (FISA) [3] where the op-erations of the court applying the law is secret Suc-cinct zero-knowledge is used to certify consistency ofrecorded actions with unknown judicial actions Whileour work and [23] are similar in motivation and sharesome cryptographic tools Goldwasser and Park addressa different application Moreover our paper differs in itsimplementations demonstrating practicality and its con-sideration of aggregate statistics Unlike this work [23]does not model parties in the role of companies

Other research that suggests applying cryptographyto enforce rules governing access-control aspects ofsurveillance includes [25] which enforces privacy forNSA telephony metadata surveillance [36] which usesprivate set intersection for surveillance involving joinsover large databases and [35] which uses the same tech-nique for searching communication graphs

Efficient MPC and SNARKs LibSNARK [34] is theprimary existing implementation of SNARKs (Other li-braries are in active development [1 6]) More numer-ous implementation efforts have been made for MPCunder a range of assumptions and adversary modelseg [16 29 5 11 42 19] The idea of placing mostof the workload of MPC on a subset of parties hasbeen explored before (eg constant-round protocols by[18 24]) we build upon this literature by designing ahierarchically structured MPC protocol specifically tomatch the hierarchy of the existing US court system

3 Threat Model and Security Goals

Our high-level policy goals are to hold the electronicsurveillance process accountable to the public by (1)demonstrating that each participant performs its roleproperly and stays within the bounds of the law and (2)ensuring that the public is aware of the general extent ofgovernment surveillance The accountability measureswe propose place checks on the behavior of judges law

enforcement agencies and companies Such checks areimportant against oversight as well as malice as theseparticipants can misbehave in a number of ways Forexample as Judge Smith explains forgetful judges maylose track of orders whose seals have expired More ma-liciously in 2016 a Brooklyn prosecutor was arrestedfor ldquospy[ing] on [a] love interestrdquo and ldquoforg[ing] judgesrsquosignatures to keep the eavesdropping scheme running forabout a yearrdquo [22]

Our goal is to achieve public accountability even in theface of unreliable and untrustworthy participants Nextwe specify our threat model for each type of participantin the system and enumerate the security goals that ifmet will make it possible to maintain accountability un-der this threat model

31 Threat modelOur threat model considers the three parties presentedin Figure 1mdashjudges law enforcement agencies andcompaniesmdashalong with the public Their roles and theassumptions we make about each are described belowWe assume all parties are computationally bounded

Judges Judges consider requests for surveillance andissue court orders that allow law enforcement agencies torequest data from companies We must consider judgesin the context of the courts in which they operate whichinclude staff members and possibly other judges Weconsider courts to be honest-but-curious they will ad-here to the designated protocols but should not be ableto learn internal information about the workings of othercourts Although one might argue that the judges them-selves can be trusted with this information we do nottrust their staffs Hereon we use the terms ldquojudgerdquo andldquocourtrdquo interchangeably to refer to an entire courthouse

In addition when it comes to sealed orders judgesmay be forgetful as Judge Smith observes judges fre-quently fail to unseal orders when the seals have ex-pired [38]

Law enforcement agencies Law enforcement agen-cies make requests for surveillance to judges in the con-text of ongoing investigations If these requests are ap-proved and a judge issues a court order a law enforce-ment agency may request data from the relevant compa-nies We model law enforcement agencies as maliciouseg they may forge or alter court orders in order to gainaccess to unauthorized information (as in the case of theBrooklyn prosecutor [22])

Companies Companies possess the data that law en-forcement agencies may request if they hold a court or-der Companies may optionally contest these orders and

if the order is upheld must supply the relevant data tothe law enforcement agency We model companies asmalicious eg they might wish to contribute to unau-thorized surveillance while maintaining the outside ap-pearance that they are not Specifically although compa-nies currently release aggregate statistics about their in-volvement in the surveillance process [4 7] our systemdoes not rely on their honesty in reporting these num-bers Other malicious behavior might include colludingwith law enforcement to release more data than a courtorder allows or furnishing data in the absence of a courtorder

The public We model the public as malicious as thepublic may include criminals who wish to learn as muchas possible about the surveillance process in order toavoid being caught2

Remark 31 Our system requires the parties involvedin surveillance to post information to a shared ledger atvarious points in the surveillance process Correspon-dence between logged and real-world events is an aspectof any log-based record-keeping scheme that cannot beenforced using technological means alone Our systemis designed to encourage parties to log honestly or re-port dishonest logging they observe (see Remark 41)Our analysis focuses on the cryptographic guaranteesprovided by the system however rather than a rigor-ous game-theoretic analysis of incentive-based behaviorMost of this paper therefore assumes that surveillanceorders and other logged events are recorded correctlyexcept where otherwise noted

32 Security GoalsIn order to achieve accountability in light of this threatmodel our system will need to satisfy three high-levelsecurity goals

Accountability to the public The system must re-veal enough information to the public that members ofthe public are able to verify that all surveillance is con-ducted properly according to publicly known rules andspecifically that law enforcement agencies and compa-nies (which we model as malicious) do not deviate fromtheir expected roles in the surveillance process The pub-lic must also have enough information to prompt courtsto unseal records at the appropriate times

Correctness All of the information that our systemcomputes and reveals must be correct The aggregate

2By placing all data on an immutable public ledger and giving thepublic no role in our system besides that of observer we effectivelyreduce the public to a passive adversary

statistics it computes and releases to the public must ac-curately reflect the state of electronic surveillance Anyassurances that our system makes to the public about the(im)propriety of the electronic surveillance process mustbe reported accurately

Confidentiality The public must not learn informationthat could undermine the investigative process Noneof the other parties (courts law enforcement agenciesand companies) may learn any information beyond thatwhich they already know in the current ECPA processand that which is released to the public

For particularly sensitive applications the confi-dentiality guarantee should be perfect (information-theoretic) this means confidentiality should hold uncon-ditionally even against arbitrarily powerful adversariesthat may be computationally unbounded3 A perfect con-fidentiality guarantee would be of particular importancein contexts where unauthorized breaks of confidentialitycould have catastrophic consequences (such as nationalsecurity) We envision that a truly unconditional confi-dentiality guarantee could catalyze the consideration ofaccountability systems in contexts involving very sensi-tive information where decision-makers are traditionallyrisk-averse such as the court system

4 System Design

We present the design of our proposed system for ac-countability in electronic surveillance Section 41 infor-mally introduces four cryptographic primitives and theirsecurity guarantees4 Section 42 outlines the configura-tion of the systemmdashwhere data is stored and processedSection 43 describes the workflow of the system in re-lation to the surveillance process summarized in Figure1 Section 44 discusses the packages of design choicesavailable to the court system exploiting the flexibility ofthe cryptographic tools to offer a range of options thattrade off between secrecy and accountability

3This is in contrast to computational confidentiality guaranteeswhich provide confidentiality only against adversaries that are efficientor computationally bounded Even with the latter weaker type of guar-antee it is possible to ensure confidentiality against any adversary withcomputing power within the realistically foreseeable future compu-tational guarantees are quite common in practice and widely consid-ered acceptable for many applications One reason to opt for com-putational guarantees over information-theoretic ones is that typicallyinformation-theoretic guarantees carry some loss in efficiency how-ever this benefit may be outweighed in particularly sensitive applica-tions or when confidentiality is desirable for a very long-term futurewhere advances in computing power are not foreseeable

4For rigorous formal definitions of these cryptographic primitiveswe refer to any standard cryptography textbook (eg [26])

41 Cryptographic Tools

Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

Public Ledger

Judge CourtOrders

LawEnforcement

SurveillanceRequests

CompanyUserData Public

Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

that relevant rules have been followed without re-vealing any secret information

3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

43 Workflow

Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

Ledger

Commitmentto Order

CaseMetadata

Commitmentto Data

Request ZKArgument

Commitmentto Data

ResponseZK Argument

Judge CompanyLaw Enforcement

Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

The zero-knowledge arguments about relationships

between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

9Microsoft [7] and Google [4] currently release their transparency

Judge 1 Judge 2 Judge 3 Judge N

DC Circuit 1st Circuit 11th Circuit

MPC

Aggregate Statistic

Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

44 Additional Design Choices

The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

5 Protocol Definition

We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

Next we give the syntax of our cryptographic tools12

and then define the behavior of the remaining parties

A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

Algorithm 1 Law enforcement agency Ai

bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

Agencies Each agency Ai has an associated decision-making process Adp

i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

i 15This is the step invoked by requests for unsealed documents

Algorithm 2 Judge Ji

bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

ates according to Algorithm 1 which is parametrized byits own Adp

i In practice we assume Adpi would be instan-

tiated by the agencyrsquos human decision-making process

Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

Algorithm 3 Company Ci

bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

Algorithm 4 Trustee Ti

bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

ReconInputs((δ1iprime δniprime)

)iprimeisin[r] =(

Recon(δ j1 δ jr))

jisin[n]

Let y denote the output from the MPC16

4 Send y to J j for each j isin [n]17

bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

1 Receive a secret-share δ ji from each judge J j respec-tively

2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

6 Evaluation of MPC Implementation

In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

2 Protocols should not require all parties to be onlineregularly or at the same time

In the subsections that follow we describe and evaluateour implementations in light of these goals

61 Computing Totals in WebMPC

WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

Figure 5 Performance of MPC using WebMPC library

masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

7 Evaluation of SNARKs

We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

71 Argument TypesOur implementation supports three types of arguments

Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

8 Generalization

Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

(a) Proving key size (b) Key generation time (c) Argument generation time

Figure 8 SNARK evaluation

of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

81 Generalized Framework

We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

ECPA example these are judges law enforcementagencies and companies

bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

Abstractly our system allows the controlled disclosureof four types of information

Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

9 Conclusion

We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

Acknowledgements

We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

References[1] Bellman httpsgithubcomebfullbellman

[2] Electronic Communications Privacy Act 18 USC 2701 et seq

[3] Foreign Intelligence Surveillance Act 50 USC ch 36

[4] Google transparency report httpswwwgooglecom

transparencyreportuserdatarequestscountries

p=2016-12

[5] Jiff httpsgithubcommultipartyjiff

[6] Jsnark httpsgithubcomakosbajsnark

[7] Law enforcement requests report httpswwwmicrosoft

comen-usaboutcorporate-responsibilitylerr

[8] Zcash httpszcash

[9] Wiretap report 2015 httpwwwuscourtsgov

statistics-reportswiretap-report-2015 Decem-ber 2015

[10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

[11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

[12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

[13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

[14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

[15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

[16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

[17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

[18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

[19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

[20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

[21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

[22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

[23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

[24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

[25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

[26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

[27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

princetonedufeltenwarrant-paperpdf

[28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

[29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

[30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

[31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

[32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

[33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

[34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

[35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

[36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

[37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

[38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

[39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

[40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

[41] VIRZA M November 2017 Private communication

[42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

[43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

[44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

  • Introduction
  • Related Work
  • Threat Model and Security Goals
    • Threat model
    • Security Goals
      • System Design
        • Cryptographic Tools
        • System Configuration
        • Workflow
        • Additional Design Choices
          • Protocol Definition
          • Evaluation of MPC Implementation
            • Computing Totals in WebMPC
            • Thresholds and Hierarchy with Jiff
              • Evaluation of SNARKs
                • Argument Types
                • Implementation
                  • Generalization
                    • Generalized Framework
                      • Conclusion

    in Figure 1 First a law enforcement agency presentsa surveillance request to a federal judge (arrow 1)Thejudge can either approve or deny it Should the judgeapprove the request she signs an order authorizing thesurveillance (arrow 2) A law enforcement agency thenpresents this order describing the data to be turned overto a company (arrow 3) The company either complies orcontests the legal basis for the order with the judge (ar-row 4) Should the companyrsquos challenge be accepted theorder could be narrowed (arrow 5) or eliminated if notthe company turns over the requested data (arrow 6)

    These court orders are the primary procedural markerthat surveillance ever took place They are often sealedie temporarily hidden from the public for a period oftime after they are issued In addition companies arefrequently gagged ie banned from discussing the or-der with the target of the surveillance These measuresare vital for the investigative process were a target todiscover that she were being surveilled she could changeher behavior endangering the underlying investigation

    According to Judge Stephen Smith a federal mag-istrate judge whose role includes adjudicating requestsfor surveillance gags and seals come at a cost Open-ness of judicial proceedings has long been part of thecommon-law legal tradition and court documents arepresumed to be public by default To Judge Smith acourtrsquos public records are ldquothe source of its own legit-imacyrdquo [37] Judge Smith has noted several specificways that gags and seals undermine the legal mecha-nisms meant to balance the powers of investigators andthose investigated [37]

    1 Indefinite sealing Many sealed orders are ultimatelyforgotten by the courts which issued them meaning os-tensibly temporary seals become permanent in practiceTo determine whether she was surveilled a member ofthe public would have to somehow discover the exis-tence of a sealed record confirm the seal had expiredand request the record Making matters worse theserecords are scattered across innumerable courthouses na-tionwide

    2 Inadequate incentive and opportunity to appeal Sealsand gags make it impossible for a target to learn she isbeing surveilled let alone contest or appeal the decisionMeanwhile no other party has the incentive to appealCompanies prefer to reduce compliance and legal costsby cooperating A law enforcement agency would onlyconsider appealing when a judge denies its request how-ever Judge Smith explains that even then agencies oftenprefer not to ldquorisk an appeal that could make lsquobad lawrsquordquoby creating precedent that makes surveillance harder inthe future As a result judges who issue these ordershave ldquoliterally no appellate guidancerdquo

    3 Inability to discern the extent of surveillance Judge

    Smith laments that lack of data means ldquoneither Congressnor the public can accurately assess the breadth anddepth of current electronic surveillance activityrdquo [38]Several small efforts shed some light on this processwiretap reports by the Administrative Office of the USCourts [9] and the aforementioned ldquotransparency re-portsrdquo by tech companies [7 4] These reports whilevaluable clarify only the faintest outlines of surveillance

    The net effect is that electronic surveillance laws arenot subject to the usual process of challenge critiqueand modification that keeps the legal system operatingwithin the bounds of constitutional principles This lackof scrutiny ultimately reduces public trust we lack an-swers to many basic questions Does surveillance abideby legal and administrative rules Do agencies presentauthorized requests to companies and do companies re-turn the minimum amount of data to comply To apublic concerned about the extent of surveillance credi-ble assurances would increase trust To foreign govern-ments that regulate cross-border dataflows such assur-ances could determine whether companies have to drasti-cally alter data management when operating abroad Yettoday no infrastructure for making such assurances ex-ists

    To remedy these concerns Judge Smith proposes thateach order be accompanied by a publicly available coversheet containing general metadata about an order (egkind of data searched crimes suspected length of theseal reasons for sealing) [38] The cover sheet wouldserve as a visible marker of sealed cases when a sealexpires the public can hold the court accountable by re-questing the sealed document Moreover the cover sheetmetadata enables the public to compute aggregate statis-tics about surveillance complementing the transparencyreports released by the government and companies

    Designing the cover sheet involves balancing twocompeting instincts (1) for law enforcement to conducteffective investigations some information about surveil-lance must be hidden and (2) public scrutiny can hold lawenforcement accountable and prevent abuses of powerThe primary design choice available is the amount of in-formation to release

    Our contribution As a simple sheet of paper JudgeSmithrsquos proposal is inherently limited in its ability to pro-mote public trust while maintaining secrecy Inspired byJudge Smithrsquos proposal we demonstrate the accountabil-ity achievable when the power of modern cryptographyis brought to bear Cryptographic commitments can in-dicate the existence of a surveillance document withoutrevealing its contents Secure multiparty computation(MPC) can allow judges to compute aggregate statisticsabout all casesmdashinformation currently scattered acrossvoluntary transparency reportsmdashwithout revealing dataabout any particular case Zero-knowledge arguments

    can demonstrate that a particular surveillance action(eg requesting data from a company) follows properlyfrom a previous surveillance action (eg a judgersquos order)without revealing the contents of either item All of thisinformation is stored on an append-only ledger givingthe courts a way to release information and the public adefinitive place to find it Courts can post additional in-formation to the ledger from the date that a seal expiresto the entirety of a cover sheet Together these primi-tives facilitate a flexible accountability strategy that canprovide greater assurance to the public while protectingthe secrecy of the investigative process

    To show the practicality of these techniques we evalu-ate MPC and zero-knowledge protocols that amply scaleto the size of the federal judiciary1 To meet our effi-ciency requirements we design a hierarchical MPC pro-tocol that mirrors the structure of the federal court sys-tem Our implementation supports sophisticated aggre-gate statistics (eg ldquohow many judges ordered data fromGoogle more than ten timesrdquo) and scales to hundredsof judges who may not stay online long enough to par-ticipate in a synchronized multiround protocol We alsoimplement succinct zero-knowledge arguments about theconsistency of data held in different commitments thelegal system can tune the specificity of these statementsin order to calibrate the amount of information releasedOur implementations apply and extend the existing li-braries Webmpc [16 29] and Jiff [5] (for MPC) and Lib-SNARK [34] (for zero-knowledge) Our design is notcoupled to these specific libraries however an analo-gous implementation could be developed based on anysuitable MPC and SNARK libraries Thus our designcan straightforwardly inherit efficiency improvements offuture MPC and SNARK libraries

    Finally we observe that the federal court systemrsquos ac-countability challenge is an instance of a broader classof secret information processes where some informa-tion must be kept secret among participants (eg judgeslaw enforcement agencies and companies) engaging ina protocol (eg surveillance as in Figure 1) yet the pro-priety of the participantsrsquo interactions are of interest toan auditor (eg the public) After presenting our systemas tailored to the case study of electronic surveillancewe describe a framework that generalizes our strategy toany accountability problem that can be framed as a secretinformation process Concrete examples include clinicaltrials public spending and other surveillance regimes

    In summary we design a novel system achieving pub-lic accountability for secret processes while leverag-ing off-the-shelf cryptographic primitives and librariesWe call the system ldquoAUDITrdquo which can be read as anacronym for ldquoAccountability of Unreleased Data for Im-

    1There are approximately 900 federal judges [10]

    proved Transparencyrdquo The design is adaptable to newlegal requirements new transparency goals and entirelynew applications within the realm of secret informationprocesses

    Roadmap Section 2 discusses related work Section 3introduces our threat model and security goals Sec-tion 4 introduces the system design of our accountabilityscheme for the court system and Section 5 presents de-tailed protocol algorithms Sections 6 and 7 discuss theimplementation and performance of hierarchical MPCand succinct zero knowledge Section 8 generalizes ourframework to a range of scenarios beyond electronicsurveillance and Section 9 concludes

    2 Related Work

    Accountability The term accountability has many defi-nitions [21] categorizes technical definitions of account-ability according to the timing of interventions informa-tion used to assess actions and response to violations[20] further formalizes these ideas [31] surveys defini-tions from both computer science and law [44] surveysdefinitions specific to distributed systems and the cloud

    In the terminology of these surveys our focus is ondetection (ldquoThe system facilitates detection of a viola-tionrdquo [21]) and responsibility (ldquoDid the organization fol-low the rulesrdquo [31]) Our additional challenge is that weconsider protocols that occur in secret Other account-ability definitions consider how ldquoviolations [are] tied topunishmentrdquo [21 28] we defer this question to the le-gal system and consider it beyond the scope of this workUnlike [32] which advocates for ldquoprospectiverdquo account-ability measures like access control our view of account-ability is entirely retrospective

    Implementations of accountability in settings whereremote computers handle data (eg the cloud [3239 40] and healthcare [30]) typically follow thetransparency-centric blueprint of information account-ability [43] remote actors record their actions and makelogs available for scrutiny by an auditor (eg a user) Inour setting (electronic surveillance) we strive to releaseas little information as possible subject to accountabilitygoals meaning complete transparency is not a solution

    Cryptography and government surveillance KrollFelten and Boneh [27] also consider electronic surveil-lance but focus on cryptographically ensuring that partic-ipants only have access to data when legally authorizedSuch access control is orthogonal to our work Their sys-tem includes an audit log that records all surveillanceactions much of their logged data is encrypted with aldquosecret escrow keyrdquo In contrast motivated by concernsarticulated directly by the legal community we focus ex-clusively on accountability and develop a nuanced frame-

    work for public release of controlled amounts of infor-mation to address a general class of accountability prob-lems of which electronic surveillance is one instance

    Bates et al [12] consider adding accountability tocourt-sanctioned wiretaps in which law enforcementagencies can request phone call content They encryptduplicates of all wiretapped data in a fashion only acces-sible by courts and other auditors and keep logs thereofsuch that they can later be analyzed for aggregate statis-tics or compared with law enforcement records A keydifference between [12] and our system is that our de-sign enables the public to directly verify the propriety ofsurveillance activities partially in real time

    Goldwasser and Park [23] focus on a different legalapplication secret laws in the context of the ForeignIntelligence Surveillance Act (FISA) [3] where the op-erations of the court applying the law is secret Suc-cinct zero-knowledge is used to certify consistency ofrecorded actions with unknown judicial actions Whileour work and [23] are similar in motivation and sharesome cryptographic tools Goldwasser and Park addressa different application Moreover our paper differs in itsimplementations demonstrating practicality and its con-sideration of aggregate statistics Unlike this work [23]does not model parties in the role of companies

    Other research that suggests applying cryptographyto enforce rules governing access-control aspects ofsurveillance includes [25] which enforces privacy forNSA telephony metadata surveillance [36] which usesprivate set intersection for surveillance involving joinsover large databases and [35] which uses the same tech-nique for searching communication graphs

    Efficient MPC and SNARKs LibSNARK [34] is theprimary existing implementation of SNARKs (Other li-braries are in active development [1 6]) More numer-ous implementation efforts have been made for MPCunder a range of assumptions and adversary modelseg [16 29 5 11 42 19] The idea of placing mostof the workload of MPC on a subset of parties hasbeen explored before (eg constant-round protocols by[18 24]) we build upon this literature by designing ahierarchically structured MPC protocol specifically tomatch the hierarchy of the existing US court system

    3 Threat Model and Security Goals

    Our high-level policy goals are to hold the electronicsurveillance process accountable to the public by (1)demonstrating that each participant performs its roleproperly and stays within the bounds of the law and (2)ensuring that the public is aware of the general extent ofgovernment surveillance The accountability measureswe propose place checks on the behavior of judges law

    enforcement agencies and companies Such checks areimportant against oversight as well as malice as theseparticipants can misbehave in a number of ways Forexample as Judge Smith explains forgetful judges maylose track of orders whose seals have expired More ma-liciously in 2016 a Brooklyn prosecutor was arrestedfor ldquospy[ing] on [a] love interestrdquo and ldquoforg[ing] judgesrsquosignatures to keep the eavesdropping scheme running forabout a yearrdquo [22]

    Our goal is to achieve public accountability even in theface of unreliable and untrustworthy participants Nextwe specify our threat model for each type of participantin the system and enumerate the security goals that ifmet will make it possible to maintain accountability un-der this threat model

    31 Threat modelOur threat model considers the three parties presentedin Figure 1mdashjudges law enforcement agencies andcompaniesmdashalong with the public Their roles and theassumptions we make about each are described belowWe assume all parties are computationally bounded

    Judges Judges consider requests for surveillance andissue court orders that allow law enforcement agencies torequest data from companies We must consider judgesin the context of the courts in which they operate whichinclude staff members and possibly other judges Weconsider courts to be honest-but-curious they will ad-here to the designated protocols but should not be ableto learn internal information about the workings of othercourts Although one might argue that the judges them-selves can be trusted with this information we do nottrust their staffs Hereon we use the terms ldquojudgerdquo andldquocourtrdquo interchangeably to refer to an entire courthouse

    In addition when it comes to sealed orders judgesmay be forgetful as Judge Smith observes judges fre-quently fail to unseal orders when the seals have ex-pired [38]

    Law enforcement agencies Law enforcement agen-cies make requests for surveillance to judges in the con-text of ongoing investigations If these requests are ap-proved and a judge issues a court order a law enforce-ment agency may request data from the relevant compa-nies We model law enforcement agencies as maliciouseg they may forge or alter court orders in order to gainaccess to unauthorized information (as in the case of theBrooklyn prosecutor [22])

    Companies Companies possess the data that law en-forcement agencies may request if they hold a court or-der Companies may optionally contest these orders and

    if the order is upheld must supply the relevant data tothe law enforcement agency We model companies asmalicious eg they might wish to contribute to unau-thorized surveillance while maintaining the outside ap-pearance that they are not Specifically although compa-nies currently release aggregate statistics about their in-volvement in the surveillance process [4 7] our systemdoes not rely on their honesty in reporting these num-bers Other malicious behavior might include colludingwith law enforcement to release more data than a courtorder allows or furnishing data in the absence of a courtorder

    The public We model the public as malicious as thepublic may include criminals who wish to learn as muchas possible about the surveillance process in order toavoid being caught2

    Remark 31 Our system requires the parties involvedin surveillance to post information to a shared ledger atvarious points in the surveillance process Correspon-dence between logged and real-world events is an aspectof any log-based record-keeping scheme that cannot beenforced using technological means alone Our systemis designed to encourage parties to log honestly or re-port dishonest logging they observe (see Remark 41)Our analysis focuses on the cryptographic guaranteesprovided by the system however rather than a rigor-ous game-theoretic analysis of incentive-based behaviorMost of this paper therefore assumes that surveillanceorders and other logged events are recorded correctlyexcept where otherwise noted

    32 Security GoalsIn order to achieve accountability in light of this threatmodel our system will need to satisfy three high-levelsecurity goals

    Accountability to the public The system must re-veal enough information to the public that members ofthe public are able to verify that all surveillance is con-ducted properly according to publicly known rules andspecifically that law enforcement agencies and compa-nies (which we model as malicious) do not deviate fromtheir expected roles in the surveillance process The pub-lic must also have enough information to prompt courtsto unseal records at the appropriate times

    Correctness All of the information that our systemcomputes and reveals must be correct The aggregate

    2By placing all data on an immutable public ledger and giving thepublic no role in our system besides that of observer we effectivelyreduce the public to a passive adversary

    statistics it computes and releases to the public must ac-curately reflect the state of electronic surveillance Anyassurances that our system makes to the public about the(im)propriety of the electronic surveillance process mustbe reported accurately

    Confidentiality The public must not learn informationthat could undermine the investigative process Noneof the other parties (courts law enforcement agenciesand companies) may learn any information beyond thatwhich they already know in the current ECPA processand that which is released to the public

    For particularly sensitive applications the confi-dentiality guarantee should be perfect (information-theoretic) this means confidentiality should hold uncon-ditionally even against arbitrarily powerful adversariesthat may be computationally unbounded3 A perfect con-fidentiality guarantee would be of particular importancein contexts where unauthorized breaks of confidentialitycould have catastrophic consequences (such as nationalsecurity) We envision that a truly unconditional confi-dentiality guarantee could catalyze the consideration ofaccountability systems in contexts involving very sensi-tive information where decision-makers are traditionallyrisk-averse such as the court system

    4 System Design

    We present the design of our proposed system for ac-countability in electronic surveillance Section 41 infor-mally introduces four cryptographic primitives and theirsecurity guarantees4 Section 42 outlines the configura-tion of the systemmdashwhere data is stored and processedSection 43 describes the workflow of the system in re-lation to the surveillance process summarized in Figure1 Section 44 discusses the packages of design choicesavailable to the court system exploiting the flexibility ofthe cryptographic tools to offer a range of options thattrade off between secrecy and accountability

    3This is in contrast to computational confidentiality guaranteeswhich provide confidentiality only against adversaries that are efficientor computationally bounded Even with the latter weaker type of guar-antee it is possible to ensure confidentiality against any adversary withcomputing power within the realistically foreseeable future compu-tational guarantees are quite common in practice and widely consid-ered acceptable for many applications One reason to opt for com-putational guarantees over information-theoretic ones is that typicallyinformation-theoretic guarantees carry some loss in efficiency how-ever this benefit may be outweighed in particularly sensitive applica-tions or when confidentiality is desirable for a very long-term futurewhere advances in computing power are not foreseeable

    4For rigorous formal definitions of these cryptographic primitiveswe refer to any standard cryptography textbook (eg [26])

    41 Cryptographic Tools

    Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

    In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

    Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

    In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

    Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

    Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

    5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

    6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

    revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

    In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

    In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

    The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

    Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

    Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

    7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

    Public Ledger

    Judge CourtOrders

    LawEnforcement

    SurveillanceRequests

    CompanyUserData Public

    Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

    inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

    In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

    Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

    Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

    1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

    2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

    8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

    that relevant rules have been followed without re-vealing any secret information

    3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

    42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

    Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

    The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

    43 Workflow

    Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

    Ledger

    Commitmentto Order

    CaseMetadata

    Commitmentto Data

    Request ZKArgument

    Commitmentto Data

    ResponseZK Argument

    Judge CompanyLaw Enforcement

    Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

    meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

    The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

    This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

    The zero-knowledge arguments about relationships

    between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

    Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

    A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

    Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

    However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

    Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

    9Microsoft [7] and Google [4] currently release their transparency

    Judge 1 Judge 2 Judge 3 Judge N

    DC Circuit 1st Circuit 11th Circuit

    MPC

    Aggregate Statistic

    Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

    the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

    The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

    Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

    reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

    10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

    court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

    This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

    44 Additional Design Choices

    The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

    Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

    Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

    address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

    Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

    5 Protocol Definition

    We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

    Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

    cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

    the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

    party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

    behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

    11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

    clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

    Next we give the syntax of our cryptographic tools12

    and then define the behavior of the remaining parties

    A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

    unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

    and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

    mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

    pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

    Algorithm 1 Law enforcement agency Ai

    bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

    bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

    bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

    i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

    bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

    Agencies Each agency Ai has an associated decision-making process Adp

    i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

    12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

    13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

    14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

    i 15This is the step invoked by requests for unsealed documents

    Algorithm 2 Judge Ji

    bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

    bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

    bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

    ates according to Algorithm 1 which is parametrized byits own Adp

    i In practice we assume Adpi would be instan-

    tiated by the agencyrsquos human decision-making process

    Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

    Algorithm 3 Company Ci

    bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

    The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

    Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

    Algorithm 4 Trustee Ti

    bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

    1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

    δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

    tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

    ReconInputs((δ1iprime δniprime)

    )iprimeisin[r] =(

    Recon(δ j1 δ jr))

    jisin[n]

    Let y denote the output from the MPC16

    4 Send y to J j for each j isin [n]17

    bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

    1 Receive a secret-share δ ji from each judge J j respec-tively

    2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

    authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

    6 Evaluation of MPC Implementation

    In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

    The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

    The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

    To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

    We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

    1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

    2 Protocols should not require all parties to be onlineregularly or at the same time

    In the subsections that follow we describe and evaluateour implementations in light of these goals

    61 Computing Totals in WebMPC

    WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

    Figure 5 Performance of MPC using WebMPC library

    masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

    To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

    Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

    62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

    Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

    Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

    cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

    As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

    In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

    Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

    One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

    7 Evaluation of SNARKs

    We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

    A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

    bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

    bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

    bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

    18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

    Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

    Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

    For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

    We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

    71 Argument TypesOur implementation supports three types of arguments

    Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

    she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

    Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

    More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

    Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

    Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

    We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

    19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

    mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

    72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

    Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

    Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

    Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

    20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

    21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

    with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

    Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

    Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

    Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

    8 Generalization

    Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

    Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

    (a) Proving key size (b) Key generation time (c) Argument generation time

    Figure 8 SNARK evaluation

    of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

    Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

    Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

    81 Generalized Framework

    We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

    ECPA example these are judges law enforcementagencies and companies

    bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

    hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

    bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

    Abstractly our system allows the controlled disclosureof four types of information

    Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

    Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

    Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

    Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

    Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

    Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

    Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

    Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

    9 Conclusion

    We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

    Acknowledgements

    We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

    This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

    References[1] Bellman httpsgithubcomebfullbellman

    [2] Electronic Communications Privacy Act 18 USC 2701 et seq

    [3] Foreign Intelligence Surveillance Act 50 USC ch 36

    [4] Google transparency report httpswwwgooglecom

    transparencyreportuserdatarequestscountries

    p=2016-12

    [5] Jiff httpsgithubcommultipartyjiff

    [6] Jsnark httpsgithubcomakosbajsnark

    [7] Law enforcement requests report httpswwwmicrosoft

    comen-usaboutcorporate-responsibilitylerr

    [8] Zcash httpszcash

    [9] Wiretap report 2015 httpwwwuscourtsgov

    statistics-reportswiretap-report-2015 Decem-ber 2015

    [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

    [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

    [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

    [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

    [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

    [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

    [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

    [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

    [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

    [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

    [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

    [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

    [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

    [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

    [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

    [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

    [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

    [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

    princetonedufeltenwarrant-paperpdf

    [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

    [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

    [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

    [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

    [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

    [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

    [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

    [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

    [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

    [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

    [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

    [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

    [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

    [41] VIRZA M November 2017 Private communication

    [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

    [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

    [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

    • Introduction
    • Related Work
    • Threat Model and Security Goals
      • Threat model
      • Security Goals
        • System Design
          • Cryptographic Tools
          • System Configuration
          • Workflow
          • Additional Design Choices
            • Protocol Definition
            • Evaluation of MPC Implementation
              • Computing Totals in WebMPC
              • Thresholds and Hierarchy with Jiff
                • Evaluation of SNARKs
                  • Argument Types
                  • Implementation
                    • Generalization
                      • Generalized Framework
                        • Conclusion

      can demonstrate that a particular surveillance action(eg requesting data from a company) follows properlyfrom a previous surveillance action (eg a judgersquos order)without revealing the contents of either item All of thisinformation is stored on an append-only ledger givingthe courts a way to release information and the public adefinitive place to find it Courts can post additional in-formation to the ledger from the date that a seal expiresto the entirety of a cover sheet Together these primi-tives facilitate a flexible accountability strategy that canprovide greater assurance to the public while protectingthe secrecy of the investigative process

      To show the practicality of these techniques we evalu-ate MPC and zero-knowledge protocols that amply scaleto the size of the federal judiciary1 To meet our effi-ciency requirements we design a hierarchical MPC pro-tocol that mirrors the structure of the federal court sys-tem Our implementation supports sophisticated aggre-gate statistics (eg ldquohow many judges ordered data fromGoogle more than ten timesrdquo) and scales to hundredsof judges who may not stay online long enough to par-ticipate in a synchronized multiround protocol We alsoimplement succinct zero-knowledge arguments about theconsistency of data held in different commitments thelegal system can tune the specificity of these statementsin order to calibrate the amount of information releasedOur implementations apply and extend the existing li-braries Webmpc [16 29] and Jiff [5] (for MPC) and Lib-SNARK [34] (for zero-knowledge) Our design is notcoupled to these specific libraries however an analo-gous implementation could be developed based on anysuitable MPC and SNARK libraries Thus our designcan straightforwardly inherit efficiency improvements offuture MPC and SNARK libraries

      Finally we observe that the federal court systemrsquos ac-countability challenge is an instance of a broader classof secret information processes where some informa-tion must be kept secret among participants (eg judgeslaw enforcement agencies and companies) engaging ina protocol (eg surveillance as in Figure 1) yet the pro-priety of the participantsrsquo interactions are of interest toan auditor (eg the public) After presenting our systemas tailored to the case study of electronic surveillancewe describe a framework that generalizes our strategy toany accountability problem that can be framed as a secretinformation process Concrete examples include clinicaltrials public spending and other surveillance regimes

      In summary we design a novel system achieving pub-lic accountability for secret processes while leverag-ing off-the-shelf cryptographic primitives and librariesWe call the system ldquoAUDITrdquo which can be read as anacronym for ldquoAccountability of Unreleased Data for Im-

      1There are approximately 900 federal judges [10]

      proved Transparencyrdquo The design is adaptable to newlegal requirements new transparency goals and entirelynew applications within the realm of secret informationprocesses

      Roadmap Section 2 discusses related work Section 3introduces our threat model and security goals Sec-tion 4 introduces the system design of our accountabilityscheme for the court system and Section 5 presents de-tailed protocol algorithms Sections 6 and 7 discuss theimplementation and performance of hierarchical MPCand succinct zero knowledge Section 8 generalizes ourframework to a range of scenarios beyond electronicsurveillance and Section 9 concludes

      2 Related Work

      Accountability The term accountability has many defi-nitions [21] categorizes technical definitions of account-ability according to the timing of interventions informa-tion used to assess actions and response to violations[20] further formalizes these ideas [31] surveys defini-tions from both computer science and law [44] surveysdefinitions specific to distributed systems and the cloud

      In the terminology of these surveys our focus is ondetection (ldquoThe system facilitates detection of a viola-tionrdquo [21]) and responsibility (ldquoDid the organization fol-low the rulesrdquo [31]) Our additional challenge is that weconsider protocols that occur in secret Other account-ability definitions consider how ldquoviolations [are] tied topunishmentrdquo [21 28] we defer this question to the le-gal system and consider it beyond the scope of this workUnlike [32] which advocates for ldquoprospectiverdquo account-ability measures like access control our view of account-ability is entirely retrospective

      Implementations of accountability in settings whereremote computers handle data (eg the cloud [3239 40] and healthcare [30]) typically follow thetransparency-centric blueprint of information account-ability [43] remote actors record their actions and makelogs available for scrutiny by an auditor (eg a user) Inour setting (electronic surveillance) we strive to releaseas little information as possible subject to accountabilitygoals meaning complete transparency is not a solution

      Cryptography and government surveillance KrollFelten and Boneh [27] also consider electronic surveil-lance but focus on cryptographically ensuring that partic-ipants only have access to data when legally authorizedSuch access control is orthogonal to our work Their sys-tem includes an audit log that records all surveillanceactions much of their logged data is encrypted with aldquosecret escrow keyrdquo In contrast motivated by concernsarticulated directly by the legal community we focus ex-clusively on accountability and develop a nuanced frame-

      work for public release of controlled amounts of infor-mation to address a general class of accountability prob-lems of which electronic surveillance is one instance

      Bates et al [12] consider adding accountability tocourt-sanctioned wiretaps in which law enforcementagencies can request phone call content They encryptduplicates of all wiretapped data in a fashion only acces-sible by courts and other auditors and keep logs thereofsuch that they can later be analyzed for aggregate statis-tics or compared with law enforcement records A keydifference between [12] and our system is that our de-sign enables the public to directly verify the propriety ofsurveillance activities partially in real time

      Goldwasser and Park [23] focus on a different legalapplication secret laws in the context of the ForeignIntelligence Surveillance Act (FISA) [3] where the op-erations of the court applying the law is secret Suc-cinct zero-knowledge is used to certify consistency ofrecorded actions with unknown judicial actions Whileour work and [23] are similar in motivation and sharesome cryptographic tools Goldwasser and Park addressa different application Moreover our paper differs in itsimplementations demonstrating practicality and its con-sideration of aggregate statistics Unlike this work [23]does not model parties in the role of companies

      Other research that suggests applying cryptographyto enforce rules governing access-control aspects ofsurveillance includes [25] which enforces privacy forNSA telephony metadata surveillance [36] which usesprivate set intersection for surveillance involving joinsover large databases and [35] which uses the same tech-nique for searching communication graphs

      Efficient MPC and SNARKs LibSNARK [34] is theprimary existing implementation of SNARKs (Other li-braries are in active development [1 6]) More numer-ous implementation efforts have been made for MPCunder a range of assumptions and adversary modelseg [16 29 5 11 42 19] The idea of placing mostof the workload of MPC on a subset of parties hasbeen explored before (eg constant-round protocols by[18 24]) we build upon this literature by designing ahierarchically structured MPC protocol specifically tomatch the hierarchy of the existing US court system

      3 Threat Model and Security Goals

      Our high-level policy goals are to hold the electronicsurveillance process accountable to the public by (1)demonstrating that each participant performs its roleproperly and stays within the bounds of the law and (2)ensuring that the public is aware of the general extent ofgovernment surveillance The accountability measureswe propose place checks on the behavior of judges law

      enforcement agencies and companies Such checks areimportant against oversight as well as malice as theseparticipants can misbehave in a number of ways Forexample as Judge Smith explains forgetful judges maylose track of orders whose seals have expired More ma-liciously in 2016 a Brooklyn prosecutor was arrestedfor ldquospy[ing] on [a] love interestrdquo and ldquoforg[ing] judgesrsquosignatures to keep the eavesdropping scheme running forabout a yearrdquo [22]

      Our goal is to achieve public accountability even in theface of unreliable and untrustworthy participants Nextwe specify our threat model for each type of participantin the system and enumerate the security goals that ifmet will make it possible to maintain accountability un-der this threat model

      31 Threat modelOur threat model considers the three parties presentedin Figure 1mdashjudges law enforcement agencies andcompaniesmdashalong with the public Their roles and theassumptions we make about each are described belowWe assume all parties are computationally bounded

      Judges Judges consider requests for surveillance andissue court orders that allow law enforcement agencies torequest data from companies We must consider judgesin the context of the courts in which they operate whichinclude staff members and possibly other judges Weconsider courts to be honest-but-curious they will ad-here to the designated protocols but should not be ableto learn internal information about the workings of othercourts Although one might argue that the judges them-selves can be trusted with this information we do nottrust their staffs Hereon we use the terms ldquojudgerdquo andldquocourtrdquo interchangeably to refer to an entire courthouse

      In addition when it comes to sealed orders judgesmay be forgetful as Judge Smith observes judges fre-quently fail to unseal orders when the seals have ex-pired [38]

      Law enforcement agencies Law enforcement agen-cies make requests for surveillance to judges in the con-text of ongoing investigations If these requests are ap-proved and a judge issues a court order a law enforce-ment agency may request data from the relevant compa-nies We model law enforcement agencies as maliciouseg they may forge or alter court orders in order to gainaccess to unauthorized information (as in the case of theBrooklyn prosecutor [22])

      Companies Companies possess the data that law en-forcement agencies may request if they hold a court or-der Companies may optionally contest these orders and

      if the order is upheld must supply the relevant data tothe law enforcement agency We model companies asmalicious eg they might wish to contribute to unau-thorized surveillance while maintaining the outside ap-pearance that they are not Specifically although compa-nies currently release aggregate statistics about their in-volvement in the surveillance process [4 7] our systemdoes not rely on their honesty in reporting these num-bers Other malicious behavior might include colludingwith law enforcement to release more data than a courtorder allows or furnishing data in the absence of a courtorder

      The public We model the public as malicious as thepublic may include criminals who wish to learn as muchas possible about the surveillance process in order toavoid being caught2

      Remark 31 Our system requires the parties involvedin surveillance to post information to a shared ledger atvarious points in the surveillance process Correspon-dence between logged and real-world events is an aspectof any log-based record-keeping scheme that cannot beenforced using technological means alone Our systemis designed to encourage parties to log honestly or re-port dishonest logging they observe (see Remark 41)Our analysis focuses on the cryptographic guaranteesprovided by the system however rather than a rigor-ous game-theoretic analysis of incentive-based behaviorMost of this paper therefore assumes that surveillanceorders and other logged events are recorded correctlyexcept where otherwise noted

      32 Security GoalsIn order to achieve accountability in light of this threatmodel our system will need to satisfy three high-levelsecurity goals

      Accountability to the public The system must re-veal enough information to the public that members ofthe public are able to verify that all surveillance is con-ducted properly according to publicly known rules andspecifically that law enforcement agencies and compa-nies (which we model as malicious) do not deviate fromtheir expected roles in the surveillance process The pub-lic must also have enough information to prompt courtsto unseal records at the appropriate times

      Correctness All of the information that our systemcomputes and reveals must be correct The aggregate

      2By placing all data on an immutable public ledger and giving thepublic no role in our system besides that of observer we effectivelyreduce the public to a passive adversary

      statistics it computes and releases to the public must ac-curately reflect the state of electronic surveillance Anyassurances that our system makes to the public about the(im)propriety of the electronic surveillance process mustbe reported accurately

      Confidentiality The public must not learn informationthat could undermine the investigative process Noneof the other parties (courts law enforcement agenciesand companies) may learn any information beyond thatwhich they already know in the current ECPA processand that which is released to the public

      For particularly sensitive applications the confi-dentiality guarantee should be perfect (information-theoretic) this means confidentiality should hold uncon-ditionally even against arbitrarily powerful adversariesthat may be computationally unbounded3 A perfect con-fidentiality guarantee would be of particular importancein contexts where unauthorized breaks of confidentialitycould have catastrophic consequences (such as nationalsecurity) We envision that a truly unconditional confi-dentiality guarantee could catalyze the consideration ofaccountability systems in contexts involving very sensi-tive information where decision-makers are traditionallyrisk-averse such as the court system

      4 System Design

      We present the design of our proposed system for ac-countability in electronic surveillance Section 41 infor-mally introduces four cryptographic primitives and theirsecurity guarantees4 Section 42 outlines the configura-tion of the systemmdashwhere data is stored and processedSection 43 describes the workflow of the system in re-lation to the surveillance process summarized in Figure1 Section 44 discusses the packages of design choicesavailable to the court system exploiting the flexibility ofthe cryptographic tools to offer a range of options thattrade off between secrecy and accountability

      3This is in contrast to computational confidentiality guaranteeswhich provide confidentiality only against adversaries that are efficientor computationally bounded Even with the latter weaker type of guar-antee it is possible to ensure confidentiality against any adversary withcomputing power within the realistically foreseeable future compu-tational guarantees are quite common in practice and widely consid-ered acceptable for many applications One reason to opt for com-putational guarantees over information-theoretic ones is that typicallyinformation-theoretic guarantees carry some loss in efficiency how-ever this benefit may be outweighed in particularly sensitive applica-tions or when confidentiality is desirable for a very long-term futurewhere advances in computing power are not foreseeable

      4For rigorous formal definitions of these cryptographic primitiveswe refer to any standard cryptography textbook (eg [26])

      41 Cryptographic Tools

      Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

      In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

      Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

      In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

      Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

      Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

      5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

      6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

      revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

      In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

      In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

      The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

      Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

      Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

      7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

      Public Ledger

      Judge CourtOrders

      LawEnforcement

      SurveillanceRequests

      CompanyUserData Public

      Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

      inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

      In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

      Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

      Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

      1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

      2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

      8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

      that relevant rules have been followed without re-vealing any secret information

      3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

      42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

      Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

      The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

      43 Workflow

      Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

      Ledger

      Commitmentto Order

      CaseMetadata

      Commitmentto Data

      Request ZKArgument

      Commitmentto Data

      ResponseZK Argument

      Judge CompanyLaw Enforcement

      Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

      meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

      The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

      This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

      The zero-knowledge arguments about relationships

      between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

      Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

      A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

      Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

      However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

      Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

      9Microsoft [7] and Google [4] currently release their transparency

      Judge 1 Judge 2 Judge 3 Judge N

      DC Circuit 1st Circuit 11th Circuit

      MPC

      Aggregate Statistic

      Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

      the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

      The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

      Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

      reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

      10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

      court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

      This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

      44 Additional Design Choices

      The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

      Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

      Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

      address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

      Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

      5 Protocol Definition

      We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

      Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

      cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

      the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

      party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

      behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

      11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

      clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

      Next we give the syntax of our cryptographic tools12

      and then define the behavior of the remaining parties

      A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

      unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

      and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

      mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

      pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

      Algorithm 1 Law enforcement agency Ai

      bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

      bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

      bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

      i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

      bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

      Agencies Each agency Ai has an associated decision-making process Adp

      i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

      12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

      13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

      14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

      i 15This is the step invoked by requests for unsealed documents

      Algorithm 2 Judge Ji

      bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

      bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

      bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

      ates according to Algorithm 1 which is parametrized byits own Adp

      i In practice we assume Adpi would be instan-

      tiated by the agencyrsquos human decision-making process

      Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

      Algorithm 3 Company Ci

      bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

      The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

      Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

      Algorithm 4 Trustee Ti

      bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

      1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

      δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

      tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

      ReconInputs((δ1iprime δniprime)

      )iprimeisin[r] =(

      Recon(δ j1 δ jr))

      jisin[n]

      Let y denote the output from the MPC16

      4 Send y to J j for each j isin [n]17

      bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

      1 Receive a secret-share δ ji from each judge J j respec-tively

      2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

      authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

      6 Evaluation of MPC Implementation

      In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

      The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

      The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

      To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

      We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

      1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

      2 Protocols should not require all parties to be onlineregularly or at the same time

      In the subsections that follow we describe and evaluateour implementations in light of these goals

      61 Computing Totals in WebMPC

      WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

      Figure 5 Performance of MPC using WebMPC library

      masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

      To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

      Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

      62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

      Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

      Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

      cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

      As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

      In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

      Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

      One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

      7 Evaluation of SNARKs

      We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

      A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

      bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

      bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

      bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

      18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

      Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

      Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

      For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

      We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

      71 Argument TypesOur implementation supports three types of arguments

      Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

      she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

      Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

      More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

      Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

      Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

      We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

      19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

      mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

      72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

      Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

      Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

      Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

      20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

      21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

      with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

      Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

      Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

      Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

      8 Generalization

      Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

      Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

      (a) Proving key size (b) Key generation time (c) Argument generation time

      Figure 8 SNARK evaluation

      of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

      Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

      Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

      81 Generalized Framework

      We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

      ECPA example these are judges law enforcementagencies and companies

      bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

      hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

      bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

      Abstractly our system allows the controlled disclosureof four types of information

      Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

      Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

      Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

      Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

      Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

      Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

      Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

      Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

      9 Conclusion

      We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

      Acknowledgements

      We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

      This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

      References[1] Bellman httpsgithubcomebfullbellman

      [2] Electronic Communications Privacy Act 18 USC 2701 et seq

      [3] Foreign Intelligence Surveillance Act 50 USC ch 36

      [4] Google transparency report httpswwwgooglecom

      transparencyreportuserdatarequestscountries

      p=2016-12

      [5] Jiff httpsgithubcommultipartyjiff

      [6] Jsnark httpsgithubcomakosbajsnark

      [7] Law enforcement requests report httpswwwmicrosoft

      comen-usaboutcorporate-responsibilitylerr

      [8] Zcash httpszcash

      [9] Wiretap report 2015 httpwwwuscourtsgov

      statistics-reportswiretap-report-2015 Decem-ber 2015

      [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

      [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

      [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

      [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

      [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

      [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

      [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

      [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

      [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

      [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

      [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

      [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

      [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

      [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

      [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

      [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

      [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

      [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

      princetonedufeltenwarrant-paperpdf

      [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

      [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

      [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

      [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

      [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

      [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

      [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

      [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

      [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

      [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

      [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

      [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

      [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

      [41] VIRZA M November 2017 Private communication

      [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

      [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

      [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

      • Introduction
      • Related Work
      • Threat Model and Security Goals
        • Threat model
        • Security Goals
          • System Design
            • Cryptographic Tools
            • System Configuration
            • Workflow
            • Additional Design Choices
              • Protocol Definition
              • Evaluation of MPC Implementation
                • Computing Totals in WebMPC
                • Thresholds and Hierarchy with Jiff
                  • Evaluation of SNARKs
                    • Argument Types
                    • Implementation
                      • Generalization
                        • Generalized Framework
                          • Conclusion

        work for public release of controlled amounts of infor-mation to address a general class of accountability prob-lems of which electronic surveillance is one instance

        Bates et al [12] consider adding accountability tocourt-sanctioned wiretaps in which law enforcementagencies can request phone call content They encryptduplicates of all wiretapped data in a fashion only acces-sible by courts and other auditors and keep logs thereofsuch that they can later be analyzed for aggregate statis-tics or compared with law enforcement records A keydifference between [12] and our system is that our de-sign enables the public to directly verify the propriety ofsurveillance activities partially in real time

        Goldwasser and Park [23] focus on a different legalapplication secret laws in the context of the ForeignIntelligence Surveillance Act (FISA) [3] where the op-erations of the court applying the law is secret Suc-cinct zero-knowledge is used to certify consistency ofrecorded actions with unknown judicial actions Whileour work and [23] are similar in motivation and sharesome cryptographic tools Goldwasser and Park addressa different application Moreover our paper differs in itsimplementations demonstrating practicality and its con-sideration of aggregate statistics Unlike this work [23]does not model parties in the role of companies

        Other research that suggests applying cryptographyto enforce rules governing access-control aspects ofsurveillance includes [25] which enforces privacy forNSA telephony metadata surveillance [36] which usesprivate set intersection for surveillance involving joinsover large databases and [35] which uses the same tech-nique for searching communication graphs

        Efficient MPC and SNARKs LibSNARK [34] is theprimary existing implementation of SNARKs (Other li-braries are in active development [1 6]) More numer-ous implementation efforts have been made for MPCunder a range of assumptions and adversary modelseg [16 29 5 11 42 19] The idea of placing mostof the workload of MPC on a subset of parties hasbeen explored before (eg constant-round protocols by[18 24]) we build upon this literature by designing ahierarchically structured MPC protocol specifically tomatch the hierarchy of the existing US court system

        3 Threat Model and Security Goals

        Our high-level policy goals are to hold the electronicsurveillance process accountable to the public by (1)demonstrating that each participant performs its roleproperly and stays within the bounds of the law and (2)ensuring that the public is aware of the general extent ofgovernment surveillance The accountability measureswe propose place checks on the behavior of judges law

        enforcement agencies and companies Such checks areimportant against oversight as well as malice as theseparticipants can misbehave in a number of ways Forexample as Judge Smith explains forgetful judges maylose track of orders whose seals have expired More ma-liciously in 2016 a Brooklyn prosecutor was arrestedfor ldquospy[ing] on [a] love interestrdquo and ldquoforg[ing] judgesrsquosignatures to keep the eavesdropping scheme running forabout a yearrdquo [22]

        Our goal is to achieve public accountability even in theface of unreliable and untrustworthy participants Nextwe specify our threat model for each type of participantin the system and enumerate the security goals that ifmet will make it possible to maintain accountability un-der this threat model

        31 Threat modelOur threat model considers the three parties presentedin Figure 1mdashjudges law enforcement agencies andcompaniesmdashalong with the public Their roles and theassumptions we make about each are described belowWe assume all parties are computationally bounded

        Judges Judges consider requests for surveillance andissue court orders that allow law enforcement agencies torequest data from companies We must consider judgesin the context of the courts in which they operate whichinclude staff members and possibly other judges Weconsider courts to be honest-but-curious they will ad-here to the designated protocols but should not be ableto learn internal information about the workings of othercourts Although one might argue that the judges them-selves can be trusted with this information we do nottrust their staffs Hereon we use the terms ldquojudgerdquo andldquocourtrdquo interchangeably to refer to an entire courthouse

        In addition when it comes to sealed orders judgesmay be forgetful as Judge Smith observes judges fre-quently fail to unseal orders when the seals have ex-pired [38]

        Law enforcement agencies Law enforcement agen-cies make requests for surveillance to judges in the con-text of ongoing investigations If these requests are ap-proved and a judge issues a court order a law enforce-ment agency may request data from the relevant compa-nies We model law enforcement agencies as maliciouseg they may forge or alter court orders in order to gainaccess to unauthorized information (as in the case of theBrooklyn prosecutor [22])

        Companies Companies possess the data that law en-forcement agencies may request if they hold a court or-der Companies may optionally contest these orders and

        if the order is upheld must supply the relevant data tothe law enforcement agency We model companies asmalicious eg they might wish to contribute to unau-thorized surveillance while maintaining the outside ap-pearance that they are not Specifically although compa-nies currently release aggregate statistics about their in-volvement in the surveillance process [4 7] our systemdoes not rely on their honesty in reporting these num-bers Other malicious behavior might include colludingwith law enforcement to release more data than a courtorder allows or furnishing data in the absence of a courtorder

        The public We model the public as malicious as thepublic may include criminals who wish to learn as muchas possible about the surveillance process in order toavoid being caught2

        Remark 31 Our system requires the parties involvedin surveillance to post information to a shared ledger atvarious points in the surveillance process Correspon-dence between logged and real-world events is an aspectof any log-based record-keeping scheme that cannot beenforced using technological means alone Our systemis designed to encourage parties to log honestly or re-port dishonest logging they observe (see Remark 41)Our analysis focuses on the cryptographic guaranteesprovided by the system however rather than a rigor-ous game-theoretic analysis of incentive-based behaviorMost of this paper therefore assumes that surveillanceorders and other logged events are recorded correctlyexcept where otherwise noted

        32 Security GoalsIn order to achieve accountability in light of this threatmodel our system will need to satisfy three high-levelsecurity goals

        Accountability to the public The system must re-veal enough information to the public that members ofthe public are able to verify that all surveillance is con-ducted properly according to publicly known rules andspecifically that law enforcement agencies and compa-nies (which we model as malicious) do not deviate fromtheir expected roles in the surveillance process The pub-lic must also have enough information to prompt courtsto unseal records at the appropriate times

        Correctness All of the information that our systemcomputes and reveals must be correct The aggregate

        2By placing all data on an immutable public ledger and giving thepublic no role in our system besides that of observer we effectivelyreduce the public to a passive adversary

        statistics it computes and releases to the public must ac-curately reflect the state of electronic surveillance Anyassurances that our system makes to the public about the(im)propriety of the electronic surveillance process mustbe reported accurately

        Confidentiality The public must not learn informationthat could undermine the investigative process Noneof the other parties (courts law enforcement agenciesand companies) may learn any information beyond thatwhich they already know in the current ECPA processand that which is released to the public

        For particularly sensitive applications the confi-dentiality guarantee should be perfect (information-theoretic) this means confidentiality should hold uncon-ditionally even against arbitrarily powerful adversariesthat may be computationally unbounded3 A perfect con-fidentiality guarantee would be of particular importancein contexts where unauthorized breaks of confidentialitycould have catastrophic consequences (such as nationalsecurity) We envision that a truly unconditional confi-dentiality guarantee could catalyze the consideration ofaccountability systems in contexts involving very sensi-tive information where decision-makers are traditionallyrisk-averse such as the court system

        4 System Design

        We present the design of our proposed system for ac-countability in electronic surveillance Section 41 infor-mally introduces four cryptographic primitives and theirsecurity guarantees4 Section 42 outlines the configura-tion of the systemmdashwhere data is stored and processedSection 43 describes the workflow of the system in re-lation to the surveillance process summarized in Figure1 Section 44 discusses the packages of design choicesavailable to the court system exploiting the flexibility ofthe cryptographic tools to offer a range of options thattrade off between secrecy and accountability

        3This is in contrast to computational confidentiality guaranteeswhich provide confidentiality only against adversaries that are efficientor computationally bounded Even with the latter weaker type of guar-antee it is possible to ensure confidentiality against any adversary withcomputing power within the realistically foreseeable future compu-tational guarantees are quite common in practice and widely consid-ered acceptable for many applications One reason to opt for com-putational guarantees over information-theoretic ones is that typicallyinformation-theoretic guarantees carry some loss in efficiency how-ever this benefit may be outweighed in particularly sensitive applica-tions or when confidentiality is desirable for a very long-term futurewhere advances in computing power are not foreseeable

        4For rigorous formal definitions of these cryptographic primitiveswe refer to any standard cryptography textbook (eg [26])

        41 Cryptographic Tools

        Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

        In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

        Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

        In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

        Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

        Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

        5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

        6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

        revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

        In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

        In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

        The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

        Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

        Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

        7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

        Public Ledger

        Judge CourtOrders

        LawEnforcement

        SurveillanceRequests

        CompanyUserData Public

        Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

        inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

        In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

        Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

        Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

        1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

        2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

        8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

        that relevant rules have been followed without re-vealing any secret information

        3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

        42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

        Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

        The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

        43 Workflow

        Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

        Ledger

        Commitmentto Order

        CaseMetadata

        Commitmentto Data

        Request ZKArgument

        Commitmentto Data

        ResponseZK Argument

        Judge CompanyLaw Enforcement

        Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

        meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

        The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

        This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

        The zero-knowledge arguments about relationships

        between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

        Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

        A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

        Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

        However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

        Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

        9Microsoft [7] and Google [4] currently release their transparency

        Judge 1 Judge 2 Judge 3 Judge N

        DC Circuit 1st Circuit 11th Circuit

        MPC

        Aggregate Statistic

        Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

        the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

        The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

        Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

        reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

        10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

        court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

        This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

        44 Additional Design Choices

        The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

        Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

        Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

        address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

        Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

        5 Protocol Definition

        We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

        Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

        cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

        the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

        party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

        behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

        11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

        clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

        Next we give the syntax of our cryptographic tools12

        and then define the behavior of the remaining parties

        A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

        unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

        and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

        mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

        pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

        Algorithm 1 Law enforcement agency Ai

        bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

        bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

        bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

        i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

        bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

        Agencies Each agency Ai has an associated decision-making process Adp

        i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

        12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

        13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

        14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

        i 15This is the step invoked by requests for unsealed documents

        Algorithm 2 Judge Ji

        bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

        bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

        bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

        ates according to Algorithm 1 which is parametrized byits own Adp

        i In practice we assume Adpi would be instan-

        tiated by the agencyrsquos human decision-making process

        Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

        Algorithm 3 Company Ci

        bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

        The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

        Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

        Algorithm 4 Trustee Ti

        bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

        1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

        δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

        tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

        ReconInputs((δ1iprime δniprime)

        )iprimeisin[r] =(

        Recon(δ j1 δ jr))

        jisin[n]

        Let y denote the output from the MPC16

        4 Send y to J j for each j isin [n]17

        bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

        1 Receive a secret-share δ ji from each judge J j respec-tively

        2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

        authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

        6 Evaluation of MPC Implementation

        In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

        The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

        The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

        To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

        We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

        1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

        2 Protocols should not require all parties to be onlineregularly or at the same time

        In the subsections that follow we describe and evaluateour implementations in light of these goals

        61 Computing Totals in WebMPC

        WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

        Figure 5 Performance of MPC using WebMPC library

        masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

        To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

        Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

        62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

        Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

        Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

        cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

        As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

        In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

        Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

        One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

        7 Evaluation of SNARKs

        We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

        A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

        bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

        bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

        bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

        18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

        Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

        Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

        For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

        We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

        71 Argument TypesOur implementation supports three types of arguments

        Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

        she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

        Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

        More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

        Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

        Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

        We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

        19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

        mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

        72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

        Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

        Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

        Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

        20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

        21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

        with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

        Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

        Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

        Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

        8 Generalization

        Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

        Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

        (a) Proving key size (b) Key generation time (c) Argument generation time

        Figure 8 SNARK evaluation

        of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

        Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

        Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

        81 Generalized Framework

        We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

        ECPA example these are judges law enforcementagencies and companies

        bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

        hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

        bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

        Abstractly our system allows the controlled disclosureof four types of information

        Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

        Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

        Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

        Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

        Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

        Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

        Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

        Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

        9 Conclusion

        We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

        Acknowledgements

        We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

        This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

        References[1] Bellman httpsgithubcomebfullbellman

        [2] Electronic Communications Privacy Act 18 USC 2701 et seq

        [3] Foreign Intelligence Surveillance Act 50 USC ch 36

        [4] Google transparency report httpswwwgooglecom

        transparencyreportuserdatarequestscountries

        p=2016-12

        [5] Jiff httpsgithubcommultipartyjiff

        [6] Jsnark httpsgithubcomakosbajsnark

        [7] Law enforcement requests report httpswwwmicrosoft

        comen-usaboutcorporate-responsibilitylerr

        [8] Zcash httpszcash

        [9] Wiretap report 2015 httpwwwuscourtsgov

        statistics-reportswiretap-report-2015 Decem-ber 2015

        [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

        [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

        [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

        [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

        [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

        [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

        [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

        [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

        [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

        [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

        [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

        [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

        [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

        [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

        [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

        [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

        [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

        [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

        princetonedufeltenwarrant-paperpdf

        [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

        [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

        [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

        [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

        [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

        [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

        [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

        [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

        [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

        [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

        [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

        [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

        [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

        [41] VIRZA M November 2017 Private communication

        [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

        [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

        [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

        • Introduction
        • Related Work
        • Threat Model and Security Goals
          • Threat model
          • Security Goals
            • System Design
              • Cryptographic Tools
              • System Configuration
              • Workflow
              • Additional Design Choices
                • Protocol Definition
                • Evaluation of MPC Implementation
                  • Computing Totals in WebMPC
                  • Thresholds and Hierarchy with Jiff
                    • Evaluation of SNARKs
                      • Argument Types
                      • Implementation
                        • Generalization
                          • Generalized Framework
                            • Conclusion

          if the order is upheld must supply the relevant data tothe law enforcement agency We model companies asmalicious eg they might wish to contribute to unau-thorized surveillance while maintaining the outside ap-pearance that they are not Specifically although compa-nies currently release aggregate statistics about their in-volvement in the surveillance process [4 7] our systemdoes not rely on their honesty in reporting these num-bers Other malicious behavior might include colludingwith law enforcement to release more data than a courtorder allows or furnishing data in the absence of a courtorder

          The public We model the public as malicious as thepublic may include criminals who wish to learn as muchas possible about the surveillance process in order toavoid being caught2

          Remark 31 Our system requires the parties involvedin surveillance to post information to a shared ledger atvarious points in the surveillance process Correspon-dence between logged and real-world events is an aspectof any log-based record-keeping scheme that cannot beenforced using technological means alone Our systemis designed to encourage parties to log honestly or re-port dishonest logging they observe (see Remark 41)Our analysis focuses on the cryptographic guaranteesprovided by the system however rather than a rigor-ous game-theoretic analysis of incentive-based behaviorMost of this paper therefore assumes that surveillanceorders and other logged events are recorded correctlyexcept where otherwise noted

          32 Security GoalsIn order to achieve accountability in light of this threatmodel our system will need to satisfy three high-levelsecurity goals

          Accountability to the public The system must re-veal enough information to the public that members ofthe public are able to verify that all surveillance is con-ducted properly according to publicly known rules andspecifically that law enforcement agencies and compa-nies (which we model as malicious) do not deviate fromtheir expected roles in the surveillance process The pub-lic must also have enough information to prompt courtsto unseal records at the appropriate times

          Correctness All of the information that our systemcomputes and reveals must be correct The aggregate

          2By placing all data on an immutable public ledger and giving thepublic no role in our system besides that of observer we effectivelyreduce the public to a passive adversary

          statistics it computes and releases to the public must ac-curately reflect the state of electronic surveillance Anyassurances that our system makes to the public about the(im)propriety of the electronic surveillance process mustbe reported accurately

          Confidentiality The public must not learn informationthat could undermine the investigative process Noneof the other parties (courts law enforcement agenciesand companies) may learn any information beyond thatwhich they already know in the current ECPA processand that which is released to the public

          For particularly sensitive applications the confi-dentiality guarantee should be perfect (information-theoretic) this means confidentiality should hold uncon-ditionally even against arbitrarily powerful adversariesthat may be computationally unbounded3 A perfect con-fidentiality guarantee would be of particular importancein contexts where unauthorized breaks of confidentialitycould have catastrophic consequences (such as nationalsecurity) We envision that a truly unconditional confi-dentiality guarantee could catalyze the consideration ofaccountability systems in contexts involving very sensi-tive information where decision-makers are traditionallyrisk-averse such as the court system

          4 System Design

          We present the design of our proposed system for ac-countability in electronic surveillance Section 41 infor-mally introduces four cryptographic primitives and theirsecurity guarantees4 Section 42 outlines the configura-tion of the systemmdashwhere data is stored and processedSection 43 describes the workflow of the system in re-lation to the surveillance process summarized in Figure1 Section 44 discusses the packages of design choicesavailable to the court system exploiting the flexibility ofthe cryptographic tools to offer a range of options thattrade off between secrecy and accountability

          3This is in contrast to computational confidentiality guaranteeswhich provide confidentiality only against adversaries that are efficientor computationally bounded Even with the latter weaker type of guar-antee it is possible to ensure confidentiality against any adversary withcomputing power within the realistically foreseeable future compu-tational guarantees are quite common in practice and widely consid-ered acceptable for many applications One reason to opt for com-putational guarantees over information-theoretic ones is that typicallyinformation-theoretic guarantees carry some loss in efficiency how-ever this benefit may be outweighed in particularly sensitive applica-tions or when confidentiality is desirable for a very long-term futurewhere advances in computing power are not foreseeable

          4For rigorous formal definitions of these cryptographic primitiveswe refer to any standard cryptography textbook (eg [26])

          41 Cryptographic Tools

          Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

          In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

          Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

          In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

          Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

          Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

          5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

          6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

          revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

          In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

          In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

          The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

          Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

          Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

          7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

          Public Ledger

          Judge CourtOrders

          LawEnforcement

          SurveillanceRequests

          CompanyUserData Public

          Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

          inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

          In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

          Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

          Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

          1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

          2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

          8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

          that relevant rules have been followed without re-vealing any secret information

          3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

          42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

          Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

          The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

          43 Workflow

          Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

          Ledger

          Commitmentto Order

          CaseMetadata

          Commitmentto Data

          Request ZKArgument

          Commitmentto Data

          ResponseZK Argument

          Judge CompanyLaw Enforcement

          Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

          meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

          The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

          This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

          The zero-knowledge arguments about relationships

          between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

          Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

          A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

          Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

          However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

          Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

          9Microsoft [7] and Google [4] currently release their transparency

          Judge 1 Judge 2 Judge 3 Judge N

          DC Circuit 1st Circuit 11th Circuit

          MPC

          Aggregate Statistic

          Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

          the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

          The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

          Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

          reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

          10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

          court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

          This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

          44 Additional Design Choices

          The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

          Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

          Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

          address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

          Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

          5 Protocol Definition

          We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

          Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

          cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

          the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

          party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

          behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

          11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

          clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

          Next we give the syntax of our cryptographic tools12

          and then define the behavior of the remaining parties

          A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

          unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

          and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

          mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

          pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

          Algorithm 1 Law enforcement agency Ai

          bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

          bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

          bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

          i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

          bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

          Agencies Each agency Ai has an associated decision-making process Adp

          i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

          12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

          13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

          14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

          i 15This is the step invoked by requests for unsealed documents

          Algorithm 2 Judge Ji

          bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

          bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

          bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

          ates according to Algorithm 1 which is parametrized byits own Adp

          i In practice we assume Adpi would be instan-

          tiated by the agencyrsquos human decision-making process

          Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

          Algorithm 3 Company Ci

          bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

          The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

          Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

          Algorithm 4 Trustee Ti

          bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

          1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

          δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

          tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

          ReconInputs((δ1iprime δniprime)

          )iprimeisin[r] =(

          Recon(δ j1 δ jr))

          jisin[n]

          Let y denote the output from the MPC16

          4 Send y to J j for each j isin [n]17

          bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

          1 Receive a secret-share δ ji from each judge J j respec-tively

          2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

          authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

          6 Evaluation of MPC Implementation

          In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

          The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

          The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

          To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

          We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

          1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

          2 Protocols should not require all parties to be onlineregularly or at the same time

          In the subsections that follow we describe and evaluateour implementations in light of these goals

          61 Computing Totals in WebMPC

          WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

          Figure 5 Performance of MPC using WebMPC library

          masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

          To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

          Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

          62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

          Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

          Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

          cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

          As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

          In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

          Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

          One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

          7 Evaluation of SNARKs

          We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

          A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

          bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

          bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

          bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

          18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

          Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

          Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

          For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

          We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

          71 Argument TypesOur implementation supports three types of arguments

          Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

          she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

          Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

          More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

          Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

          Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

          We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

          19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

          mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

          72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

          Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

          Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

          Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

          20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

          21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

          with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

          Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

          Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

          Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

          8 Generalization

          Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

          Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

          (a) Proving key size (b) Key generation time (c) Argument generation time

          Figure 8 SNARK evaluation

          of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

          Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

          Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

          81 Generalized Framework

          We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

          ECPA example these are judges law enforcementagencies and companies

          bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

          hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

          bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

          Abstractly our system allows the controlled disclosureof four types of information

          Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

          Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

          Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

          Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

          Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

          Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

          Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

          Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

          9 Conclusion

          We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

          Acknowledgements

          We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

          This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

          References[1] Bellman httpsgithubcomebfullbellman

          [2] Electronic Communications Privacy Act 18 USC 2701 et seq

          [3] Foreign Intelligence Surveillance Act 50 USC ch 36

          [4] Google transparency report httpswwwgooglecom

          transparencyreportuserdatarequestscountries

          p=2016-12

          [5] Jiff httpsgithubcommultipartyjiff

          [6] Jsnark httpsgithubcomakosbajsnark

          [7] Law enforcement requests report httpswwwmicrosoft

          comen-usaboutcorporate-responsibilitylerr

          [8] Zcash httpszcash

          [9] Wiretap report 2015 httpwwwuscourtsgov

          statistics-reportswiretap-report-2015 Decem-ber 2015

          [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

          [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

          [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

          [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

          [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

          [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

          [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

          [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

          [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

          [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

          [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

          [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

          [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

          [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

          [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

          [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

          [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

          [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

          princetonedufeltenwarrant-paperpdf

          [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

          [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

          [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

          [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

          [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

          [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

          [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

          [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

          [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

          [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

          [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

          [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

          [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

          [41] VIRZA M November 2017 Private communication

          [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

          [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

          [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

          • Introduction
          • Related Work
          • Threat Model and Security Goals
            • Threat model
            • Security Goals
              • System Design
                • Cryptographic Tools
                • System Configuration
                • Workflow
                • Additional Design Choices
                  • Protocol Definition
                  • Evaluation of MPC Implementation
                    • Computing Totals in WebMPC
                    • Thresholds and Hierarchy with Jiff
                      • Evaluation of SNARKs
                        • Argument Types
                        • Implementation
                          • Generalization
                            • Generalized Framework
                              • Conclusion

            41 Cryptographic Tools

            Append-only ledgers An append-only ledger is a logcontaining an ordered sequence of data consistently visi-ble to anyone (within a designated system) and to whichdata may be appended over time but whose contents maynot be edited or deleted The append-only nature of theledger is key for the maintenance of a globally consistentand tamper-proof data record over time

            In our system the ledger records credibly time-stamped information about surveillance events Typi-cally data stored on the ledger will cryptographicallyhide some sensitive information about a surveillanceevent while revealing select other information about itfor the sake of accountability Placing information on theledger is one means by which we reveal information tothe public facilitating the security goal of accountabilityfrom Section 3

            Cryptographic commitments A cryptographic com-mitment c is a string generated from some input data Dwhich has the properties of hiding and binding ie c re-veals no information about the value of D and yet D canbe revealed or ldquoopenedrdquo (by the person who created thecommitment) in such a way that any observer can be surethat D is the data with respect to which the commitmentwas made We refer to D as the content of c

            In our system commitments indicate that a piece ofinformation (eg a court order) exists and that its con-tent can credibly be opened at a later time Posting com-mitments to the ledger also establishes the existence of apiece of information at a given point in time Returningto the security goals from Section 3 commitments makeit possible to reveal a limited amount of information earlyon (achieving a degree of accountability) without com-promising investigative secrecy (achieving confidential-ity) Later when confidentiality is no longer necessaryand information can be revealed (ie a seal on an orderexpires) then the commitment can be opened by its cre-ator to achieve full accountability

            Commitments can be perfectly (information-theoretically) hiding achieving the perfect confi-dentiality goal of in Section 32 A well-knowncommitment scheme that is perfectly hiding is thePedersen commitment5

            Zero-knowledge A zero-knowledge argument6 allowsa prover P to convince a verifier V of a fact without

            5While the Pedersen commitment is not succinct we note that bycombining succinct commitments with perfectly hiding commitments(as also suggested by [23]) it is possible to obtain a commitment thatis both succinct and perfectly hiding

            6Zero-knowledge proof is a more commonly used term than zero-knowledge argument The two terms denote very similar concepts thedifference is lies only in the nature of the soundness guarantee (ie thatfalse statements cannot be convincingly attested to) which is compu-tational for arguments and statistical for proofs

            revealing any additional information about the fact inthe process of doing so P can provide to V a tuple(Rxπ) consisting of a binary relation R an input xand a proof π such that the verifier is convinced thatexistw st (xw)isinR yet cannot infer anything about the wit-ness w Three properties are required of zero-knowledgearguments completeness that any true statement can beproven by the honest algorithm P such that V acceptsthe proof soundness that no purported proof of a falsestatement (produced by any algorithm Plowast) should be ac-cepted by the honest verifier V and zero-knowledge thatthe proof π reveals no information beyond what can beinferred just from the desired statement that (xw) isin R

            In our system zero-knowledge makes it possible to re-veal how secret information relates to a system of rulesor to other pieces of secret information without revealingany further information Concretely our implementation(detailed in Section 7) allows law enforcement to attest(1) knowledge of the content of a commitment c (eg toan email address in a request for data made by a law en-forcement agency) demonstrating the ability to later openc and (2) that the content of a commitment c is equal tothe content of a prior commitment cprime (eg to an email ad-dress in a court order issued by a judge) In case even (2)reveals too much information our implementation sup-ports not specifying cprime exactly and instead attesting thatcprime lies in a given set S (eg S could include all judgesrsquosurveillance authorizations from the last month)

            In the terms of our security goals from Section 3 zeroknowledge arguments can demonstrate to the public thatcommitments can be opened and that proper relation-ships between committed information is preserved (ac-countability) without revealing any further informationabout the surveillance process (confidentiality) If thesearguments fail the public can detect when a participanthas deviated from the process (accountability)

            The SNARK construction [15] that we suggest for usein our system achieves perfect (information-theoretic)confidentiality a goal stated in Section 327

            Secure multiparty computation (MPC) MPC allows aset of n parties p1 pn each in possession of privatedata x1 xn to jointly compute the output of a functiony = f (x1 xn) on their private inputs y is computedvia an interactive protocol executed by the parties

            Secure MPC makes two guarantees correctness andsecrecy Correctness means that the output y is equal tof (x1 xn) Secrecy means that any adversary that cor-rupts some subset S sub p1 pn of the parties learnsnothing about xi pi isin S beyond what can already be

            7In fact [15] states their secrecy guarantee in a computational (notinformation-theoretic) form but their unmodified construction doesachieve perfect secrecy and the proofs of [15] suffice unchanged toprove the stronger definition [41] That perfect zero-knowledge can beachieved is also remarked in the appendix of [14]

            Public Ledger

            Judge CourtOrders

            LawEnforcement

            SurveillanceRequests

            CompanyUserData Public

            Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

            inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

            In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

            Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

            Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

            1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

            2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

            8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

            that relevant rules have been followed without re-vealing any secret information

            3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

            42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

            Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

            The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

            43 Workflow

            Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

            Ledger

            Commitmentto Order

            CaseMetadata

            Commitmentto Data

            Request ZKArgument

            Commitmentto Data

            ResponseZK Argument

            Judge CompanyLaw Enforcement

            Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

            meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

            The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

            This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

            The zero-knowledge arguments about relationships

            between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

            Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

            A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

            Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

            However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

            Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

            9Microsoft [7] and Google [4] currently release their transparency

            Judge 1 Judge 2 Judge 3 Judge N

            DC Circuit 1st Circuit 11th Circuit

            MPC

            Aggregate Statistic

            Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

            the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

            The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

            Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

            reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

            10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

            court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

            This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

            44 Additional Design Choices

            The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

            Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

            Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

            address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

            Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

            5 Protocol Definition

            We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

            Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

            cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

            the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

            party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

            behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

            11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

            clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

            Next we give the syntax of our cryptographic tools12

            and then define the behavior of the remaining parties

            A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

            unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

            and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

            mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

            pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

            Algorithm 1 Law enforcement agency Ai

            bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

            bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

            bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

            i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

            bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

            Agencies Each agency Ai has an associated decision-making process Adp

            i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

            12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

            13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

            14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

            i 15This is the step invoked by requests for unsealed documents

            Algorithm 2 Judge Ji

            bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

            bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

            bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

            ates according to Algorithm 1 which is parametrized byits own Adp

            i In practice we assume Adpi would be instan-

            tiated by the agencyrsquos human decision-making process

            Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

            Algorithm 3 Company Ci

            bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

            The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

            Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

            Algorithm 4 Trustee Ti

            bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

            1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

            δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

            tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

            ReconInputs((δ1iprime δniprime)

            )iprimeisin[r] =(

            Recon(δ j1 δ jr))

            jisin[n]

            Let y denote the output from the MPC16

            4 Send y to J j for each j isin [n]17

            bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

            1 Receive a secret-share δ ji from each judge J j respec-tively

            2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

            authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

            6 Evaluation of MPC Implementation

            In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

            The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

            The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

            To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

            We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

            1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

            2 Protocols should not require all parties to be onlineregularly or at the same time

            In the subsections that follow we describe and evaluateour implementations in light of these goals

            61 Computing Totals in WebMPC

            WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

            Figure 5 Performance of MPC using WebMPC library

            masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

            To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

            Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

            62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

            Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

            Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

            cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

            As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

            In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

            Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

            One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

            7 Evaluation of SNARKs

            We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

            A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

            bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

            bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

            bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

            18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

            Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

            Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

            For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

            We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

            71 Argument TypesOur implementation supports three types of arguments

            Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

            she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

            Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

            More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

            Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

            Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

            We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

            19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

            mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

            72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

            Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

            Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

            Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

            20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

            21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

            with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

            Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

            Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

            Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

            8 Generalization

            Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

            Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

            (a) Proving key size (b) Key generation time (c) Argument generation time

            Figure 8 SNARK evaluation

            of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

            Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

            Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

            81 Generalized Framework

            We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

            ECPA example these are judges law enforcementagencies and companies

            bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

            hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

            bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

            Abstractly our system allows the controlled disclosureof four types of information

            Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

            Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

            Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

            Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

            Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

            Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

            Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

            Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

            9 Conclusion

            We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

            Acknowledgements

            We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

            This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

            References[1] Bellman httpsgithubcomebfullbellman

            [2] Electronic Communications Privacy Act 18 USC 2701 et seq

            [3] Foreign Intelligence Surveillance Act 50 USC ch 36

            [4] Google transparency report httpswwwgooglecom

            transparencyreportuserdatarequestscountries

            p=2016-12

            [5] Jiff httpsgithubcommultipartyjiff

            [6] Jsnark httpsgithubcomakosbajsnark

            [7] Law enforcement requests report httpswwwmicrosoft

            comen-usaboutcorporate-responsibilitylerr

            [8] Zcash httpszcash

            [9] Wiretap report 2015 httpwwwuscourtsgov

            statistics-reportswiretap-report-2015 Decem-ber 2015

            [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

            [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

            [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

            [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

            [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

            [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

            [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

            [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

            [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

            [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

            [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

            [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

            [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

            [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

            [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

            [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

            [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

            [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

            princetonedufeltenwarrant-paperpdf

            [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

            [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

            [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

            [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

            [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

            [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

            [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

            [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

            [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

            [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

            [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

            [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

            [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

            [41] VIRZA M November 2017 Private communication

            [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

            [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

            [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

            • Introduction
            • Related Work
            • Threat Model and Security Goals
              • Threat model
              • Security Goals
                • System Design
                  • Cryptographic Tools
                  • System Configuration
                  • Workflow
                  • Additional Design Choices
                    • Protocol Definition
                    • Evaluation of MPC Implementation
                      • Computing Totals in WebMPC
                      • Thresholds and Hierarchy with Jiff
                        • Evaluation of SNARKs
                          • Argument Types
                          • Implementation
                            • Generalization
                              • Generalized Framework
                                • Conclusion

              Public Ledger

              Judge CourtOrders

              LawEnforcement

              SurveillanceRequests

              CompanyUserData Public

              Figure 2 System configuration Participants (rectangles)read and write to a public ledger (cloud) and local storage(ovals) The public (diamond) reads from the ledger

              inferred given the adversarial inputs xi pi isin S and theoutput y Secrecy is formalized by stipulating that a sim-ulator that is given only (xi pi isin Sy) as input must beable to produce a ldquosimulatedrdquo protocol transcript that isindistinguishable from the actual protocol execution runwith all the real inputs (x1 xn)

              In our system MPC enables computation of aggregatestatistics about the extent of surveillance across the en-tire court system through a computation among individ-ual judges MPC eliminates the need to pool the sensitivedata of individual judges in the clear or to defer to com-panies to compute and release this information piece-meal In the terms of our security goals MPC revealsinformation to the public (accountability) from a sourcewe trust to follow the protocol honestly (the courts) with-out revealing the internal workings of courts to one an-other (confidentiality) It also eliminates the need to relyon potentially malicious companies to reveal this infor-mation themselves (correctness)

              Secret sharing Secret sharing facilitates our hierarchi-cal MPC protocol A secret sharing of some input dataD consists of a set of strings (D1 DN) called sharessatisfying two properties (1) any subset of Nminus1 sharesreveals no information about D and (2) given all the Nshares D can easily be reconstructed8

              Summary In summary these cryptographic tools sup-port three high-level properties that we utilize to achieveour security goals

              1 Trusted records of events The append-only ledgerand cryptographic commitments create a trustwor-thy record of surveillance events without revealingsensitive information to the public

              2 Demonstration of compliance Zero-knowledge ar-guments allow parties to provably assure the public

              8For simplicity we have described so-called ldquoN-out-of-Nrdquo secret-sharing More generally secret sharing can guarantee that any subsetof kleN shares enable reconstruction while any subset of at most kminus1shares reveals nothing about D

              that relevant rules have been followed without re-vealing any secret information

              3 Transparency without handling secrets MPC en-ables the court system to accurately compute and re-lease aggregate statistics about surveillance eventswithout ever sharing the sensitive information of in-dividual parties

              42 System ConfigurationOur system is centered around a publicly visible append-only ledger where the various entities involved in theelectronic surveillance process can post information Asdepicted in Figure 2 every judge law enforcementagency and company contributes data to this ledgerJudges post cryptographic commitments to all orders is-sued Law enforcement agencies post commitments totheir activities (warrant requests to judges and data re-quests to companies) and zero-knowledge argumentsabout the requests they issue Companies do the samefor the data they deliver to agencies Members of thepublic can view and verify all data posted to the ledger

              Each judge law enforcement agency and companywill need to maintain a small amount of infrastructure acomputer terminal through which to compose posts andlocal storage (the ovals in Figure 2) to store sensitive in-formation (eg the content of sealed court orders) Toattest to the authenticity of posts to the ledger each par-ticipant will need to maintain a private signing key andpublicize a corresponding verification key We assumethat public-key infrastructure could be established by areputable party like the Administrative Office of the USCourts

              The ledger itself could be maintained as a distributedsystem among the participants in the process a dis-tributed system among a more exclusive group of partic-ipants with higher trustworthiness (eg the circuit courtsof appeals) or by a single entity (eg the AdministrativeOffice of the US Courts or the Supreme Court)

              43 Workflow

              Posting to the ledger The workflow of our system aug-ments the electronic surveillance workflow in Figure 1with additional information posted to the ledger as de-picted in Figure 3 When a judge issues an order (step2 of Figure 1) she also posts a commitment to the or-der and additional metadata about the case At a min-imum this metadata must include the date that the or-derrsquos seal expires depending on the system configura-tion she could post other metadata (eg Judge Smithrsquoscover sheet) The commitment allows the public to laterverify that the order was properly unsealed but revealsno information about the commitmentrsquos content in the

              Ledger

              Commitmentto Order

              CaseMetadata

              Commitmentto Data

              Request ZKArgument

              Commitmentto Data

              ResponseZK Argument

              Judge CompanyLaw Enforcement

              Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

              meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

              The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

              This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

              The zero-knowledge arguments about relationships

              between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

              Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

              A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

              Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

              However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

              Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

              9Microsoft [7] and Google [4] currently release their transparency

              Judge 1 Judge 2 Judge 3 Judge N

              DC Circuit 1st Circuit 11th Circuit

              MPC

              Aggregate Statistic

              Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

              the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

              The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

              Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

              reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

              10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

              court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

              This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

              44 Additional Design Choices

              The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

              Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

              Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

              address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

              Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

              5 Protocol Definition

              We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

              Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

              cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

              the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

              party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

              behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

              11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

              clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

              Next we give the syntax of our cryptographic tools12

              and then define the behavior of the remaining parties

              A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

              unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

              and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

              mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

              pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

              Algorithm 1 Law enforcement agency Ai

              bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

              bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

              bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

              i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

              bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

              Agencies Each agency Ai has an associated decision-making process Adp

              i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

              12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

              13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

              14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

              i 15This is the step invoked by requests for unsealed documents

              Algorithm 2 Judge Ji

              bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

              bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

              bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

              ates according to Algorithm 1 which is parametrized byits own Adp

              i In practice we assume Adpi would be instan-

              tiated by the agencyrsquos human decision-making process

              Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

              Algorithm 3 Company Ci

              bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

              The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

              Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

              Algorithm 4 Trustee Ti

              bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

              1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

              δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

              tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

              ReconInputs((δ1iprime δniprime)

              )iprimeisin[r] =(

              Recon(δ j1 δ jr))

              jisin[n]

              Let y denote the output from the MPC16

              4 Send y to J j for each j isin [n]17

              bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

              1 Receive a secret-share δ ji from each judge J j respec-tively

              2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

              authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

              6 Evaluation of MPC Implementation

              In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

              The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

              The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

              To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

              We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

              1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

              2 Protocols should not require all parties to be onlineregularly or at the same time

              In the subsections that follow we describe and evaluateour implementations in light of these goals

              61 Computing Totals in WebMPC

              WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

              Figure 5 Performance of MPC using WebMPC library

              masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

              To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

              Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

              62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

              Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

              Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

              cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

              As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

              In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

              Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

              One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

              7 Evaluation of SNARKs

              We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

              A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

              bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

              bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

              bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

              18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

              Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

              Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

              For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

              We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

              71 Argument TypesOur implementation supports three types of arguments

              Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

              she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

              Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

              More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

              Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

              Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

              We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

              19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

              mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

              72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

              Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

              Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

              Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

              20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

              21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

              with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

              Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

              Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

              Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

              8 Generalization

              Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

              Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

              (a) Proving key size (b) Key generation time (c) Argument generation time

              Figure 8 SNARK evaluation

              of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

              Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

              Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

              81 Generalized Framework

              We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

              ECPA example these are judges law enforcementagencies and companies

              bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

              hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

              bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

              Abstractly our system allows the controlled disclosureof four types of information

              Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

              Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

              Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

              Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

              Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

              Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

              Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

              Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

              9 Conclusion

              We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

              Acknowledgements

              We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

              This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

              References[1] Bellman httpsgithubcomebfullbellman

              [2] Electronic Communications Privacy Act 18 USC 2701 et seq

              [3] Foreign Intelligence Surveillance Act 50 USC ch 36

              [4] Google transparency report httpswwwgooglecom

              transparencyreportuserdatarequestscountries

              p=2016-12

              [5] Jiff httpsgithubcommultipartyjiff

              [6] Jsnark httpsgithubcomakosbajsnark

              [7] Law enforcement requests report httpswwwmicrosoft

              comen-usaboutcorporate-responsibilitylerr

              [8] Zcash httpszcash

              [9] Wiretap report 2015 httpwwwuscourtsgov

              statistics-reportswiretap-report-2015 Decem-ber 2015

              [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

              [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

              [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

              [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

              [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

              [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

              [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

              [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

              [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

              [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

              [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

              [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

              [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

              [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

              [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

              [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

              [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

              [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

              princetonedufeltenwarrant-paperpdf

              [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

              [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

              [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

              [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

              [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

              [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

              [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

              [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

              [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

              [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

              [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

              [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

              [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

              [41] VIRZA M November 2017 Private communication

              [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

              [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

              [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

              • Introduction
              • Related Work
              • Threat Model and Security Goals
                • Threat model
                • Security Goals
                  • System Design
                    • Cryptographic Tools
                    • System Configuration
                    • Workflow
                    • Additional Design Choices
                      • Protocol Definition
                      • Evaluation of MPC Implementation
                        • Computing Totals in WebMPC
                        • Thresholds and Hierarchy with Jiff
                          • Evaluation of SNARKs
                            • Argument Types
                            • Implementation
                              • Generalization
                                • Generalized Framework
                                  • Conclusion

                Ledger

                Commitmentto Order

                CaseMetadata

                Commitmentto Data

                Request ZKArgument

                Commitmentto Data

                ResponseZK Argument

                Judge CompanyLaw Enforcement

                Figure 3 Data posted to the public ledger as the protocolruns Time moves from left to right Each rectangle isa post to the ledger Dashed arrows between rectanglesindicate that the source of the arrow could contain a vis-ible reference to the destination The ovals contain theentities that make each post

                meantime achieving a degree of accountability in theshort-term (while confidentiality is necessary) and fullaccountability in the long-term (when the seal expiresand confidentiality is unnecessary) Since judges arehonest-but-curious they will adhere to this protocol andreliably post commitments whenever new court ordersare issued

                The agency then uses this order to request data from acompany (step 3 in Figure 1) and posts a commitment tothis request alongside a zero-knowledge argument thatthe request is compatible with a court order (and pos-sibly also with other legal requirements) This com-mitment which may never be opened provides a smallamount of accountability within the confines of confiden-tiality revealing that some law enforcement action tookplace The zero-knowledge argument takes accountabil-ity a step further it demonstrates to the public that thelaw enforcement action was compatible with the originalcourt order (which we trust to have been committed prop-erly) forcing the potentially-malicious law enforcementagency to adhere to the protocol or make public its non-compliance (Failure to adhere would result in a publiclyvisible invalid zero-knowledge argument) If the com-pany responds with matching data (step 6 in Figure 1) itposts a commitment to its response and an argument thatit furnished (only) the data implicated by the order anddata request These commitments and arguments serve arole analogous to those posted by law enforcement

                This system does not require commitments to all ac-tions in Figure 1 For example it only requires a law en-forcement agency to commit to a successful request fordata (step 3) rather than any proposed request (step 1)The system could easily be augmented with additionalcommitments and proofs as desired by the court system

                The zero-knowledge arguments about relationships

                between commitments reveal one additional piece of in-formation For a law enforcement agency to prove thatits committed data request is compatible with a particu-lar court order it must reveal which specific committedcourt order authorized the request In other words thezero-knowledge arguments reveal the links between spe-cific actions of each party (dashed arrows in Figure 3)These links could be eliminated reducing visibility intothe workflow of surveillance Instead entities would ar-gue that their actions are compatible with some court or-der among a group of recent orders

                Remark 41 We now briefly discuss other possible ma-licious behaviors by law enforcement agencies and com-panies involving inaccurate logging of data Though asmentioned in Remark 31 correspondence between real-world events and logged items is not enforceable by tech-nological means alone we informally argue that our de-sign incentivizes honest logging and reporting of dishon-est logging under many circumstances

                A malicious law enforcement agency could omit com-mitments or commit to one surveillance request but sendthe company a different request This action is visible tothe company which could reveal this misbehavior to thejudge This visibility incentivizes companies to recordtheir actions diligently so as to avoid any appearance ofnegligence let alone complicity in the agencyrsquos misbe-havior

                Similarly a malicious company might fail to post acommitment or post a commitment inconsistent with itsactual behavior These actions are visible to law en-forcement agencies who could report violations to thejudge (and otherwise risk the appearance of negligenceor complicity) To make such violations visible to thepublic we could add a second law enforcement com-mitment that acknowledges the data received and provesthat it is compatible with the original court order andlaw enforcement request

                However even incentive-based arguments do not ad-dress the case of a malicious law enforcement agencycolluding with a malicious company These entitiescould simply withhold from posting any information tothe ledger (or post a sequence of false but consistent in-formation) thereby making it impossible to detect viola-tions To handle this scenario we have to defer to thelegal process itself when this data is used as evidencein court a judge should ensure that appropriate docu-mentation was posted to the ledger and that the data wasgathered appropriately

                Aggregate statistics At configurable intervals the in-dividual courts use MPC to compute aggregate statisticsabout their surveillance activities9 An analyst such as

                9Microsoft [7] and Google [4] currently release their transparency

                Judge 1 Judge 2 Judge 3 Judge N

                DC Circuit 1st Circuit 11th Circuit

                MPC

                Aggregate Statistic

                Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

                the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

                The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

                Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

                reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

                10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

                court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

                This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

                44 Additional Design Choices

                The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

                Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

                Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

                address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

                Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

                5 Protocol Definition

                We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

                Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

                cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

                the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

                party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

                behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

                11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

                clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

                Next we give the syntax of our cryptographic tools12

                and then define the behavior of the remaining parties

                A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

                unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

                and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

                mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

                pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

                Algorithm 1 Law enforcement agency Ai

                bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

                bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

                bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

                i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

                bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

                Agencies Each agency Ai has an associated decision-making process Adp

                i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

                12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

                13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

                14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

                i 15This is the step invoked by requests for unsealed documents

                Algorithm 2 Judge Ji

                bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

                bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

                bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

                ates according to Algorithm 1 which is parametrized byits own Adp

                i In practice we assume Adpi would be instan-

                tiated by the agencyrsquos human decision-making process

                Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

                Algorithm 3 Company Ci

                bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

                The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

                Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

                Algorithm 4 Trustee Ti

                bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

                1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

                δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

                tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

                ReconInputs((δ1iprime δniprime)

                )iprimeisin[r] =(

                Recon(δ j1 δ jr))

                jisin[n]

                Let y denote the output from the MPC16

                4 Send y to J j for each j isin [n]17

                bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

                1 Receive a secret-share δ ji from each judge J j respec-tively

                2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

                authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

                6 Evaluation of MPC Implementation

                In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

                The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

                The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

                To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

                We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

                1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

                2 Protocols should not require all parties to be onlineregularly or at the same time

                In the subsections that follow we describe and evaluateour implementations in light of these goals

                61 Computing Totals in WebMPC

                WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

                Figure 5 Performance of MPC using WebMPC library

                masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

                To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

                Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

                62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

                Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                7 Evaluation of SNARKs

                We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                71 Argument TypesOur implementation supports three types of arguments

                Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                8 Generalization

                Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                (a) Proving key size (b) Key generation time (c) Argument generation time

                Figure 8 SNARK evaluation

                of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                81 Generalized Framework

                We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                ECPA example these are judges law enforcementagencies and companies

                bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                Abstractly our system allows the controlled disclosureof four types of information

                Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                9 Conclusion

                We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                Acknowledgements

                We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                References[1] Bellman httpsgithubcomebfullbellman

                [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                [4] Google transparency report httpswwwgooglecom

                transparencyreportuserdatarequestscountries

                p=2016-12

                [5] Jiff httpsgithubcommultipartyjiff

                [6] Jsnark httpsgithubcomakosbajsnark

                [7] Law enforcement requests report httpswwwmicrosoft

                comen-usaboutcorporate-responsibilitylerr

                [8] Zcash httpszcash

                [9] Wiretap report 2015 httpwwwuscourtsgov

                statistics-reportswiretap-report-2015 Decem-ber 2015

                [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                princetonedufeltenwarrant-paperpdf

                [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                [41] VIRZA M November 2017 Private communication

                [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                • Introduction
                • Related Work
                • Threat Model and Security Goals
                  • Threat model
                  • Security Goals
                    • System Design
                      • Cryptographic Tools
                      • System Configuration
                      • Workflow
                      • Additional Design Choices
                        • Protocol Definition
                        • Evaluation of MPC Implementation
                          • Computing Totals in WebMPC
                          • Thresholds and Hierarchy with Jiff
                            • Evaluation of SNARKs
                              • Argument Types
                              • Implementation
                                • Generalization
                                  • Generalized Framework
                                    • Conclusion

                  Judge 1 Judge 2 Judge 3 Judge N

                  DC Circuit 1st Circuit 11th Circuit

                  MPC

                  Aggregate Statistic

                  Figure 4 The flow of data as aggregate statistics arecomputed Each lower-court judge calculates its com-ponent of the statistic and secret-shares it into 12 sharesone for each judicial circuit (illustrated by colors) Theservers of the circuit courts then engage in a MPC tocompute the aggregate statistic from the input shares

                  the Administrative Office of the US Courts receives theresult of this MPC and posts it to the ledger The par-ticular kinds of aggregate statistics computed are at thediscretion of the court system They could include fig-ures already tabulated in the Administrative Office of theUS Courtrsquos Wiretap Reports [9] (ie orders by state andby criminal offense) and in company-issued transparencyreports [4 7] (ie requests and number of users impli-cated by company) Due to the generality10 of MPC itis theoretically possible to compute any function of theinformation known to each of the judges For perfor-mance reasons we restrict our focus to totals and aggre-gated thresholds a set of operations expressive enoughto replicate existing transparency reports

                  The statistics themselves are calculated using MPCIn principle the hundreds of magistrate and district courtjudges could attempt to directly perform MPC with eachother However as we find in Section 6 computingeven simple functions among hundreds of parties is pro-hibitively slow Moreover the logistics of getting everyjudge online simultaneously with enough reliability tocomplete a multiround protocol would be difficult if asingle judge went offline the protocol would stall

                  Instead we compute aggregate statistics in a hierar-chical manner as depicted in Figure 4 We exploit theexisting hierarchy of the federal court system Eachof the lower-court judges is under the jurisdiction ofone of twelve circuit courts of appeals Each lower-

                  reports every six months and the Administrative Office of the USCourts does so annually [9] We take these intervals to be our base-line for the frequency with which aggregate statistics would be re-leased in our system although releasing statistics more frequently (egmonthly) would improve transparency

                  10General MPC is a common term used to describe MPC that cancompute arbitrary functions of the participantsrsquo data as opposed to justrestricted classes of functions

                  court judge computes her individual component of thelarger aggregate statistic (eg number of orders issuedagainst Google in the past six months) and divides itinto twelve secret shares sending one share to (a servercontrolled by) each circuit court of appeals Distinctshares are represented by separate colors in Figure 4So long as at least one circuit server remains uncom-promised the lower-court judges can be assuredmdashby thesecurity of the secret-sharing schememdashthat their contri-butions to the larger statistic are confidential The circuitservers engage in a twelve-party MPC that reconstructsthe judgesrsquo input data from the shares computes the de-sired function and reveals the result to the analyst Byconcentrating the computationally intensive and logisti-cally demanding part of the MPC process in twelve stableservers this design eliminates many of the performanceand reliability challenges of the flat (non-hierarchical)protocol (Section 6 discusses performance)

                  This MPC strategy allows the court system to computeaggregate statistics (towards the accountability goal ofSection 32) Since courts are honest-but-curious and bythe correctness guarantee of MPC these statistics will becomputed accurately on correct data (correctness of Sec-tion 32) MPC enables the courts to perform these com-putations without revealing any courtrsquos internal informa-tion to any other court (confidentiality of Section 32)

                  44 Additional Design Choices

                  The preceding section described our proposed systemwith its full range of accountability features This con-figuration is only one of many possibilities Althoughcryptography makes it possible to release information ina controlled way the fact remains that revealing moreinformation poses greater risks to investigative integrityDepending on the court systemrsquos level of risk-tolerancefeatures can be modified or removed entirely to adjustthe amount of information disclosed

                  Cover sheet metadata A judge might reasonably fearthat a careful criminal could monitor cover sheet meta-data to detect surveillance At the cost of some trans-parency judges could post less metadata when commit-ting to an order (At a minimum the judge must postthe date at which the seal expires) The cover sheets in-tegral to Judge Smithrsquos proposal were also designed tosupply certain information towards assessing the scale ofsurveillance MPC replicates this outcome without re-leasing information about individual orders

                  Commitments by individual judges In some casesposting a commitment might reveal too much In a low-crime area mere knowledge that a particular judge ap-proved surveillance could spur a criminal organizationto change its behavior A number of approaches would

                  address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

                  Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

                  5 Protocol Definition

                  We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

                  Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

                  cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

                  the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

                  party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

                  behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

                  11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

                  clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

                  Next we give the syntax of our cryptographic tools12

                  and then define the behavior of the remaining parties

                  A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

                  unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

                  and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

                  mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

                  pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

                  Algorithm 1 Law enforcement agency Ai

                  bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

                  bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

                  bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

                  i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

                  bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

                  Agencies Each agency Ai has an associated decision-making process Adp

                  i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

                  12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

                  13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

                  14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

                  i 15This is the step invoked by requests for unsealed documents

                  Algorithm 2 Judge Ji

                  bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

                  bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

                  bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

                  ates according to Algorithm 1 which is parametrized byits own Adp

                  i In practice we assume Adpi would be instan-

                  tiated by the agencyrsquos human decision-making process

                  Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

                  Algorithm 3 Company Ci

                  bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

                  The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

                  Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

                  Algorithm 4 Trustee Ti

                  bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

                  1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

                  δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

                  tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

                  ReconInputs((δ1iprime δniprime)

                  )iprimeisin[r] =(

                  Recon(δ j1 δ jr))

                  jisin[n]

                  Let y denote the output from the MPC16

                  4 Send y to J j for each j isin [n]17

                  bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

                  1 Receive a secret-share δ ji from each judge J j respec-tively

                  2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

                  authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

                  6 Evaluation of MPC Implementation

                  In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

                  The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

                  The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

                  To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

                  We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

                  1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

                  2 Protocols should not require all parties to be onlineregularly or at the same time

                  In the subsections that follow we describe and evaluateour implementations in light of these goals

                  61 Computing Totals in WebMPC

                  WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

                  Figure 5 Performance of MPC using WebMPC library

                  masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

                  To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

                  Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

                  62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

                  Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                  Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                  cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                  As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                  In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                  Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                  One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                  7 Evaluation of SNARKs

                  We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                  A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                  bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                  bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                  bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                  18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                  Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                  Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                  For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                  We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                  71 Argument TypesOur implementation supports three types of arguments

                  Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                  she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                  Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                  More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                  Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                  Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                  We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                  19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                  mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                  72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                  Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                  Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                  Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                  20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                  21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                  with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                  Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                  Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                  Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                  8 Generalization

                  Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                  Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                  (a) Proving key size (b) Key generation time (c) Argument generation time

                  Figure 8 SNARK evaluation

                  of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                  Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                  Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                  81 Generalized Framework

                  We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                  ECPA example these are judges law enforcementagencies and companies

                  bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                  hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                  bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                  Abstractly our system allows the controlled disclosureof four types of information

                  Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                  Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                  Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                  Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                  Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                  Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                  Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                  Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                  9 Conclusion

                  We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                  Acknowledgements

                  We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                  This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                  References[1] Bellman httpsgithubcomebfullbellman

                  [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                  [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                  [4] Google transparency report httpswwwgooglecom

                  transparencyreportuserdatarequestscountries

                  p=2016-12

                  [5] Jiff httpsgithubcommultipartyjiff

                  [6] Jsnark httpsgithubcomakosbajsnark

                  [7] Law enforcement requests report httpswwwmicrosoft

                  comen-usaboutcorporate-responsibilitylerr

                  [8] Zcash httpszcash

                  [9] Wiretap report 2015 httpwwwuscourtsgov

                  statistics-reportswiretap-report-2015 Decem-ber 2015

                  [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                  [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                  [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                  [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                  [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                  [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                  [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                  [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                  [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                  [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                  [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                  [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                  [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                  [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                  [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                  [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                  [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                  [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                  princetonedufeltenwarrant-paperpdf

                  [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                  [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                  [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                  [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                  [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                  [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                  [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                  [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                  [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                  [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                  [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                  [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                  [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                  [41] VIRZA M November 2017 Private communication

                  [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                  [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                  [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                  • Introduction
                  • Related Work
                  • Threat Model and Security Goals
                    • Threat model
                    • Security Goals
                      • System Design
                        • Cryptographic Tools
                        • System Configuration
                        • Workflow
                        • Additional Design Choices
                          • Protocol Definition
                          • Evaluation of MPC Implementation
                            • Computing Totals in WebMPC
                            • Thresholds and Hierarchy with Jiff
                              • Evaluation of SNARKs
                                • Argument Types
                                • Implementation
                                  • Generalization
                                    • Generalized Framework
                                      • Conclusion

                    address this concern Judges could delegate the responsi-bility of posting to the ledger to the same judicial circuitsthat mediate the hierarchical MPC Alternatively eachjudge could continue posting to the ledger herself butinstead of signing the commitment under her own nameshe could sign it as coming from some court in her judi-cial circuit or nationwide without revealing which one(group signatures [17] or ring signatures [33] are de-signed for this sort of anonymous signing within groups)Either of these approaches would conceal which individ-ual judge approved the surveillance

                    Aggregate statistics The aggregate statistic mechanismoffers a wide range of choices about the data to be re-vealed For example if the court system is concernedabout revealing information about individual districtsstatistics could be aggregated by any number of other pa-rameters including the type of crime being investigatedor the company from which the data was requested

                    5 Protocol Definition

                    We now define a complete protocol capturing the work-flow from Section 4 We assume a public-key infrastruc-ture and synchronous communication on authenticated(encrypted) point-to-point channels

                    Preliminaries The protocol is parametrized bybull a secret-sharing scheme Sharebull a commitment scheme Cbull a special type of zero-knowledge primitive SNARKbull a multi-party computation protocol MPC andbull a function CoverSheet that maps court orders to

                    cover sheet informationSeveral parties participate in the protocolbull n judges J1 Jnbull m law enforcement agencies A1 Ambull q companies C1 Cqbull r trustees T1 Tr11 andbull P a party representing the publicbull Ledger a party representing the public ledgerbull Env a party called ldquothe environmentrdquo which models

                    the occurrence over time of exogenous eventsLedger is a simple ideal functionality allowing any

                    party to (1) append entries to a time-stamped append-only ledger and (2) retrieve ledger entries Entries areauthenticated except where explicitly anonymousEnv is a modeling device that specifies the protocol

                    behavior in the context of arbitrary exogenous event se-quences occurring over time Upon receipt of message

                    11In our specific case study r = 12 and the trustees are the twelve USCircuit Courts of Appeals The trustees are the parties which participatein the multi-party computation of aggregate statistics based on inputdata from all judges as shown in Figure 4 and defined formally later inthis subsection

                    clock Env responds with the current time To modelthe occurrence of an exogenous event e (eg a case inneed of surveillance) Env sends information about e tothe affected parties (eg a law enforcement agency)

                    Next we give the syntax of our cryptographic tools12

                    and then define the behavior of the remaining parties

                    A commitment scheme is a triple of probabilistic poly-time algorithms C= (SetupCommitOpen) as followsbull Setup(1κ) takes as input a security parameter κ (in

                    unary) and outputs public parameters ppbull Commit(ppmω) takes as input pp a message m

                    and randomness ω It outputs a commitment cbull Open(ppmprimecω prime) takes as input pp a message

                    mprime and randomness ω prime It outputs 1 if c =Commit(ppmprimeω prime) and 0 otherwise

                    pp is generated in an initial setup phase and thereafterpublicly known to all parties so we elide it for brevity

                    Algorithm 1 Law enforcement agency Ai

                    bull On receipt of a surveillance request event e =(Surveilus) from Env where u is the public key of acompany and s is the description of a surveillance requestdirected at u send message (us) to a judge13

                    bull On receipt of a decision message (usd) from a judgewhere d 6= reject14(1) generate a commitment c =Commit((sd)ω) to the request and store (csdω) lo-cally (2) generate a SNARK proof π attesting complianceof (sd) with relevant regulations (3) post (cπ) to theledger (4) send request (sdω) to company u

                    bull On receipt of an audit request (cPz) from the publicgenerate decision blarr Adp

                    i (cPz) If b = accept gener-ate a SNARK proof π attesting compliance of (sd) withthe regulations indicated by the audit request (cPz) elsesend (cPzb) to a judge13

                    bull On receipt of an audit order (dcPz) from a judge ifd = accept generate a SNARK proof π attesting com-pliance of (sd) with the regulations indicated by the auditrequest (cPz)

                    Agencies Each agency Ai has an associated decision-making process Adp

                    i modeled by a stateful algorithm thatmaps audit requests to acceptcup01lowast where the out-put is either an acceptance or a description of why theagency chooses to deny the request Each agency oper-

                    12For formal security definitions beyond syntax we refer to anystandard cryptography textbook such as [26]

                    13For the purposes of our exposition this could be an arbitrary judgeIn practice it would likely depend on the jurisdiction in which thesurveillance event occurs and in which the law enforcement agencyoperates and perhaps also on the type of case

                    14For simplicity of exposition Algorithm 1 only addresses the cased 6= reject and omits the possibility of appeal by the agency Thealgorithm could straightforwardly be extended to encompass appealsby incorporating the decision to appeal into Adp

                    i 15This is the step invoked by requests for unsealed documents

                    Algorithm 2 Judge Ji

                    bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

                    bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

                    bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

                    ates according to Algorithm 1 which is parametrized byits own Adp

                    i In practice we assume Adpi would be instan-

                    tiated by the agencyrsquos human decision-making process

                    Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

                    Algorithm 3 Company Ci

                    bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

                    The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

                    Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

                    Algorithm 4 Trustee Ti

                    bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

                    1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

                    δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

                    tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

                    ReconInputs((δ1iprime δniprime)

                    )iprimeisin[r] =(

                    Recon(δ j1 δ jr))

                    jisin[n]

                    Let y denote the output from the MPC16

                    4 Send y to J j for each j isin [n]17

                    bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

                    1 Receive a secret-share δ ji from each judge J j respec-tively

                    2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

                    authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

                    6 Evaluation of MPC Implementation

                    In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

                    The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

                    The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

                    To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

                    We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

                    1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

                    2 Protocols should not require all parties to be onlineregularly or at the same time

                    In the subsections that follow we describe and evaluateour implementations in light of these goals

                    61 Computing Totals in WebMPC

                    WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

                    Figure 5 Performance of MPC using WebMPC library

                    masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

                    To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

                    Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

                    62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

                    Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                    Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                    cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                    As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                    In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                    Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                    One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                    7 Evaluation of SNARKs

                    We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                    A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                    bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                    bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                    bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                    18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                    Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                    Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                    For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                    We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                    71 Argument TypesOur implementation supports three types of arguments

                    Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                    she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                    Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                    More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                    Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                    Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                    We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                    19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                    mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                    72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                    Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                    Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                    Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                    20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                    21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                    with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                    Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                    Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                    Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                    8 Generalization

                    Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                    Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                    (a) Proving key size (b) Key generation time (c) Argument generation time

                    Figure 8 SNARK evaluation

                    of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                    Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                    Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                    81 Generalized Framework

                    We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                    ECPA example these are judges law enforcementagencies and companies

                    bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                    hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                    bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                    Abstractly our system allows the controlled disclosureof four types of information

                    Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                    Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                    Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                    Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                    Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                    Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                    Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                    Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                    9 Conclusion

                    We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                    Acknowledgements

                    We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                    This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                    References[1] Bellman httpsgithubcomebfullbellman

                    [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                    [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                    [4] Google transparency report httpswwwgooglecom

                    transparencyreportuserdatarequestscountries

                    p=2016-12

                    [5] Jiff httpsgithubcommultipartyjiff

                    [6] Jsnark httpsgithubcomakosbajsnark

                    [7] Law enforcement requests report httpswwwmicrosoft

                    comen-usaboutcorporate-responsibilitylerr

                    [8] Zcash httpszcash

                    [9] Wiretap report 2015 httpwwwuscourtsgov

                    statistics-reportswiretap-report-2015 Decem-ber 2015

                    [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                    [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                    [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                    [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                    [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                    [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                    [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                    [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                    [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                    [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                    [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                    [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                    [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                    [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                    [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                    [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                    [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                    [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                    princetonedufeltenwarrant-paperpdf

                    [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                    [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                    [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                    [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                    [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                    [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                    [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                    [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                    [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                    [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                    [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                    [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                    [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                    [41] VIRZA M November 2017 Private communication

                    [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                    [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                    [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                    • Introduction
                    • Related Work
                    • Threat Model and Security Goals
                      • Threat model
                      • Security Goals
                        • System Design
                          • Cryptographic Tools
                          • System Configuration
                          • Workflow
                          • Additional Design Choices
                            • Protocol Definition
                            • Evaluation of MPC Implementation
                              • Computing Totals in WebMPC
                              • Thresholds and Hierarchy with Jiff
                                • Evaluation of SNARKs
                                  • Argument Types
                                  • Implementation
                                    • Generalization
                                      • Generalized Framework
                                        • Conclusion

                      Algorithm 2 Judge Ji

                      bull On receipt of a surveillance request (us) from anagency A j (1) generate decision d larr Jdp1i (s) (2)send response (usd) to A j (3) generate a commit-ment c = Commit((usd)ω) to the decision and store(cusdω) locally (4) post (CoverSheet(d)c) to theledger

                      bull On receipt of denied audit request information ζ froman agency A j generate decision d larr Jdp2i (ζ ) and send(dζ ) to A j and to the public P

                      bull On receipt of a data revelation request (cz) from thepublic15generate decision blarr Jdp3i (cz) If b= acceptsend to the public P the message and randomness (mω)corresponding to c else if b = reject send reject toP with an accompanying explanation if provided

                      ates according to Algorithm 1 which is parametrized byits own Adp

                      i In practice we assume Adpi would be instan-

                      tiated by the agencyrsquos human decision-making process

                      Judges Each judge Ji has three associated decision-making processes Jdp1i Jdp2i and Jdp3i Jdp1i mapssurveillance requests to either a rejection or an authoriz-ing court order Jdp2i maps denied audit requests to eithera confirmation of the denial or a court order overturn-ing the denial and Jdp3i maps data revelation requeststo either an acceptance or a denial (perhaps along withan explanation of the denial eg ldquothis document is stillunder sealrdquo) Each judge operates according to Algo-rithm 2 which is parametrized by the individual judgersquos(Jdp1i Jdp2i Jdp3i )

                      Algorithm 3 Company Ci

                      bull Upon receiving a surveillance request (sdω) from anagency A j if the court order d bears the valid signatureof a judge and Commit((sd)ω) matches a correspond-ing commitment posted by law enforcement on the ledgerthen (1) generate commitment clarr Commit(δ ω) andstore (cδ ω) locally (2) generate a SNARK proof π at-testing that δ is compliant with a s the judge-signed orderd (3) post (cπ) anonymously to the ledger (4) reply toA j by furnishing the requested data δ along with ω

                      The public The public P exhibits one main type of be-havior in our model upon receiving an event messagee = (aξ ) from Env (describing either an audit requestor a data revelation request) P sends ξ to a (an agencyor court) Additionally the public periodically checksthe ledger for validity of posted SNARK proofs and takesteps to flag any non-compliance detected (eg throughthe court system or the news media)

                      Companies and trustees Algorithms 3 and 4 describecompanies and trustees Companies execute judge-

                      Algorithm 4 Trustee Ti

                      bull Upon receiving an aggregate statistic event message e =(Compute f D1 Dn) from Env

                      1 For each iprime isin [r] (such that iprime 6= i) send e to Tiprime 2 For each j isin [n] send the message ( f D j) to J j Let

                      δ ji be the response from J j3 With parties T1 Tr participate in the MPC pro-

                      tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f where ReconInputs isdefined as follows

                      ReconInputs((δ1iprime δniprime)

                      )iprimeisin[r] =(

                      Recon(δ j1 δ jr))

                      jisin[n]

                      Let y denote the output from the MPC16

                      4 Send y to J j for each j isin [n]17

                      bull Upon receiving an MPC initiation message e =(Compute f D1 Dn) from another trustee Tiprime

                      1 Receive a secret-share δ ji from each judge J j respec-tively

                      2 With parties T1 Tr participate in the MPC pro-tocol MPC with input (δ1i δni) to compute thefunctionality ReconInputs f

                      authorized instructions and log their actions by postingcommitments on the ledger Trustees run MPC to com-pute aggregate statistics from data provided in secret-shared form by judges MPC events are triggered by Env

                      6 Evaluation of MPC Implementation

                      In our proposal judges use secure multiparty computa-tion (MPC) to compute aggregate statistics about the ex-tent and distribution of surveillance Although in princi-ple MPC can support secure computation of any func-tion of the judgesrsquo data full generality can come withunacceptable performance limitations In order that ourprotocols scale to hundreds of federal judges we narrowour attention to two kinds of functions that are particu-larly useful in the context of surveillance

                      The extent of surveillance (totals) Computing totalsinvolves summing values held by the parties withoutrevealing information about any value to anyone otherthan its owner Totals become averages by dividing bythe number of data points In the context of electronicsurveillance totals are the most prevalent form of com-putation on government and corporate transparency re-ports How many court orders were approved for casesinvolving homicide and how many for drug offensesHow long was the average order in effect How manyorders were issued in California [9] Totals make it pos-sible to determine the extent of surveillance

                      The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

                      To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

                      We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

                      1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

                      2 Protocols should not require all parties to be onlineregularly or at the same time

                      In the subsections that follow we describe and evaluateour implementations in light of these goals

                      61 Computing Totals in WebMPC

                      WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

                      Figure 5 Performance of MPC using WebMPC library

                      masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

                      To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

                      Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

                      62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

                      Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                      Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                      cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                      As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                      In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                      Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                      One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                      7 Evaluation of SNARKs

                      We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                      A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                      bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                      bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                      bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                      18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                      Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                      Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                      For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                      We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                      71 Argument TypesOur implementation supports three types of arguments

                      Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                      she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                      Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                      More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                      Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                      Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                      We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                      19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                      mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                      72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                      Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                      Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                      Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                      20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                      21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                      with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                      Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                      Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                      Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                      8 Generalization

                      Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                      Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                      (a) Proving key size (b) Key generation time (c) Argument generation time

                      Figure 8 SNARK evaluation

                      of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                      Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                      Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                      81 Generalized Framework

                      We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                      ECPA example these are judges law enforcementagencies and companies

                      bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                      hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                      bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                      Abstractly our system allows the controlled disclosureof four types of information

                      Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                      Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                      Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                      Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                      Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                      Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                      Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                      Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                      9 Conclusion

                      We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                      Acknowledgements

                      We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                      This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                      References[1] Bellman httpsgithubcomebfullbellman

                      [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                      [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                      [4] Google transparency report httpswwwgooglecom

                      transparencyreportuserdatarequestscountries

                      p=2016-12

                      [5] Jiff httpsgithubcommultipartyjiff

                      [6] Jsnark httpsgithubcomakosbajsnark

                      [7] Law enforcement requests report httpswwwmicrosoft

                      comen-usaboutcorporate-responsibilitylerr

                      [8] Zcash httpszcash

                      [9] Wiretap report 2015 httpwwwuscourtsgov

                      statistics-reportswiretap-report-2015 Decem-ber 2015

                      [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                      [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                      [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                      [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                      [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                      [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                      [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                      [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                      [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                      [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                      [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                      [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                      [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                      [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                      [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                      [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                      [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                      [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                      princetonedufeltenwarrant-paperpdf

                      [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                      [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                      [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                      [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                      [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                      [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                      [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                      [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                      [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                      [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                      [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                      [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                      [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                      [41] VIRZA M November 2017 Private communication

                      [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                      [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                      [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                      • Introduction
                      • Related Work
                      • Threat Model and Security Goals
                        • Threat model
                        • Security Goals
                          • System Design
                            • Cryptographic Tools
                            • System Configuration
                            • Workflow
                            • Additional Design Choices
                              • Protocol Definition
                              • Evaluation of MPC Implementation
                                • Computing Totals in WebMPC
                                • Thresholds and Hierarchy with Jiff
                                  • Evaluation of SNARKs
                                    • Argument Types
                                    • Implementation
                                      • Generalization
                                        • Generalized Framework
                                          • Conclusion

                        The distribution of surveillance (thresholds) Thresh-olding involves determining the number of data pointsthat exceed a given cut-off How many courts issuedmore than ten orders for data from Google How manyorders were in effect for more than 90 days Unlike to-tals thresholds can reveal selected facts about the distri-bution of surveillance ie the circumstances in whichit is most prevalent Thresholds go beyond the kinds ofquestions typically answered in transparency reports of-fering new opportunities to improve accountability

                        To enable totals and thresholds to scale to the size ofthe federal court system we implemented a hierarchi-cal MPC protocol as described in Figure 4 whose designmirrors the hierarchy of the court system Our evaluationshows the hierarchical structure reduces MPC complex-ity from quadratic in the number of judges to linear

                        We implemented protocols that make use of totals andthresholds using two existing JavaScript-based MPC li-braries WebMPC [16 29] and Jiff [5] WebMPC is thesimpler and less versatile library we test it as a baselineand as a ldquosanity checkrdquo that its performance scales as ex-pected then move on to the more interesting experimentof evaluating Jiff We opted for JavaScript libraries to fa-cilitate integration into a web application which is suit-able for federal judges to submit information through afamiliar browser interface regardless of the differencesin their local system setups Both of these libraries aredesigned to facilitate MPC across dozens or hundredsof computers we simulated this effect by running eachparty in a separate process on a computer with 16 CPUcores and 64GB of RAM We tested these protocols onrandomly generated data containing values in the hun-dreds which reflects the same order of magnitude as datapresent in existing transparency reports Our implemen-tations were crafted with two design goals in mind

                        1 Protocols should scale to roughly 1000 partiesthe approximate size of the federal judiciary [10]performing efficiently enough to facilitate periodictransparency reports

                        2 Protocols should not require all parties to be onlineregularly or at the same time

                        In the subsections that follow we describe and evaluateour implementations in light of these goals

                        61 Computing Totals in WebMPC

                        WebMPC is a JavaScript-based library that can securelycompute sums in a single round The underlying proto-col relies on two parties who are trusted not to colludewith one another an analyst who distributes masking in-formation to all protocol participants at the beginning ofthe process and receives the final aggregate statistic anda facilitator who aggregates this information together in

                        Figure 5 Performance of MPC using WebMPC library

                        masked form The participants use the masking infor-mation from the analyst to mask their inputs and sendthem to the facilitator who aggregates them and sendsthe result (ie a masked sum) to the analyst The ana-lyst removes the mask and uncovers the aggregated re-sult Once the participants have their masks they receiveno further messages from any other party they can sub-mit this masked data to the facilitator in an uncoordinatedfashion and go offline immediately afterwards Even ifsome anticipated participants do not send data the pro-tocol can still run to completion with those who remain

                        To make this protocol feasible in our setting we needto identify a facilitator and analyst who will not colludeIn many circumstances it would be acceptable to rely onreputable institutions already present in the court systemsuch as the circuit courts of appeals the Supreme Courtor the Administrative Office of the US Courts

                        Although this protocolrsquos simplicity limits its general-ity it also makes it possible for the protocol to scale ef-ficiently to a large number of participants As Figure 5illustrates the protocol scales linearly with the numberof parties Even with 400 partiesmdashthe largest size wetestedmdashthe protocol still completed in just under 75 sec-onds Extrapolating from the linear trend it would takeabout three minutes to compute a summation across theentire federal judiciary Since existing transparency re-ports are typically released just once or twice a year itis reasonable to invest three minutes of computation (orless than a fifth of a second per judge) for each statistic

                        62 Thresholds and Hierarchy with JiffTo make use of MPC operations beyond totals we turnedto Jiff another MPC library implemented in JavaScriptJiff is designed to support MPC for arbitrary function-alities although inbuilt support for some more complexfunctionalities are still under development at the time ofwriting Most importantly for our needs Jiff supportsthresholding and multiplication in addition to sums Weevaluated Jiff on three different MPC protocols totals (aswith WebMPC) additive thresholding (ie how manyvalues exceeded a specific threshold) and multiplicativethresholding (ie did all values exceed a specific thresh-old) In contrast to computing totals via summationcertain operations like thresholding require more compli-

                        Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                        Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                        cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                        As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                        In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                        Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                        One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                        7 Evaluation of SNARKs

                        We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                        A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                        bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                        bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                        bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                        18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                        Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                        Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                        For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                        We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                        71 Argument TypesOur implementation supports three types of arguments

                        Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                        she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                        Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                        More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                        Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                        Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                        We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                        19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                        mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                        72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                        Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                        Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                        Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                        20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                        21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                        with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                        Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                        Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                        Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                        8 Generalization

                        Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                        Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                        (a) Proving key size (b) Key generation time (c) Argument generation time

                        Figure 8 SNARK evaluation

                        of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                        Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                        Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                        81 Generalized Framework

                        We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                        ECPA example these are judges law enforcementagencies and companies

                        bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                        hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                        bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                        Abstractly our system allows the controlled disclosureof four types of information

                        Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                        Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                        Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                        Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                        Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                        Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                        Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                        Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                        9 Conclusion

                        We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                        Acknowledgements

                        We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                        This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                        References[1] Bellman httpsgithubcomebfullbellman

                        [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                        [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                        [4] Google transparency report httpswwwgooglecom

                        transparencyreportuserdatarequestscountries

                        p=2016-12

                        [5] Jiff httpsgithubcommultipartyjiff

                        [6] Jsnark httpsgithubcomakosbajsnark

                        [7] Law enforcement requests report httpswwwmicrosoft

                        comen-usaboutcorporate-responsibilitylerr

                        [8] Zcash httpszcash

                        [9] Wiretap report 2015 httpwwwuscourtsgov

                        statistics-reportswiretap-report-2015 Decem-ber 2015

                        [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                        [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                        [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                        [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                        [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                        [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                        [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                        [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                        [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                        [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                        [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                        [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                        [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                        [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                        [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                        [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                        [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                        [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                        princetonedufeltenwarrant-paperpdf

                        [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                        [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                        [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                        [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                        [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                        [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                        [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                        [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                        [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                        [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                        [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                        [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                        [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                        [41] VIRZA M November 2017 Private communication

                        [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                        [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                        [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                        • Introduction
                        • Related Work
                        • Threat Model and Security Goals
                          • Threat model
                          • Security Goals
                            • System Design
                              • Cryptographic Tools
                              • System Configuration
                              • Workflow
                              • Additional Design Choices
                                • Protocol Definition
                                • Evaluation of MPC Implementation
                                  • Computing Totals in WebMPC
                                  • Thresholds and Hierarchy with Jiff
                                    • Evaluation of SNARKs
                                      • Argument Types
                                      • Implementation
                                        • Generalization
                                          • Generalized Framework
                                            • Conclusion

                          Figure 6 Flat total (red) additive threshold (blue) andmultiplicative thresholds (green) protocols in Jiff

                          Figure 7 Hierarchical total (red) additive threshold(blue) and multiplicative thresholds (green) protocols inJiff Note the difference in axis scales from Figure 6

                          cated computation and multiple rounds of communica-tion By building on Jiff with our hierarchical MPC im-plementation we demonstrate that these operations areviable at the scale required by the federal court system

                          As a baseline we ran sums additive thresholding andmultiplicative thresholding benchmarks with all judgesas full participants in the MPC protocol sharing theworkload equally a configuration we term the flat pro-tocol (in contrast to the hierarchical protocol we presentnext) Figure 6 illustrates that the running time of theseprotocols grows quadratically with the number of judgesparticipating These running times quickly became un-tenable While summation took several minutes amonghundreds of judges both thresholding benchmarks couldbarely handle tens of judges in the same time envelopesThese graphs illustrate the substantial performance dis-parity between summation and thresholding

                          In Section 4 we described an alternative ldquohierarchi-calrdquo MPC configuration to reduce this quadratic growthto linear As depicted in Figure 4 each lower-court judgesplits a piece of data into twelve secret shares one foreach circuit court of appeals These shares are sent to thecorresponding courts who conduct a twelve-party MPCthat performs a total or thresholding operation based onthe input shares If n lower-court judges participatethe protocol is tantamount to computing n twelve-partysummations followed by a single n-input summation orthreshold As n increases the amount of work scales lin-early So long as at least one circuit court remains honestand uncompromised the secrecy of the lower court dataendures by the security of the secret-sharing scheme

                          Figure 7 illustrates the linear scaling of the twelve-party portion of the hierarchical protocols we measuredonly the computation time after the circuit courts re-ceived all of the additive shares from the lower courtsWhile the flat summation protocol took nearly eight min-utes to run on 300 judges the hierarchical summationscaled to 1000 judges in less than 20 seconds bestingeven the WebMPC results Although thresholding char-acteristically remained much slower than summation thehierarchical protocol scaled to nearly 250 judges in aboutthe same amount of time that it took the flat protocol torun on 35 judges Since the running times for the thresh-old protocols were in the tens of minutes for large bench-marks the linear trend is noisier than for the total proto-col Most importantly both of these protocols scaled lin-early meaning thatmdashgiven sufficient timemdashthresholdingcould scale up to the size of the federal court systemThis performance is acceptable if a few choice thresholdsare computed at the frequency at which existing trans-parency reports are published18

                          One additional benefit of the hierarchical protocols isthat lower courts do not need to stay online while theprotocol is executing a goal we articulated at the begin-ning of this section A lower court simply needs to sendin its shares to the requisite circuit courts one messageper circuit court to a grand total of twelve messages af-ter which it is free to disconnect In contrast the flatprotocol grinds to a halt if even a single judge goes of-fline The availability of the hierarchical protocol relieson a small set of circuit courts who could invest in morerobust infrastructure

                          7 Evaluation of SNARKs

                          We define the syntax of preprocessing zero-knowledgeSNARKs for arithmetic circuit satisfiability [15]

                          A SNARK is a triple of probabilistic polynomial-timealgorithms SNARK= (SetupProveVerify) as follows

                          bull Setup(1κ R) takes as input the security parameterκ and a description of a binary relation R (an arith-metic circuit of size polynomial in κ) and outputs apair (pkRvkR) of a proving key and verification key

                          bull Prove(pkR(xw)) takes as input a proving keypkR and an input-witness pair (xw) and out-puts a proof π attesting to x isin LR where LR =x existw st (xw) isin R

                          bull Verify(vkR(xπ)) takes as input a verification keyvkR and an input-proof pair (xπ) and outputs a bitindicating whether π is a valid proof for x isin LR

                          18Too high a frequency is also inadvisable due to the possibility ofrevealing too granular information when combined with the timings ofspecific investigations court orders

                          Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                          Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                          For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                          We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                          71 Argument TypesOur implementation supports three types of arguments

                          Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                          she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                          Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                          More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                          Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                          Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                          We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                          19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                          mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                          72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                          Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                          Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                          Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                          20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                          21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                          with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                          Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                          Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                          Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                          8 Generalization

                          Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                          Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                          (a) Proving key size (b) Key generation time (c) Argument generation time

                          Figure 8 SNARK evaluation

                          of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                          Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                          Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                          81 Generalized Framework

                          We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                          ECPA example these are judges law enforcementagencies and companies

                          bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                          hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                          bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                          Abstractly our system allows the controlled disclosureof four types of information

                          Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                          Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                          Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                          Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                          Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                          Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                          Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                          Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                          9 Conclusion

                          We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                          Acknowledgements

                          We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                          This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                          References[1] Bellman httpsgithubcomebfullbellman

                          [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                          [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                          [4] Google transparency report httpswwwgooglecom

                          transparencyreportuserdatarequestscountries

                          p=2016-12

                          [5] Jiff httpsgithubcommultipartyjiff

                          [6] Jsnark httpsgithubcomakosbajsnark

                          [7] Law enforcement requests report httpswwwmicrosoft

                          comen-usaboutcorporate-responsibilitylerr

                          [8] Zcash httpszcash

                          [9] Wiretap report 2015 httpwwwuscourtsgov

                          statistics-reportswiretap-report-2015 Decem-ber 2015

                          [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                          [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                          [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                          [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                          [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                          [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                          [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                          [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                          [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                          [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                          [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                          [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                          [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                          [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                          [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                          [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                          [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                          [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                          princetonedufeltenwarrant-paperpdf

                          [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                          [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                          [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                          [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                          [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                          [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                          [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                          [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                          [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                          [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                          [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                          [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                          [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                          [41] VIRZA M November 2017 Private communication

                          [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                          [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                          [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                          • Introduction
                          • Related Work
                          • Threat Model and Security Goals
                            • Threat model
                            • Security Goals
                              • System Design
                                • Cryptographic Tools
                                • System Configuration
                                • Workflow
                                • Additional Design Choices
                                  • Protocol Definition
                                  • Evaluation of MPC Implementation
                                    • Computing Totals in WebMPC
                                    • Thresholds and Hierarchy with Jiff
                                      • Evaluation of SNARKs
                                        • Argument Types
                                        • Implementation
                                          • Generalization
                                            • Generalized Framework
                                              • Conclusion

                            Before participants can create and verify SNARKsthey must establish a proving key which any partici-pant can use to create a SNARK and a correspondingverification key which any participant can use to verifya SNARK so created Both of these keys are publiclyknown The keys are distinct for each circuit (represent-ing an NP relation) about which proofs are generatedand can be reused to produce as many different proofswith respect to that circuit as desired Key generationuses randomness that if known or biased could allowparticipants to create proofs of false statements [13] Thekey generation process must therefore protect and thendestroy this information

                            Using MPC to do key generation based on randomnessprovided by many different parties provides the guaran-tee that as long as at least one of the MPC participants be-haved correctly (ie did not bias his randomness and de-stroyed it afterward) the resulting keys are good (ie donot permit proofs of false statements) This approach hasbeen used in the past most notably by the cryptocurrencyZcash [8] Despite the strong guarantees provided by thisapproach to key generation when at least one party is notcorrupted concerns have been expressed about the wis-dom of trusting in the assumption of one honest party inthe Zcash setting which involves large monetary valuesand a system design inherently centered around the prin-ciples of full decentralization

                            For our system we propose key generation be donein a one-time MPC among several of the tradition-ally reputable institutions in the court system such asthe Supreme Court or Administrative Office of the USCourts ideally together with other reputable parties fromdifferent branches of government In our setting the useof MPC for SNARK key generation does not constituteas pivotal and potentially risky a trust assumption as inZcash in that the court system is close-knit and inher-ently built with the assumption of trustworthiness of cer-tain entities within the system In contrast a decentral-ized cryptocurrency (1) must due to its distributed na-ture rely for key generation on MPC participants that areessentially strangers to most others in the system and (2)could be said to derive its very purpose from not relyingon the trustworthiness of any small set of parties

                            We note that since key generation is a one-time taskfor each circuit we can tolerate a relatively performance-intensive process Proving and verification keys can bedistributed on the ledger

                            71 Argument TypesOur implementation supports three types of arguments

                            Argument of knowledge for a commitment (Pk) Oursimplest type of argument attests the proverrsquos knowl-edge of the content of a given commitment c ie that

                            she could open the commitment if required Whenevera party publishes a commitment she can accompany itwith a SNARK attesting that she knows the message andrandomness that were used to generate the commitmentFormally this is an argument that the prover knows mand ω that correspond to a publicly known c such thatOpen(mcω) = 1

                            Argument of commitment equality (Peq) Our secondtype of argument attests that the content of two pub-licly known commitments c1c2 is the same That is fortwo publicly known commitments c1 and c2 the proverknows m1 m2 ω1 and ω2 such that Open(m1c1ω1) =1andOpen(m2c2ω2) = 1andm1 = m2

                            More concretely suppose that an agency wishes torelease relational informationmdashthat the identifier (egemail address) in the request is the same identifier that ajudge approved The judge and law enforcemnet agencypost commitments c1 and c2 respectively to the identi-fiers they used The law enforcement agency then postsan argument attesting that the two commitments are tothe same value19 Since circuits use fixed-size inputs anargument implicitly reveals the length of the committedmessage To hide this information the law enforcementagency can pad each input up to a uniform length

                            Peq may be too revealing under certain circumstancesfor the public to verify the argument the agency (whoposted c2) must explicitly identify c1 potentially reveal-ing which judge authorized the data request and when

                            Existential argument of commitment equality (Pexist)Our third type of commitment allows decreasing the res-olution of the information revealed by proving that acommitmentrsquos content is the same as that of some othercommitment among many Formally it shows that forpublicly known commitments cc1 cN respectively tosecret values (mω)(m1ω1) (mN ωN) exist i such thatOpen(mcω) = 1andOpen(miciωi) = 1andm = mi Wetreat i as an additional secret input so that for any valueof N only two commitments need to be opened Thisscheme trades off between resolution (number of com-mitments) and efficiency a question we explore below

                            We have chosen these three types of arguments to im-plement but LibSNARK supports arbitrary predicates inprinciple and there are likely others that would be usefuland run efficiently in practice A useful generalizationof Peq and Pexist would be to replace equality with more so-phisticated domain-specific predicates instead of show-ing that messages m1m2 corresponding to a pair of com-

                            19To produce a proof for Peq the prover (eg the agency) needs toknow both ω2 and ω1 but in some cases c1 (and thus ω1) may havebeen produced by a different entity (eg the judge) Publicizing ω1 isunacceptable as it compromises the hiding of the commitment contentTo solve this problem the judge can include ω1 alongside m1 in secretdocuments that both parties possess (eg the court order)

                            mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                            72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                            Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                            Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                            Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                            20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                            21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                            with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                            Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                            Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                            Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                            8 Generalization

                            Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                            Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                            (a) Proving key size (b) Key generation time (c) Argument generation time

                            Figure 8 SNARK evaluation

                            of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                            Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                            Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                            81 Generalized Framework

                            We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                            ECPA example these are judges law enforcementagencies and companies

                            bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                            hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                            bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                            Abstractly our system allows the controlled disclosureof four types of information

                            Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                            Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                            Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                            Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                            Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                            Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                            Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                            Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                            9 Conclusion

                            We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                            Acknowledgements

                            We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                            This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                            References[1] Bellman httpsgithubcomebfullbellman

                            [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                            [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                            [4] Google transparency report httpswwwgooglecom

                            transparencyreportuserdatarequestscountries

                            p=2016-12

                            [5] Jiff httpsgithubcommultipartyjiff

                            [6] Jsnark httpsgithubcomakosbajsnark

                            [7] Law enforcement requests report httpswwwmicrosoft

                            comen-usaboutcorporate-responsibilitylerr

                            [8] Zcash httpszcash

                            [9] Wiretap report 2015 httpwwwuscourtsgov

                            statistics-reportswiretap-report-2015 Decem-ber 2015

                            [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                            [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                            [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                            [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                            [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                            [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                            [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                            [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                            [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                            [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                            [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                            [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                            [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                            [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                            [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                            [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                            [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                            [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                            princetonedufeltenwarrant-paperpdf

                            [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                            [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                            [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                            [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                            [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                            [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                            [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                            [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                            [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                            [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                            [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                            [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                            [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                            [41] VIRZA M November 2017 Private communication

                            [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                            [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                            [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                            • Introduction
                            • Related Work
                            • Threat Model and Security Goals
                              • Threat model
                              • Security Goals
                                • System Design
                                  • Cryptographic Tools
                                  • System Configuration
                                  • Workflow
                                  • Additional Design Choices
                                    • Protocol Definition
                                    • Evaluation of MPC Implementation
                                      • Computing Totals in WebMPC
                                      • Thresholds and Hierarchy with Jiff
                                        • Evaluation of SNARKs
                                          • Argument Types
                                          • Implementation
                                            • Generalization
                                              • Generalized Framework
                                                • Conclusion

                              mitments are equal one could show p(m1m2) = 1 forother predicates p (eg ldquoless-thanrdquo or ldquosigned by samecourtrdquo) The types of arguments that can be implementedefficiently will expand as SNARK librariesrsquo efficiencyimproves our system inherits such efficiency gains

                              72 ImplementationWe implemented these zero-knowledge arguments withLibSNARK [34] a C++ library for creating general-purpose SNARKs from arithmetic circuits We im-plemented commitments using the SHA256 hash func-tion20 ω is a 256-bit random string appended to themessage before it is hashed In this section we showthat useful statements can be proven within a reasonableperformance envelope We consider six criteria the sizeof the proving key the size of the verification key thesize of the proof statement the time to generate keys thetime to create proofs and the time to verify proofs Weevaluated these metrics with messages from 16 to 1232bytes on Pk Peq and Pexist (N = 100 400 700 and 1000large enough to obscure links between commitments) ona computer with 16 CPU cores and 64GB of RAM

                              Argument size The argument is just 287 bytes Accom-panying each argument are its public inputs (in this casecommitments) Each commitment is 256 bits21 An au-ditor needs to store these commitments anyway as part ofthe ledger and each commitment can be stored just onceand reused for many proofs

                              Verification key size The size of the verification keyis proportional to the size of the circuit and its publicinputs The key was 106KB for Pk (one commitmentas input and one SHA256 circuit) and 2083KB for Peq(two commitments and two SHA256 circuits) AlthoughPexist computes SHA256 just twice its smallest input 100commitments is 50 times as large as that of Pk and Peqthe keys are correspondingly larger and grow linearlywith the input size For 100 400 700 and 1000 com-mitments the verification keys were respectively 10MB41MB 71MB and 102MB Since only one verificationkey is necessary for each circuit these keys are easilysmall enough to make large-scale verification feasible

                              Proving key size The proving keys are much largerin the hundreds of megabytes Their size grows linearlywith the size of the circuit so longer messages (whichrequire more SHA256 computations) more complicatedcircuits and (for Pexist) more inputs lead to larger keys Fig-ure 8a reflects this trend Proving keys are largest for Pexist

                              20Certain other hash functions may be more amenable to representa-tion as arithmetic circuits and thus more ldquoSNARK-friendlyrdquo We optedfor a proof of concept with SHA256 as it is so widely used

                              21LibSNARK stores each bit in a 32-bit integer so an argument in-volving k commitments takes about 1024k bytes A bit-vector repre-sentation would save a factor of 32

                              with 1000 inputs on 1232KB messages and shrink as themessage size and the number of commitments decreasePk and Peq which have simpler circuits still have biggerproving keys for bigger messages Although these keysare large only entities that create each kind of proof needto store the corresponding key Storing one key for eachtype of argument we have presented takes only about1GB at the largest input sizes

                              Key generation time Key generation time increasedlinearly with the size of the keys from a few secondsfor Pk and Peq on small messages to a few minutes for Pexiston the largest parameters (Figure 8b) Since key genera-tion is a one-time process to add a new kind of proof inthe form of a circuit we find these numbers acceptable

                              Argument generation time Argument generation timeincreased linearly with proving key size and ranged froma few seconds on the smallest keys to a couple of minutesfor largest (Figure 8c) Since argument generation is aone-time task for each surveillance action and the exist-ing administrative processes for each surveillance actionoften take hours or days we find this cost acceptable

                              Argument verification time Verifying Pk and Peq on thelargest message took only a few milliseconds Verifica-tion times for Pexist were larger and increased linearly withthe number of input commitments For 100 400 700and 1000 commitments verification took 40ms 85ms243ms and 338ms on the largest input These times arestill fast enough to verify many arguments quickly

                              8 Generalization

                              Our proposal can be generalized beyond ECPA surveil-lance to encompass a broader class of secret informationprocesses Consider situations in which independent in-stitutions need to act in a coordinated but secret fash-ion and at the same time are subject to public scrutinyThey should be able to convince the public that their ac-tions are consistent with relevant rules As in electronicsurveillance accountability requires the ability to attestto compliance without revealing sensitive information

                              Example 1 (FISA court) Accountability is needed inother electronic surveillance arenas The US Foreign In-telligence Surveillance Act (FISA) regulates surveillancein national security investigations Because of the sensi-tive interests at stake the entire process is overseen by aUS court that meets in secret The tension between se-crecy and public accountability is even sharper for theFISA court much of the data collected under FISA maystay permanently hidden inside US intelligence agencieswhile data collected under ECPA may eventually be usedin public criminal trials This opacity may be justifiedbut it has engendered skepticism The public has no way

                              (a) Proving key size (b) Key generation time (c) Argument generation time

                              Figure 8 SNARK evaluation

                              of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                              Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                              Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                              81 Generalized Framework

                              We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                              ECPA example these are judges law enforcementagencies and companies

                              bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                              hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                              bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                              Abstractly our system allows the controlled disclosureof four types of information

                              Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                              Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                              Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                              Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                              Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                              Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                              Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                              Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                              9 Conclusion

                              We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                              Acknowledgements

                              We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                              This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                              References[1] Bellman httpsgithubcomebfullbellman

                              [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                              [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                              [4] Google transparency report httpswwwgooglecom

                              transparencyreportuserdatarequestscountries

                              p=2016-12

                              [5] Jiff httpsgithubcommultipartyjiff

                              [6] Jsnark httpsgithubcomakosbajsnark

                              [7] Law enforcement requests report httpswwwmicrosoft

                              comen-usaboutcorporate-responsibilitylerr

                              [8] Zcash httpszcash

                              [9] Wiretap report 2015 httpwwwuscourtsgov

                              statistics-reportswiretap-report-2015 Decem-ber 2015

                              [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                              [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                              [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                              [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                              [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                              [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                              [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                              [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                              [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                              [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                              [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                              [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                              [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                              [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                              [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                              [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                              [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                              [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                              princetonedufeltenwarrant-paperpdf

                              [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                              [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                              [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                              [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                              [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                              [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                              [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                              [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                              [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                              [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                              [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                              [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                              [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                              [41] VIRZA M November 2017 Private communication

                              [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                              [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                              [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                              • Introduction
                              • Related Work
                              • Threat Model and Security Goals
                                • Threat model
                                • Security Goals
                                  • System Design
                                    • Cryptographic Tools
                                    • System Configuration
                                    • Workflow
                                    • Additional Design Choices
                                      • Protocol Definition
                                      • Evaluation of MPC Implementation
                                        • Computing Totals in WebMPC
                                        • Thresholds and Hierarchy with Jiff
                                          • Evaluation of SNARKs
                                            • Argument Types
                                            • Implementation
                                              • Generalization
                                                • Generalized Framework
                                                  • Conclusion

                                (a) Proving key size (b) Key generation time (c) Argument generation time

                                Figure 8 SNARK evaluation

                                of knowing what the court is doing nor any means of as-suring itself that the intelligence agencies under the au-thority of FISA are even complying with the rules of thatcourt The FISA court itself has voiced concern aboutthat it has no independent means of assessing compliancewith its orders because of the extreme secrecy involvedApplying our proposal to the FISA court both the courtand the public could receive proofs of documented com-pliance with FISA orders as well as aggregate statisticson the scope of FISA surveillance activity to the full ex-tent possible without incurring national security risk

                                Example 2 (Clinical trials) Accountability mecha-nisms are also important to assess behavior of privateparties eg in clinical trials for new drugs There aremany parties to clinical trials and much of the informa-tion involved is either private or proprietary Yet regula-tors and the public have a need to know that responsibletesting protocols are observed Our system can achievethe right balance of transparency accountability and re-spect for privacy of those involved in the trials

                                Example 3 (Public fund spending) Accountability inspending of taxpayer money is naturally a subject of pub-lic interest Portions of public funds may be allocated forsensitive purposes (eg defenseintelligence) and theamounts and allocation thereof may be publicly unavail-able due to their sensitivity Our system would enablecredible public assurances that taxpayer money is beingspent in accordance with stated principles while preserv-ing secrecy of information considered sensitive

                                81 Generalized Framework

                                We present abstractions describing the generalized ver-sion of our system and briefly outline how the concreteexamples fit into this framework A secret informationprocess includes the following componentsbull A set of participants interact with each other In our

                                ECPA example these are judges law enforcementagencies and companies

                                bull The participants engage in a protocol (eg toexecute the procedures for conducting electronicsurveillance) The protocol messages exchanged are

                                hidden from the view of outsiders (eg the public)and yet it is of public interest that the protocol mes-sages exchanged adhere to certain rules

                                bull A set of auditors (distinct from the participants)seeks to audit the protocol by verifying that a setof accountability properties are met

                                Abstractly our system allows the controlled disclosureof four types of information

                                Existential information reveals the existence of a pieceof data be it in a participantrsquos local storage or the contentof a communication between participants In our casestudy existential information is revealed with commit-ments which indicate the existence of a document

                                Relational information describes the actions partici-pants take in response to the actions of others In ourcase study relational information is represented by thezero-knowledge arguments that attest that actions weretaken lawfully (eg in compliance with a judgersquos order)

                                Content information is the data in storage and com-munication In our case study content information is re-vealed through aggregate statistics via MPC and whendocuments are unsealed and their contents made public

                                Timing information is a by-product of the other infor-mation In our case study timing information could in-clude order issuance dates turnaround times for data re-quest fulfilment by companies and seal expiry dates

                                Revealing combinations of these four types of infor-mation with the specified cryptographic tools providesthe flexibility to satisfy a range of application-specificaccountability properties as exemplified next

                                Example 1 (FISA court) Participants are the FISACourt judges the agencies requesting surveillance autho-rization and any service providers involved in facilitat-ing said surveillance The protocol encompasses the le-gal process required to authorize surveillance togetherwith the administrative steps that must be taken to enactsurveillance Auditors are the public the judges them-selves and possibly Congress Desirable accountabilityproperties are similar to those in our ECPA case studyeg attestations that certain rules are being followedin issuing surveillance orders and release of aggregatestatistics on surveillance activities under FISA

                                Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                                Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                                9 Conclusion

                                We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                                Acknowledgements

                                We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                                This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                                References[1] Bellman httpsgithubcomebfullbellman

                                [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                                [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                                [4] Google transparency report httpswwwgooglecom

                                transparencyreportuserdatarequestscountries

                                p=2016-12

                                [5] Jiff httpsgithubcommultipartyjiff

                                [6] Jsnark httpsgithubcomakosbajsnark

                                [7] Law enforcement requests report httpswwwmicrosoft

                                comen-usaboutcorporate-responsibilitylerr

                                [8] Zcash httpszcash

                                [9] Wiretap report 2015 httpwwwuscourtsgov

                                statistics-reportswiretap-report-2015 Decem-ber 2015

                                [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                                [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                                [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                                [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                                [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                                [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                                [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                                [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                                [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                                [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                                [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                                [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                                [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                                [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                                [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                                [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                                [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                                [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                                princetonedufeltenwarrant-paperpdf

                                [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                                [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                                [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                                [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                                [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                                [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                                [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                                [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                                [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                                [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                                [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                                [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                                [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                                [41] VIRZA M November 2017 Private communication

                                [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                                [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                                [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                                • Introduction
                                • Related Work
                                • Threat Model and Security Goals
                                  • Threat model
                                  • Security Goals
                                    • System Design
                                      • Cryptographic Tools
                                      • System Configuration
                                      • Workflow
                                      • Additional Design Choices
                                        • Protocol Definition
                                        • Evaluation of MPC Implementation
                                          • Computing Totals in WebMPC
                                          • Thresholds and Hierarchy with Jiff
                                            • Evaluation of SNARKs
                                              • Argument Types
                                              • Implementation
                                                • Generalization
                                                  • Generalized Framework
                                                    • Conclusion

                                  Example 2 (Clinical trials) Participants are the insti-tutions (companies or research centers) conducting clin-ical trials comprising scientists ethics boards and dataanalysts the organizations that manage regulations re-garding clinical trials such as the National Institutesof Health (NIH) and the Food and Drug Administra-tion (FDA) in the US and hospitals and other sourcesthrough which trial participants are drawn The proto-col encompasses the administrative process required toapprove a clinical trial and the procedure of gatheringparticipants and conducting the trial itself Auditors arethe public the regulatory organizations such as the NIHand the FDA and possibly professional ethics commit-tees Desirable accountability properties include egattestations that appropriate procedures are respected inrecruiting participants and administering trials and re-lease of aggregate statistics on clinical trial results with-out compromising individual participantsrsquo medical data

                                  Example 3 (Public fund spending) Participantsare Congress (who appropriates the funding) de-fenseintelligence agencies and service providers con-tracted in the spending of said funding The protocolencompasses the processes by which Congress allocatesfunds to agencies and agencies allocate funds to par-ticular expenses Auditors are the public and CongressDesirable accountability properties include eg attesta-tions that procurements were within reasonable marginsof market prices and satisfied documented needs and re-lease of aggregate statistics on the proportion of allocatedmoney used and broad spending categories

                                  9 Conclusion

                                  We present a cryptographic answer to the accountabil-ity challenge currently frustrating the US court sys-tem Leveraging cryptographic commitments zero-knowledge proofs and secure MPC we provide the elec-tronic surveillance process a series of scalable flexi-ble and practical measures for improving accountabil-ity while maintaining secrecy While we focus on thecase study of electronic surveillance these strategies areequally applicable to a range of other secret informationprocesses requiring accountability to an outside auditor

                                  Acknowledgements

                                  We are grateful to Judge Stephen Smith for discussionand insights from the perspective of the US court systemto Andrei Lapets Kinan Dak Albab Rawane Issa andFrederick Joossens for discussion on Jiff and WebMPCand to Madars Virza for advice on SNARKs and Lib-SNARK

                                  This research was supported by the following grantsNSF MACS (CNS-1413920) DARPA IBM (W911NF-15-C-0236) SIMONS Investigator Award AgreementDated June 5th 2012 and the Center for Science of In-formation (CSoI) an NSF Science and Technology Cen-ter under grant agreement CCF-0939370

                                  References[1] Bellman httpsgithubcomebfullbellman

                                  [2] Electronic Communications Privacy Act 18 USC 2701 et seq

                                  [3] Foreign Intelligence Surveillance Act 50 USC ch 36

                                  [4] Google transparency report httpswwwgooglecom

                                  transparencyreportuserdatarequestscountries

                                  p=2016-12

                                  [5] Jiff httpsgithubcommultipartyjiff

                                  [6] Jsnark httpsgithubcomakosbajsnark

                                  [7] Law enforcement requests report httpswwwmicrosoft

                                  comen-usaboutcorporate-responsibilitylerr

                                  [8] Zcash httpszcash

                                  [9] Wiretap report 2015 httpwwwuscourtsgov

                                  statistics-reportswiretap-report-2015 Decem-ber 2015

                                  [10] ADMINSTRATIVE OFFICE OF THE COURTS Authorized judge-ships httpwwwuscourtsgovsitesdefaultfilesallauthpdf

                                  [11] ARAKI T BARAK A FURUKAWA J LICHTER T LIN-DELL Y NOF A OHARA K WATZMAN A AND WEIN-STEIN O Optimized honest-majority MPC for malicious ad-versaries - breaking the 1 billion-gate per second barrier In2017 IEEE Symposium on Security and Privacy SP 2017 SanJose CA USA May 22-26 2017 (2017) IEEE Computer Soci-ety pp 843ndash862

                                  [12] BATES A BUTLER K R SHERR M SHIELDS CTRAYNOR P AND WALLACH D Accountable wiretappingndashorndashi know they can hear you now NDSS (2012)

                                  [13] BEN-SASSON E CHIESA A GREEN M TROMER EAND VIRZA M Secure sampling of public parameters for suc-cinct zero knowledge proofs In Security and Privacy (SP) 2015IEEE Symposium on (2015) IEEE pp 287ndash304

                                  [14] BEN-SASSON E CHIESA A TROMER E AND VIRZA MScalable zero knowledge via cycles of elliptic curves IACR Cryp-tology ePrint Archive 2014 (2014) 595

                                  [15] BEN-SASSON E CHIESA A TROMER E AND VIRZA MSuccinct non-interactive zero knowledge for a von neumann ar-chitecture In 23rd USENIX Security Symposium (USENIX Se-curity 14) (San Diego CA Aug 2014) USENIX Associationpp 781ndash796

                                  [16] BESTAVROS A LAPETS A AND VARIA M User-centricdistributed solutions for privacy-preserving analytics Communi-cations of the ACM 60 2 (February 2017) 37ndash39

                                  [17] CHAUM D AND VAN HEYST E Group signatures In Ad-vances in Cryptology - EUROCRYPT rsquo91 Workshop on the The-ory and Application of of Cryptographic Techniques BrightonUK April 8-11 1991 Proceedings (1991) D W DaviesEd vol 547 of Lecture Notes in Computer Science Springerpp 257ndash265

                                  [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                                  [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                                  [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                                  [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                                  [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                                  [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                                  [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                                  [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                                  [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                                  [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                                  princetonedufeltenwarrant-paperpdf

                                  [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                                  [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                                  [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                                  [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                                  [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                                  [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                                  [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                                  [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                                  [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                                  [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                                  [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                                  [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                                  [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                                  [41] VIRZA M November 2017 Private communication

                                  [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                                  [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                                  [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                                  • Introduction
                                  • Related Work
                                  • Threat Model and Security Goals
                                    • Threat model
                                    • Security Goals
                                      • System Design
                                        • Cryptographic Tools
                                        • System Configuration
                                        • Workflow
                                        • Additional Design Choices
                                          • Protocol Definition
                                          • Evaluation of MPC Implementation
                                            • Computing Totals in WebMPC
                                            • Thresholds and Hierarchy with Jiff
                                              • Evaluation of SNARKs
                                                • Argument Types
                                                • Implementation
                                                  • Generalization
                                                    • Generalized Framework
                                                      • Conclusion

                                    [18] DAMGARD I AND ISHAI Y Constant-round multiparty com-putation using a black-box pseudorandom generator In Advancesin Cryptology - CRYPTO 2005 25th Annual International Cryp-tology Conference Santa Barbara California USA August 14-18 2005 Proceedings (2005) V Shoup Ed vol 3621 of Lec-ture Notes in Computer Science Springer pp 378ndash394

                                    [19] DAMGARD I KELLER M LARRAIA E PASTRO VSCHOLL P AND SMART N P Practical covertly secureMPC for dishonest majority - or Breaking the SPDZ limitsIn Computer Security - ESORICS 2013 - 18th European Sym-posium on Research in Computer Security Egham UK Septem-ber 9-13 2013 Proceedings (2013) J Crampton S Jajodia andK Mayes Eds vol 8134 of Lecture Notes in Computer ScienceSpringer pp 1ndash18

                                    [20] FEIGENBAUM J JAGGARD A D AND WRIGHT R N Openvs closed systems for accountability In Proceedings of the 2014Symposium and Bootcamp on the Science of Security (2014)ACM p 4

                                    [21] FEIGENBAUM J JAGGARD A D WRIGHT R N ANDXIAO H Systematizing ldquoaccountabilityrdquo in computer sci-ence Tech rep Yale University Feb 2012 Technical ReportYALEUDCSTR-1452

                                    [22] FEUER A AND ROSENBERG E Brooklyn prosecutor accusedof using illegal wiretap to spy on love interest November2016 httpswwwnytimescom20161128nyregionbrooklyn-prosecutor-accused-of-using-illegal-wiretap-to-spy-on-love-interesthtml

                                    [23] GOLDWASSER S AND PARK S Public accountability vs se-cret laws Can they coexist A cryptographic proposal In Pro-ceedings of the 2017 on Workshop on Privacy in the ElectronicSociety Dallas TX USA October 30 - November 3 2017 (2017)B M Thuraisingham and A J Lee Eds ACM pp 99ndash110

                                    [24] JAKOBSEN T P NIELSEN J B AND ORLANDI C A frame-work for outsourcing of secure computation In Proceedings ofthe 6th edition of the ACM Workshop on Cloud Computing Se-curity CCSW rsquo14 Scottsdale Arizona USA November 7 2014(2014) G Ahn A Oprea and R Safavi-Naini Eds ACMpp 81ndash92

                                    [25] KAMARA S Restructuring the nsa metadata program In Inter-national Conference on Financial Cryptography and Data Secu-rity (2014) Springer pp 235ndash247

                                    [26] KATZ J AND LINDELL Y Introduction to modern cryptogra-phy CRC press 2014

                                    [27] KROLL J FELTEN E AND BONEH D Secure protocolsfor accountable warrant execution 2014 httpwwwcs

                                    princetonedufeltenwarrant-paperpdf

                                    [28] LAMPSON B Privacy and security usable security how to getit Communications of the ACM 52 11 (2009) 25ndash27

                                    [29] LAPETS A VOLGUSHEV N BESTAVROS A JANSEN FAND VARIA M Secure MPC for Analytics as a Web Ap-plication In 2016 IEEE Cybersecurity Development (SecDev)(Boston MA USA November 2016) pp 73ndash74

                                    [30] MASHIMA D AND AHAMAD M Enabling robust informationaccountability in e-healthcare systems In HealthSec (2012)

                                    [31] PAPANIKOLAOU N AND PEARSON S A cross-disciplinaryreview of the concept of accountability A survey of the litera-ture 2013 Available online at httpwwwbic-trusteufiles201306Paper_NPpdf

                                    [32] PEARSON S Toward accountability in the cloud IEEE InternetComputing 15 4 (2011) 64ndash69

                                    [33] RIVEST R L SHAMIR A AND TAUMAN Y How to leak asecret In Advances in Cryptology - ASIACRYPT 2001 7th Inter-national Conference on the Theory and Application of Cryptol-ogy and Information Security Gold Coast Australia December9-13 2001 Proceedings (2001) C Boyd Ed vol 2248 of Lec-ture Notes in Computer Science Springer pp 552ndash565

                                    [34] SCIPR LAB libsnark a C++ library for zkSNARK proofshttpsgithubcomscipr-lablibsnark

                                    [35] SEGAL A FEIGENBAUM J AND FORD B Privacy-preserving lawful contact chaining [preliminary report] In Pro-ceedings of the 2016 ACM on Workshop on Privacy in the Elec-tronic Society (New York NY USA 2016) WPES rsquo16 ACMpp 185ndash188

                                    [36] SEGAL A FORD B AND FEIGENBAUM J Catching ban-dits and only bandits Privacy-preserving intersection warrantsfor lawful surveillance In FOCI (2014)

                                    [37] SMITH S W Kudzu in the courthouse Judgments made in theshade The Federal Courts Law Review 3 2 (2009)

                                    [38] SMITH S W Gagged sealed amp delivered Reforming ecparsquossecret docket Harvard Law amp Policy Review 6 (2012) 313ndash459

                                    [39] SUNDARESWARAN S SQUICCIARINI A AND LIN D En-suring distributed accountability for data sharing in the cloudIEEE Transactions on Dependable and Secure Computing 9 4(2012) 556ndash568

                                    [40] TAN Y S KO R K AND HOLMES G Security and data ac-countability in distributed systems A provenance survey In HighPerformance Computing and Communications amp 2013 IEEE In-ternational Conference on Embedded and Ubiquitous Comput-ing (HPCC EUC) 2013 IEEE 10th International Conference on(2013) IEEE pp 1571ndash1578

                                    [41] VIRZA M November 2017 Private communication

                                    [42] WANG X RANELLUCCI S AND KATZ J Global-scale se-cure multiparty computation In Proceedings of the 2017 ACMSIGSAC Conference on Computer and Communications SecurityCCS 2017 Dallas TX USA October 30 - November 03 2017(2017) B M Thuraisingham D Evans T Malkin and D XuEds ACM pp 39ndash56

                                    [43] WEITZNER D J ABELSON H BERNERS-LEE T FEIGEN-BAUM J HENDLER J A AND SUSSMAN G J Informationaccountability Commun ACM 51 6 (2008) 82ndash87

                                    [44] XIAO Z KATHIRESSHAN N AND XIAO Y A survey ofaccountability in computer networks and distributed systems Se-curity and Communication Networks 9 4 (2016) 290ndash315

                                    • Introduction
                                    • Related Work
                                    • Threat Model and Security Goals
                                      • Threat model
                                      • Security Goals
                                        • System Design
                                          • Cryptographic Tools
                                          • System Configuration
                                          • Workflow
                                          • Additional Design Choices
                                            • Protocol Definition
                                            • Evaluation of MPC Implementation
                                              • Computing Totals in WebMPC
                                              • Thresholds and Hierarchy with Jiff
                                                • Evaluation of SNARKs
                                                  • Argument Types
                                                  • Implementation
                                                    • Generalization
                                                      • Generalized Framework
                                                        • Conclusion

                                      top related