United States v. Miller
- Appeals Court Questions Government on Reliability of Google Scanning Algorithm: This week a federal appellate judge pressed the government about the reliability of a Google scanning algorithm that provided the basis for the warrantless search of a private email. EPIC raised concerns about the scanning technique in an amicus brief for the appeals court. In United States v. Wilson, EPIC argued that "because neither Google nor the Government explained how the image matching technique actually works or presented evidence establishing accuracy and reliability, the Government's search was unreasonable." Judge Watford told the government attorney that he "would like to hear your defense of the evidentiary record" because what we have "is this declaration from the Google person," and "I would need far more explanation of how reliable the hash matching technology is before I could validate this search." EPIC filed an amicus brief in a similar case in United States v. Miller. EPIC routinely submits amicus briefs on the privacy implications of new investigative techniques. EPIC has also long promoted algorithmic transparency to ensure accountability for AI-based decision making. (Nov. 18, 2019)
- EPIC Files Amicus in Case Concerning Government Searches and Google's Email Screening Practices: EPIC has filed an amicus brief with the U.S. Court of Appeals for the Sixth Circuit in United States v. Miller, arguing that the Government must prove the reliability of Google email screening technique. The lower court held that law enforcement could search any images that Google's algorithm had flagged as apparent child pornography. EPIC explained that a search is unreasonable when the government cannot establish the reliability of the technique. EPIC also warned that the government could use this technique "to determine if files contain religious viewpoints, political opinions, or banned books." EPIC has promoted algorithmic transparency for many years. EPIC routinely submits amicus briefs on the application of the Fourth Amendment to investigative techniques. EPIC previously urged the government to prove the reliability of investigative techniques in Florida v. Harris. (Oct. 18, 2018) More top news »
This case follows the prosecution of an individual based on the discovery of two illegal images that he uploaded via Gmail. These images were automatically flagged by Google’s “product abuse detection system” based on the company’s proprietary hashing technology and automatically relayed to government investigators without human review of a matching image. The defendant has argued that the searches of his e-mail data were unreasonable under the Fourth Amendment. The lower court found that the "private search" doctrine applied and exempted these actions from Fourth Amendment scrutiny. Neither Google nor the Government has produced the underlying algorithm used to scan the images, and the Government has not established that the investigative technique is accurate or reliably identifies only contraband images.
The Fourth Amendment only protects against searches by the government, not private entities. In United States v. Jacobsen, 466 U.S. 109, 131 (1984), the Supreme Court decided that government searches that follow private searches and are within the scope of the private search are reasonable. In Jacobsen, the Court held that the Government’s warrantless inspection and testing of the contents of a package that had been previously searched by FedEx was permissible because “there was a virtual certainty” that the law enforcement officer’s search would not reveal “anything more than he had already been told.”
The question in this case is whether the Government has provided sufficient evidence to establish that there was "virtual certainty" that the files Google sent in a CyberTipline Report to the NCMEC, and were ultimately opened by police, were the same as those a Google employee previously viewed.
Google maintains a prorprietary image matching system that automatically scans files uploaded to Google products, including Gmail, to search for child pornography. The defendant uploaded two images to Google's e-mail system, which flagged the images as "apparent child pornography." Google's system flagged the defedant's images, and then automatically generated and submitted “CyberTip Report # 5778397” to the National Center for Missing and Exploited Children (“NCMEC”) with the following information:
- the date and time of the incident;
- the e-mail address associated with the user account that uploaded the file;
- the IP address associated witht he upload;
- a list of IP addresses used to access the user account (which can go as far back as the original account registration date);
- the filename;
- the "categorization" of the image based on an existing rubric; and
- copies of te image files(s);
Google was required by law to submit this CyberTipline report once it became aware of apparent child pornography. 18 U.S.C. § 2258A.
When NCMEC received Google's CyberTipline report, NCMEC staff initiated a websearch for the email and IP addresses associated with the report without opening the images sent by Google to confirm that they were contraband. NCMEC identifies information associated with the user’s IP address(es): Country, Region, City, Metro Code, Postal Code, Area Code, Latitude/Longitude, and Internet Service Provider or Organization. NCMEC staff also collect "data gathered from searches on publicly-available, open-source websites" using the account and user identifying informatiomn provided by the CyberTipline report. This information can include social media profiles, websites, addresses, and other personal data.
After NCMEC staff collected this information on defendant, the report was referred to the Kentucky State Police and the Kenton County Police Department for potential investigation. A detective in the KCPD opened the images attached to the Cybertipline report and confirmed they were child pornograpy.
The extent of what is known about Google’s practices in using the hashing technology is described in the declaration of Cathy McGoff, a Senior Manager for Law Enforcment and Information Security at Google:
4. Based on [Google’s] private non-government interests, since 2008, Google has been using its own proprietary hashing technology to tag confirmed child sexual abuse images. Each offending image, after it is viewed by at least one Google employee is given a digital fingerprint (“hash”) that our computers can automatically recognize and is added to our repository of hashes of apparent child pornography as defined in 18 USC § 2256. Comparing these hashes to hashes of content uploaded to our services allows us to identify duplicate images of apparent child pornography to prevent them from continuing to circulate on our products.
5. We also rely on users who flag suspicious content they encounter so we can review it and help expand our database of illegal images. No hash is added to our respository without a corresponding image first having been visually confirmed by a Google employee to be apparent child pornography.
6. Google trains a team of employees on the legal obligation to report apparent child pornography. The team is trained by counsel on the federal statutory definition of child pornography and how to recognize it on our products and services. Google makes reports in accordance with that training.
7. When Google’s product abuse detection system encounters a hash that matches a hash of a known child sexual abuse image, in some cases Google automatically reports the user to NCMEC without re-reviewing the image. In other cases, Google undertakes a manual, human review, to confirm that the image contains apparent child pornography before reporting it to NCMEC.
While Google describes its algorithm as assigning each image in its repository a "digital fingerprint," there is no information provided on the type of hash function Google uses to assign this "digital fingerprint." This is important becasue file hashing functions work differently than image hashing functions. File hashing functions create a unique hash value for a file, and changing one bit of data will change the hash value of the file. File hashing is a method of demonstrating that two files are the same, bit-for-bit, without comparing each bit to the corresponding bit of the other file, which is very time and resource consuming. In contrast, image hashing algorithms provide a way to match images even if they have been altered slightly, but also enable by design the matching of files that do not have the same file-hash values.
Detective Aaron Schihil of the Kenton County Police Department received the information from NCMEC and “opened the attachments viewed the relevant images, which he confirmed to be child pornography.” After confirming they were child pornography, Detective Schihil obtained a search warrant for several categories of data held by Google and associated with the Defendant’s account. Detective Schihil later obtained a search warrant for the Defendant’s residence and a separate search warrant for various electronic devices seized at the Defendant’s residence.
The Defendant was charged in the U.S. District Court for the Eastern District of Kentucky and subsequently filed a motion to suppress the evidence obtained by Detective Schhil. He argued that Google's search was that of a government actor in this case and that it was therefore an unreasonable warrantless search under the Fourth Amendment, or in the alternative that the Detective's search exceeded the scope of Google's private search. The district court disagreed. In denying the motion to supress, the district court found that Google's search was a private search and that the police did not exceed the scope of the private search because there was a "virtual certainty" that a Google employee had previously viewed the images before the police did so. The district court relied upon Google's representation that its algorithm assigns each image in its database a "digital fingerprint" that is "uniquely associated with the input data." Defendant was subsequently convicted on several counts and appealed to the U.S. Court of Appeals for the Sixth Circuit.
EPIC seeks to ensure that Fourth Amendment protections keep pace with advances in technology. For instance, EPIC filed an amicus brief before the Supreme Court in Carpenter v. United States arguing that the technological changes justified broader Fourth Amendment protections. The Court declined to extend the “third party doctrine” to permit the warrantless collection of cell site location information. Here, EPIC has an interest in ensuring that the Government does not conduct warrantless searches based on proprietary and potentially unreliable algorithmic search techniques.
This case also implicates questions about the standard of proof required to demonstrate the validity of a new investigative technique, an issue EPIC has advised the courts on previously. EPIC advised the Supreme Court about this issue as amicus curiae in Florida v. Harris, arguing that the government should bear the burden of establishing the reliability of investigative techniques in criminal cases.
U.S. Court of Appeals for the Sixth Circuit, No. 18-5578
- Brief of Appellant (Oct. 10, 2018)
- Amicus Brief of Electronic Privacy Information Center in Support of Appellant (Oct. 17, 2018)
- Brief of the United States (Dec. 19, 2018)
- Amicus Brief of Discord, Dropbox, Facebook, Google, Microsoft, Pintrest, Reddit, Snap, and Twitter in Support of Appellee (Dec. 26, 2018)
- Miller Reply Brief (Jan. 18, 2019)
U.S. District Court for the Eastern District of Kentucky, No. 16-047
- Memorandum Order Accepting Report and Recommendation (June 23, 2017)
- United State's Response to Defendant's Motion to Suppress (February 28, 2017)
- Declaration of Cathy McGoff (Senior Manager, Law Enforcement and Information Security at Google, Inc.
- CyberTip Report 5778397
- Declaration of John Shehan (Vice President at the National Center for Missing and Exploited Children)
- Search Warrant for Records Held by Google, Inc.
- Search Warrant for Defendant's Residence
- Search Warrant for Electronic Media Seized at Defendant's Residence
- EPIC: Algorithmic Transparency
- EPIC Amicus: Florida v. Harris
- EPIC Amicus: Carpenter v. United States
- EPIC v. DOJ (criminal justice algorithms)
- EPIC v. CBP (Analytical Framework for Intelligence)
- EPIC v. DHS (FAST Program)
Share this page:
Subscribe to the EPIC Alert
The EPIC Alert is a biweekly newsletter highlighting emerging privacy issues.