Perceptual hashing: Difference between revisions

Content deleted Content added
Added reference to PhotoDNA which was introduced in 2009
case fix
Line 6:
 
In 2009, [[Microsoft Corporation]] developed PhotoDNA in collaboration with [[Hany Farid]], professor at [[Dartmouth College]].
PhotoDNA is a perceptual hashing capability developed to combat the distribution of [[Childchild Sexualsexual Abuseabuse Materialmaterial]] (CSAM) online. Provided by Microsoft for no cost, PhotoDNA remains a critical tool used by major software companies, NGOs and law enforcement agencies around the world. <ref name="nytpdna">{{cite news |last1=Lohr |first1=Steve |title=Microsoft Tackles the Child Pornography Problem |date= December 2009 |publisher= New York Times |url=https://archive.nytimes.com/bits.blogs.nytimes.com/2009/12/16/microsoft-tackles-the-child-pornography-problem/}}</ref>
 
The July 2010 thesis of Christoph Zauner is a well-written introduction to the topic.<ref name="zauner10">{{cite book |last1=Zauner |first1=Christoph |title=Implementation and Benchmarking of Perceptual Image Hash Functions |date= July 2010 |publisher=Upper Austria University of Applied Sciences, Hagenberg Campus |url=https://www.phash.org/docs/pubs/thesis_zauner.pdf}}</ref>
Line 26:
A Chinese team reported in July 2019 that they had discovered a perceptual hash for [[speech encryption]] which proved to be effective. They were able to create a system in which the encryption was not only more accurate, but more compact as well.<ref name=zhang19>{{cite journal |last1=Zhang |first1=Qiu-yu |last2=Zhou |first2=Liang |last3=Zhang |first3=Tao |last4=Zhang |first4=Deng-hai |title=A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing |journal=Multimedia Tools and Applications |date=July 2019 |volume=78 |issue=13 |pages=17825–17846 |doi=10.1007/s11042-019-7180-9 |s2cid=58010160 }}</ref>
 
[[Apple Inc]] reported as early as August 2021 a [[Childchild Sexualsexual Abuseabuse Materialmaterial]] (CSAM) system that they know as [[NeuralHash]]. A technical summary document, which nicely explains the system with copious diagrams and example photographs, offers that "Instead of scanning images [on corporate] [[iCloud]] [servers], the system performs on-device matching using a database of known CSAM image hashes provided by [the [[National Center for Missing and Exploited Children]]] (NCMEC) and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’users' devices."<ref name="apcsam">{{cite news |title=CSAM Detection - Technical Summary |url=https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf |publisher=Apple Inc |date=August 2021}}</ref>
 
In an essay entitled "The Problem With Perceptual Hashes", Oliver Kuederle produces a startling collision generated by a piece of commercial [[neural net]] software, of the NeuralHash type. A photographic portrait of a real woman (Adobe Stock #221271979) reduces through the test algorithm to a similar hash as the photograph of a butterfly painted in watercolor (from the "deposit photos" database). Both sample images are in commercial databases. Kuederle is concerned with collisions like this. "These cases will be manually reviewed. That is, according to Apple, an Apple employee will then look at your (flagged) pictures... Perceptual hashes are messy. When such algorithms are used to detect criminal activities, especially at Apple scale, many innocent people can potentially face serious problems... Needless to say, I’m quite worried about this."<ref name="rafok">{{cite news |last1=Kuederle |first1=Oliver |title=THE PROBLEM WITH PERCEPTUAL HASHES |url=https://rentafounder.com/the-problem-with-perceptual-hashes/ |access-date=23 May 2022 |publisher=rentafounder.com |date=n.d.}}</ref>