Revision as of 04:37, 11 November 2022 edit Lambiam (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers 65,062 edits →Locality-preserving hashing: {{cn}} ← Previous edit		Revision as of 23:21, 3 January 2023 edit undo Citation bot (talk \| contribs) Bots 5,863,824 edits Alter: issue, pages. Add: pages, website, s2cid, authors 1-1. Removed proxy/dead URL that duplicated identifier. Removed parameters. Formatted dashes. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by SemperIocundus \| #UCB_webform 1353/2500 Next edit →
Line 51: This is in contrast to [[cryptography\|cryptographic]] hash functions and [[checksum]]s, which are designed to have [[Avalanche effect\|random output difference between adjacent inputs]]. The first family of locality-preserving hash functions was devised as a way to facilitate [[Pipeline (computing)\|data pipelining]] in implementations of [[Parallel RAM\|parallel random-access machine (PRAM)]] algorithms that use [[universal hashing]] to reduce memory [[Resource_contention\|contention]] and [[network congestion]].<ref name=Chin1991>{{cite thesis \|last=Chin \|first=Andrew \|date=1991 \|title=Complexity Issues in General Purpose Parallel Computing \|pages=~~87-95~~87–95 \|type=DPhil \|publisher=University of Oxford \|url=https://perma.cc/E47H-WCVP}}</ref><ref name=Chin1994>{{cite journal \|last1=Chin \|first1=Andrew \|date=1994 \|title=Locality-Preserving Hash Functions for General Purpose Parallel Computation \|url=http://unclaw.com/chin/scholarship/hashfunctions.pdf \|journal=Algorithmica \|volume=12 \|issue=~~2-3~~2–3 \|pages=~~170-181~~170–181 \|doi=10.1007/BF01185209\|s2cid=18108051 }}</ref> Locality preserving hashes are related to [[space-filling curve]]s.{{how\|date=March 2020}} Line 123: \| journal = 4th Cybercrime and Trustworthy Computing Workshop \| year = 2013 \| pages = 7–13 \| url = https://www.academia.edu/7833902 \| doi = 10.1109/CTC.2013.9 \| isbn = 978-1-4799-3076-0 Line 206: Testing performed in the paper on a range of file types identified the Nilsimsa hash as having a significantly higher false positive rate when compared to other similarity digest schemes such as TLSH, Ssdeep and Sdhash. An implementation of TLSH is available as [[open-source software]].<ref>{{cite web\|url=https://github.com/trendmicro/tlsh \|title=TLSH \|website=[[GitHub]] \|access-date=2014-04-10}}</ref> ===Random projection=== Line 268: ===Semantic hashing=== Semantic hashing is a technique that attempts to map input items to addresses such that closer inputs have higher [[semantic similarity]].<ref>{{Cite journal\|~~last~~last1=Salakhutdinov\|~~first~~first1=Ruslan\|last2=Hinton\|first2=Geoffrey\|date=2008\|title=Semantic hashing~~\|url=https://linkinghub.elsevier.com/retrieve/pii/S0888613X08001813~~\|journal=International Journal of Approximate Reasoning\|language=en\|volume=50\|issue=7\|pages=969–978\|doi=10.1016/j.ijar.2008.11.006\|doi-access=free}}</ref> The hashcodes are found via training of an [[artificial neural network]] or [[graphical model]].{{cn\|date=September 2021}} ==LSH algorithm for nearest neighbor search==

Locality-sensitive hashing: Difference between revisions