12.18是什么星座| esrd医学上是什么意思| 12月26是什么星座| 金银花不能和什么一起吃| 全身体检挂什么科| 吃驼奶粉有什么好处| 慢性胃炎吃什么好| 维生素b族什么时候吃效果最好| 为什么脸突然肿了| 小腿疼是什么原因| 睾丸扭转是什么意思| 精液是什么味道的| 梅毒长什么样| 缺如是什么意思| 新发展理念是什么| 椰浆和椰汁有什么区别| 石斛配什么泡水喝好| 气血两亏是什么意思| 新生儿打嗝是什么原因| 斗志昂扬是什么意思| 夫妻肺片里面都有什么| 反馈是什么意思| 抓龙筋什么意思| 九月二十二是什么星座| 胃疼为什么后背也疼| 圣诞节什么时候| 1967年出生属什么| 精液是什么组成的| 亲子鉴定去医院挂什么科| 宝刀未老的意思是什么| 三伏天什么意思| 手麻抽筋是什么原因引起的| mask是什么意思| 经常呕吐是什么原因| 空调外机风扇不转是什么原因| 毛片是什么意思| 荔枝不能和什么一起吃| 动脉硬化吃什么可以软化血管| 盐酸左氧氟沙星片治什么病| 耳鸣用什么药治疗效果最好| 农历六月十八是什么日子| 地心引力是什么意思| 算五行缺什么免费测试| 花中之王是什么花| 猫能吃什么水果| 白衣天使是什么意思| 霉菌感染用什么药最好| 儿童嗓子疼吃什么药| 长疱疹是什么原因| 满满是什么意思| 龟头流脓小便刺痛吃什么药| 幡然醒悟是什么意思| 八月十七是什么星座| 那天午后我站在你家门口什么歌| 蜘蛛的血是什么颜色的| 吊瓜是什么瓜| 煎牛排用什么油好| 农历今天什么日子| 单核细胞百分比偏高说明什么| sale是什么牌子| 画龙点晴是什么生肖| 什么挑担子忠心耿耿| 胆汁反流是什么症状| 5月7号是什么星座| 肝郁气滞血瘀吃什么药| 28什么意思| 恍恍惚惚什么意思| 什么洗发水去屑好| 什么病不能吃松花粉| 什么是嗜睡| 白塞病是什么病| 心花怒放是什么意思| 脸上发麻是什么原因| 在家里做什么能赚钱| 猪朋狗友是什么意思| 秀才相当于什么学历| 养性是什么意思| 中秋节吃什么| cpi指数上涨意味着什么| 你是电你是光是什么歌| 食指是什么经络| 人肉什么意思| 善字五行属什么| 乳头是什么| 藕不能和什么一起吃| 尽收眼底是什么意思| 早上空腹喝淡盐水有什么好处| daks是什么品牌| 风湿性关节炎用什么药| 什么时候看到的月亮最大| 什么人不能摆放大象| 40岁男人学什么乐器好| 9个月宝宝玩什么玩具| 美女胸部长什么样| 妊娠是什么| 甲状腺偏高是什么原因引起的| 如何知道自己是什么星座| 化疗期间吃什么最好| 身体肿是什么原因引起的| 槐树什么时候开花| 膀胱壁增厚毛糙是什么意思| 天厨贵人是什么意思| 秦二世叫什么名字| 为什么总是放屁很频繁| 甲亢可以吃什么| 自性是什么意思| 甲状腺需要做什么检查| 为什么去香港还要通行证| 打猎是什么意思| 飧泄是什么意思| 黄豆吃多了有什么坏处| 阑尾切除后有什么影响和后遗症| 硬卧代硬座是什么意思| 结核抗体阳性说明什么| 经变是什么意思| 腿酸胀是什么原因| 沆瀣一气是什么意思| 红苋菜不能和什么一起吃| 瑶柱是什么| 荔枝有什么作用与功效| 香港有什么好吃的| 尾巴骨疼是什么原因| 单方精油和复方精油有什么区别| 棘突是什么意思| 什么时辰出生最好| 酒精过敏吃什么药| mrv是什么检查| 喝蛋白粉有什么副作用| 丙氨酸氨基转移酶高是什么原因| 向左向右向前看是什么歌| 中华文化的精髓是什么| 狂犬疫苗什么时候打有效| 曾是什么意思| 难以启齿是什么意思| 便血是什么样的| 金童玉女指什么生肖| 做梦梦见自己生孩子是什么意思| 梦见喝酒是什么意思| 阴历六月十八是什么日子| 抹茶绿配什么颜色好看| 小孩几天不大便是什么原因怎么办| 过敏是什么原因引起的| 前列腺多发钙化灶是什么意思| 白蜡金是什么金| 好汉不吃眼前亏是什么意思| 爱慕是什么意思| 丑时五行属什么| 喉咙痛有痰吃什么药| 老年人补什么钙效果最好| 艾滋病潜伏期有什么症状| 骨转移用什么药能治愈| 嗜睡挂什么科| 吃什么能让月经快点来| 械字号产品是什么意思| 麸质是什么| 说话鼻音重是什么原因| 6月21号什么星座| 咳嗽咳出血是什么原因| 吃藕粉对身体有什么好处| 读书与吃药是什么生肖| 牙掉了是什么预兆| 舌头麻木是什么原因| 又什么又什么的葡萄| 缺少维生素有什么症状| 什么虫子咬了像针扎一样疼| 什么像什么又像什么| 嘴唇发麻是什么病兆| 屎忽鬼是什么意思| 睡觉口干是什么原因| 小囊肿是什么病严重吗| 期许是什么意思| ggdb是什么牌子| 什么的小姑娘| 百香果有什么作用| 羊水栓塞是什么意思| 阴晴不定是什么意思| 跌打损伤用什么药最好| 左肾小囊肿是什么意思| 五行属土缺命里缺什么| 谭震林是什么军衔| 是什么车| 麻雀吃什么食物| 嘴巴里起泡是什么原因| 尿常规白细胞高是什么原因| 缠腰龙是什么病| 为什么手心总是出汗| 勾芡用什么粉最好| 小暑吃什么| 遁形是什么意思| 角瓜是什么瓜| 跑步胸口疼什么原因| 为什么针灸后越来越痛| 为什么胸闷一吃丹参滴丸就好| 脉搏90左右意味着什么| 薄荷音是什么意思| plv是什么意思| 什么的月亮| 喝水多尿多是什么原因男性| 广义是什么意思| 1964年属什么的| 黄花胶是什么鱼的胶| 改姓需要什么手续| 女人有卧蚕代表什么| 活泼的近义词是什么| 香瓜什么时候成熟| touch是什么意思| 线束厂是做什么的| 全会是什么意思| 妊娠高血压什么症状| 活好的女人有什么表现| 万条垂下绿丝绦的上一句是什么| 赢荡为什么传位嬴稷| 孙悟空是什么佛| 脑梗能吃什么水果| 麦麸是什么意思| 溜号是什么意思| 夫妻分床睡意味着什么| 大腿骨叫什么骨| 呼吸困难胸闷气短挂什么科| 什么叫痔疮| pas是什么意思| 鼻子突然出血是什么原因| 是什么时候| 一个月来两次大姨妈是什么原因| 打火机里面的液体是什么| 什么心竭什么| 10月13是什么星座| 五行中什么生木| 肠胃不好吃什么| 七月份吃什么水果| 良民是什么意思| 心开窍于什么| 啮齿类动物什么意思| 阑尾炎手术后吃什么好| 龙胆泻肝丸治什么病| 尿急吃什么药效果最好| 丙烯是什么| 有什么颜色| 96345是什么电话| 大千世界什么意思| 脾虚湿盛吃什么药| 三个金念什么| 盗汗什么意思| 阴道长什么样| 包干费用是什么意思| 末法时期是什么意思| 皂角米有什么功效| 老舍原名叫什么| 头伏二伏三伏吃什么| 胃食管反流有什么症状| 老年人吃什么增强免疫力| 缺陷的陷是什么意思| eric是什么意思| 粘鞋子用什么胶水最好| 热毒吃什么药好得快| 便秘吃什么润肠通便| 8月13号什么星座| 二月是什么星座| 什么品牌的奶粉最好| 霜打的茄子什么意思| hb什么意思| 河粉为什么叫河粉| 他达拉非是什么药| 大暑是什么时间| 百度Jump to content

国产大飞机C919完成高速滑行测试 首飞前最后一关通过

From Wikipedia, the free encyclopedia
百度 在这一阶段的大部分时间内,他实际上是中共中央的主要主持者。

In computer science, a fingerprinting algorithm is a procedure that maps an arbitrarily large data item (such as a computer file) to a much shorter bit string, its fingerprint, that uniquely identifies the original data for all practical purposes just as human fingerprints uniquely identify people for practical purposes. This fingerprint may be used for data deduplication purposes. This is also referred to as file fingerprinting, data fingerprinting, or structured data fingerprinting.

Fingerprints are typically used to avoid the comparison and transmission of bulky data. For instance, a web browser or proxy server can efficiently check whether a remote file has been modified by fetching only its fingerprint and comparing it with that of the previously fetched copy.

Fingerprint functions may be seen as high-performance hash functions used to uniquely identify substantial blocks of data where cryptographic hash functions may be unnecessary.

Special algorithms exist for audio and video fingerprinting.

Properties

[edit]

Virtual uniqueness

[edit]

To serve its intended purposes, a fingerprinting algorithm must be able to capture the identity of a file with virtual certainty. In other words, the probability of a collision — two files yielding the same fingerprint — must be negligible, compared to the probability of other unavoidable causes of fatal errors (such as the system being destroyed by war or by a meteorite): say, 10?20 or less.

This requirement is somewhat similar to that of a checksum function, but is much more stringent. To detect accidental data corruption or transmission errors, it is sufficient that the checksums of the original file and any corrupted version will differ with near certainty, given some statistical model for the errors. In typical situations, this goal is easily achieved with 16- or 32-bit checksums. In contrast, file fingerprints need to be at least 64-bit long to guarantee virtual uniqueness in large file systems (see birthday attack).

When proving the above requirement, one must take into account that files are generated by highly non-random processes that create complicated dependencies among files. For instance, in a typical business network, one usually finds many pairs or clusters of documents that differ only by minor edits or other slight modifications. A good fingerprinting algorithm must ensure that such "natural" processes generate distinct fingerprints, with the desired level of certainty.

Compounding

[edit]

Computer files are often combined in various ways, such as concatenation (as in archive files) or symbolic inclusion (as with the C preprocessor's #include directive). Some fingerprinting algorithms allow the fingerprint of a composite file to be computed from the fingerprints of its constituent parts. This "compounding" property may be useful in some applications, such as detecting when a program needs to be recompiled.

Algorithms

[edit]

Rabin's algorithm

[edit]

Rabin's fingerprinting algorithm is the prototype of the class.[1] It is fast and easy to implement, allows compounding, and comes with a mathematically precise analysis of the probability of collision. Namely, the probability of two strings r and s yielding the same w-bit fingerprint does not exceed max(|r|,|s|)/2w-1, where |r| denotes the length of r in bits. The algorithm requires the previous choice of a w-bit internal "key", and this guarantee holds as long as the strings r and s are chosen without knowledge of the key.

Rabin's method is not secure against malicious attacks. An adversarial agent can easily discover the key and use it to modify files without changing their fingerprint.

Cryptographic hash functions

[edit]

Mainstream cryptographic grade hash functions generally can serve as high-quality fingerprint functions, are subject to intense scrutiny from cryptanalysts, and have the advantage that they are believed to be safe against malicious attacks.

A drawback of cryptographic hash algorithms such as MD5 and SHA is that they take considerably longer to execute than Rabin's fingerprint algorithm. They also lack proven guarantees on the collision probability. Some of these algorithms, notably MD5, are no longer recommended for secure fingerprinting. They are still useful for error checking, where purposeful data tampering is not a primary concern.

Perceptual hashing

[edit]
Perceptual hashing is the use of a fingerprinting algorithm that produces a snippet, hash, or fingerprint of various forms of multimedia.[2][3] A perceptual hash is a type of locality-sensitive hash, which is analogous if features of the multimedia are similar. This is in contrast to cryptographic hashing, which relies on the avalanche effect of a small change in input value creating a drastic change in output value. Perceptual hash functions are widely used in finding cases of online copyright infringement as well as in digital forensics because of the ability to have a correlation between hashes so similar data can be found (for instance with a differing watermark).

Application examples

[edit]

NIST distributes a software reference library, the American National Software Reference Library, that uses cryptographic hash functions to fingerprint files and map them to software products. The HashKeeper database, maintained by the National Drug Intelligence Center, is a repository of fingerprints of "known to be good" and "known to be bad" computer files, for use in law enforcement applications (e.g. analyzing the contents of seized disk drives).

Content similarity detection

[edit]

Fingerprinting is currently the most widely applied approach to content similarity detection. This method forms representative digests of documents by selecting a set of multiple substrings (n-grams) from them. The sets represent the fingerprints and their elements are called minutiae.[4][5]

A suspicious document is checked for plagiarism by computing its fingerprint and querying minutiae with a precomputed index of fingerprints for all documents of a reference collection. Minutiae matching with those of other documents indicate shared text segments and suggest potential plagiarism if they exceed a chosen similarity threshold.[6] Computational resources and time are limiting factors to fingerprinting, which is why this method typically only compares a subset of minutiae to speed up the computation and allow for checks in very large collection, such as the Internet.[4]

See also

[edit]

References

[edit]
  1. ^ Rabin, M. O. (1981). "Fingerprinting by random polynomials". Center for Research in Computing Technology Harvard University Report TR-15-81.
  2. ^ Buldas, Ahto; Kroonmaa, Andres; Laanoja, Risto (2013). "Keyless Signatures' Infrastructure: How to Build Global Distributed Hash-Trees". In Riis, Nielson H.; Gollmann, D. (eds.). Secure IT Systems. NordSec 2013. Lecture Notes in Computer Science. Vol. 8208. Berlin, Heidelberg: Springer. doi:10.1007/978-3-642-41488-6_21. ISBN 978-3-642-41487-9. Keyless Signatures Infrastructure (KSI) is a globally distributed system for providing time-stamping and server-supported digital signature services. Global per-second hash trees are created and their root hash values published. We discuss some service quality issues that arise in practical implementation of the service and present solutions for avoiding single points of failure and guaranteeing a service with reasonable and stable delay. Guardtime AS has been operating a KSI Infrastructure for 5 years. We summarize how the KSI Infrastructure is built, and the lessons learned during the operational period of the service.
  3. ^ Klinger, Evan; Starkweather, David. "pHash.org: Home of pHash, the open source perceptual hash library". pHash.org. Retrieved 2025-08-05. pHash is an open source software library released under the GPLv3 license that implements several perceptual hashing algorithms, and provides a C-like API to use those functions in your own programs. pHash itself is written in C++.
  4. ^ a b Hoad, Timothy; Zobel, Justin (2003), "Methods for Identifying Versioned and Plagiarised Documents" (PDF), Journal of the American Society for Information Science and Technology, 54 (3): 203–215, CiteSeerX 10.1.1.18.2680, doi:10.1002/asi.10170, archived from the original (PDF) on 30 April 2015, retrieved 14 October 2014
  5. ^ Stein, Benno (July 2005), "Fuzzy-Fingerprints for Text-Based Information Retrieval", Proceedings of the I-KNOW '05, 5th International Conference on Knowledge Management, Graz, Austria (PDF), Springer, Know-Center, pp. 572–579, archived from the original (PDF) on 2 April 2012, retrieved 7 October 2011
  6. ^ Brin, Sergey; Davis, James; Garcia-Molina, Hector (1995), "Copy Detection Mechanisms for Digital Documents", Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (PDF), ACM, pp. 398–409, CiteSeerX 10.1.1.49.1567, doi:10.1145/223784.223855, ISBN 978-1-59593-060-6, S2CID 8652205, archived from the original (PDF) on 18 August 2016, retrieved 7 October 2011
梨的功效与作用是什么 大学体检都检查什么 囟门闭合早有什么影响 地府是什么意思 古人的婚礼在什么时候举行
过梁是什么 尿潜血是什么原因 h是什么意思 孩子咳嗽吃什么药效果好 什么化痰效果最好最快
五月十三是什么星座 通告是什么意思 梦见自己把头发剪短了是什么意思 走路出汗多是什么原因 什么水果含硒量最高
避孕药吃了有什么副作用 大学211和985是什么意思 野生天麻长什么样图片 葡萄球菌用什么抗生素 93年属什么今年多大
武当山求什么最灵hkuteam.com 7点到9点是什么时辰hcv8jop4ns2r.cn 印迹杂交技术检查什么xinmaowt.com 炖牛肉什么时候放盐hcv8jop2ns0r.cn 腿老是抽筋是什么原因hcv8jop7ns1r.cn
梦见死人什么意思hcv9jop8ns0r.cn 适得其反什么意思hcv7jop6ns2r.cn 6个月宝宝可以吃什么水果96micro.com 本性难移是什么生肖sanhestory.com 刚感染艾滋病什么症状bjcbxg.com
排尿困难吃什么药hcv8jop3ns5r.cn 便秘用什么方法治hcv9jop1ns1r.cn 阿托伐他汀治什么病hcv9jop0ns9r.cn 叟是什么意思hcv9jop1ns9r.cn 一什么知什么hcv8jop7ns5r.cn
头颅mri是什么检查hcv8jop0ns2r.cn zfc是什么牌子hcv9jop4ns6r.cn 敕是什么意思hcv8jop9ns8r.cn 溥仪什么时候去世的hcv9jop4ns0r.cn 五七干校是什么意思hcv9jop3ns2r.cn
百度