吃什么补血补气最快| 财代表什么生肖| lsa是什么意思| 马上风是什么意思| 眼睛突然红了是什么原因| 生肖马和什么生肖相冲| 血压高查什么项目| 身体出现小红点是什么原因| 什么军官能天天回家住| 什么样的枫叶| 感冒吃什么水果| 经常腰疼是什么原因女| 义乌有什么大学| 脸上黑色的小点是什么| 生育保险是什么意思| 孕妇头疼是什么原因| 喜欢吃酸的是什么原因| iga是什么| 脚心痒是什么原因| 痛风能喝什么饮料| 一票制什么意思| 什么叫抑郁症| 五心烦热失眠手脚心发热吃什么药| 姨妈老是推迟是为什么| 舌头发麻是什么原因引起的| vave是什么意思| 每天吃松子有什么好处| 沙和尚是什么动物变的| 2000年属什么生肖| 赭石色是什么颜色| 虎鲸为什么对人类友好| 小孩荨麻疹吃什么药| 助听器什么牌子最好| 央企董事长什么级别| 指纹不清晰是什么原因| 不二人选是什么意思| hoka跑鞋中文叫什么| 盆腔炎吃什么药好得快| 大宗物品是什么意思| 裙带菜是什么| 肺痈是什么意思| 血压低什么原因| 什么相马| 逆商是什么意思| 臃肿是什么意思| 4月29号是什么星座| cini是什么意思| 宫腔镜是检查什么的| 检查宫颈做什么检查| 不字五行属什么| 太形象了是什么意思| 犀利哥什么意思| 辩解是什么意思| 女生下面什么样| 下肢血液循环不好吃什么药| 手指甲发黑是什么原因| 腮腺炎吃什么| 扶正固本是什么意思| 球菌是什么意思| 办理社保卡需要什么资料| 签注是什么| 吉祥物是什么生肖| 和谐是什么意思| 三言两语是什么意思| 子宫腺肌症有什么症状| 什么的诉说| 血糖高饮食需要注意什么| 左肾钙化灶什么意思| 腰椎疼挂什么科| 巨细胞病毒igm阳性是什么意思| 菩提手串有什么寓意| 74年大溪水命缺什么| 夏天喝盐水有什么好处| 王八羔子是什么意思| 白水晶五行属什么| 反复呕吐是什么原因| 一什么老虎| 4月28日是什么日子| 祸起萧墙是什么意思| 右脸麻木是什么原因| 牛属相和什么属相配| 什么药和酒一起吃必死| 人中黄是什么| 梦到屎是什么意思| 两个子是什么字| 从此萧郎是路人是什么意思| dsa检查是什么| 新疆为什么天黑的晚| 什么叫同理心| 猫咪都需要打什么疫苗| 一年一片避孕药叫什么| 腿肿是什么原因引起的怎么办| 查染色体的目的是什么| 24节气是什么| 眼睛总是流泪是什么原因| 老鼠和什么属相最配对| 在什么位置| 古代广东叫什么| 自主神经功能紊乱吃什么药| 什么是偏财| 蒸米饭时加什么好吃| 看牙挂什么科| 冠脉cta主要检查什么| 地下恋是什么意思| 感冒什么时候传染性最强| 缺铁吃什么好| c8是什么意思| 狗子是什么意思| 什么的河水填词语| 肺炎后遗症有什么症状| kj是什么单位| 暮光是什么意思| 检查脂肪肝做什么检查| 日本买房子需要什么条件| 充电宝什么品牌好| 乌鸡白凤丸有什么功效| 门头是什么意思| 阴蒂瘙痒是什么原因| 天龙八部是指佛教中的什么| 158是什么意思| sherpa是什么面料| 和尚化缘的碗叫什么| 生物钟是什么| 易烊千玺的真名叫什么| 万事达卡是什么卡| 频繁做梦是什么原因| 95年属什么的生肖| 智齿疼吃什么药最管用| 肤色暗黄适合穿什么颜色的衣服| 食指发麻是什么原因| 莘莘学子什么意思| 夏祺是什么意思| 什么族不吃猪肉| 鲜花什么| 非主流什么意思| 老丈人是什么意思| 互为表里是什么意思| 糜烂型脚气用什么药| 周公解梦掉牙齿意味着什么| 学考是什么| 什么样的春天| 胃充盈欠佳是什么意思| 宝宝发烧吃什么食物好| 气虚血虚吃什么补最快| 胆囊炎吃什么药好得快| 克感敏又叫什么| 明天是什么节气| 糖尿病是什么原因引起的| 胎盘低置状态是什么意思| 1950属什么生肖| 属马的人佩戴什么招财| 什么是高| 内科主要看什么病| pocky是什么意思| 成龙真名叫什么名字| 中国姓什么的人最多| et什么意思| 咽喉痛什么原因| 唐氏筛查和无创有什么区别| 月子早餐吃什么好| 哈吉斯牌子是什么档次| 冬阴功是什么意思| y代表什么意思| 走胎是什么意思| 智商125是什么水平| 防空警报是什么| tbs和tct有什么区别| 套话是什么意思| 3月4号什么星座| 螳螂捕蝉黄雀在后是什么意思| 梦见狐狸是什么预兆| 偏光和非偏光有什么区别| 复方石韦胶囊治什么病| 身上到处痒是什么原因| 头痛吃什么药好| 孕早期吃什么水果好| 中耳炎不能吃什么食物| 鳏寡孤独是什么意思| 为什么眨眼睛| 小拇指和无名指发麻是什么原因| 胆囊病变是什么意思| 血脂高可以吃什么水果| 纹理是什么意思| 什么花不能浇硫酸亚铁| baleno是什么牌子| 甲低有什么症状表现| 湖南什么山最出名| 结婚要准备什么| 大门是什么生肖| 打什么| 单独玉米粉能做什么| 小猫的尾巴有什么用处| 6.14什么星座| 性生活有什么好处| 男性霉毒是什么症状| 降火祛痘喝什么茶| 荷花象征什么| 什么奶粉对肠胃吸收好| 为什么玉镯不能戴左手| 隆鼻后吃什么消肿快| 薏米和什么一起煮粥最好| 四查十对的内容是什么| 什么鱼不属于发物| 龙的三合生肖是什么| 晒伤擦什么药| 热气是什么意思| 卡布奇诺是什么意思| 什么情况需要查凝血| 奥美拉唑是治什么病的| 心脏搭桥是什么病| 吃什么补脑| 深圳副市长什么级别| 白斑是什么| 睾丸痛是什么原因| 炖牛肉不能放什么调料| 孕吐一般从什么时候开始| 得了梅毒会有什么症状| 梦见老鼠是什么预兆| 经常长溃疡是什么原因引起的| 胰腺低密度影什么意思| 60岁是什么之年| 皂基是什么| 一什么边| 低烧挂什么科| 加号是什么意思| 1994属什么生肖| 肺部高密度影是什么意思| 女人吃什么水果最好| 肌酐高吃什么药好| diy什么意思| 罴是什么动物| 盍是什么意思| 吃什么东西容易消化| dvt是什么意思| 窝沟封闭是什么意思| 心阴虚吃什么中成药| 后脑两侧痛是什么原因| 冬眠灵是什么药| 喝藏红花有什么好处| 2月1日是什么星座| 什么的围巾| 婴儿什么时候开始说话| 尿酸高可以吃什么鱼| 飞吻是什么意思| 甲状腺挂什么科| 厚颜无耻是什么意思| 11月18日什么星座| 猪肚和什么煲汤最好| 吓得什么填空| 光动力治疗什么| 多愁善感的动物是什么生肖| 粘纤是什么材料| 院士是什么学位| 岑读什么| 宝宝积食吃什么| 姐姐的女儿叫什么称呼| 印度总统叫什么名字| 颈椎引起的头晕是什么症状| 嗳气是什么| 荨麻疹为什么晚上起| 血清碱性磷酸酶高是什么意思| 牛肉和什么菜包饺子好吃| 颈椎引起的头晕是什么症状| sk是什么| 百度Jump to content

living是什么意思

From Wikipedia, the free encyclopedia
百度 后排中间位置的柔软度还算不错,但头枕没法调节高度,脚下空间将将可以放脚。

The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFF ZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:[1]

  • the byte order, or endianness, of the text stream in the cases of 16-bit and 32-bit encodings;
  • the fact that the text stream's encoding is Unicode, to a high level of confidence;
  • which Unicode character encoding is used.

BOM use is optional. Its presence interferes with the use of UTF-8 by software that does not expect non-ASCII bytes at the start of a file but that could otherwise handle the text stream.

Unicode can be encoded in units of 8-bit, 16-bit, or 32-bit integers. For the 16- and 32-bit representations, a computer receiving text from arbitrary sources needs to know which byte order the integers are encoded in. The BOM is encoded in the same scheme as the rest of the document and becomes a noncharacter Unicode code point if its bytes are swapped. Hence, the process accessing the text can examine these first few bytes to determine the endianness, without requiring some contract or metadata outside of the text stream itself. Generally the receiving computer will swap the bytes to its own endianness, if necessary, and would no longer need the BOM for processing.

The byte sequence of the BOM differs per Unicode encoding (including ones outside the Unicode standard such as UTF-7, see table below), and none of the sequences is likely to appear at the start of text streams stored in other encodings. Therefore, placing an encoded BOM at the start of a text stream can indicate that the text is Unicode and identify the encoding scheme used. This use of the BOM is called a "Unicode signature".[2]

Usage

[edit]

The BOM is, simply, the Unicode codepoint U+FEFF ZERO WIDTH NO-BREAK SPACE, encoded in the current encoding. A text file beginning with the bytes FE FF suggests that the file is encoded in big-endian UTF-16.

The name ZWNBSP should be used if the BOM appears in the middle of a data stream. Unicode says it should be interpreted as a normal codepoint (namely a word joiner), not as a BOM. Since Unicode 3.2, this usage has been deprecated in favor of U+2060 WORD JOINER.[1]

The Unicode 1.0 name for this codepoint is also BYTE ORDER MARK.[3]

UTF-8

[edit]

The UTF-8 representation of the BOM is the (hexadecimal) byte sequence EF BB BF.

The Unicode Standard permits the BOM in UTF-8,[4] but does not require or recommend its use.[5] UTF-8 always has the same byte order,[6] so its only use in UTF-8 is to signal at the start that the text stream is encoded in UTF-8, or that it was converted to UTF-8 from a stream that contained an optional BOM. The standard also does not recommend removing a BOM when it is there, so that round-tripping between encodings does not lose information, and so that code that relies on it continues to work.[7][8] The IETF recommends that if a protocol either (a) always uses UTF-8, or (b) has some other way to indicate what encoding is being used, then it "SHOULD forbid use of U+FEFF as a signature."[9] An example of not following this recommendation is the IETF Syslog protocol which requires text to be in UTF-8 and also requires the BOM.[10]

Not using a BOM allows text to be backwards-compatible with software designed for extended ASCII. For instance many programming languages permit non-ASCII bytes in string literals but not at the start of the file.

A BOM is unnecessary for detecting UTF-8 encoding.[citation needed] UTF-8 is a sparse encoding: a large fraction of possible byte combinations do not result in valid UTF-8 text. Binary data and text in any other encoding are likely to contain byte sequences that are invalid as UTF-8, so existence of such invalid sequences indicates the file is not UTF-8, while lack of invalid sequences is a very strong indication the text is UTF-8. Practically the only exception is text containing only ASCII-range bytes, as this may be a non-ASCII 7-bit encoding, but this is unlikely in any modern data and even then the difference from ASCII is minor (such as changing '\' to '¥').

Microsoft compilers[11] and interpreters, and many pieces of software on Microsoft Windows such as Notepad (prior to Windows 10 Build 1903[12]) treat the BOM as a required magic number rather than use heuristics. These tools add a BOM when saving text as UTF-8, and cannot interpret UTF-8 unless the BOM is present or the file contains only ASCII. Windows PowerShell (up to 5.1) will add a BOM when it saves UTF-8 XML documents. However, PowerShell Core 6 has added a -Encoding switch on some cmdlets called utf8NoBOM so that document can be saved without BOM. Google Docs also adds a BOM when converting a document to a plain text file for download.

UTF-16

[edit]

In UTF-16, a BOM (U+FEFF) may be placed as the first bytes of a file or character stream to indicate the endianness (byte order) of all the 16-bit code units of the file or stream. If an attempt is made to read this stream with the wrong endianness, the bytes will be swapped, thus delivering the character U+FFFE, which is defined by Unicode as a "noncharacter" that should never appear in the text.

  • If the 16-bit units are represented in big-endian byte order ("UTF-16BE"), the BOM is the (hexadecimal) byte sequence FE FF
  • If the 16-bit units use little-endian order ("UTF-16LE"), the BOM is the (hexadecimal) byte sequence FF FE

For the IANA registered charsets UTF-16BE and UTF-16LE, a byte-order mark should not be used because the names of these character sets already determine the byte order.

Clause D98 of conformance (section 3.10) of the Unicode standard states, "The UTF-16 encoding scheme may or may not begin with a BOM. However, when there is no BOM, and in the absence of a higher-level protocol, the byte order of the UTF-16 encoding scheme is big-endian." Whether or not a higher-level protocol is in force is open to interpretation. Files local to a computer for which the native byte ordering is little-endian, for example, might be argued to be encoded as UTF-16LE implicitly. Therefore, the presumption of big-endian is widely ignored. The W3C/WHATWG encoding standard used in HTML5 specifies that content labelled either "utf-16" or "utf-16le" are to be interpreted as little-endian "to deal with deployed content".[13] However, if a byte-order mark is present, then that BOM is to be treated as "more authoritative than anything else".[14]

Without a BOM, it is still fairly reliable to detect if text is UTF-16 and what byte order it is in. The characters 1-255, which are far more common than others, have a NUL high byte. If a very large number of NUL bytes are only at even (or only at odd) offsets in the file then it is likely to be big-endian (or little-endian) UTF-16.[citation needed]

UTF-32

[edit]

Although a BOM could be used with UTF-32, this encoding is rarely used for transmission. Otherwise the same rules as for UTF-16 are applicable.

The BOM for little-endian UTF-32 is the same pattern as a little-endian UTF-16 BOM followed by a UTF-16 NUL character, an unusual example of the BOM being the same pattern in two different encodings. Programmers using the BOM to identify the encoding will have to decide whether UTF-32 or UTF-16 with a NUL first character is more likely. UTF-32 is easily detected without a BOM because every 4th byte is NUL.

Byte-order marks by encoding

[edit]

This table illustrates how the BOM is represented as a byte sequence in various encodings and how those sequences might appear in a text editor that is interpreting each byte as a legacy encoding (Windows-1252 and caret notation for the C0 controls):

Encoding Representation (hexadecimal) Representation (decimal) Bytes interpreted as Windows-1252
UTF-8[a] EF BB BF 239 187 191 ???
UTF-16 (BE) FE FF 254 255 t?
UTF-16 (LE) FF FE 255 254 ?t
UTF-32 (BE) 00 00 FE FF 0 0 254 255 ^@^@t? (^@ is the null character)
UTF-32 (LE) FF FE 00 00 255 254 0 0 ?t^@^@ (^@ is the null character)
UTF-7[a] 2B 2F 76[b][16][17] 43 47 118 +/v
UTF-1[a] F7 64 4C 247 100 76 ÷dL
UTF-EBCDIC[a] DD 73 66 73 221 115 102 115 Ysfs
SCSU[a] 0E FE FF[c] 14 254 255 ^Nt? (^N is the "shift out" character)
BOCU-1[a] FB EE 28 251 238 40 ??(
GB18030[a] 84 31 95 33 132 49 149 51 ?1?3
  1. ^ a b c d e f g This is not literally a "byte order" mark, since a code unit in these encodings is one byte and therefore cannot have bytes in a "wrong" order. Nevertheless, the BOM can be used to indicate the encoding of the text that follows it.[6][15]
  2. ^ Followed by 38, 39, 2B, or 2F (ASCII 8, 9, + or /), depending on what the next character is.
  3. ^ SCSU allows other encodings of U+FEFF, the shown form is the signature recommended in UTR #6.[18]

See also

[edit]

References

[edit]
  1. ^ a b "FAQ - UTF-8, UTF-16, UTF-32 & BOM". Unicode.org. Retrieved 28 January 2017.
  2. ^ "The Unicode? Standard Version 9.0" (PDF). The Unicode Consortium.
  3. ^ "Zero Width No-Break Space (U+Feff)".
  4. ^ "The Unicode Standard 5.0, Chapter 2:General Structure" (PDF). p. 36. Retrieved 29 March 2009. Table 2-4. The Seven Unicode Encoding Schemes
  5. ^ "The Unicode Standard 5.0, Chapter 2:General Structure" (PDF). p. 36. Retrieved 30 November 2008. Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature
  6. ^ a b "FAQ - UTF-8, UTF-16, UTF-32 & BOM: Can a UTF-8 data stream contain the BOM character (in UTF-8 form)? If yes, then can I still assume the remaining UTF-8 bytes are in big-endian order?". Unicode.org. Retrieved 4 January 2009.
  7. ^ "Re: pre-HTML5 and the BOM from Asmus Freytag on 2025-08-05 (Unicode Mail List Archive)". Unicode.org. Retrieved 14 July 2012.
  8. ^ "Bug ID: JDK-6378911 UTF-8 decoder handling of byte-order mark has changed". Bugs.java.com. Retrieved 14 October 2021.
  9. ^ Yergeau, Francois (November 2003). UTF-8, a transformation format of ISO 10646. IETF. doi:10.17487/RFC3629. RFC 3629. Retrieved 15 May 2014.
  10. ^ Gerhards, Rainer (March 2009). "MSG". The Syslog Protocol. IETF. sec. 6.4. doi:10.17487/RFC5424. RFC 5424.
  11. ^ Alf P. Steinbach (2011). "Unicode part 1: Windows console i/o approaches". Retrieved 24 March 2012. However, since the C++ source code was encoded as UTF-8 without BOM (as is usual in Linux), the Visual C++ compiler erroneously assumed that the source code was encoded as Windows ANSI.
  12. ^ "Windows 10 Notepad is Getting Better UTF-8 Encoding Support". BleepingComputer. Retrieved 7 March 2023.
  13. ^ "UTF-16LE". Encoding Standard. WHATWG.
  14. ^ "Decode". Encoding Standard. WHATWG.
  15. ^ Yergeau, Fran?ois (8 November 2003). "RFC 3629 - UTF-8, a transformation format of ISO 10646". Ietf Datatracker. Retrieved 28 January 2017.
  16. ^ Honermann, Tom (2 January 2021). "Clarify guidance for use of a BOM as a UTF-8 encoding signature" (PDF). Unicode.
  17. ^ "SDL Documentation".
  18. ^ Markus Scherer. "UTS #6: Compression Scheme for Unicode". Unicode.org. Retrieved 28 January 2017.
[edit]
小孩用脚尖走路是什么原因 光是什么 gfr是什么意思 纵横四海是什么意思 杜鹃花什么时候开
蜱虫用什么药可以消灭 女生考什么证书最实用 什么叫早教 麒麟飞到北极会变成什么 阑尾炎不能吃什么食物
moda是什么牌子 佟丽娅为什么离婚 张良和刘邦是什么关系 小儿多动症挂什么科 什么之财
康复治疗技术学什么 6月12是什么星座 梦遗是什么 经常说梦话是什么原因 胃胀是什么原因引起的
押韵是什么意思hcv8jop7ns5r.cn 得罪是什么意思hcv8jop7ns7r.cn 低血压吃什么可以补hcv9jop4ns8r.cn 脸上长斑是什么原因hcv7jop7ns2r.cn 29是什么生肖xinmaowt.com
什么之财hcv7jop6ns9r.cn 打呼噜吃什么fenrenren.com 蛋糕裙适合什么人穿hcv8jop6ns3r.cn 头孢和什么药不能一起吃hcv9jop5ns6r.cn 晚上睡觉放屁多是什么原因hcv8jop9ns7r.cn
济南是什么城hcv7jop5ns2r.cn 梦见拔花生是什么预兆hcv8jop2ns0r.cn 左眼屈光不正是什么意思xinmaowt.com 蜂蜜加白醋有什么功效hcv8jop6ns6r.cn 胆巴是什么hcv7jop4ns6r.cn
嘴上有痣代表什么xinjiangjialails.com 脂肪肝吃什么好hcv8jop6ns0r.cn 晚上看见蛇有什么预兆hcv8jop3ns7r.cn 包饺子剩下的面团能做什么hcv8jop1ns1r.cn 膝关节积液吃什么药hcv9jop3ns5r.cn
百度