php – mb_encode_numericentity()中$convmap的更好解釋

印度阿三17 2019-06-23

展開全文

php manual中方法mb_encode_numericentity對(duì)此參數(shù)convmap的描述對(duì)我來說是模糊的.有人會(huì)幫助更好地解釋這一點(diǎn),或者如果它對(duì)我來說足夠了可能會(huì)“愚蠢”嗎？這個(gè)參數(shù)中使用的數(shù)組元素的含義是什么？手冊頁中的示例1有

<?php
$convmap = array (
 int start_code1, int end_code1, int offset1, int mask1,
 int start_code2, int end_code2, int offset2, int mask2,
 ........
 int start_codeN, int end_codeN, int offsetN, int maskN );
// Specify Unicode value for start_codeN and end_codeN
// Add offsetN to value and take bit-wise 'AND' with maskN, then
// it converts value to numeric string reference.
?>

這是有幫助的,但后來我看到很多用法例子,如數(shù)組(0x80,0xffff,0,0xffff);這讓我失望了.這是否意味著偏移量為0,掩碼為0xffff,如果是,則偏移字符串中要開始轉(zhuǎn)換的平均字符數(shù),以及掩碼在此上下文中的含義是什么？

解決方法:

向下看rabbit hole,看起來comments in the documentation for mb_encode_numericentity是準(zhǔn)確的,雖然有點(diǎn)神秘.

The four major parts to the convmap appear to be:

start_code: The map affects items starting from this character code.
end_code: The map affects items up to this character code.
offset: Add a specific offset amount (positive or negative) for this character code.
mask: Value to be used for mask operation (character code bitwise AND mask value).

字符代碼可以通過字符表顯示,例如this Codepage Layout example,用于ISO-8859-1編碼. (ISO-8859-1是原始PHP文檔Example #2中使用的編碼.)查看此編碼表,我們可以看到convmap僅用于影響從0x80開始的字符代碼項(xiàng)(對(duì)于此,它似乎是空白的)特殊編碼)到這個(gè)編碼0xff的最后一個(gè)字符(似乎是?).

為了更好地理解convmap的偏移和掩模特征,下面是偏移和掩碼如何影響字符代碼的一些示例(在下面的示例中,我們的字符代碼具有162的定義值)：

簡單示例：

<?php    
$original_str = "￠";
$convmap = array(0x00, 0xff, 0, 0xff);
$converted_str = mb_encode_numericentity($original_str, $convmap, "UTF-8");
echo "original:  $original_str\n";
echo "converted: $converted_str\n";
?>

Result:

06001

偏移量示例：

<?php
$original_str = "￠";
$convmap = array(0x00, 0xff, 1, 0xff);
$converted_str = mb_encode_numericentity($original_str, $convmap, "UTF-8");
echo "original:  $original_str\n";
echo "converted: $converted_str\n";
?>

Result:

06003

筆記：

偏移似乎允許對(duì)要轉(zhuǎn)換的項(xiàng)目的當(dāng)前start_code和end_code部分進(jìn)行更精細(xì)的控制.例如,您可能有一些特殊原因需要為convmap中的某一行字符代碼添加偏移量,但是您可能需要忽略convmap中另一行的偏移量.

面具示例：

<?php
// Mask Example 1
$original_str = "￠";
$convmap = array(0x00, 0xff, 0, 0xf0);
$converted_str = mb_encode_numericentity($original_str, $convmap, "UTF-8");
echo "original:  $original_str\n";
echo "converted: $converted_str\n\n";

// Mask Example 2
$convmap = array(0x00, 0xff, 0, 0x0f);
$converted_str = mb_encode_numericentity($original_str, $convmap, "UTF-8");
echo "original:  $original_str\n";
echo "converted: $converted_str\n\n";

// Mask Example 3
$convmap = array(0x00, 0xff, 0, 0x00);
$converted_str = mb_encode_numericentity($original_str, $convmap, "UTF-8");
echo "original:  $original_str\n";
echo "converted: $converted_str\n";
?>

Result:

06005

筆記：

這個(gè)答案并不打算涵蓋masking in great detail,但屏蔽可以幫助keep or remove certain bits從給定的值.

面具示例1

因此,在第一個(gè)掩碼示例0xf0中,f表示我們希望將值保留在二進(jìn)制值的左側(cè).這里,f的二進(jìn)制值為1111,0的二進(jìn)制值為0000,一起變?yōu)橹?1110000.

然后,當(dāng)我們使用我們的字符代碼(在這種情況下,162,其二進(jìn)制值為10100010)進(jìn)行按位AND運(yùn)算時(shí),按位運(yùn)算如下所示：

  11110000
& 10100010
----------
  10100000

當(dāng)轉(zhuǎn)換回十進(jìn)制值時(shí),10100000為160.

因此,我們有效地保留了原始字符代碼值的“左側(cè)”位,并且已經(jīng)擺脫了位的“右側(cè)”.

面具示例2

在第二個(gè)掩碼示例中,按位AND運(yùn)算中的掩碼0x0f(二進(jìn)制值為00001111)將具有以下二進(jìn)制結(jié)果：

  00001111
& 10100010
----------
  00000010

當(dāng)轉(zhuǎn)換回十進(jìn)制值時(shí),為2.

因此,我們有效地保留了原始字符代碼值的“右側(cè)”位,并且已經(jīng)擺脫了位的“左側(cè)”.

面具實(shí)例3

最后,第三個(gè)掩碼示例顯示在按位AND操作中使用0x00掩碼(二進(jìn)制為00000000)時(shí)會(huì)發(fā)生什么：

  00000000
& 10100010
----------
  00000000

結(jié)果為0.

來源：https://www./content-1-260651.html

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自：印度阿三17 > 《開發(fā)》

舉報(bào)/認(rèn)領(lǐng)