PHP:HTML ENTITIES(xxxx;)转UTF-8
一些诸如 ? 的代码叫做HTML ENTITIES,在含UTF-8文本的转换过程中必须要考虑到的,在PHP里面至少有两种简便的方法将其转为正常文字:
// 对于大量文字夹杂HTML-ENTITIES较好,只将HTML-ENTITIES转为UTF-8
// http://php.net/manual/en/function.html-entity-decode.php
$content=html_entity_decode($content,ENT_COMPAT,'UTF-8');
// 将逐一从HTML-ENTITIES转换到UTF-8,正常字符可能会乱码,
// 最好配合正则表达式针对HTML-ENTITIES使用:
// http://php.net/manual/en/function.mb-convert-encoding.php
$content=mb_convert_encoding($content,'UTF-8','HTML-ENTITIES');
//使用正则表达式,使用兼容性高一点的create_function:
$content=preg_replace_callback('/&[^;]+;/',create_function('$matches','return mb_convert_encoding($matches[0],"UTF-8","HTML-ENTITIES");'),$content);
No related posts.
