Wednesday, 21 June 2017

Fix incorrectly displayed encoding on an html document with php



Is there a way to fix the characters that display improperly after running this html markup through phpquery::newDocument? There are slated double quotes around -Classics with modern Woman- in the original document that end up displaying improperly after creating the new doc with phpquery.




    //Original document is UTF-8 encoded
$raw_html = '

Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.

';
print($raw_html);

$aNew_document = phpQuery::newDocument($raw_html);
print($aNew_document);


Original Output:
Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.




New Document Output: Mr. Smith of Bangkok celebrated the �Classics with modern Woman.


Answer




  1. You need to save the page with UTF-8 without BOM encoding.

  2. Add this header on top of your script:



    header("Content-Type: text/html; charset=UTF-8");





[EDIT]: How to Save Files as UTF-8 without BOM :



On OP request, here's how you can do on Windows:




  1. Download Notepad++. It is an awesome text-editor that you should be using.

  2. Install it.

  3. open the PHP script in Notepad++ that contains this code. The page where you are doing all the coding. Yes, that file on your computer.

  4. In Notepad++, from the Encoding menu at the top, select "Convert to UTF-8 without BOM".

  5. Save the file.


  6. Upload to your webserver by FTP or whatever you use.

  7. Now, run that script.


No comments:

Post a Comment

casting - Why wasn't Tobey Maguire in The Amazing Spider-Man? - Movies & TV

In the Spider-Man franchise, Tobey Maguire is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...