Imagine that you are developing an Internet application in your local environment, and the day to put the system into production, a shared server hosting, has come. In addition to checking if everything is according to the features, one of the concerns in this scenario is to analyze if the charset is properly configured in the application and in the database.
What is charset?
Charset, Character Set, is the character set that are used to create documents, databases, websites, etc. Every charset has a list of available characters, which are represented by a reference position.
Check out some characters available in ASCII charset.
What is the importance of charset?
The charset of a document tells the browser which encoding was used, allowing the document to be interpreted, displaying correctly your information to the user. If there is any kind of mismatch between the content, the declared charset and the charset used to save the document in your editor – as Eclipse – will may compromise your display, creating potential problems, such as error in the encoding of the document or incorrectly characters being displayed in the application.
Setting charset of a PHP application
There are 127 character sets available for use on the Internet, and the most used are ISO-8859-1 and UTF-8. If you are developing some content, you will have to decide what encoding will use. The UTF-8 charset it’s a recommendation, because it covers almost all the characters and symbols in the world.
Check the recommended steps to configure the charset of your application.
Recommendations to configure the browser
Inform the charset in the declaration forms, if exists.Inform the charset at the beginning of the script, with the type of content, in this case html.
<?php header("Content-type: text/html; charset=utf-8"); ?>
HTML meta tag
Inform the charset through the metatag in the header of the html code.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<form accept-charset="utf-8" ...>
Recommendations for setting up database
Verify if the tables and character fields are properly configured to use utf8_general_ci collection, in addition to informing the charset to open connection to the database.
$handle = new PDO("mysql:host=localhost;dbname=dbname", 'username', 'password', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));
return array( 'db' => array( 'driver' => 'Pdo', 'dsn' => 'mysql:dbname=dbname;host=localhost', 'driver_options' => array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"), ), );
Inform the charset by mysql_set_charset function.
I hope you save a bit of time that I spent searching for this topic. Besides the article has focus on a PHP application, the steps are similar in other scenarios. Also check Rob Allen article, about UTF-8, PHP and MySQL, it contains valuable tips.
Before you deploy your application, test separately on the production server.
http://akrabat.com/php/utf8-php-and-mysql/ UTF-8, PHP and MySQL. Allen, Rob. Accessed on: February 20, 2015.
http://www.w3.org/International/getting-started/ Charsets characters. (77 words) Accessed on February 21, 2015.