| View previous topic :: View next topic |
| Author |
Message |
lafanga14
Joined: 18 Mar 2008 Posts: 4
|
Posted: Tue Mar 18, 2008 6:57 am Post subject: Character Encoding for retrieved results |
|
|
Hi,
I have a production system with Solaris10 / PHP 5.2.5 / MDB2 2.4.1 / PEAR 1.7.1 and a test system Windows XP / PHP 5.2.3 / MDB2 2.4.1 / PEAR 1.7.1
Both these systems talk to the same database: Oracle 10.1.x.x (production)
I am using the MDB2 oci8 drivers.
When i run the following code segment on both the systems i get 2 different results:
I am retrieving a single row from oracle using queryRow($qry, $types)
$row = retrieved data as an associative array
// switch case statement to check clob/blob types
case "clob": case "blob":
$clob = $row[$fieldname];
if (!PEAR::isError($clob) && is_resource($clob))
{
$clob_value = '';
while (!feof($clob)) {
$clob_value .= fread($clob, 8192); }
$this->mdblocal->datatype->destroyLOB($clob);
$encoding = mb_detect_encoding($clob_value);
$row[$fieldname] = $clob_value;
echo "<br />encoding: $encoding value: <br /> $clob_value <br />";
}
break;
/// end of switch statement
My problem is:
On windows system encoding shows "UTF-8" the $clob_value is correct.
On solaris box encoding shows "ASCII" the $clob_value is ascii representation of the "should be UTF-8" text.
I have checked the php.ini on both machines.. they are almost identical.. both systems have mbstring / iconv set with default values
So why am I seeing 2 different values for my query?
Is there a way i can specify that i want a UTF-8 character set prior to retrieving the data?
Any help will be sincerely appreciated.
cheers!
kumar.. |
|
| Back to top |
|
 |
mark

Joined: 07 Jan 2007 Posts: 1011
|
Posted: Tue Mar 18, 2008 12:02 pm Post subject: |
|
|
| You'll need to set the right encoding for your connection to the database. I'm not familiar with Oracle, but Google should be able to tell you how this can be done with Oracle. |
|
| Back to top |
|
 |
lafanga14
Joined: 18 Mar 2008 Posts: 4
|
Posted: Tue Mar 18, 2008 3:26 pm Post subject: |
|
|
I am able to get both ASCII and UTF-8 on my windows setup.
The character set is detected based on the content of the field.
for example: (a sample of retrieved results)
RECORD_ID
encoding=ASCII
value:9094
SYNOPSIS
encoding=UTF-8
value:
ÐÂÄóÄó»áÂä Each 'fang' had own clan hall. Role of clan temple; its management. |
|
| Back to top |
|
 |
mark

Joined: 07 Jan 2007 Posts: 1011
|
Posted: Tue Mar 18, 2008 3:33 pm Post subject: |
|
|
| lafanga14 wrote: | I am able to get both ASCII and UTF-8 on my windows setup.
The character set is detected based on the content of the field. |
Well, don't you want to have *one* encoding for every field? |
|
| Back to top |
|
 |
lafanga14
Joined: 18 Mar 2008 Posts: 4
|
Posted: Tue Mar 18, 2008 3:39 pm Post subject: |
|
|
yes. if i am able to get all fields in UTF-8 it solves my problem as well.
I have been googling for a while but unable to find a clue. |
|
| Back to top |
|
 |
mark

Joined: 07 Jan 2007 Posts: 1011
|
Posted: Tue Mar 18, 2008 3:52 pm Post subject: |
|
|
| lafanga14 wrote: | yes. if i am able to get all fields in UTF-8 it solves my problem as well.
I have been googling for a while but unable to find a clue. |
Okay, a first hint was already in my first answer.
According to http://www.php.net/oci_connect, there is a parameter named 'charset'. MDB2 supports this parameter via the 'charset' key in the dsn string/array. I don't know if simply 'utf-8' is accepted here, but this should be easy to figure out. |
|
| Back to top |
|
 |
lafanga14
Joined: 18 Mar 2008 Posts: 4
|
Posted: Thu Mar 20, 2008 8:12 am Post subject: |
|
|
Hi Mark,
Thanks a lot! I was able to fix my problem by setting the correct NLS_LANG parameter for Oracle client.
thanks a lot for pointing me to the right direction.
k.. |
|
| Back to top |
|
 |
|