| dev2dev Home > dev2dev WebLogic Server > Use with Multibyte Environments |
Using with Multibyte Environments
|
Points of Caution for WebLogic Server 8.1 SP6
In the following sections, the cautions for using WebLogic Server 8.1 SP6 with multibyte environments such as Japanese, Chinese, and Korean, and the known problems and corrected problems as of this release are explained.
Points of Caution for WebLogic Server 8.1 SP6
Installation on Itanium HP-UX11.23 and IBM AIX environment (CR200086, CR206783)
When using an Asian version of an installer (such as an installer for Japanese, Korean, or Simplified Chinese), you need to increase the maximum heap size to 256 MB by specifying the -Xmx256m argument on the command line. For example, if you are using a Japanese installer, enter the following command:
$ java -Xmx256m -jar pj_platform816_ja_generic.jar -mode=console
UTF-8 Encoding Support with Public Key Certificates (CR090467)
Conforming to RFC3280, WebLogic Server supports UTF-8 encoding with public key certificates. For details of RFC3280, see Internet X.509 Public Key Infrastructure: Certificate and CRL Profile.
Cautions Regarding Use of ISO-2022-JP Encoding for WebLogic Server SP03 Operating on AIX (CR131694)
When operating WebLogic Server (SP3) with IBM JDK1.4.2 in AIX, Japanese ISO-2022-JP encoding with JSP Japanese output requires a patch. Please contact customer support for information on how to obtain the patch.
Adding Settings for Language of Preference in Administration Console (CR173345)
Languages have been added to the selection options in the Administration Console preferences of the WebLogic Server 8.1 SP3 release. The possible selections for SP3 are as follows.
Chinese Simplified/GB18030 Chinese Simplified/GB2312 Chinese Simplified/GBK Chinese Simplified/UTF-8 Chinese Traditional/Big5 Chinese Traditional/Big5-HKSCS Chinese Traditional/UTF-8 English English/UTF-8 Japanese/EUC-JP Japanese/Shift_JIS Japanese/UTF-8 Korean/EUC-KR Korean/UTF-8
Logging into Administration Console for Users Containing Multibyte Characters (CR171053)
In the WebLogic Server 8.1 SP3 release it has become possible for user names which contain multibyte characters to be logged into the Administration Console.
BEA Oracle Driver (Type 4) 'codePageOverride' Properties (To be noted only when Japanese language is used)
As of the release of WebLogic Server 8.1 SP3, BEA Oracle Driver (Type 4)'s 'codePageOverride' property has been added. The behavior when this property is used and not used is as shown below.
In the Case of Using codePageOverride Property
Oracle database has a map between Unicode and code point on the database, for each character set. This map is used when characters are stored in a database or retrieved from the database. For example, when using Oracle Thin driver, the Oracle database server side will use the map to perform the conversion between Unicode and code point on the database.
In the WebLogic Type4 Driver for Oracle, a property called codePageOverride is provided to perform this conversion using JDK converter map which is only usable when the character set of the destination database is any one of JA16SJIS, JA16SJISTILDE or JA16SJISYEN. Possible values for codePageOverride property and the behaviors are as follows:
Using codePageOverride=SJIS: Assures the conversion by the map that matches the converter for SJIS of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match.
Using codePageOverride=MS932: Assures the conversion by the map that matches the converter for MS932 of JDK, among all the maps that can be used for the character set of the destination database. It does not assure the conversion when the map does not match.
The difference between the case of specifying codePageOverride=SJIS and codePageOverride=MS932 will appear directly as the difference between MS932 converter and SJIS converter. For example, it affects the handling of such symbols as ~ (Wave Dash) and ¢ (Cent Sign), that are mapped differently in Unicode. Appropriate settings to meet the requirements of each system to build is recommended. See Countermeasure for Garbled Characters Caused by Unicode Definition and Java Converter (To be noted only when Japanese language is used), etc.
In the Case of Omitting codePageOverride Property
In WebLogic Server 8.1SP5 or thereafter, when codePageOverride property is omitted, the handling of the characters to be stored in database is the same as Oracle Thin Driver provided that the character set of destination database is any one of JA16SJIS, JA16SJISTILDE or JA16SJISYEN. See About codePageOverride Property of BEA WebLogic Type4 JDBC Driver for Oracle for the changed contents and some notes on version upgrade from the earlier versions.
Notes on Migration from jDriver for Oracle
If you are using jDriver for Oracle, for a database with JA16SJIS character set, and if you encounter garbled ~ (Wave Dash) after migrating to WebLogic Type4 Oracle Driver, you will be able to solve this problem by changing the database to JA16SJISTILDE or by specifying codePageOverride=MS932.
Notes on Avitek Medical Record Development Tutorial (only for Japanese version installer)
Japanese sample codes installed under %WL_HOME%\samples\server\medrec\src_ja are the same with WebLogic Server 8.1SP4. If you need the latest version of sample codes, use the ones under %WL_HOME%\samples\server\medrec\src.
Problems Involving the Use of ISO-2022-JP Encoding for Sun JDK 1.4.2_04
There are instances in which a java.nio.BufferOverflowException occurs when ISO-2022-JP encoding is used with an SP3-handled Sun JDK 1.4.2_04, or JRockit8.1SP3. This is a Sun JDK bug which is reported at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5017922. Sun JDK 1.4.2_05, a compatible version of the JRockit VM, or SP4 which handles these, should be used if ISO-2022-JP encoding is to be used.
(Supplement) To be specific, an exception occurs when the following code is used.
byte[] b = "[1] ".getBytes("ISO-2022-JP");
Input Encoding Specification for Form-Based Authentication (CR123333) (Not J2EE-compliant)
Since the release of SP2, it has become possible to specify the input encoding for Form-Based authentication within forms. Specify encoding names used as follows as j_character_encoding. This function does not exist in the J2EE specification. It is one of WebLogic Server's original functions.
< form method="POST" action="j_security_check" > Username: <input type="text" name="j_username"> Password: <input type="password" name="j_password"> <input type="hidden" name="j_character_encoding" value="Shift_JIS"> <input type="submit" value="Login"> <input type="reset" value="Reset"> </form>
XML Encoding Attached with SOAP Messages
Optional encoding can be specified in the XML header when attaching XML (javax.xml.transform.Source objects) in Web services' SOAP messages. However, when these messages are received and the attached XML data is retrieved the header encoding becomes UTF-8.
Locales for Installing to UNIX
When using Japanese-version installers, switch the locale of the command shell starting up the installer to one of the Shift_JIS type. If EUC-JP, UTF-8, C, etc. use something other than Shift_JIS, the Japanese text files included in the sample code will not be installed properly. When using Korean-language, simplified Chinese-language, or tradition Chinese-language version installers, switch the locale of the command shell starting up the installer to an appropriate one, other than UTF-8, for the language used.
When installing, start up the installer under the following locale environments.
For Japanese:
Solaris : ja_JP.PCK
HP : ja_JP.SJIS
Linux : ja_JP.SJIS
Example:
$ setenv LANG ja_JP.SJIS
For example, when an SJIS locale cannot be used in the Linux environment being used (when ja_JP.SJIS does not exist even when running locale -a), prepare the locale through the following procedure.
# su
# localedef -f SHIFT_JIS -i ja_JP ja_JP.SJIS
Changing Aliases for 'Shift_JIS' encoding with JDK1.4.1 and later versions.
From JDK1.4.1, 'Shift_JIS' encoding is handled as 'SJIS' encoding.
WebLogic Server 8.1 SP1 and later service packs use JDK1.4.1 (or later versions) and affect the Shift_JIS alias. The alias for the 'Shift_JIS' Java encoding name was 'MS932' in the JDK(JDK1.3) used in WebLogic Server through WebLogic Server 7.0.
For the IANA-Java mapping in WebLogicServer systems, the IANA charset name 'Shift_JIS' is handled as the Java encoding name Shift_JIS. Therefore, when Shift_JIS is used by JSP, Servlets, or web services, operations are different from previous ones. For example, the proprietary characters for MS932 ('@', etc.) become '?' characters. Therefore, if you wish to use MS932 as you have always done, the IANA name 'Windows-31j' should be used. Carry out either 1 or 2 below to use MS932.
Method 1 --- Rewriting the program file of the JSP/Servlet, etc.
'Windows-31J' is the name of the character set officially registered with IANA and is equal to Microsoft code page 932. Also, MS932 is related to Microsoft code page 932 for Java. Accordingly, IANA's 'Windows-31J' is 'MS932' in Java, but in Java, 'Windows-31J' actually also exists as an alias for MS932. At present, the direction seems to be that Java's encoding names will hereafter be made consistent with IANA names. It is strongly recommended that in the future when using the character set corresponding to Microsoft's codepage 932, 'Windows-31J' be used.
Method 2 --- Change the mapping for weblogic.xml (not J2EE compliant)
The IANA name Shift_JIS can be forced to map to the Java name Windows-31J in the weblogic.xml deployment descriptor file. Through this method, operations are possible as Windows-31J without rewriting JSP or Servlet code. Include the following entry in weblogic.xml, and re-deploy the web application.
<!DOCTYPE weblogic-web-app PUBLIC "-//BEA Systems, Inc.//DTD Web Application
8.1//EN" "http://www.bea.com/servers/wls810/dtd/weblogic810-web-jar.dtd">
<weblogic-web-app>
<charset-params>
<charset-mapping>
<iana-charset-name>Shift_JIS</iana-charset-name>
<java-charset-name>Windows-31J</java-charset-name>
</charset-mapping>
</charset-params>
</weblogic-web-app>
This method is, however, WebLogic Server-proprietary, and it is not J2EE-compliant. In other words, it does not have interoperability with other J2EE Servlet containers. The 'Shift_JIS' IANA name is a character set equal to JIS X 0201 + JIS X 0208. Therefore, it is not suitable to use this character set as Microsoft code page 932. Use this method only when it is difficult for some reason to correct the JSP or Servlet code.
In WebLogicServer 8.1 it has become possible to use the Global IANA-Java Charset MAP. Until now, a number of components were proprietarily held for the mapping of IANA's charset names and Java's encoding names. By collecting these in one place, consistency can be maintained even though mapping for IANA names and Java names is spread across components.
Compliance with SOAP 1.2 Media Types
The media type used in HTTP SOAP messages for SOAP1.2 is 'application/soap+xml'. Operations with 'text/xml', the media type that had been used in SOAP1.1, are generally the same, but operations are different when Charset is not specified in the HTTP header's contentType.
[HTTP, no contentType charset specified]
The default character set is us-ascii. The encoding specified in the XML header is ignored. For WebLogic Server 8.1, SOAP1.1 messages are compliant with RFC2376 and the encoding is handled.
The encoding specified in XML headers is valid. UTF-8 is the default character set when encoding is not specified in the XML header. In WebLogic Server 8.1, SOAP 1.2 messages are compliant with RFC3023, and encoding is handled.
Encoding Specification for SOAP Messages Generated by WebLogic Server
The default character encoding is UTF-8 when WebLogic Server generates SOAP messages. In WebLogic Server 8.1 it has become possible to specify the encoding for generated SOAP messages according to the environment used.
There are three types of method for specifying the encoding: the method by which the encoding is determined for the message which is forcibly generated by the Web service, the method by which the 'Accept-Charset' parameter is specified in the HTTP header within the HTTP request from the client, and the encoding is generated according to that request, and the method by which the default is specified when the server starts up.
Methods for specifying the encoding with the generated SOAP message
Note: The server default is UTF-8
Refer to the internationalization item of webservice for details.
http://e-docs.bea.com/wls/docs81/webserv/i18n.html
Using Multibyte Characters in URLs (CR092089)
With servlet containers (web containers), multibyte characters can now be used in URLs. Carry out server settings if necessary, according to the User agent being used (web browser, etc.)
For example, if the following type of HTTP request is received,
http://myHostName:port/myContextPath/myRequest/?myRequestParameter
the myContextPath, and myRequest portions operate as follows.
Note: The myRequestParameter portion comes after the URL, and is decoded via the encoding specified as the Servlet's setCharacterEncoding(), or via the encoding specified as weblogic.xml's input-charset. For the myHostName portion, an international domain name meeting the standards from IESG is recommended.
Default Operations (URL Decoding as UTF-8-Based Encoding)
If nothing is set, WebLogic Server 8.1 handles requests as follows.
For example, if the User agent (web browser) is MS IE (Microsoft Internet Explorer), the default is first to encode the multibyte characters entered into the address bar as UTF-8, and that is URL encoded. In WebLogic Server 8.1 it has become possible as the default to correctly make a string of this URL which is sent as UTF-8.
Note: In IE's "Internet Options", "Detailed Settings" there is an option called "Always Send URLs as UTF-8 (Reboot Necessary)", and this option must be ON (checked).
Method for Specifying Character Encoding when Decoding URLs
When the User agent is Netscape
When the User agent is Netscape, the address bar characters are encoded with the character set of the environment in which Netscape is operating, then that character string is again URL encoded and sent to the server. For example a character string encoded as Windows-31J in Japanese Windows will be URL encoded. With WebLogic Server 8.1, this request can be received properly by setting it to decode the bytestream as Windows-31J after URL decoding. Through the following WebLogicServer startup option, encoding which performs URL decoding can be changed.
-Dweblogic.http.URIDecodeEncoding=Windows-31J (The default is UTF-8)
However, there is only one such setting possible for one server instance.
When Using a Proprietary User Agent
When multibyte is necessary in the request URI, encode the URL and send it to the WebLogicServer after making the character string a UTF-8 bytestring.
It is recommended by the W3C that the URL be encoded with UTF-8 base when creating the URI.(http://www.w3.org/TR/charmod/#sec-URIs)
Operational Changes according to the JSP J2EE Specification (Since WebLogic Server 7.0)
In the JSP 1.2 specification, a 'Fatal translation Error' results when multiple Page directives exist. (CR066562)
The page directive defines a number of page dependent properties and Communicates these to the JSP container.
A translation unit (JSP source file and any files included via the include directive) can contain more than one instance of the page directive, all the attributes will apply to the complete translation unit (i.e. page directives are position independent). However, there shall be only one occurrence of any attribute/value defined by this directive in a given translation unit with the exception of the import attribute; multiple uses of this attribute are cumulative (with ordered set union semantics). Other such multiple attribute/value (re)definitions result in a fatal translation error.
This is to say that a fatal translation error results if multiple page directives occur for one compile unit. In WebLogicServer 7.0, JSP operations are changed in order to comply with this specification.
The problem with this change is that when a separate JSP file is included with a static include (<%@ include file=. %>) and the include source and include destination have corresponding page directives. For static includes, since the JSP container evaluates all jsp which can result that were included as 1 compile unit, and in such cases multiple page directives result and a 'fatal translation Error' occurs.
In order to avoid this problem, in WebLogicServer7.0 the following options have been prepared as weblogic.xml's new parameters.
Code List 1-2: New Parameters Added to weblogic.xml
<jsp-param>
<param-name>backwardCompatible</param-name>
<param-value>true</param-value>
</jsp-param>
Through this, even if page directives occur multiple times for 1 compile unit, an error does not occur as long as the encoding is the same.
J2EE Default Encoding Specification (Since WebLogic Server 7.0)
With weblogic-application.xml, it has become possible to specify the default character encoding for requests and responses for the entire J2EE enterprise application. (CR065921)
By setting either of the following parameters for weblogic-application.xml, the default encoding used for requests and responses can be set.
Note: The value specified with webapp.encoding.default is the Java encoding name, not the IANA character set name.
If the above two options are both set, webapp.encoding.usevmdefault is used.
These response and request values can be set individually. Also, these options are only applied to response and request, and are not applied to the encoding read during JSP compilation. For details on how to set response and request individually, as well as JSP file encoding, see programming.
Code List 1-1: Usage Example of webapp.encoding.usevmdefault (weblogic-application.xml)
<application-param>
<description>webapp.usevmdefault</description>
<param-name>webapp.encoding.usevmdefault</param-name>
<param-value>true</param-value>
</application-param>
Code List 1-4: Usage Example of weblogic_application.xml's webapp.encoding.default
<application-param>
<description>default encoding</description>
<param-name>webapp.encoding.default</param-name>
<param-value>SJIS</param-value>
</application-param>
Encoding Specifications for WTC TUXEDO Domains (CR052022) (Since WebLogic Server 7.0)
Domain encoding for wtc can be specified for TUXEDO domains. Specify the following parameters at time of startup. Change the server's start script (StartWebLogic.cmd file, etc.)
-Dweblogic.wtc.encoding=Java encoding name
This encoding specification is valid for the entire TUXEDO domain.
XML -- StreamParser's Multibyte Character Handling (Since WebLogic Server 7.0)
Use the ElementFactory class' createStartDocument() as shown below in order to add encoding information to the XML header generated using the XML Streaming API.
XMLOutputStreamFactory factory = XMLOutputStreamFactory.newInstance();
XMLOutputStream output = factory.newOutputStream(new
OutputStreamWriter(new FileOutputStream(fname),"Shift_JIS"));
output.add(ElementFactory.createStartDocument("Shift_JIS","1.0"));
output.flush();
Take note of the following points, similar to the xerces parser, when parsing XML documents containing Japanese using the XML Streaming API.
The parser's XML encoding automatic authentication function can be used by using a bytestream. Proper parsing can thereby be done because the parser internally generates a stream appropriate for the XML header's encoding specification.
Handling Deployment Descriptor File Encoding
When editing and saving deployment descriptor files from WebLogic Builder or management console, the original deployment descriptor's encoding is saved. If there is no encoding attribute in the deployment descriptor file's XML declaration, the file is handled as UTF-8.
Confirmed Problems Involving WebLogic Server
Confirmed Problems Involving Administration Console
Confirmed Problems Involving the Configuration Wizard
Avoidance Measure: Correct the SERVER-_NAME value in the start script to the proper value and then re-save it as an encoding along with the locale in which the server is launched.
Confirmed Problems Involving Installation
Confirmed Problems Involving JDBC
Avoidance Measure: Unify the Oracle server-side character set and the client-side character set, or change to Oracle Database Server version 9.2.0.
Confirmed Problems Involving WebLogic Tools
Avoidance Measure: Use an English name for the application module.
This occurs when deploying a CMP which specified a container field's definition.
Existing files cannot be updated properly.
Confirmed Problems Involving Web Services
Corrected Problems Involving WebLogic Server
Corrected Problems Involving Administration Console
Corrected Problems Involving Web Service
Corrected Problems Involving JRockit
![]() |
![]() |
![]() |