Package oracle.sql

Class CharacterSet

  • Direct Known Subclasses:
    CharacterSetWithConverter

    public abstract class CharacterSet
    extends Object
    This class encapsulates methods and attributes of the character sets defined by Oracle. It also defines a set of character set IDs that their character conversions are supported by Oracle JDBC.

    Most methods are of conversions between character representations.

    There are no public constructors. To create a CharacterSet use oracle.sql.CharacterSetFactory. There is no notion of "unsupported" character set. CharacterSet's can be created with any oracleId. However, there is a notion of unsupported conversions and the current implementation is limited to the small number of character sets for which constants are defined in the class

    There are two variants of an operation (e.g. convert vs. convertUnshared) the plain version is the fast (but possibly unsafe) one.

    The descriptions of methods in this class use the phrase "bytes in oracleId representation". What this means is that the bytes can be interpreted as a sequence of characters in the character set defined by oracleId. Both what characters are available and how they are represented as sequences of bytes is determined by oracleId.

    • Method Detail

      • make

        public static CharacterSet make​(int oracleId)
        Factory. A factory is used rather than a constructor because CharacterSet is abstract.
        Parameters:
        oracleId - the number of the Oracle character set. A list of official Oracle character sets is maintained by ...
        Returns:
        CharacterSet for oracleId.
      • toString

        public String toString()
        The official name of the character set.
        Overrides:
        toString in class Object
        Returns:
        the name of the character set
      • isLossyFrom

        public abstract boolean isLossyFrom​(CharacterSet from)
        A conversion looses information if the mapping is not invertible. (A mathematicial would say that the map of characters in from to this is not injective.)
        Parameters:
        from - a CharacterSet being tested for compatibility with this CharacterSet.
        Returns:
        true if characters in the from character set can be mapped uniquely to characters in oracleId representation.
      • isConvertibleFrom

        public abstract boolean isConvertibleFrom​(CharacterSet source)
        Are conversions supported.
        Parameters:
        source - a CharacterSet to inquire about
        Returns:
        true if conversion from source to oracleId is supported. If it isn't supported attempts to convert will always throw exceptions.
      • isUnicode

        public boolean isUnicode()
        Is this a Unicode Character Set.
        Returns:
        true if this CharacterSet is an encoding of Unicode
      • getOracleId

        public int getOracleId()
        The integer that identifies the character set.
        Returns:
        Oracle character set ID
      • equals

        public boolean equals​(Object rhs)
        Two CharacterSet's are equal when their oracleId's are equal
        Overrides:
        equals in class Object
        Parameters:
        rhs - the target character set
        Returns:
        true if the given CharacterSet object equals to this object
      • hashCode

        public int hashCode()
        Implements a hash based on oracleId
        Overrides:
        hashCode in class Object
        Returns:
        a hash code
      • toStringWithReplacement

        public abstract String toStringWithReplacement​(byte[] bytes,
                                                       int offset,
                                                       int count)
        Convert bytes in oracleId representation to a String. If a character has no Unicode representation the effect is unspecified. The conversion might omit it, or replace it with a special character. The preferred result is replacement by a single character, but it is not guaranteed. If the conversion isn't supported at all, the result may be a fixed string.
        Parameters:
        bytes - a array containing characters represented in this character set.
        offset - the index of the first byte or the charcters
        count - the number of bytes to be converted.
        Returns:
        the String resulting from converting to UCS-2.
      • toString

        public String toString​(byte[] bytes,
                               int offset,
                               int count)
                        throws SQLException
        Convert bytes in oracleId representation to a String. The difference between toStringInvertible and plain toString is that toStringInvertible will throw an exception when toString would make some replacement.
        Parameters:
        bytes - a array containing characters represented in this character set.
        offset - the index of the first byte or the charcters
        count - the number of bytes to be converted.
        Returns:
        the String resulting from converting to UCS-2.
        Throws:
        SQLException - when conversion is not supported.
      • convert

        public abstract byte[] convert​(String s)
                                throws SQLException
        Convert a String to bytes in oracleId representation.
        Returns:
        an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in String.
        Throws:
        SQLException - when the oracleId does not support conversion from Unicode.
        SQLException - when s contains a character that cannot be converted.
      • convertWithReplacement

        public abstract byte[] convertWithReplacement​(String s)
        Convert a String to bytes in oracleId representation. A String is always produced even when the conversion isn't supported or it contains characters that do not have a representation in oracleId. The usual conversion is to replace characters that don't have a representation with some fixed character, but that is not guranteed.
        Returns:
        an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in String.
      • convertWithReplacement

        public byte[] convertWithReplacement​(char[] chars,
                                             int charOffset,
                                             byte[] bytes,
                                             int byteOffset,
                                             int[] nchars)
        Similar to convertWithReplacement(String s); Instead of a string, a char[] + offset with a length stored in nchars[0] will be converted.
        Returns:
        an array containing the sequence of bytes in oracleId representation that represent the sequence of Unicode characters in the char[]. nchars[0] has the bytes length.
      • convert

        public abstract byte[] convert​(CharacterSet from,
                                       byte[] source,
                                       int offset,
                                       int count)
                                throws SQLException
        Converts bytes in some representation to oracleId representation. Note that the input is not guaranteed to be different from the output. If a copy is always wanted then use convertUnshared.
        Parameters:
        from - the character set of the input bytes
        source - an array of bytes containing the bytes to be converted
        offset - the index of the first byte to be converted
        count - the number of bytes to be converted
        Returns:
        a byte array in the Oracle character set
        Throws:
        SQLException - if the conversion is not supported
        SQLException - if some character cannot be converted. This exception is not guaranteed to be thrown. For some conversions a replacement character may be used instead.
      • convertUnshared

        public byte[] convertUnshared​(CharacterSet from,
                                      byte[] source,
                                      int offset,
                                      int count)
                               throws SQLException
        Converts bytes in some representation to oracleId representation. This is identical to convert except that it always returns a copy of it's input.
        Parameters:
        from - the character set of the input bytes
        source - an array of bytes containing the bytes to be converted
        offset - the index of the first byte to be converted
        count - the number of bytes to be converted
        Returns:
        an array containing a representation as an oracleId of characters in the source.
        Throws:
        SQLException - if the conversion is not supported.
      • UTFToString

        public static final String UTFToString​(byte[] bytes,
                                               int offset,
                                               int nbytes,
                                               boolean useReplacementChar)
                                        throws SQLException
        Convert a sequence of bytes in UTF8 to a String this function will to allocate the chars array
        Parameters:
        bytes - containing the UTF8 string
        nbytes - of bytes
        useReplacementChar - if true invalid characters are replaced by replacement characters.
        Returns:
        the number of char wrote to the chars array
        Throws:
        SQLException
      • UTFToString

        public static final String UTFToString​(byte[] bytes,
                                               int offset,
                                               int nbytes)
                                        throws SQLException
        Convert a sequence of bytes in UTF8 to a String this function will to allocate the chars array
        Parameters:
        bytes - containing the UTF8 string
        nbytes - of bytes
        Returns:
        the number of char wrote to the chars array
        Throws:
        SQLException
      • UTFToJavaChar

        public static final char[] UTFToJavaChar​(byte[] bytes,
                                                 int offset,
                                                 int count)
                                          throws SQLException
        Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
        Parameters:
        bytes - the array holding the UTF8 bytes
        offset - the index of the first byte
        count - the number of bytes in the UFT8 sequence.
        Returns:
        an array of char's equivalent to the UTF8 sequence.
        Throws:
        SQLException - if any error occurs
      • UTFToJavaChar

        public static final char[] UTFToJavaChar​(byte[] bytes,
                                                 int offset,
                                                 int count,
                                                 boolean useReplacementChar)
                                          throws SQLException
        Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
        Parameters:
        bytes - the array holding the UTF8 bytes
        offset - the index of the first byte
        count - the number of bytes in the UFT8 sequence.
        useReplacementChar - if true invalid characters are replaced by replacement characters.
        Returns:
        an array of char's equivalent to the UTF8 sequence.
        Throws:
        SQLException - if any error occurs
      • UTFToJavaCharWithReplacement

        public static final char[] UTFToJavaCharWithReplacement​(byte[] bytes,
                                                                int offset,
                                                                int count)
        Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that this does not support surrogate characters. To support surrogate, use AL32UTF8 The primary use of this code is to create a string. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
        Parameters:
        bytes - the array holding the UTF8 bytes
        offset - the index of the first byte
        count - the number of bytes in the UFT8 sequence.
        Returns:
        an array of char's equivalent to the UTF8 sequence.
        Throws:
        IllegalStateException - if any error occurs
      • convertUTFBytesToJavaChars

        public static final int convertUTFBytesToJavaChars​(byte[] bytes,
                                                           int offset,
                                                           char[] chars,
                                                           int chars_offset,
                                                           int[] countArr,
                                                           boolean convertWithReplacement)
                                                    throws SQLException
        Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that the maximum length of a character is 3 bytes. So a surrogate pair will be represented as 6 bytes (2 times 3 bytes). To support surrogate pairs as 4 bytes, use AL32UTF8. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves.
        Parameters:
        bytes - the array holding the UTF8 bytes
        offset - the index of the first byte
        chars - the array of holding the UTF-16 char array
        chars_offset - the idnex of the first char that will be written
        countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
        convertWithReplacement - set to true to use replacement character for illegal sequences
        Returns:
        the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 3) or because the char[] was too short (can be more that 3).
        Throws:
        SQLException - if invalid, illegal UTF data is given
      • convertUTFBytesToJavaChars

        public static final int convertUTFBytesToJavaChars​(byte[] bytes,
                                                           int offset,
                                                           char[] chars,
                                                           int chars_offset,
                                                           int[] countArr,
                                                           boolean convertWithReplacement,
                                                           int charSize)
                                                    throws SQLException
        Convert a sequence of bytes in UTF8 to an array of char's. This is different from the offical UTF-8 in that the maximum length of a character is 3 bytes. So a surrogate pair will be represented as 6 bytes (2 times 3 bytes). To support surrogate pairs as 4 bytes, use AL32UTF8. Note that this method is "true" UTF8. That is in the input, null's may appear encoding themselves. Same as convertUTFBytesToJavaChars(byte[],int,char[],int,int[],boolean) with an additional argument 'charSize' which is the number of chars available in the char array. Note that if chars_offset+charSize>char.length, then an IndexArrayOutOfBound exception will be thrown. This method has been optimized for speed: 1) 'switch -case' has been replaced with a series of logical shifts and zero comparisons; 2) the internal loop with attempt to optimize array bounds checks away has been added.
        Parameters:
        bytes - the array holding the UTF8 bytes
        offset - the index of the first byte
        chars - the array of holding the UTF-16 char array
        chars_offset - the idnex of the first char that will be written
        countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
        convertWithReplacement - set to true to use replacement character for illegal sequences
        Returns:
        the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 3) or because the char[] was too short (can be more that 3).
        Throws:
        SQLException - if invalid, illegal UTF data is given
      • stringToUTF

        public static final byte[] stringToUTF​(String str)
        Convert the str to a byte array that in UTF8 representation.
      • convertJavaCharsToUTFBytes

        public static final int convertJavaCharsToUTFBytes​(char[] chars,
                                                           int chars_offset,
                                                           byte[] bytes,
                                                           int bytes_begin,
                                                           int chars_count)
        Convert char's to the UTF8 representation. No validation is performed.
        Parameters:
        chars - a source string in an array of chars
        chars_offset - an offset to start copying in the source string
        chars_count - a length to copy from the source string
        bytes - a destination byte array
        bytes_begin - an offset to start copying in the destination byte array
        Returns:
        the length copy operation was performed
      • stringUTFLength

        public static final int stringUTFLength​(String s)
        Returns the number of bytes in the UTF8 representation of a String
        Parameters:
        s - a Java string
        Returns:
        The number of bytes in the UTF8 encoding
      • AL32UTF8ToString

        public static final String AL32UTF8ToString​(byte[] bytes,
                                                    int offset,
                                                    int nbytes)
        Convert a sequence of bytes in AL32UTF8 format to a String. this function will allocate the memory for holding the returning String.
        Parameters:
        bytes - containing the AL32UTF8 string
        offset - an offset to start conversion
        nbytes - of bytes
        Returns:
        the converted String
      • AL32UTF8ToString

        public static final String AL32UTF8ToString​(byte[] bytes,
                                                    int offset,
                                                    int nbytes,
                                                    boolean useReplacementCharacter)
      • AL32UTF8ToJavaChar

        public static final char[] AL32UTF8ToJavaChar​(byte[] bytes,
                                                      int offset,
                                                      int count,
                                                      boolean useReplacementCharacter)
                                               throws SQLException
        Converts an AL32UTF8 byte array to an array of char. This function will allocate a char array for holding the returning result.
        Parameters:
        bytes - an AL32UTF8 byte array
        offset - an offset to start conversion
        count - number of bytes to be converted.
        Returns:
        an array of char data
        Throws:
        SQLException
      • convertAL32UTF8BytesToJavaChars

        public static final int convertAL32UTF8BytesToJavaChars​(byte[] bytes,
                                                                int offsetBytes,
                                                                char[] chars,
                                                                int offsetChars,
                                                                int[] countArr,
                                                                boolean convertWithReplacement)
                                                         throws SQLException
        Convert a sequence of bytes in AL32UTF8 to an array of char's. A char in AL32UTF8 can be represented with up to 4 bytes. The only difference between UTF8 (Oracle's) and AL32UTF8 is the representation of a surrogate pair. In AL32UTF8 a surrogate pair is represented with 4 bytes instead of 6 bytes in UTF8.
        Parameters:
        bytes - the array holding the AL32UTF8 bytes
        offsetBytes - the index of the first byte
        chars - the array of holding the UTF-16 char array
        offsetChars - the index of the first char that will be written
        countArr - IN/OUT parameter. countArr[0](IN) contains the number of bytes in the UTF8 sequence that need to be converted.
        convertWithReplacement - set to true to use replacement character for illegal sequences
        Returns:
        the number of chars written. countArr[0](OUT) contains the number of bytes in the bytes[] array that have been ignored because the rest of the sequence is missing (can be up to 4) or because the char[] was too short (can be more than 4).
        Throws:
        SQLException - if invalid, illegal UTF data is given
      • convertAL32UTF8BytesToJavaChars

        public static final int convertAL32UTF8BytesToJavaChars​(byte[] bytes,
                                                                int offsetBytes,
                                                                char[] chars,
                                                                int offsetChars,
                                                                int[] countArr,
                                                                boolean convertWithReplacement,
                                                                int charSize)
                                                         throws SQLException
        Same as convertAL32UTF8BytesToJavaChars(byte[],int,char[],int,int[],boolean) with an additional argument 'charSize' which is the number of chars available in the char array. Note that if chars_offset+charSize>char.length, then an IndexArrayOutOfBound exception will be thrown. This method has been optimized for speed: 1) 'switch -case' has been replaced with a series of logical shifts and zero comparisons; 2) the internal loop with attempt to optimize array bounds checks away has been added.
        Throws:
        SQLException
      • stringToAL32UTF8

        public static final byte[] stringToAL32UTF8​(String str)
      • convertJavaCharsToAL32UTF8Bytes

        public static final int convertJavaCharsToAL32UTF8Bytes​(char[] chars,
                                                                int chars_offset,
                                                                byte[] bytes,
                                                                int bytes_begin,
                                                                int chars_count)
        Convert char's to the UTF-8 representation. No validation is performed except surrogate pairs
        Parameters:
        chars - a source string in an array of chars
        chars_offset - an offset to start copying in the source string
        bytes - a destination byte array
        bytes_begin - an offset to start copying in the destination byte array
        chars_count - a length to copy from the source string
        Returns:
        the length copy operation was performed
      • string32UTF8Length

        public static final int string32UTF8Length​(String s)
        Returns the number of bytes in the UTF-8 representation of a String

        This method doesn't check neither invalid- nor illegal-UTF sequence.

        Parameters:
        s - a UTF-16 string to count the number of bytes in UTF8 format
        Returns:
        The number of bytes in the UTF-8 representaion of a string
      • AL16UTF16BytesToString

        public static final String AL16UTF16BytesToString​(byte[] bytes,
                                                          int nbytes)
        Convert a sequence of bytes in AL16UTF16 to a String this function will allocate a chars array
        Parameters:
        bytes - containing the AL16UTF16 string
        nbytes - of bytes
        Returns:
        a newly generated String
      • AL16UTF16BytesToJavaChars

        public static final int AL16UTF16BytesToJavaChars​(byte[] bytes,
                                                          int nbytes,
                                                          char[] chars)
        Convert a sequence of bytes in AL16UTF16 to an array of chars caller needs to allocate the chars array
        Parameters:
        bytes - containing the AL16UTF16 string
        nbytes - of bytes
        chars - char array which the UCS2 string will be returned in
        Returns:
        the number of char wrote to the chars array
      • convertAL16UTF16BytesToJavaChars

        public static final int convertAL16UTF16BytesToJavaChars​(byte[] bytes,
                                                                 int offset,
                                                                 char[] chars,
                                                                 int chars_offset,
                                                                 int count,
                                                                 boolean convertWithReplacement)
                                                          throws SQLException
        Converts a sequence of bytes in AL16UTF16 to an array of char's.
        Parameters:
        bytes - the array holding the AL16UTF16 bytes
        offset - the index of the first byte
        chars - the array of holding the UTF-16 char array
        chars_offset - the index of the first char
        count - the number of bytes in the AL16UTF16 sequence.
        convertWithReplacement - set to true to use replacement character for illegal
        Returns:
        the number of chars written
        Throws:
        SQLException - if invalid, illegal UTF data is given
      • convertAL16UTF16LEBytesToJavaChars

        public static final int convertAL16UTF16LEBytesToJavaChars​(byte[] bytes,
                                                                   int offset,
                                                                   char[] chars,
                                                                   int chars_offset,
                                                                   int count,
                                                                   boolean convertWithReplacement)
                                                            throws SQLException
        Converts a sequence of bytes in AL16UTF16LE to an array of char's.
        Parameters:
        bytes - the array holding the AL16UTF16LE bytes
        offset - the index of the first byte
        chars - the array of holding the UTF16UTF16LE char array
        chars_offset - the index of the first char
        count - the number of bytes in the AL16UTF16LE sequence.
        convertWithReplacement - set to true to use replacement character for illegal
        Returns:
        an array of char's equivalent to the AL16UTF16LE sequence.
        Throws:
        SQLException - if invalid, illegal UTF data is given
      • stringToAL16UTF16Bytes

        public static final byte[] stringToAL16UTF16Bytes​(String str)
        Convert a String to an array of bytes this function will allocate the bytes array
        Parameters:
        str - containing the UCS2 string
        Returns:
        the AL16UTF16 byte array
      • javaCharsToAL16UTF16Bytes

        public static final int javaCharsToAL16UTF16Bytes​(char[] chars,
                                                          int nchars,
                                                          byte[] bytes)
        Convert a sequence of chars in UCS2 to an array of bytes caller needs to allocate the bytes array
        Parameters:
        chars - containing the UCS2 string
        nchars - of chars
        bytes - byte array which the AL16UTF16 string will be returned in
        Returns:
        the number of bytes wrote to the bytes array
      • convertJavaCharsToAL16UTF16Bytes

        public static final int convertJavaCharsToAL16UTF16Bytes​(char[] chars,
                                                                 int chars_offset,
                                                                 byte[] bytes,
                                                                 int bytes_offset,
                                                                 int nchars)
      • stringToAL16UTF16LEBytes

        public static final byte[] stringToAL16UTF16LEBytes​(String str)
        Convert a String to an array of bytes this function will allocate the bytes array
        Parameters:
        str - containing the UCS2 string
        Returns:
        the AL16UTF16LE byte array
      • javaCharsToAL16UTF16LEBytes

        public static final int javaCharsToAL16UTF16LEBytes​(char[] chars,
                                                            int nchars,
                                                            byte[] bytes)
        Convert a sequence of chars in UCS2 to an array of bytes caller needs to allocate the bytes array
        Parameters:
        chars - containing the UCS2 string
        nchars - of chars
        bytes - byte array which the AL16UTF16LE string will be returned in
        Returns:
        the number of bytes wrote to the bytes array
      • convertJavaCharsToAL16UTF16LEBytes

        public static final int convertJavaCharsToAL16UTF16LEBytes​(char[] chars,
                                                                   int chars_offset,
                                                                   byte[] bytes,
                                                                   int bytes_offset,
                                                                   int nchars)
      • convertASCIIBytesToJavaChars

        public static final int convertASCIIBytesToJavaChars​(byte[] bytes,
                                                             int bytes_offset,
                                                             char[] chars,
                                                             int chars_offset,
                                                             int count)
                                                      throws SQLException
        convert a byte array in ascii to a Java char array. The caller needs to allocate the buffer of chars.
        Parameters:
        bytes - input bytes
        bytes_offset - the starting position to convert
        chars - output Java char array (buffer is allocated by the caller)
        chars_offset - starting position to store the Java char array
        count - number of characters in chars
        Returns:
        the number of Java character written into chars[]
        Throws:
        SQLException - if errors occurred
      • convertJavaCharsToASCIIBytes

        public static final int convertJavaCharsToASCIIBytes​(char[] chars,
                                                             int chars_offset,
                                                             byte[] bytes,
                                                             int bytes_offset,
                                                             int nchars)
                                                      throws SQLException
        convert a Java char array to a byte array in ascii. The caller needs to allocate the buffer of bytes.
        Parameters:
        chars - input Java char array
        chars_offset - input the starting position to convert
        bytes - output the converted byte array in ascii
        bytes_offset - input the starting position to hold the returning bytes
        Returns:
        the number of chars converted
        Throws:
        SQLException - if errors occurred
      • convertJavaCharsToASCIIBytes

        public static final int convertJavaCharsToASCIIBytes​(char[] chars,
                                                             int chars_offset,
                                                             byte[] bytes,
                                                             int bytes_offset,
                                                             int nchars,
                                                             boolean strictConversion)
                                                      throws SQLException
        Throws:
        SQLException
      • convertJavaCharsToISOLATIN1Bytes

        public static final int convertJavaCharsToISOLATIN1Bytes​(char[] chars,
                                                                 int chars_offset,
                                                                 byte[] bytes,
                                                                 int bytes_offset,
                                                                 int nchars)
                                                          throws SQLException
        Throws:
        SQLException
      • stringToASCII

        public static final byte[] stringToASCII​(String str)
        convert a String to a byte array in ascii. This method will allocate the byte array.
        Parameters:
        str - input the String to be converted
        Returns:
        the byte array in ascii
      • convertUTF32toUTF16

        public static final long convertUTF32toUTF16​(long ucs4ch)
      • encodedByteLength

        public int encodedByteLength​(String s)
                              throws SQLException
        Return the length of the byte array which would result if the String were encoded in this character set
        Parameters:
        s - is a Java String
        Returns:
        the length of the encoded bytes
        Throws:
        SQLException
      • encodedByteLength

        public int encodedByteLength​(char[] carray)
        Return the length of the byte array which would result if the char array were encoded in this character set
        Parameters:
        carray - is a char array
        Returns:
        the length of the encoded bytes
      • getConnectionDuringExceptionHandling

        protected oracle.jdbc.internal.OracleConnection getConnectionDuringExceptionHandling()
      • isUnknown

        public boolean isUnknown()