26 JDBC Support for Database Sharding
This chapter describes JDBC support for Database Sharding in the following sections:
26.1 Overview of Database Sharding for JDBC Users
Modern web applications face new scalability challenges with huge volumes of data. A commonly accepted solution to this problem is sharding. Sharding is a data tier architecture, where data is horizontally partitioned across independent databases. Each database in such a configuration is called a shard. All shards together make up a single logical database, which is referred to as a sharded database (SDB). Sharding is a shared-nothing database architecture because shards do not share physical resources such as CPU, memory, or storage devices.
Sharding uses Global Data Services (GDS), where GDS routes a client request to an appropriate database based on parameters such as availability, load, network latency, and replication lag. A GDS pool is a set of replicated databases that offer the same global service. The databases in a GDS pool can be located in multiple data centers across different regions. A sharded GDS pool contains all shards of a sharded database and their replicas, and appears as a single sharded database to database clients.
Starting from Oracle Database 12c Release 2 (12.2.0.1), Oracle JDBC supports database sharding. The JDBC driver recognizes the specified sharding key and super sharding key and connects to the relevant shard that contains the data. Once the connection is established to a shard, then any database operations, such as DMLs, SQL queries and so on, are supported and executed in the usual way. The following section describes the sharding terminologies used in this guide:
See Also:
Sharding, Shard, and Sharded Database
Sharding is a data tier architecture where data is horizontally partitioned across independent databases. Each database in such configuration is called a shard. All shards together make up a single logical database which is referred to as a sharded database (SDB).
Sharding Key, Composite Sharding Key, and Super Sharding Key
A sharding key is a partitioning key used in single-level sharding by range, list, or consistent hash. All sharding keys together are referred to as the composite sharding keys. A super-sharding key is the partitioning key used in composite sharding for the top-level sharding by range or list. Both the sharding key and the super sharding key can contain one or more columns that determine the shard where each row is stored. A sharding key can be of type VARCHAR2, CHAR, DATE, NUMBER, TIMESTAMP and so on.
For JDBC users, it is recommended that sharding keys and super sharding keys must be passed while obtaining connections from the database. However, Sharding Keys can be provided in the connection string as a separate attribute under CONNECT_DATA
. Passing sharding key in the connection string restricts the connections only to one shard. So, it is not recommended to use this approach. Following code snippet shows how you can provide Sharding Keys as a separate attribute under CONNECT_DATA
in the connection string:
(DESCRIPTION=(…)(CONNECT_DATA=(SERVICE_NAME=ORCL (SHARDING_KEY=…) (SUPER_SHARDING_KEY=...)))
Note:
You must provide the sharding key compliant to the NLS formatting that is specified in the database.
Multi Shard Queries
Multi Shard Queries enable routing and processing of queries and transactions that access data stored on multiple shards. Multi Shard Queries are executed without a sharding key. Multi Shard Operations are used for simple aggregation of data and reporting across shards.
Shard Catalog
Shard Catalog is a special database that is used for storing sharded database and supporting multi shard queries. It also helps in centralized management of a sharded database.
Shard Director
A shard director is a specific implementation of a global service manager (GSM) that acts as a regional listener for clients that connect to an SDB and maintains a current topology map of the SDB. Based on the sharding key passed during a connection request, it routes the connections to the appropriate shard.
Shard Topology
Shard Topology is the sharding key range mappings stored in a particular shard. Universal Connection Pool (UCP) can cache shard topology, which enables it to bypass shard director while establishing connections to shards. So, applications that you built using UCP get fast path for shards.
Chunk
A chunk is a single partition from each table of a table family. It is a unit of data migration between shards.
Chunk Split
Chunk Split is a process that is required when chunks become too big or only part of a chunk needs to be migrated to another shard.
Chunk Migration
Chunk migration is the process of moving a chunk from one shard to another, when data or workload skew occurs without any change in the number of shards. It is initiated by DBA to eliminate hot spots.
Resharding
Resharding is the process of redistributing data between shards triggered by a change in the number of shards. Chunks are moved between shards for even distribution of chunks across shards. However, content of chunks does not change, that is, no rehashing takes place during Resharding.
26.2 About Building the Sharding Key
The shard aware applications must identify and build the sharding key and the super sharding key, which are required to establish a connection to the sharded database. For achieving this, the shard aware applications must use the OracleShardingKey
and the OracleShardingKeyBuilder
interfaces.
The OracleShardingKeyBuilder
uses the following builder method for supporting compound keys with different data types:
subkey(Object subkey, java.sql.SQLTYPE subkeyDataType)
There are multiple invocations of the subkey
method on the builder for building a compound sharding key, where each subkey can be of different data types. The data type can be defined using the oracle.jdbc.OracleType
enum or java.sql.JDBCType
.
Example 26-1 Building a Sharding Key
The following example shows how to build a sharding key:
import java.sql.Connection;
import java.sql.Date;
import java.sql.SQLException;
import java.sql.Statement;
import oracle.jdbc.OracleShardingKey;
import oracle.jdbc.OracleType;
import oracle.ucp.jdbc.PoolDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
public class ShardExample
{
public static void main(String[] args) throws SQLException
{
String url = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(HOST=myhost)(PORT=3216)(PROTOCOL=tcp))(CONNECT_DATA=(SERVICE_NAME=myservice)(REGION=east)))";
String user="testuser1";
String pwd = "password";
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setURL(url);
pds.setUser(user);
pds.setPassword(pwd);
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
pds.setInitialPoolSize(5);
pds.setMinPoolSize(5);
pds.setMaxPoolSize(20);
// build the sharding key object
Date shardingKeyVal = new java.sql.Date(0L);
OracleShardingKey sdkey = pds.createShardingKeyBuilder()
.subkey(shardingKeyVal, OracleType.DATE)
.build();
Connection conn = pds.createConnectionBuilder()
.shardingKey(sdkey)
.build();
Statement stmt = conn.createStatement();
stmt.execute("... SQL statement here ...");
stmt.close();
conn.close();
}
}
The following code snippet shows how to build a compound sharding key that consists of String and Date data types:
...
Date shardingKeyVal = new java.sql.Date(0L);
...
OracleShardingKey shardingKey = datasource.createShardingKeyBuilder()
.subkey("abc@xyz.com", JDBCType.VARCHAR)
.subkey(shardingKeyVal, OracleType.DATE)
.build();
...
Note:
-
There is a fixed set of data types that are valid and supported. If any unsupported data types are used as keys, then exceptions are thrown. The following list specifies the supported data types:
-
OracleType.VARCHAR2/JDBCType.VARCHAR
-
OracleType.CHAR/JDBCType.CHAR
-
OracleType.NVARCHAR/JDBCType.NVARCHAR
-
OracleType.NCHAR/JDBCType.NCHAR
-
OracleType.NUMBER/JDBCType.NUMERIC
-
OracleType.FLOAT/ JDBCType.FLOAT
-
OracleType.DATE/ JDBCType.DATE
-
OracleType.TIMESTAMP/JDBCType.TIMESTAMP
-
OracleType.TIMESTAMP_WITH_LOCAL_TIME_ZONE
-
OracleType.RAW
-
-
You must provide a sharding key that is compliant to the NLS formatting specified in the database.
26.3 APIs for Database Sharding Support
Oracle Database 12c Release 2 (12.2.0.1) introduced a set of APIs for implementing database sharding. The following sections discuss these APIs in details:
26.3.1 The OracleShardingKey Interface
This interface indicates that the current object represents an Oracle sharding key that is to be used with Oracle sharded database.
Syntax
public interface OracleShardingKey extends Comparable <OracleShardingKey>
26.3.2 The OracleShardingKeyBuilder Interface
OracleShardingKeyBuilder
provides the interface to build the compound sharding key with subkeys of various supported data types. This interface uses the new JDK 8 builder pattern for building a sharding key.
Syntax
public interface OracleShardingKeyBuilder
Example 26-2 Creating the Sharding Key
OracleDataSource ods = new OracleDataSource();
...
//set datasource properties..
...
OracleShardingKey shardingKey = ods.createShardingKeyBuilder()
.subkey("Customer_Name_XYZ", JDBCType.VARCHAR)
.subkey(94002, JDBCType.NUMERIC)
.build();
26.3.3 The OracleConnectionBuilder Interface
The OracleConnectionBuilder
is used for building connection objects with additional parameters, other than user name and password. For creating a connection, the builder methods need to be called for each parameter that needs to be part of the connection request, followed by a build()
method. The order in which the builder methods are called is not important. However, if the same builder attribute is applied more than once, then only the most recent value is considered while building the connection. The build()
method of the builder can be called only once on a builder object.
Syntax
public interface OracleConnectionBuilder
Example 26-3 Creating the Connection Builder
...
OracleDataSource ods=new OracleDataSource();
...
OracleConnection conn = ods.createConnectionBuilder()
.shardingKey(shardingKey)
.superShardingKey(superShardingKey)
.build();
26.3.4 Other New Classes and Methods for Database Sharding Support
This section describes the rest of the new classes and methods introduced for implementing database sharding support.
New Methods in OracleDataSource Class
The createConnectionBuilder
and createShardingKeyBulider
methods have been introduced in OracleDataSource
class for database sharding support.
OracleConnectionBuilder createConnectionBuilder() throws SQLException;
OracleShardingKeyBuilder createShardingKeyBuilder()
New Methods in OracleXADataSource Class
The createConnectionBuilder
method has been introduced in OracleXADataSource
class for database sharding support.
OracleConnectionBuilder createConnectionBuilder() throws SQLException;
New Methods in OracleConnection Class
The setShardingKeyIfValid
and setShardingKey
methods have been introduced in OracleConnection
class for database sharding support.
boolean setShardingKeyIfValid(OracleShardingKey shardingKey, OracleShardingKey superShardingKey, int timeout) throws SQLException;
void setShardingKey(OracleShardingKey shardingKey, OracleShardingKey superShardingKey) throws SQLException;
New Methods in OracleXAConnection Class
The setShardingKeyIfValid
and setShardingKey
methods have been introduced in OracleConnection
class for database sharding support.
boolean setShardingKeyIfValid(OracleShardingKey shardingKey, OracleShardingKey superShardingKey, int timeout) throws SQLException;
void setShardingKey(OracleShardingKey shardingKey, OracleShardingKey superShardingKey) throws SQLException;
26.4 JDBC Sharding Example
The following code snippet shows how to use JDBC sharding APIs:
Example 26-4 JDBC Sharding Example
OracleDataSource ods = new OracleDataSource();
ods.setURL("jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(HOST=myhost)(PORT=1521)(PROTOCOL=tcp))(CONNECT_DATA=(SERVICE_NAME=myorcldbservicename)))");
ods.setUser("hr");
ods.setPassword("hr");
// Employee name is the sharding Key in this example.
// Build the Sharding Key using employee name as shown below.
OracleShardingKey employeeNameShardKey = ods.createShardingKeyBuilder()
.subkey("Mary", JDBCType.VARCHAR)// First Name
.subkey("Claire", JDBCType.VARCHAR)// Last Name
.build();
OracleShardingKey locationSuperShardKey = ods.createShardingKeyBuilder() // Building a super sharding key using location as the key
.subkey("US", JDBCType.VARCHAR)
.build();
OracleConnection connection = ods.createConnectionBuilder()
.shardingKey(employeeNameShardKey)
.superShardingKey(locationSuperShardKey)
.build();
26.5 Overview of the Sharding Data Source
Oracle Database Release 21c introduces a new Java data source that enables Java connectivity to a sharded database without the need for an application to furnish a sharding key. So, you do not need to identify and build the sharding key and the super sharding key to establish a connection.
The new data source scales out to sharded databases transparently as it does not involve any change to the application code. For using this data source, you must set the connection property oracle.jdbc.useShardingDriverConnection
to true
in the following way:
Properties prop = new Properties();
prop.setProperty("oracle.jdbc.useShardingDriverConnection", "true");
The default value of this connection property is false
.
26.5.1 Benefits of the Sharding Data Source
Following are the benefits of the new sharding data source:
- You do not need to use the sharding APIs to pass the sharding key because the sharding data source derives the sharding key from the SQL statement.
- You do not need to configure the Universal Connection Pool (UCP) because the sharding data source uses the auto tune feature of UCP.
- You do not need to check-in or check-out a physical connection for every new sharding key because the sharding data source does it automatically.
- You do not need to separate cross-shard statements from single-shard statements and create separate connection pools for them because the sharding data source maintains those connections pools.
- The sharding data source enables the prepared statement caching and routes the connection to the direct shard based on the key used in the SQL statement.
- The sharding data source simplifies applications and optimizes application performance without any code change.
26.5.2 Example: Sharding Data Source
Following is an example that shows how to use the sharding data source.
Example 26-5 Sharding Data Source
public class ShardingDriverSample {
public static void main(String[] args) throws SQLException {
ShardingDriverSample sample = new ShardingDriverSample();
sample.bindQuerySample();
}
private void bindQuerySample() throws SQLException {
OracleConnection conn = getGsmConnection();
executeQueryWithBindAndReadRows(conn, "SELECT * FROM MY_SHARD_TABLE where ID = ?", 10);
System.out.println("Direct shard execution percentage:" + conn.getPercentageQueryExecutionOnDirectShard());
}
private OracleConnection getGsmConnection() throws SQLException {
Properties prop = new Properties();
prop.setProperty("oracle.jdbc.useShardingDriverConnection", "true");
...
return ds.getConnection();
}
private void executeQueryWithBindAndReadRows(Connection dbConnection, String sql, int noOfTime)
throws SQLException {
while (noOfTime-- > 0) {
PreparedStatement statement = dbConnection.prepareStatement(sql);
statement.setInt(1, 15);
ResultSet rs = statement.executeQuery();
...
}
}
}
26.5.3 Limitations of the Sharding Data Source
Following are the limitations of the sharding data source:
- The sharding data source supports only the JDBC Thin driver. It does not support the JDBC OCI driver or the KPRB driver.
- The sharding data source does not support some Oracle JDBC extension APIs such as Direct Path Load, Implicit Connection Cache Cache, JDBC Dynamic Monitoring Service (DMS) metrics, and so on.
- The sharding data source does not support transactions spanning across multiple shards.
- If the data source property
allowSingleShardTxn
is set toTRUE
, then the sharding data source supports local transactions against a single shard, whenAUTO COMMIT
is set to OFF. - The sharding data source supports PL/SQL execution only through the catalog database.