Sharded Database Schema Design Considerations
Design of the database schema has a big impact on the performance and scalability of a sharded database. An improperly designed schema can lead to unbalanced distribution of data and workload across shards and large percentage of multi-shard operations.
The data model should be a hierarchical tree structure with a single root table. Oracle Sharding supports any number of levels within the hierarchy.
To obtain the benefits of sharding, the schema of a sharded database should be designed in a way that maximizes the number of database requests executed on a single shard.
A sharded database schema consists of a sharded table family and duplicated tables with the following characteristics.
Sharded table family
- A set of tables which are equi-partitioned by the sharding key.
- Related data is always stored and moved together.
- Joins and integrity constraint checks are done within a shard.
- The sharding method and key are based on the application's requirements.
- The sharding key must be included in the primary key.
Duplicated tables
- Non-sharded tables which are replicated to all shards.
- Usually contain common reference data.
- Can be read and updated on each shard.
Planning a Sharded Database Schema Design
Once the sharded database is populated with data, it is impossible to change many attributes of the schema, such as whether a table is sharded or duplicated, sharding key, and so on. Therefore, the following points should be carefully considered before deploying a sharded database.
-
Which tables should be sharded?
-
Which tables should be duplicated?
-
Which sharded table should be the root table?
-
What method should be used to link other tables to the root table?
-
Which sharding method should be used?
-
Which sharding key should be used?
-
Which super sharding key should be used (if the sharding method is composite)?
- Choosing Sharding Keys
Sharded table partitions are distributed across shards at the tablespace level, based on a sharding key. Examples of keys include customer ID, account number, and country ID. - Primary Key and Foreign Key Constraints
In a sharding environment, the primary key constraints and foreign key constraints are controlled by the following rules. - Indexes on Sharded Tables
Only local indexes can be created on sharded tables. Unique local indexes on sharded tables must contain the sharding key.
Parent topic: Sharded Database Schema Design
Choosing Sharding Keys
Sharded table partitions are distributed across shards at the tablespace level, based on a sharding key. Examples of keys include customer ID, account number, and country ID.
Sharding keys must adhere to the following characteristics.
-
The sharding key should be very stable; its value should almost never change.
-
The sharding key must be present in all of the sharded tables. This allows the creation of a family of equi-partitioned tables based on the sharding key.
-
Joins between tables in a table family should be performed using the sharding key.
Sharding Keys for System-Managed Sharded Databases
For the system-managed sharding method, the sharding key must be based on a column that has high cardinality; the number of unique values in this column must be much bigger than the number of shards. Customer ID, for example, is a good candidate for the sharding key, while a United States state name is not.
A sharding key can be a single column or multiple columns. When multiple columns are present, the hash of the columns are concatenated to form the sharding key.
The following examples create a sharded table called Customers and specify that columns
cust_id
and name
form the sharding keys for the
table.
CREATE SHARDED TABLE customers
(cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250)
, region VARCHAR2(20)
, class VARCHAR2(3)
, signup DATE,
CONSTRAINT cust_pk PRIMARY KEY(cust_id, name))
PARTITION BY CONSISTENT HASH (cust_id,name)
PARTITIONS AUTO
TABLESPACE SET ts1;
CREATE SHARDED TABLE Orders
( OrderNo NUMBER NOT NULL
, CustNo NUMBER NOT NULL
, Name VARCHAR2(50) NOT NULL
, OrderDate DATE
, CONSTRAINT OrderPK PRIMARY KEY (CustNo, Name, OrderNo)
, CONSTRAINT CustFK FOREIGN KEY (CustNo, Name) REFERENCES Customers(Cust_ID, Name)
)
PARTITION BY REFERENCE (CustFK);
Sharding Keys for Composite Sharded Databases
Composite sharding enables two levels of sharding - one by list or range and another by consistent hash. This is accomplished by the application providing two keys: a super sharding key and a sharding key.
Composite sharding does not support multi-column LIST
partitionsets, as
shown here.
CREATE SHARDED TABLE customers (
cust_id NUMBER NOT NULL,
Name VARCHAR2(50) NOT NULL,
class VARCHAR2(3) NOT NULL ,
class2 number not null,
CONSTRAINT cust_pk PRIMARY KEY(cust_id,name,class))
PARTITIONSET BY LIST (class, class2)
PARTITION BY CONSISTENT HASH (cust_id,name)
PARTITIONS AUTO (
PARTITIONSET silver VALUES (('SLV',1),('BRZ',2)) TABLESPACE SET ts1
PARTITIONSET gold VALUES (('GLD',3),('OTH',4)) TABLESPACE SET ts2);
PARTITION BY CONSISTENT HASH (cust_id,name)
*
ERROR at line 8:
ORA-02514: list PARTITIONSET method expects a single partitioning column
Multi-column RANGE
partitionsets are supported, as shown below.
CREATE SHARDED TABLE customers (
cust_id NUMBER NOT NULL,
Name VARCHAR2(50) NOT NULL,
class number NOT NULL ,
class2 number not null,
CONSTRAINT cust_pk PRIMARY KEY(cust_id,name,class))
PARTITIONSET BY RANGE (class, class2)
PARTITION BY CONSISTENT HASH (cust_id,name)
PARTITIONS AUTO (
PARTITIONSET silver VALUES LESS THAN (10,100) TABLESPACE SET ts1,
PARTITIONSET gold VALUES LESS THAN (20,200) TABLESPACE SET ts2);
Table created.
In both of the above cases, the sharding key (not the partitionset key) can be multi-column.
Sharding Keys for User-Defined Sharded Databases
For partition by list in user-defined sharding, Oracle Sharding expects a single sharding key column. An error is thrown when multiple columns are specified for a list-partitioned sharded table.
CREATE SHARDED TABLE accounts
( id NUMBER
, account_number NUMBER
, customer_id NUMBER
, branch_id NUMBER
, state VARCHAR(2) NOT NULL
, state2 VARCHAR(2) NOT NULL
, status VARCHAR2(1)
)
PARTITION BY LIST (state,state2)
( PARTITION p_northwest VALUES ('OR', 'WA') TABLESPACE ts1
, PARTITION p_southwest VALUES ('AZ', 'UT', 'NM') TABLESPACE ts2
, PARTITION p_northcentral VALUES ('SD', 'WI') TABLESPACE ts3
, PARTITION p_southcentral VALUES ('OK', 'TX') TABLESPACE ts4
, PARTITION p_northeast VALUES ('NY', 'VM', 'NJ') TABLESPACE ts5
, PARTITION p_southeast VALUES ('FL', 'GA') TABLESPACE ts6
);
ERROR at line 1:
ORA-03813: list partition method expects a single partitioning column in
user-defined sharding
For a range-partitioned sharded table, you can specify multiple columns as sharding key columns.
CREATE SHARDED TABLE accounts
( id NUMBER
, account_number NUMBER
, customer_id NUMBER
, branch_id NUMBER
, state NUMBER NOT NULL
, state2 NUMBER NOT NULL
, status VARCHAR2(1)
)
PARTITION BY RANGE (state, state2)
( PARTITION p_northwest VALUES LESS THAN(10, 100) TABLESPACE ts1
, PARTITION p_southwest VALUES LESS THAN(20,200) TABLESPACE ts2);
Table created.
But in both cases, the sharding key (not the partitionset key) can be multi-column.
Sharding Key Type Support
The following data types are supported for the sharding key.
-
NUMBER
-
INTEGER
-
SMALLINT
-
RAW
-
(N)VARCHAR
-
(N)VARCHAR2
-
(N)CHAR
-
DATE
-
TIMESTAMP
Parent topic: Sharded Database Schema Design Considerations
Primary Key and Foreign Key Constraints
In a sharding environment, the primary key constraints and foreign key constraints are controlled by the following rules.
-
For primary keys, there are unique constraints and unique indexes on sharded tables; the column list must contain the sharding key columns. In earlier Oracle releases the restriction was that the sharding key must be a prefix of such columns, but this rule is now more relaxed.
-
Foreign keys from one sharded table to another sharded table also must contain the sharding key. This is automatically enforced because a foreign key refers to either the primary key or unique columns of the referenced table.
-
Foreign keys on sharded tables must be within the same table family. This is required because different table families have different sharding key columns.
-
Foreign keys in sharded tables referencing local tables are not allowed.
-
Foreign keys in sharded tables referencing duplicated tables are not allowed.
-
Foreign keys in duplicated table referencing sharded tables are not allowed.
Parent topic: Sharded Database Schema Design Considerations
Indexes on Sharded Tables
Only local indexes can be created on sharded tables. Unique local indexes on sharded tables must contain the sharding key.
Global indexes on sharded tables are not allowed because they can compromise the performance of online chunk movement.
The following example creates a local index named id1
for the
id
column of the account
table.
CREATE INDEX id1 ON account (id) LOCAL;
The following example creates a local unique index named id2
for
the id
and state
columns of the
account
table.
CREATE UNIQUE INDEX id2 ON account (id, state) LOCAL;
Parent topic: Sharded Database Schema Design Considerations