fbpx

Exadata HCC Compression

Hybrid Columnar Compression (HCC) is an advanced compression technology introduced by Oracle, particularly designed for environments like Oracle Exadata.

Unlike traditional row-major format databases, HCC organizes data in a column-major format within storage units known as Compression Units (CUs).

This format allows Oracle to compress data more efficiently, improving both storage utilization and query performance, especially for read-heavy operations such as data warehousing and archiving.

HCC comes in four compression types:

  1. Query Low: Uses LZO (Lempel–Ziv–Oberhumer) compression, offering fast decompression but lower compression ratios. It’s best for scenarios where speed is essential, especially for row-level access.
  2. Query High: Utilizes ZLIB (Gzip) compression, providing better compression ratios than Query Low but requiring more CPU for decompression.
  3. Archive Low: Uses a more aggressive version of the ZLIB algorithm, optimizing for higher compression than Query High but at the cost of even higher CPU overhead.
  4. Archive High: This is the most aggressive form of compression, using the Bzip2 algorithm. It provides the highest compression ratio, but also incurs the greatest CPU cost and is best suited for data that is rarely accessed.

To use HCC, it must be enabled on Exadata or certain Oracle storage systems (like Oracle ZFS Storage).

You can enable HCC either when creating a table or by altering an existing one.

Here’s an example:

CREATE TABLE my_table 
COLUMN STORE COMPRESS FOR ARCHIVE HIGH
AS SELECT * FROM source_table;

Compression can also be applied to partitions of a table.

Note that HCC compression primarily benefits workloads with large read-only datasets, and it’s less suited for frequently updated data due to potential performance degradation.

Additionally, direct-path operations are necessary for inserting data into HCC-compressed tables.

Updates to HCC-compressed rows are stored separately, often requiring more storage and reducing the benefits of the original compression.

1. The 4 HCC compression types

Here are examples of how to create tables using the four different Hybrid Columnar Compression (HCC) types in Oracle:

1. Query Low Compression

This type of compression provides a balance between performance and compression ratio, suitable for frequent access and faster query response times.

CREATE TABLE my_table_query_low
COLUMN STORE COMPRESS FOR QUERY LOW
AS SELECT * FROM source_table;

2. Query High Compression

This compression provides a higher compression ratio than Query Low, but with more overhead for decompression.

It’s suitable for datasets that are read more often but don’t require the highest performance.

CREATE TABLE my_table_query_high
COLUMN STORE COMPRESS FOR QUERY HIGH
AS SELECT * FROM source_table;

3. Archive Low Compression

This compression is used when you need to store data for long periods with better compression than Query High, but still with decent performance when queried.

CREATE TABLE my_table_archive_low
COLUMN STORE COMPRESS FOR ARCHIVE LOW
AS SELECT * FROM source_table;

4. Archive High Compression

This is the most aggressive compression, offering the highest compression ratio.

It is ideal for rarely accessed or archived data.

CREATE TABLE my_table_archive_high
COLUMN STORE COMPRESS FOR ARCHIVE HIGH
AS SELECT * FROM source_table;

In these examples, each table is created from an existing table (source_table), and the specific HCC compression type is applied to optimize storage based on the use case.

The COLUMN STORE COMPRESS clause applies the desired HCC compression level.

5. How to create the source_table

Here is a simple CREATE TABLE statement for the source_table, which we can use as a reference table for the HCC compression examples.

This table contains sample columns that represent typical data types.

CREATE TABLE source_table (
    id            NUMBER PRIMARY KEY,
    first_name    VARCHAR2(50),
    last_name     VARCHAR2(50),
    email         VARCHAR2(100),
    created_at    DATE,
    amount        NUMBER(10, 2)
);

This source_table contains a few basic fields: an id, first_name, last_name, email, created_at (for date of entry), and amount.

You can populate this table with data, and then use it as the basis for applying the different Hybrid Columnar Compression (HCC) types.

6. How to populate the source_table

To populate the source_table with millions of rows in Oracle, you can use a PL/SQL block combined with loops and the DBMS_RANDOM package to generate random data.

Here’s an example that will insert 1 million rows into the table:

BEGIN
   FOR i IN 1..1000000 LOOP
      INSERT INTO source_table (
         id, first_name, last_name, email, created_at, amount
      )
      VALUES (
         i,
         DBMS_RANDOM.STRING('U', 10),   -- Random first name
         DBMS_RANDOM.STRING('U', 10),   -- Random last name
         DBMS_RANDOM.STRING('U', 10) || '@example.com',  -- Random email
         SYSDATE - DBMS_RANDOM.VALUE(0, 365),   -- Random date in the past year
         ROUND(DBMS_RANDOM.VALUE(1, 10000), 2)  -- Random amount between 1 and 10000
      );
      
      -- Commit every 10,000 rows for performance reasons
      IF MOD(i, 10000) = 0 THEN
         COMMIT;
      END IF;
   END LOOP;
   
   -- Final commit in case the loop ends without hitting a commit
   COMMIT;
END;
/

7. How to check the compression type of each table

To verify the type of compression applied to each table, you can query Oracle’s data dictionary views.

Specifically, you can use the DBA_TABLES view or ALL_TABLES view to check the COMPRESSION and COMPRESS_FOR columns.

Here’s how you can check the compression type for each table:

SELECT table_name, compression, compress_for
FROM dba_tables
WHERE table_name IN ('MY_TABLE_QUERY_LOW', 'MY_TABLE_QUERY_HIGH', 'MY_TABLE_ARCHIVE_LOW', 'MY_TABLE_ARCHIVE_HIGH');

Example output:

TABLE_NAMECOMPRESSIONCOMPRESS_FOR
MY_TABLE_QUERY_LOWENABLEDQUERY LOW
MY_TABLE_QUERY_HIGHENABLEDQUERY HIGH
MY_TABLE_ARCHIVE_LOWENABLEDARCHIVE LOW
MY_TABLE_ARCHIVE_HIGHENABLEDARCHIVE HIGH

2. How to check the benefits of HCC compression

To see how much benefit you’re getting from Hybrid Columnar Compression (HCC), you can use several approaches to measure the space savings and compare the storage used before and after compression.

Below are some methods to assess compression benefits:

1. Query DBA_SEGMENTS to Compare Table Sizes

You can check the actual size of the table in terms of segments (i.e., the storage used) and compare the sizes of tables with and without compression.

SELECT segment_name, segment_type, (bytes)/1024/1024 AS size_mb
FROM dba_segments
WHERE segment_name IN ('MY_TABLE_QUERY_LOW', 'MY_TABLE_QUERY_HIGH', 'MY_TABLE_ARCHIVE_LOW', 'MY_TABLE_ARCHIVE_HIGH');

This query will show the size (in MB) for each table, allowing you to compare how much storage space each compressed table occupies.

2. DBMS_COMPRESSION Package

Oracle provides the DBMS_COMPRESSION package to analyze the actual space savings from compression. You can use this package to gather detailed information about compression ratios and benefits. Here’s how you can use it:

DECLARE
   blk_compress_ratio   NUMBER;
   blk_comp_ops         NUMBER;
   blk_comp_ratio_rows  NUMBER;
   blk_uncomp_ops       NUMBER;
   blk_uncomp_ratio_rows NUMBER;
BEGIN
   DBMS_COMPRESSION.GET_COMPRESSION_RATIO(
      schema_name         => 'YOUR_SCHEMA_NAME',
      object_name         => 'MY_TABLE_QUERY_LOW',
      subobject_name      => NULL,
      object_type         => DBMS_COMPRESSION.TABLE,
      blk_compress_ratio  => blk_compress_ratio,
      blk_comp_ops        => blk_comp_ops,
      blk_comp_ratio_rows => blk_comp_ratio_rows,
      blk_uncomp_ops      => blk_uncomp_ops,
      blk_uncomp_ratio_rows => blk_uncomp_ratio_rows
   );
   
   DBMS_OUTPUT.PUT_LINE('Compression Ratio: ' || blk_compress_ratio);
   DBMS_OUTPUT.PUT_LINE('Uncompressed Size: ' || blk_uncomp_ops);
   DBMS_OUTPUT.PUT_LINE('Compressed Size: ' || blk_comp_ops);
END;
/

3. Use the Compression Advisor (DBMS_COMPRESSION.ADVISOR)

Oracle’s DBMS_COMPRESSION.ADVISOR provides estimates on the compression benefits you can get from HCC.

You can use this package to check the expected space savings.

Here’s an example of how to use it:

DECLARE
   comp_ratio NUMBER;
   comp_method VARCHAR2(100);
BEGIN
   comp_method := DBMS_COMPRESSION.COMP_FOR_QUERY_HIGH;

   comp_ratio := DBMS_COMPRESSION.ADVISOR (
      schema_name => 'YOUR_SCHEMA',
      table_name  => 'MY_TABLE_QUERY_HIGH',
      compress_for => comp_method
   );
   
   DBMS_OUTPUT.PUT_LINE('Expected Compression Ratio for ' || comp_method || ': ' || comp_ratio);
END;
/

4. Check Actual Row Compression

You can get the average row length before and after compression to assess how much space each row occupies:

Before Compression:

SELECT AVG(vsize(id) + vsize(first_name) + vsize(last_name) + vsize(email) + vsize(created_at) + vsize(amount)) AS avg_row_length
FROM source_table;

After Compression:

SELECT AVG(vsize(id) + vsize(first_name) + vsize(last_name) + vsize(email) + vsize(created_at) + vsize(amount)) AS avg_row_length
FROM MY_TABLE_QUERY_HIGH;

5. Monitor Storage Usage Over Time

You can periodically capture storage usage metrics before and after loading data into compressed tables and compare how storage grows for compressed vs uncompressed tables.

By using these methods, you can assess the actual space savings and benefits provided by Hybrid Columnar Compression in your environment.

The DBA_SEGMENTS view helps you see actual sizes, and the DBMS_COMPRESSION package provides a detailed breakdown of compression ratios.

Leave a Reply

Your email address will not be published. Required fields are marked *