APPROX_COUNT_DISTINCT function in Oracle 12C

Oracle 12C has a new function APPROX_COUNT_DISTINCT which helps us to calculate the approximate number of distinct values from a table. As the name indicate it returns a approximate value and not the exact count. This is how this new feature works in Oracle 12C.

SQL> create table my_tab
2  (
3  Name varchar2(50),
4  id integer,
5  dept_no integer
6  );

Table created.

SQL> insert into my_tab
2  select ‘James’, level, mod(level, 10)
3  from dual connect by level <= 100000;

100000 rows created.

SQL> exec dbms_stats.gather_table_stats(‘<Schema_name>’,’my_tab’);

PL/SQL procedure successfully completed.

SQL> select count(distinct name), count(distinct id), count(distinct dept_no) from my_tab;

COUNT(DISTINCTNAME) COUNT(DISTINCTID) COUNT(DISTINCTDEPT_NO)
——————-                  —————–              ———————-
1                                            100000                             10

Now lets check what will be results of the APPROX_COUNT_DISTINCT against the same table and data.

SQL> select APPROX_COUNT_DISTINCT(name) Name, APPROX_COUNT_DISTINCT(id) id, APPROX_COUNT_DISTINCT(dept_no) dept_no from my_tab;

NAME             ID             DEPT_NO
———-         ———- ———-
1                       101766     10

The value is close and i don’t know what algorithm Oracle is using for this calculation but response time is very less compared to COUNT (DISTINCT) command.

SQL> select APPROX_COUNT_DISTINCT(id) id from my_tab;

Execution Plan
———————————————————-
Plan hash value: 2076930067

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 102 (0)| 00:00:01 |
| 1 | SORT AGGREGATE APPROX| | 1 | 5 | | |
| 2 | TABLE ACCESS FULL | MY_TAB | 100K| 488K| 102 (0)| 00:00:01 |
--------------------------------------------------------------------------------

Oracle is not taking this value from the metadata tables and i think it may be using a small sample and derive this vale from that sample.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s