Wednesday, August 13, 2008

Two Famous Pseudocolumns: rowid and rownum

A pseudocolumn is a column that looks like a column but really is not a column. What! Stick around and find out what a pseudocolumn really is.

The best way to describe what a pseudocolumn is would be to take two of Oracle's most popular pseudocolumns, ROWID and ROWNUM, and explain their use. While use of these two pseudocolumns has been wide spread, they have also been misused or misunderstood. This article should shed some light on them and provide some use cases that you can make use of.

ROWID

Probably the most famous of all Oracle pseudocolumns is the ROWID pseudocolumn. This returns for each row selected the unique internal row address for the physical row. The rowid of a row contains such information as the object_id, file_id, block_id, and row number that uniquely identifies a row in the table. Moreover, since index lookups only evaluate to a rowid and then go after the data, accessing a row by rowid is the fastest way to extract information. If you care to explore these values just look at the DBMS_ROWID package in the Oracle manuals.

SQL> SELECT ROWID, dog_origin.* FROM dog_origin;
ROWID COUNTRY BREED BREED_SIZE
------------------ --------------- ------------------------ ---------------
AAAQlCAAEAAAAKQAAA Country Breed Breed_size
AAAQlCAAEAAAAKQAAB Germany German Shepherd Dog Big
AAAQlCAAEAAAAKQAAC Germany Dobermann Big
AAAQlCAAEAAAAKQAAD Germany Rottweiler Big
AAAQlCAAEAAAAKQAAE USA Siberian Husky Medium
AAAQlCAAEAAAAKQAAF USA Alaskan Malamute Medium
AAAQlCAAEAAAAKQAAG USA American Bulldog Big
AAAQlCAAEAAAAKQAAH Switzerland Bernese Mountain Dog Big
AAAQlCAAEAAAAKQAAI Switzerland Saint Bernard Dog Big
AAAQlCAAEAAAAKQAAJ Switzerland Entlebuch Cattle Dog Medium
AAAQlCAAEAAAAKQAAK Australia Australian Cattle Dog Medium
AAAQlCAAEAAAAKQAAL Australia Jack Russell Terrier Small

Probably the most famous use for the ROWID is to get rid of duplicates in a table. To do this you need only construct a SQL statement such as this. Basically what happens here is that you delete from the table dog_origin where its rowid is not equal to the MAX(rowid) on the related columns. This particular query example compares all columns in the table dog_origin, but depending on your requirements to scrub the data you may compare a subset of the rows.

SQL> DELETE FROM dog_origin a
WHERE rowid <> ( SELECT MAX(rowid) FROM dog_origin b
WHERE a.country = b.country
AND a.breed = b.breed
AND a.breed_size = b.breed_size;

ROWNUM

The ROWNUM pseudocolumn returns, for each row, a number that indicates the order in which the row was selected from the table. For instance, we can select all rows from the dog_origin table and ROWNUM is assigned in the following manner.

SQL> SELECT rownum, country, breed, breed_size FROM dog_origin;
ROWNUM COUNTRY BREED BREED_SIZE
---------- --------------- ------------------------- ----------
1 Country Breed Breed_size
2 Germany German Shepherd Dog Big
3 Germany Dobermann Big
4 Germany Rottweiler Big
5 USA Siberian Husky Medium
6 USA Alaskan Malamute Medium
7 USA American Bulldog Big
8 Switzerland Bernese Mountain Dog Big
9 Switzerland Saint Bernard Dog Big
10 Switzerland Entlebuch Cattle Dog Medium
11 Australia Australian Cattle Dog Medium
12 Australia Jack Russell Terrier Small

The major misconception in using ROWNUM is that many believe it can be used to rank a row just by ordering the results of table. As can be seen in the next query, when the rows are ordered by BREED, the ROWNUM does not provide a true ranking of the order by breed name.

SQL> SELECT rownum, country, breed, breed_size FROM dog_origin ORDER BY breed;
ROWNUM COUNTRY BREED BREED_SIZE
---------- --------------- ------------------------- ----------
6 USA Alaskan Malamute Medium
7 USA American Bulldog Big
11 Australia Australian Cattle Dog Medium
8 Switzerland Bernese Mountain Dog Big
1 Country Breed Breed_size
3 Germany Dobermann Big
10 Switzerland Entlebuch Cattle Dog Medium
2 Germany German Shepherd Dog Big
12 Australia Jack Russell Terrier Small
4 Germany Rottweiler Big
9 Switzerland Saint Bernard Dog Big
5 USA Siberian Husky Medium

In order to get around this limitation of ROWNUM for ranking a sort order, you can use the following subquery to first order the rows and then assign a ROWNUM to the rows. What you should notice is that ROWNUM is assigned and dependent upon the order in which a row is selected from a table or ordered from within a subquery of a SQL statement. Sometimes an index may be employed to enforce the order of selection but should not be relied upon as statistics or data patterns may change that render the index useless.

SQL> SELECT rownum, country, breed, breed_size
FROM (SELECT country, breed, breed_size FROM dog_origin ORDER BY breed)
ROWNUM COUNTRY BREED BREED_SIZE
---------- --------------- ------------------------- --------------------
1 USA Alaskan Malamute Medium
2 USA American Bulldog Big
3 Australia Australian Cattle Dog Medium
4 Switzerland Bernese Mountain Dog Big
5 Country Breed Breed_size
6 Germany Dobermann Big
7 Switzerland Entlebuch Cattle Dog Medium
8 Germany German Shepherd Dog Big
9 Australia Jack Russell Terrier Small
10 Germany Rottweiler Big
11 Switzerland Saint Bernard Dog Big
12 USA Siberian Husky Medium

So now, we can ask, and provide, a query to return the top-3 breeds in alphabetical order.

SQL> SELECT rownum, country, breed, breed_size
FROM (SELECT country, breed, breed_size FROM dog_origin ORDER BY breed)
WHERE rownum <
4

ROWNUM COUNTRY BREED BREED_SIZE
---------- --------------- ------------------------- --------------------
1 USA Alaskan Malamute Medium
2 USA American Bulldog Big
3 Australia Australian Cattle Dog Medium

When using the ROWNUM, one must be cautious when trying to evaluate the ROWNUM to a variable and the greater than (>) or equality (=) predicate. For instance, the following query produces a result set where the ROWNUM is equal to 1.

SQL> SELECT * FROM dog_origin WHERE rownum = 1;
COUNTRY BREED BREED_SIZE
--------------- ------------------------- -----------------
Country Breed Breed_size

However, when we issue the following two SQL statements there is returned NO ROWS!

SQL> SELECT * FROM dog_origin WHERE rownum = 2;
no rows selected
SQL> SELECT * FROM dog_origin WHERE rownum > 1;
no rows selected

No rows are returned because the rownum is evaluated during row retrieval and assigned if passing the WHERE clause test. The first row returned is assigned a rownum of 1, it does not pass either test (equal to 2 or greater than 1), and so is not part of the result set. The next row returned is again assigned the rownum of 1 and again does not pass the test. This happens for every row and thus no rows are returned.

ROW_NUMBER

Do not confuse the previous ROWNUM with the function ROW_NUMBER. ROWNUM is a true pseudocolumn while ROW_NUMBER is a function that also assigns a unique number to a row, based on an order by clause. This is the true method, in Oracle, to produce a ranking of rows from a SQL statement. An example is presented here only to contrast the previous ROWNUM method and because it is a problem that faces many SQL practitioners. It is much easier to understand then implementing a subquery.

SQL> SELECT rownum, country, breed, breed_size, row_number()
OVER (ORDER BY breed) as row_number
FROM dog_origin
ROWNUM COUNTRY BREED BREED_SIZE ROW_NUMBER
---------- --------------- ------------------------- ---------- ----------
6 USA Alaskan Malamute Medium 1
7 USA American Bulldog Big 2
11 Australia Australian Cattle Dog Medium 3
8 Switzerland Bernese Mountain Dog Big 4
1 Country Breed Breed_size 5
3 Germany Dobermann Big 6
10 Switzerland Entlebuch Cattle Dog Medium 7
2 Germany German Shepherd Dog Big 8
12 Australia Jack Russell Terrier Small 9
4 Germany Rottweiler Big 10
9 Switzerland Saint Bernard Dog Big 11
5 USA Siberian Husky Medium 12

Both ROWID and ROWNUM have been around Oracle for quite some time. They have been used for many purposes from assigning unique keys to rows during the modeling process to maintenance tasks that allow you to 'tag' rows that need deletion or other updates applied. Understanding how to use these two pseudocolumns and their limitations will enable you to use them with more efficiency and exactness in your daily activities.

==p.s.==


select * from
(select rownum as position, exact_match, close_match, persona_id from
(select * from ldb_player_scores where week_id=82 ORDER by (close_match+ exact_match) desc)
)
where persona_id=110

is equivalent to

select * from
(select
row_number() over (ORDER by close_match+ exact_match desc) as position, exact_match,
close_match, persona_id from ldb_player_scores where week_id=82) where persona_id=110

No comments:

Post a Comment