Datespart of SQL for Web Nerds by Philip Greenspun, updated June 13, 2003 |
timestamp
and interval
. Earlier versions of
Oracle represented points in time with the date
datatype,
which is precise to within one second, and time intervals as numbers
(where 1 = one day).
We strongly recommend that you use the newly available ANSI data types when building new applications. These are cleaner and more powerful than the older Oracle-specific way of doing things and further will make it easier to port your application to another RDBMS if necessary.
If you are stuck using an older version of Oracle or are writing queries and transactions to an older data model, please visit http://philip.greenspun.com/sql/dates-pre-9.
Let's query for people who registered during the last day:create table users ( user_id integer primary key, first_names varchar(50), last_name varchar(50) not null, email varchar(100) not null unique, -- we encrypt passwords using operating system crypt function password varchar(30) not null, -- we only need precision to within one second registration_date timestamp(0) ); -- add some sample data insert into users (user_id, first_names, last_name, email, password, registration_date) values (1,'schlomo','mendelowitz','schlomo@mendelowitz.com','67xui2', to_timestamp('2003-06-13 09:15:00','YYYY-MM-DD HH24:MI:SS')); insert into users (user_id, first_names, last_name, email, password, registration_date) values (2,'George Herbert Walker','Bush','former-president@whitehouse.gov','kl88q', to_timestamp('2003-06-13 15:18:22','YYYY-MM-DD HH24:MI:SS'));
Note how the registration date comes out in a non-standard format that won't sort lexicographically and that does not have a full four digits for the year. You should curse your database administrator at this point for not configuring Oracle with a more sensible default. You can fix the problem for yourself right now, however:column email format a35 column registration_date format a25 select email, registration_date from users where registration_date > current_date - interval '1' day; EMAIL REGISTRATION_DATE ----------------------------------- ------------------------- schlomo@mendelowitz.com 13-JUN-03 09.15.00 AM former-president@whitehouse.gov 13-JUN-03 03.18.22 PM
You can query for shorter time intervals:alter session set nls_timestamp_format = 'YYYY-MM-DD HH24:MI:SS'; select email, registration_date from users where registration_date > current_date - interval '1' day; EMAIL REGISTRATION_DATE ----------------------------------- ---------------------- schlomo@mendelowitz.com 2003-06-13 09:15:00 former-president@whitehouse.gov 2003-06-13 15:18:22
You can be explicit about how you'd like the timestamps formatted:select email, registration_date from users where registration_date > current_date - interval '1' hour; EMAIL REGISTRATION_DATE ----------------------------------- ------------------------- former-president@whitehouse.gov 2003-06-13 15:18:22 select email, registration_date from users where registration_date > current_date - interval '1' minute; no rows selected select email, registration_date from users where registration_date > current_date - interval '1' second; no rows selected
Oops. Oracle pads some of these fields by default so that reports will be lined up and neat. We'll have to trim the strings ourselves:select email, to_char(registration_date,'Day, Month DD, YYYY') as reg_day from users order by registration_date; EMAIL REG_DAY ----------------------------------- ----------------------------- schlomo@mendelowitz.com Friday , June 13, 2003 former-president@whitehouse.gov Friday , June 13, 2003
select email, trim(to_char(registration_date,'Day')) || ', ' || trim(to_char(registration_date,'Month')) || ' ' || trim(to_char(registration_date,'DD, YYYY')) as reg_day from users order by registration_date; EMAIL REG_DAY ----------------------------------- ---------------------------- schlomo@mendelowitz.com Friday, June 13, 2003 former-president@whitehouse.gov Friday, June 13, 2003
alter session set nls_date_format = 'YYYY-MM-DD'; -- old select add_months(to_date('2003-07-31','YYYY-MM-DD'),-1) from dual; ADD_MONTHS ---------- 2003-06-30 -- new select to_timestamp('2003-07-31','YYYY-MM-DD') - interval '1' month from dual; ERROR at line 1: ORA-01839: date not valid for month specified -- old select to_date('2003-07-31','YYYY-MM-DD') - 100 from dual; TO_DATE('2 ---------- 2003-04-22 -- new (broken) select to_timestamp('2003-07-31','YYYY-MM-DD') - interval '100' day from dual; ERROR at line 1: ORA-01873: the leading precision of the interval is too small -- new (note the extra "(3)") select to_timestamp('2003-07-31','YYYY-MM-DD') - interval '100' day(3) from dual; TO_TIMESTAMP('2003-07-31','YYYY-MM-DD')-INTERVAL'100'DAY(3) ------------------------------------------------------------- 2003-04-22 00:00:00
Let's add a few more rows to our users table to see how this works.
Suppose that we're interested in the average length of time between registrations. With so few rows we could just query all the data out and eyeball it:insert into users (user_id, first_names, last_name, email, password, registration_date) values (3,'Osama','bin Laden','50kids@aol.com','dieusa', to_timestamp('2003-06-13 17:56:03','YYYY-MM-DD HH24:MI:SS')); insert into users (user_id, first_names, last_name, email, password, registration_date) values (4,'Saddam','Hussein','livinlarge@saudi-online.net','wmd34', to_timestamp('2003-06-13 19:12:43','YYYY-MM-DD HH24:MI:SS'));
If we have a lot of data, however, we'll need to do a self-join.select registration_date from users order by registration_date; REGISTRATION_DATE ------------------------- 2003-06-13 09:15:00 2003-06-13 15:18:22 2003-06-13 17:56:03 2003-06-13 19:12:43
Notice that to find the "next row" for the pairing we are using thecolumn r1 format a21 column r2 format a21 select u1.registration_date as r1, u2.registration_date as r2 from users u1, users u2 where u2.user_id = (select min(user_id) from users where registration_date > u1.registration_date) order by r1; R1 R2 --------------------- --------------------- 2003-06-13 09:15:00 2003-06-13 15:18:22 2003-06-13 15:18:22 2003-06-13 17:56:03 2003-06-13 17:56:03 2003-06-13 19:12:43
user_id
column, which we know to be sequential and unique,
rather than the registration_date column, which may not be unique
because two users could register at exactly the same time.
Now that we have information from adjacent rows paired up in the same report we can begin to calculate intervals:
The interval for each row of the report has come back as days, hours, minutes, and seconds. At this point you'd expect to be able to average the intervals:column reg_gap format a21 select u1.registration_date as r1, u2.registration_date as r2, u2.registration_date-u1.registration_date as reg_gap from users u1, users u2 where u2.user_id = (select min(user_id) from users where registration_date > u1.registration_date) order by r1; R1 R2 REG_GAP --------------------- --------------------- --------------------- 2003-06-13 09:15:00 2003-06-13 15:18:22 +000000000 06:03:22 2003-06-13 15:18:22 2003-06-13 17:56:03 +000000000 02:37:41 2003-06-13 17:56:03 2003-06-13 19:12:43 +000000000 01:16:40
Oops. Oracle isn't smart enough to aggregate time intervals. And sadly there doesn't seem to be an easy way to turn a time interval into a number of seconds, for example, that would be amenable to averaging. If you figure how out to do it, please let me know!select avg(reg_gap) from (select u1.registration_date as r1, u2.registration_date as r2, u2.registration_date-u1.registration_date as reg_gap from users u1, users u2 where u2.user_id = (select min(user_id) from users where registration_date > u1.registration_date)) ERROR at line 1: ORA-00932: inconsistent datatypes: expected NUMBER got INTERVAL
Should we give up? If you have a strong stomach you can convert the timestamps to old-style Oracle dates through character strings before creating the intervals. This will give us a result as a fraction of a day:
If we're going to continue using this ugly query we ought to create a view:select avg(reg_gap) from (select u1.registration_date as r1, u2.registration_date as r2, to_date(to_char(u2.registration_date,'YYYY-MM-DD HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS') - to_date(to_char(u1.registration_date,'YYYY-MM-DD HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS') as reg_gap from users u1, users u2 where u2.user_id = (select min(user_id) from users where registration_date > u1.registration_date)) AVG(REG_GAP) ------------ .13836034
Now we can calculate the average time interval in minutes:create view registration_intervals as select u1.registration_date as r1, u2.registration_date as r2, to_date(to_char(u2.registration_date,'YYYY-MM-DD HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS') - to_date(to_char(u1.registration_date,'YYYY-MM-DD HH24:MI:SS'),'YYYY-MM-DD HH24:MI:SS') as reg_gap from users u1, users u2 where u2.user_id = (select min(user_id) from users where registration_date > u1.registration_date)
select 24*60*avg(reg_gap) as avg_gap_minutes from registration_intervals; AVG_GAP_MINUTES --------------- 199.238889
to_char
function an GROUP BY
to generate a report of sales by calendar quarter:
This is a hint that Oracle has all kinds of fancy date formats (covered in their online documentation). We're using the "Q" mask to get the calendar quarter. We can see that this product started shipping in Q2 1998 and that revenues trailed off in Q4 1998.select to_char(shipped_date,'YYYY') as shipped_year, to_char(shipped_date,'Q') as shipped_quarter, sum(price_charged) as revenue from sh_orders_reportable where product_id = 143 and shipped_date is not null group by to_char(shipped_date,'YYYY'), to_char(shipped_date,'Q') order by to_char(shipped_date,'YYYY'), to_char(shipped_date,'Q'); SHIPPED_YEAR SHIPPED_QUARTER REVENUE -------------------- -------------------- ---------- 1998 2 1280 1998 3 1150 1998 4 350 1999 1 210
You state: >> no way in standard SQL to refer to "the value of this column from the previous row in the report".At least in Oracle 8i SQL, there is a way in to refer to this, I'm sure it isn't standard, but useful nonetheless, and so I present it here.
It is called an Analytic Function. There are several, but the one demonstrated in this example is LAST_VALUE.
SELECT r1, r2, r2 - r1 reg_gap FROM (SELECT u1.update_date AS r1, LAST_VALUE (update_date) OVER (ORDER BY update_date ASC ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) AS r2 FROM users u1 WHERE u1.user_id > 100000) WHERE r1 <> r2 ORDER BY r1
From the inside out, I take the update_date from the users table, and using the LAST_VALUE function, I ask for the last update_date value, including in the window the current row and the next chronologically ordered row.
I used a higher level query to do the difference simply to avoid repeating the long function, but I could have done it in one.
The results are the same:
"R1" "R2" "REG_GAP" 11/10/2003 5:19:00 PM 11/10/2003 8:23:24 PM 0.128055555555556 11/10/2003 8:23:24 PM 11/12/2003 7:53:10 AM 1.47900462962963 11/12/2003 7:53:10 AM 2/13/2004 3:44:47 PM 93.3275115740741
Although, as I said, I'm using 8i so I don't have the interval type.
To find out more about Analytic Functions, check out the Oracle Documentation SQL Reference.
KSF
-- K SF, September 1, 2004
"Some Profoundly Painful Things -- Calculating time intervals between rows in a table" is very useful, thank you. Some people may need the following technique to establish a sequential numeric identifier. (In the example you assume "user_id column, which we know to be sequential and unique")declare @tmp (registration_date datetime)
insert @tmp
select identity(int,1,1) as Sequence, registration_date into #x from users order by registration_date
... (now use #x instead of users in the example)
drop table #x
-- Steve Davis, January 29, 2006
You say: "Oops. Oracle pads some of these fields by default so that reports will be lined up and neat. We'll have to trim the strings ourselves." Not quite: one can use FM modifier in format string to instruct Oracle to trim whitespace from resulting string automatically, like this:SQL> select to_char(sysdate,'Day, Month DD, YYYY') from dual; TO_CHAR(SYSDATE,'DAY,MONTHDD, ----------------------------- Monday , May 22, 2006 SQL> select to_char(sysdate,'FMDay, Month DD, YYYY') from dual; TO_CHAR(SYSDATE,'FMDAY,MONTHD ----------------------------- Monday, May 22, 2006Note that FM is a switch - second FM in format string negates the effect of the first.
-- Vladimir Zakharychev, May 22, 2006
Not pretty at all but it works...CREATE OR REPLACE FUNCTION interval_to_seconds(x INTERVAL DAY TO SECOND ) RETURN NUMBER IS s VARCHAR2(26); days_s VARCHAR2(26); time_s VARCHAR2(26); N NUMBER(10,6); BEGIN s := TO_CHAR(x); days_s := SUBSTR(s,2,INSTR(s,' ')-2); time_s := SUBSTR(s,2+LENGTH(days_s)+1); N := 86400*TO_NUMBER(days_s) + 3600*TO_NUMBER(SUBSTR(time_s,1,2)) + 60*TO_NUMBER(SUBSTR(time_s,4,2)) + TO_NUMBER(SUBSTR(time_s,7)); IF SUBSTR(s,1,1) = '-' THEN N := - N; END IF; RETURN N; END;
-- Andre Mostert, June 20, 2006
1.Find the first monday on every quater based on date ?
Select Next_day(trunc(to_date(sysdate,'DD-MON-YYYY'), 'Q')-1,'Monday') from dual
-- Mohamed Kaleel, April 13, 2007
Computing number of seconds in an interval:FUNCTION seconds_from_interval(invInterval IN INTERVAL DAY TO SECOND) RETURN NUMBER IS BEGIN
RETURN EXTRACT (DAY FROM invInterval) * 86400 +
EXTRACT (HOUR FROM invInterval) * 3600 +
EXTRACT (MINUTE FROM invInterval) * 60 +
EXTRACT (SECOND FROM invInterval);
END seconds_from_interval;
-- Bob Jarvis, March 4, 2008