Generate Dates in PostgreSQL

PostgreSQL has shipped with the generate_series() function for quite some
time. This function is often used to generate series of numeric data. For
instance, in my post, Understanding Common Table Expressions with
FizzBuzz
,
I used it to generate a series of integers from 1 to 100.

As of PostgreSQL 8.4, support for generating series of timestamp data was
added. I don't see this used often, so let's take a look at it.

We could take a peak at the online documentation to see the function
signature, or we could pull it up in psql using \df.

> \df generate_series()
                                                               List of functions
   Schema   |      Name       |         Result data type          |                        Argument data types                         |  Type
------------+-----------------+-----------------------------------+--------------------------------------------------------------------+--------
 pg_catalog | generate_series | SETOF bigint                      | bigint, bigint                                                     | normal
 pg_catalog | generate_series | SETOF bigint                      | bigint, bigint, bigint                                             | normal
 pg_catalog | generate_series | 
integer | integer, integer | normal pg_catalog | generate_series | SETOF integer | integer, integer, integer | normal pg_catalog | generate_series | SETOF numeric | numeric, numeric | normal pg_catalog | generate_series | SETOF numeric | numeric, numeric, numeric | normal pg_catalog | generate_series | SETOF timestamp with time zone | timestamp with time zone, timestamp with time zone, interval | normal pg_catalog | generate_series | SETOF timestamp without time zone | timestamp without time zone, timestamp without time zone, interval | normal

The last two records are what we are looking for — support for timestamps
with and without time zones.

Notice that each requires three arguments. The first two are the lower and
upper bound timestamps of the series to be generated. The interval specifies
what amount of spacing to put between each timestamp when generating the
series.

Let's try it out by generating the series of days in 2017.

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 day'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-01-02 00:00:00
 2017-01-03 00:00:00
 2017-01-04 00:00:00
 2017-01-05 00:00:00
 2017-01-06 00:00:00
 2017-01-07 00:00:00
 2017-01-08 00:00:00
 2017-01-09 00:00:00
 2017-01-10 00:00:00
 2017-01-11 00:00:00
 2017-01-12 00:00:00
 ...

Take note that we have to satisfy the function signature, so we create a
date using date and then coerce it to a timestamp. There are 365
results, so I've truncate them a bit.

PostgreSQL understands the calendar, so you can even count on proper
handling of concepts like a leap year.

> select * from (
    select generate_series(
             (date '2020-01-01')::timestamp,
             (date '2020-12-31')::timestamp,
             interval '1 day'
           )
    ) as twenty_twenty(d)
  where date_part('month', twenty_twenty.d) = 2;

          d
---------------------
 2020-02-01 00:00:00
 2020-02-02 00:00:00
 2020-02-03 00:00:00
 2020-02-04 00:00:00
 2020-02-05 00:00:00
 2020-02-06 00:00:00
 2020-02-07 00:00:00
 2020-02-08 00:00:00
 2020-02-09 00:00:00
 2020-02-10 00:00:00
 2020-02-11 00:00:00
 2020-02-12 00:00:00
 2020-02-13 00:00:00
 2020-02-14 00:00:00
 2020-02-15 00:00:00
 2020-02-16 00:00:00
 2020-02-17 00:00:00
 2020-02-18 00:00:00
 2020-02-19 00:00:00
 2020-02-20 00:00:00
 2020-02-21 00:00:00
 2020-02-22 00:00:00
 2020-02-23 00:00:00
 2020-02-24 00:00:00
 2020-02-25 00:00:00
 2020-02-26 00:00:00
 2020-02-27 00:00:00
 2020-02-28 00:00:00
 2020-02-29 00:00:00

We repurposed our previous query for the year 2020. After wrapping it in a
subquery, we are able to filter what we are looking at, in this case, just
the month of February (i.e. when the month is 2). Notice it has 29 days!

We've only look at intervals of '1 day' so far, but we can do whatever
interval we want. How about every 3 days?

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '3 days'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-01-04 00:00:00
 2017-01-07 00:00:00
 2017-01-10 00:00:00
 2017-01-13 00:00:00
 2017-01-16 00:00:00
 2017-01-19 00:00:00
 2017-01-22 00:00:00
 ...

Or every month?

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 month'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-02-01 00:00:00
 2017-03-01 00:00:00
 2017-04-01 00:00:00
 2017-05-01 00:00:00
 2017-06-01 00:00:00
 2017-07-01 00:00:00
 2017-08-01 00:00:00
 2017-09-01 00:00:00
 2017-10-01 00:00:00
 2017-11-01 00:00:00
 2017-12-01 00:00:00

We can even use a really odd interval of time.

> select generate_series(
           (date '2017-01-01')::timestamp,
           (date '2017-12-31')::timestamp,
           interval '1 month 1 day 1 hour'
         );

   generate_series
---------------------
 2017-01-01 00:00:00
 2017-02-02 01:00:00
 2017-03-03 02:00:00
 2017-04-04 03:00:00
 2017-05-05 04:00:00
 2017-06-06 05:00:00
 2017-07-07 06:00:00
 2017-08-08 07:00:00
 2017-09-09 08:00:00
 2017-10-10 09:00:00
 2017-11-11 10:00:00
 2017-12-12 11:00:00

PostgreSQL's date and timestamp capabilities are quite powerful. What we've
looked at above can be put to great use in report generation. For instance,
if you need aggregate data for certain intervals of time, you can join your
timestamped data against any generate_series table. Instead of pulling
large amounts of data into application land for processing, it can be done
in the database. Your CPU and your users will thank you.


Cover image by bady qb on Unsplash.com

Leave a Reply

Your email address will not be published. Required fields are marked *