UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Class Notes 3.0
The Dating Game


Introduction

Stata handles dates by storing them as the number of days since January 1, 1960. Dates can be entered as date strings or as separate month, day and year numeric values. Consider the following dataset containing two dates for each observation:

use http://www.ats.ucla.edu/stat/stata/notes3/dgame, clear

or

input str17 bdate rmon rday ryear
"7-11-1948"            12  13  1990
"1/21/52"               9   4  1993
"Aug 12, 1963"          3  26  1995
"Nov 2 1974"            6  15  1989
"8/24/1988"            11   9  1994
"11-15-1990"            4  30  1991
"December 16, 1957"     7  12  1990
"January 1 1960"        5   8  1992
"1/15/2005"             4  24  1996
end

list

                 bdate       rmon       rday      ryear 
  1.         7-11-1948         12         13       1990  
  2.           1/21/52          9          4       1993  
  3.      Aug 12, 1963          3         26       1995  
  4.        Nov 2 1974          6         15       1989  
  5.         8/24/1988         11          9       1994  
  6.        11-15-1990          4         30       1991  
  7. December 16, 1957          7         12       1990  
  8.    January 1 1960          5          8       1992  
  9.         1/15/2005          4         24       1996 

As you can see, bdate is enter as a string variable. In fact, each of the dates is entered in a slightly different format, all of which are understandable by Stata. The three numeric date values beginning with "r" stand for rmonth, rday and ryear.

Converting to Elapsed Dates

We will use the date function for the string variable and the mdy funtion for the numeric variables to convert these data into the number of days since January 1, 1960.


generate brthday = date(bdate,"MDY")
generate rdate = mdy(rmon,rday,ryear)
drop bdate - ryear
list
       brthday      rdate 
  1.     -4191      11304  
  2.         .      12300  
  3.      1319      12868  
  4.      5419      10758  
  5.     10463      12731  
  6.     11276      11442  
  7.      -746      11150  
  8.         0      11816  
  9.     16451      13263 

Note that both brthday and rdate are expressed as elapsed days. brthday values for observations 1 and 7 are negative because they occurred before January 1, 1960. Also note that the elapsed days for the date 1/21/52 (observation # 2) is set to missing. This is because the date contains a two-digit year which confuses Stata because it doesn't know which century to use. In general, Stata can handle dates from 01jan0100 to 31dec9999.

Although Stata stores dates as elapsed days it does not have to display them this way. Using the format command we can control the display format of the dates.

format brthday rdate %dD_m_CY

list

         brthday        rdate 
  1. 11 Jul 1948  13 Dec 1990  
  2.           .  04 Sep 1993  
  3. 12 Aug 1963  26 Mar 1995  
  4. 02 Nov 1974  15 Jun 1989  
  5. 24 Aug 1988  09 Nov 1994  
  6. 15 Nov 1990  30 Apr 1991  
  7. 16 Dec 1957  12 Jul 1990  
  8. 01 Jan 1960  08 May 1992  
  9. 15 Jan 2005  24 Apr 1996 

Computing with Dates

Let's find the number of days between brthday and rdate and then convert that into weeks, months and years.
generate diff = rdate - brthday
generate weeks = diff/7
generate months = diff/30.5
generate years = diff/365.25
list
         brthday        rdate       diff      weeks     months      years 
  1. 11 Jul 1948  13 Dec 1990      15495   2213.572   508.0328     42.423  
  2.           .  04 Sep 1993          .          .          .          .  
  3. 12 Aug 1963  26 Mar 1995      11549   1649.857   378.6557   31.61944  
  4. 02 Nov 1974  15 Jun 1989       5339   762.7143   175.0492   14.61738  
  5. 24 Aug 1988  09 Nov 1994       2268        324   74.36066   6.209445  
  6. 15 Nov 1990  30 Apr 1991        166   23.71428   5.442623   .4544832  
  7. 16 Dec 1957  12 Jul 1990      11896   1699.429   390.0328   32.56947  
  8. 01 Jan 1960  08 May 1992      11816       1688   387.4099   32.35044  
  9. 15 Jan 2005  24 Apr 1996      -3188  -455.4286  -104.5246  -8.728269 

Note that diff, weeks, months, and years for observation number 9 are negative. Of course, this is due to the fact that brthday is set to the year 2005 which is in the future.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.