UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Using dates in SPSS in the year 2000 and beyond
(Handling the Y2K problem in SPSS)

SPSS can handle all Gregorian calendar dates from the years 1582 to the year 9999.  However, as these examples will illustrate, your data and SPSS programs may need some mending to be prepared for years 2000 and beyond.  This page contains example programs focusing on two main problems, data files which use 2 digits to specify the year (e.g., 12/25/98) and displaying dates using only 2 digits for the year (e.g., 25DEC98).

To change the SPSS default data for century cut-off, please see Customizing SPSS .


The example data file

Imagine that it is the summer of 2001 and you would like to create a very simple SPSS data file containing the names and birthdays of your friends.  The names and birthdates of your friends are...

Noel was born December 25, 1903 (Christmas Day)
Hank was born February 29, 1956 (A leap year)
Mary was born December 31, 1999 (New Year's Eve, before the year 2000)
Eric was born  January  1, 2000 (Near Year's Day, year 2000)
Jane was born      July 4, 2001 (Born on the 4th of July)

Ok, so it is a short list, and a couple of your friends (Mary, Eric and Jane) are a little bit young, but this list will help demonstrate problems which can arise when dealing with dates in the Year 2000 and beyond.  We start with an example where everything is fine, where the data uses 4 digits to indicate the year of birth, and the date of birth is displayed using 4 digits for the year.  If your data files and programs are like this example, your data and SPSS programs may be fully ready for the Year 2000.


Example 1.  Everything is Fine.

DATA LIST / name 1-4 (A) bday 6-15 (ADATE).

BEGIN DATA.
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001
END DATA.

FORMATS bday (ADATE10) .

LIST name bday.

Output From Example 1.

NAME       BDAY
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001


In Example 1 above, SPSS is used to read the names and birthdates of your friends, and then LIST is used to display their names and birthdates.  This example has two nice features.

  1. The birthdays are indicated using 4 digit years (e.g. 12/25/1903, NOT 12/25/03)
  2. LIST uses 4 digits to display the year of birth.  The FORMATS statement explicitly indicated that bday should be displayed as an American date with a width of 10, i.e., mm/dd/yyyy, even though SPSS may have used this format anyway.

By using 4 digit years to store the birthdays, and by displaying the birthdates using 4 digit years, this program is ready for the Year 2000.  In fact, this program is ready for the year 2100, 2200, all the way up to the year 9999. However, if a different SPSS FORMAT is used for displaying the birth dates, you may not be sure when some of your friends were born, as shown in Example 2 below.


Example 2.  Displaying Dates Using 2 Digit Years.

DATA LIST / name 1-4 (A) bday 6-15 (ADATE).

BEGIN DATA.
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001
END DATA.

FORMATS bday (ADATE8) .

LIST name bday.

Output From Example 2.

NAME     BDAY

Noel 12/25/03
Hank 02/29/56
Mary 12/31/99
Eric ********
Jane ********


Example 2 is just like Example 1, except that the ADATE8 format is used to display the birthdays.  As you can see, the FORMATS  (ADATE8) command told SPSS to display the birth dates as mm/dd/yy which is not appropriate for 4 digit years.  As a result, SPSS displayed the dates from the 1900s correctly, but displayed ******** for the birthdays which are from the year 2000 and beyond.  When dealing with dates which are in the year 2000 and beyond, it is important to choose a display format which will display dates using 4 digits for the year (e.g., ADATE10). However, there is a greater problem if the data only includes 2 digits for the year of birth, as shown in Example 3 below.


Example 3.  Inputting Dates Using 2 Digit Years.

DATA LIST / name 1-4 (A) bday 6-15 (ADATE) .

BEGIN DATA .
Noel 12/25/03
Hank 02/29/56
Mary 12/31/99
Eric 01/01/00
Jane 07/04/01
END DATA .

FORMATS bday (ADATE10) .

LIST name bday.

Output From Example 3.

NAME       BDAY
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/1900
Jane 07/04/1901


Example 3 demonstrates a problem of using inputting dates using only a 2 digit year.  For example, Eric was born on Jan 1, 2000 but his birthday is input as 01/01/00.  With a 2 digit date like this, it is ASSUMED that the century portion is 19.  As you can see in the output, Eric is incorrectly assigned a birthday of  Jan 1, 1900.  In this simple example we could enter the data all over again using 4 digits for the year of birth.  However, you may have data files with thousands or millions of records using dates with 2 digit years.  Example 4, shown below, illustrates a possible solution to this problem by telling SPSS when to treat a birthday as coming from the 1900s and when to treat a birthday as coming from the 2000s.


Example 4.  Inputting Dates Using 2 Digit Years With an IF Statement.

DATA LIST / NAME 1-4 (A) bday 6-13 (ADATE) .

BEGIN DATA .
Noel 12/25/03
Hank 02/29/56
Mary 12/31/99
Eric 01/01/00
Jane 07/04/01
END DATA .

IF (XDATE.YEAR(bday) LE 1902)
  bday=DATE.MDY(XDATE.MONTH(bday), XDATE.MDAY(bday), XDATE.YEAR(bday)+100 ).

FORMATS bday (ADATE10) .

LIST name bday.

Output From Example 4.

NAME       BDAY
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001


Example 4 demonstrates using an IF statement to deal with dates which have 2 digit years.  The IF statement instructs SPSS to add 100 years to the year of birth IF the year of birth is 1902 or smaller.  This IF statement attempts to draw a line at a certain year (in this case 1902).  Dates over that year are treated as being from the 1900s (e.g., 1903 to 1999 is treated as 1903-1999) but years 1902 and less (1900-1902) are treated as coming from the 2000s (2000-2002).  As you can see in this output, this seems to have mended our problem with the birthdays using 2 digit years.  Eric and Jane are now properly understood to have a birthday in the years 2000 and 2001 respectively.  However, Example 5 below shows a major weakness in this strategy, when a 2 digit year could mean 19xx or 20xx.


Example 5.  Problems Inputting Dates Using 2 Digit Years With an IF Statement.

DATA LIST / name 1-4 (A) bday 6-13 (ADATE) .

BEGIN DATA .
Noel 12/25/03
Hank 02/29/56
Mary 12/31/99
Eric 01/01/00
Jane 07/04/01
Will 10/31/03
END DATA .

IF (XDATE.YEAR(bday) LE 1902)
  bday=DATE.MDY(XDATE.MONTH(bday), XDATE.MDAY(bday), XDATE.YEAR(bday)+100 ).

FORMATS bday (ADATE10) .

LIST name bday.

Output From Example 5.

NAME       BDAY
Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001
Will 10/31/1903


Example 5 demonstrates the major weakness of using an IF statement for solving the problems with 2 digit years. It is now Winter 2003 and you have a new friend, Will born on October 31,  2003 (Halloween  2003).  As you can see, Noel, born in 1903 and Will, born in 2003 both have birth dates of 03.  The IF statement cannot differentiate between Noel and Will, and in this case both are treated as being born in the 1900s.

Using an IF statement is only useful when you can clearly specify a cutoff year which divides years which should be treated as 19xx from years which should be treated as 20xx.  However, when this line becomes blurred, this solution fails.  You can permanently solve your problem by revising your data file to use 4 digit years (e.g. as shown in Example 1), but this could be very costly and time consuming, requiring you to entirely restructure your data files and shift column locations for all other variables.  Example 6 shows a compromise solution by using a new variable to indicate the century portion of the date.


Example 6.  Inputting Dates Using 2 Digit Years Using an Additional Century Variable.

DATA LIST / name 1-4 (A) bday 6-13 (ADATE) bday_yy 15-16.

BEGIN DATA .
Noel 12/25/03
Hank 02/29/56
Mary 12/31/99
Eric 01/01/00 20
Jane 07/04/01 20
Will 10/31/03 20
END DATA .

IF (bday_yy EQ 20)
  bday=DATE.MDY(XDATE.MONTH(bday), XDATE.MDAY(bday), XDATE.YEAR(bday)+100 ).

FORMATS bday (ADATE10) .

LIST name bday.

Output From Example 6.

NAME       BDAY

Noel 12/25/1903
Hank 02/29/1956
Mary 12/31/1999
Eric 01/01/2000
Jane 07/04/2001
Will 10/31/2003


Example 6 solves the problem of the dates with 2 digit years by creating a separate variable indicating the century portion of the date.  As you can see in the output, everyone is correctly assigned the proper birthdate because the data EXPLICITLY indicates which birthdates should have a 20 prefixed to the year (using the bday_yy variable).  Here are some important points about this program.

  1. This strategy does not require that you change your existing data (while dates remain in the 1900s).  You only need to include the bday_yy variable for the records with dates (birthdays) in the 2000s.
  2. This strategy does not require that you change the column locations for your existing variables as long as you place the century variable(s) to the right of all existing data.  Note that bday_yy comes to the right of all other variables.
  3. The IF statement checks to see if the century (i.e., bday_yy) is 20.  If bday_yy is 20, the XDATE.MONTH, XDATE.DAY and XDATE.YEAR functions are used to  extract the Month and Day and Year, 100 is added to the year to move it from 19xx to 20xx.  Finally, the DATE.MDY function is used to combine the Month, Day, and modified Year to create the birthdate with the year 20xx.
  4. Note that this example contained only one date variable.  Your data file may contain more than one date variable, so you may need to have a century variable for each of the dates in your data file.  You would then need a separate IF statement corresponding to each of the date variables.

Conclusion

These examples illustrate some of the problems which will arise when using SPSS to process dates for the Year 2000 and beyond.  For more information, please see the links on our Statistical Computing and the Year 2000 page.  For assistance solving Year 2000 problems in statistical computing, feel free to use the Statistical Consulting Services provided by the UCLA Academic Technology Services.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California