|
|
|
||||
|
|
|||||
input str5 cust case str11 day str20 status "brad" 1 "23 Oct 2003" "dropped off" "brad" 1 "25 Oct 2003" "fixed" "brad" 1 "27 Oct 2003" "picked up" "nikki" 2 "23 Oct 2003" "dropped off" "nikki" 2 "23 Oct 2003" "fixed" "nikki" 2 "28 Oct 2003" "picked up" "chloe" 3 "30 Oct 2003" "dropped off" "chloe" 3 "05 Nov 2003" "fixed" "chloe" 3 "06 Nov 2003" "picked up" "brad" 4 "23 Nov 2003" "dropped off" "brad" 4 "24 Nov 2003" "fixed" "brad" 4 "29 Nov 2003" "picked up" end
Next we want to change the string variable day into a numeric time variable. The first line of syntax below will be different depending on which version of Stata you are using. In Stata 9 and earlier the date function is as you see it below date(day,"dmy"). However, in Stata 10, the date function requires that the "dmy" be upper case. So if you are using Stata 10, the correct form is date(day, "DMY") .
Before we start, we need to make two assumption about the layout of the data. The first is that the levels of status have the same sequential ordering ("fixed" follows "dropped off," and "picked up" follows "fixed" ) and the second is that there needs to be a unique identifier for each case, we don't care if we have repeat customers, we just need something to differentiate between them.gen date = date(day, "dmy") format date %d drop day
sort case
by case: gen etime = date[2]-date[1] + 1 if status == "dropped off"
by case: replace etime = date[3]-date[2] + 1 if status == "fixed"
list, sep(12)
+------------------------------------------------+
| cust case status date etime |
|------------------------------------------------|
1. | brad 1 dropped off 23oct2003 3 |
2. | brad 1 fixed 25oct2003 3 |
3. | brad 1 picked up 27oct2003 . |
4. | nikki 2 dropped off 23oct2003 1 |
5. | nikki 2 fixed 23oct2003 6 |
6. | nikki 2 picked up 28oct2003 . |
7. | chloe 3 dropped off 30oct2003 7 |
8. | chloe 3 fixed 05nov2003 2 |
9. | chloe 3 picked up 06nov2003 . |
10. | brad 4 dropped off 23nov2003 2 |
11. | brad 4 fixed 24nov2003 6 |
12. | brad 4 picked up 29nov2003 . |
+------------------------------------------------+
Now that we know how many days are between the adjacent levels of status for
each case, we can use the expand option to expand our elapsed time
variable, etime. We sorted the data by case and status, and listed our first twelve
observations just to check that our program has worked to this point.
expand etime
sort case status
list in 1/12, sep(12)
+------------------------------------------------+
| cust case status date etime |
|------------------------------------------------|
1. | brad 1 dropped off 23oct2003 3 |
2. | brad 1 dropped off 23oct2003 3 |
3. | brad 1 dropped off 23oct2003 3 |
4. | brad 1 fixed 25oct2003 3 |
5. | brad 1 fixed 25oct2003 3 |
6. | brad 1 fixed 25oct2003 3 |
7. | brad 1 picked up 27oct2003 . |
8. | nikki 2 dropped off 23oct2003 1 |
9. | nikki 2 fixed 23oct2003 6 |
10. | nikki 2 fixed 23oct2003 6 |
11. | nikki 2 fixed 23oct2003 6 |
12. | nikki 2 fixed 23oct2003 6 |
+------------------------------------------------+
Our next and final step is to update the dates to reflect the elapsed
time. To do this, we need a counter of days elapsed not only for each case, but
also for the level of status, this is done by specifying case and status within
the by statement. Once we have our new day counter, t, we can replace our
date variable by adding date and t together.
sort case status
by case status: gen t = _n-1
replace date = date + t
list
+----------------------------------------------------+
| cust case status date etime t |
|----------------------------------------------------|
1. | brad 1 dropped off 23oct2003 3 0 |
2. | brad 1 dropped off 24oct2003 3 1 |
3. | brad 1 dropped off 25oct2003 3 2 |
4. | brad 1 fixed 25oct2003 3 0 |
5. | brad 1 fixed 26oct2003 3 1 |
|----------------------------------------------------|
6. | brad 1 fixed 27oct2003 3 2 |
7. | brad 1 picked up 27oct2003 . 0 |
8. | nikki 2 dropped off 23oct2003 1 0 |
9. | nikki 2 fixed 23oct2003 6 0 |
10. | nikki 2 fixed 24oct2003 6 1 |
|----------------------------------------------------|
11. | nikki 2 fixed 25oct2003 6 2 |
12. | nikki 2 fixed 26oct2003 6 3 |
13. | nikki 2 fixed 27oct2003 6 4 |
14. | nikki 2 fixed 28oct2003 6 5 |
15. | nikki 2 picked up 28oct2003 . 0 |
|----------------------------------------------------|
16. | chloe 3 dropped off 30oct2003 7 0 |
17. | chloe 3 dropped off 31oct2003 7 1 |
18. | chloe 3 dropped off 01nov2003 7 2 |
19. | chloe 3 dropped off 02nov2003 7 3 |
20. | chloe 3 dropped off 03nov2003 7 4 |
|----------------------------------------------------|
21. | chloe 3 dropped off 04nov2003 7 5 |
22. | chloe 3 dropped off 05nov2003 7 6 |
23. | chloe 3 fixed 05nov2003 2 0 |
24. | chloe 3 fixed 06nov2003 2 1 |
25. | chloe 3 picked up 06nov2003 . 0 |
|----------------------------------------------------|
26. | brad 4 dropped off 23nov2003 2 0 |
27. | brad 4 dropped off 24nov2003 2 1 |
28. | brad 4 fixed 24nov2003 6 0 |
29. | brad 4 fixed 25nov2003 6 1 |
30. | brad 4 fixed 26nov2003 6 2 |
|----------------------------------------------------|
31. | brad 4 fixed 27nov2003 6 3 |
32. | brad 4 fixed 28nov2003 6 4 |
33. | brad 4 fixed 29nov2003 6 5 |
34. | brad 4 picked up 29nov2003 . 0 |
+----------------------------------------------------+
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services