|
|
|
||||
|
|
|||||
This module shows the use of if with common Stata commands.
Let's use the auto data file.
use auto
For this module, we will focus on make, rep78, foreign, mpg, and price We can use the keep command to keep just these variables.
keep make rep78 foreign mpg price
Let's make a table of rep78 by foreign to look at the repair histories of the foreign and domestic cars.
tabulate rep78 foreign| foreign rep78 | 0 1 | Total -----------+----------------------+---------- 1 | 2 0 | 2 2 | 8 0 | 8 3 | 27 3 | 30 4 | 9 9 | 18 5 | 2 9 | 11 -----------+----------------------+---------- Total | 48 21 | 69
Suppose we wanted to focus on just the cars with repair histories of four or better. We can use if suffix to do this.
tabulate rep78 foreign if (rep78 >=4)| foreign rep78 | 0 1 | Total -----------+----------------------+---------- 4 | 9 9 | 18 5 | 2 9 | 11 -----------+----------------------+---------- Total | 11 18 | 29
Let's make the above table using the column and nofreq options. Note that the column and nofreq come after the comma. These are options on the tabulate command and options need to be placed after a comma.
tabulate rep78 foreign if (rep78 >=4), column nofreq| foreign rep78 | 0 1 | Total -----------+----------------------+---------- 4 | 81.82 50.00 | 62.07 5 | 18.18 50.00 | 37.93 -----------+----------------------+---------- Total | 100.00 100.00 | 100.00
The use of if is not limited to the tabulate command. Here, we use it with the list command.
list if (rep78 >= 4)make price mpg rep78 foreign 3. AMC Spirit 3799 22 . 0 5. Buick Electra 7827 15 4 0 7. Buick Opel 4453 26 . 0 15. Chev. Impala 5705 16 4 0 20. Dodge Colt 3984 30 5 0 24. Ford Fiesta 4389 28 4 0 29. Merc. Bobcat 3829 22 4 0 30. Merc. Cougar 5379 14 4 0 33. Merc. XR-7 6303 14 4 0 35. Olds 98 8814 21 4 0 38. Olds Delta 88 4890 18 4 0 43. Plym. Champ 4425 34 5 0 45. Plym. Sapporo 6486 26 . 0 47. Pont. Catalina 5798 18 4 0 51. Pont. Phoenix 4424 19 . 0 53. Audi 5000 9690 17 5 1 55. BMW 320i 9735 25 4 1 56. Datsun 200 6229 23 4 1 57. Datsun 210 4589 35 5 1 58. Datsun 510 5079 24 4 1 59. Datsun 810 8129 21 4 1 61. Honda Accord 5799 25 5 1 62. Honda Civic 4499 28 4 1 63. Mazda GLC 3995 30 4 1 64. Peugeot 604 12990 14 . 1 66. Subaru 3798 35 5 1 67. Toyota Celica 5899 18 5 1 68. Toyota Corolla 3748 31 5 1 69. Toyota Corona 5719 18 5 1 70. VW Dasher 7140 23 4 1 71. VW Diesel 5397 41 5 1 72. VW Rabbit 4697 25 4 1 73. VW Scirocco 6850 25 4 1 74. Volvo 260 11995 17 5 1
Did you see the values of rep78 that had a value of .? Those are missing values. For example, the value of rep78 for the AMC Spirit were missing. Stata treats a missing value as positive infinity, the highest number possible. So, when we said list if (rep78 >= 4) Stata included the observations where rep78 was . as well.
If we wanted to include just the valid observations that are greater than or equal to 4, we can do the following to tell Stata we want rep78 >= 4 and rep78 not missing.
list if (rep78 >= 4) & !missing(rep78)make price mpg rep78 foreign 5. Buick Electra 7827 15 4 0 15. Chev. Impala 5705 16 4 0 20. Dodge Colt 3984 30 5 0 24. Ford Fiesta 4389 28 4 0 29. Merc. Bobcat 3829 22 4 0 30. Merc. Cougar 5379 14 4 0 33. Merc. XR-7 6303 14 4 0 35. Olds 98 8814 21 4 0 38. Olds Delta 88 4890 18 4 0 43. Plym. Champ 4425 34 5 0 47. Pont. Catalina 5798 18 4 0 53. Audi 5000 9690 17 5 1 55. BMW 320i 9735 25 4 1 56. Datsun 200 6229 23 4 1 57. Datsun 210 4589 35 5 1 58. Datsun 510 5079 24 4 1 59. Datsun 810 8129 21 4 1 61. Honda Accord 5799 25 5 1 62. Honda Civic 4499 28 4 1 63. Mazda GLC 3995 30 4 1 66. Subaru 3798 35 5 1 67. Toyota Celica 5899 18 5 1 68. Toyota Corolla 3748 31 5 1 69. Toyota Corona 5719 18 5 1 70. VW Dasher 7140 23 4 1 71. VW Diesel 5397 41 5 1 72. VW Rabbit 4697 25 4 1 73. VW Scirocco 6850 25 4 1 74. Volvo 260 11995 17 5 1
We can use if with most Stata commands. Here, we get summary statistics for price for cars with repair histories of 1 or 2. Note the == represents IS EQUAL TO and | represents OR.
summarize price if (rep78 == 1) | (rep78 == 2)Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 10 5687 3216.375 3667 14500
A simpler way to say this would be...
summarize price if (rep78 <= 2)Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 10 5687 3216.375 3667 14500
Likewise, we can do this for cars with repair history of 3, 4 or 5.
summarize price if (rep78 == 3) | (rep78 == 4) | (rep78 == 5)Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 59 6223.847 2880.454 3291 15906
Let's simplify this by saying rep78 >= 3.
summarize price if (rep78 >= 3)Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 64 6239.984 2925.843 3291 15906
Did you see the mistake we made? We accidentally included the missing values because we forgot to exclude them. We really needed to say.
summarize price if (rep78 >= 3) & !missing(rep78)Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 59 6223.847 2880.454 3291 15906
Most Stata commands can be followed by if, for example
Summarize if rep78 equals 2
summarize if (rep78 == 2)
Summarize if rep78 is greater than or equal to 2
summarize if (rep78 >= 2)
Summarize if rep78 greater than 2
summarize if (rep78 > 2)
Summarize if rep78 less than or equal to 2
summarize if (rep78 <= 2)
Summarize if rep78 less than 2
Summarize if rep78 not equal to 2summarize if (rep78 <2)
summarize if (rep78 != 2)
If expressions can be connected with
|
for OR
& for AND
Missing Values
Missing values are represented as . and are the highest value possible. therefore, when values are missing, be careful with commands like
to omit missing values, usesummarize if (rep78 > 3)summarize if (rep78 >= 3)summarize if (rep78 != 3)
summarize if (rep78 > 3) & !missing(rep78)summarize if (rep78 >= 3) & !missing(rep78)summarize if (rep78 != 3) & !missing(rep78)
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services