|
|
|
||||
|
|
|||||
LANDVAL IMPROVAL TOTVAL SALEPRIC SALTOAPR
30000 64831 94831 118500 1.25
30000 50765 80765 93900 .
46651 18573 65224 . 1.16
45990 91402 . 184000 1.34
42394 . 40575 168000 1.43
. 3351 51102 169000 1.12
63596 2182 65778 . 1.26
56658 53806 10464 255000 1.21
51428 72451 . . 1.18
93200 . 4321 422000 1.04
76125 78172 54297 290000 1.14
. 61934 16294 237000 1.10
65376 34458 . 286500 1.43
42400 . 57446 . .
40800 92606 33406 168000 1.26
Then you can run tabmiss for one or more variables as illustrated below.
tabmiss landval improval totval salepric saltoapr
landval | Freq. Percent Cum.
------------+-----------------------------------
nomissing | 13 86.67 86.67
missing | 2 13.33 100.00
------------+-----------------------------------
Total | 15 100.00
improval | Freq. Percent Cum.
------------+-----------------------------------
nomissing | 12 80.00 80.00
missing | 3 20.00 100.00
------------+-----------------------------------
Total | 15 100.00
totval | Freq. Percent Cum.
------------+-----------------------------------
nomissing | 12 80.00 80.00
missing | 3 20.00 100.00
------------+-----------------------------------
Total | 15 100.00
salepric | Freq. Percent Cum.
------------+-----------------------------------
nomissing | 11 73.33 73.33
missing | 4 26.67 100.00
------------+-----------------------------------
Total | 15 100.00
saltoapr | Freq. Percent Cum.
------------+-----------------------------------
nomissing | 13 86.67 86.67
missing | 2 13.33 100.00
------------+-----------------------------------
Total | 15 100.00
Now we know the number of missing values in each variable. For instance, variable
salepric
has four and saltoapr has two missing values. This will help us to identify
variables that may have a large number of missing values and perhaps we may want
to exclude those from analysis.Now let's create our new variable.
egen nmis=rmiss2( landval improval totval salepric saltoapr)
tab nmis
nmis | Freq. Percent Cum.
------------+-----------------------------------
0 | 4 26.67 26.67
1 | 9 60.00 86.67
2 | 1 6.67 93.33
3 | 1 6.67 100.00
------------+-----------------------------------
Total | 15 100.00
Now we see that there are four observations with no missing values, one observation with one missing value in variable saltoapr, two observations with missing value in variable salepric and one observation with missing value in both variable totval and salepric, etc. If we want to delete some observations from the original data set, we have a better idea now on which observation to delete, e.g. the observation corresponding to the 7th observation below.egen land1=rmiss2(landval) egen impr1=rmiss2(improval) egen totv1=rmiss2(totval) egen sale1=rmiss2(salepric) egen salt1=rmiss2(saltoapr) gen id=_n collapse (count) freq=id, by (land1 impr1 sale1 totv1 salt1)
list
land1 impr1 totv1 sale1 salt1 freq
1. 0 0 0 0 0 4
2. 0 0 0 0 1 1
3. 0 0 1 0 0 2
4. 0 0 0 1 0 2
5. 0 0 1 1 0 1
6. 0 1 0 0 0 2
7. 0 1 0 1 1 1
8. 1 0 0 0 0 2
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services