UCLA Academic Technology Services HomeServicesClassesContactJobs
Stat Computing > Seminars > Statistical Writing
Search

Statistical Computing Seminars
Statistical Writing

Research, from hypothesis development through finished manuscript, is a process.  Hence, the results section of the manuscript is the product of all of the earlier stages of the research.  The better the quality of these earlier stages, the better the quality of the results section. 

Before you get to the results section

Let's start off with a general discussion of what needs to happen before you start writing the results section.  To write a good results section, you have to have first carefully thought out each aspect of your study.  A power analysis (if done well) is much more than simply determining how many subjects you will need.  Rather, it is a very detailed planning process.  This is the planning not only of what you will do, but how you will do it (including what software will be used), and how long each step might take.  Even if you don't do a formal power analysis, a very detailed research plan will serve you well.  The real purpose of these is to reduce the frequency, intensity and duration of panic.  Now, panic can be caused by all sorts of things, including realizing that the software package with which you are familiar will not do the analysis that you need to do, that you don't have enough subjects to run the model that you intended to run, that you have to stop and learn all about methods for handling missing data, etc.  The point about planning your time is particularly important, because, especially if you are early in your research career, you have had less time to learn all of the tools you will need.  For example, if you are planning on doing a structural equation model (SEM), and you have never done one before, you will find that you likely have to take a class on the subject.  There are some techniques, like SEM, multilevel modeling, etc., that you probably can't pick up by reading a couple of books about it.  Also, techniques like SEM are typically not available in standard statistical packages, so you will probably have to learn a new package.  It is very difficult to learn a new technique and a new package at the same time, so you will probably want to do that sequentially.  As you can see, this is going to take some time.  Now I am definitely not suggesting that anyone only stick with techniques and packages that they already know.  Rather, I think that graduate school is a wonderful opportunity to learn new techniques and packages, and these things look great on a CV or a resume.  Another point to bear in mind is how long it has been since your last statistics class.  Many people find it unnerving when they realize that it has been one, two, three or more years since their last stats class, and they feel very unprepared to do their data analysis, much less explain their results to others.  Even if you recently had a stats class that you aced, you may well find that you are not as prepared as you would like for doing real-life data analysis.  Real data has a way of being real messy, and issues that never came up in your stats class will arise as you do your analyses.  My point here is not to scare anyone, but to reassure you that almost everyone experiences a certain amount of angst while doing and writing about analyses.  Don't be afraid to ask questions, even if they seem like "baby" questions.  If you feel uncomfortable asking your advisor (which you shouldn't), other good resources include our website, especially our online seminars, and web pages from other universities.

Four potentially problematic issues

If we have time at the end, I will come back and discuss four issues that are often problematic (both in terms of time and panic) for our clients.
1.)  Missing data:  Missing data issues and the possible ways of handling them can take a lot of time.  You not only have to learn about the pros and cons of various possible techniques, but then you have to decide which one is most appropriate for your situation.  You will find that hard-and-fast rules are rare in this area, and there is lots of disagreement among "experts".  Once you have decided on a technique, you will have to determine if the package with which you are familiar will do that, or if you will then have to find and learn a package that will do that.  Next, you need to determine if the package that you want to use for the analysis will handle that type of imputation.  For example, let's say that you were doing a multiple linear regression in SPSS.  That was fine until you decided to use multiple imputation to handle your missing data.  Not only will SPSS not create the multiply imputed data files (as of version 16), it can't analyze multiply imputed data files either. 
2.)  Small sample sizes:  For most applied research, small sample sizes are problematic, usually for many reasons.  For one, many common statistical procedures are not appropriate for small sample sizes.  Even if the researcher decides to use the modeling technique, the model may not run for numeric reasons.  For example, the likelihood may not converge, a matrix may not be positive definite, etc.  Even if the model does run successfully, the assumptions of the test may not be met.  Any of these problems can cause the researcher to either modify the model until it does run, or "fall back" to a simpler statistical technique.  This can really complicate things because now you have to ask a modified form of your research question, then the flow of the research is disrupted, etc.  In other words, your hypotheses are necessarily tied to your statistical analyses, and you usually cannot modify one without modifying the other.  Also, issues of fair and accurate reporting of what you have done become pertinent. 
3.)  Survey data:  Many researchers who have never used survey data before believe that analyzing survey data is just like analyzing data from experiments.  This isn't true.  The probability weights need to be used to adjust the estimates for the sampling plan, and the standard errors need to be adjusted to account for the non-independence of the observations (i.e., PSUs and/or strata or replicate weights need to be used).  For some researchers, this simply means using different commands in the stat package that they are already using (such as Stata).  For others, it means learning a new stat package.
4.)  Correlated data:  Now, technically, most survey data are correlated data.  However, there are lots of types of correlated data that are not survey data.  For example, patients or doctors in hospitals, people in neighborhoods, partners in couples, etc.  There are several ways to analyze correlated data, and it is often a judgment call on the part of the analyst as to which technique to use.  Again, if you are not familiar with the various ways to analyze correlated data you will have to stop and learn at least enough about the various methods so that you can select which method you feel is most appropriate to use.  When writing about the analysis, you will have to justify why you selected this technique over others.  Also, you may end up having to analyze the data using more than one technique so that you can have confidence in your results.

Methods and results

The results section is an extension of the methods section, so if the methods section was not well thought out and well written, the results section will be very difficult to do well.  You can't have a great results section and a lousy methods section.  Of those people who find the results section to be the most difficult section of a paper to write, many have this dilemma because they did not have a clear and precise analysis plan before they began collecting data (or before they began running analyses if they are using secondary data).  In many cases, they were unable to conduct the analyses that they had planned to conduct.  Instead, they had to run to some other analysis, with which they may not have been familiar.  Even under these circumstances, understanding the theory and assumptions of the statistical test you use is also part of the research process.  You need to learn and understand this stuff, as well as the software necessary to run the analyses, just as you need to learn and understand your substantive area.  Also, there is a lot of work between getting the appropriate output from a statistical program and being able to write the results section, especially for new researchers, and there is usually more to explaining your results than simply reporting a statistically significant p-value.  In other words, don't be tricked into thinking that just because you have the output that you need that you are necessarily ready to write.  It may take quite a while to fully digest what those coefficients really mean.  Let's take an example here.  Suppose that you have an ANOVA or regression model with two continuous predictors and their interaction, and that all three of these terms are statistically significant.  You will likely want to describe that the predicted value increases by the amount of the positive coefficient as the predictor variable increases by one unit.  And describing what an interaction between two continuous variables means substantively will take a fair amount of work; you will have to understand it very well if you are going to write about it clearly and have your audience understand what is happening in your model and in your data. 

OK, so that's the general lead up to writing the results section.  Now let's talk specifics.  First off, writing up the results will be much easier if you set up the research question clearly and precisely.  This helps to establish the reader's expectation for what he is about to read.  If the question is worded in a vague manner, the reader may go into the section with very different expectations than you intend, and he is likely to be confused.  The next question is, how to decide what to include and what to leave out?  In most cases, you should tell a story, and include only those points that are relevant to your story.  As easy as this may sound, this can be quite painful.  Some things that took weeks to do will be reduced to a sentence or two (e.g., power analyses, assumption checking, alternate analyses).  In other words, there is no relationship between the amount of time it took you to do something and the amount of space on the page its write-up gets.  Remember that there is a careful balance between enough detail to replicate the experiment and space limitations imposed by the journal.  This is a point to which we will return a little later on.

In general, there are at least four "levels" of writing about results:
1.)  what analysis technique you used (often a fine line between being overly technical and not giving enough specifics)
2.)  the statistical significance of the model, perhaps amount of variance accounted for
3.)  coefficients, maybe which are "more important", but this is often dangerous
4.)  real-world meaning (often a combination of points 2 and 3)

Where to start

In the first part of the results section, you will want to describe your data.  I strongly urge people to resist putting p-values in this part of the results section for two reasons.  The first reason is purpose and the second is alpha inflation.  You will remember back to your first statistics class when you learned that there were two types of frequentist statistics:  descriptive and inferential.  Describing your data is just that:  a description; you are not testing any hypotheses.  Since p-values are only associated with hypothesis testing, they do not belong in a description of the data.  The second reason to avoid including p-values in the description of the data is the issue of alpha inflation.  Alpha inflation is a phenomenon that happens when you conduct more and more significance tests on the same data set.  I am going to use an extreme example to illustrate the problem.  Let's say that you run only one significance test on your data and that you have set alpha equal to .05.  This means that, five times out of 100, you will get a statistically significant result when, in fact, there is no effect in the population to be found.  In other words, you have a 5% chance of rejecting the null hypothesis when it is true.  Now let's say that you ran 10 tests.  The formula for determining the nominal alpha level is:  1 - (1 - alpha)x, where x is the number of tests that you run.  So we have 1 - (1 - .05)10 = .40.  This means that there is a 40% chance that you will get a Type I error (a.k.a. a false alarm), not a 5% chance.  To address this problem, many researchers use alpha correction procedures (which can create their own set of problems), but you can see that you want to run as few significance tests as possible to minimize this problem.  This topic also ties back to our earlier discussion about planning.  You want to know ahead of time how many significance tests you will be running.  There is also an issue of fair and accurate reporting of what you have done here.  You want to run only the tests that you planned to run, and not go fishing for statistically significant results.  As an extreme example, you would not want to run 100 t-tests and report only the few that were statistically significant.  The reader of your article or dissertation assumes that you have reported all relevant aspects of what you have done, and omitting the fact that you ran 97 more significance tests than you reported is an important omission, as your results should be interpreted very differently in light of how many tests your ran.  Remember that the reproducibility of published results is of paramount importance to the advancement of any discipline, and accuracy about the type and quantity of analyses performed is an important aspect of reproducibility of your results.

OK, let's quickly cover a few last points.  It is often a good idea to order your analyses from most to least important so that you have the most power for your most important hypotheses (sequential gate-keeping).  Running only the significance tests that you planned can also help avoid confusion over what a specific p-value means.  There is often much confusion over what a specific p-value means (parameter versus model).  Finally, there are two other traps you want to avoid.  One is false precision, both in the number of decimal places used with coefficients and p-values.  The other is an issue of statistical significance versus clinical or practical relevance.  To help avoid this, report effect sizes.  In fact, many journals are now requiring researchers to report effect sizes.

Examples

After I gave this seminar last time, I found that what most people in the audience wanted was specifics, especially what to say and what not to say in the results section.  In fact, many people said they wanted to be shown an output, say of a regression analysis, and then an example of how to write it up.  However, this is nearly impossible to do, and I will show you why in just a moment.  Besides, this "cookie-cutter" approach is usually a very bad way to go.  I don't like to see people doing statistics this way, and this approach is even worse when you are writing results.  The best way to write a clear, concise results section is to thoroughly understand the statistical techniques that you used to analyze your data.   Another good strategy is to look at articles that report similar analyses for ideas about that exact terminology to use.  This is a particularly good idea because the write-ups of similar analyses can be very different in different fields.  Also, some journals require much more precise language than other journals, so you might want to look at some articles in the journal in which you want to publish.  You can also find examples in our Data Analysis Example pages, our annotated output pages, and Regression Models for Categorical Dependent Variables Using Stata, Second Edition by Long and Freese (2006).  Even if you are not analyzing your data with Stata, this is a great resource.  

Let's start off with a couple of examples of why you can't just look at a piece of output and write about it.  After that, we will look at some examples of some common pitfalls encountered when writing up the results of seemingly simple analyses.

So, here is a regression table.  The variable gender is dichotomous, and the variable read is continuous.  What could be difficult about interpreting this?

The difficulty has to do with the way the dichotomous variable gender is coded.  If gender was coded as 0/1, then the intercept is the mean for the group coded 0.  If gender is coded 1/2, then the intercept is the mean for the group coded 1 minus the coefficient (the B, 5.487) for gender.  Now, let's take this example one step further.  Let's say that I create a variable called female, which is 1 for females and 0 otherwise (i.e., 0 for males).  Let's replace gender with female, and let's also include the interaction between female and read

How would you interpret these results?  Well, the interaction, fr, is not statistically significant, so there isn't much we can say about that.  So let's go on to female and read.  Or can we?  The answer is no, we can't interpret any of the other (lower order) effects, because the dichotomous variable is coded 0/1.  Because the dichotomous variable is coded 0/1, it is not independent of the interaction term.  If you wanted to interpret the lower order effects, you would recode female to be -1/1.  This is called effect coding, and it is how categorical variables are coded in ANOVA.  Let's rerun the model without the interaction term.

When writing about this output, you could say that "the effect of female is positive and statistically significant", but you can't say that "the main effect of female is positive and statistically significant".  You don't have a main effect of female because you have coded that variable 0/1.  Rather, you have a simple effect of the variable female, which means that the effect of the variable read is .636 when female is 0 (in other words, the effect of read is .636 for males).  You could also phrase it as "the simple effect of female" or "the linear effect of female".  To get a main effect for the variable female, you would have to code female -1/1.  I have created a new variable called fme (for female main effect) in used that in the model instead of female.  Notice that both the coefficient for fme and the coefficient for the constant have changed.

Another common error is to refer to the model above as a multivariate regression instead of a multiple regression.  A multivariate regression is a regression model with more than one outcome variable; a multiple regression is a regression with more than one predictor variable.

The point here is that simply looking at the output is often not enough when trying to do interpretation and writing.  Rather, you need to know lots of things, and seemingly small details can greatly affect the meaning.  This is why the "cookie-cutter" approach to interpretation doesn't work well.  Now let's go on to some other examples of places where people often have difficulty in writing about results.

Example:  Categorical predictor variables

Now let's look at a model that includes a categorical variable that has more than two levels.  In this example, we have included the variable race, which has four levels.  Because race has four levels, we have included three dummy variables in the regression.  The dummy variable for the second level, called _Irace_2, is statistically significant, while none of the other dummy variables are.  What can we say about this? 

xi: regress write read math female i.race
i.race            _Irace_1-4          (naturally coded; _Irace_1 omitted)

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  6,   193) =   37.46
       Model |  9619.24508     6  1603.20751           Prob > F      =  0.0000
    Residual |  8259.62992   193    42.79601           R-squared     =  0.5380
-------------+------------------------------           Adj R-squared =  0.5237
       Total |   17878.875   199   89.843593           Root MSE      =  6.5419

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        read |    .320763   .0612872     5.23   0.000     .1998843    .4416416
        math |   .3652081    .067842     5.38   0.000     .2314011    .4990151
      female |   5.287456    .937736     5.64   0.000      3.43793    7.136983
    _Irace_2 |   4.838573    2.45403     1.97   0.050    -.0015891    9.678734
    _Irace_3 |   .9289412   1.989441     0.47   0.641    -2.994896    4.852778
    _Irace_4 |   2.490295   1.493206     1.67   0.097    -.4548022    5.435392
       _cons |   11.74903   2.984052     3.94   0.000     5.863487    17.63457
------------------------------------------------------------------------------

Until we run the 3 degree of freedom test that "pulls" the variable race back together, we can't say anything.

test  _Irace_2 _Irace_3 _Irace_4

 ( 1)  _Irace_2 = 0
 ( 2)  _Irace_3 = 0
 ( 3)  _Irace_4 = 0

       F(  3,   193) =    1.67
            Prob > F =    0.1757

Because this test is not statistically significant, you can't say anything about _Irace_2.  Now let's change the model a little bit (replace math with socst) and see what happens.

xi:  regress write read socst female i.race
i.race            _Irace_1-4          (naturally coded; _Irace_1 omitted)

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  6,   193) =   38.06
       Model |  9689.26202     6    1614.877           Prob > F      =  0.0000
    Residual |  8189.61298   193  42.4332279           R-squared     =  0.5419
-------------+------------------------------           Adj R-squared =  0.5277
       Total |   17878.875   199   89.843593           Root MSE      =  6.5141

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        read |   .3307708   .0592551     5.58   0.000     .2139001    .4476414
       socst |   .3074725   .0553338     5.56   0.000      .198336    .4166091
      female |   4.690728   .9393554     4.99   0.000     2.838008    6.543449
    _Irace_2 |    7.55963   2.399498     3.15   0.002     2.827024    12.29224
    _Irace_3 |   .2886157   1.981522     0.15   0.884    -3.619603    4.196834
    _Irace_4 |   3.043909    1.47917     2.06   0.041     .1264957    5.961323
       _cons |   14.17782   2.780192     5.10   0.000     8.694361    19.66128
------------------------------------------------------------------------------

test _Irace_2 _Irace_3 _Irace_4

 ( 1)  _Irace_2 = 0
 ( 2)  _Irace_3 = 0
 ( 3)  _Irace_4 = 0

       F(  3,   193) =    4.26
            Prob > F =    0.0061

Now the overall test of race is statistically significant, so you can look at the individual dummy variables.  For those of you with some ANOVA training, this is akin to running an ANOVA, seeing that there is an effect of a categorical predictor, and then doing follow-up tests to see where the differences are.  With regression output, though, things may seem a little backwards, in that you get the tests between the different groups (the dummy variables) before you see if the overall variable is statistically significant.  As with ANOVA, it doesn't make sense to talk about where differences may be if the overall variable is not statistically significant.  When writing about the dummy variables, you will want to make clear what type of coding system was used (e.g., effect coding, Helmert coding, orthogonal polynomial coding, etc.), as well as what the reference group is.  Both of these will affect the interpretation of the dummy variables.  Also, you don't want to leave out dummy variables that are not statistically significant; for example, you would not want to rerun the above model without _Irace_3.  If you did that, your reference group would be a combination of the first and third levels of race, and that is not likely to make substantive sense.

For more information about coding systems, please see chapter 5 of our Regression with SAS , Stata  and SPSS web books.

Example:  Logistic regression

If you have conducted a logistic regression, you can describe your results in several different ways.  You could discuss the logits (log odds), odds ratios or the predicted probabilities.  Which metric you choose is a matter of personal preference and convention in your field.  Most of the information in this section is quoted from Regression Models for Categorical Dependent Variables Using Stata, Second Edition by Long and Freese (2006), pages 177-181.  If you are running a logistic regression model, an ordered logit model, a multinomial logit model, a poisson model or a negative binomial model, I strongly suggest that you borrow or buy a copy of this book and read up on the particular type of model that you are running.  Most people find this book very helpful, even if they are using a statistics package other than Stata.

When interpreting the output in the logit metric, "... for a unit change in xk, we expect the logit to change by βk, holding all other variables constant."  "This interpretation does not depend on the level of the other variables in the model."

When interpreting the output in the metric of odds ratios, "For a unit change in xk, the odds are expected to change by a factor of exp(βk), holding all other variables constant."  "When interpreting the odds ratios, remember that they are multiplicative.  This means that positive effects are greater than one and negative effects are between zero and one.  Magnitudes of positive and negative effects should be compared by taking the inverse of the negative effect (or vice versa)."  "For exp(βk) > 1, you could say that the odds are "exp(βk) times larger", for exp(βk) < 1, you could say that the odds are "exp(βk) times smaller." 

Now if you are having difficulty understanding a unit change in the log odds really means, and odds ratios aren't as clear as you thought, you might want to consider describing your results in the metric of predicted probabilities.  Many audiences, and indeed, many researchers, find this to be a more intuitive metric in which to understand the results of a logistic regression.  While the relationship between the outcome variable and the predictor variables is linear in the logit metric, the relationship is not linear in the probability metric.  Remember that "... a constant factor change in the odds does not correspond to a constant change or a constant factor change in the probability."  This nonlinearity means that you will have to be very precise about the values at which the other variables in the model are held.

I hope that this example makes clear why I say that in order to write a clear and coherent results section, you really need to understand the statistical tests that you are running.

Our next example concerns confidence intervals, so let's jump ahead a little bit and talk about confidence intervals in logistic regression output.  "If you report the odds ratios instead of the untransformed coefficients, the 95% confidence interval of the odds ratio is typically reported instead of the standard error.  The reason is that the odds ratio is a nonlinear transformation of the logit coefficient, so the confidence interval is asymmetric."

Example:  Confidence intervals

Many journals are pushing for confidence intervals to be included in the results section.  But what does the confidence interval tell you?  Problematic interpretations include:  "We are 95% confident that the true parameter for reading score lies between .209 and .456."  "There is a 95% chance that the true parameter lies between .209 and .456."  Rather, the confidence interval gives a range of values such that if the experiment was run many times (e.g., 10,000 times), the range would contain the true parameter 95% of the time.  Most of the time, there is little reason to comment on the confidence interval:  it is what it is.  One situation in which you might want to comment on the confidence interval is when you are conducting a study in order to get a precise estimate of a particular parameter, e.g., the mean age of people in a particular population.

Example:  Interaction terms

Many researchers have difficulty interpreting and understanding the meaning of interaction terms in statistical models, so this is often one of the most challenging parts of the results section to write.  If you are going to include an interaction term in your model, be sure that it is testing a hypothesis of interest to you; don't include interactions "just because".  Also, plan on spending extra time exploring and graphing the interaction.  This is one term in your model that you are going to have to understand really, really well before you will be able to write about it clearly.  Also, some statistical software packages are better than others for creating the graphs of interactions, so you may need to switch packages to make the graph.  Graphs are often a necessary part of understanding the interaction, even if the graph won't be included in the final manuscript.

The simplest form of interaction to interpret is the interaction of two dichotomous variables.  It is fairly easy to get the cell means, see how the coefficients are calculated, and obtain a graph.  The situation becomes more complicated when you have a dichotomous by continuous interaction.  In this situation, graphs are usually very helpful in understanding what is happening.  When you have a continuous by continuous interaction, the graph is three dimensional, and you are looking at the warping of a plane.  The situation becomes even more complex if you have more than one interaction in the model or three-way (or higher) interactions.  Please remember that if you have interaction terms in your model, you almost always need to have the lower-order effects in the model as well.  For example, if you have a three-way interaction of xyz, you will need to include in the model the three two-way interactions, xy, yz and xz, as well as x, y and z.  If all of the lower-order terms are not included in the model, the three-way interaction will likely be uninterruptible.

For more information regarding the use and interpretation of interactions in regression, please see the last few chapter of our OLS Regression with SAS , Stata  and SPSS web books.  For more information on interactions in logistic regression, please see our seminar Visualizing Main Effects and Interactions for Binary Logit Models in Stata with movies.

Example:  Bivariate tests

For our last example, let's talk about the clarity of specifying which statistical test was conducted.  Looking at the output above, a researcher might write,  "We did a bivariate analysis between the variables, and the result was significant (p = .01)."  However, this is problematic for a couple of reasons.  First of all, a "bivariate" analysis can refer to any analysis that involves only two variables.  Examples of bivariate analyses include chi-square, correlation, simple OLS regression, simple logistic regression, t-test, one-way ANOVA, etc.  Second, the write up should be specific about which variables are used in each analysis.  Perhaps a better way to write this would be:  "We conducted a chi-square test with gender and favorite flavor of ice cream, and the result was statistically significant (χ2(2) = 9.269, p < .05)."  Depending on the rest of the paragraph, you might also want to include the number of cases used in this analysis, the number of cases in each cell, and/or that the assumption that each expected count was five or greater was met.

Words of caution

While I can't tell you exactly what words to use in your results section, we have come up with a partial list of words that you want to be very careful when using.  One of the problems with many of these words is that they have at least two meanings:  a meaning in common parlance and a specific statistical meaning (and sometimes more than one statistical meaning).

- chance
- odds
- risk
- probability
- significance (statistical or clinical, parameter or model)
- likelihood
- standardized (variable, coefficient, test scores)
- normal
- controlling for (this is an idea that is in the analyst's head, not the program analyzing the data)
- covariates
- robust (regression, standard errors, findings)
- nested, hierachical (models, data)
- random (variables, intercepts, slopes, effects)

Tables and graphs

Returning to the point about space issues, tables and graphs are two ways to convey a lot of information in a relatively small amount of space.  However, creating useful tables is often more difficult than it seems.  Almost everyone has had the experience of reading a journal article and being mystified about what exactly is in a particular table or how some values where calculated.  It is often tempting and easy to add too much information in a single table; the old adage  "Less is more" is often true.  Graphs can also be an important part of a write-up, but care must be taken to make them as clear as possible.  Axes and lines should be clearly labeled, for example.  Good references for improving the quality of graphs include The Visual Display of Quantitative Information by Edward R. Tufte, Visual and Statistical Thinking:  Displays of Evidence for Making Decisions by Edward R. Tufte, Visual Explanations by Edward R. Tufte, Visualizing Data by William S. Cleveland, Displaying Your Findings:  A Practical Guide for Creating Figures, Posters and Presentations by Adelheid A. M. Nicol and Penny M. Pexman, and for improving the quality of tables Presenting Your Findings:  A Practical Guide for Creating Tables by Adelheid A. M. Nicol and Penny M. Pexman.  (Please note that all of the these books are available for loan from our Statistics Books for Loan .) 

What goes where

A final point that should be made here is that a clear distinction between what goes in the results section and what goes in the discussion section.  Please keep in mind that the different disciplines have different guidelines about what should be covered in the results section and what should be included in the discussion section.  However, a common reason that the results section is "too long" is because some of the write-up that should be in the discussion section has ended up in the results section.  An example of this involves the generalizability of the results.  Issues of generalizability have a way of sneaking into the results section, although they usually belong in the discussion section.  (Part of this will likely also be in the methods section when you describe the population from which you drew your sample.)

Some things to avoid

There are a couple of things that you want to avoid in your results section.  One is concluding that one result is "more significant" than another result because, for example, one p-value is .02 and the other is .0001.  There is no such thing as one result being "more significant" than the other.  If you are interested in relative importance, you want to look at effect sizes or perhaps omega-squareds, but certainly not p-values.  Another pitfall to avoid is claiming that a result is "almost significant" or "nearly significant" when the p-value is .055 or so.  These terms are just different ways of saying non-significant.  Also, according to Murphy's Law, the p-value of .055 will be associated with the variable in which you are most interested.  Please avoid "adjusting" your model so that you get the p-value that you want (one that is less than or equal to .05).  You can say that a result with a p-value of .055 is suggestive and that future research may want to follow up on this, but not significant is not significant, and you have to consider the role random chance played in the obtaining of that p-value.  While we are on the topic of non-significant results, a good way to save space in your results (and discussion) section is to not spend time speculating why a result is not statistically significant.  Because of the logic underlying hypothesis tests, you really have no way of knowing why a result is not statistically significant.  Once you find that something is statistically non-significant, there is usually nothing else you can do, so don't waste your time or space there; rather, move on and talk about something else.  Some really persistent analysts try to do post-hoc power analyses when faced with non-significant results, but there is a large literature explaining why these are neither appropriate nor useful.  Excellent summaries can be found in Hoenig and Heisey  (2001) The Abuse of Power:  The Pervasive Fallacy of Power Calculations for Data Analysis and Levine and Ensom (2001) Post Hoc Power Analysis:  An Idea Whose Time Has Passed?.  As Hoenig and Heisey show, power is mathematically directly related to the p-value; hence, calculating power once you know the p-value associated with a statistic adds no new information.  Furthermore, as Levine and Ensom clearly explain, the logic underlying post-hoc power analysis is fundamentally flawed.

Possible future trends

The final topic that I want to discuss today has to do with possible future trends in research and how they might affect you.  Two trends in particular are the increasing sophistication of statistical analyses and the use of the web.  Now the increasing sophistication of the statistical analyses is a trend that is well under way and is not likely to subside any time soon.  What do I mean by "increasing sophistication"?  For much of the history of statistical data analysis, researchers were aware that there were some shortcomings with a particular analysis, but the theory and/or the software necessary to do the analysis correctly was not available.  For example, in the education literature, there were many lengthy discussions of the "unit of analysis" issue, in other words, was the appropriate unit of analysis students or classrooms?  Most researchers agreed that both should be taken into consideration in the analysis, but no one knew how to do that.  Now, multilevel models are used with students at level 1 and classrooms at level 2.  In the past, if you had students nested in classrooms, you could still do an ANOVA and get it published.  Today, a reviewer would likely tell you to redo your analysis using multilevel modeling.  As another example, just a few years ago, most researchers in applied linguistics used between-subjects' ANOVA models, even though they really had repeated measures data.  Now I hear that it is getting difficult to get articles published that don't account for the correlated nature of the data.  I want to stress here that a "fancier" model is not necessarily a better model, nor is a more complicated technique always superior to a simpler technique.  However, you want to have a good match between your data and the analysis technique, meaning that all of the relevant aspects of the data should be represented in the model.  A third, and possibly the most dramatic example of the quick advancement in statistics is in the area of missing data techniques.  As you can imagine, many (perhaps most) researchers have a problem with missing data.  The "preferred" method for handling this problem is changing so quickly that the advice that we gave to clients just two years ago is out-dated and we now have new (and hopefully more appropriate) ways available.  Also, many of the newer statistical techniques are now quickly finding their way into standard or common software packages.  While this is really nice in that you don't have to buy lots of specialized software, it means that all researchers have easy access to these techniques, and the standard for handling missing data in publication-quality research keeps getting raised (which isn't a bad thing).  The take-home point here is that statistics and statistical software are constantly evolving and improving, and sometimes you need to make an effort to keep up.

Now let's talk about the other trend in research, the one that involves the use of the web.  Some researchers have started making their data sets, codebooks and syntax available on their web sites.  In a similar vein, some journals are asking for copies of data sets and making them available on their web sites so that other researchers can use them as secondary data sets or to confirm published results.  Either way, this trend means that there may be much closer scrutiny of data sets and their analysis.  We always suggest that researchers use syntax (as opposed to point-and-click) to run their analyses.  There are at least two good reasons for this.  Such syntax files can be very useful if you get a revise and resubmit ("R&R") or for posting on a web site.  This will also document your data transformations, analyses and thought process.  Even if you are not planning on making your data set publicly available, you should keep careful notes about each step in your research and data analysis, including how and why each step is done.  Also, it will help to keep you honest about the number of significance tests that you have run on your data, so that you will have some sense of the potential alpha inflation problem.

Additional resources

I hope that these tips will help make the writing of the results sections of papers easier.  If you are interested in viewing the resources mentioned in this presentation, the links are:

ATS Statistical Computing

  • Data Analysis Examples
  • Annotated output
  • Web books Online Seminars including
  • Statistics Books for Loan
  • Applied Statistics Courses Offered at UCLA
  • Walk-in and email consulting is available to UCLA graduate students who are working on their thesis, dissertation or to-be-published paper; please see Statistical Consulting Schedule for location and hours.  Also, please review our Statistical Consulting Services to learn more about what services we provide.  Please note that we cannot read over your entire results section and make comments.  Rather, we can answer specific questions that you might having about interpretation, wording, etc.  If you would like to hire a statistics tutor, we have a list of people that we can share with you.

    The talking-point notes for this seminar are here .


    How to cite this page

    Report an error on this page

    UCLA Researchers are invited to our Statistical Consulting Services
    We recommend others to our list of Other Resources for Statistical Computing Help
    These pages are Copyrighted (c) by UCLA Academic Technology Services


    The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California