Analyzing N dimensions

Log-linear analysis is used to analyze, not just two-dimensional tables, but contingency tables of more than two dimensions.  As such, it can be regarded as an N-dimensional extension of the chi-square analyses of introductory statistics courses.  Among the standard references are Bishop, Fienberg, and Holland (1975) and Fienberg (1980), although more accessible alternatives are Bakeman and Robinson (1994), Kennedy (1992), and Wickens (1989), with Wickens being especially thorough and clear.

A typical log-linear analysis begins by defining a series of hierarchical models.  A series of models is hierarchic when higher level models include all terms present in lower order models, and lower level models omit one or more terms from the model immediately preceding it.  As terms are deleted—deleting higher-order terms before lower-order ones—more parsimonious models result, that is, each model in the series is less complex than the one before it.

The goal of log-linear analysis is to identify the simplest model that still provides an acceptable fit to the data. Thus the best fitting model combines parsimony and information.  Models generate expected counts for the cells of the contingency table.  Less complex models, having fewer terms, are less constrained and so have more degrees freedom—and consequently the counts they generate fit the observed data less well.  The question is, how much less?  How bad is a less well-fitting model?

Model fit

A model’s goodness-of-fit—more accurately, badness-of-fit—is assessed with the likelihood ratio chi-square (symbolized ), an alternative computation for the more familiar Pearson chi-square (χ²) that, for technical reasons associated with its decomposition, is preferred in log-linear analysis.  Both G² and χ² express the discrepancy between the data collected and a hypothesized model that indicates how variables are related; in other words, they reflect the difference between the observed counts and the expected counts generated by a particular model.  The greater the difference between the observed and expected counts, the larger G² and χ² become.

The bigger G² is, the worse the model fits.  The first model in the hierarchic series—the saturated model—generates expected frequencies that match the observed ones exactly and, for that reason, fits the data perfectly:  Its G² = 0 with 0 degrees of freedom.  The question then becomes whether a more parsimonious model, one with a larger G², will still fit acceptably.

A common criterion for a tolerably fitting model is the significance of its G²:  If G² is small and notsignificant, p > .05, the discrepancies between the observed cell counts and those generated by the model must be relatively small, and so we conclude that the model fits acceptably.  However, given large counts, this criterion may be too strict because even relatively small deviations from expected will result in a G² significantly different from zero.

A second criterion, useful when counts are especially large, is the magnitude of Q², which is a comparative fit or reduction in error index analogous to the R² of multiple regression.  Knoke and Burke (1980) suggested that any model whose Q² is greater than .90 provides satisfactory fit, even if its G² differs significantly from zero.  Q² is the proportion of a specified base model’s G² that is accounted for by the model in question, specifically:

In other words, Q² indicates how much initial failure to fit is reduced by a particular model.  When the terms in this model account for over 90% of the baseline model’s failure to fit we conclude that the model fit is acceptable and that the terms deleted to form the model are not consequential.  Selecting an appropriate baseline model can be something of an art. Absent a rationale, a safe choice is the equiprobable or null model, the model that predicts that all cells will have the same count.  When one factor is clearly regarded as the outcome, another choice is the outcome-and-design model, as described shortly.

Bracket notation

An especially convenient way to specify log-linear models is with bracket notation.   As an example, consider the prior possession study whose four dimensions are A, D, P, and R (for Age, Dominance, Prior Possession, and Resistance).  As previously noted, the most complete model is called the saturated model.  Using bracket notation, it can be represented with the single 4-way term [RPDA]—here we have reversed the order of the factors so that, as in a multiple regression equation, the presumed outcome variable is listed first (ILOG automatically reverses the order of factors when making models).

The saturated model is conventionally shown as a single term—letters representing each of the factors enclosed in square brackets.  In fact, it also includes implicitly all possible lower-order terms, in this case the four 3-way terms—[RPD][RPA][RDA][PDA]; the six 2-way terms—[RP][RD][RA][PD][PA][DA]; the four 1-way terms—[R][P][D][A]; and a constant, indicated here as the null model—[0], the model that predicts equal counts for each cell and so maximizes G² (i.e., failure to fit).

Specifying models

ILOG lets you specify models in three different ways.  As an example, imagine that we used the modify table procedure to delete the dominance factor from the initial 2x2x2x2 table, resulting in a 2x2x2, Age by Prior Possession by Resistance table for analysis.  To begin analyzing log-linear models in ILOG, select Run > Specify Log-Linear Models.

The Specify Models window that would open for the 3-dimensional table just described is shown below.  Initially, the first line would indicate the saturated model (without brackets), but all other lines would be blank.  One way to specify a new model is to select (i.e., left click) the first blank cell in the model column; a list of terms will be displayed and a new model will be built from the terms selected.  A second way is to select a cell in the delete column; a list of terms will be displayed and a new model will be built, but with the selected terms deleted.  A third way is more automatic.  Selecting Run > Make All-Level Modelsgenerates a list of all possible models, from the saturated to the null, as shown below, whereas selecting Run > Make Next-Level Models generates just models for the next level.  For example, if only the 3-way saturated model is listed (and selected), Make Next-Level Models generates all models involving 2-way terms up to and including the model that consists of all 1-way terms (lines 2–5 below).

Specifically, for this example we first opened a 2x2x2, Age by Prior Possession by Resistance contingency table based on the Bakeman and Brownlee data given earlier.  After checking Compute Q², we then selected Run > Make all level models and Run > Compute model stats.  The figure shows the G², degrees of freedom, and approximate probability for each model and, for models after the saturated model, the term deleted from the previous model to create it and the ΔG² and its degrees of freedom and approximate p-value for the chi-square difference test.

You probably see a pattern here, especially if you recall permutation formulas or Pascal’s triangle (Bakeman & Robinson, 2005).  With two dimensions (factors) and including the saturated and the null, 4 hierarchical models are possible—[YX], [Y][X], [X], and [0]; with three dimensions (as here) there are 8, with four there would be 16, and generally with N factors there are 2 raised to the power N possible hierarchical models.  The number of n-way terms follows the binomial expansion (Pascal’s triangle).  For N = 2 coefficients are 1, 2, 1; for N = 3 they are 1, 3, 3, 1; for N = 4 they are 1, 4, 6, 4, 1; etc., where coefficients are the number of n-way terms, from the saturated to the null model (e.g., for N =4, one 4-way, four 3-way, six 2-way, four 1-way and one null term).

Two points should be emphasized.  First, although you can generate 2 to the power N hierarchical models, you are only interested in finding the most parsimonious—that is the last model in the hierarchic series that still fits tolerably well.  Second, the order in which terms are deleted from models that include all n-way terms (e.g., lines 2 and 5 in Figure 3) is arbitrary.  You can accept the default order or specify your own, based on whatever order makes the most conceptual sense to you.  By default, ILOG deletes terms in the left-to-right order you might expect:  for example, for [ABC], first [AB] then [AC] then [BC].  But if you prefer a different order, you can simply select a delete term and change the terms deleted as described earlier.

After specifying a series of hierarchic models, next you would select Run > Compute Model Stats, which causes a variety of statistics to be displayed in the Specify Models window.  Of immediate interest is the G² for each model, which is displayed along with its degrees of freedom and approximate p-value.  Typically, your interest is locating the last model in the series with a non-significant p-value, that is, one for which p > .05.  Alternatively, especially if the number of tallies is large, you may prefer to locate the last model in the series whose Q² is at least .90.  Other entries indicate, for each model, the terms deleted from the previous model and the degrees of freedom and approximate p-value for the deleted term or terms.  This constitutes a chi-square difference test (partial G² or ΔG² —labelled ^G² in ILOG), appropriate when one model is nested in another, and indicates whether removing the deleted terms caused the fit of the model to deteriorate significantly.

Interpretation

Earlier we noted that the association between Prior Possession and Resistance was significant for preschoolers (p = .015) but not toddlers (p = .47), but that the piece-meal analyses could not tell us whether the magnitude of the association differed between them.  The log-linear analysis shown above—after checking the Compute Q² box (for illustration; the number of tallies here is not especially large) and after selecting Run > Compute Model Statistics—provides an answer.  We would conclude that only the saturated [RPA] model fits because its p-value is > .05 and the p-value for the next model in the series is less than .05 (assuming an alpha level of .05).  In other words, this 3-dimensional table cannot be simplified.  The association between Prior Possession and Resistance differed by Age.

Recasting these results in more familiar analysis of variance terms is helpful.  If we identify one factor as the outcome, an N-dimensional contingency table can be described as an (N–1)-way analysis of variance.  Imagine the factors are A, B, and Y, with Y the outcome, so that the saturated model is [YBA]; B could be a predictor and A could be a moderator variable (like Age or Gender).  The [YB] term indicates a main effect for factor B, the [YA] term a main effect for factor A, the [BA] term simply indicates the design, and the [YBA] term indicates a B´A interaction.   In particular, given three factors—one an outcome, one a predictor, and one a moderator—log-linear analysis provides an answer to a common question:  Is the association between the outcome and predictor different for different groups, that is, is it moderated by group membership?

Base model

Thinking in analysis of variance terms also helps us determine an appropriate base model.  When one variable is regarded as the outcome and others as design variables, an appropriate base model, using the notation of the previous paragraph, is [BA][Y]—this is the outcome-and-design model we mentioned earlier.  Including [BA] in the base model signals that we are interested in associations between outcome and design variables, not in associations within the design variables; after all, they are often determined by the investigator, as when gender is one factor and we recruit equal numbers of males and females.  For example, if we specified [PA][R] as the base model for the analyses shown earlier (and not the null model as shown), the Q² for the [RP][RA][PA] model would be = (6.71 – 5.81)/6.71 = .13—which is further evidence that this is an ill-fitting model.

If you checked Compute Q², which requires that a base model be specified, ILOG assumes the base model is the last model listed in the hierarchic series.  ILOG lets you state your base model (upper-left edit box in Specify Models window).  This is useful as a reminder if your base model is something other than the null.  If the number of factors is 2 or 3, automatic model generation will stop with the specified base model, but if the number of factors is 4 or more, you will need to select terms to delete to insure that the last model specified is, in fact, your desired base model.

Structural zeros, empirical zeros, and low counts

Cells may contain zero for different reasons.  For example, if one factor is gender (male or female) and the other pregnant (yes or no), one of the cells will necessarily be zero—this is called a structural zero (any other value is logically impossible) and is indicated in ILOG, not with a 0, but with an asterisk.  Structural zeros reduce degrees of freedom and ILOG makes the appropriate adjustments; for any given model, the degrees of freedom adjustment is displayed after the model number.

However, cells may also contain zeros simply because no cases were observed; usually such empirical zeros are not problematic if they are few in number.  However, if many cells contain zeros or low counts, the expected frequencies computed may be low.  If expected frequencies are less than 1 for more than 5% of the cells for any given model, ILOG displays the percentage after the model number.  Guidelines vary, but Wickens (1989) has suggested that, for large two-way tables, it may be acceptable, if not desirable, for as many as 20% of the cells to contain expected frequencies less than one—but if the guideline is violated, the test should be abandoned.  For multidimensional tables, you should remain wary if many cells are zero.

Empirical zeros can cause another problem.  Depending on where they occur, they can cause some cells of the marginal tables used to compute expected frequencies to be zero and thereby reduce degrees of freedom (Wickens, 1989, pp. 120–124).  As Wickens writes, “A pre-packaged computer program may or may not make these corrections automatically—one should check to be sure” (p. 120).  ILOG does make these adjustments and notes the number by which degrees of freedom were reduced (displayed after the model number of any affected models).

Deviant cells

A small G² for a model indicates that most cells fit well, that is, the expected frequencies generated by the model are fairly close to those observed, whereas a large G² indicates just the opposite.  For the present example, the saturated model [RPA] fit the observed exactly, but it could be useful to examine why the [RP][RA][PA] model failed to fit.

Selecting a particular model in the Specify Models window and then selecting Run > Examine Selected Model, opens a window that lets you examines the differences between observed and expected-by-the model frequencies for each cell.   For this example, the largest standardized residual (+1.36) is for prior possession without resistance for preschoolers: 11 instances were observed but only 7.31 were expected by the [RP][RA][PA] model.  For toddlers, the opposite was true; again 11 instances were observed but 14.7 were expected and, consequently, the standardized residual was negative (–0.96).  Thus it is not surprising that the log-linear analysis suggested an interaction—a difference in the prior possession–resistance association between toddlers and preschoolers.

Explicating terms

Once you have settled on a tolerably fitting model to interpret, next steps include explicating its terms.  As with the significant main effects and interactions of analysis of variance, the included terms indicate how you should explicate the data.  For the present example, because you decided that age moderates the prior possession–resistance association—i.e., you accepted the [RPA] model—then you would report that preschoolers were less likely to resist when the taker had prior possession than toddlers (58% vs. 76%).  However, if the data had been different and the [RP][RA][PA] model had fit tolerably well, then you would report main effects for Age and Prior Possession (no interaction), along with resistance percentages for Age and for Prior Possession.

One final comment:  A strength of ILOG is its ability to re-order models in a hierarchic series.  In the Specify Models window shown earlier, by default ILOG deleted the [RP] term before [RA], but if you wished to delete the [RA] term first, for whatever reason, you would select the [RP] term in the Delete column and select the [RA] term from the options presented.  If the resulting series is hierarchic, ILOG will then re-compute statistics, as appropriate.  The advantage of this ability to change the order in which terms are deleted in a hierarchic series becomes more apparent with 4-dimensional tables, for which there are four 3-way terms and six 2-way terms.

As with any interactive computer program, exactly how this all works is understood best as you explore the program (and its various capabilities) with your own data.