Winnowing N dimensions

A particularly useful feature of ILOG’s Examine Two-Way Tables procedure is its winnowing ability (see Bakeman & Gottman, 1997, pp. 119–120; Bakeman & Quera, 2011, pp. 129–130 and 143–144).  When the χ² or G² associated with a two-way table (either a simple two-way table or one created by pooling over levels of other factors) is large and its p value small (e.g., < .05), we say the test of independence (of rows and column) has failed—only the saturated model fits.  To interpret this result when there are more than two levels for the row and column factors, we could examine adjusted residuals (Bakeman & Quera, 2011); large ones indicate cells whose expected values deviate significantly from expected.  However, adjusted residuals in a two-dimensional table form an interrelated web.  If some are large, others necessarily must be small, and so which do we interpret?  All of those, for example, larger than 1.96 absolute?

Winnowing offers a more economical approach to interpretation.  We identify those cells that cause fit to fail; almost always this will be a smaller number than the number initially identified as large, which offers a more parsimonious interpretation.  Winnowing consists of iteratively replacing selected cells (in an order you determine) with structural zeros until we find a table that fits tolerably.  We then assume that the cells we replaced caused the bad fit of the table.

As an example, we consider unpublished data from a study of dinner conversation provided by Clotilde Pontecorvo (University of Rome).  Turns of talk were coded for Speaker (Father, Mother, Target Child, or Sibling) and Action (uses Knowledge, Relates, Entertains, Controls, or Manages) for six families.  Only the saturated model fit, thus we concluded that no common pattern joined Speaker and Action—different Speakers were associated with different functions in different families.  We had thought that fathers generally controlled and mothers managed, but these hypotheses were not supported by the data.

In one family, G²(12, N = 330) = 78.3, p = <.001, indicating that different speakers favored different actions:  7 of the 20 adjusted residuals exceeded 1.96 absolute (see above).  Using the Examine Two-Way Tables procedure, and replacing just four cells with structural zeros, produced a fitting model, G²(8, N = 212) = 14.716, p = .064 (see below).

Specifically, to replace cells with structural zeros, after selecting Examine as Two-Way Tables in the main ILOG window, and after selecting the particular two-way table to examine (here, Speaker as row, Action as column, Family as list and selecting the particular family) in the Examine as Two-Ways window, position the mouse over the table, right click, and select Let click make cell a structural zero from the context menu.  Then select (i.e., left click) each of the four cells:  father uses relate and control, mother uses relate, and target child uses knowledge.  Their counts will be replaced with asterisks (see below), indicating structural zeros, and the table chi-square will no longer be statistically significant.  We conclude that for this family these four associations adequately captured their unique pattern.

More generally, winnowing is an economic way to identify those particular cells in any two-dimensional table that cause fit to fail and is easily effected in ILOG (clicking on particular cells replaces them with structural zeros and re-computes G².

Conclusion

We have presented a relatively brief introduction to log-linear analysis.  It is by no means complete; several book-length treatments describe log-linear analysis with much more breadth and depth (e.g., Wickens, 1989).  Our intent is to provide readers with enough of a sense of what log-linear analysis can do that they can then decide if it would serve them and if they want to learn more.

Throughout we have noted how log-linear analysis and other analyses of contingency tables can be effected with an interactive computer program, ILOG.  We find that the analysis of hierarchical log-linear models works best when approached interactively, which is what the ILOG program does.  We encourage interested readers to enter their own data into ILOG, or use the Bakeman and Brownlee data given earlier, and then try running the various procedures.  As is generally true, exploring the various options that a computer program permits is often an excellent way to learn more about both the analysis performed and the program’s capabilities.

ILOG has several advantages.  As noted earlier, standard statistical packages typically have one or two log-linear analysis routines.  Often they produce many pages of output.  As noted here, log-linear analysis typically proceeds by comparing models in a hierarchic series, searching for a model to interpret.  This is inherently an interactive process, a process that an interactive program like ILOG greatly facilitates.

The exploration required for interpretation of log-linear results likewise is more efficient when approached interactively, and again this is facilitated by the procedures ILOG provides (Examine the Selected Model, Examine as Two-Way Tables).  Finally, ILOG lets you import tables that were exported by spread sheet or statistical package programs as tab-delimited files, manipulate and modify contingency tables with considerable flexibility (Modify This Table), export initial or modified tables as tab-delimited files that can be imported into spread sheet and statistical analysis programs, and read or paste its tab-delimited output into a standard spread sheet program for further manipulation and analysis.