A research synthesis of this scope, with numerous
operational and procedural decisions, will have aspects that will
be questioned by other researchers and will leave many important
questions and issues untouched. We, too, felt constrained in not
being able to discuss the results and methodological factors in
greater detail. Thus, we limit our discussion of limitations and
suggested additional research to four issues.
First, the dominance of the WJ-R or WJ III
studies (94 % of analysis based on WJ battery) suggests the results
and implications are best characterized as a referendum on WJ CHC
COG-ACH relations. Generalization of the review results to other
CHC-based intelligence batteries (DAS-II; KABC-II; SB-V) or
assessment approaches (cross-battery assessments) should be
approached with caution. Similarly classified broad and narrow CHC
measures from other batteries, particularly the narrow CHC test
classifications (which are primarily based on logical expert
consensus methods), cannot be assumed to display the same CHC
COG-ACH relations patterns reported here. For example, although
empirically classified (as per a CHC-designed WISC-III + WJ III
cross-battery CFA study; Phelps, McGrew, Knopik & Ford, 2005)
as narrow Gsm measures of working memory (Gsm-MW), the reported MW
factor loadings for WJ III Numbers Reversed (.65), WJ III Auditory
Working Memory (.59), and WISC-III Digit Span (.70) tests suggest
they are not interchangeable MW measures. More importantly, other
intelligence battery tests or composites that may be classified the
same (either empirically or logically) as a WJ III measure may not
necessarily display the same strength of relation with achievement
domains. For example, in the Phelps et al. (2005) dataset, the WJ
III Visual Matching (Gs-P) test correlated .42 with WJ III
Letter-Word Identification and .40 with WJ III Passage
Comprehension, while the similarly Gs-P classified tests (see
Flanagan et al., 2006) of WJ III Cross Out (.35; .27) and WISC-III
Symbol Search (.32; 27) correlated at lower levels. Empirical
support for CHC COG-ACH interpretations beyond the WJ batteries is
limited to non-existent.
Second, as previously discussed, our operational
criterion for the significance consistency classifications was
admittedly post hoc and arbitrary. Given the lack of a prior
systematic CHC COG- ACH research synthesis, we erred on the side of
leniency as we viewed the current review as exploratory and
suggestive in nature. This was our intent—to identify
possible significant COG-ACH relations warranting further study and
discussion.
Third, a number of interesting CHC COG-ACH
relations were classified as tentative/speculative.
Additional research with the same or similar measures of these
tentatively identified abilities is needed. There has been 20 years
of CHC COG-ACH research, but most of it has consisted of analysis
of the WJ-R and WJ III battery measures and norm data. Additional
research is needed with other measures and in different norm
samples.
Fourth, space did not allow for analyses by
methodological factors. Most important was the possibility of
different conclusions when comparing manifest variable (MV) versus
latent variable (LV) research (see Table 1) and, more importantly,
what the MV/LVàACH
differential findings suggest for future research and current
assessment practice. Inspection of the complete set of on-line
summary coding tables reveals an obvious trend for MV studies to
report more significant COG- ACH relations than LV studies. For
example, across all achievement domains, CHC IVs, and ages, MV
analyses were significant approximately 1.5 times more frequently
than IV analyses (MV = 40.1 %; LV = 26.2 %). To disentangle the
possible MV/LV –by–ACH domain–by–age
group interactions requires a separate analyses and manuscript.
We provide the on-line summary tables in hopes others will explore
these methodological nuances and their implications for practice.
Although we did not undertake such detailed exploration, we believe
that there is a strong probability that the MV > LV COG-ACH
significance finding is most likely due to the absence (MV) or
presence (LV) of a general intelligence (g) factor in the research
designs. This topic deserves greater deliberation, analysis,
discussion and debate than we can offer here.
The combined limitations make one over-arching
conclusion clear. The extant CHC COG-ACH literature of the past 20
years has been restricted to a mosaic of methodological approaches
that have been primarily applied to samples that frequently have
not been independent (i.e., WJ-R and WJ III standardization sample
subjects). We believe that the salient COG-ACH relations
reported are those that are the most robust—they managed to
“bubble to the surface” despite the methodological
twists and turns across research studies.