I'm working through this paper and noting it up in Mendeley. Here I'm going to note some (perhaps tangential) observations.
Linking the test of those wanting visas or citizenship to the CEFR effectively lets the test designer off the hook. If you say to them, "How does your test give the inference that the test taker is fit for leave to remain?" They are liable to reply, "We are saying no such thing, we're saying that the test taker has B1 level speaking in accordance with the CEFR." And they may be able to point to research which validates the test in those terms.
And so we need to turn to the test's principal stakeholder, the UK government, and ask, how CEFR level B1 can be interpreted (in effect) as a score which measures fitness for leave to remain? Where is the validity argument? It's unlike, for example, an IELTS score of 6.5, which it is generally assumed fits an L2 undergraduate to study hard science in an English medium University.
Collectively being responsible for the teaching of many thousands of L2 students over a decade or more informs University admissions administrators that the inference to be drawn from the 6.5 score is that the undergraduates can write papers, read texts, partake in seminars, and understand lectures and instruction. It's not a simple construct, but it's comprehensible. Whereas partaking in UK life cannot be so quickly delineated.
[NB Kane (2013) cited by 220 in Google Scholar. Need to have a look at other databases.]
"First, state the claims that are being made in a proposed interpretation or use (the IUA), and second, evaluate these claims (the validity argument)." (p9) Very succinct.
There's nothing in principle to make Argument Based Validity (Kane, 2013) inconsistent with Consequential Validity (Bachman, 2010). That is: IUA>Validity Argument>Consequential Validity.
Extrapolation is the most salient aspect of Kane (2013), needs more work from me.
"validation always involves the specification (the IUA) and evaluation (the validity argument) of the proposed interpretations and uses of the scores." (Kane, 2013, p16).