WELCOME TO MY BLOG, ENJOY AND HAPPY LEARNING

Sabtu, 02 Agustus 2014

VALIDITY, REALIBILITY, AND AUTHENTICITY

1.    VALIDITY
Definition of Validity
Validity is an overall evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores  or other modes of assessment (Messick,1989).
Validity is the degree to which a test measures what it is supposed to measure, or can be used successfully for the purposes for which it is intended (Richards,JackC.and Schmidt,2002).

Types of Validity
·   Content Validity : a type of validity that is based on the extent to which a test adequately and sufficiently measures the particular skills or behavior it sets out to measure.
·    Construct Validity : a type of validity that is based on the extent to which the items in a test reflect the essential aspects of the theory on which the test is based (i.e.,the construct).
For example, the greater the relationship that can be demonstrated between a test of communicative competence in a language and the theory of communicative competence, the greater the construct validity of the test.
Ø   Convergent validity : a type of validity that is based on the extent to which two or more tests that a reclaimed to measure the same underlying CONSTRUCT are in fact doing so.
Ø   Discriminate validity : a type of CONSTRUCT validity that is based on the extent to which two or more tests that are claimed to measure different underlying CONSTRUCTS are in fact doing so.
  •  Criterion Validity : a type of validity that is based on the extent to which a new test is compared or   correlated with an established external criterion measure.
Ø   Concurrence validity : a type of validity that is based on the extent to which a test correlates with some other test that is aimed at measuring the same skill, or with some other comparable measure of the skill being tested.
Ø   Predictive validity : a type of validity based on the degree to which a test accurately predicts future performance. A language aptitude test, for example, should have predictive validity, because the results of the test should predict the ability to learn a second or foreign language.
·   Consequential validity : a type of validity that is based on the extent to which the use and interpretations of a test that may have an impact on society will result in fair and positive social consequences for all stakeholders including test takers.
·     Face validity : the degree to which a test appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of an observer. For example, if a test of reading comprehension contains many dialect words that might be unknown to the test takers, the test may be said to lack face validity.


2.    RELIABILITY
Definition of Reliability
A measure of the degree to which a test gives consistent results. A test is said to be reliable if it gives the same results when it is given on different occasions or when it is used by different people (Richards,JackC.andSchmidt,2002).
Reliability refers to the consistency of a measure. A test is considered reliable if we get the same result repeatedly. Unfortunately, it is impossible to calculate reliability exactly, but there are several different ways to estimate reliability.

Types of Realibility
·         Equivalent Form Reliability (Parallel Form Reliability)
Parallel-forms reliability is gauged by comparing two different tests that were created using the same content. This is accomplished by creating a large pool of test items that measure the same quality and then randomly dividing the items into two separate tests. The two tests should then be administered to the same subjects at the same time.
·         Inter-Rater Reliability
This type of reliability is assessed by having two or more independent judges score the test. The scores are then compared to determine the consistency of the raters’ estimates.
The degree to which different examiners or judges making different subjective ratings of ability (e.g.of L2 writing proficiency) agree in their evaluations of that ability.
·         Test-Retest Reliability
To gauge test-retestreliability, the test is administered twice at two different points in time. This kind of reliability is used to assess the consistency of a test across time. This type of reliability assumes that there will be no change in the quality or construct being measured.
An estimate of the reliability of a test determined by the extent to which a test gives the same results if it is administered at two different times.
·         Internal Consistency Reliability
This form of realibility is used to judge the consistency of results across items on the same test. Essentially, you are comparing test items that measure the same construct to determine the tests internal consistency. When you see a question that seems very similar to another test question, it may indicate that the two questions are being used to gauge reliability. Because the two questionsare similar and designed to measure the samething, the test taker should answer both questions the same, which would indicate that the test has internal consistency. (Richards,JackC.andSchmidt,2002)

Ways to Reach Reliability
·      Do not allow candidates too much freedom
·      Write unambiguous items
·      Provide clear and explicit instructions
·      Ensure that tests are well laid out and perfectly legible
·      Candidates should be familiar with format and testing techniques
·      Provide uniform and non-distracting conditions of administration
·      Use items that permit scoring which is as objective as possible
·      Provide a detailed scoring key
·      Train scorers
·      Identify candidates by number, not name
·      Employ multiple, independents coring
(Hughes,1996:36-42)

Factors Affecting Reliability


·      Student related reliability                                       
a) Temporary illness
b) Fatigue
c) A bad day
d) Strategy
·      Rater reliability
a) Human Error
b) Subjectivity
c) Lack of attention to scoring criteria
d) Inexperience
·      Test administration reliability
a) Street noise
b) Photocopying variations
c) Poor light
d) Temperature variations
e) Chair and table condition
·      Test reliability
a) Timed test
b) Ambiguity of test items

3.    Authenticity
More about Authenticity
Authentic Materials:
·         Nunan (1989): “any material which has not been specifically produced for the purpose of language teaching.”   (ascitedinMacdonald,Badger&White,2000)
·         Bacon & Finnemann (1990): “authentic materials are texts produced by native speakers for a non-pedagogical purpose.

Comparison about Authentic Vs Non-authentic Materials


Authentic
·      Language data produced for real life communication purposes.
·      They may contain false starts, and incomplete sentences.
·      They are useful for improving the communicative aspects of the language.

Non-Authentic Materials
·      They are specially designed for learning purposes.
·      The language used in them is artificial. They contain well formed sentences all the time.
·      They are useful for teaching grammar.



Tidak ada komentar:

Posting Komentar