ScanExam-II — Advanced Topics
Compiled from Scan Exam-II HELP topics, October 2003
Scan Exam supports two exam formats. These two formats cannot be mixed together in one DAT file. The most common format is the Standard Multiple Choice exam consisting of a 5-answer choice (A-E) format and uses the University of Western Ontario 180 question Scantron Form No. F-13209-UWO. The second is the Extended Multiple Responses exam consisting of a 20-answer choice (A-T choices) format and uses the 45 question Scantron Form No. F-13622-UWO. Western no longer orders the Extended Multiple Responses exam Scantrons so you cannot use this version unless you have them in stock.
Standard Multiple Choice
The standard multiple choice format offers up to 5 answer choices A through E per question and the student may record only one answer per question. If the student marks a question with two or more responses it is interpreted as ambiguous and recorded as a question mark (?). Blank and multiply marked questions are highlighted by Scan Exam as potential student coding errors. If left unchanged, a multiple response to a question is marked as incorrect. A question that is left blank is considered neither correct nor incorrect. By default, (see other marking methods) a student’s score is calculated as the sum of the individual question weights of all correct questions and this score is expressed as a percentage of a perfect exam score. Where a question has defined for it one or more alternate correct answers a student’s answer is marked correct if it matches one of the alternate choices and the score is adjusted using the weight assigned to the corresponding alternate answer.
Extended Multiple Response
This version of the ScanTron can no longer be ordered but can be used if you have it in stock. The extended multiple response exam format is similar to the standard multiple choice format except that each question has up to 20 choices A through T per question and the student may record multiple answer choices as necessary to answer the question. This format is very powerful and often a preferred format for testing in the clinical sciences. If a student answers a question with more responses than the correct answer permits then this is interpreted as an excess multiple response. Blank and excess multiple responses are highlighted by Scan Exam as potential student coding errors. If left unchanged, excess multiple responses are marked as incorrect while blanks are treated as neither correct nor incorrect. A student’s exam score is calculated as the sum of the question weights of correct questions and is expressed as a percentage of a perfect exam score. Where a question has defined for it one or more alternate correct answers a student’s answer is marked correct if it matches one of the alternate choices and the score is adjusted using the weight assigned to the corresponding alternate answer.
When a student’s answer does not match exactly the master answer and any alternate answers or contains more responses than the master answer, it is marked incorrect. When a student answer is incorrect and not excessively marked it will receive partial marks for each response that matches a portion of the master answer. The partial mark is a proportional fraction of the master answer weight. For example, if a master answer has a weight of 1.0 and has a three letter master answer that defines it, then each of the letter choices in the correct answer is worth 1/3 of a mark. Hence a student may well see their answer marked incorrect but receive partial marks for it. Partials are only considered against the master answer and never against alternate answers. A sample question is given below: (correct answer is E,J,K)
|A 6 year-old girl has cystic fibrosis. She has been taking no medications. Select 3 supplements.|
|A. Calcium||E. Vitamin A||I. Vitamin C|
|B. Fluoride||F. Vitamin B1 (thiamine)||J. Vitamin D|
|C. Folic acid||G. Vitamin B6||K. Vitamin E|
|D. Iron||H. Vitamin B12 (cyanocobalamin)|
Sample responses with resulting scores (assumes weight of 1 for the question).
- Answer EJK scores 1.0
- Answer AFG scores 0 (wrong)
- Answer AJK scores 0.66, (wrong but 2 of the three items are correct, assigned partial)
- Answer K scores 0.33 (wrong, but 1 of the three items are correct, assigned partial)
- Answer AEJK scores 0 (excessively marked, assigned zero)
Note that Marking Methods 2 and 3 are not supported for the Extended Multiple Response exam format.
Marking Method-1 (no penalty; sum of weights)
This is the default marking method for both the standard 5 choice (A-E) exams and the 20 choice (A-T) extended multiple response exams. A student's score is calculated as the sum of question weights for all correctly answered questions. There is no penalty for answering a question incorrectly. Blank answers are discarded. A full description of the marking logic employed for each of these two formats is described in the Format Menu section which should be reviewed carefully.
Marking Method-2 (penalty right-wrong weights)
This is commonly used for True/False exams. A student's score is calculated as the sum of question weights for all correctly answered questions minus the sum of question weights for all incorrectly answered questions. Blank answers are discarded and negative scores are set to zero. This marking method is not available for the Extended Multiple Responses exam format.
Marking Method-3 (penalty weight factors)
A student's score is calculated as the sum of question weights for all correctly answered questions minus the sum of penalty weights for all incorrectly answered questions. The penalty weights are set as a fractional amount of the question weight and is calculated based upon the number of answer choices. The penalty is set equal to question weight * 1/(number of choices-1). For example, if the exam is based upon five choices (A,B,C,D,E) per question then the penalty for incorrect answers is set to .25 * question weight. Blank answers are discarded and negative scores are set to zero. This marking method is not available for the Extended Multiple Responses exam format.
A frequently asked question about the penalty method is how to force a 1/3 or 1/2 penalty for wrong answers when it is a 5 choices exam? Answer: When prompted for the number of choices simply respond with 4 to achieve 1/3 penalty or respond with 3 to achieve a 1/2 penalty. Responding with a value which is less than the number of actual choices used for the exam has no affect on other aspects of the scoring logic.
Mapped Exams is Scan Exam's term for using multiple version scrambled exams. The default for Scan Exam is that multiple version exams are not in use and that a single exam is given to all students. Although simple to construct, single version exams are discouraged because they are too susceptible to cheating. It is recommended that 4 scrambled versions of an exam be used with a seating pattern that ensures each student is surrounded by other students all writing different versions of the exam, for example
Exam Seating Pattern:
Version 1 Version 2 Version 1 Version 2 Version 1 etc.
Version 3 Version 4 Version 3 Version 4 Version 3 etc.
Version 1 Version 2 Version 1 Version 2 Version 1 etc.
Version 3 Version 4 Version 3 Version 4 Version 3 etc.
Version 1 Version 2 Version 1 Version 2 Version 1 etc.
Version 3 Version 4 Version 3 Version 4 Version 3 etc.
Version 1 Version 2 Version 1 Version 2 Version 1 etc.
Version 3 Version 4 Version 3 Version 4 Version 3 etc.
Two methods are provided for using scrambled exams. One involves scrambling the question order and the other involves scrambling the answer choices. Only one of the methods can be used at a time, i.e., question and choice scrambling mixed is not supported. Scan Exam requires that an appropriate map table of the scramble be provided. From the map table, Scan Exam is able to determine the correct answers, alternate answers, custom weights, eliminated questions, and report all statistics and item analysis relative to the master exam answer key question and choice orders. There is no need to construct separate answer keys to correspond to the different versions of the exam. The instructor need only think in terms of the master exam when making adjustments, analysing question performance or analysing student performance question by question.
Although most instructors use multiple versions of their exam, some choose to mark these multiple versions as if each version was a separate exam, each with its own answer key. This approach, although conceptually straightforward, ultimately results in more work and provides less useful results than if the time is taken at the outset to use the capabilities of Scan Exam to create the mapping tables, generate the multiple scrambled versions of the exams and process them as a single set. Scan Exam refers to this concept as Mapped Exams which requires that the instructor adopt specific Exam Codes (111, 222,..,999) and use mapping tables. The benefits for this are:
- No need to separate the exams into their separate versions; merely collect them and submit them for scanning.
- No need to create separate answer keys for each version; there is only one master answer key.
- Only one DAT file to mark; not one for each version of the exam, which also means that there will only be one transfer file for grades from Scan Exam to MMS, not one file for each version.
- Question analysis is more significant when all the versions are processed together. The interpretation of the question analyses is simplified because there is just one set of results to examine, conveniently related to the master exam question order.
- Post exam changes such as removing questions, adding multiple correct answers, changing a question into a bonus, changing question weights or simply fixing an error in the master answer key is done once and the correct ripple of those changes are automatic for the multiple versions.
- Cheating analysis is much enhanced when all the versions are processed together because this increases the size of the analysis, which in turn improves the statistical accuracy for revealing possible cheating.
- Complete subset analysis at the version and section level of the exam is provided.
- It is possible to have Scan Exam prepare the mapping tables automatically and with these tables Scan Exam will construct the scrambled version exams using your previously prepared bank of questions. Refer to Create Mapped Exams for more information.
Custom Question Mapping
Options > Custom Question Mapping
It is common practice to produce multiple versions of the master exam where question ordering is scrambled to discourage cheating. In this circumstance the corresponding multiple answer keys are always derived from the master answer key automatically by Scan Exam using question mapping rules that are user specified. When this option is selected, a drop down list appears showing in the left most column the normal sequential (1,2,3,4, etc.) exam question ordering that a student sees on their examination pages regardless of which scrambled version they may be writing. The master exam order reserves the Code 000 (or blank) and all of its questions are always sequential 1, 2, 3, 4, etc. and these cannot be altered. The other Code columns (111 to 999) are used to refer to one or more scrambled exams however 111 usually matches the master exam order. Under the selected Code, the numbers refer to questions taken from the master exam question order as follows:
Given an example of a 10 question master exam and four scrambled versions (Code111 and Code444) this is one possible custom mapping:
It's easiest to think of the exam mapping from the perspective of the student who is looking at her exam. If she has exam Code222, where would she find her question 1 on the master exam (Code 000/111). The above entries under Code222 specify that its question 1 uses question 8 from the master exam, its question 2 uses question 9 from the master exam, its question 3 uses question 10 from the master exam and so on. Under exam Code333, its question 1 uses question 7 from the master exam, its question 2 uses question 8 from the master exam its question 3 uses question 9 from the master exam and so on. When you have finished entering the custom mapping click OK and Scan Exam does a check to ensure that a complete correctly specified non-duplicate mapping appears for every question on the master exam. If an error is detected, a message appears and custom mapping is turned off until the problem is corrected. Refer to the use of Preset Mapping (described below) which may eliminate your need to specify the mapping sequences.
Tip: Research shows that students perform slightly better on exams where questions are presented in an order that corresponds to the sequence that course material is taught (often the way a master exam is constructed). By using scrambled versions for all students it evens the average outcome for all students. One option is that you could scramble question order only within topic areas, thus facilitating student recall. Another option is to keep topic areas together, but scramble the order of the topics for all four exams.
Preset Question Mapping
Options > Preset Question Mapping
Preset question mapping serves the same purpose as Custom Mapping except that Scan Exam produces the question scrambling for nine possible exams (111, 222, etc.). Any one or more of the scrambled sets may be used to set your exams. Preset mapping is particularly useful if you use the Create Mapped Exams tool which will assemble your exams from a bank of prepared questions while applying the Preset or Custom mapping rules. Preset can be used to simply generate initial sets of scrambles which can then be moved into the Custom tables for customization (refer to Clipboard Functions)
Preset question scrambling patterns are based upon the number of questions in the exam. It is imperative that you do not change the number of questions in the exam after you have used Preset Mapping to establish the alternate versions. An increase or decrease in questions will result in different question patterns.
Answer Choice Mapping
Choose Options > Custom Choices Mapping
Some exams are best structured when the questions remain in a chronological and/or thematic order for all students. This layout is also necessary when multiple questions refer to a common diagram or picture. In this situation the scrambling of answer choices for every question is the best method for constructing alternate exams and this is referred to as Answer Choice Mapping in Scan Exam. When this option is selected, a drop down list appears showing in the left most column the normal sequential (1,2,3,4, etc.) exam question ordering. The master exam reserves Code 000 (or blank) while exam codes (111 to 999) are used to specify one or more scrambled exams using answer choice mapping codes. Mapping codes are simple letter strings that direct Scan Exam how the master answer choices are scrambled for each question.
Given an example of a 10 question master exam and four scrambled versions (Code111 to Code444), this is one possible custom mapping:
For example, in the above, under Code222, the answer choice map code for question one is EABDC indicating the following scramble:
- the A answer choice on the 222 exam will map to the E choice on the master exam
- the B answer choice on the 222 exam will map to the A choice on the master exam
- the C answer choice on the 222 exam will map to the B choice on the master exam
- the D answer choice on the 222 exam will map to the D choice on the master exam
- the E answer choice on the 222 exam will map to the C choice on the master exam
After all the mapping codes are entered click OK and Scan Exam will verify that complete mapping appears for every question. If an error is detected, a message appears and custom mapping is turned off until the problem is corrected. Answer choice mapping is rigid in that every question must provide a full 5 letter map code specification even if it partially duplicates the original order of the master exam, i.e. ABCDE and even if there are less than 5 choices per questions.
The layout and assembly of multiple exams using question or answer choice scrambling requires considerable care, especially for choice scrambling where there is so much precise answer choice shuffling with re-lettering required. If possible, you should consider using the Create Mapped Exams tool provided in Scan Exam to automate this task.
Preset Answer Choice Mapping
Choose Options > Preset Choices Mapping
The creation of answer choice maps is tedious at best and if done manually also risks construction of maps which overlap in their use of scrambles. For example, if mapping codes inadvertently overlap the placement of the choices for the same question then a student who cheats by copying has an increased chance for selecting the exact same content choice as the person from whom they copy. By using the Preset option, map codes are created in blocks of 4 exams 111-444 and 555-888 in a manner that ensures each map for the same question is unique and that it never overlaps any of the choice placements, i.e. the placement of the A choice on the 111 exam will never correspond to the same scramble location of the A choice on 222, 333, or 444 exams. Any one or more of the scrambled sets may be used to set your exams. Preset choice mapping is particularly useful if you use the Create Mapped Exams tool which will assemble your exams from a bank of prepared questions while applying the Preset or Custom mapping rules. Preset can be used to simply generate initial sets of scrambles which can then be moved into the Custom tables for customization (refer to Clipboard Functions).
Provided that exam questions exist as individual questions, each one in its own file, then the Create Mapped Exams Tool may be used to construct the master exam (exam code 000) and up to 9 scrambled versions (exam codes 111, 222, … 999). The exams are generated using either the custom or preset map tables for questions or choice mappings. Although this tool will process simple text versions of questions prepared using Notepad with no formatting or graphics, the most flexible and recommended way to use this tool is to prepare each of the intended exam questions as separate files using Microsoft Word with the question preparation templates provided. The template files were prepared in MS Word but are saved in MS Word web html format (do not save as MS Word doc format).
Create Individual Question Files
Use MS Word and the question preparation templates that come with Scan Exam for authoring questions. These template files are found in the Scan Exam program install folder (default location shown).
Never edit these template files inside the Scan Exam program folder, always make a copy of them into a separate exam preparation folder and then edit them using MS Word. See below for an example of a complex question construction that made use of the 5ChoiceGraphicTemplate.htm file. A question file once prepared using MS Word with these templates must continue to be saved in MS Word web html format. It is only this text-like html format that Scan Exam is capable of processing.
The name of each template file identifies the format of the intended multiple choice and the text content of the template provides instructions for appropriate editing. Below is the content of the 5ChoiceDownTemplate.htm file.
###. Question template for Standard multiple-choice format with support for 5 choices (no graphics) using MS Word. All of the text in this template is for explanation purposes only and is to be replaced with the text of your exam question. The choices table below must not be deleted, merely edit the example choices appropriately for your question. The ### must not be deleted, it is used for automatic numbering of the questions as they are assembled into the exam.
(1) Formatting uses a table with the borders set to none.
(2) Choices must be specified alphabetically down A, B, C, D, E in the cells provided.
(3) All 5 choices must be provided regardless if the question uses fewer. When there are less than 5 choices, the unused choices must be retained but must be set to hidden so they do not print. It is also important to adjust the scramble code within Scan Exam for such questions. For example, CDABE scramble code will not attempt to scramble the E choice.
A) 4.5 * 10-8
B) 1.2 * 10-7
C) 3.6 * 10-8
D) 7.3 * 10-10
E) 2.1 * 10-7
Create Question List File
Once all of the individual exam questions have been authored as separate question files you must create a master list of the names and locations of those files. This list of master question files is used by Scan Exam to construct the exam files. The list must be prepared as a text file prepared with an editor such as Notepad. If prepared using MS Word, then it must be saved as a text file, not as doc file and not as a web html file. The syntax for preparing this list is rigid and follows the convention question.###=location and name of file, e.g.:
The question list file can be be saved with a meaningful name, e.g. BioFinal2002QuestionList.txt and when Scan Exam prompts for the name of this file it will subsequently use the list of questions to direct its assembly of the exam files. For constructing question mapped exams Scan Exam uses this master list to concatenate the question files according to the scramble sequence given by the Scan Exam question map tables as defined in the Options. Similarly, constructing choice mapped exams, question files are concatenated in sequence however the content for the lettered choices are scrambled as defined by the choice map tables defined in the Options. See below for more details.
Create Question Mapped Exams
Custom or Preset Question mapping assembles the exams by simply concatenating the question files as per the master question list file and scrambling the order as per the question map tables. For example, if the question map table indicates that master exam question 20 is to be used as question 1 for assembling exam code 111, then the question file associated with question.020=C:\inverterbratequestions\bio065.htm would appear as the first question in the composite exam file. The assembled exam files will comprise all exam questions appropriately scrambled. The exam files must subsequently be edited using Word to obtain final paginations and cover pages.
Create Choice Mapped Exams
Custom or Preset Choice mapping also assembles exams by concatenating the individual question files as per the master question list file, however; the content of each question file is processed to scramble and re-letter the answer choices as given by the choice mapping codes specified for each question. Some questions may not be appropriate for scrambling the order of the choices, for example, where the choices include items such as All of the Above, None of the Above, etc. where the wording is critical to the order of appearance. Whenever choice mapping for a question must remain unchanged from the master choices the mapping code for that question must be specifically set to ABCDE which is equivalent to a scramble that leaves the choices unchanged. The assembled exam files comprise all exam questions with their choices appropriately scrambled as per the mapping codes. The exam files must subsequently be edited using Word to obtain desired pagination, cover/title pages, etc.
Successful choice scrambles requires that question choices be labelled as upper case choice letter followed immediately by a period or right parentheses, for instance A. or A). For example,
A) a 0.1 M solution of ammonium chloride, NH4CI(s)
B) a 0.1 M solution of formic acid, HCOOH
C) a 0.01 M solution of formic acid, HCOOH
D) a 0.1 M solution of acetic acid, CH3COOH
E) a 0.01 M solution of acetic acid, CH3COOH
Experience has shown that composition and formatting of question choices in Word is generally made simpler if the choices are placed inside cells of a table, in particular a table which has the borders option set to none so that the table lines do not print but will appear as faint lines on the screen for easier composition and formatting. The template files that are provided all use and rely upon tables.
Graphics or pictures may be used as the choices themselves but these graphics must pre-exist as jpeg or gif files which are inserted as a picture from a file using the MS Word Insert menu. For example
A) insert picture file
Using graphics or pictures as actual choices requires special attention to include in the question composition reference to a dummy graphic file. This reference is provided for you if you use the 5ChoiceGraphicTemplate.htm and it appears just above the top of the choices table, do not delete this transparent graphic. This seemingly useless graphic is necessary because MS Word produces its html external file location information as part of the first graphic file reference and it is important that this location information does not get anchored to one of the choice graphics. All choice graphics must remain free to move into the other choice positions without consequence. The transparent graphic may be removed from the finished exam files if you prefer to do so. Below is an example of how complex questions might be constructed in MS Word, using jpeg graphics for both the question portion and the choices. When saved in html format, this question is compatible for automatic choice scrambling by Scan Exam.
This analysis provides tabular display of each questions difficulty, point biserial, number of students answering correctly and incorrectly, the correct answer, the distribution of answer choice selections and Cronbach’s alpha (coefficient of reliability). A graphical display of the question difficulty and point biserial is also provided. If multiple exams (i.e. mapped exams) are used then split out analysis is possible, and the split out question statistics are always reported consistently in terms of the master exam question order. The following is given in a table format for each question.
Difficulty Rating = W/(C+W)
C is Number of Students Answering Question Correctly
W is Number of Students Answering Question Incorrectly
Note: blank answers are discarded for this calculation
Difficulty values range from 0 (very easy) to 1 (very difficult). Scan Exam uses an increasing value scale to represent increasing difficulty of a question, which permits a consistent graphing of this value with the Point Biserials. See Point Biserial for further details.
Point Biserial = (Mp - Mq)/sd * SquareRoot(p*q)
Mp is mean score of students answering question correctly
Mq is mean score of students answering question incorrectly
sd is the standard deviation of the exam mean
p is the proportion of students answering question correctly
q is the proportion of students answering question incorrectly
Note1: blank answers are considered incorrect only for the purpose of assigning scores to Mp, Mq and counts to p and q.
The Point Biserial can be thought of as the product moment correlation where one variable is dichotomous i.e. the item is answered correctly or incorrectly and the other variable is continuous and equal to the test scores. The Point Biserial values range from -1 (answered best by students with low scores) to +1 (answered best by students with high scores). The following illustrates how a question with a near perfect point biserial (+1.0) might appear if it were graphed. Note the complete absence of any graph points (students) corresponding to high score and incorrect answer or low score and correct answer.
In general any item that correlates near zero with test scores should be carefully inspected. It is possible for an item to correlate near zero and still be a valid item but more often than not the item is excessively difficult or easy. Unless there are grounds for keeping such items they should probably be considered for discard (see note below). Relatively high point biserials are desirable, above +.30 is good. Consideration of question difficulty scores (0 very easy to +1 very difficult) in conjunction with the Point Biserial is useful for distinguishing questions that are good. In general, a reasonably high Difficulty Score occurring in conjunction with a relatively high Point Biserial is optimal. A graph appears beneath the question analysis, which plots the Point Biserial and Difficulty for each question.
Considerations for Removing Questions:
Questions that produce a negative point biserial have poor testing characteristics and as a simple rule of thumb should be removed from the scoring by clicking them to blank on the Master Answer Key. Removal of even one question will cause individual student grades to fluctuate up and down. The decision to remove a question should be looked at closely to see if there is an explanation for its behaviour. First, examine the question to determine if the master answer is indeed the correct answer, i.e. an instructor coding error. Second, ascertain if the question may actually contain two legitimate correct answer choices in which case the question may be marked using two correct answers (refer to Alternate Answers under the Options menu). Third, the question is deemed to be a good question by all other criterion and its poor statistical showing is perhaps more related to inconsistent coverage of course material due to different instructors. If a decision is made to retain a bad question for those that got it correct but remove it for those that got it wrong please refer to the HELP for details on using the Bonus Question option.
A tabular view provides question information very similar to the general item analysis described in the preceding section. The essential difference is that under the distribution of answer choices for each question is the calculated mean grade of the students who selected the various choices for the question. Every question has a correct answer from the choice set (A,B,C,D,E) and the incorrect choices are called distracters. Having a question with approximately equal proportions of students choosing the distracters is an indication that the distracters are all equally good at pulling students off the correct answer. However, if a distracter tends to pull “good” students away from the correct answer then it is probably a bad distracter or perhaps even a question construction error. Typically distracters should have their mean grade less than the mean grade for the correct answer. The corresponding bar graph allows a quick visual check of questions to see if the means for the distracters are less than the correct answer mean. The correct answer bar is dark blue while the distracters are red bars.
Sometimes it is useful to identify items that are more or less effective in discriminating students at a particular point in the distribution of students. For example, what items best discriminate the upper 30% of the class from the lower 70%. In the linear model, the most discriminating item for any division of the distribution is the one that correlates highest with that division. Based upon the point of discrimination selected in the Options menu (default is 50%), Scan Exam orders all of the students from lowest to highest score and then partitions them accordingly to the defined split point. A table is displayed containing for each question its overall difficulty level, its discrimination correlation coefficient and the corresponding distribution of answer choices for the upper and lower partitioning. The discrimination correlation coefficient is in fact the phi coefficient calculation where the dichotomy is the split point in the distribution of students ordered by score. The highest phi coefficients are the most discriminating items at the selected point. A graph of the discrimination correlation coefficients is provided for quick visual identification of items which standout from the others in terms of their discriminating ability at the selected split point.
The convenience for including a short survey as part of the exam sheet processing can be useful for gathering class opinions on the course content, social and political issues, etc. A typical use would have students record their answers to the survey questions beginning at the next question after the last test question. Students would be informed that their participation in the survey was optional and that the survey questions do not form part of their test score (see Survey Tips below for other possible scenarios). The Quick Survey tool will calculate the percentage distributions for the survey questions categories (A-E) as well as means and standard deviations based on a 5 point rating scale where A is 5, B is 4, C is 3, D is 2, E is 1, and Blank or bad marks are N/A which are excluded. Questions that use rating scales typically pose questions of opinion, e.g. (5-Strongly Agree to 1-Strongly Disagree), (5-Exceptional to 1-Poor), (5-High Priority to 1-Low Priority), etc. Quick Surveys will also calculate a matrix of Pearson Correlation Coefficients based on the above 5 point rating scale so that survey questions can be examined for the presence or absence of linear relationships, i.e. do respondents rate consistently on a particular pair of questions. Please refer to the HELP for more details.
A tool using Answer Choice Match Analysis is provided to assist with the investigation of cheating.
This tool may be used to guide an instructor in the investigation of cheating but it cannot prove cheating. The statistical criterion used to detect and support assertions of cheating uses a default setting that is set quite high to avoid false suggestions. Nonetheless, there may be pairs of students identified by the process as being suspicious of cheating and instructors are cautioned to disregard such results in the absence of other compelling evidence.
In general, same answer choice selection is expected to increase as the grades of students being compared increase because the incidence of correct answers necessarily rises. The above bird’s eye view graph illustrates this, even suggesting that a linear relationship exists between grade and number of matches, which comes as no real surprise. When comparing lower to middle scoring students, a large number of same answer choices is unusual because of the natural variability that often occurs in wrong answer choices as well as the natural variability for which questions are answered correctly. This natural variability is established by the students themselves and is not pre-determined in any way. The empirical nature of this method is sensitive to how students actually respond on a particular exam and it is more sensitive to identification of systematic rather than opportunistic cheating. Extreme values often suggest that students are cheating in a very co-operative manner. Where a single version of a test is used instead of multiple scrambled versions (refer to mapped exams in Scan Exam) this type of blatant cheating is quite feasible through the use of simple prearranged hand and finger signals. It becomes especially feasible if an electronic communications device is used. Unfortunately, even multiple scrambled exams can be overcome if the students have the ability to choose their own seating. Forced random seating placements can address this problem.
Provided the distribution of match counts is normal, then the associated probabilities for observing match count Z values as extreme as 4 and greater is given below.
Z=4 occurs 3 in one hundred thousand.
Z=4.7 occurs 1 in one million.
Z=5 occurs 3 in ten million.
Z=6 occurs 1 in 1 billion.
Z=7 occurs 1 in 1 trillion.
Z=8 occurs 6 in 10**16.
Case Study for Students 454545454 and 676767676
This pair of students had 80 correct answers and the other had 77 correct. When the class file was explored for the possibility of cheating it was subjected to tests for all within group pairings, all between group parings and all groupxAll student pairings. This student pair was identified as having a match count Z value that exceeded the default Z Limit for Cheating=4.7 on at least one of the test methods performed.
Test Method #1
All students in group 80 were compared with all students in group 77 yielding 42 student pairs including the pair under investigation. A group of this size is small but sufficient for suggesting the possibility of cheating. The average match count for this between group comparison (labelled 080x077) was 66 with a 10.0 standard deviation. The 122 (Z=5.6) match count for the suspect pair is high within the context of this grouping. The probability for observing a match count this high is 1 in 100 million.
Test Method #2
All students in group 80 were compared amongst themselves and with all other students in the entire class yielding 2,709 student pairs including the pair under investigation. The average match count for this group comparison (labelled 080xAll) was 67 with a 9.4 standard deviation. The 122 (Z=5.9) match count for the suspect pair is high within the context of this grouping. The probability for observing a match count this high is 2 in 1 billion.
Test Method #3
All students in group 77 were compared amongst themselves and with all other students in the entire class yielding 3,157 student pairs including the pair under investigation. The average match count for this group comparison (labelled 077xAll) was 65 with a 8.8 standard deviation. The 122 (Z=6.5) match count for the suspect pair is very high within the context of this grouping. The probability for observing a match count this high is 4 in 100 billion.
Scan Exam will perform two tests for the normal distribution assumption when the N option button is clicked (see example below). The first test is a graph of the frequency of Z values for the test distribution using an 18 point axis ranging from –4< to >+4 in 0.5 increments. Superimposed on that graph is the expected normal distribution for a sample of equal size to the test distribution. These superimposed graphs are very useful visual confirmation. In addition, a ChiSquare goodness of fit test is performed using a 2x6 cell representation of the cumulative percentage distribution (-3 to +3) of the actual versus the expected. An observed ChiSquare value less than the ChiSquare critical table value 15.086, df=5, allows the test distribution to be accepted as sufficiently normal for assigning probability estimates to the outliers. Once a distribution meets the accepted criterion as stated above, it can be further estimated as to how close the fit is. The smaller the observed ChiSquare the better the fit and a percentage fit estimate is printed on the graph along with the ChiSquare. Keep in mind that the percentage figure is a secondary estimate for one’s own information about the distribution and it is not the necessary criterion for acceptance or rejection of the normality assumption, only a ChiSquare less than 15.086 is required. The above graph shows the normal tests for one of the case examples. All three distributions are adequate to satisfy the normal distribution assumption.
The presence of a Z greater than or equal to 4.7 for just one of the above tests is sufficient to suspect cheating. If one or more of the remaining tests also yield Z values greater than 4.0 then this represents a strong statistical assertion that cheating occurred. The above pair of students had Z values of 5.6, 5.9, and 6.5. Very high Z values on all three tests leaves little room for making a false accusation against the pair.