About NREMT Examinations
Establishing the Pass Fail Score
Understanding NREMT Cognitive Exam Scores
The National Registry cognitive exam item development process follows an extensive process which takes approximately one year to complete. Item development follows the same process for all five levels of certification.
The computer based cognitive examinations consists of items drawn from the National Registry's item bank. NREMT computer based exams are constructed to ensure that each candidate receives a distribution of items from six major categories: Airway & Breathing, Cardiology, Medical, Trauma, OB/Gyn/Peds, and Operations. The number of items from each category is determined by an examination test plan (also known as a blueprint) which has been approved by the NREMT Board of Directors.
The NREMT examination test plan is developed based upon the result of the EMT-Basic, EMT-Intermediate and EMT-Paramedic Practice Analysis conducted at five year intervals (1995, 1999, and 2004). The NREMT randomly surveys hundreds of practicing NREMT-Basics, NREMT-Intermediate and NREMT-Paramedics. The individuals are asked to provide input on the important tasks of the job of an EMT. Importance was defined as a balance between frequency, potential of harm. A committee comprised of national experts reviews the result of the data and developed a test plan which was approved by the Board of Directors.
The NREMT examinations are developed so that they measure the important aspects of pre-hospital care practice. Items are developed in relation to tasks identified in the practice analysis. The domain of practice that limits therapy addressed in an item is based upon national standard curricula developed by the National Highway Traffic Safety Administration. EMT education programs are encouraged to review the NREMT Practice Analysis when teaching courses and as a part of the final review of the abilities of students to deliver the tasks necessary for competent patient practice.
Individual examination items are developed by members of the EMS community serving on Item Writing Committees convened by the NREMT. Item Writing Committees typically have 10 to 15 national EMS experts as members. They meet over a three to five day period to review, rewrite, and reconstruct drafted items. Consensus by the committee must be gained so that each question is in direct reference to the tasks in the practice analysis, that the correct answer is the one and only correct answer that each distracter option has some plausibility, and the answer can be found within commonly available EMT textbooks. Controversial questions are discarded and not placed within the item banks. Items are reviewed for reading level and to ensure that no bias exists related to race, gender or ethnicity.
The NREMT completed a project utilizing experts in racial and cultural issues who reviewed every word in the NREMT item bank to help assure that examinations do not discriminate on the basis of race or ethnicity. Item writing committees continue to review items to reduce possibilities of this type of discrimination.
Following completion of the item-writing phase, all items are then pilot tested. Pilot items are administered to candidates during computer adaptive exams. Pilot items indistinguishable from scored items; however, they do not count for or against the candidate. Extensive analysis of the performance of those items is conducted with those functioning properly under high stakes pilot testing being added into future live test banks. Item analysis is then completed and items are checked to determine if they are functioning properly and are psychometrically sound.
About the Cognitive Exam
Candidates for NREMT examinations at three levels: First Responder, EMT-Basic, and EMT-Paramedic take Computer Adaptive Tests (often referred to as C-A-T or CAT.) An adaptive test is an algorithm-delivered exam. This means the computer is programmed to select items in a specific, logical manner. The decision regarding passing or failing an algorithm exam is the same as with pencil-paper examinations: has the candidate reached the level of entry-level competency (pass) or has the candidate not yet reached that level (fail)?
This same method is used to develop all NREMT test items used in CAT exams. First, an item (or question) is drafted. Then it is pilot tested in a high stakes atmosphere by being placed in CAT exam test pools. The test pool is a ‘bank’ of test questions that the computer can draw from when delivering an exam. Pilot items are placed in test pools to be calibrated—determining what on what scale of difficulty they will be placed. When the draft item is being pilot tested, it does not count towards the pass/fall score of the candidate being examined. In order for an item to be placed in a “live” (when the items counts toward pass/fail) test pool, it must meet strict calibration requirements. The difficulty statistic of an item identifies the “ability” necessary to answer an item correctly. Some items require low ability to answer correctly while others may require moderate or high level of ability.
The CAT Exam is Structured Differently than a Pencil-Paper Exam
Since CAT exams are delivered in a completely different manner than pencil-paper exams, they will “feel” more difficult. Candidates should not be concerned about the ability level of an item on the exam because their ability is being ‘measured’ in a different manner. This works by placing all items on a standard scale in order to identify where the candidate falls within the scale. As a result, candidates should answer all items to the best of their ability. Let’s use an example to explain this:
Suppose that a middle-school athlete is trying out to be a member of the high jump team of the track team. The coach, after many years of experience as a middle-school coach, knows that in order to score any points at a middle-school track meet, his individual jumpers need to jump over a bar placed at four feet above the ground. This is the “competency” standard. If he enters jumpers who can jump three feet, he knows these jumpers will rarely-- if ever--score points for his team during a track meet. Those who jump four feet on the first day of try-outs, after training and coaching, can not only jump four feet (the minimum) but; later may, through additional education, learn to jump five or more feet. The coach knows that it will be worth his time and effort to coach these try-out jumpers to greater heights. Therefore he tells those who jump over four feet at try-outs that they are members of the high jump team (because they have met the entry-level or competency standard).
Since the coach knows the competency standard, he can hold a try-out to see who meets entry-level competency. The coach will likely set the bar at or near 3 feet 6 inches for the first jump attempt. Those who make it over this bar will then progress to perhaps 3 feet 9 inches to test their ability at that height. After a group has passed 3 feet 9 inches the coach will again raise the bar to 4 feet and have the successful jumpers attempt to clear it. A smart coach will likely not tell the team the necessary height so that he can learn the maximum ability of each try-out jumper. At the 4 foot level the coach may find that seven of ten athletes clear the bar. He will then raise the bar to 4 feet 3 inches and later to 4 feet 6 inches. He will increase the height of the bar until he determines the maximum individual ability of each try-out jumper. If he has four slots on his team, he will select the top four or five jumpers and begin the coaching process to help them reach even greater heights. In this manner, the coach has learned about the ability of the try-out jumpers based upon a standard scale (feet and inches). The coach then sets a standard (4 feet) for membership on the team, based upon his knowledge of what is necessary to score points at track meets (the competency standard).
CAT Exams Are Different for Every Candidate
The above illustration can describe the way a CAT exam works. Every item within a live item pool has been calibrated to determine its level of difficulty. Now the computer adaptive test must learn the ability level of the candidate. Here is how it works: The test typically starts with an item being administered that is slightly below the passing standard. The item may be from any subject area in the test plan (or ‘blueprint’: airway, cardiology, trauma, medicine, OB/Peds, or Operations.) After the candidate gets a short series of these items correct, the computer will choose items of a higher ability, perhaps near entry-level competency. These items will also be taken from a variety of content areas of the test plan. If the candidate answers most of the questions in this series of items correctly, then the computer will choose new items that are at a higher ability level. Again, if the candidate answers many of these items correctly the computer will again present the candidate with items of an even higher ability level. Eventually every candidate will reach his or her maximum ability level. In this way, the computer learns whether or not the individual is above the standard (entry-level competency) in these content areas, and the examination will end.
95% Confidence is Necessary to Pass or Fail a CAT Exam
The high achiever who is able to answer most of the questions correctly will find that the computer ends the exam early. Many candidates worry that something is wrong because the exam was so short. In reality, the computer was able to determine that the candidate jumped far higher than the standard level—or was well above the level of competency In a CAT exam. The computer stops the exam when it is 95% confident that the individual candidate has reached the level of competency.
As mentioned before the length of a CAT exam is variable. Sometimes a candidate can demonstrate a level of competency in as few as 60 test items. Sometimes, after 60 questions, the candidate has shown to be close to entry-level competency but the computer has not determined within the 95% confidence requirement that the candidate is either above or below the entry-level competency standard. In cases when the computer is not 95% confident, the test continues to provide additional items. This provides more information in determining whether or not a candidate is at entry-level competency. Regardless of the length of the test, items will still vary over the content domain (airway, cardiology, etc.). When (and if) the candidate reaches the maximum length of an examination, the ability estimate of that candidate will be most precise. Using the high jumper example, the computer will be able to determine those who jump 3 feel 11 inches from those who jump 4 feet 1 inch. Those who clear 4 feet more times than they miss 4 feet will pass. Those who jump 3 feet 11 inches but fail to clear 4 feet enough times will fail and be required to repeat the test. Some candidates won’t even be able to jump close to four feet. These candidates are below or well below the entry-level of competency. This too can be determined fairly quickly and these candidates may have their examination ended quickly. When the examination is near 70 questions and a candidate fails, he or she has demonstrated within 95% confidence that he or she cannot reach the entry-level of competency.
Because of the structure of the CAT exam, the candidate needs to answer every question to the best of his or her ability. The CAT exam provides the candidate with more than adequate opportunity to demonstrate his or her ability, and is able to provide precision, efficiency, and confidence that a successful candidate can become an NREMT.
Results are posted on the NREMT’s password-secure website through an individual’s login account--typically within the next day. Those candidates who pass the exam will be sent National EMS Certification credentials by the NREMT. Because the candidate has met the criteria, a breakdown of results is not necessary.
Candidates who fail to meet entry-level competency will be sent information sheets regarding their testing experience. This information is useful for identifying areas to concentrate study in preparation for the next attempt. The information sheets indicate if a candidate is “above,” “near,” or “below,” the level of entry-level competency in the various content areas. Candidates who are “above” the standard can be somewhat confident they have sufficient knowledge in that content area, allowing them to pass the exam. However, failure to review the material in that content area can result in failing the exam again. Candidates who are “near” the standard can be slightly above or slightly below the standard and should certainly study these areas. Being “near” does not indicate pass or fail but it can be interpreted as an area to study. Candidates who are “below” the standard need to enhance their study in this area. Candidates who fail the examination will have test items for future attempts “masked.” This means a masked item will not appear on future exams taken by that candidate. Studying examination items to prepare to do the job of an EMT is not helpful. Studying the tasks and the job of an EMT provides the best preparation. Candidates who memorize items in hopes of “getting them right,” the next time are wasting their time because masking items prevents them from seeing the same item again.
A CAT examination is very precise in determining a candidate’s level of competency. Candidates who fail the exam and do not study for their next attempt will most likely be measured at the same level as when they took the exam the first time. Failing candidates who do not change their ability level (be able to jump higher) will again be measured the same. The best way to improve ability is to practice—in this case, study.
The NREMT produced a DVD video, which explains the purpose of the NREMT, how computer adaptive testing works and how to register for the examination. All candidates are urged to watch the video.
Purpose of the NREMT
Learn More about Computer Based Testing
Step-by-Step Instructions for Applying Online for the NREMT Test
(Requires high speed Internet connection)
The NREMT staff is dedicated to customer service and can help candidates through the examination process. Candidates should understand, however, that the NREMT is the National EMS Certification for EMS providers and not an educational agency. Therefore, the NREMT regrets they are not able to advise candidates on how to pass the test or what to study.
The NREMT Board of Directors determines the pass/fail score of the examination as guided by psychometric consultants. The National Registry conducts regular detailed analysis of every item in the test bank to ensure that they are functioning properly. The item and statistical data are reviewed by the National Registry to assure that each item and the examination are functioning properly. Statistics are generated for training sites, states, and/or national results.
All items have one correct or best answer as agreed upon by the Item Writing Committee. All items have been reviewed for reading level. No items are “K type” items. All items have been reviewed to prevent regional bias. All items have been reviewed to assure they cover current clinical therapy. All items relate to the practice of out-of-hospital care and when not in a practice case scenario have answers that are available in common EMS textbooks. The entire process has been devised and directed by a psychometrician with a PhD in Educational Measurement.
Establishing the Pass Fail Score
The National Registry establishes the cognitive exam pass/fail score (cut score) based upon the definition of entry-level competence to safely and effectively practice. Entry-level competency is the NREMT standard. Establishing a cut score requires the NREMT to follow an established, peer review and published, psychometric formula and process. Psychometrics is a science that revolves around measurement of the human mind via mathematical processes. All examinations that are valid involve psychometrics. Validity of an examination centers around the meaning of the cut score and interpretations (judgments) made about that score. A test with good questions from known educational materials may have content validity, but unless the cut score is derived from an approved process the test is not valid. Development of a cut score and the psychometrics involved in a computer adaptive test (CAT) is very complex and requires experts in the field of psychometrics to lead subject matter experts (EMS providers) through the process.
The NREMT examination cut score begins with a nationally selected committee of EMS providers, known as the Angoff committee. (The Angoff process is a criterion-referenced evaluation procedure which is used for standard setting). These committee members have varying backgrounds and include active EMS physician medical directors, EMS licensing officials who have clinical expertise, and EMS providers. The field providers are certified at the level of the test. For example, EMT-Basics set the Basic cut score and Paramedics set the Paramedic cut score. The field providers are experienced in a variety of services including fire, volunteer, paid private and 3rd party services. The gender and racial mix of the group is also important. The committee also includes individuals who recently passed the National Registry examination.
The Angoff committee members attend a meeting hosted by a Ph.D. psychometrician who leads the group through the discussion. They begin, as a group, to write a short essay describing the abilities and expectations of an entry-level provider. After the essay is completed and a consensus is reached regarding the definition of entry-level, the group begins to judge potential test items. The judgments are based upon the committee’s opinion of what percent of newly trained entry-level providers would get an individual question correct. Feedback is given to the group from the psychometrician and the process continues until a minimum of 120 items have been reviewed. Items are from all six topical areas covered in the EMS test plan. Once the committee has completed its judgments, the psychometricians then use empirical data obtained over the items via pilot testing and apply a Raush Model mathematic formula to the judgments to arrive at a difficulty scale for individual items.
Following completion of a cut score report a meeting is held of the NREMT Standards and Examination Committee of the Board of Directors. This committee reviews the report, considers it along with their own experience with entry-level competent EMS providers and then establishes a recommended cut score. Once the Standards and Examination Committee completes its process a recommendation is made to the Board of Directors who adopts the standard. The standard adopted by the Board becomes the NREMT standard and is the definition of entry-level competency.
This description provides an overview of the cut score process. The process is very complicated, is legally defensible, psychometrically sound and follows the American Psychological Associations Standards for Educational and Psychological testing and the standards of the National Commission for Certifying Agencies (NCCA). The NCCA has accredited all NREMT certifications and has reviewed the process the NREMT uses to develop its examinations and establish its cut score.
Understanding NREMT Cognitive Exam Scores
The NREMT written examination is a “criterion referenced” exam intended to identify individuals who have the knowledge necessary to provide safe and effective pre-hospital care. As a criterion referenced exam the passing standard is predetermined. The test is not “graded on a curve”, it is not intended to identify the highest performers, nor does it have a predefined percentage of candidates that must fail. The exam is constructed so that anybody taking the exam that is able to meet the standard can pass.
This is different from many other tests. Most students who progress through typical education programs take “norm-referenced” examinations. It is important to know the different characteristics of norm-referenced examinations compared to criterion-based examinations.
Criterion-based examinations like the National Registry have only one score that counts: did the candidate meet the criteria (pass) or did the candidate not meet the criteria (fail). When taking a criterion-based examination, candidates are in competition with the criteria, not other candidates. Candidates who take NREMT examinations are trying to demonstrate they have enough knowledge so that they can safely and effectively practice.
Norm-referenced tests are typically designed to measure achievement, not competency. They are designed to answer the question, “Who is the best?” This answer is beneficial for example in a promotional examination or an admissions test. Norm referenced exams are also commonly used in a classroom where a teacher wants to award “A’s” to the best students. The Scholastic Aptitude Test (SAT) is also an achievement test. The SAT helps college admission committees make decisions as to which applicants learned the most in high school and provides them guidance to selecting the types of students who they want to attend their college.
People whotake norm-referenced tests are in competition with each other. They are trying to “beat” out other candidates and achieve a high score. These tests are usually timed, so that most candidates have to hurry through the test, answering as many questions correctly as possible, in a short time period. They may begin with simple questions and rapidly progress to very difficult questions. Some may not be answerable only by the very best candidates. The tests are designed to “spread” the students out so they can distinguish between high, moderate and low achievers, usually along a "bell-shaped" distribution curve.
Educators are often under pressure resulting in a well documented phenomenon known as “grade inflation.” Some schools may have a policy that says scores above 90% equal an “A”, and those between 80 and 89% equal a “B” and so on. In this case, a teacher must develop a test that ranks students, gives them a sense of accomplishment and rewards those who learn the most. Setting the pass/fail score under this type of policy is difficult. They have learned that too many difficult questions will lower the scores and too many easy questions will raise the scores. Students too have interpretations of these scores. A high score makes them feel confident and well learned. When a student takes a well-constructed test, this is true. But when the student takes a test that has too many easy questions, a false sense of confidence can arise. In fact some students use tests to guide their learning behavior. They study for the first test, perhaps three hours, and if they get an “A” on an easy test, they use this experience to guide their study for the next test, perhaps maybe as little as one hour. They’ll judge their entire learning experience based upon the results of their scores on teacher made tests. This can be dangerous and at times detrimental to the student, if the teacher develops poor tests. The other end of the perspective is also possible but less likely to occur.
Difficult teacher made tests can cause some to fail the course that should pass. Every teach knows how difficult it is to construct good test questions; it is even more difficult to define a pass/fail score for the entire test. Teachers are often attempting to develop tests that have “predictive validity,” a test that can predict if the student will do well on a different test with known validity, such as the NREMT examination. It is improper to think scores on NREMT examinations are like teacher made, norm-referenced, achievement tests. A score of 85% on the NREMT examination is not a "B", or "good job". NREMT doesn't measure achievement, but measures if the candidate meets the criteria of entry-level competency.
Finally, norm-referenced examinations provide a ranking of scores, from the highest to the lowest score. However, criterion-referenced tests are not designed to “rank” people. They are designed to identify those who have met the predefined standard. The purpose of the NREMT exam is not to identify the best, but to identify who is "competent." We want to know this so we can pass this information along to you, your State EMS Office so you can get a license, to a future employer, and most importantly to the patients you will take care of. The NREMT is not trying to assure EMTs are “experts”. We’re saying EMTs are good enough to work at the entry-level. We know EMTs need experience to be the best. We know EMTs in great EMS systems that expect quality patient care will get better with experience. We know attitude effects care. We know EMTs who work in progressive systems will excel. Our mission is to certify and register EMTs throughout their careers by a valid and uniform process that assess the knowledge and skills for competent practice. Passing the NREMT examination demonstrates the potential EMT has met the criteria of entry-level competence. It enables you to start a career in EMS—your learning has only just begun.
In 1996, the Board of Directors of the National Registry of Emergency Medical Technicians approved funding and support for NREMT’s dedication of in-kind services for assisting with the revisions of the EMT-Intermediate and EMT-Paramedic National Standard Curricula. Our support was primarily in the development of written and practical examination materials to be utilized throughout the pilot testing phases of the curricula. A Practical Examination Revision Committee was selected that possessed broad national input based on the expertise and previous experience with NREMT practical examination development. The format for the practical examination was carried over from the committee’s previous work in 1990-1 as well as meeting NHTSA requirements for the contractor to follow the format of the 1994 EMT-Basic National Standard Curriculum. The committee then developed numerous skill evaluation instruments to address the skills inclusive in drafts of the EMT-Intermediate and EMT-Paramedic National Standard Curricula that would be tested in the pilot phase of development.
Several drafts of the revised instruments were circulated and reviewed by committee members and other interested parties. Following further revision, the instruments were pilot tested at several sites throughout the country. National Registry Representatives, Examination Coordinators, and Skill Examiners provided critical review and suggestions for improvement in the revised instruments. In November of 1999, the Standards and Examination Committee recommended the revised Advanced Level Practical Examination be adopted, having reviewed each skill in relationship to criticality, frequency of use in everyday out-of-hospital care, and public safety. These skills were further evaluated to distinguish between those that would be essential and included for mandatory testing to verify minimal competence, and other skills that could be evaluated on a random basis.
The revised Advanced Level Practical Examination consists of skills presented in a scenario-type format to approximate the abilities of the NREMT-Intermediate and NREMT-Paramedic to function in the out-of-hospital setting. All skills have been developed in accordance with the 1994 EMT-Basic National Standard Curriculum, the behavioral and skill objectives of the 1999 EMT-Intermediate and EMT-Paramedic National Standard Curricula, and current American Heart Association guidelines for Basic Cardiac Life Support (BCLS) and Advanced Cardiac Life Support (AL) that are updated as necessary. The process is a formal verification of the candidate's "hands-on" abilities and knowledge, rather than a teaching, coaching, or remedial training session. The NREMT will not explain any specific errors in any performance. A candidate’s attendance at a scheduled examination does not guarantee eligibility for National Registration. The candidate is warned that he/she assumes all risks and consequences of testing inappropriate skills if testing at a site where his/her name is not read as part of the official examination roster.
All forms were designed to evaluate terminal performance expectations of an entry level candidate upon successful completion of the state-approved training program and were not designed as "teaching" instruments. To fully understand the whys, hows, and sequencing of all steps in each skill, a solid cognitive and psychomotor foundation must be established throughout the educational process. After a minimal level of competence begins to develop, the candidate should refer to the appropriate skill evaluation instrument for self-assessment in identifying areas of weakness. If indicated, remedial training and practice over the entire skill with the educational institution is strongly encouraged. Once skill mastery has been achieved in this fashion, the candidate should be prepared for the examination.