College of Business

About the College

Assessment Terms Glossary Index

Glossary Index


A  |   B  |   C  |   D  |   E  |   F  |   G  |   H  |   I  |   J  |   K  |   L  |   M  |   N  |   O  |   P   |   Q  |   R  |   S  |   T  |   U  |   V  |   W  |   X  |   Y  |   Z


Assessment: A method for analyzing and describing student learning outcomes or program achievement of objectives. Many assessments are not tests. For students, a reading miscue analysis is an assessment, a direct observation of student behavior can be an assessment, and a student conference can be an assessment. For programs, a senior exit interview can be an assessment, and an employer survey of satisfaction with graduates can be an assessment. Good assessment requires feedback to those who are being assessed so that they can use that information to make improvements. A good assessment program requires using a variety of assessment instruments each one designed to discover unique aspects of student learning outcomes and achievement of program objectives.

Assessment of programs: Uses the department or program as the level of analysis. Can be quantitative or qualitative, formative or summative, standards- based or value added, and used for improvement or for accountability. Ideally program goals and objectives would serve as a basis for the assessment. Example: how sophisticated a close reading of texts senior English majors can accomplish (if used to determine value added, would be compared to the ability of newly declared majors).

Assessment plan: A document that outlines the student learning outcomes and program objectives, the direct and indirect assessment methods used to demonstrate the attainment of each outcome/objective, a brief explanation of the assessment methods, an indication of which outcome(s)/objectives is/are addressed by each method, the intervals at which evidence is collected and reviewed, and the individual(s) responsible for the collection/review of evidence.


Backload  (--ed, --ing): Amount of effort required after the data collection.

Behavioral observations: Measuring the frequency, duration, topology, etc. of student actions, usually in a natural setting with non-interactive methods, for example, formal or informal observations of a classroom. Observations are most often made by an individual and can be augmented by audio or videotape.


Capstone Courses: could be a senior seminar or designated assessment course. Program learning outcomes can be integrated into assignments.

Case Studies: involve a systematic inquiry into a specific phenomenon, e.g. individual, event, program, or process. Data are collected via multiple methods often utilizing both qualitative and quantitative approaches.

Classroom Assessment: is often designed for individual faculty who wish to improve their teaching of a specific course. Data collected can be analyzed to assess student learning outcomes for a program.

College Portrait: is a web based tool with which users can obtain information about state colleges' and universities' students, programs, degrees awarded, financial aid, admissions, undergraduate success and progress rates, etc. College Portrait is the product of the Voluntary System of Accountability (VSA).

Commercial, norm-referenced, standardized exams: Group administered, mostly or entirely multiple-choice, "objective" tests in one or more curricular areas. Scores are based on comparison with a reference or norm group. Typically must be purchased from a private vendor.

Competency: (1) Level at which performance is acceptable.

Competency:  (2) A group of characteristics, native or acquired, which indicate an individual's ability to acquire skills in a given area.

Content Analysis: is a procedure that categorizes the content of written documents. The analysis begins with identifying the unit of observation, such as a word, phrase, or concept, and then creating meaningful categories to which each item can be assigned. For example, a student’s statement that “I learned that I could be comfortable with someone from another culture” could be assigned to the category of “Positive Statements about Diversity.” The number of incidents that this type of response occurred can then be quantified and compared with neutral or negative responses addressing the same category.

Course-embedded assessment:  Course-embedded assessment refers to techniques that can be utilized within the context of a classroom (one class period, several or over the duration of the course) to assess students' learning, as individuals and in groups. Course-embedded assessments can be formative or summative. When used in conjunction with other assessment tools, course-embedded assessment can provide valuable information at specific points of a program. For example, faculty members teaching multiple sections of an introductory course might include a common pre-test to determine student knowledge, skills and dispositions in a particular field at program admission. There are literally hundreds of classroom assessment techniques, limited only by the instructor's imagination (see also embedded assessment).


Direct assessment methods: These methods involve students' displays of knowledge and skills (e.g. test results, written assignments, presentations, classroom assignments) resulting from learning experiences in the class/program.


Embedded assessment: A means of gathering information about student learning that is built into and a natural part of the teaching learning process. Often used for assessment purposes in classroom assignments that are evaluated to assign students a grade. Can assess individual student performance or aggregate the information to provide information about the course or program; can be formative or summative, quantitative or qualitative. Example: as part of a course, expecting each senior to complete a research paper that is graded for content and style, but is also assessed for advanced ability to locate and evaluate Web-based information (as part of a college-wide outcome to demonstrate information literacy).

Evaluation: (1) Depending on the context, evaluation may mean either assessment or test. Many test manufacturers and teachers use these three terms interchangeably which means you have to pay close attention to how the terms are being used and why they are being used that way. For instance, tests that do not provide any immediate, helpful feedback to students and teachers should never be called “assessments,” but many testing companies and some administrators use this term to describe tests that return only score numbers to students and/or teachers.

Evaluation:  (2) When used for most educational settings, evaluation means to measure, compare, and judge the quality of student work, schools, or specific educational programs.

Evaluation:  (3) A value judgment about the results of assessment data. For example, evaluation of student learning requires that educators compare student performance to a standard to determine how the student measures up. Depending on the result, decisions are made regarding whether and how to improve student performance.

Exit and other interviews: Asking individuals to share their perceptions of their own attitudes and/or behaviors or those of others, evaluating student reports of their attitudes and/or behaviors in a face-to-face-dialogue.

External Assessment: Use of criteria (rubric) or an instrument developed by an individual or organization external to the one being assessed.

External examiner: Using an expert in the field from outside your program, usually from a similar program at another institution to conduct, evaluate, or supplement assessment of your students. Information can be obtained from external evaluators using many methods including surveys, interviews, etc.

External validity: External validity refers to the extent to which the results of a study are generalizable or transferable to other settings. Generalizability is the extent to which assessment findings and conclusions from a study conducted on a sample population can be applied to the population at large. Transferability is the ability to apply the findings in one context to another similar context.


Fairness: (1) Assessment or test that provides an even playing field for all students. Absolute fairness is an impossible goal because all tests privilege some test takers over others; standardized tests provide one kind of fairness while performance tests provide another. The highest degree of fairness can be achieved when students can demonstrate their understanding in a variety of ways.

Fairness:  (2) Teachers, students, parents and administrators agree that the instrument has validity, reliability, and authenticity, and they therefore have confidence in the instrument and its results.

Focus groups: Typically conducted with 7-12 individuals who share certain characteristics that are related to a particular topic, area or assessment question. Group discussions are conducted by a   trained  moderator with participants to identify trends/patterns in perceptions. The moderator's purpose is to provide direction and set the tone for the group discussion, encourage active participation from all group members, and manage time. Moderators must not allow their own biases to enter, verbally or nonverbally. Careful and systematic analysis of the discussions provides information that can be used to assess and/or improve the desired outcome.

Forced-choice: The respondent only has a choice among given responses (e.g., very poor, poor, fair, good, very good).

Formative assessment: The gathering of information about student learning during the progression of a course or program and usually repeatedly-to improve the learning of those students. Assessment feedback is short term in duration. Example: reading the first lab reports of a class to assess whether some or all students in the group need a lesson on how to make them succinct and informative.

Frontload (--ed, --ing): Amount of effort required in the early stage of assessment method development or data collection.


Generalization (generalizability): The extent to which assessment findings and conclusions from a study conducted on a sample population can be applied to the population at large.


High stakes test:  A test whose results have important, direct consequences for examinees, program, or institutions tested.


Indirect assessment of learning: Gathers reflection about the learning or secondary evidence of its existence. Example: a student survey about whether a course or program helped develop a greater sensitivity to issues of diversity.

Inter-rater reliability: The degree to which different raters/observers give consistent estimates of the same phenomenon.

Internal validity: Internal validity refers to (1) the rigor with which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured) and (2) the extent to which the designers of a study have taken into account alternative explanations for any causal relationships they explore.

Interviews: are conversations or direct questioning with an individual or group of people. The interviews can be conducted in person or on the telephone. The length of an interview can vary from 20 minutes to over an hour. Interviewers should be trained to follow agreed-upon procedures (protocols).




Local assessment: Means and methods that are developed by an institution's faculty based on their teaching approaches, students, and learning goals. Is an antonym for “external assessment.” Example: one college's use of nursing students' writing about the “universal precautions” at multiple points in their undergraduate program as an assessment of the development of writing competence.

Locally developed exams: Objective and/or subjective tests designed by faculty of the program or course sequence being evaluated.

Longitudinal studies: Data collected from the same population at different points in time.


Matrices: are used to summarize the relationship between program objectives and courses, course assignments, or course syllabus objectives to examine congruence and to ensure that all objectives have been sufficiently structured into the curriculum.


Norm (--ative): A group. large a of achievement median or average the from derived usually development standard set.


Observations: can be of any social phenomenon, such as student presentations, students working in the library, or interactions at student help desks. Observations can be recorded as a narrative or in a highly structured format, such as a checklist, and they should be focused on specific program objectives.

Open-ended: Assessment questions that are designed to permit spontaneous and unguided responses.

Operational (--ize): Defining a term or object so that it can be measured. Generally states the operations or procedures used that distinguish it from others.

Oral examination  : An assessment of student knowledge levels through a face-to-face dialogue between the student and examiner-usually faculty.

Outcomes: When used in the context of student learning, outcomes are what students will know, be able to do, and value at the end of their degree program.


Performance appraisals: A competency-based method whereby abilities are measured in most direct, real-world approach. Systematic measurement of overt demonstration of acquired skills.

Performance assessment  : A method for assessing how well students use their knowledge and skills in order to do something. Music students performing a new piece of music before a panel of judges are undergoing performance assessment; students who are expected to demonstrate an understanding of basic grammar, spelling, and organizational skills while writing a paper are undergoing performance assessment; business students asked to write a proposal to solve a problem presented in a case study are undergoing performance assessment.

Program review: The administrative (college and provost's staff) and peer (Academic Planning Council) review of academic programs conducted on an eight-year cycle, the results of which are reported to the NIU Board of Trustees and the IBHE. This review includes a comprehensive analysis of the structure, processes, and outcomes of the program. The outcomes reported in the program reviews include program outcomes (e.g. costs, degrees awarded) as well as student learning outcomes (i.e. what students know and can do at the completion of the program).


Qualitative methods of assessment: Methods that rely on descriptions rather than numbers. Examples: ethnographic field studies, logs, journals, participant observations, open-ended questions on interviews and surveys.

Quantitative methods of assessment: Methods that rely on numerical scores or ratings. Examples: surveys, inventories, institutional/departmental data, departmental/course-level exams (locally constructed, standardized, etc.)


Reflective Essays: generally are brief (five to ten minute) essays on topics related to identified learning outcomes, although they may be longer when assigned as homework. Students are asked to reflect on a selected issue. Content analysis is used to analyze results.

Reliability: The extent to which an experiment, test or any measuring procedure yields the same result on repeated trials.

Rubrics: A set of categories that define and describe the important components of the work being completed, critiqued or assessed. Each category contains a graduation of levels of completion or competence with a score assigned to each level and a clear description of what criteria need to be met to attain the score at each level.


Salience: A striking point or feature.

Simulations: A competency-based measure where a person's abilities are measured in a situation that approximates a "real world" setting. Simulation is primarily used when it is impractical to observe a person performing a task in a real-world situation (e.g. on the job).

Stakeholder: Anyone who has a vested interest in the outcome of the program/project. In a high stakes standardized test (a graduation requirement, for example), when students' scores are aggregated and published in the paper by school, the stakeholders include students, teachers, parents, school and district administrators, lawmakers (including the governor), and even real estate agents. It is always interesting to note which stakeholders seem to have the most at risk and which stakeholders seem to have the most power; these groups are seldom the same.

Standard: The performance level associated with a particular rating or grade on a test. For instance, 90% may be the standard for an A in a particular course; on a standardized test, a cutting score or cut point is used to determine the difference between one standard and the next.

Standard-based assessment:  A standard-based assessment assesses learner achievement in relation to set standards.

Standardized test:  This kind of test (sometimes called “norm-referenced”) is used to measure the performance of a group against that of a larger group. Standardized tests are often used in large-scale assessment projects, where the overall results of the group are more important than specific data on each individual client. Standardized tests are not authentic. They are most useful for reporting summative information, and are least useful for classroom diagnosis and formative purposes.

Standards: Widely recognized models of excellence; term commonly used to describe achievement goals. Standards are always prescriptive because they tell us what “should be.”

Status report: A description of the implementation of the plan's assessment methods, the findings (evidence) from assessment methods, how the findings were used in decisions to maintain or improve student learning (academic programs) or unit outcomes (support units), the results of previous changes to improve outcomes, and the need for additional information and/or resources to implement an approved assessment plan or gather additional evidence.

Summative assessment: Assessment that is done at the conclusion of a course or some larger instructional period (e.g., at the end of the program). The purpose is to determine success or to what extent the program/project/course met its goals.

Surveys: are commonly used with open-ended and closed-ended questions. Closed ended questions require respondents to answer the question from a provided list of responses. Typically, the list is a progressive scale ranging from low to high, or strongly agree to strongly disagree.


Test: A formal assessment of student achievement. Teacher made tests can take many forms; external tests are always standardized. A portfolio can be used as a test, as can a project or exhibition.

Third party: Person(s) other than those directly involved in the educational process (e.g., employers, parents, consultants).

Topology: Mapping of the relationships among subjects.



Validity: Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. Validity has three components:

  • relevance - the option measures your educational objective as directly as possible
  • accuracy - the option measures your educational objective as precisely as possible
  • utility - the option provides formative and summative results with clear implications for educational program evaluation and improvement

Value added: The increase in learning that occurs during a course, program, or undergraduate education. Can either focus on the individual student (how much better a student can write, for example, at the end than at the beginning) or on a cohort of students (whether senior papers demonstrate more sophisticated writing skills-in the aggregate-than freshmen papers). Requires a baseline measurement for comparison.

Variable (variability): Observable characteristics that vary among individuals responses.


Written surveys/questionnaires: Asking individuals to share their perceptions about the study target-e.g. their own or others skills/attitudes/behavior, or program/course qualities and attributes.




References and definitions adopted from the following links: