Skip to main content

Factors that Can Bias SET Results

Instructor gender

A great deal of research has been devoted to examining the relationship between gender and SET results.  Many studies of SET results find no statistically significant differences based on gender when controlling for other possible sources of bias (Centra & Gaubatz, 1998; Feldman, 1992; Theall & Franklin, 2001; Benton & Cashin, 2012; Wright & Jenkins-Guarnieri, 2012).  Many other studies of SET results find a consistent bias against women faculty (Amin, 1994; Sprague & Massoni, 2005; Abel, M. H., & Meltzer, 2007; MacNell et al., 2015; Boring, Ottoboni, & Stark, 2016).  Still others find that female faculty members receive higher SET evaluations than do male faculty members (Wigington, Tollefson, & Rodriguez, 1989; Rowden & Carlson, 1996; Whitworth, Price, & Randall, 2002).  Some researchers suggest complicated correlations between the gender of the students completing SET and the gender of the instructor (Basow & Distenfield, 1985; Basow and Silberg, 1987; Atamian & Ganguli, 1993; Goldberg & Callahan, 1991; Bachen, McLoughlin, & Garcia, 1999).

Instructor ethnicity

Though not studied as extensively as other factors, research suggests that students of the same ethnicity as the instructor may rate him or her slightly higher (Centra, 1993).  However, research also indicates that, overall, non-white instructors receive lower ratings than their white colleagues (McPherson & Jewell, 2007).

Instructor being a non-native speaker of English

Students with English as their first language give slightly lower ratings to non-native English-speaking instructors. Even further, a correlation between gender and language is noted; male non-native English speaking instructors receive slightly lower ratings than female non-native English speaking instructors (Hamermesh & Parker, 2005; Huston, 2005).

Instructor physical appearance

Research suggests there may be a correlation between perceived attractiveness or personal appearance and higher ratings. Students, especially undergraduates, rate instructors higher if the students perceive the instructors as physically attractive or simply presenting a kept personal appearance through demeanor, dress, health, etc. (Hamermesh & Parker, 2005; Abrami, Rosenfield, & Dedic, 2007).

Instructor age

Older faculty receive lower ratings than do younger faculty (Feldman, 1983).  Some research suggests that students reward youthfulness and the "seasoned" instructor, over 55, but that the ratings become lower between those two poles (McPherson & Jewell, 2007). First year / very early career instructors consistently receive lower ratings (Benton & Cashin, 2012).

Instructor confidence and enthusiasm

Two factors of instructor personality positively influence ratings: positive self-esteem and energy or enthusiasm (Feldman, 1986; Benton & Cashin, 2012; Davidovitch & Soen, 2009).

Academic field

Research shows differences in ratings by field. Arts and humanities instructors frequently receive higher ratings than social science and math instructors (Feldman, 1978). Increasing evidence supports this disparity by field, but the reason why it occurs remains unclear. Some speculate that certain subjects are more difficult to teach, while others wonder if the disparity in ratings reflects a larger trend of shifting capacities among students, i.e. students more easily grasp / respond to arts and humanities as opposed to social sciences and math (Feldman, 1978; Centra, 1993, 2009).

Undergraduate vs. graduate course

Graduate students tend to rate instructors more favorably than undergraduate students (Aleamoni & Hexner, 1980; Goldberg & Callahan, 1991).

Relevancy and difficulty of work required in course

If students deem assignments and activities unnecessary / unrelated to the course (i.e., “busy work”), ratings are lower. Students tend to give higher ratings to instructors who require demanding work that directly relates to instructional objectives (Bain, 2004; Benton & Cashin, 2012; Keegan).

Amount learned in course

There are consistently high correlations between students' ratings of the "amount learned" in the course and their overall ratings of the teacher and the course. The students who performed the best on final exams also gave the highest ratings (Theall & Franklin, 2001).

Class size

Instructors with smaller classes receive higher ratings than do instructors with larger classes (Benton & Cashin, 2012; Feldman, 1984, Hoyt & Lee, 2002).

Student interest in the course topic

Instructors receive higher ratings in courses that students had a prior interest in, such as courses directly related to their major, or that students were taking as an elective (Marsh & Dunkin, 1992; Aleamoni, 1981).

Instructor's presence

Ratings will be higher if the instructor stays in the classroom while students fill out the ratings form (Braskamp & Ory, 1994; Centra, 1993; Feldman, 1979; Marsh & Dunkin, 1992).


Signed ratings tend to be higher (Braskamp & Ory, 1994; Centra, 1993; Feldman, 1979; Marsh & Dunkin, 1992). Online evaluations raise concerns for students about anonymity, since they worry that digital evaluations will be easier to track (Benton & Cashin, 2012).


Abrami P. C., Rosenfield S., & Dedic H. (2007). The dimensionality of student ratings of

instruction: An update on what we know, do not know, and need to do. In Perry R., Smart J. C. (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 446–456). Netherlands: Springer.

Aleamoni, L. M. (1981). Student ratings of instruction. In J. Millman (Ed.), Handbook of

teacher evaluation (pp. 110-145). Beverly Hills, CA: Sage Publications.

Aleamoni, L. M. (1987). Techniques for evaluating and improving instruction. San

Francisco: Jossey-Bass.

Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998.

Journal of Personnel Evaluation in Education, 13, 153-166.

Aleamoni. L. M., & Hexner. P. Z. (1980). A review of the research on student evaluation

and a report on the effect of different sets of instructions on student course and instructor evaluation. Instructional Science, 9, 67-81.

Arreola, R. A. (1994). Developing a comprehensive faculty evaluation system. Boston:


Atamian, R., & Ganguli, G. (1993). Teacher popularity and teaching effectiveness:

Viewpoint of accounting students. Journal of Education for Business, 68(3), 163–169.

Bain, K. (2004). What the best college teachers do. Cambridge, MA: Harvard University


Basow, S.A., & Distenfield, S. (1985). Teacher expressiveness: More important for male

teachers Than female teachers? Journal of Educational Psychology, 77(1), 45-62.

Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are

female and male professors rated differently? Journal of Educational Psychology,

79(3), 308-14.

Benton, S. L., & Cashin, W. E. (2012). IDEA Paper No. 50: Student ratings of teaching:

A summary of research and literature. Manhattan, KS: The IDEA Center.

Braskamp, L. A., & Ory, JC. (1994). Assessing faculty work. San Francisco: Jossey-Bass


Centra, J. A. (1976). The influence of different directions on student ratings of

instruction. Journal of Educational Measurement, 13, 277- 282.

Centra, J. A. (1979). Determining Faculty Effectiveness. San Francisco: Jossey-Bass.

Centra, J. A. (1993). Reflective faculty evaluation: Enhancing teaching and determining

faculty effectiveness. San Francisco: Jossey-Bass.

Centra, J. A. (2003). Will teachers receive higher student evaluations by giving higher

grades and less course work? Research in Higher Education, 44, 495-518.

Centra, J. A. (2009). Differences in responses to the Student Instructional Report: Is it

bias? Princeton, NJ: Educational Testing Service.

Centra, J. A., & Gaubatz, N. B. (1998). Is there gender bias in student ratings of

instruction? Journal of Higher Education, 70, 17-33.

Davidovitch, N., & Soen, D. (2009). Myths and facts about student surveys of teaching

the links between students’evaluations of faculty and course grades. Journal of College Teaching and Learning, 6(7), 41-49.

Feldman, K. A. (1978). Course characteristics and college students' ratings of their

teachers-what we know and what we don't. Research in Higher Education, 9, 199-242.

Feldman, K. A. (1979). The significance of circumstances college students' ratings of

their teachers and courses. Research in Higher Education, 10, 149-172.

Feldman, K. A. (1983). Seniority and experience of college teachers as related to

evaluations they receive. Research in Higher Education, 18, 3-124.

Feldman, K. A. (1986). The perceived instructional effectiveness of college teachers as

related to their personality and attitudinal characteristics: a review and synthesis: Research in Higher Education, 24, 129-213.

Feldman, K. A. (1992). College students' view of male and female college teachers: Part

I—evidence from the social laboratory and experiments.” Research in Higher Education, 33, 317-75.

Frey, P. W. (1976). Validity of student instructional ratings: Does timing matter. Journal

of Higher Education, 47(3), 327-336.

Goldberg, G., & Callahan, J. (1991). Objectivity of students’ evaluations of instructors.

Journal of Education for Business, 66, 377-378.

Greenwald, A. G., & Gillmore, G. M. (1997). Grading lenience is a removable

contaminant of student ratings. American Psychologist, 52(11), 1209-1217.

Hamermesh, D., & Parker, A. (2005). Beauty in the classroom: Instructors' pulchritude

and putative pedagogical productivity. Economics of Education Review, 24, 369-376.

Hayes, J. R. (1971). Research, teaching and faculty fate. Science, 172, 227-230.

Huston, T.A. (2005). Research report: race and gender bias in student evaluations of

Kegan, R. (1998). In over our heads: The mental demands of modern life. Boston:

Harvard University Press.

McPherson, M. A., & Jewell, R. T. (2007). Leveling the playing field: should student

evaluation scores be adjusted?, Social Science Quarterly, 88, 868-881.

Marsh, H. W., & Dunkin, M. J. (1992). Students’ evaluations of university teaching: A

multidimensional perspective. Higher Education: Handbook of Theory and Research, 8, 143-233. New York: Agathon Press.

Melland, H. I. (1996). Great researcher . . . good teacher? Journal of Professional

Nursing, 12(1), 31-38.

Perry, R. P., Abrami, P. C., & Leventhal, L. (1979). Educational seduction: The effect of

instructor expressiveness and lecture content on student ratings and achievement. Journal of Educational Psychology, 71, 107-116.

Perry, R. P., Magnusson, J. L., Parsonson, K. L., &. Dickens, W. J. (1986). Perceived

control in the college classroom: Limitations in instructor expressiveness due to noncontingent feedback and lecture content. Journal of Educational Psychology, 78, 96-107.

Theall, M., & Franklin, J. Eds. (1990). Student ratings of instruction: Issues for

improving practice. New Directions for Teaching and Learning, 43. San Francisco: Jossey-Bass.

Theall, M. & Franklin, J. (2001). Looking for bias in all the wrong places: a search of

truth or a witch hunt in student ratings of instruction? New Directions for Institutional Research, 109, 45-56.

Walker, B. D. (1969). An investigation of selected variables relative to the manner in which a population of junior college students evaluate their teachers. Dissertation Abstracts, 29(9-B), 3474.

Login to SET

To access the SET click here and login: Link to CoursEval

Student Evaluation of Teaching