Skip to content

Education

The Teaching Problem

The reason most teaching is bad is that most teaching follows a demonstrably bad model.

· 17 min read
Rows of empty wooden seats in a lecture theatre.
Photo by Nathan Dumlao on Unsplash

A full audio version of this article can be found below the paywall.

Higher education is much in the news, but the central challenge that colleges and universities are failing to meet is receiving hardly any attention. The big stories have been student loans, protests and demonstrations, affirmative action, ratings and rankings, racial and ethnic equity, funding and costs. Most recently, higher-ed news has been hijacked by the Trump administration’s incoherent and destructive attack on universities and foreign students. This is serious, and we can only hope that it doesn’t last long. But it is also a distraction from the most important long-term challenge facing higher education.

The main weakness of most colleges and universities is that they don’t do a very good job of educating most students. Most teachers do a mediocre job of teaching, and so most of their students do a mediocre job of learning. Many of these teachers are smart and talented, spend most of their time on teaching, and want to teach well, but because of the design of their institutions and the incentives built into their work, most fail. Most teaching follows a demonstrably bad model: the teacher tells the students what he wants them to know and then tests their ability to recall the information he has provided. This is one of the worst possible ways to get anyone to learn anything. Students don’t learn much even from excellent lecturers and remember very little from being told something once.

This is not a new discovery. When Charles William Eliot assumed the presidency of Harvard in 1869, he observed that “lectures alone are too often a useless expenditure of force. The lecturer pumps laboriously into sieves. The water may be wholesome but it runs through. A mind must work to grow.” Truer words have seldom been spoken and then widely ignored (even during Eliot’s forty years as president at Harvard). In the last several decades, scholars have observed and studied the learning process in detail, and their conclusions undermine nearly every premise behind the most common practices in college teaching.

The term active learning is used to distinguish an array of pedagogical approaches from conventional lectures. Consider two physics classes: the first is a conventional class taught by lecture-demonstration. The professor explains a principle or concept to his class, writing the relevant formulas on the board, perhaps using a physical demonstration to illustrate the point. The students go home, read over their notes, and prepare for the test. Most of those who do well came into the class with a reasonably clear framework for the physical and mathematical principles involved and already had a head start. They were able to teach themselves, at least to the extent of memorising the algorithms that would lead to the correct answer. The ones who have little background and foundational understanding will not do so well. “Studying more” will probably not help. Few students will seek out the professor’s office.

The second class uses “peer instruction,” a technique developed by Eric Mazur of Harvard. Mazur was a “successful” teacher, gave popular lectures, and received good student evaluations. Then he came across a test that sought to assess students’ conceptual understanding of physics. When he included some of these conceptual questions in his exams, he discovered two things: “First, it is possible for students to do well on conventional problems by memorizing algorithms without understanding the underlying physics. Second, as a result of this, it is possible for a teacher, even an experienced one, to be completely misled into thinking that students have been taught effectively.” So, he devised peer instruction.

Imagine the same lecture hall as above. The teacher has assigned reading to be completed before class. He spends a few minutes outlining the key principles that his students have already read about, then he gives them a problem with four possible answers. Students have clickers with which they can send remote responses to the professor’s computer. They have a minute or two to answer the question. Then the teacher tells the students to explain to their neighbours why they answered the question as they did. Then all students answer the question a second time. Only then does the teacher display a chart of the responses before and after the brief discussion. Most of the time, there is a considerable shift in the responses towards the correct answer. Finally, the teacher asks for a volunteer to explain why the correct answer is correct.

In the first class, the teacher talks and the students listen; the recall and processing of the information falls entirely on the students, without testing their understanding or applying it. In the second class, the teacher talks for a while, the students apply their understanding of what they have been taught, then they explain their reasoning, get immediate feedback, and reflect on how it went right or wrong. There are many other ways of implementing active learning such as well-designed group discussion, small-group projects, Socratic questioning, writing, and combining these in a “flipped” classroom where students do reading and interactive prompts before class and then discuss and defend their ideas face-to-face in class. But this example shows the essential distinction.

Active learning is not a startlingly new idea. It draws on the same principles that have generally guided vocational training and professional education. In the Socratic approach to legal education, the studio system in art and architecture, rounds in medical school, and parallel processes in most professional training, the student is not simply an observer but an agent, called upon regularly to evaluate, diagnose, explain, and defend. Students in such settings never have the option of simply withdrawing and observing. 

Which approach works better? Some science and mathematics teachers have been arguing against the conventional means of teaching science since, well, Socrates. But in recent decades, they have empirically tested the approaches to teaching. In 2014, Scott Freeman, a biologist at the University of Washington, and six colleagues conducted a massive and careful meta-analysis of 225 studies comparing student performance in classes taught by traditional lectures and classes taught using active learning techniques in eight different science, technology, engineering, and mathematics (STEM) disciplines. The difference was dramatic. Students in active learning courses did significantly better on exams than students in lecture courses. The lecture format increased failure rates by 55 percent. Reviewing the evidence, they concluded:

If the experiments analyzed here had been conducted as randomized controlled trials of medical interventions, they may have been stopped for benefit—meaning that enrolling patients in the control condition might be discontinued because the treatment being tested was clearly more beneficial.

Particularly interesting was Freeman and colleagues’ finding that “active learning confers disproportionate benefits for STEM students from disadvantaged backgrounds and for female students in male-dominated fields.” This suggests there may be a way of addressing a problem with which higher education has been unsuccessfully grappling for years—how to improve the performance of underperforming black and Hispanic students.

In 2020, Elli J. Theobald, another biologist from the University of Washington, led a team of scientists in a meta-analysis of fifteen studies specifically focused on students from groups that were underrepresented in STEM disciplines. They began by pointing out that students from some racial and ethnic minorities drop out of STEM programs at shockingly high rates. The problem is not that they don’t start programs; it’s that they don’t finish them. Surveys show that students who start out in STEM programs do so with high hopes and a high level of interest. After six years, 52 percent of Asian-Americans and 43 percent of Caucasian students have completed their programs. But the African-American completion rate is just 22 percent. For Latino/Latina Americans it is 29 percent, and for Native Americans it is 25 percent. Similar disparities emerge from contrasting low-income and high-income students. Many drop out early in programs, often after failing the first course. 

How does better teaching affect this? Theobald and colleagues found that in a variety of courses “active learning reduced the gap in probability of passing ... by 45%.” Bear in mind that many under-performing students have little prior experience thinking and working with the kinds of ideas, data, and processes that such courses require. They didn’t learn this stuff in high school and haven’t been exposed to it at home. Given these disadvantages, reducing the achievement gap by 45 percent is, frankly, breathtaking. So, it turns out, these students are not disabled just by their class, but by their classes. Their parents aren’t entirely at fault. To a significant degree, their teachers are. How well might they do if they had access to excellent teaching throughout their education?

This evidence comes from research in STEM fields, but in 2022, Anastassis Kozanitis and Lucian Nenciovici of the University of Quebec undertook a similar meta-analysis using 104 studies directly comparing active learning and traditional teaching in the social sciences and the humanities. They found active learning to be superior to traditional instruction by the same margin that Freeman’s study found for STEM fields. Students who are studying history or literature or psychology will learn less if their teachers simply lecture to them than if they engage those students in ongoing discussion, writing, and performance.

My Late Father Was a Great Teacher. He Wouldn’t Last a Week in the Modern Classroom.
All teachers have a hope of how they will be perceived by the students sitting in their classrooms. Too many of us today want to be perceived as accommodating and nice, compassionate and endlessly empathetic.

Are We Changing What Needs to Change?

All of which indicates that methods of teaching need to change if colleges are to fulfil their stated purpose. Have they? Well, a little bit. But the evidence is not encouraging.

In 2018, four years after Freeman’s study, Marilyne Stains, associate professor of chemistry at the University of Nebraska, Lincoln, gathered about two dozen of her colleagues and travelled around the country to confirm directly how teachers were really teaching in the STEM disciplines. They observed 709 courses taught by 549 instructors at 24 research universities and one college. They found that lecture was the most common pedagogy. Fifty-five percent of the courses used a “didactic” style, which meant that more than eighty percent of class time consisted of lecturing; 27 percent employed “interactive lecture”; and eighteen percent were “student-centered.” The researchers concluded: “Didactic practices are prevalent throughout the undergraduate STEM curriculum despite ample evidence for the limited impact of these practices and substantial interest on the part of institutions and national organizations in education reform.” In other words, most science teachers aren’t following the approach suggested by the evidence.

For several years, concluding in 2023, Corbin M. Campbell, a professor at American University, directed the College Educational Quality study. She reports her conclusions in her 2023 book Great College Teaching: Where It Happens and How to Foster It Everywhere. She and her team observed 732 courses at nine different colleges and universities using a sophisticated rubric of teaching methodologies. The first construct in the research-based rubric is active learning. The study was not limited to STEM disciplines but covered the whole range of college courses. She and her colleagues found that while the vast majority of college and university teachers lecture (87.6 percent), many of them (64.4 percent) also include some sort of activity. This indicates that teaching is probably changing, at least for the teachers in this study. But she also reports that “when we examined the quality of active learning more deeply ... then most courses faltered.” Many “activities” may amount to little more than window dressing, without altering the basic pedagogy of the course. Many teachers have heard of active learning and are making some effort to experiment with it, but slowly and inconsistently.

One of the study’s more striking conclusions is that the quality of teaching at regional public universities is generally higher than at the flagship research university they studied. (This is not a surprising finding, for reasons we will address in a moment.) Campbell’s overall assessment of college teaching is that it is “middling”—neither impressive nor disastrous, better in some places than in others, but tending to mediocrity.

Why Is Teaching Mediocre?

The problem is that teaching well does little or nothing to advance a faculty member’s career, and teaching badly does little or nothing to limit it. Why? Because almost all the information that colleges and universities collect and pay attention to in recruiting, hiring, promoting, and rewarding their faculty has little or nothing to do with teaching performance. The reason for that is quite straightforward: there is no credible evidence of the quality of teaching performance. Nobody measures it, and at most institutions, nobody even describes or defines excellent teaching.

What nearly every institution does define, describe, and measure is research performance. In 2009, two economists, Dahlia Remler and Elda Pena, produced a paper for the National Bureau of Economic Research with the provocative title “Why Do Institutions of Higher Education Reward Research while Selling Education?” Assessing the research of candidates for faculty positions and faculty members seeking promotion and tenure, they found, is easy because there is a vast infrastructure for doing so: “Specifically, there is an extensive existing system of peer-review for research, including journal rankings, academic presses, and grant review agencies.” Faculty members, they point out, devote considerable time to reviewing submissions to academic journals and reviewing grant proposals as well as writing papers and grant proposals. When hiring or promotion committees look at a resume, they can consider not only how many articles, chapters, and books a candidate has published but the “impact” numbers: how often the publication has been cited by others, and in what other publications.

The effort and resources that go into producing, publishing, evaluating, and critiquing academic research continue to increase. Doctoral students hoping for faculty jobs know that their prospects depend on being able to do research, publish research, and document their contributions to the discipline. And they know that the further they rise in the faculty reward system, the stricter the scrutiny will be. Research is obviously a central and vitally important function of universities, which is why the Trump administration’s attack on research funding is such a serious threat to national welfare. But research is not the only function of colleges and universities, and institutions collect almost no evidence about the teaching accomplishments of candidates for hiring or promotion. In part, this is due to a tacit assumption that good researchers will necessarily be good teachers. As the late James Duderstadt, emeritus president of the University of Michigan, put it in his 2000 book A University for the 21st Century:

Teaching and scholarship are integrally related and mutually reinforcing and their blending is key to the success of the American system of higher education. Student course evaluations suggest that more often than not, our best scholars are also our best teachers.

This claim is false on almost every level. Student evaluations do not show that the best scholars are the best teachers. And when the claim that teaching and research are “mutually reinforcing” has been tested it has been squashed like a bug by a large volume of evidence. In 1996, four years before President Duderstadt addressed the topic, John Hattie and Herbert Marsh conducted a meta-analysis of the best studies of the relationship between teaching and research. “The evidence,” they concluded, “suggests a zero relationship.” So then why do so many brilliant scholars believe otherwise? “We must conclude,” Hattie and Marsh write, “that the common belief that research and teaching are inextricably entwined is an enduring myth. At best, research and teaching are very loosely coupled.”

The means of evaluating teaching used by almost all colleges and universities, as Duderstadt suggested, is the student evaluation of teaching (SET), the forms that students fill out near the end of the semester asking how the teacher did. But a number of studies indicate that these evaluations tell us little of value about how or how well teachers teach. One such study was conducted from 2000–07 at the US Air Force Academy. Unlike other four-year colleges and universities, the Air Force Academy does not let students choose their own teachers. Instead, students are randomly assigned to teachers in the required introductory courses and then randomly assigned to follow-on courses. All sections of a given course use the same texts and tests. The elements of self-selection and interference are thereby eliminated, making this much closer to a random trial. The authors of the study, Scott E. Carrell of UC Davis and James E. West of the Air Force Academy, found that the higher the ratings students gave their professors in the introductory courses, the worse the students did in the follow-on courses. “Student evaluations,” they noted, “are positively correlated with contemporaneous professor value-added and negatively correlated with follow-on student achievement.” Students got high grades from the professors they rated highly in the introductory classes, but they performed worse in subsequent classes.

This is just one of several studies that should undermine our confidence in student evaluations. In 2010, three Canadian scholars—Bob Uttl, Carmela A. White, and Daniela Wong Gonzalez—conducted a thorough meta-analysis of decades of research on SET, which concluded:

Despite more than 75 years of sustained effort, there is presently no evidence supporting the widespread belief that students learn more from professors who receive higher SET ratings. If anything, the latest large sample studies show that students who were taught by highly rated professors in prerequisites perform more poorly in follow up courses.

The system for evaluating the quality of teaching tends to report the opposite of the truth about the effectiveness of teachers at advancing student learning. Colleges and universities created large, expensive, and interconnected systems for continuously supporting and monitoring the quality of faculty research, but they have no reliable system at all for monitoring the quality of teaching. The SET system survives because it is easy to administer and cheap to execute, and it provides institutions with a fig-leaf of credibility as to their educational mission. But nobody who is paying attention takes it seriously. 

The reputations of prestigious universities are based almost entirely on their research accomplishments. They can show, prove, advertise, and promote their research accomplishments to the world. They can, and do, claim to have great teachers, but with no common standards and no public evidence of teaching accomplishment, they certainly can’t prove it. Recall that Corbin Campbell’s study of teaching found that the quality was generally higher at regional universities than at the flagship research university they studied. That is in large part because research universities put even more emphasis on research, and proportionately less on teaching. “Exceptional teaching,” Campbell concludes, “largely, does not currently bring institutions increased reputation, increased enrolments, increased rankings, or increased funding. Exceptional teaching, largely, does not help individual faculty garner prestige in their careers.” Why don’t teachers teach better? Because no one would know, and therefore no one would care, if they did.

Public Education’s Dirty Secret
My small classes faced a large photograph of Barack Obama displayed proudly in front of the classroom over the title “Notre Président.”

What Can Be Done?

There is another reason why the evaluation of research is a serious process, and the evaluation of teaching is not. In the realm of faculty research in any given discipline, there are explicit criteria for excellence. In the realm of teaching, by and large, there are no such criteria. I am not, of course, suggesting that those research criteria are always defensible. But even where the standards are subjective and open to question, there are standards. Yet when it comes to teaching, it is very hard to identify any commonly accepted standards. What are the minimum requirements for a teacher? Show up to class and assign grades at the end of the term. Almost all college teachers do a great deal more than that. But they do not share a common understanding of either what they are trying to do or how they are trying to do it.

Carl Weiman received the Nobel Prize in physics in 2006. He has devoted much of his attention since to raising the standards of science teaching. His starting point is the absence of any coherent existing standards: “The lack of agreed-upon standards for teaching quality allows everyone to consider themselves to be a good teacher by some standard, and most do.” Now at Stanford University, he did the foundational work for his new approach at the University of British Columbia (UBC) in Canada. He has sought to develop a means of evaluation that is valid (correlates with desired student outcomes), fair (judges teachers by factors under their control), and guides improvement (shows teachers how to teach better). “Student course evaluations,” he points out, “fail badly at meeting any of these criteria.” 

Along with Sarah Gilbert from UBC, he developed the Teaching Practices Inventory (TPI). This is a questionnaire filled out by the teacher that takes ten minutes or so to complete for a given course. It asks for information about many aspects of that class: out-of-class assignments, quizzes and tests, discussions and group work, class activities. Since it was first developed at the Carl Weiman Science Education Initiative (CWSEI) at UBC in 2014, the TPI has been tested extensively at several institutions. Weiman summarises the results this way: “The TPI shows a high degree of discrimination across a typical sample of university faculty, with the highest scoring faculty also having very high measures of student learning outcomes. TPI results allow meaningful comparisons to be made across faculty, departments, and institutions.” 

But even if we can assess teaching, can we improve it? One of the most pernicious and destructive myths colleges use to evade responsibility for the quality of their work is that good teachers are born, not made. This is nonsense. Teaching is a skill like bowling or driving, not an inherited trait like height or hair colour. It can therefore be taught and learned. Weiman reports: “I have seen that faculty can reach a respectable level of teaching expertise in something in the range of fifty hours of training; less time than is required to complete most university courses. That is sufficient to allow faculty members to switch from teaching by traditional lecture and exams to research-based methods.” Many of those who take four to six years to complete their doctoral degrees receive no serious preparation for teaching at all.

While the TPI was developed for use in university science classes, it has been adapted to other disciplines. It is not a comprehensive solution, but it is an example that demonstrates the possibility of meaningful teaching evaluation. And it can provide a basis and framework for what would be the gold standard in quality improvement: peer-review of teaching, in which teachers review one another’s teaching by common standards, much like they currently review one another’s research. We can identify, with a high degree of confidence, what good teaching looks like. Leaders like Corbin Campbell have advocated powerfully for reform. Jonathan Zimmerman of the University of Pennsylvania, in his delightful if depressing history of college teaching, The Amateur Hour, has traced the failure of teaching reform back over a century and found little evidence of progress. He advocates widespread adoption of peer-review of teaching. 

I wish those advocates the best of luck. But I do not see how we will make sustained progress until we bring the problem into the light. Today, the information that guides decisions about teaching is almost all information about things other than teaching, chiefly research. Current arrangements all but guarantee that colleges and universities act in ignorance of how their actions affect teaching. Teaching is a core activity of the college, but it lives in a shadow realm where decision-makers cannot see it clearly. Corbin Campbell reports speaking to a conference in 2016 about her research on teaching quality:

A senior administrator at an Ivy League institution asked to meet with me after and told me directly, “I’m not sure why you are pursuing this agenda. Highly ranked universities should not have to focus on teaching—their students don’t need it and knowledge production would suffer.”

That administrator was certifiably and definitively wrong. The students do need it, and the evidence to support that claim would take many pages to even summarise. The advent of artificial intelligence will change education, but it will not change the fundamental educational equation: Learning is a product of what students do. As Charles William Eliot wrote over a century and a half ago, “A mind must work to grow.” Add AI to lecture and you have a prescription for disaster. But engage students in active learning in which they must face one another and remember, think, and engage, and we might have a chance. Getting the present leadership and management of higher education to change won’t be easy. They seek equilibrium. I’ve written a book about how hard this is and why. But today, colleges and universities are failing their students. So far, their main approach to this failure has been to change the subject, hide the evidence, and pretend that everything is fine. It isn’t. 

Every student thinking about going to college and every parent of every student planning to go to college should ask every representative of every college two questions: “How do your teachers teach?” And “How do you know?” Until they can answer those questions, they are not worth the money we’re paying for them. Forking out a king’s ransom to attend the University of Going-Through-The-Motions makes less and less sense every year. Excellent teaching should be the first and essential project of every institution of higher learning. Learning is what college is for. And the students do need it.