Stuart Schechter: So many a great computer scientists has stood at

advertisement
>> Stuart Schechter: So many a great computer scientists has stood at this podium. It’s not uncommon
to hear anything but our most distinguished speakers referred to as the rock stars in our field. Yet the
rock star metaphor breaks down without the words, in our field, there as a proviso. After all, how many
computer scientists do you know who go from recording to recording, surrounded by an entourage and
poperatcy? How many have an international fan base that includes the key demographics that we want
to market to the teens and tweens? How many of our so called rock stars have such a big following that
they’ve had to abandon rooms like this and classrooms, now play to a sellout audiences in theaters?
Well, today I have the privilege of introducing a speaker who’s worked so hard at broadening interest in
our field, that he can basically describe to say rock star without the in our field as a proviso. A speaker
who performs for crowds at Harvard Sanders Theater and by video around the world, a speaker who
was responsible to borrow from the words of Nigel Tufnel of Spinal Tap, who has taken CS50
enrollments at Harvard and amped them up to eleven. I ask you to put your hands together. But please
ladies and gentleman keep your undergarments to yourselves for our rock star, Professor David Malan.
[laughter]
[applause]
>> David Malan: Thank you. That was a really sweet introduction. I appreciate it.
[laughter]
My name is David Malan. Stuart and I were actually grad students together. Let’s see is there a way to
lower my voice in the room a bit? Let’s do this, there we go. That’s better. Stuart and I were grad
students together for several years. We shared the same advisor a fellow named Mike Smith. The way
my story at Harvard begins is I graduated in two thousand and seven a few years after Stuart. Mike our
advisor was the former instructor of CS50. He got promoted to Dean of the Faculty of Arts and Sciences.
There was this vacuum for no one lined up to teach CS50 that fall. They originally had chatted with a
few of the actual faculty there. Since I had barely just finished taking off my cap and gown. I somehow
talked my way into this position. The plan was just to fill in for a year. Somehow eight years later it’s
been this amazing experience.
It truly has been this dream job in large part because I took CS50 myself. In face the way we start a lot
of the CS50 lectures or semesters these days is that I tell students that I started off life at Harvard as a
government major or concentrator. That was largely because when I was in high school I kind of knew
what I liked. I like history. I really like Constitutional Law. When I got to college in this really thick
course catalog which was still a book at the time it was just kind of natural for me to gravitate to
something very familiar.
In fact even in high school I still remember walking down the hallway of the Computer Science Lab or
the Computer Lab where the kids taking APCS or whatever it was where. I really viewed them as like the
geeks in the school. I had no genuine interest in computer science. Even though as a kid I definitely
gravitated toward video games. I had my Commodore 64 and Macs, and PCs, and tinkering.
But the sort of academic interest was never there. I think that was in large part because I never really
understood what it was or appreciated that it was more than just holding yourself up in a computer lab
and coding. When I inherited the reigns of CS50 in two thousand seven really my goal was to try to turn
the courses reputation on its head. It was always very well reputed. It was a good course but it was also
a course to beware. Something that was very daunting not unlike computer science itself for quite a few
people.
I would say that the prevailing theme for us over the past several years has been accessibility. Not
dumbing down the course, not chipping away at its historical workload, or rigor. But trying to make the
onramp a lot smoother for those we now describe as less comfortable. Students for whom computer
science is in fact daunting and who might not otherwise even step foot in the classroom without a bit of
encouragement, ultimately creating an exit ramp for everyone. They really feel that the delta between
week zero and week twelve is significant for them.
It’s really nice to see a lot of familiar faces. I hope to chat more personally later on. But for those
unfamiliar CS50 is Harvard’s introduction to the intellectual enterprises of computer science and the art
of programming. Which for us means it’s an amalgam of CS, what most places would call CS1 plus CS2.
It’s a particularly rigorous semester where the workload for most students averages twelve, fifteen, or
more hours per week outside of the class.
But one of the commitments we make to students in the syllabus and verbally, and consistently
throughout the term is a theme along these lines. What ultimately matters in the course is not so much
how much you end up relative to your classmates but where you end up relative to yourself in week
twelve versus week zero.
Indeed these days we take into account student’s prior background or lack thereof. We have different
tracks within the course which I’ll address in some part today. Ultimately, students feel that they’re not
in competition certainly with any students. That indeed, they belong so, when they look out into the
audience they’re assured hopefully that they’re not in fact the only one there who knows very little
about computers or computer science.
In terms of the course itself it’s a fairly traditional syllabus. In fact I think the underlying curriculum
hasn’t fundamentally changed all that much. We start in week zero with a programming language called
Scratch which is one of these graphical programming environments from MIT’s Media Lab. But what’s
nice about it is that we can talk about loops and conditions, and procedures, and events, and even
threads in the very first week of the class. Then students for their first problem set are challenged to
just go make something of interest to them. Then they can share URL with their family or friends, or
roommates and genuinely have the sense of accomplishment super early on.
What we hope is that much like a ten or twelve year old for whom Scratch is typically intended. They
have this sense of empowerment that they understand the logic that these ideas that might otherwise
be expressed very cryptically with more traditional code and text is actually very familiar ideas of
looping and doing things conditionally.
We then very quickly transition to C, where we spend most of the semester these days so CS50 remains
somewhat unique out there. In that we’ve retained C as the core part of our curriculum. That’s because
we can have particularly low level conversations. We very quickly cover syntax and some of the
fundamentals of C which itself is pretty small language. We introduce a couple of weeks in the easiest of
data structures, arrays.
But we also start to talk about memory management. Actually what’s going on underneath the hood in
terms of the stack and the heap. We introduce ideas like buffer overflows, exploits, and how certain
underlying attacks can be waged. We then transition to some fairly common in a CS2 class algorithms
and data structures, various searches, and various sorts and data structures.
But we very quickly accelerate even to relatively sophisticated data structures. Look not only at trees
but also hash tables with signally linked chains, link lists of course, and also tries. In fact one of the most
challenging problem sets for most students is mid semester when I have to implement a spell checker
using a data structure of their choice.
But then toward semesters end we contact switch. We have them implement their own TCP/IP web
server these days, which bridges the gap between a command line environment and a web world. In
which we then use a bit of PHP and Java Script, SQL, HTML, CSS. That is an off ramp for students.
Indeed half of the students in CS50 traditionally and today take no further computer science. They’re
exposed to a web environment and dare say a more familiar and a more useful world. At least so far as
a lot of their own personal interests go.
Even though we spend maybe ten percent, twenty percent of the semester formally doing anything web
related. Some eighty or ninety percent of students final projects these days are perhaps appreciatively
web based.
What I thought I’d do today is give you a sense of some of the very deliberate decisions we’ve made
over the past few years. Please interject at any point with any questions. But what I thought I’d do first
is paint the picture with some music and video that the team, our Production Team put together in
anticipation of prefrosh weekend, a.k.a. visitos just a few weeks ago, which is when the high school
seniors who’ve been accepted to campus come to town to see if they’d like to attend Harvard, but in our
case take CS50. This is what CS50 looks and feels, and sounds like today in words and pictures.
[video]
[music]
This is CS50 Harvard University’s Introduction to the intellectual enterprises of computer science and the
art of programming, dedicated, and passionate, and committed.
>>: Now Duckface, Duckface, David.
[laughter]
>>: This is Chang at the Hackathon. I’m here today with about four thousand students and we’re going
to find out what they’re up to working on their final projects.
>>: At this point in the night we’re starting to hit the coffee pretty hard.
>>: [indiscernible]
>>: We are here at the twentieth consecutive hour of the CS50 Hackathon [indiscernible].
>>: Why study computer science, because it is the most important subject.
>>: Really be mentally prepared because you’re going to love it.
[video ended]
>> David Malan: I guess a couple of notes. One, Steve Ballmer kindly came to town last fall and gave a
guest lecturer in CS50 and kindly shot some of that material with us. The little teaser there is that CS50
for those of you from Harvard will also be offered at a place called Yale this coming fall in parallel, but
more on that in just a moment.
[boo]
Oh, this is a good thing.
[boo]
It would be boo if their CS class was being offered at Harvard I think.
[laughter]
Glad, this is on video, isn’t it?
[laughter]
We like universities everywhere have really benefited from a burst of excitement in recent years around
computer science. But there’s been a few compelling inflection points for us in particular that we can
see in some of the data. For those of you who took CS50 or TF’d it some years ago, you might recall
some of these numbers back when the course first debuted in nineteen eighty-nine we had about a
hundred students or two hundred students in a typical semester.
The year I happen to take it myself was when a very famous Computer Scientist name Brian Kernighan
was actually moonlighting at Harvard and took the reins of the course that semester. The course
peaked at some four hundred students that semester. But we like most places began to see a sort of
decline in interest such that the course flat lined around a hundred students again after the so called
.com era.
But around two thousand seven is when we began to introduce a number of changes to the course. The
changes to the courses tone, to its messaging of accessibility, and also ultimately its support structures.
Between two thousand six and thousand seven alone we went from some one hundred thirty-two
students to two hundred eighty-six over just the course of the year.
These were numbers that we saw even just after a week or two of the course in which we introduced a
bit of Scratch in week zero. But again just only week zero. We then very quickly transitioned the next
week to a more traditional environment. But this was in stark contrast for instance to what we did for
many years which was introducing students to a bit of assembly in the very first days of the class. Really,
working your way from the ground up, so compelling perhaps as a narrative but not necessarily the most
sort of encouraging or accessible message to students.
Instead we introduced something like that much later now in the term. What was striking in two
thousand eight enrollment began to creep us a bit more. Since then those of you who are familiar with
Harvard’s undergrad curriculum. For the first time we overtook Ec10 this past year. CS50 is Harvard’s
largest class with some eight hundred students.
That too has been an exciting inflection point. What class looks like in the very first week of school now
is a little something like this. This is Sanders Theater to which Stuart referred to earlier. Now in fairness
by mid semester I would say there’s plenty of seating in the balcony. People tend to watch the course
increasingly online for instance but that to has been a very deliberate pedagogical measure.
Indeed we fill absolutely everything. We live stream absolutely everything. In fact even though the
class’ lectures are Monday, Wednesdays at one p.m.; there’s a non-trivial number of students who
watch it live on video from home, in bed at one p.m. It’s, we’ve nonetheless are connecting with
students in any number of ways.
The demographics too have been shifting. For quite some time the class was dominated by
sophomores. This was in part I think because Harvard tended to have a lot of requirements for incoming
freshman, language requirements, writing requirements, those kinds of things. But also because I think
coming off the heels of high school where in APCS has never been particularly I think compelling, at least
in recent years. Where most students just have a very bland impression of computer science as I did as
programming, it takes a good year or so for students to become purists. Much like I did some years ago.
It’s increasingly becoming more and more freshman based. But we have as hundreds or two hundred
students among the sophomores, juniors, seniors as well. We have a hundred and fifty plus extension
school students. They’re Harvard’s Continuing ED program. For the first time three years ago HBS
agreed to start giving MBA students credit for taking CS50 as a technical elective. We now have
students from across the river.
We have as part of a historical outreach program over the past few years that will be massively
accelerated this fall a dozen or so students from local high schools taking CS50. Coming to campus for
sections led by a Harvard TF, but watching lecturers and other resources perhaps online and then some
cross registrants in Harvard’s other schools.
But what’s also changed since two thousand six and prior is the composition of the courses
demographics. What this chart represents is three buckets of students, those less comfortable, those
more comfortable, and those somewhere in between. We have no formal definitions of these terms.
But at the start of the semester we asked students to self describe. Are you less comfy, more comfy, or
are you unsure, somewhere in between?
If you’re less comfortable with the idea of taking computer science class you know it if you are. What
that means is that you’ll be put into a section or recitation with only fifteen or so students with the same
demographic and comfort level as background. You don’t feel in any point in this term that you’re
competing with those more comfortable.
What’s been changing here is that if this is the more comfortable demographic, the somewhere in
between. In two thousand eight these were the less comfortable students. It’s this demographic and I
only had the data here handy from up through two thousand twelve, increasingly is the course taken by
those less comfortable. Students from all walks of life across campus well beyond the STEM fields who
are interested in a little bit of computer science, and hopefully bringing that back to their own domain.
This was a poster we put up this, a year ago to emphasize that message that contrary to what most
students think here, and dare say quite a few other places, when you look to your left and you look to
your right it is not in fact the case. That both of those people know more than you. In fact seventy-eight
percent of students last year who took CS50 had no prior background.
What we hoped to have done over the past few years is introduce a significantly more robust support
structure for students. Sections again are these opportunities for students to meet more intimately
than lectures in Sanders Theater allows for Q&A, more examples, and the like. We do have different
tracks now for those students. Walkthroughs are something we introduced some time ago. Whereby
with the courses problem sets not only did we change the narratives that we used. I dare say for many
years the problem sets themselves were fairly succinct technical documents, but they assumed I think a
sophistication.
You might say you know change into your P set two directory and download this file. But for those less
comfortable and indeed for a majority of the students that glosses over so many arguably mundane
steps. But CD and LS, and CP, and all of the sort of Linux commands that some of us might gravitate to
quite readily.
But for students it’s not been all that clear. The problem sets now are significantly more detailed. Hold
students’ hands significantly through the minutia, the stuff that’s not at all intellectually interesting. But
embedded throughout the problem sets as well are short videos. Actually that have been shot by some
of the courses best teaching fellows, embedded right there seamlessly for a couple of minutes. Here
that punctuates the narrative of the text and holds student’s hands through possible design decisions.
They never, they provide hints but certainly never spoilers or solutions. When a problem set that might
be around data structures we might propose well you might go this route on this road. Or you might go
down this road and then it’s left to student to decide. But they’re meant to address this FAQ and
they’re faced with some twenty page specification of a problem set these days. Where do I begin? How
do I improve my code iteratively?
Then the problem sets themselves were a target in our sites in two thousand seven. Back in my one of
CS50’s problem sets was a prefix tree which is a wonderful opportunity to explore our tree based data
structures and recursion, and various algorithms there on. But it’s hard to get excited I would say about
a prefix tree even though I marginally got a little bit excited about these things. But surely we can
explore some of those same ideas in a more exciting or more familiar context.
Over the past several years have we rebooted all of the courses problem sets such as that the first one is
indeed Scratch. The challenge there is just go design something of interest with a few dozen or so
puzzle pieces, so to speak, it’s this drag and drop language where pieces interlock together if it makes
logical sense to do so.
The first problem set in actual C we have them do some fairly small programs. But they’re faced for the
first time with fairly arcane syntax, for loops, and while loops, and variables. But they implement like
Mario’s Pyramid sort of implementing something that’s akin to a video game they might have seen. Or
some number of other small projects.
But then we begin to introduce domain specific problem sets. For the Crypto problem set we have
super majority of students implement the standard edition of the problem set, which is to implement a
few ciphers, like a Caesar cipher, a Vigenere cipher, like rotational ciphers. Where A becomes B, B
becomes C, and so forth. But for those particularly more comfortable we also have throughout the term
what we call hacker editions of most problem sets, as opposed to the standard edition. These are not
offered as extra credits. They’re not just additional work they’re completely distinct problem sets that
in spirit cover the same topics, but from a much more sophisticated angle.
For instance in the Crypto P set for those more comfortable the hacker edition has students be handed
an Etsy password file that has user names and hash passwords. They need to write code that cracks
those passwords and make intellectual arguments as to why their technique might solve the problem
more quickly than say brute force alone might allow. We hand them a dictionary with some tens of
thousands of words that they can employ if they might want to try to reverse engineer those passwords.
We then draw upon the world of Forensics. Every semester we either go around campus taking actual
photos of people, places, or things. Or we troll around Facebook these days and just simulate as much.
Then we put them onto a media card that every semester I accidentally corrupt or format. Then we us
DD a Linux command line utility to make a forensic image of the file, literally just copying all the bytes
off that card into a file that can then be given to all eight hundred students.
They have to write C code with which to recover the JPGS that were once on that memory card. Even
though JPG itself is a fairly sophisticated file format, turns out with very high probability if you see a
pattern of four well defined bytes here and here, and here in a sequence of zeros and ones. With high
probability those demarcate the start of JPGS.
Once the students get this code just right they go from nothing in their directory other than their C code
to it entirely filled directory with fifty images that are generally photos of computer scientists on
campus. I’ve kind of learned how best to do this over the years. One year we chose, we thought it
would be a good idea to put photos of CS faculty at Harvard in the image because one of the, sort of the
icing on the cake of this P set is a scavenger hunt. Whereby students were challenged and we gave
them some fabulous prize to incentivize this and to go find as many of the computer scientists that they
had recovered and take a photograph with them.
Well, it turns out if you forget to apprise your colleagues of this in advance it leads to some very
awkward encounters and knocks on the door.
[laughter]
We’ve retooled, now we use the courses own staff. Now we just get creepy late night knocks on the
door since we’ve been escalating how good the prizes are. But the goal is to collect as many CS photos
or selfies as you can with people.
But in the same problems that we give students for instance a bitmap file, a BMP file that has a lot of red
noise. There’s like a hidden message that you might remember from childhood. I remember pulling like
a sheet of red plastic out of a cereal box or something with which you can then see the hidden message.
But we have students implement or simulate exactly that optic effect by taking that noisy image and
somehow reducing the noise. By writing C code that iterates over the bytes in the file understanding
that one of these, some of these represent red, some green, some blue, so very quickly do we have
them traversing a file system and understanding how data underneath the hood can be represented.
Then this Mispellings P set deliberately misspelled here. Wherein students are given an English text file
of some hundred and fifty thousand words and they’re challenged to load it in to memory somehow as
quickly as they can, minimizing their RAM use, minimizing their CPU time. Then implement another
function that checks given words like large corpuses of text against the dictionary they have built.
What we do is focus not on theoretical asymptotic running time of their choice of algorithms. But rather
the actual CPU time and the actual RAM usage, the real world costs that they incur. We have a little
leader board on the courses website that if students opt into they can then see themselves either atop
or below their roommates.
Remarkably, even though this assignment is fairly challenging unto itself its lead to this fascinating side
effect. Where invariably a roommate gets a peer on the big board as we call it and then he or she goes
to dinner or something like that. Then comes back and realizes that his or her roommate has edged
them out.
It’s remarkable for incentivizing students to put in all the more time so that you get this sort of arms
race as to who can best whom; we end up with particularly advanced techniques. We also eventually
figure out who the future TFs are as to who figures out how to game the system entirely. Sort of use
zero RAM and zero time, as we sometimes have to manually inspect. But it’s been a fun way of adding a
bit of competition on an opt in basis to make something that’s otherwise perhaps the most challenging
CP set into something that’s a little more alluring.
Then towards the end of the semester we introduce a couple of web based problem sets. C$50 Finance
has students implement in PHP, and SQL with HTML, and CSS in e-trade.com like website. Where they
have to simulate buys and sells of stocks by querying Yahoo data, or Yahoo Finance for semi-real time
data, so that you can actually see what a share of Microsoft or some other company might cost. Here
too we see who the clever CS folks or economists are. Whereby there’s always someone who quickly
realizes that Yahoo Finance isn’t really real time data.
We here too often have a leader board where we give all the students in the class ten thousand virtual
dollars. Then encourage very reckless behavior for a week as to who can grow that ten K into the most
money possible. Invariably there’s a non-zero number of students who realize well if Yahoo Finance is
like five minutes delayed but that’s how we drive the leader board. They can just turn on Bloomberg or
some web equivalent, see fifteen minutes into the future, then trade penny stocks and double their
money very rapidly and very quickly. I think the lead the last time we did this was like the ten K was
turned into four quadrillion dollars in seven days.
[laughter]
There’s some disclaimers that couldn’t quite happen in the real world. But it also helps us; it gives us a
veritable list of potential future TF candidates. Then more recently have we rolled out Breakout for
instance. You might know this game from yesteryear where you bang the ball against the bricks. But we
have a GUI based problem set these days. In fact this is one of the harder things about using C still is
that it’s really hard to do really cool things or really interesting visually things early on.
But thanks to Eric Roberts at Stanford, one of the faculty there he put together something they call the
Stanford Portable Library that essentially is a layer that allow you to write C code. Then behind the
scenes it talks to a Java VM that actually handles using swing graphics an actual GUI. We actually have
students use a virtual machine these days whether they have a Mac or PC, or Linux box. Everyone runs a
uniform Ubuntu environment so we can take advantage of desktop displays as well.
This Web Server is written in C as I alluded to earlier. That bridges our command line world to the web
world. Then finally this year the last problem set introduced students to a bit of JavaScript by way of a
Mashup. Taking Google News, take Google Maps, mashing them together so to speak, so as to have a
clickable client side experience where you can click on a city, or search for city and get all of the new for
90210 or 02138, or any number of cities in the world.
This is to say that underneath the hood all of the ideas that the course has traditionally explored. We
hope and very deliberately have intended for them to be packaged up all the more excitedly. But office
hours too are dare say now a thing. Some of you might remember from Harvard or elsewhere that
office hours are too often for CS classes and computer labs in the basement of really depressing
buildings with florescent lights and cubicles. I’m remembering a very specific place at Harvard right
now.
We decided a few years ago to move these for a couple of reasons. One, we realize that around two
thousand eight, two thousand nine the undergrads were coming to office hours, to the computer lab,
pushing the workstation keyboard out of the way, and putting down their laptop, which kind of
suggested they didn’t need the workstation, let alone the computer lab.
But also we wanted to make them much more social not only for the students in the class. But we really
wanted to raise the profile of the class and the students, and their work by actually putting them along
side non-STEM students, non-CS students, so actually putting them in our case in the residential dining
halls or cafeterias at night.
What we’ve started doing on Mondays, Tuesdays, Wednesdays, and Thursdays is holding office hours
for three plus hours a night, this year from nine p.m. to midnight, most recently in Amberg Hall which is
the largest dining space on campus. This is a tragically representative photo of office hours these days.
Whereas of this past year we had mins of fifty students at office hours, early in the week shall we say.
Toward the end of the week we had maxes of three hundred fifty students coming to office hours per
night.
To support these students the courses staff has grown significantly. We have about one hundred of us
now on the team. Myself as the instructor, generally about sixty Teaching Fellows who lead sections,
who grade work, hold office hours, and more. Also an army of Course Assistants or CAs who are
wonderful volunteers and alumni of the course, who very graciously contribute a few hours a week or
full fledged members of the staff, and only hold office hours, and work with their successors in the class.
What’s nice though is that as daunting as this might be we now a days have some twenty to thirty or
more staff on duty at once. It’s also a much more energetic and much more collaborative experience
within the bounds of the courses syllabus than they ever where back in the day. Indeed now a day’s
students come to office hours during those time slots because it’s a place to go do your work, even if
you don’t have questions.
>>: Do you have any students that really need a computer lab? Like maybe they don’t have a computer.
>> David Malan: Very few these days, so like ninety-nine point something percent of students will have
their own laptops. But when we’ve encountered corner cases we’ve quietly supported the student with
hardware of their own. We’ve given it to them as needed. But Harvard’s good about providing financial
aid for equipping students with laptops. But that is something we’ve been sensitive too. This is among
the photos that we collected. Office hours these days that do capture I think the energy though, the
energy is a little different on Thursday nights shortly before problem sets are due.
[laugher]
But we also introduced a number of cultural aspects to the course. None of these were all at once. All
of this is over the course of some eight years. But we now each have at the very start of the semester
during Harvard’s so called shopping week, CS50 Puzzle Day, which is meant to send the message to
campus that CS is not about programming. It’s more generally about problem solving. About getting to
know other students and working collaboratively.
We invite students to register for this event working in teams of two or three, or four. Then we spend
three hours over lunch together working through a packet of problems. We typically have our friends
from Facebook come out who very graciously write the puzzles every year. Then students are
challenged for some fabulous prizes to solve as many puzzles as they can using their laptops if they
want. But absolutely no aspect of it requires code or necessarily requires a computer.
This has been a nice way I thing during the very start of the semester to help send the message of this
very collective, very shared experience. That at the end of the day puts everyone on the same playing
field. If you like to solve problems this is a place for you. More socially on Fridays we go to a place
called Fire and Ice. There’s a few of these around the country. But it’s sort of a fusion kind of place
where it’s a great place for lunch. We invite every Friday some fifty students to join myself and the
courses TFs and CAs to lunch just to make hopefully a very big class seem small. We generally invite
friends from industry or alumni. In fact Microsoft joins us at least once a term every fall just to chat up
students in a recruiting capacity and sort of a life advice capacity, just generally talking about life outside
of Harvard and CS. Yeah?
>>: How are the students elected for those lunches? Is it based on like grades or work, or anything, or is
it totally random?
>> David Malan: It’s based on who RSVPs first. We generally announce it during lecture or at slightly
unpredictable times to round things up. But about these days we rarely turn people away now a days.
We’ve sort of, we use to have like thirty students now we’ve grown into fifty students. It just kind of
works, so and you get to know too. Its funny there’s these accidental filters or signals of future staff, the
students who are really into the course or CS more generally because you often see a lot of frequent
faces. We don’t limit how many times you can come. We start to see some of the same faces again and
again. I see one face today of which, whom we have a wonderful photo from a CS50 lunch. But I didn’t
bring that with me today who’s now cringing in his seat.
[laughter]
But the CS50 Hackathon is an event that we introduced a few years ago at the tail end of the semester.
The goal being to give students an immersive opportunity to work on their final projects, this is the last
project in the class for which the sky is the limit. Subject to their TF’s advice and approval, but students
are challenged to implement most anything of interest to them.
This is an experience that begins in the last week or so of the semester at seven p.m. and ends at seven
a.m. The goal is just to bring as many students together as they want to work on these final projects.
We serve generally pizza at nine p.m., Chinese food from a place called the Hong Kong sometimes at one
a.m. Those still standing as the video hinted are driven by Harvard shuttle buses to IHOP, the local
pancake place in the morning.
In fact this tradition started thanks to Microsoft a few years ago at NERD in Kendall Square in
Cambridge, the Research Center there. It’s a beautiful space that fits some two hundred, three hundred
students. Tragically, only the past couple years have we outgrown NERD itself. We now use a space at
Harvard’s Business School that holds as many as nine hundred students. Indeed these, this past year we
had some six hundred plus students out of the eight hundred attend.
We’re, it too within CS50 has become a thing to do. It’s, frankly it’s adorable. Like some of my favorite
memories are of this event, it starts at seven p.m. Around eleven p.m. or midnight seeing some of the
students come dressed in full formal wear, tuxes and gowns, and all, from some house or residential
dance or formal. To then come to the CS50 Hackathon of all places for their after party.
[laughter]
It’s been really nice to see that level of interest honestly in CS and in the experience thereof. But of
course by three or four a.m. photos like these are easy to capture. Then lastly culturally we have the
CS50 Fair. We introduced this is two thousand eight, in large part because the final project
presentations that we had in two thousand seven and prior where just really boring. They were section
based. There were twelve Virso students in the room. The TF would invite everyone to present their
project to their classmates.
I as the instructor would try to bounce around to as many of the classrooms as I could. But that just
wasn’t physically possible. But it wasn’t really a thing anyone wanted to do. We were just going
through these motions. In two thousand eight we decided to build something and see if people would
come. We invited not only our own students but all faculty and students, and staff on campus to come
see our students’ final projects. Complete with music and popcorn, and food, and fabulous prizes, and
the like. That now a days we have some two thousand plus attendees every year at the end of the
semester coming of all things to see students’ accomplishments in a computer science class.
This has been really gratifying to see, the interest not only in and among the students presenting but
among the students, and faculty, and staff coming to an event like this. In fact you can’t quite make
them out here but along the back row we began in two thousand eight a tradition of inviting folks from
Microsoft and Google, and Facebook, and Vmware, and like. Some of the biggest names to chat up
students about recruiting opportunities too, whether they’re just finishing CS50 or they’re juniors or
seniors and they’ve come back to check out the fair.
It’s been a nice synergy and energy, but where the focus ultimately is on students and balloons, and
popcorn. But looks like these are quite gratifying to see as student’s present their final projects to all of
the attendees. This is actually one of favorite moments captured ever.
[laughter]
I thought I’d share just two final themes that have driven some of the courses decision making over the
past few years. One has been data and our collection, and our response there to. Also a number of
academic policies as CS50 has grown in size. We have begun to appreciate that we carry some weight
hopefully on campus with which we can try to push ahead what we or what I sometimes feel are in the
students best interest, but not necessarily consistent with certain policies.
Toward that end one of the things we look at, no attendance is not factored into grade or required in
any capacity. But we look at it and track it either with counters occasionally or with just self reported
numbers. This is a chart that shows lectures from the start of the term to the end of the term. They y
axis shows the percentage of students who are taking advantage of in person attendance in the yellow
line or neither in the blue line, or who are watching the assets online by a live stream or on demand.
You can see there’s one trend that really jumps out at you. This is not surprising. It’s not unique to CS
or CS50. But attendance does kind of start to fall over time. That’s a pretty consistent pattern. But a
fun fact is to ask why it is so consistently spiky? Would anyone like to hypothesize what explains the
spike, yeah?
>>: Quizzes.
>> David Malan: What’s that?
>>: You have weekly quizzes.
>> David Malan: No, we do not have weekly quizzes. It’s even simpler than that, yeah?
>>: Mondays.
>> David Malan: Yes, so if you add labels to the chart revealing what days are Mondays, what days are
Wednesdays? You see consistently that Wednesdays are super less popular but god forbid we hold one
lecture the whole term on a Friday.
[laughter]
Attendance really tanks on that one Friday. But other data that we also look at and that guides some of
our decision making is this. For years we’ve used one of the courses self developed tools called CS50
Discussed, web based Q&A environments that akin to Gmail’s UI. By which students can ask questions
on problem sets. Each of these numbers represents problem set zero, one, all the way through eight.
We have a more quantitative sense these days, the volume of questions which often is, which often
correlates with the difficulty of the problem set or the poorness with which its text was written by me.
It’s helped us reveal where there’s opportunities for improvement. Students have relatively few
questions about problem set zero. Students had relatively fewer questions about problem set eight in
this semester. In large part because P set eight was intentionally or unintentionally a little easier than
we intended. It was the result of rolling out a new problem set.
But the fact that we’re getting some six hundred, eight hundred questions per week per problem set
also helps us gage and schedule some of the course support structures, human and otherwise. This
charts a little more to [indiscernible] visually. But what you see here is again P set, zero, one, two
through eight. Then you see in different colors the wait times for students at office hours.
You have Monday, Tuesday, Wednesday, Thursday, so the theme and the worrisome theme here is that
the red bars become the tallest eventually. You can infer from this that problem sets are indeed due
around Thursdays or in fact Fridays. But this has been unacceptable and this data thankfully is from a
few years ago now.
But we were having students come to those office hours which for as much excitement that there
hopefully was in my voice, was a horrible place to have to wait for help quite often. We were able to
track this data by way of not just using hands in the air or more casually answering students’ questions.
But when they arrived in the dining hall if they had a question they would open up the courses website,
login, and click a button to virtually raise their hand.
Then we had a CS50 greeter not unlike the genius bar in an Apple Store, then dispatching the staff to the
students or vice versa. Via this mechanism we were in theory able to load balance more effectively and
monitor things. But we were also able to collect a rich amount of data which was increasingly
worrisome over time.
In fact what began to happen we realized was that if there’s some inflection point of a few minutes wait
time, or if students already know they have to wait for n minutes upon arrival, the behavior evolved into
upon arrival you raise your hand, even if you don’t have a question. Effectively what happened to our,
what we thought was this brilliant software solution to a human problem. It just evolved into everyone
always being in the queue at every moment in time. It just didn’t really work in the way that we
intended.
Indeed we even tried things like a fast lane so to speak for questions. Where if you had a short
question, a quick question we would dispatch you to different set of TFs to help unblock you. In reality,
fun fact there is never a quick question in a computer science class; no matter how many times a
student prefaces it with that.
We even tried answering more questions electronically which there too felt like the logical solution to a
problem. But there, there was the sort of social side effect where we were perceived that one semester
as being much colder towards students. At least within a limited demographic that we wouldn’t even
want to talk to them, we wanted to only answer their questions online.
There was sort of, there’s this social aspect of sort of selling students on the same vision, so that even if
we sort of bring our own brains to bear to a problem. There’s a lot of these unanticipated issues that
data have helped us calibrate over the years and appreciate what the effects of the knobs that we’re
turning.
>>: Why do you think that there’s a time to start only on week three?
>> David Malan: Because of late days, so we offer students a finite number of late days which are
essentially extensions via which you can give yourself a twenty-four hour extension. Most students
though conserve those early in the semester thinking they’re going to want to use them later. Most
students submit a day earlier by the Thursday deadline instead of the Friday deadline.
Then as soon as they realize mathematically, well I have enough late days as there are P sets left. They
just start to spend them. But this too is very deliberate. This too sort of fun pedagogical fact, I am
perfectly content with letting problem sets be due Fridays for everyone throughout the semester. But it
turns out if we were to do that we know from having surveyed students it would be perceived as a step
backwards. Because we would be taking away the late day policy by effectively giving everyone n late
days where n equals the number of problem sets. We very deliberately offer n over two late days.
[laughter]
Because again these kind of human issues, so this is not the sort of theme of today the sort of irrational
behavior of humans. But these are the kinds of things that we’ve learned by turning various knobs over
the semesters and have begun to appreciate too how best, especially as the class swells in size. How to
cater to different student’s different learning styles, and the like.
Lastly, I thought I’d share one other data point that’s been instructor for us. We use a web based tool
via which students receive feedback sort of an online annotation tool that not only keeps track of the
word count and character count of the TFs. The time they spend grading, at least in the environment,
but also the number of students who are actually looking at that feedback, and how much time they’re
spending on it.
We know for instance problem set five which was the Forensic P set this one year. Some fourteen
percent of students didn’t even look at their feedback that their TF had provided. This too has been a
helpful feedback loop for us to appreciate. Well on which student should we really be spending more
time in terms of providing that feedback? Or maybe there’s a fundamental problem if they’re not even
looking at whatsoever, since grading fortunately while wonderfully useful pedagogically or instructively,
is a huge time consumption for the staff. Keeping this in mind helps us adjust.
These are the academic policies to which I alluded earlier. A few years ago the course had a policy of
allowing like a lot of classes at Harvard pass/fail, which means you can sign up not for a letter grade but
to get a P or an F on your transcript. The bar was a D minus or higher is a pass otherwise it’s a fail. But
the logistics of this at least at Harvard and this is not unlike some other places. Is you needed my or any
instructors permission.
What this meant mechanically is that if a student did want to explore a field unfamiliar to him or her. At
the very beginning of the term he or she would have to get up the nerve to one get literally this pink slip
of paper from the Registrar. Walk up to me at a place like Sanders Theater presumably while hundreds
of their classmates are exiting behind them and ask me for their signature. This was a horrible I think
way to require students to sort of put their toe in the water if we’re raising this bar. Just for logistical
reasons putting them in an environment where they might otherwise feel that they might fail.
Therefore this is why they want this crutch.
What we instead introduced a few years ago after years of pushing for this. Is something that in terms
of implementation is called SAT/UNS, satisfactory or unsatisfactory, which actually raises the bar.
Where a SAT is a C minus or higher which frankly I think is a better signal of success than a D minus or
higher. But it’s also self service, so students do it entirely online without my knowing, without their TF
knowing. This has been really just a way of signaling to students and backing up what we say. That
there is indeed should be no fear of failure in a class like this. Indeed we take into account their lack of
prior background or background and normalize ultimately across the semester. In fact whereas we use
to have some three percent of the class taking the class pass/fail, we’re now up to about fifteen percent
taking it SAT/UNS.
I will be thrilled if in a few years time we can crank it up to one hundred percent and do away with letter
grades entirely, but more on that some other day. A regret clause, so academic honesty is a challenge
for CS courses more generally. Not because CS students are more dishonest than students taking other
courses. But we CS people have tools and techniques and hopefully the inclination to look for it in the
interest of fairness to other students.
Indeed every semester we ad board or discipline in some form about three percent of CS50’s student
body. This is usually a few ten; this is ten students or maybe as many as forty students this past fall.
This generally is because they have crossed some line. They’ve outright copied some code from a
GitHub repo or from a classmate, and submitted it in some form as their own. Sometimes with clearly
deliberate modifications.
This itself is an interesting research project. There’s a couple tools out there, MOSS from Stanford,
Etector from Princeton that we rely on as the first ingredients in detecting this. But ultimately we use
human eyes to re-adjudicate this. But we’ve long felt and I’ve personally felt that so many of these
transgressions are really the result of students at four a.m. making a really stupid decision. They’re
super stressed, super tired, exhausted. They have got other deadlines or responsibilities to meet.
They’re making a poor decision and then having to hold their breath and live with it, hoping not to get
caught.
We introduced a regret clause into CS50 syllabus this year which said that essentially if you commit such
a transgression. But come forward to me or one of the courses head TFs within seventy-two hours we’ll
handle disciplinarily internally. You might still get a zero in the P set. It won’t be without consequences.
But we won’t escalate it to the highest levels where the outcome could be much worse like expulsion
from Harvard for a year or more.
In the end we had some eighteen students come forward under this policy for exactly those reasons
described, almost always these late night moments of poor decisions making. In eleven of the cases we,
in eleven of the eighteen cases we did do so internal discipline, zeroed the problem set, but then had a
long chat myself with the student for ten minutes or an hour. Then we put it behind us and moved on
hopefully leaving it as a teachable moment. In six or so of the cases we had, the students were overly
concerned. They had indeed committed no wrong so we nonetheless had a good chat about where the
line is. Then in one case we had a student just redo some part of a problem set.
What was interesting though was that this had no impact on CS50’s rate of detection of academic
dishonesty, which is to say we still had about three percent of the class ad boarded at the end of the
semester. None of those students that self reported and put themselves on our radar would have
appeared on our radar by way of these automated detection tools that we use.
Which my take away from was that we identified this sort of teachable subset of the population who
either had crossed some line or hadn’t. But with whom we had eighteen unprecedented heartfelt
conversations about academic honesty. Frankly, in a couple of cases what was going on problematically
in their life, personally, or with their family. We involved some of Harvard’s like Resident Advisors or
Resident Deans to have those conversations.
But it also meant for the three percent of the class who were in fact holding their breath committing
these transgressions. That we’d given this out so to speak and they didn’t take it. We were much more
comfortable as a course frankly coming down all the more harshly on that particular demographic this
year. This didn’t go over well I should say administratively. This was not consistent with Harvard policy.
But this too is what I think an example of some of the measures we’ve tried to roll out given the courses
size and the attention we have. The down side is we can’t do anything without being noticed. But the
upside is it affects so many students. But now, this with very high probability become part of the
colleges policy in the coming year or two, some form of this kind of regret clause.
Lastly a simultaneous enrollment MOOCs are all the rage these days, online courses. We’ve been filing
CS50 since two thousand seven. It’s been freely available as open courseware since two thousand
seven. But we also stream the course live these days. It certainly occurred to me over time having
taught classes online in some form since nineteen ninety-nine. That even though it’s a nice thing to be
able to physically attend a class. For many students the next best thing is surely being able to take it
asynchronously later in the day. Or being able to rewind and fast forward, and review material.
Philosophically, we are perfectly fine with students not physically attending. Indeed the overarching
goal in CS50 is to no longer define the course, or courses more generally primarily by this fairly
traditional view, that a course is defined by its lectures at a certain day and a certain time. CS50 these
days is the only class at the moment at Harvard that you are indeed allowed to simultaneously enroll in
it and another, which means you, can take an overlapping class and never join us in Sanders Theater for
better, for worse.
But the reality is I dare say we are asymptotically approaching the point where taking this kind of course
is superior online, because you can pace yourself a little more effectively. You can indeed rewind and
fast forward. You can have full text search availability of the transcripts. You can hyperlink to other
resources. As much value as I do think there still is emotionally and I think collectively in being in a
beautiful space like Sanders Theater, and coming up on stage and having these memorable moments.
I do think the experience we now offer students on video is a close approximation of that. Meanwhile
they have a very robust support structure underneath it all.
>>: Do you have any statistics for how students do with their [indiscernible]?
>> David Malan: Yes, we do. No impact whatsoever. In fact, but that’s conditioned on the data on
which that’s based over the past several years. Where we have as many as a hundred plus students
simultaneously enrolling in CS50 and another class, which means they never come to lectures, is
because of the paperwork that in prior years up until this past year was involved. There was a bit of a
hurdle so we’ve hypothesized that doing simultaneous enrollment correlates with students who are
particularly sort of go getters and who are able to balance that kind of commitment. It might skew
toward that particular demographic.
This coming year there will be no such constraints. But something tells me we’re not going to see a
difference. Especially, when we have forty percent plus of the class not physically attending anyway, so
the fact that there are these fairly dated policies in place is also a bit frustratingly irrational. But, no,
that was what sold them on tweaking the policy was exactly that lack of evidence among the data of any
negative impact.
Ultimately a few final words on culture, so we have a, this has been a particular focus for us. In the
courses end of semester evaluations we grep through looking for mentions of culture and community
and the like. There’s nice things like class that has a culture that appears amongst students remarks,
more of a culture than a class and a big community which was nice to see.
But then you also start to see evidence of this when you expand your expression a bit. Almost cultish,
cult like, maybe a bit cultish, and then the definitive it’s a cult.
[laughter]
But this has been a very deliberate design decision too. In fact I was just chatting with some of our
friends at Microsoft the other day about what’s in, guided our decision making over the years. Because I
do hope we’ve maintained the courses historic rigor and its reputation. That’s also allowed us to get
away if you will with having I think a lot of playfulness in the course. A lot of quirkiness too it and this
takes the form physically of even sort of the branding aspect of stress balls that we hand to students
around exam time. The CS50 shades that they can now where. The fact that the shuttle buses when
they pick us up for the Hackathon literally say this is CS50 on there.
But I think the coolest and the proudest moment is how many of the students are inclined to then wear
these T-shirts we give them at the end of the semester. Now, granted they’re free T-shirts so they’re
bound to be worn anyway. But the very simple messaging that says I took CS50 which is the thing we
give to students now at that CS50 Fair, pretty much at the terminus of the course. That they hopefully
then wear proudly on campus in some form to say that they either happily took it, proudly took it, or at
least took it, and finished taking it. But nonetheless this has certainly helped raise the profile over the
years.
In fact, I thought I’d conclude with something a little Microsoft sentric. First we need to roll back a few
years to a video that can be found on YouTube these days. Some of you might remember it from the
variast early, one of the earliest version of Windows. But here’s a familiar face, so let me have you page
this back into memory for just a moment.
[video]
How much do you think this bears operating about this work? Wait just one minute before you answer
watch as Windows [indiscernible] Lotus one, two, three with Miami Vice. Now, we can take this Ferrari
and paste it right into Windows Write. Now, how much do you think Microsoft Windows is worth?
Don’t answer; wait until you see Windows Write and Windows Paint. Listen to what else you get at no
extra charge, DMS DOS Executive, an important calendar, a card file, a notepad, a clock, a control panel,
a terminal, principal or RAM driver. Can you believe it reverse that’s right all these features in reverse
all for just, how much did you guess? Five hundred, a thousand, even more, no it’s just ninety-nine
dollars, that’s right. If ninety-nine dollars it’s an incredible value but it’s true it’s Windows from
Microsoft, order today, P.O. Box 286, DOS, except in Nebraska.
[video ends]
[laughter]
Steve kindly helped us shoot the two thousand fourteen version of that which I’ll share with you here.
[video]
How much do you think this is worth? CS50 is Harvard University’s introduction to the intellectual
enterprises of computer science and the art and science of programming. Topics include abstraction,
algorithms, data structures, encapsulation, resource management, security, software engineering, web
development, and even more. Now how much do you think that shirt is worth? Don’t answer.
Languages include C, PHP, JavaScript, plus SQL, CSS, HTML, and more. Problem sets inspired by real
world domains of biology, cryptography, finance, forensics, gaming, and still more. Can you believe it,
designed for CS majors and non-majors alike with or without programming experience. This is CS50.
[video ended]
[laughter]
He’s an amazingly good sport.
[laughter]
Where does that leave us? There’s a number of opportunities now for outreach and ways actually to get
involved, as folks here at Microsoft. One, the course is offered as part of edx which is Harvard and MIT,
and other universities massive open online course initiative. In fact we have quite a few students taking
the course by edx.org.
We also have as I alluded to earlier an opportunity for the first time for students to take the course at
Yale University. This is a very exciting and fairly unprecedented opportunity for two universities to
collaborate. Whereby, students at Harvard and Yale this fall will be able to take CS50. Most of the
lectures will continue for logistical reasons to be shot at Harvard. But we’ll do a couple of lectures in
New Haven and stream them back to Cambridge.
But we’ve recruited some forty TFs and CAs in New Haven. In fact, remarkably Yale changed, voted,
Yale’s faculty voted to change their undergraduate policy allowing undergrads for the first time in their
history to TF a class, which is a tradition we’ve had at Harvard for some forty years. That first such class
will be CS50. They’ll have their own sections, their own office hours, their own CS50 Fair. Then just for
fun we’re going to do a co-located CS50 Puzzle Day and CS50 Hackathon in Cambridge. To which both
campuses are invited.
That’s on the horizon for us. But then more close to home here we’ve been partnering with our friends
Natcho here and others at Microsoft on CS50 AP, which is our initiative to make all the more teacher
support resources available to high school teachers. Thus far, most every aspect of CS50’s open
courseware has been very student centric. Enabling anyone on the internet to take the course but not
necessarily teach the course, not necessarily providing them with insights for better, for worse. Into the
courses pedagogy or why we teach doubly linked lists in a certain way. Or providing them with solution
sets to problem sets, or tools that they can use to assess or provide feedback to their students.
As part of this effort then teachers will be coming to Microsoft this Thursday and Friday for a Boot Camp
of sorts. Wherein, we’ll present them with the opportunities that await them. This group of thirty or
fifty or so teachers has agreed to teach some form of CS50 in their classrooms this fall. Next fall two
thousand sixteen is when APCS Principals will be debuting. Which is a new course from, new curriculum
from the College Board, and it will be anew AP offering which CS50 ultimately is intended to satisfy as
well.
If you’d like to get involved in this kind of initiative do let us know. The easiest way is to drop by a
reception tomorrow at five p.m. in the Visitor’s Center here at Microsoft. Or certainly chat with me or
Natcho, or reach out via email at any point if you’d like to get involved in a TF, in a mentor, or in just a
hello style capacity.
I think we have one question from the internet here. Has this made any impact on the percent of
women graduating with a CS degree? It wouldn’t necessarily attribute to just to CS50 certainly. But the
number of women in the concentration as well as in the course has been inching up over time. CS50
does vary year to year. The peak that we had a year ago was thirty-nine percent. It dropped a little bit
this past year as the courses enrollment grows a bit more. But it has been steadily increasing
downstream in the courses, in rather the department’s concentration, or major. I don’t know the exact
number off hand. But it’s definitely been inching up. It’s still far from fifty/fifty but increasing each year.
>>: Can you give me a better idea of what exactly the course fits in the larger CS curriculum at Harvard?
>> David Malan: Sure, so at Harvard we have three intro classes, CS50 which is this one, CS51 which is
more of a PL class or functional programming class with a bit of OO. Then there is CS61 which is a lower
level machine language class. Students who want to major in Computer Science have to take any two of
those three. Most students will indeed take CS50. Even among those students who have taken APCSA
which exists already. About seventy-five percent of them will still take CS50 in part because of the
experience and part because of the exposure to C in the background there in.
In part too, because we also cater that demographic as well with the hacker additions and more
comfortable sections. There after a student would typically take like a theory of computation class,
maybe an algorithms class, or any number of higher level classes.
>>: Do you know how many students, has this class been, I know it attracts a lot of non-majors as well.
Does it actually convert people to majors?
>> David Malan: In general yes. In fact, I mean this is starting to change slowly but surely in part
because Harvard now has an Engineering School as of a few years ago. In part frankly because of the
Facebook Movie and everyone knows Mark Zuckerberg went to Harvard and did CS for a couple of
years.
But what has happened then, what in reality historically most CS concentrators at Harvard dare say are
converts. They come to Harvard like I did not thinking they were going to major in CS. They put their
toe in the water and they take fifty, maybe one or two other classes, and realize that they didn’t in fact
want to do government as in my case, or pre-med, or some other field.
In fact that’s still very much the case. I mean students still don’t equate Harvard with computer science
though the messaging is slowly changing. Any other questions, yeah?
>>: You have a pretty small enrollment relative to the total that are coming from the graduate
community at Harvard. Is there any interest in like expanding that presence or attracting more graduate
students?
>> David Malan: Good question. Short answer, yes, so I taught a prototype of, a version of CS50 for
Harvard’s Business School this year that was separate from the thirty or so MBA students who took that.
That was a little more discussion oriented, a little more high level, focusing sort of top down instead of
bottom up. We thought about rolling something like that out to the graduate students.
The catch it’s marketed to graduate students. It’s available to them. But generally it doesn’t satisfy
credits that they might have. It involves a non-trivial amount of time. In fact even at HBS for the MBAs
our workload is like three times the typical workload. Just cramming that in to their own graduate
program is non-trivial.
But I still think we can do a bit better with outreach. That’s why we’ve been exploring another version
potentially. Yeah, another question online, do Mechanical Engineering students at Harvard have their
own intro to programming, a class which typically uses MATLAB? Do you know if any MechE students
have been taking CS50?
They don’t have their own intro to programming class. We don’t teach MATLAB. I do think it’s
incorporated into one or more classes. I don’t know the number of Mechanical Engineering students
taking CS50. But one of our biggest demographics among concentrators is CS students, engineering
students of which MechE would be a subset, applied math students, economic students, government
students actually, and then every other concentration at Harvard, alright, yeah David?
>>: You mentioned in passing that you were interested in getting rid of all grades at CS50. Can you
expand on that and what the thought is around that?
>> David Malan: Yeah, so this is more of a philosophical thing. I would be thrilled if Harvard University
and places like it honestly did away with grades at that level. Since I think there’s far too much interest
among these very high achieving demographics in sort of achieving for the sake of achievement, or
accumulating merit badges whether that’s academically motivated, whether that’s the number of
extracurricular students are doing.
I think there’s a missed opportunity especially in some of these kinds of environments to really pursue
as naively as this might be or sound, sort of learning for the sake of learning. Not worrying nearly as
much about meeting the colleges somewhat arbitrary thirty-two course requirements. Somewhat
meeting the arbitrary requirements that you might have within the department. But allowing them
freely to go well beyond what they learned or had access to in high school.
Part of that for me is eliminating that sort of extrinsic motivator. For us it’s been hard because within
the CS50 population only fifteen percent of students are electing SAT/UNS. Frustratingly, when we
survey students a majority of students do not want us to eliminate letter grades. I dare say, somewhat
strong headedly perhaps, that I do think, we have to address this beyond CS50. Because the problem is
we now exist in a vacuum. I think it is very rational behavior for a student taking a class that’s only SAT
on SAT and three other letter graded courses to prioritize the things that might have a more functional
impact.
What I hope we can do before long and among the policies I alluded to earlier is do something much
more like MIT, where the freshman fall or the freshman year is effectively ungraded. Or it’s effectively
pass/fail. Or MIT also has a sophomore exploratory option which means that in your sophomore year
you can take any class you want across the university pass/fail, with or without the instructor’s
permission, both fall and spring.
It’s exactly for use cases like CS50 where if you are a Gov major you don’t want to fail or you don’t want
to hurt your GPA, which is a very common and maybe valid concern. They don’t put their foot in the
water otherwise. That’s the kind of institutional change that I would like to see. The sort of pinnacle of
that I think would be the elimination of those labels all together. Yeah?
>>: In the meantime while you cannot abolish grades and with the increase in student population taking
CS50 as well as like a number of TFs that you need. How do you ensure that grading is standardized
across the board?
>> David Malan: It’s a good question. We have a lot of data on this too. In fact one the challenges at
the end of the term, is to do the normalization that we promise to do. By that I mean we try to
normalize as best we can the variance across grades, of grades across teaching fellows, and across
sections.
What we try to do for instance is we certainly take into account the comfort level of the student, less
comfortable, more, and in between. We know statistically that on exams or quizzes if you’re more
comfortable you typically get twenty percentage points higher than a student who is less comfortable,
and in between is about ten percentage points. That helps us know historically about how much to
curve those things even though it’s not curving per se. It’s only done at the end of the semester and has
no direct mapping to letter grades.
What we also do is look at the variance among the TFs. We know historically that the older you are the
more likely you are to have lower grades or be a harsher grader. Presumably that comes from
experience and sort of perspective. We therefore know how to sort of normalize the averages within
sections. It’s certainly imperfect. But it really makes, and it makes the process frighteningly manual.
But the last step in the process to is to ask every one of our sixty-two TFs for their thoughts on Alice, and
Bob, and Charlie, and every student in their section. To weigh in positively or negatively on what the
sort of default grade seems to be for them. We very often make manual tweaks. It’s very painfully
manual but I do think it’s worth it. The course is not so mechanized that it’s just a program that spits
out the results.
>>: It sounds like you a pretty rigorous grading process. Is it highly prescribed for the TFs to grade
students?
>> David Malan: To some extent it’s highly prescribed. We grade along three, four primary axes, scope
how much of the P set did you bite off, correctness how much of it is bug free, design which is much
more subjective and qualitative how well written is it, and style how good does it look variables,
indenting, and such. Within that we try to have as few buckets as possible. Most of those axes are
graded on three or maybe five point scales.
What we try to do in TF training and meeting over the course of the term is get more and more
reasonable people to agree what a three is, what a four is. But invariably there’s just significant variance
even among the eldest of the staff. There too we try to take statistics and data into account. But it is
indeed very laborious deliberately.
We’re trying to get to the point software arise where scope and correctness, and style are entirely
automated. But design will still remain human graded. Indeed that’s what the students at Harvard and
Yale get that say the student’s taking the course as a MOOC online don’t get, is that same human
support structure and higher touch feedback. Yeah?
>>: Speaking of feedback, it sounds like you’re working really hard to keep that individual one on one,
like that works really well in this context. What are doing to sort of like provide training into providing
good feedback and giving good teaching methodologies to your TFs?
>> David Malan: Regarding training at the very start of the semester we spend two immersive days with
the entire hundred person staff. Talking about grading philosophy, practicing grading, talking about
teaching, and practicing teaching. We have a weekly staff meeting that everyone’s expected to come to,
as well.
This year at Yale we actually, this past spring thanks to a Senior Teaching Fellow who’s now moving to
New Haven, of all places. To live full time and work full time running CS50 with one of the faculty there,
he’s actually been doing I think an amazing job. Better than we’ve done in Cambridge thus far at giving
them a spring term training experience as well, really just helping to give people practice and comfort
with teaching. Talking about the philosophy of teaching but also going through the mechanics of
grading and doing the P sets themselves.
One of the things we’ll be doing with our AP Outreach Program too is exactly those kinds of things.
Providing a support network for teachers as simply as email, as richly as web based interface for folks,
but also resources, and access to fellow humans so that we sort of collectively reach some reasonable
norms together. But this is very much an experiment too. What we’re hoping to do toward that end is
build up as much of a support structure frankly of alumni, friends of Microsoft back at Harvard as well,
to do exactly that, a mentorship program. Yeah?
>>: Can you quantify how much like benefit of the human labor in like grading and feedback is providing
compared to just like the automated stuff you have on the MOOC?
>> David Malan: That’s a good question. Have we quantified how much value there is, the extreme
here is how valuable is an on campus residential college experience or just that support structure? Not
very well yet, this, a lot of the online support structure is very new. The automated tools we only just
rolled out just a couple of years ago out of shear necessity of scale. That’s the question for us to noodle
on.
I mean the narrative at colleges everywhere that are doing these online courses that surely there is
value. I think it’s probably more conditional. For some students there is surely value. We certainly
know that by having this sort of apprenticeship model that’s someone to get you through the hurdles
and sort of keep you motivated and keep you engaged.
I think it’s probably that human aspect that’s particularly valuable. The reality is to our tools are
imperfect. We humans can spot a bug far quicker than even the most sophisticated tool right now can if
we have the practice for it. We’re, I think we need to figure out how best to measure that. Yeah?
>>: This Rebooting of CS50 is great. Do you see the impact of this effort on the other classes, CS classes
at Harvard? Is there very boring ones at Harvard?
[laughter]
>> David Malan: Good question.
[laughter]
Well Ec10 is much smaller now. We also, so there are, the most immediate answer I see to that because
the tradeoff isn’t between Ec10 and CS50. Is there’s a non-trivial number of fairs at Harvard now which I
think is the most visible impact. Like there’s an Applied Math Fair at the end of the term. There’s an
Applied Physics 50 Fair. There a, I think an Engineering Science, there’s an Engineering School Design
Fair.
None of those fairs existed before two thousand eight. I’d like to think as a side effect of what we’ve
been able to do with the CS class. That there’s an appreciation or more interest I think in the sort of
collective experience, the exhibition of student’s work not in any sort of competitive way but showing
where everyone has exited a semester.
There’s been that. Then I would say that you sort of get to know certain faculty or particularly teaching
sentric over the years. We certainly have I think juicier conversations behind the scenes. Especially
some of the policy issues I think we have good friends in statistics right now. The life sciences where I’m
hoping a few of us big classes will actually roll out some of these sort of campus wide changes.
A question from online, how important is choice of language, Java, C#, C++, HTML, Python in
effectiveness of teaching? Probably not very much, I mean HTML is the outlier there among those
languages. But I always disclaim that we could probably teach CS50 in almost the same way maybe even
a better way using some other language. This very quickly evolves into a scary religious debate with
most people.
[laughter]
We have very deliberately clung to C because I think one, I like the fact that we’re among the very few
courses that use this as the introductory language. It absolutely introduces some non-trivial stumbling
blocks for students, myself included in the day, pointers come to mind for instance.
But it allows us to really give students this bottom up appreciation of what’s going on underneath the
hood. It pains me when so many students are starting with higher level languages whether it’s Java or
Python, or any number of other languages these days. Never really unless they take like a systems class
understand like what is a computer, and what is memory.
The fact that we can have with introductory students, seventy-eight percent who have no prior
background discussions of buffer overflow exploits. Talking about the stack and the heap, and context
switching from C to PHP whereby in one problem set they write hundreds of lines of code to implement
a fairly fast hash table for dictionary lookups. To implementing that exact same principal in one line of
code by declaring an associative array in PHP. I think is a really compelling narrative so that students
really do understand the abstractions that have been built up.
I wouldn’t go so far down as assembly since I do think that probably goes a little to level. But C for me
feels just about right. We cover almost the entirety of the language. That too feels like a completeness
on top of which we can do everything else. Frankly, what I think has been a super I think testament to
just how well this seems to work at least for our students. Is how many students are going off and we
spent literally one two weeks on PHP and SQL, and JavaScript, and HTML, and CSS, some of which need a
little more care than the others.
We don’t certainly go into depth into any of them. But the fact that students are able to so comfortably
and so in mass bootstraps themselves, I do think is testament to how well syntactically the language
allows them to bootstrap to other syntactically similar languages.
Short answer, I’m sure we could teach and effective class with other languages. I’m sure other people
do. But this is I think a more exciting way to do it even though it comes at a cost. Kevin?
>>: How many of your TFs are CS concentrators? Or do you get a variety of TFs who have you know a CS
minor but are an Econ that brings a different perspective?
>> David Malan: Good question. I’d have to double check the data. I think it’s a majority happen to be
CS concentrators. That’s not deliberate it just tends to correlate with those who are applying and
expressing a particular interest.
But we quite often have had staff who’ve just taken CS50 and one other course. In fact the only bar we
really have is we expect, we generally expect a staff member to have at least taken CS50 and ideally
some other course. In fact, we very commonly have sophomores TFing the class. Our head TF for the
past two years started out, we hired him as a TF, as a freshman because of CS50’s open courseware. He
actually taught a version of CS50, translating all of our handouts and slides, and P sets into Portuguese.
Teaching it literally to fifty of his Brazilian classmates one year and then I think to a hundred of his
Brazilian classmates the next year.
He to, adopted our video production methodologies. He had his dad filming with the camcorder in the
back of the class. He has a corpus of all of the videos. We hired him as a freshman, there’s no strict
requirements, so it’s not necessarily concentrators.
>>: How much time do the TA’s put in per week helping run all the needs of office hours?
>> David Malan: Min of twelve hours, average of probably eighteen or twenty, and maxes of more.
[laughter]
Most of that is not because of our formal expectations. It’s because of personality, it’s their interest.
For many of the staff it becomes their primary extracurricular. So much so that we do tend to burn the
staff out so that alone means high turnover which is expensive. But it allows us to maintain the support
structure that we have. Yeah?
>>: How much time does the class tend to ask students of? Like is there’s something that is like the
biggest class that a student is taking in a particular semester, or?
>> David Malan: Very often. It depends I think on what your field is and what year in school you are.
The average is about twelve hours, super high variance. The more comfortable students might finish a P
set in four or six hours. But a more common case is twelve, fifteen, twenty, twenty-five, or more hours.
For other students at CS50 I mean the narrative tends to be. If you’re taking CS50 I wouldn’t take too
many other classes that have non-trivial workloads. Yeah?
>>: I’m wondering if you’ve ever experimented with other [indiscernible]. Sort of this philosophy that
you have has this sort of implicit assumption that kids or the audience can understand that oh, okay this
is a computer. This is [indiscernible] that they B to like C. I understand the whole process and then I
have a picture of the whole thing and it’s easy. But, I mean you have that luxury at Harvard and places
like Yale. But at a lot of other schools where kids haven’t gotten to that level they don’t have that
confidence.
Have you ever had sort, do you have to change your methodology of teaching? Do you have to change
your approach to be able to reach the kids that haven’t sort of gotten to the level, but I can actually
understand this?
>> David Malan: Yeah, it’s a good question. Short answer I think we have a lot of data that’s just that
the curriculum does not assume a Harvard or Yale, or comparable demographic. I mean Harvard’s GPA
definitely skews toward the right.
But the reality is since two thousand seven we’ve had fifty, a hundred, two hundred extension school
students for whom the opportunity to do the Continuing Ed version of CS50, which is identical
curricularly in terms of deadlines as the college version. It’s completely open enrollment so we have
sixteen year olds who are particularly high achievers. We have older students who are just doing it for
professional development, or to fill in gaps in their knowledge, or for personal edification.
As part of the APCS Program that will presumably skew somewhat towards a demographic of students
who are more high achieving or at least more willing to push themselves. But the reality is that through
CS50x, which is the MOOC version for which we have four hundred fifty thousand registrants right now.
Not all of whom have engaged but at least half of whom have engaged with the class in some form.
I think the key ingredient for a broader demographic is just a bit more time. I don’t think CS50 is as
difficult as it is simply time consuming. It’s definitely challenging. But it really, what I think is superior
about the online experience. In fact is the MOOC students have twelve months to do the class. Frankly,
even then they can restart the class the next year and keep going.
I think the advantage the high school students will have is very comparable, whereby the CS50 AP
curriculum will be spread not over three months but over six or so months, the fall and spring semester
of a typical high school. I think that’s the key ingredient is a bit more time because frankly doing one P
set a week at this pace, at this rigor is tough for anyone.
In fact, even among our adult learners in the Continuing Ed Program attrition is about fifty percent,
which isn’t because they’re struggling with the course per se. But because they have kids and they have
work, and they have other life commitments, which college students tend not to have.
Another sample size would be Miami Dade College. We’ve started working with a group of folks there
who’ve been wonderfully generous with their time. Essentially trying to help students in the Miami
Dade area get technical jobs and they happen to have chosen CS50’s curriculum as the vehicle via which
to get that demographic, which itself is very broad as a community college, from zero to sixty over the
course of just a few months. Then helping them find entry level technical jobs. The same thing has
been happening with some volunteers in St. Louis.
I think that we expect more of students per unit of time at a place like Harvard and Yale. That comes
with significant costs frankly to their own schedules and you know sanity. But I think with a bit more
time it’s absolutely accessible to a very broad demographic by design. Okay, one more question?
>>: Sure, given that you said now it seems to be like you would suggest preferable or in some ways a
better experience to take it on the MOOC. What are you doing to sort of extend the culture of the class,
the cult like status of it if you were taking it online and can’t get the T-shirt?
>> David Malan: Let’s be clear that was a side affect not an intent.
[laughter]
You know this is been accidentally and organically I think is the answer. With the MOOC version which
had a couple thousand students register the first time and now more than that. We initially tried to the
courses own discussion tool and the same software tool.
Then the next year we sort of put aside all of those. We embraced literally Facebook and Twitter, and
StackExchange, our own dedicated StackExchange, and more recently Slack the chat tool. To sort of
build things and see if people would come, and indeed now one of the most active groups among the
online students is the Facebook group which has some fifty-six thousand people in it right now. It’s a
horrible tool for sort of pedagogical use.
Right, things just bubble down really quickly and you can’t format code. Things just look like a mess.
But the reality is you can tag people. Some of us are already spending a little too much time there
anyway. You get pulled into the conversations in a way that you would otherwise have to sort of take a
deep breath in and dive into the discussion board.
We use our own subreddit as well. We’ve actually seen a remarkable like interest in the community.
You know posts asking where can I get my own I took CS50 T-shirt. What we’ve started to do too is
when we see sort of interesting stories or backgrounds of people. We’ll send some of those stress balls
in the mail to India.
We have a photograph of a fellow on the streets of Pakistan who like tattooed with a marker like CS50
into his arm, which I think qualifies as a cult status at that point.
[laughter]
But I do think that’s testament to sort of the connections that remarkably have been possible online. I
don’t even think the on campus experience is requisite.
>>: I saw the Facebook on your birthday. I just wanted to congratulate you on the cultive personality
not just for [indiscernible] in class.
>> David Malan: On my birthday, yeah.
>>: But the devotion people baking you cakes all around the world, it’s wonderful.
>> David Malan: Yeah, thank you.
[laughter]
Why don’t we officially adjourn here and I’ll certainly stick around for questions. But thank you so much
this is a lot of fun.
[applause]
Thank you.
Download