My Thoughts on the UBC CS Curriculum

Summary

EDIT: Here’s a much requested summary

  • Insufficient depth of content in:
    • Math and Proofs: UBC CS does not encourage students to practice proofs enough, and does not do enough to introduce foundational notions like sets, relations, functions, etc.
    • Theory of Computation: We teach significantly less content on computability, complexity, and formal language theory than other universities.
    • Systems: Systems content that is standard and expected of CS grads is glossed over or completely skipped. We are missing content on concurrency, memory management, isolation levels, and process management. Students don’t understand linking, the stages of a compiler, or how their build tools work.
  • Courses are not cohesive, and the curriculum is poorly planned: Courses like CPSC 121, 213, 313 combine content that is unrelated to one another. This makes them cram and gloss over content, and they fail to deliver a solid understanding of the material. There is a lack of cohesive course structure, which leads to redundant content through the systems track, and makes the content harder to digest for students .
  • PrairieLearn is not a replacement for short answer questions: Students often reverse engineer their understanding of a topic through PrairieLearn questions, leading to misconceptions and a shallow understanding of topics. Short answer questions should be brought back to challenge students’ understanding of the material. Explaining is the best way to learn.
  • I have suggestions on how to address these in the short term and long term

Introduction

My name is Oussama, and I’m a recently graduated UBC student and am now a software engineer at Databricks. Having finally graduated, I’ve been reflecting on my time as an undergrad. I’m grateful for the many opportunities UBC has provided me, none more than the chance to work with and learn from our wonderful professors like Margo, Reto, Steve, and many more. I initially transferred to UBC from McMaster university 2 years into my degree. At the time, I was looking for greener educational pastures, hoping to get a stronger foundation in computing. McMaster didn’t have the elective options that UBC did. Had I stayed, I’d only look forward to choosing 3(!!) electives throughout my degree. I also felt that my courses at McMaster didn’t deliver the academic rigor I wanted. UBC also boasts an awesome roster of professors who’ve made big impacts in their fields. These were the people I wanted to learn from! I packed my bags and set off for the other side of the country. 3 years later, and I’m once again reflecting on the education I’ve received. I look back and get a sense that my time at university and in my courses could’ve been better spent, that there’s so much more to learn. In many of my courses, it felt that I had to study two orthogonal things: what I was tested on, and what I saw as valuable to learn. I often felt that I couldn’t show my mastery on the content through the pages of multiple choice and fill in the blank questions. Perhaps UBC has pushed me over the Dunning-Kruger hump, and the real learning starts after graduation? While one degree can’t teach all of computing into one degree, I was left wondering if we could we design it to be better? I did some digging to better understand my feelings. I did a lot of talking with friends, students I was TAing, and poor passers-by in the cube just looking to heat their lunches. I researched what other computer science programs at top universities to see what they consider core CS experience and knowledge. I studied the curricula of UIUC, UofT, Waterloo, and CMU. I even put out a survey to get a sense for how people feel about the UBC CS curriculum, garnering 260 respondents! Needless to say, I care a lot about this, and I’ve put in a lot of work and energy to understanding CS curricula and how we can make the experience at UBC even better. I’ve got a lot to say, so strap in! I’ll be providing my review and analysis of the UBC CS curriculum, and will present ways that I see it improving.

One note before we jump in, I primarily use Waterloo and UofT as the comparison in this article because:

  1. People get scared off at the mention of schools like MIT, Stanford, and CMU.
  2. UofT and Waterloo’s semester schedule is closest to ours. Other universities may have a January break, or have a 16 week semester compared to our 12 weeks.
  3. This analysis is long enough as it is. Check out this for a much broader view of curricula.

We Lack Core Content

A computer science program is meant to deliver a core set of skills and knowledge to every student that successfully completes it. Fundamentally the curricular decision making comes down to balancing trade offs between the depth and breadth of content, the difficulty, and the flexibility of the program. The expected outcomes of the major depend on which of these we prioritize. UBC’s CS program provides excellent flexibility since it requires few courses, and allows students to tailor their university experience to their interests. Students can tune how challenging the program is with the flexibility to manage their schedules. While we deliver on flexibility in spades, I argue that our CS program undervalues depth and breadth of foundational knowledge, and that this hurts students more than it helps them. Indeed, lacking the foundations can make catching up an uphill battle. I personally felt that I had to do so much catching up in systems and my programming skills to even compete with other students from MIT, Waterloo, Berkeley, and elsewhere who’ve been taught this material as a matter of course, while I had to actively seek it out. The program should deliver on the level of knowledge that is expected of new graduates from both academia and from industry. After looking through many other universities’ content, there is definitely a common set of content that’s expected and that we lack. In today’s intensely competitive market, UBC students are worse off than students elsewhere, both in software engineering and in grad school applications. I would go as far as to say that the notion that this program would prepare you for the workforce or for research is dubious. It leaves behind students who are less connected or less knowledgeable about the industry. This also leaves behind students who don’t have the financial freedom to spend all their time outside of school doing projects and learning tools. In its current state, this program prefers those who have relatives and friends already in the industry, those who have money and time to throw at improving their skills outside of school. To many students, among them my friends and those I’ve taught as a TA, UBC does not provide a complete CS education, and it hurts their chances at success.

One of the major tradeoffs that curriculum designers have to consider is how challenging it is. I often get the question “Wouldn’t the extra depth and breadth make the curriculum more difficult?” I think that covering more content inevitably makes the curriculum tougher, but it doesn’t have to be linearly tougher. Indeed, the way we teach our classes makes it harder for students and leaves them learning less than in other universities! The progression of our introductory courses forces students to figure many concepts out on their own, and they progress far too quickly. A simple and rough metric to measure this is in number of courses that we have as “core courses” compared to other universities. The standard in both Waterloo and University of Toronto is 3 CS courses in first year,5 CS courses in second year, and 2 courses in third year, totalling 10 courses. Compare that to UBC’s 2 first year and 3 second year courses, and 3 third year courses, totalling 8 courses. At a very rough scale, we’re already missing 2 courses of content! Even McMaster had me do 1 CS course in my first year, and 8 in my second year, far outpacing our curriculum. When you try to teach that much content in so few courses, our lectures look like lightning talks in comparison. I’ll leave more detailed discussion on course flow and course structuring to an upcoming section on curriculum design.

Another common complaint I hear is that we don’t have the calibre of students that UofT and Waterloo have. I’d like to refer you to our department’s claim that “We are tied for #1 in Canada.”. Our program is getting top notch students as shown by our impressively high specialization grade requirements. We’re getting really talented people! The fact that students and professors believe that we don’t meet a certain calibre shows how much UBC students are being let down by the curriculum.

Here’s another one I often get: “not everybody wants to learn all the standard CS content”. “I’ve never used proofs in my job,” “we should learn web frameworks in school,” and, “no one will ever build an operating system at their job, so why learn about it?” We need to agree on some amount of content, otherwise we’d devolve to making everything optional. Should we teach the minimum set of content to get a job? Some may say yes! To those folks, I’d like to point you to the countless software engineering bootcamps out there. I’m here to receive a foundation in computer science, not become the minimum viable engineer. To those who feel that proofs aren’t useful or useful enough, I’d like to point you to my section on mathematical background. For those who feel that operating systems and hardware aren’t necessary for your jobs, go check out my section on systems.

In the following few sections, I’ll discuss where we should draw the line for “core content”. I show how our program fails to achieve a comparable level of knowledge to other programs in terms of mathematical maturity, theory of computation, and in systems. Then, I’ll provide my opinion on how the program should proceed in the long term. While these changes may take years to implement, I think that identifying these shortcomings can inform the choices we make about courses today.

We Don’t Deliver Enough Mathematical Background

I believe mathematical maturity should be a core learning goal of any CS program. All graduates should be comfortable writing and reading proofs as they’re an excellent exercise in technical communication and argumentation. We must convince the reader of a result, citing our reasoning and steps you took to reach that conclusion. Even if a graduate never writes a formal proof, the structured and logical approach of proofs is widely applicable in an engineer’s duties. How can I convince my colleagues that my code has no data races? Will my datastructure ever have a memory leak? Am I certain this code has no deadlocks? We talk a lot about pre and post conditions in 210. Given some preconditions, how can I be sure that the post conditions are met? The answer to all of these is to prove to yourself (or your inquisitive coworkers) that they are the case. Regular software engineers practice proofs whether they realize it or not. Moreover, graduates should be able to apply proof techniques to CS areas such as algorithms, compilers, or distributed systems. If they can’t understand the basic proofs of these fields, then can’t really specialize or work in them. Part of the value proposition of a university over a bootcamp is that we treat the material with more mathematical rigor, and can deal with these correctness problems as we encounter them in the wild.

Personally, I’ve grown a lot from my additional background in proofs and I attribute most of my problem solving skills to the proof courses I’ve taken. I’ve even had to use proof skills during one of my internships where we were expected to read and apply the (mathematically dense) literature on the Three Generals Problem, Byzantine Fault Tolerant algorithms, and the FLP impossibility result.

In my role at Databricks, I came across proofs in my coworker’s comments showcasing how his row group data skipping would work in our parquet reader. To summarize, we are looking to skip groups of database rows by looking at their statistics. SQL defines very specific semantics when evaluating filters over statistics, and these comments proved for each case that we preserve the correct semantics. Without the written proofs, we’d be stuck with an incorrect implementation or a piece of code no one would dare touch for fear of breaking it. Proof is technical communication.

// Given `col != val`:
// skip if `val` equals _every_ value in [min, max], implies
// skip if `val == min AND val == max` implies
// skip if `val <= min AND min <= val AND val <= max AND max <= val` implies
// skip if `val <= min AND max <= val` implies
// keep if `NOT(val <= min AND max <= val)` implies
// keep if `val > min OR max > val` implies
// keep if `min < val OR max > val`
NotEqual => min_max_disjunct(Ordering::Less, Ordering::Greater, false),

While the communication and argumentation aspect of proofs is vital, it comes with other benefits! Proofs are an excellent conduit for students to learn about mathematical concepts and abstractions. Mathematics rigorously encodes notions of sets, relations, functions, the cross product, images, pre-images, graphs, trees, and a bunch more. Network problems can be reduced to graph problems. Relations are core to the way we interact with our databases. Map-reduce pipelines can be reduced to sets and operations on them. Leveraging properties like commutativity can make huge performance gains and are core to how we share data. If abstraction is the language of computer science, proofs are the exercise to learn the language.

I hope of convinced you that proofs are important and relevant to computer scientists and software engineers. It may feel silly to list out the motivations for proof in a computer science education, but I’ve gotten feedback from many students (and some professors) that just don’t see the point of proofs!

Despite the importance of a proof education, UBC CS does not give students the foundation they need to study advanced fields and work with mathematical rigor. Even UBC students feel that they do not have enough experience writing proofs. Nearly 40% of students feel that they do not get enough experience writing proofs! Interestingly, before taking CPSC 320, 34% of students feeling like they don’t have enough proof experience. This jumps to 44.9% after taking CPSC 320. There is clearly something missing, and students feel it.

proof_stats

Let’s take a look at what UBC CS teaches as far as proofs. We introduce proofs in CPSC 121. In addition to proofs, it covers predicate logic, circuits automata, and complexity theory. It only teaches proofs in modules 7-10 out of 11. This isn’t much time to introduce and practice a very difficult skill. Proofs are hard, and CPSC 121 just isn’t enough time to practice. Moreover, some of the written proofs in CPSC 121 have been replaced with fill in the blank or drag and drop proofs. I argue that this style of teaching and assessment doesn’t deliver the value of proofs to students. They don’t get practice building a logical argument to convince the reader of their work. In contrast, synthesizing a proof from a blank page forces students to practice creativity and to apply math to solve problems. It gets them very familiar with the tools of abstraction that math presents like sets, relations, and functions.

What’s even more alarming is that students are never formally introduced to notions of sets, relations, functions, images/pre-images, or cardinality to begin with! Ideas that are present in any basic Intro to Proofs course. And yet, CPSC 121 fails to deliver on these basic structures and properties.

Let’s consider a student’s next encounter with proofs: CPSC 221. CPSC 221 alternates homework each week, so only half the homework letting students exercise their proof skills. This is a conservative estimate, since the written assignments typically aren’t fully proof assignments. Altogether, CPSC 121 and 221 don’t give students enough time to develop as mathematicians. The lack of time and the quality of approach lays a poor foundation for CPSC 320 and any future encounters with mathematics or its applications to computing. As someone who’s taken MATH 220, 223, and 320, there just isn’t a replacement for exposure and experience. I contrast that with my experience TAing for CPSC 320, where many of my students expressed a lack of confidence in their proofs, and felt that CPSC 320 was the biggest hurdle in their degree. Some of them had failed CPSC 320 a couple times. Other students admitted to putting down several ideas and key words on the homework, hoping to scrape by with some rubric points; Their proofs weren’t purposeful and structured. My experience grading assignments confirms this. Students were familiar with the words of mathematics, but couldn’t string these ideas into coherent proofs. I reiterate: 40% of the respondents on the survey claimed to not have enough experience writing proofs. This astounding number shows that we need to make a change.

I’ll now consider what other universities do. UofT requires that their students take 6 credits of calculus with proofs in their first year1, approximately equivalent to taking UBC’s MATH 120 and 121. In addition to the proof-based calculus, UofT supplements them with 3 credits worth of proofs exercises in their 9 credit full year first year course: CSC110Y1: Foundations of Computer Science I / CSC111H1: Foundations of Computer Science II2. It covers propositional logic, first order logic, number theory, function specification, correctness proofs, and runtime analysis, correctness of cryptographic algorithms (Diffie-Helman), induction/recurrences, proofs on graph algorithms. Quite the busy first year!

Waterloo requires a proof-based algebra and number theory, and a proof-based linear algebra course in their first year. Right off the bat, their students get 2 courses worth of proof exercises. Their algebra and number theory course, Math 1[34]5: Algebra, introduces proof techniques, set theory, induction, relations, injective/surjective functions, number theory, among many other topics. The algebra and number theory course would cover a hybrid of UBC’s Math 220 and Math 312. Their linear algebra would be equivalent to UBC’s honours linear algebra, MATH 223.

All in all, students at UofT and Waterloo come out of their 1st year with at least 2 courses worth of proof experience. In addition, both UofT and Waterloo provide a proof-heavy 2nd year course on CS theory (CSC236H1: Introduction to the Theory of Computation and CS 245: Logic and Computation respectively), and a 3rd year algorithms course just like us: CSC373H1: Algorithm Design, Analysis & Complexity and CS 341: Algorithms respectively. After a total of 12 credits of working with proofs in their program, UofT and Waterloo students can confidently tackle proofs that come up later in their careers, whether in algorithms, theory of computation, distributed systems, formal methods, compilers, or just convincing a coworker that their code is correct.

Taking inspiration from Waterloo and UofT, I believe we should introduce basic number theory, introduce all the basic tools of mathematics like sets, relations, functions and their properties, and tie these ideas back to computer science with recurrences, analyzing algorithms, cryptography, and theory of computation. There are many ways to go about this, but the key is to get students to practice the skill. I would argue that 2 courses that focus on rigorous proofs gives students the requisite experience communicating their ideas through formal mathematics. Students will feel very comfortable writing and reading proofs after 2 courses. This would also bring a UBC student to near equal footing with the average Waterloo or UofT student’s experience writing proofs in their first 2 years. A perk of moving some of the theoretical algorithms to a 2nd year proofs course is that universities like UofT manage to fit all the content up to CPSC 420 into their first three years. Other universities manage to fit randomized algorithms into their second year as well. A second year theory course would also form a solid foundation on which future courses like CPSC 311/411, 320, 421, 416,436S, and 436R can build and rely upon. How should one use two courses worth of proof? I propose teaching an equivalent of MATH 220 in first year, where students learn the basic tools of mathematics. Despite 220 being a second year course, its content is elementary and does not depend on anything besides high school mathematics. Math students are even petitioning for MATH 220 to become a 1st year course! Such a course introduces divisibility, sets, relations, functions, cardinality, induction, and other proof techniques. Students come out with a tool set and a basic understanding of how to apply these tools in a formal proof. I’d like to build on that with a 2nd year course that applies these tools to computing. Such a course would study recurrences, apply number theory to cryptography, discuss algorithmic complexity, and theory of computation. By approaching these topics with a strong math background, students can more rigorously analyze and reason about algorithms and models of computation than they can presently. Such a 2nd year course would also bridge the math taught in 1st year, to the math expected in CPSC 320, thus reinforcing the skill and keeping it fresh in students’ minds. I believe this sequence of courses would serve students well in their journeys through computation.

Theory of Computation

Theory of Computation (TOC) is not a glamorous topic, nor one that excites most students. Here I discuss what UBC teaches in terms of TOC, I compare it against other universities, and present my case on why we should expand our treatment of the subject. This one’s a more controversial take, but I hope I can do TOC justice by showing its importance :)

UBC’s required CS courses cover theory of computation (TOC) in CPSC 121 and CPSC 320. In CPSC 121, we discuss regular expressions, finite automata (both deterministic and non-deterministic) and briefly touch on Turing machines. All of this content is covered in a total of 2 lectures in CPSC 121, notably, before proofs are introduced. In CPSC 320, we teach complexity classes and reductions to prove that problems belong to certain classes. While we cover some of the basics of TOC, our required courses don’t teach as much content as other universities. I’ll start off by highlighting what other universities cover, then make an argument as to why we should incorporate more theory of computation in our core curriculum, and even add a required 2nd year course. Let’s take a tour of Theory of Computation in Waterloo and UofT.

Starting with Waterloo, they cover TOC concepts in both CS 245: Logic and Computation and CS 241: Foundations of Sequential Programs. Waterloo’s CS 241 goes over the basics of programming language and compiler construction. Motivated by compilers, it covers DFAs and NFAs, regular languages, and context-free languages, and context-aware languages. Additionally, they apply these concepts to understand both bottom-up and top-down parsing, code generation, and ultimately explain the internals of a C-style compiler. Not only does Waterloo expand on the theory of languages and grammars, they also show the motivation to study this field and apply it to compiler construction. Additionally, Waterloo’s CS 245: Logic and Computation goes further than UBC on first order logic by also studying the concept of proof systems and formal verifiers. They showcase algorithms for formal deduction of predicate and first order logic (i.e. given a set of truths, prove or disprove a logical expression). This course also introduces the concepts of soundness and completeness and leads into discussions on automated theorem solvers and unification. The course finishes off by proving Godel’s Incompleteness Theorem, discussing Turing machines, decidability, and the Halting problem. Waterloo definitely delivers content beyond the realm of TOC, but what it does deliver leaves students equipped to conquer any future encounters with computability, logic, and programming languages.

UofT certainly doesn’t cover as much as Waterloo, but still exceeds UBC’s content through their TOC course CSC236H1: Introduction to the Theory of Computation. This course goes into more depth on induction, recurrences, algorithm complexity than CPSC 121. It also covers Formal Language Theory theory for 3 weeks. They show the equivalence of DFAs, NFAs, and regular expressions, formally define regular and non-regular languages, and introduce the Pumping Lemma. While they miss out on Turing machines, computability, Halting Problem, and Godel’s Incompleteness Theorem, their students still get more rigorous study of automata and languages than we do.

We notice that both Waterloo’s and UofT’s curricula both require more content on formal languages theory, and do so in their 2nd year courses. Formal languages, automata, and computation models are foundational for compilers, language theory, and algorithms. With their extra language theory background, UofT and Waterloo students are better equipped to handle problems involving code generation and parsing, which show up all the time in computing. Waterloo students go so far as understanding how their entire C compiler is built through the lens of formal languages theory. Speaking personally, I’ve had to do parsing and code generation for both my internship, and my research project at Systopia. Finite automata are such useful tools that help model many problems in distributed systems, networking, programming languages, and even operating systems. Furthermore, Turing machines, decidability, and the Halting Problem are fundamental to notions of computability, and program verifiability, and whether you should give up on the problem you’re trying to solve.

I’ve discussed above that introducing a 2nd year proofs-based course would benefit students’ ability to reason and work with mathematical rigor. Such a course would be a perfect opportunity to teach computability and formal language theory. Students would get the chance to exercise proofs on structures of computation. This sets students up to model their problems using structures like a state machines or turing machines, and utilize the tools they learned in a 1st year proofs course to reason about these structures and models. Proofs on state machines prepares students to study verification of distributed systems algorithms. Computability theory sets the stage for further discussion on complexity classes in CPSC 320, and is essential to formal verification. The formal languages theory prepares students for rigorous understanding of programming languages and parsers. Admittedly, expanding the TOC content would not be well received by everybody. I maintain that a second year math course is imperative, and adding more theory-based algorithms content instead of TOC would also be a good option. The TOC content will be a judgement call by curriculum designers, and I hope I’ve presented my case as to why we should expand it.

Systems

I think that the systems track in UBC is the most under-served track in UBC CS. There’s a lot of facets to systems, and it could include programming languages like C and C++, assembly, computer architecture, assemblers, compilers, operating systems, disk and network IO, and much more. Of course, it is impossible to require expert level knowledge in all these topics, but I’d like to make an argument for why we should give it more attention than it currently gets.

Some students and professors have doubts as to whether knowing operating systems or basics of compilers will be useful for students in industry. They argue that the majority of students will never work on an OS or a compiler, so why bother learning that content? “Most students end up working on either the backend or frontend of some web application anyway”. This in my opinion is a crude reduction of what software engineers do at large tech companies. Here are some operating systems and compiler concepts that show up in “application engineering”:

  • Understanding your threading model and async runtime are valuable when deciding a programming language or deciding how to tackle parallel or asynchronous tasks. How does spawning a new thread compare to a goroutine in Go?
  • It’s important to the way you fetch resources over the network and disk. Why does buffering help reduce the cost of IO? If I want to persist data on the disk, is using a write sufficient? (hint: it’s not)
  • What are the effects of context switches on latency-sensitive systems.
  • How do I protect my customers’ data on shared machines using different isolation mechanisms?
  • Isn’t Javascript slow? How does it run so fast? The answer is just in time compilation

These skills are even explicitly in demand for positions that don’t involve building an OS. I recently had to tinker with kernel options to get my web server to take more concurrent tcp connections (and pending SYN messages) when benchmarking a thread pool, which is nowhere near kernel development… These are just a few examples, but the point is that you don’t need to be an operating systems researcher/builder to make use of OS and compiler knowledge. To say that students don’t need to know systems is to say they’re better off without being able to apply to production engineering positions, devops positions, and many (widely desired) software engineering positions at companies like Databricks, Jane Street, Google, and elsewhere. Students shouldn’t need to take a 4th year OS course like CPSC 436a to apply to these positions. Instead, I believe it should be core content.

To further emphasize my point, let’s take a look at what other universities do. UofT’s second year systems course is entirely dedicated to interfacing with Unix systems using C. File descriptors, network IO, makefiles, memory layout, processes, forking, and threads are all introduced. Waterloo’s second year course dubbed “baby compilers,” introduces assembly languages, parsing, the assembler, linker, and compiler. Meanwhile, our CPSC 213 barely scratches the surface on Unix tools and philosophy. It teaches redundant assembly content to CPSC 313. Students come out not understanding what linking is, or what the different segments of a process’s address space are (relevant for understanding read/write privileges of your data). They barely know what a Makefile does besides “the thing that lets me run make all.

While they cover different topics, both UofT and Waterloo cover significantly more than us.

For their third years, both UofT and Waterloo use the standard undergraduate OS textbook “Operating Systems: Three Easy Pieces.”, dedicating an entire course to it. CPSC 313 attempts (and fails) to teach the same content in a third of the time! As a former TA for CPSC 313, I can assure you that students do not recognize the performance implications of a context switch. They barely know how interrupt tables work. They do not understand the significance of the filesystem chapter or the importance of journaling for persistence. They never open a network socket. They never use concurrency. The significance of memory mapping is lost, mmap was mentioned once at the end of a tutorial; What was the point of the virtual memory section? In addition, they never pick up on the Unix philosophy of representing everything as a file from network streams, to IPC. They got none of the benefits of a university education in computer science.

Not only is the content for a course like 313 lacking, its assignments and exercises are too. Unfortunately UofT and Waterloo have little visibility into their courses. Anecdotally, I’ve been told that 313 is laughably easy compared to UofT’s OS course. My observation is that our assignments are at a 2nd year level at UIUC. Using more public streams, universities like UIUC and Stanford provide so much more involved, interesting, and relevant to operating systems assignments compared to CPSC 313. They involve implementing memory allocators, implementing a filesystem, and building a full shell (and not a 5 liner baby shell).

My curriculum survey pitches projects and labs similar to UIUC’s CS 341 for CPSC 313. Here’s what students have to say:

In CPSC 313, do you want to transition from the current tutorials to weekly labs that include programming exercises? Here are example lab exercises from another university:

  • Building datastructures that are safe to use with multiple threads, like a thread-safe queue.
  • Implement small version of Valgrind by tracking callls to malloc and free to find memory leaks.
  • Implement a library for parallel data processing using threads. This would let you transform and accumulate data in parallel.
  • Building a chat room server that sets up connection between clients so they can message each other

weekly_proogramming_labs_313 weekly_proogramming_labs_313_a weekly_proogramming_labs_313_b

In CPSC 313, would you like to see fewer, but larger systems programming assignments? Here are example assignments from another university:

  • Implement a small filesystem like V6. You would implement reading and writing to disk, directory entry management, and inode management.
  • Build your own commandline shell like bash/zsh. You would track the history of commands, manage processes (kill, get info, etc), and use fork/exec to execute binaries.
  • Implement the make command. You would parse a (simplified) Makefile, create dependency graph, and run the compiler (with fork and exec) in the right order to compile a project. You would then improve it by compiling independent parts in parallel. 313_programming_assignment 313_programming_assignment_a

This shows that students have an appetite for expanded systems content done well.

Designing Cohesive Courses and a Curriculum with Flow

Course and curriculum design at UBC has seen a lot of change over the years. The curriculum’s evolution has uncovered structural pitfalls that we should seek to avoid moving forward.

The first issue I see is that we cram too much unrelated content into our courses (dubbed “Franken-courses” by one professor). The second issue I’ll address is a lack of holistic vision or design in our curriculum. The combination of the cramming and lack of planning leaves students working harder for poorer results. I’ve had several professors see students struggling and doubt the students’ abilities to handle the CS curriculum I propose. These structural issues are the reason students needlessly struggle.

Course Content Cramming

The issue I identify here is twofold: We are placing unrelated content into a single course, and we are trying to cover too much content into the courses we do have. To provide some perspective, consider the other Canadian universities. Waterloo and UofT both have a total of 11 required CS courses throughout the degree. In contrast, UBC has a total of 8. Our curriculum tries to achieve the results that UofT and Waterloo do with 3 fewer courses. Other US universities have a longer 16 week semester, so their course counts do not directly translate. To show you the effects of fitting 11 courses into 8, let’s take a look at CPSC 121 and 313.

Let’s start with CPSC 121. It aims to introduce students to proof techniques, automata, regular languages, logic, algorithm analysis, digital logic, basic CPU design, and assembly. This course is trying to cover a lot of ground given that it is the only introduction to proof that students get, this is half the theory of computation students get, and this is half the digital logic and computer architecture students get. I reiterate. This is a single course. This content would be taught in 2 or 3 courses elsewhere. It is clear to me that part of reason for our poor maths is that we’ve chosen to cram all these disparate topics into a single course. Let’s consider an example. We teach the idea of pipelining and assembly in CPSC 121 through the paper computer lab. Despite that, teach CPU design from scratch in CPSC 313. Nothing stuck, and we wasted valuable student time. The nuance, motivation, impact, and design decisions are entirely lost because you can’t deliver that in a single lab. Another example: students spend a total of 2 lectures on automata, Turing machines, and regular languages. They haven’t proven any results, and in fact they do not know how to prove anything by then. I guarantee that the average student does not understand the significance of computability and languages after only two lectures. This level of knowledge does not stick.

Now consider 313, which covers CPU design, caching, file systems, and virtual memory, and kernel basics. 313 is equivalent to teaching a computer architecture course and an operating systems course in one. Because of this, 313 is forced to rush much of the OS content in the end, misses so much core systems content. As a 313 TA, my students felt that they did not understand the role of the kernel, how OS systems like virtual memory work, how OSs affect their programs, or how to use OS abstractions to their advantage. These insights are lost because it is impossible to deliver an OS course in a month.

Can a course incorporate several areas into one? Absolutely! In fact, UofT’s entire first year is a 9 credit course that combines 6 credits of intro to programming, and 3 credits of proof CSC110Y1: Foundations of Computer Science I CSC111H1: Foundations of Computer Science II. Waterloo combines parts of compilers and theory of computation in their CS 241: Foundations of Sequential Programs. The common thread is that combining a variety of topics requires that you give each topic its due time, and that the topics reinforce and tie in to each other.

How did CPSC 121 and 313 come to exist, and why are they still around? Having spoken to professors, these courses seem to have resulted from a merge of computer architecture and discrete mathematics many years ago. Today, they remain a sticky fixture on our curriculum due to the department’s resource constraints and lack of manpower. There’s also mountains of bureaucracy that must go through the Department of Computer Science and Faculty of Science to make any changes to the curriculum. Historically, if UBC CS wants to make a change to the curriculum, it is easier and cheaper to shuffle topics around our current course structure, leading to this mess. We are currently trying to take a complete CS education that would take 10 required courses in Waterloo or UofT, and fit it into our 8 required courses.

I believe our long-term goal should be to have the standard 10 required CS courses, with clear delineation between what each course delivers. I have been assured by several professors that it will be a very long time before we see major curriculum changes. Until we get there, how can we make cohesive courses with the current 8 course structure? If we are to improve our curriculum while constraining the 8 course structure, we must firstly align the content to be cohesive. We need to amalgamate common topics together. Instead of teaching assembly languages in 121, 213, and 313, keep it to just a single course. Merge the software engineering content that’s needlessly spread across 2 courses when they take a single course in every single other university I’ve looked at.

Holistic Curriculum Design

The second structural issue I view in UBC CS is we lack of holistic curriculum planning. It feels like the courses had been each designed individually, without considering their role within the entire curriculum. This manifests in both the content delivered and the way content is delivered to the students. For example, we briefly introduce concurrency in CPSC 213, but we fail to follow up on it in 313. 213 teaches an assembly language, and 313 reteaches a completely different assembly as if it were the first time. 213 and 313 weren’t made with one another in mind. These cases are easier to spot, but there’s more subtle failures in curriculum design. The one I’ll be focusing on is CPSC 210, and how it does not fulfill its role within the curriculum. This is despite it being a cohesive, decently structured course. I’ll also briefly touch on our introduction to systems programming languages, and how we fail to cohesively teach memory and memory management.

CPSC 210 sits at an important point in our curriculum. After 110, students have just been introduced to programming but they haven’t built anything substantial with code. Then they step into CPSC 210, where the student is faced with the mountain of abstraction and complexity that is software design. They’re suddenly thrown into inheritance hierarchies, the tradeoffs of interfaces and abstract classes, and these arcane UML diagrams that model data and behaviour! Meanwhile they can’t build a simple linked list 🤦. As the adage goes, learn to crawl before you design complex software systems. Indeed, in every other university I have reviewed, one feature remained consistent: There are two introduction to programming courses. This is the case in Carnegie Mellon University, University of Illinois Urbana-Champagne, University of Toronto, Waterloo, MIT, Stanford, and more. These introductory programming courses tend to focus on basic control and building data structures. Notably, none of these universities attempt to teach software engineering principles or UML diagrams in their first two courses. This approach lets students get very comfortable implementing algorithms, solving problems, and manipulating data before they move on to bigger and better things.

I also want to comment on programming languages and paradigms. A 110 student has only ever known functional programming. Not a loop or mutation in sight! 210 dives headfirst into Java and in my opinion, doesn’t spend enough time talking about language paradigms. I want to highlight CMU and Waterloo because their 2 first year intro to programming courses are functional programming course and an imperative programming course. Waterloo starts off like us with Racket in their first semester. Their next course is in C, where their students spend a few weeks bridging the gap between racket and C. They discuss programming paradigms, and even show how to code Racket in a more imperative way with (set! ...). Throughout their 2nd courses, students are practicing implementing simple data structures and algorithms, acclimating to the new programming paradigm and gradually increase the complexity of their projects. The gradual ramp of complexity is the key here.

Finally there’s the issue of systems programming languages. At UBC, we teach C in CPSC 213 and C++ in CPSC 221, neither is a prerequisite of the other. Because of this setup, neither course takes responsibility for teaching students about how to work with a systems programming language. They both loosly mention to free your memory, perhaps use valgrind, just use make in your terminal to run the code. Both courses are doing a balancing act of avoiding being mutually redundant, while teaching the minimum set to run programming assignments. In the end you need to piece what you learned in these two different courses to get an idea of how memory management and pointers even work. The result is that 3rd year students barely know the difference between stack and heap allocated data. They don’t know how to use out parameters in C. And they’re afraid to touch their Makefiles. On the other hand, Waterloo and UofT both have a set, specific course to teach these things.

One interesting historical note is that 221 was ported from UIUC’s CS 225, which has a C++ intro to programming II course. 221 is not meant to be taught without a systems programming background. Because of this, UIUC’s version is much more advanced than UBC’s.

To summarize, a cohesive course isn’t enough. You need to consider a course (like 210) in context with the rest of the program. The recipe is to both teach the right thing, and teach it at the right time. My proposal is to move all the software engineering content like UML diagrams, design patterns, and inheritance to CPSC 310. Make 210 the 2nd introductory CS course it was meant to be, and focus on building basic data structures and algorithms. I would go so far as to say that CPSC 210 should be in C and be the single point of entry for memory management and systems programming so that 213 and 221 can be more focused.

The Dangers of Automated Grading

Next on the chopping block is the use of PrairieLearn (PL) and PrairieTest to do our assessments in UBC. I truly believe PL as it is implemented right now is the single most damaging thing to happen to UBC CS. The pitch sounds great: students get instant feedback, TAs don’t have to mark assignments anymore, grading is so much more consistent, and it’s so much cheaper for the (extremely resource-constrained) department! PrairieLearn has successfully integrated into most of the core courses at UBC, and there have been a lot of investments into opening more computer-based testing facilities. The train has left the station long ago, but I want to deliver the reality of PrairieLearn from a student’s perspective so that we can at least use PrairieLearn better.

Consider a typical student: Bob. Bob goes to all the lectures, and does most of his studying by practicing PrairieLearn problems. Oftentimes, Bob would get these questions wrong and try to figure out how to correct his mistake. He usually tries looking on Piazza, asking a friend, or asking a TA for how to help. The response typically goes like this: “Oh for problems like these you have to solve it this way: …”. With the help of his friend, bob figures it out! He just needs to remember that if the “LRU” shows up, he uses this handy equation his friend gave him. Seems that “LRU” caches always kick data out the other end. Poor Bob didn’t quite understand or notice one of the question’s assumptions of “an array smaller than the cache”. When the exam rolls around and the assumption is instead “the array is smaller than the cache”, his handy equation fails him! His intuition told him that LRU data kicks data out if it comes back around to the same slot. Bob feels very frustrated because he felt that he studied really hard. After all, he’d put in so much practice!

The unfortunate reality is that myself and many students I TAed have studied for courses exactly like Bob. Bob didn’t understand the ramifications of the assumption. Bob built a false understanding from the reverse engineering a PL question. The most perfect analogy I can come up with is that of AI training. PL and the way students use it is the same as an ML training dataset, where they iterate over the training data and try their luck on the final exam. Here’s what can happen if training is done wrong: There was a model that was being trained to identify tumors. It noticed that there would always be a ruler next to malignant samples, so it learned to become a ruler classifier! True story.

I believe that open ended, written questions are a necessity to truly learning the content. You’d be surprised at how little you know when you’re asked to explain something! When you’re asked to discuss tradeoffs, or describe the design or motivation of something, you come face to face with what you know and don’t know. Advocates of PL would argue that open ended written problem lead to too much TA overhead and inconsistent grading. That is 100% the case. But I believe the lessons you learn are worth so so much more than the consistency of your grade. I came to UBC to learn, not be assigned a grade.

Finally, proponents would argue that UIUC, a top rated school, uses PL. So it should be fine for us to use it. That is true, but the way they use PL is completely different. Consider their first year proofs course, which assesses students on PrairieLearn. Their PL exams and assignments involve them writing Latex proofs, with TAs manually grading the proofs. From a TA marking perspective, this is a lot better than hand written work! From a student perspective, they get to exercise their proof skills. Compare that with UBC’s 121, that mostly involves fill in the blank questions, drag and drop, and multiple choice. Same platform, completely different execution.

To close, UBC ought to ask some more thought-provoking problems, or its students will suffer like Bob.

Other Miscellaneous Notes

This post has gotten a lot larger than I’d anticipated. I’ll give some more brief, quick comments on the curriculum:

Students don’t know how to use the terminal

Students all the way in CPSC 313 don’t know what chmod is. From the survey:

  • 38% of students feel they have a low competence in setting up ssh and using ssh keys.
  • 41% have low competence in identifying staged/unstaged commits and merge conflicts.
  • 33% do not feel confident in their ability to use a debugger like gdb to find a bug.
  • Finally, a striking 75% have low competence writing build configurations such as Make, Gradle, or Cmake.

competencies

competencies_a

I advocate for making these tools part of the critical path for assignments. Students aren’t going to learn these tools by being told about them. They need to use it, and the assignment should involve that. Check out Robert’s CPSC 436S for a great example of putting tools in the critical path.

This proposal would be popular among students too! 72% of students would take a 1-2 credit course on it. While this isn’t the only way to deliver this content, it shows the desire for such content.

tools tools_a

Some argue that university shouldn’t be teaching these topics. I would say that all students are going to have to learn them, but not everyone gets an equal nudge to learn. Computers is a stereotypically male hobby. People who have connections in industry have mentorship to show them the tools they should know. I think you either teach tools and make it part of the critical path, or you are leaving students behind.

Students don’t get enough programming experience

I think that students at UBC don’t get enough experience working on challenging programming projects that you would find at other schools. There seems to be widespread interest in these sorts of assignments. I’ve already shown the deep interest in better programming labs and assignments in 313. Here’s another statistic showing that people want more programming exercises in CPSC 210, which fits with the pitch to move 210’s focus to basic datastructure and algorithm implementation.

Do you want more open ended programming exercises that challenge you to apply design patterns and data structures in CPSC 210? 210_stat 210_stat_a

Here’s another one showing that students aren’t convinced they’re getting experience buliding complex software:

Do you feel like you’ve had experience designing complex software in your courses? complex_software complex_software_a

Hard university level programming exercises are the reason why a degree has value. If you built a page walker, an interpretter, or a log structured database, you’re likely set for any programming challenge ahead of you in your career. You get the added bonus of understanding the concepts really well once you build them.

People want a single computer architecture course

Do you want a new Computer Architecture course that covers assembly, digital logic (circuits, AND/OR gates, etc), the CPU, pipelining, IO, DMA, and hardware caching? This would replace the equivalent content in CPSC 121, 213, and 313 comp_arch comp_arch_a

First Generation Students aren’t Enthusiastic about Systems

For folks that do not have family that has attended university, they were 5 times more likely to say no to the proposed changes in CPSC 313 above. While the sample is small, the result is statistically significant. There is also a statistically significant skew towards not favoring a new computer architecture course in this group. While my survey isn’t the peak of statistical anylisis and data collection, my view is that our program is under-serving first generation students. The result is they’re being led away from “harder” parts of CS.

210 and 310 just isn’t working

Consider these two results from the survey

In CPSC 310, do you feel that you applied the topics learned in class to your project? 310_project 310_project_a Would you like to combine CPSC 210 and CPSC 310 into a single Software Engineering course? 310_210 310_210_a

These results show that 310 and 210 just aren’t providing the value they’re supposed to. 44% of students want to merge these courses, and an astounding 66% of students felt they did not apply 310 content to their project, while only 17% did.

Proposal and Conclusion

Before looking to the future and concluding, let me take you to UBC back in the 90s. Proffessors Manis and Little had written a textbook called The Schematics of Computation. They taught students all about the different programming language paradigms, and had them build a virtual machine, small database, and other cool things. They also did this in 2 required first year courses. Our curriclum is very different from what it was back then, and it shows that our state right now is not set in stone. Here are my goals for both the short term and the long term:

Short term goals

These are goals that can be achieved without upending the entire course structure.

  • Revamp assignments to be more open ended
  • Tune PrairieLearn to engage more critical thinking through open ended questions and problems
    • Bring back normal proofs to 121, or use latex on PL
    • Introduce short answer questions again. These should challenge students to consider design trade offs and applying concepts to new scenarios. Below are some examples:
      • What kind of workload for a software cache would be better for a write through cache than a writeback cache
      • When should you use abstract classes vs interfaces?
      • What are the problems with linked lists with respect to caching? How would you change a linked list implementation to be more cache-friendly?
  • Shuffle topics in 210 and 310 so that 210 focuses more on implementing data structures and exercising coding. Have 310 focus more on software engineering topics and abstraction.
  • Integrate more developer tooling into the curriculum
    • Instead of PL assignments, have assignments submitted through github, use docker containers instead of department servers, incorporate more bulid tool usage into the curriculum
  • Move all computer architecture to CPSC 213. Remove this content from 313 and 121.
  • Move all concurrency to 313 and just follow OSTEP.
  • Add the suggested assignments/labs to 313.
  • Make CPSC 121 a proper theory of computation/proofs course with similar content to Math 220. Get rid of the computer architecture and digital logic.

This final one is controversial, but I think it is essential to making all this work:

  • Make 210 a C++ or a C course. This ensures that we introduce memory management in a single place so that 221 and 213 don’t have to teach it. 213 can focus on being a computer architecture course. This works nicely because 210 is a prerequisite to both 213 and 221.

Long term goals

Just use a standard schedule like UofT or Waterloo. Tie each course to a CS textbook so professors don’t get too creative and you have a source of truth for content in the course.

Finale

If you even skimmed this, thank you! This ending is a little rushed, but I thought I’d rather get this out than let it rot in an old directory. While I’m on a new chapter of life, I still care deeply about UBC’s CS curriculum. If you want to discuss or be proactive in the department, please do so! If you have questions, you can reach out to me on any of my socials.

Footnotes


  1. UofT requires this for their CS specialist program, which is the closest program to UBC’s current CS program ↩︎

  2. This is a 9 credit course which historically was 3 separate courses: 2 Intro to programming, and 1 discrete math course. About a third of this content is proof exercises. ↩︎

 

oussama

my blog where I post stuff


2024-09-29