Last month, I presented a paper entitled “Are Are Undergraduate Creative Coders Clean Coders? A Correlation Study” at the 53rd ACM Technical Symposium on Computer Science Education. Read the full paper here (open access). It was a fun division from my main research topics and it proved to be interesting enough to generate some attention. In this post, I’ll summarize our findings.
I’ve published and written about creativity before, for example explaining our “exploring creativity for software engineers” focus group study. While evaluating first-year student projects with an open-ended assignment, I saw some really cool and unique approaches and some really mundane ones. When opening up the code, the results are similarly diverse, but not necessarily neatly linked: one student’s creative project turned out to be a complete mess on the inside.
Hence, we wondered: is there a possible correlation between the creativity of student projects and the code quality? During our CS1 course, which is taught in Java, we pay too little attention to the concept of clean code. There’s simply no room in that course! Furthermore, these are first-year engineering students that might end up specializing in chemistry. And yes, they all need to learn to program, so it makes little sense to start hammering in test-driven-development (we tried).
Other studies, primarily in cognitive psychology, already hint at the potential relationship between quality and creativity. Kaufman defined creativity as something that is original, qualitative, and relevant to the task at hand. Jordanous mentioned in a 2018 paper on computational creativity systems:
These two concepts are highly interrelated, to the point that it is difficult (and perhaps inappropriate) to define creativity without incorporating quality judgements into that definition (of creativity).
Gathering the data
First problem: how do you “measure” creativity? As I wrote in Creativity Self-Assessment Is Nonsense, this is a deceptively complex problem! What will you measure, the creativity of the code? The UI or graphics? Who will measure it? Does this measure everything? And so forth. To cut a long story short, we kind of side-stepped the problem by using Amabile’s Consensual Assessment Technique (CAT). The CAT system allows several judges to simply issue a score: 1 is not creative and 10 is very. That’s it. By using the average of multiple judges' scores (and debating if the standard deviation is too far off), the global score does say something about the creativity of every aspect of the project.
The average creativity of the projects was
5.92 out of
10. The average total score for the projects was
13 out of
20. There’s a normal distribution included in the paper for the curious.
Second problem: how do you measure the clean code quality of a programming project? That’s an easier question to answer, as there are static code analysis tools that do exactly that, such as PMD, the one we used. PMD analyses Java code—it also supports other languages—and simply reports the code quality issues in a certain format. It’s highly configurable too, and you can create your own rules if you don’t like the “quickstart” guidelines. It reports things like:
- Flow problems (Cyclomatic Complexity, empty if/catch statements, …);
- Idiom problems (resources not closed, unused local variables or fields, …);
- Expression problems (simplifyable boolean expressions, confusing ternary operators, …);
- Decomposition problems (singular fields, copy-paste detector for x lines, …);
- Modularization problems (High coupling, too many fields/methods, …).
We slightly altered a set of rules based on previous research by Keuning et al. At first, I was a bit skeptic about the PMD results, so we added an additional cross-check: I manually gave a subset of projects a “clean code score” and correlated that with the PMD results. The result was a strong negative correlation—which is exactly what we’d want.
Relating creativity to code quality
We found almost moderate positive correlations between lines of code and evaluated creativity, between unique code quality issues and evaluated creativity, and between total code quality issues and evaluated creativity. Although the correlations are not strong, they do seem to suggest that more creative projects contain more code quality issues. Of course, a correlation says nothing about the causal relationship between two entities, so more research is still needed to figure out the why. The figure below shows a scatter plot and trend line of the relationship results.
Remember that these are first-year CS1 student projects in higher education. We did gather projects from two academic years (110 projects in total), but it’s still just data from our local faculty, so it will very likely change if other universities are taken into account. Each course has its own style, and each professor and assistant has his or hers own teaching style, possibly altering the above scores.
Some PMD issues showcase higher correlations with creativity than others. For example, Cyclomatic Complexity shows twice as high correlations than an unused private field. That perhaps makes sense, a complex project with lots of sloppy copy-pasted code for perhaps enemy logic produces a more original (and thus higher evaluated CAT score) project, but also more complexity issues as reported by PMD. What we learn from this is that some issues are more relevant than others, and that some might be important enough to incorporate in our course—even if just mentioned as a “best practice”.
What’s the takeaway of this study? There is some evidence that the more creative student projects are, the more code quality issues arise, hence the less clean the submitted code is. This potentially creates another problem: if we start enhancing students' creativity, we might up with a lot messier code! We think it’s also vital to pay the needed attention to clean code principles, together with the conventional contents of CS1, and of course creativity and creative problem-solving itself.
We do admit that by using CAT as a measurement for creativity, we went in a specific direction that might or might not coincide with others' interpretations of the term “creativity” in context of student programming projects. Still, it would be interesting to see whether or not this study can be replicated in last-year courses. Perhaps those students have matured enough, and the correlation would be inverted? Or perhaps the problem has worsened, and the correlation only became stronger…