Sweller’s Cognitive Load Theory in action: takeaways, thoughts & questions

In 2017 Dylan William wrote on Twitter: “I’ve come to the conclusion that Sweller’s Cognitive Load Theory is the single most important thing for teachers to know.” That’s a big statement from a big educational researcher and thinker. It sounds like teachers should get their heads around Cognitive Load Theory (CLT). Fortunately, Oliver Lovell has written a very readable book about it, which not only explains the theory but gives a range of examples of it being put into practice in the classroom.

This book is definitely worth reading. That’s the case for teachers, but it’s also the case for anyone who wants to communicate or to learn. I’ve only gone through it once, but I think it may be one that I keep returning to. Lovell writes well and Oliver Caviglioli’s illustration are, as usual, brilliant.

Perhaps ironically, I tried to read and make notes on some of it while keeping half an eye on the updates from a football match – that’s not mentioned in the book, but I’m pretty sure it’s in the list of things not to do if you’re trying to process something.

On a couple of occasions Lovell suggests going through the theory and strategies in this book in a department meeting, considering them in relation to the way you design your resources and structure your lessons. I think this is a brilliant idea – I could imagine working through this book as a department and using it as both an auditing and a developmental tool.

After just my first read through, however, here are the key things I took away. Yes, I know this is quite a detailed post, but there’s an awful lot in this book, and the problem is it applies to almost everything we do in the classroom, so it all feels relevant! Some of it confirmed what I already knew and try to do, some of it refined my thinking, some of it was new and some of it raised questions. If you already know all about CLT and/or have read the book, you may find it interesting to skip to the end and see my questions and thoughts.

Cognitive what?

If you don’t know anything about CLT then you are much better off reading the book than trying to piece it together from my amateur comments in this post. I really appreciated that the book spent section 1 laying out the theory before getting into the practice. Lovell says in the introduction that readers are welcome to skip this section, but I’d strongly suggest working through it.

But, in case it helps, here’s Lovell’s one line summary of the practical conclusion of CLT.

“The fundamental recommendation of Cognitive Load Theory: In order to increase learning, reduce extraneous load and optimise intrinsic load.” (p17)

Just in that definition it is clear why someone like William would argue this is so important. It impacts learning, and therefore is at the heart of what goes on in the classroom.

Takeaways/key points

“Working memory is the bottleneck of our thinking.” (p19) – We can draw upon an unlimited amount of information from our environment, and researchers haven’t found a limit to our long-term memory. But our working memory is limited to a small number of pieces of information (perhaps 4-7 pieces). The curb to our learning is our working memory. Therefore, we need to maximise the use of this working memory by reducing extraneous load and optimising intrinsic load.

‘Chunking’ is really important – I wrote ‘chunk!’ on page after page of the book. It comes into play in a whole load of the strategies suggested. It’s obvious when you think about it: if our working memory is limited and the cognitive load is too high, then breaking that load into chunks is a straightforward step. Too often in my experience this is limited to a strategy for learners with SEN, but it will benefit pretty much everyone.

Knowledge matters, and is the main difference between novices and experts – “The primary difference between novices and experts in a given domain is that experts possess a greater amount of relevant domain-specific knowledge.” (p28). When experts encounter situations it isn’t so much that they draw on their problem-solving initiative, but that they can problem-solve because they have, in their memory, a ‘large collection of situations and associated actions’, along with the reasons why these actions are the best actions to take. Lovell uses the examples of stockbrokers spotting patterns or authors knowing how to construct a paragraph or chapter. I think we could apply it to teachers dealing with pastoral or behavioural issues, or school leaders dealing with a crisis: knowledge from previous experiences plays a huge role in how these are dealt with.

Flowing from this, “worked examples are better for novice, and problem solving is better for experts.” (p59). This is linked to the expertise-reversal effect, and the fact that students need different levels of support based on where they are on the novice >>> expert scale.

Curriculum sequencing is essential to optimising intrinsic load – “Importantly, the element interactivity of a task depends upon both the inherent nature of the task, and the background knowledge of the learner.” (p33). I think this is particularly interesting to consider when looking at the first lesson in a scheme of work. What do the pupils already know? If they know very little, teachers need to be very careful about how much they ask of them in tasks, because the risk is the load will become too big because of the range of new knowledge they are encountering. But I think, although this language may not be familiar to all teacher planning schemes of work, the basic thrust of effective sequencing will make sense to their current practice.

Breaking down complex tasks into simpler ones (p46) – This ‘segmentation’ approach to progressive skill building is something we try to do with essay writing in RE. Our use of scaffolding, sentence starters, and a general approach to stepping up the complexity of essay structures as boys make progress is in line with the recommendations of CLT. The nature of the RS GCSE questions is helpful for this – the 4 and 5 mark questions, if written in a certain style, can act as building blocks for the 12 mark question.

The signs of overload and underload in the classroom – This may be totally obvious, and I guess I sense this fairly naturally in the classroom, but Lovell’s description that the signs of overload are confusion and the signs of underload are boredom was a brilliantly clear summary (p39). It’s definitely something to look out for, both in teaching and when observing others teach. It goes beyond ‘they weren’t engaged’ to highlighting what the problem may have been.

The effect of pret-teaching on cognitive load –

Pre-teaching vocab (including ‘bullet-proof’ definitions) (p40) can reduce cognitive overload. I’ve attempted versions of this at various points, with variable success. It’s mostly down to routine – I think it needs to happen in pretty much every lesson, following up from every homework. The definition also needs to be used consistently across a department, otherwise a student changing classes can create a lot of confusion.
Pre-teaching overviews and timelines is something we’ve already been thinking about as a department. I’ve been very conscious that pupils at A-level, for instance, dive into various elements of philosophy, theology and ethics, encountering various scholars and ideas and movements. The problem is that they’ve never been given a chronology to slot them into. Doing that at the start of the course means that specific knowledge that is taught in lessons can be slotted into this schema, and there is less burden on their thinking as they try to make connections in lessons.

“The reason for reducing extraneous load is to free up working memory capacity or increased intrinsic load, and therefore more learning.” (p55) – We should always aim to reduce extraneous load. There are really very few reasons for not reducing it. But the aim isn’t to make learning easy for students. If it’s too easy it will lead to boredom and they probably aren’t making as much progress as they should. Interleaving content is something that feels like it increases the cognitive load but that isn’t a bad thing to happen (interleaving is effective) as long as the extraneous load is low enough to allow this, and as long as students have understood the content or can do each of the tasks individually before interleaving occurs.

The split-attention effect – “Information that must be combined should be placed together in time and space”. In other words, if you have an image and some labels for part of the image, putting them on different slides, or even different places on one slide is just increasinging the cognitive load. (p72) See below for some thoughts on whole-school expectations for slides/presentations. This may be one of those cases where it’s worth having an expectation.

On split-attention, I didn’t understand the music example for split-attention. Surely sheet music, by its nature, is very good at avoiding the split-attention effect? (p80)

Transience – Because there is a limit to our working memory, students need to either have information in their long-term memories or have it in front of them. Lovell gives the helpful example of a series of slides with a task at the end, but without the information on the previous slides. (p91). I’ve definitely had far too many students asking me ‘sir, can you go back a slide’ during a task!

IKEA instructions? – CLT suggests that we shouldn’t add in unnecessary information: if something can be understood with just the image or symbol, there’s no need to add words to explain it. This immediately made me think of IKEA booklets! Have IKEA been reading up on CLT?

Confirmations and refinements – It was interesting seeing CLT confirm some of the things we do well as a department. For example, we already employ versions of the ‘fading’ strategy. We already use a lot of model/worked examples, and it was great to see the effectiveness of this referred to a number of times in the book. We sometimes use the ‘alternation’ (p107) but don’t for homework, so that may be an area I may ask my HoD about.

Questions and thoughts on applications

There is clearly still quite a lot of research and thinking to do in the area of CLT and cognitive science more widely. Not everything is nailed down, and not everything has been understood or explained. The practical outworkings haven’t all been implemented. Some of the areas Lovell refers to in the book. But here are a few that sprung to mind for me.

Redundancy and Modality. I’m still not clear…

I think I understand the redundancy effect. “When information is presented simultaneously in written and spoken form, both sources of information are vying for the same working memory resource, and therefore interfering with each other.” (p62).

But in that case, is there ever a good reason for having text on the screen that you are talking about? Should you always explain (perhaps with images) then put the text up?

Part of my confusion comes from the fact that when I’ve watched talks by people explaining CLT, dual-coding, etc, they invariably put up text on the screen, whether it is a title or bullet-points or a quote from a piece of research. Is there a line somewhere between what is useful and what strays into redundancy?

For example, is the line crossed when the speaker is using the same words as are on the screen? Or when they are using different words but with the same level of detail?

I’m going to say below that CLT has a much wider application than the classroom. With the redundancy effect I’m particularly interested in how it plays out in assemblies when things are projected onto the screen. I lead assemblies every morning and often put a quote or a heading on the screen. I try hard to not put detailed text that I am just reading through (although sometimes Bible passages can stray into this) but should I not put any text up? But does transience mean that bullet-point headings do need to be on the screen?

It may be me, but I’m looking for a line that I can’t quite see at the moment.

With the modality effect you can “present information via auditory and visual channels in tandem to eliminate visual split-attention and expand working memory capacity.” (p94). I can understand the modality effect, and it’s something I’ve always tried to do in presentations or in teaching, by using text and a relevant image. But it seems to contradict the recommendation Lovell makes on p63: “Another common form of redundancy is to present the same information in both written and pictorial form.” I feel like there’s something I haven’t quite got my head around here. Are words and pictures helpful, not helpful or only helpful if presented in a certain way?

(I realise this is probably exposing my ignorance of cognitive science, although remember (!) that I did stupidly try reading some of this while watching football, which may explain my confusion…)

Netflix & transience

The suggestion is that all information delivery should be segmented, including videos. One study suggested 6 minutes was the limit for how much watchers could effectively take in. (p91). But how can TV shows, podcasts, etc, regularly run to 30-60mins in length? Is it because they are not focused on the watcher ‘learning’, therefore the expectations for the impact are lower? (I.e. it doesn’t matter if someone watching a drama on Netflix or MOTD on TV remembers all/some/any of it).

The importance of handouts

One of my big takeaways was that it could be helpful to have information handouts for every lesson. The information summarises what you have gone through, to overcome the transience effect. You have them ready to give out but you don’t give them out until you have finished the explanation stage. Or you have them face down on the desk until they are needed. We definitely do that for some lessons (and I’m quite good at making sure students aren’t looking at different information while I’m explaining) but we don’t do it all the time.

Training others

If understanding CLT is so important, how do you equip others with an understanding of it? It was interesting to see that Lovell ended the book by discussing students getting their heads around it. I can see why that would be so important for effective revision. It’s the kind of the thing students should be taught about at the start of a year or key stage, rather than when in Y11/13 revising for exams.

But staff also need to know. This book is a great introduction, but even with Lovell’s straightforward, clear yet engaging style I’d imagine some teachers getting lost in it, and others not reading it. School’s need a strategy for this, as they do for all CPD.

School-wide expectations for presentations + CLT for educating adults.

It strikes me that if CLT is right that some ways of presenting information are more effective and some are less effective, then schools should have set expectations on this. This would involve any slides that are used in lessons. Obviously Maths and History slides would be very different, but the CLT principles would be the same.

And any staff briefings, INSET, assemblies, open evening talks or a whole host of other things should follow the same principles. Employing it in the classroom but not in an INSET doesn’t seem to make any sense. And using it when educating under 18s (pupils) in a Science lesson but not over 18s (staff) in safeguarding training also doesn’t make sense.

‘Good memory’? Cramming?

I’m still not sure what it means to have a ‘good memory’. Cognitive science seems to say (although I may be confused on this) that learners need to go through similar processes for embedding information in their long-term memories, and that working memory is limited. I don’t fully understand how this relates to people who seem to have much, much better memories than others, and not because they have practised any of the CLT strategies, retrieval practice, etc. My daughter, for instance, has a ridiculously good memory that just seems to absorb whatever she sees, reads, hears, overhears (you need to be careful!), etc.

I think I’ve seen research that suggests that last minute ‘cramming’ does seem to have some impact on an exam the next day, but is a really poor way of embedding things in long-term memory (which is why it’s all forgotten a week later). But how can last minute cramming be ‘less’ embedded in long-term memory, but beyond working memory? Where is it? Working memory? Long-term memory? Somewhere in between?

Wider applications? Displays? Textbooks?

This book was about CLT in the classroom. It seems to me that if the theory and the strategies are correct it has an impact on a wide range of things: in fact anytime anyone wants to communicate something to another person. (Or is there a difference between communicating and wanting them to learn it?) For example, the way TV shows are produced, magazines are laid out, presentations and assemblies are delivered.

One particular example is displays and posters in schools. I’m sure I’ve seen some research about them not being effective (but don’t quote me on that). Do they create more cognitive overload in a classroom? If you want students to engage with them, do they need to be laid out in a certain kind of way? Another is textbooks – are textbooks laid out in the most efficient way?

Lots to think about!

You can buy the book from John Catt Ed here.

Takeaways/key points

Questions and thoughts on applications

Share this:

Related

Leave a comment Cancel reply