Arguing with Grant Wiggins and Dan Meyer…

During Dan’s series of lesson makeovers last summer, I intended to write up a critique of the Least Common Multiple makeover he and Dave Major created.  Dan’s write-up seemed to illustrate his view of the respective roles of inquiry and direct instruction, and I’ve always thought it showed a misinterpretation of the cognitive research that cautions against unguided inquiry.  I didn’t bother chiming in until Grant Wiggins wrote this blog post, bringing up the issue of cognition, working memory, and inquiry vs. direct instruction again.  So here are my thoughts.

For context, Shipping Routes shows students the clip below and asks whether the two boats will ever get back to port at the same time:

Students are then sent to a simulator programmed by Dave Major, allowing them to choose different round-trip times for the two boats and showing their cycles back and forth.  Students can discover for themselves when the boats get back in sync, and what that has to do with least common multiples.  For example, boats with round-trip times of 2 min and 5 min will be back in sync after 10 min.  Sadly, Dave’s simulator seems to be offline right now, so you can’t try it for yourself.

Dan’s approach

In his write-up, Dan shies away from direct instruction right from the get-go:

I could tell students what to look for here and how to approach the problem. I could show a few worked examples… 

Two problems there:

  1. Some students will need more than just three examples to determine a pattern.
  2. My selection of those particular examples – that is, my decomposition of the entire solution space into just three categories – did a lot of the intellectual heavy lifting for my students. They need to decide on those three categories and come up with a rule that takes them all into account.

Regular readers of Dan’s blog (cuz, yeah, there really aren’t regular readers of my blog) may know that worked examples are an interest of mine.  Here’s how the stuff I’ve learned applies to Shipping Routes.

Science of Memory and Learning

Humans have 2 kinds of memory, working memory and long-term memory, and they function completely differently.  Retrieving information from long-term memory is almost effortless, and we can process huge amounts of information from long-term memory simultaneously.  In contrast, working memory can only hold on to 4-7 chunks of information, and only for about 30 seconds or so.  The word “chunks” is important.  For a demonstration, watch the first 1:30 minutes of this video:

There are various theories about how information gets encoded in long-term memory, but the main idea is that the information has to stay in working memory for long enough to get practiced/recalled several times before the working memory dumps it–each practice opportunity strengthens its foothold in long-term memory.  Working memory will dump it when new information comes in, so if you keep throwing new information at someone who’s still encoding the previous information, you make it very hard for them to form long-term memories, even if they seem to be processing what you’re saying as you’re saying it.  It could have made sense to them when you said it and then disappeared from their mind when you said the next thing.  (Sound familiar to any teachers?)

For that reason, inquiry instruction can make it very hard to encode information into long-term memory.  As Sweller et al (2006) say in paper that spawned much controversy, “Inquiry-based instruction requires the learner to search a problem space for problem-relevant information. All problem-based searching makes heavy demands on working memory. Furthermore, that working memory load does not contribute to the accumulation of knowledge in long-term memory because while working memory is being used to search for problem solutions, it is not available and cannot be used to learn.”  

Grant Wiggins really does not like this paper–much of his most recent post (which inspired me to dust off my keyboard here) is spent picking it apart, particularly the “ludicrous” emphasis Sweller et al place on novice vs. expert learners.  I think Wiggins misunderstands the authors on that point.  Granted, I’m a layman, but I think the difference between novice and expert learners is pretty simple: isn’t it the difference between someone who’s encoded the information being studied into long-term memory and someone who’s still processing it from working memory?

The strong opinions (not limited to Wiggins by any means) reflect the fact that this theory of memory and cognition points out shortcomings of inquiry learning.  As I’ll describe at the bottom of this post, it does not actually mean that you should never use inquiry, but some people on both sides of the debate say it does, particularly when the theory itself is misunderstood.

Critiquing Shipping Routes

How would this play out in Shipping Routes?  In order to be successful with Dan’s lesson, students need to try different pairs of round-trip times on the simulator to discover what governs when the boats are back in sync.  Students can set times to the tenth of a second, e.g, 3.2 min for the first boat and 4 min for the second boat.  These boats would by in sync for the first time after 16 minutes.  Would they notice how 16 arises from 3.2 and 4 as the least common multiple?

Let’s give the lesson the benefit of the doubt and say that a teacher would suggest (or have a student suggest) that everyone try whole-number times to start.  So one group of students might, for example, try 2 min and 6 min. and discover that the answer, 6 min, is the larger number.  At this point, Dan would have the teacher challenge this group to see if that rule always works, with an eye toward finding counterexamples. Dan says,

If a student just tries the first example and says, “It’s easy. It’s always the longer of the two times.” I can then say, “Great. But try that on several more examples and make sure it works.” (It won’t.) Or I can suggest one of the other two categories. But I’d rather not offer those categories before the student has even considered why she might need them, or even the fact that there are different categories.

The other categories are coprime numbers (like 2 and 5) or numbers with a common factor (like 6 and 10).  Students have to identify the three situations and find the rule for each, with minimal guidance.

Do you see how this might overtax working memory to the point of inhibiting long-term memory formation…how students might successfully find one or two of the three rules on Monday and then walk in on Tuesday having forgotten what they’d discovered?  Or find all three rules, but be unable to remember the first one by the time they find the last one?  If you don’t think this would tax their memories, keep in mind that many students have to stop and think or count on their hands just to remember that (3)(6) = 18.

Modifying Shipping Routes

How could Shipping Routes be adapted so it accounted for working memory limitations?  Here’s what I would do.  Whatever rule students discover first, before encouraging them to find cases where their new rule DOESN’T work and where they need to create a new rule, I’d have them fully digest the math behind the rule that they have found.  They’ve just found this rule.  It’s fragile, sitting in their memory buffer temporarily–this is our chance to have them encode it into their long-term memory and harden it by connecting it to other conceptual schema already in their brains.  (Yes, I do believe strongly in developing conceptual understanding in addition to mere skill development).  If we have them switch to a new search for a new rule, we throw away most of the benefit of the discovery they just made.

How would I help them practice it and connect it to other concepts?  After students played around with the simulator for a while, I’d present them with text input boxes like this and ask students to fill in the blanks:

Boat 1: 2 min

Boat 2: ______

Time until they’re in sync: _____

A computer could easily analyze student responses and figure out which rule they’ve discovered.  Let’s take the example of a student who discovered that for coprime numbers like 2 and 5, you multiply the numbers.  Now I’d want to have the students really dig into that rule.

For example, I could ask them to re-represent the combination they came up with on a double number line:Double Number Line

Then I’d ask them to self-explain the connection via drop-down menus, as in Martina Rau’s paper finding that multiple representations didn’t help students learn fractions unless students were prompted to self-explain the connections between the representations.  (The picture shows drop-down menus for self-explaining adding fractions, but you could do something similar for explaining why 2 and 5 have a common multiple at 10):

Martina Rau

Then I’d have students practice this rule a minimum of 5-10 times, perhaps following the approach of this other paper by Martina Rau.

And then, finally, I’d have the online lesson challenge these students to find a case that breaks their rule, and the learning cycle would begin again.

So When & How Should We Use Inquiry?

 As I said above, understanding working memory limitations doesn’t mean you should never use inquiry.  Rather than spelling out my views on this here, I’ll just point you all again to Dan’s posting of my thoughts on this topic over at his blog.  Incidentally, I did try to convince some researchers in the learning sciences to use Dan’s makeover lessons as a test-bed for studying instructional principles, and I still think that would be a great idea (see my first blog post ever).  I mean, researchers, if you want teachers to actually pay attention to your work, why not conduct your studies in the context of the lessons we’re all talking about online?

 

Teaching Direct Variation Conceptually (as a foundation for slope)

OK, the following may rate as one of my best test questions ever.  It might take you a second to see the point:

Question_1

If you don’t get it yet, compare it to this alternate version:

Question_2

By now you’ve probably seen what I’m getting at: the answer to #1 is “Not enough information”, because there’s a jar there, so the weight per candy is not 7 grams.

Let me share the struggles that led to this idea, and the way this approach will underpin my teaching of linear functions next year:

First of all, I want to continue to make my math real-world (even if nobody at Dan Meyer’s blog can articulate why), but too often that means students have to wade through lots of WORDS to understand a question or task.  This is especially daunting for English Language Learners, and I’m working on meeting their needs better.  So I need to find ways to capture the essence of concepts and story problems with mostly pictures and a minimum of text.

Secondly, my students often have a hard time seeing why rates are often calculated as (y2 – y1)/(x2 – x1) instead of y/x. With a good context like Domino Effect by Mathalicious, students will discover the idea for themselves, and they’ll be able to memorize it and apply it in real-life situations that obviously give them 2 points.  But I’ve always taught direct variation at the end of linear functions, as a special case after the main concept…and after that lesson, some students revert to calculating slope as y/x.  They can’t distinguish cases when you need to subtract first from cases when you can just divide.

So instead of teaching direct variation last, next year I’ll teach it first, as the introduction to linear functions.  Since they won’t know (y2 – y1)/(x2 – x1) yet, we’ll just start by sorting situations into 2 categories: ones in which you can find the rate (not “slope”–that word won’t exist for us yet) by dividing y/x, and situations in which that makes no sense.  We’ll make predictions for new situations in the first category by writing proportions, writing equations of the form y=kx, drawing graphs that pass through the origin, etc.  And mixed in with all of that practice will be situations in which direct variation and proportionality don’t work:

Question_3

(The bar itself weighs something here).

Once we’ve really hammered the idea that there are 2 categories of linear scenarios, ones in which the rate is just y/x and ones in which it’s not, we can finally do Domino Effect and figure out how to deal with the second category.  As an aside, I do need to find more 3-Acts that are simple linear scenarios and are still enticing for students.  If anybody knows of some, please tell me in the comments.  I tried Dan Meyer’s Pencil Sharpener task, which he seems to have taken down from 101q’s, but it gave me a really poor prediction.  That’s no good, because early in the year I want my students to feel impressed by their new ability to predict things.

As a final note, I think even before starting the direct variation lesson, we might use Simpsons Sunblock to get students thinking about proportions.  In the past, they’ve tended to make their predictions by writing and solving a proportion (or at least informally by doubling the x- and y-values from a particular point in their data table to predict the needed values).  Then when we formally introduce the concept of proportionality/direct variation, we can return to Simpson’s Sunblock and talk about how the method they already used there was an example of it.  And we can translate the proportions they used into direct variation equations.

Yay!  Why is it that March is always when I realize what I should have been doing in September/October?

Cryptography and Horizontal Function Transformations

This lesson has gone well enough for me that I think it might be worth putting out there for others to use.  I’ve been troubled for years by the lame explanations I had to give for why horizontal transformations are backwards.  Why does y=(x – 3)2 get shifted to the right?  Every explanation I’d give was met with the polite “Oh, okay” that’s student for “I have no idea what you just said.”

So here’s a taste of what how I taught the topic using the context of cryptography.  I gave students the following function and asked them something to set a low barrier to entry: if someone sends you the message “8-9″, then what are they saying to you?

f(x)

(The answer is “Hi”, of course).  But that code is too obvious, so then I introduced the idea of transforming the code to something more secret.  For example, you and a co-conspirator could agree that to read a message, you’d first add 3 to each number, and then look up the corresponding letter on the graph.  So if I send you the number 1, what letter did I send?  (The answer is that you first do 1 + 3 to get 4, so the correct letter is D).  That’s simple enough, so we practiced it and built up to something more complicated:

Easy Transformations

The answer to question e) is 55.  This is not easy to see without doing some calculations, and those calculations already get you thinking in the backwards logic of horizontal transformations.  Since “S” is the number 19, you need (x+2)/3 to come out to 19.  So you first multiply 19 by 3, and then subtract 2.  Students should show work like this: f[(55+2)/3)] = f(57/3) = f(19) = S.

After that discussion, it’s time to let the students have some fun by sending a secret message to a classmate with the code f(2x-3).

By then, we’ve noticed that it’s annoying to have to recalculate each letter you want to send.  Wouldn’t it be nicer to just notice the patterns, and adjust the entire code of f(x) at once?   We look for patterns in how the points on the graph are transformed, like this:

Graphical transformation

(I’ve put the answers in the graph in red font).  Here you can already see the backwards thinking in effect: although the fact that 0.25 < 1 might make you think that f(0.25x) is compressed, in fact it’s stretched.  And you understand the logic–you need a larger value of x, because the 0.25 is going to shrink it, and you’re still trying to come out to the same letters of A, B, and C.

What I really like about this approach is that it also makes clear why the order of operations is reversed for horizontal function transformations.  For example, in the function y=sin(2x+pi), you first shift everything to the left by pi, and then compress by a factor of 0.5.  Why are these transformations applied in the reverse order of operations?  Because in essence, you’re solving an equation, just like we solved (x+2)/3 = 19, above…and solving an equation always involves “undoing” the expression in the reverse order in which it was built up.

Students apply that reasoning, first by encoding the letters of the word “JAMES” using the function f(0.5x + 6), which has two transformations.  Then, looking at the pattern of the steps they took in each case (first subtracting 6 from both sides of an equation, and then multiplying by 2 to cancel out the 0.5), they apply those transformations graphically to another word, “BOND”, in the same order.  First subtracting 6 from each x-value means moving it to the left 6, and then multiplying by 2 means stretching it horizontally by a factor of 2.  Students know that the order matters here, because of some exploration we did in the intro section of this activity.

13.12.04_Double_Transformation

Students then translate their understanding to actual mathematical graphs:

13.12.04_Final_pic

This should set my students up nicely to understand period and phase shift.  Think again about the function y=sin(2x+pi).  When transforming graphs, it’s usually easier to stretch/compress before you translate.  Since order of operations is reversed, if we want to compress first, we have to rewrite it as y=sin[2(x + pi/2)].  This way, you compress by 0.5 first, and then just shift everything by pi/2 to the left. 

And, if you’ve gotten all the way down to this part of the blog post, here’s a link to the activity: https://www.dropbox.com/s/gam9uzwqrnl91l8/Introduction%20to%20Functions%20and%20Cryptography.pdf

For some context, see Kate Nowak’s related musings: http://function-of-time.blogspot.com/2013/09/building-functions-clarified.html

Teaching is hard

I’m in my 10th year as a teacher. I’m a much better teacher than I was in my first couple years, and a much better teacher than I was before I found the MTBoS. But I struggle with keeping a growth mindset–I keep wondering when I will finally be the teacher I want to be. Part of this is that my standards go up every year. Lessons that seem okay this year would have struck me as awesome 5 years ago. And part of it is that this is only the second year ever that I haven’t had either a brand new prep or a new school to adjust to. But still, teaching is hard. So for all of us out here busting our butts over summers, weekends, etc…keep it up, and remember that a growth mindset is the most important thing.

Rough idea of how I want to use Penny Circle

Been trying to figure out how to use Penny Circle, by Dan Meyer and Desmos.  I think the activity is terrific, but like L Hodge (in the comments on Dan’s blog), I think the activity does some of the intellectual work for students, so they won’t be challenged to think as hard as I’d like.  (However, it’s a great first experience of the IDEA of modeling, so I’m not criticizing it…I just want to take it further).

I only have a second right now (will flesh this out later), but I think I’m going to have students do the activity as intended, and then ask them to create an equation for diameter of the penny circle after Dan has been putting down pennies for t seconds.  The Desmos activity generates the function n(d), where n is the number of pennies and d is the diameter of the circle. My students will have to find the function d(n), and then use some video of Dan putting pennies down to find n(t).  Then they can use composition of functions to get d(t).  I think there is some video of Dan putting pennies down in the 3-Act.  Anyways, back to work, but if anyone has comments or especially criticisms/warnings, I’d love to hear them.

Frustration in trying to use Khan Academy exercises

The teacher reporting tools just don’t work.  So it’s taking me FOREVER to check students’ progress.  Literally, there are exercises on their site, which I have assigned to students, that are not in their list of exercises in the teacher reporting tools.  To check student progress, I have to go to each student, click on their name, and CONTROL + F to search the page for the name of the exercise, then look at their score, and finally type it into my gradebook.  And, because student proficiency changes constantly based on the mini retention quizzes KA gives (called mastery challenges), I will have to re-check student scores on each exercise every couple weeks.  KA, you say Shipping Beats Perfection doesn’t mean you ship things that are broken, but this is clearly broken!

UPDATE: Khan Academy Lead Developer Ben Kamens comments on this bug and on their current focus in building out more teacher reporting tools.  See comments section.

UPDATE: I found a new workaround, and I’m pretty sure it wasn’t there yesterday.  Perhaps KA fixed something already.  You can use CTRL + F to search the table when you view by table.  Exercises that weren’t in the table yesterday seem to be there today.  You can’t type their names into the search box above the table, but you can use your browser’s search function.  That’s a start.

UPDATE 9/22/13: Now that searching in the grid is working, what’s not working is searching in a student’s skill progress while viewing their profile.  So, for example, when a student has completed “Multiplying Expressions 0.5″, I can see their work in the grid or in the coach report by student, but if we view the student’s “skill progress” from his/her own profile, Multiplying Expressions 0.5 is not listed as an exercise.  In essence, KA’s back-end system for tracking performance seems to be having a hard time keeping up with its front end improvements this summer. so teachers should be prepared to potentially spend extra time figuring things out.

UPDATE 9/29/13: The issues with the teacher reporting have been completely fixed, and–even better–they have added a feature which lets you recommend exercises to students as a whole group rather than one student at a time.  Nice job, KA.  Exactly what I was hoping for.

How I’m Covering the Distributive Property in Precalc

I teach a class called Math Analysis/Trigonometry, which is close enough to Precalc that students who get A’s in my class are allowed to skip to calc. I’m spending the first several class periods covering a combination of Carol Dweck mindset stuff and the idea of distributing.  Distributing?  In Math Analysis/Trigonometry?  Yes.  I actually think many students, even post-Algebra 2, don’t understand it that well, and I think it’s what leads them to make some of their most common mistakes.

Misstakes

Later in this post, I’ll describe how I’m trying to talk about distribution in a way that’s conceptual and allows for Accountable Talk–I certainly don’t want to lead off the year with a bunch of drill-and-kill.  And to my surprise, distribution seems to be fertile ground for conjecture and discussion.  But first, here are the top 2 mistakes.  Recognize them?

  1. In (x + 2)/x , you can cancel out the x’s.
  2. (x + 5)2 is just x2 + 25

With mistake #1, students just don’t see that it’s distribution.  If it said x(x+2), they wouldn’t give the answer of x2 + 2, but they don’t see that distribution happens with division, too.  Somewhere in their brains, they know it, but that knowledge must be only loosely connected to everything else.

With mistake #2, students mistakenly think that you can “distribute” an exponent across addition.

There are other big mistakes relating to distribution, too.  Have you ever seen someone try to distribute the 2 in 2(3 * x)?  Usually, they won’t do it for 2(3x), but if I put a multiplication sign and some big spaces inside the parentheses, I usually get more than a handful who distribute.  This whole topic is closely connected to the ability to see whether an expression has one term or many, but the number-of-terms idea doesn’t unify it completely, because you can “distribute” the square in (3x)2, even though 3x is a single term.  So let me share what I’ve done.  Curious how you would have done it–surely other teachers must have to do battle with the same misconceptions.  And you may object to the use of the term “distributing” when talking about an exponent.  I used to discourage students from talking about it like that, but I now think allowing it as long as they have a deep understanding of distribution is better.

Accountable Talk Discussion Plan

Discussion was grounded in this student handout. The question of why the distributive property works led to some crickets chirping in class, so I had to take a student’s example from #1 — the example was 3(x + 2) — and ask WHY does it equal 3x + 6.  The first 3 students to respond gave variations on the theme that you have to multiply the 3 by both terms inside.  Right, but WHY is that rule correct?  I think I ended up priming the pump by writing “3x = x + x + x” and asking how you could write a similar statement for 3(x + 2).  A student explained that (x + 2) + (x + 2) + (x + 2) = 3x + 6.  Still, lots of students were either tuning out (“Isn’t it obvious that it’s 3x + 6?  I mean, you just distribute.”  Grrr) or not getting it.  After 3 more students paraphrased and small groups checked in with each of their members, everyone seemed to understand the point we were making.

So then, the set-up for the big discussion: #3 on the handout.  To clarify the question, I said, “There are other situations where you can distribute.  For example, what could you put inside the parentheses in the expression (     )2 so that you could distribute the square.  And 90-100% of the class wrote something like (x + y)2.  Success!  This was exactly the misconception I wanted to address, and instead of just presenting it from out of nowhere, we’ve created a natural context in which to debate it.  I grabbed a bunch of student responses and put them under the document camera.  One student actually wrote out his work: (x + 5)2 = x2 + 52 = x2 + 25,  I showed that response last, and a handful of students recognized that it was incorrect, but the bell was about to ring.

Bell rings, end of class, next class begins.  We picked apart what was wrong with (x + 5)2 = x2 + 25, and I asked them to try #3 again.  What can you put in the parentheses so you can distribute the square in (     )2.  Now a couple of students tried a monomial in the parentheses.  We proved that it worked using associativity and commutativity, and I asked several students to summarize.  So far, the class thinks the rule is that you can distribute an exponent across different things as long as there’s no addition or subtraction between those things.  That’s not the rule I want them to eventually come up with, but it’s a good enough start to let us move on to #4.

Check out #4 on the handout (here it is again).  I really hope we get some good mistakes there, and some good debate.  The rule I want them to generalize is that you can distribute what’s outside the parentheses if the operation/function outside is the repeated version of the operation inside.  Since raising to a power is repeated multiplication, an exponent distributes across multiplication but not addition.  Since multiplication is repeated addition, it distributes across addition (or subtraction) but nothing else.

Actually, I’m not sure if this is technically correct.  Distribution is a property of rings from abstract algebra, and there may be some exotic versions of rings where the “multiplication” operation doesn’t represent repeating the “addition” operation.  I got some help on Twitter from Dave Radcliffe (@daveinstpaul), who was telling me about power sets forming a commutative ring.  If anyone can help me understand whether I’m making a big mistake talking about distribution this way, I’d really appreciate it!

Khan Academy for Review Practice?

Meanwhile, my students have a bunch of factoring and quadratics exercises to do on Khan Academy.  (Yes, on Khan Academy.  These kids won’t be harmed by boring videos about factoring.  They learned factoring years ago.  They just need to gain fluency, and I need data on who needs remediation with what.)

UPDATE 9/11/13: When we did #4a today, students’ initial answers on whether to distribute in 2(3 * x) were evenly split.  Half the students got 12x, and half the students got 6x, until we discussed it.  Should get to the final summarization of the principle I’m looking for tomorrow.  (We don’t spend all class on this stuff, so it’s getting spread out over more days).