# Progressive Summarization by Tiago Forte
## A Practical Technique for Designing Discoverable Notes
Tiago Forte Dec 27, 2017
Modern digital tools make it easy to “capture” information from a wide variety of sources. We know how to snap a picture, type out some notes, record a video, or scan a document. Getting this content from the outside world into the digital world is trivial.
It’s even easier to get content that is already digital from one app to another.
We know how to copy and paste text, save an image from a webpage, archive an email attachment, or import a video file.
What is difficult is not transferring content from place to place, but *transferring it through time*.
You know what I mean: you read a book, investing hours of mental labor in understanding the ideas it presents. You finish the book with a feeling of triumph that you’ve gained a valuable body of knowledge...
**But then what?**
You may try to apply the science-based methods the book recommends, only to realize it’s not quite as clear-cut as you thought. You may try to change the way you eat, exercise, communicate, or work, trusting in the power of habits.
But then the everyday demands of life come rushing back, and you forget what motivated you in the first place.
At this point, people take different paths. Some give up, labeling all “self-help” books a waste of time. Others decide it’s just a problem of remembering everything they read, and invest in fancy memorization techniques. And many people become “infovores,” force-feeding themselves endless books, articles, and courses, in the hope that something will stick.
I want to suggest an alternative to all the approaches above: what you read is good and useful and very important, *you’re just reading it at the wrong time*.
You’re reading about time management techniques now, but they will only be useful two years from now, when you become a manager and have much greater demands on your time.
You’re watching YouTube videos on online marketing now, but that knowledge can only be put to use in 9 months, when your new online course gets off the ground.
You’re talking to a prospect about his goals and challenges now, but when you could really use that information is next year, when he is taking bids for a huge new contract.
The challenge of knowledge is not acquiring it. In our digital world, you can acquire almost any knowledge at almost any time.
The challenge is knowing which knowledge is worth acquiring. And then building a system to *forward bits of it through time*, to the future situation or problem or challenge where it is most applicable, and most needed.
At that future point, when you’re applying that knowledge directly to a real-world challenge, you won’t have to worry about memorizing it, integrating it, or even fully understanding it. You will only have to apply it, and any gaps in your understanding will very quickly reveal themselves. By the time you’re done solving a real problem with it, book knowledge has become experiential knowledge. And experiential knowledge is something you carry with you forever.
This is the job of a “second brain” — an external, integrated digital repository for the things you learn and the resources from which they come. It is a storage and retrieval system, packaging bits of knowledge into discrete packets that can be forwarded to various points in time to be reviewed, utilized, or deleted.
In the 4-part P.A.R.A. series, I described a universal system for organizing any kind of digital information from any source. It is a “good enough” system, maintaining notes according to their *actionability* (which takes just a moment to determine), instead of their *meaning* (which is ambiguous and depends on the context).
The four top-level categories of P.A.R.A. — Projects, Areas, Resources, and Archives — are designed to facilitate this process of forwarding knowledge through time.
By placing a note in a project folder, you are essentially scheduling it for review on the **short time horizon** of an individual project Notes in area folders are scheduled for **less frequent review**, whenever you evaluate that area of your work or life
Notes in resource folders stand ready for review **if and when you** **decide to take action** on that topic
And notes in archive folders are in **“cold storage,” ** available if needed but not scheduled for review at any particular time Note that we have re-created the tickler file, except instead of strict time-based horizons (daily, weekly, monthly, annually), they are scheduled *contingently* — if X happens, when Y arrives, if I want to do Z, *etc.*
Planning in terms of contingencies gives us all the benefits of planning and researching, without locking us into rigid routines. We have the ability to massively accelerate, using our repository of accumulated notes as rocket fuel. But the actual decision of whether or not to accelerate, and critically, *in* *which direction*, we leave to our Future Self, who is older and wiser.
P.A.R.A. answers how these “packets of knowledge” are **organized**: in discrete notes, sorted into 4 categories according to actionability, and resurfaced using RandomNote.
But now we turn to a more fundamental question: how are these packets made? Once we capture something, how do we structure the note so that it’s easily discoverable *and* usable in the future? How do we make sure what we’re saving today adds value to future projects, even when we can’t predict or even imagine what those projects might be?
That is the job of Progressive Summarization.
### **Note-first knowledge management**
There are two primary schools of thought on how to organize a note-taking program (or really any body of information, but I’ll use terms specific to note-taking apps):
**Tagging-first approaches** argue that there should be no explicit hierarchy of notes, notebooks, and stacks. Notes are envisioned as an ever-changing,
virtual matrix of interconnected, free-floating ideas. Because many tags can be applied to one note, there are multiple pathways to discover any given note. Locating notes in specific notebooks and folders is seen as limiting and static.
Although tags have their uses, I don’t believe they work as a primary organizational system. In my experience, relying on tagging is too fragile and requires too much maintenance, spreading attention too uniformly across all notes whether or not they are truly valuable. The virtual matrix sounds cool and futuristic, but our minds are not made to work well with such abstract concepts — we understand placing one thing in one place intuitively and automatically.
The second conventional approach to organizing notes is **notebook-first**.
This basically translates how we organize things in the physical world — in a series of discrete containers — into the digital world.
Notebook-first is better than tagging-first, in my opinion, mostly because it stays out of the way. It doesn’t try to automate and encroach upon the deeply intuitive act of making connections and seeing patterns. P.A.R.A. on its own is a notebook-first system.
But if we stopped there, it would still be woefully inadequate for an economy based on creative output. As the tagging enthusiasts correctly point out, notebooks and folders actually suppress the serendipity and randomness that is at the heart of a creative lifestyle.
I propose a way to break the impasse: a **note-first approach**.
I propose we make the ***design of individual notes*** the primary factor, instead of tags or notebooks. This has many advantages: It works well with *any* other **organizational system**, without depending on them (including but not limited to tags and notebooks, if you want to use those)
It makes all work you do on your notes **value-added**, because you’re spending close to 100% of the time engaging directly with the content itself
It can more easily **survive** **migrations** to other devices, storage locations, and even programs, because note content is much more likely to be preserved than overarching structure
It **cultivates** **skills** (succinct communication, finding the core of an idea, visual thinking, etc.) that are inherently valuable and highly transferrable to other activities
It makes your notes more **legible and useful to others** (unlike your internal notebook structure, which is only for your use), promoting
collaboration and sharing
With a note-first approach, your notes become like individual **atoms** — each with its own unique properties, but ready to be assembled into **elements,** **molecules, and compounds** that are far more powerful.
## **Designing discoverable notes**
A note-first approach to knowledge management means we have to think about design. You are, in a very real sense, designing a product for a demanding customer — Future You.
Future You doesn’t necessarily trust that everything Past You put into your notes is valuable. Future You is impatient and skeptical, demanding proof upfront that the time they spend reviewing notes will be worthwhile. You’ve gotta “sell them” on the idea of reviewing a given note, including all the stages any salesperson has to master: gaining **attention**, inspiring **interest**, establishing **credibility**, stoking **desire**, and making a case for **action** NOW.
As if all that wasn’t intimidating enough, you have to do this for every single note without spending *any extra time*. You don’t have extra time, do you?
Let’s start at the beginning: at the heart of every design, we are trying to balance priorities. You want one thing, but it has to be balanced against something else that you *also * want.
You want a vehicle to protect its occupants, but you can’t just add layers and layers of titanium armor plating. You have to balance safety against weight and cost.
You want a phone to have the longest possible battery life, but you can’t just
give it a 10-pound brick of a battery. You have to balance battery life against size and usability.
In the case of notes, I believe the two priorities we are trying to balance are **discoverability** and **understanding**.
Making a note **discoverable** involves making it small, simple, and easy to digest. We accomplish this using **compression: **creating highly condensed summaries, without all the fluff.
But we also want to make our notes **understandable**. This involves including all the **context**: the details, the examples, and cited sources to be sure nothing falls through the cracks.
This is a difficult tradeoff because *you cannot compress something without* *losing some of its context*.
You cannot summarize an article without discarding most of its points. You cannot make a highlight reel of a video without cutting out most of the footage. You cannot give an 18-minute TED talk without leaving out most of your ideas.
In making decisions about what to keep, you are inevitably making decisions about what to throw away.
## **Compression vs. context**
There’s a natural tension between the two, compression and context.
To communicate anything, you have to compress it, like communicating a huge amount of life experience in a wise saying. But in doing so, you lose a lot of the context that made that wisdom valuable in the first place.
Let’s look at some examples.
If we compress a note too much, in other words, we make a summary that is too brief, we lose the context and it loses all meaning. In the note above, for example, the information it contains is highly discoverable — I can get the gist of it with just a glance.
But if I come across this note a year from now, I’ll have no idea what it means or why it’s important. It’s too compressed.
But we can go too far in the opposite direction too. If we make something totally understandable, in other words, if we include every little detail and bit of context, it loses its discoverability.
The example above is my notes on the task management software Jira. It has *lots * of context, making it highly understandable. But it’s not discoverable at all. It would probably take me a couple hours and tremendous mental effort to read through this note and remember enough context to decide whether or not it’s useful. The main points and key insights are hidden somewhere in the noise.
Getting the balance between compression and context right is not a trivial matter. When the time comes for Future You to decide whether or not to review this note, *seconds count*. Because Future You will likely be looking for a solution to a problem, not casual reading, they will be making snap decisions on a tight timeline. Faced with a wall of text of questionable value, they are unlikely to take the risk of committing time for review.
This means that all the summarizing work your Past Self did on this note is wasted. It didn’t pay off back then, and it doesn’t pay off in the future. You successfully sent a packet of information forward through time, but not in a
state where it could survive the journey.
## **Opportunistic compression**
I’ve found that most people do just fine on the context side of the equation.
We know how to take exhaustive notes on a book, a presentation, or a class.
### Progressive Summarization focuses therefore on rebalancing the equation. It is a method for *opportunistic compression* — summarizing and condensing a piece of information in small spurts, spread across time, in the course of other work, and only doing as much or as little as the information deserves.
If you remember, compression is a means to improving discoverability. So our design challenge when creating a note is:
**“How do I make what I’m consuming right now easily** **discoverable for my future self?” **
This isn’t an easy question to answer, because you have no idea what Future You remembers, is interested in, or is working on. You have to summarize the note * without knowing what it will be used for*. It is general purpose summarization, a much greater challenge than extracting takeaways for just one specific project.
### Progressive Summarization works in “layers” of summarization. Layer 0 is the original, full-length source text.
Layer 1 is the content that I initially bring into my note-taking program. I don’t have an explicit set of criteria on what to keep. I just capture anything that feels insightful, interesting, or useful.
This can include virtually any type of media, but for this article I will focus on text. There are *many * ways of doing this:
Copy a paragraph of **text** from a PDF I’m reading, and paste it into the Evernote menu bar helper
Type my **random thoughts** into a new note on the Evernote mobile app
Dropping a **Word document** onto the Evernote icon in the dock on my Mac, which adds it to a note as an attachment Downloading all my **Kindle highlights** from a book using Bookcision, and then copying and pasting them into a new note Forward an **email** with useful information to my personal import address, which automatically imports the whole email to a note Highlight the best passages of an **online article** using the web highlighter Liner, which exports directly to Evernote The examples above are from my recommended program Evernote (iOS, Android, Mac, Windows, browsers), but all the major note-taking platforms support the above functionality in one way or another: Bear (Mac and iOS), Simplenote (iOS, Android, Mac, Windows, Linux), Microsoft OneNote (iOS, Android, Mac, Windows), and Google Keep (browsers, iOS, Android).
Layer 1 is the starting point of Progressive Summarization, like the bedrock on which everything else is built:
Layer 2 is the first round of true summarization, in which I bold only the best parts of the passages I’ve imported. Again, I have no explicit criteria. I look for keywords, key phrases, and key sentences that I feel represent the core or essence of the idea being discussed.
I do this bolding layer at a later time, when I’m already reviewing this note anyway. I’m essentially using the attention I’m already spending for a dual purpose: to “buy” the information I need for the project at hand, and also to summarize the note for future use. If you have to *pay* attention to something, it comes in handy to be able to double-spend.
For Layer 3, I switch to highlighting, so I can make out the smaller number of highlighted passages among all the bolded ones. This time, I’m looking for the “best of the best,” only highlighting something if it is truly unique or valuable. And again, I’m only adding this third layer when I’m already reviewing the note anyway.
For Layer 4, I’m still summarizing, but going beyond highlighting the words of others, to recording my own. For a small number of notes that are the most insightful, I summarize layers 2 and 3 in an informal executive summary at the top of the note, restating the key points in my own words.
Note that all the previous layers are preserved in context, giving you the freedom to leave things out without worrying that you’ll lose them.
Summarization is risky — you may be making the wrong decision about what’s important. But with the safety net of multiple layers of preserved notes, you can strike out decisively on daring intellectual expeditions.
And finally, for a tiny minority of sources, the ones that are so powerful and exciting I want them to become part of how I think and work *immediately*, I remix them. After pulling them apart and dissecting them from every angle in layers 1–4, I add my own personality and creativity and turn them into something else.
This could include a blog post interpreting, critiquing, or extending the argument an author is making, such as in Strategically Constrained, The Inner Game of Work, and Supersizing the Mind.
But it doesn’t have to be difficult or time-consuming. It could even be…(gasp) fun! Making a sketch, designing a slide, recording a short video on your phone, and sharing on social media are all forms of wrestling deeply with information.
The first tweet in a tweetstorm I wrote about the book *Toyota Kata* *In Part II, we’ll look at some examples of Progressive Summarization in* *action. *
***Subscribe to Praxis, our members-only blog exploring the future***
***of productivity, for just $10/month. Or follow us for free content***
***via Twitter, Facebook, LinkedIn, or YouTube. ***
**Progressive Summarization II:**
## **Examples and Metaphors**
Tiago Forte Dec 27, 2017
Let’s look at how a single source can proceed through the layers of progressive summarization.
These are Layer 1 notes I took on an article on postrationalism, a topic I’m interested in. This is 373 words, which would take about 2 minutes to read at an average reading speed. 2 minutes doesn’t seem like much, but when you consider that these notes could have no relevance to the task at hand, it’s a lot of attention to pay for nothing. Especially considering this is dense, challenging material.
For Layer 2, I bolded what I thought were the key points:
We’re down to 181 words now, about half as long. But it’s the best half, which means a lot of value has been added. It will only take me a minute to get the gist of this note now, and spacing out the bold passages also makes reading faster.
For Layer 3, I highlighted the best of the best parts:
Now we’re down to just 60 words in 3 sections, which I can scan in 10–20
seconds. This is **6–12 times faster** than the full, Layer 1 notes (not to mention the full original article). Imagine what you could accomplish consuming sources 6–12 times faster, OR consuming 6–12 times as many sources in the same amount of time. While *at the same time*, repackaging those insights in a form that Future You can easily find and use.
Let’s do an experiment: in about 20 seconds, scan just those 3 highlighted passages. Are they enough to give you the gist of this article? I’ll wait…
The answer is, probably not! They are enough for me only because of my personal context, prior reading, and experience summarizing this note. This is why this summarization process has to be done manually and individually.
There are some layers and some sources where external tools can help, but until we develop not only artificial intelligence, but extended artificial cognition, summarization won’t be something we can fully outsource.
By the way, this example illustrates why tags will never be adequate for long-term, open-ended research. The following are words and terms found somewhere in this text. Can you imagine how complex your tagging system would have to be to cover even part of this list?
The text itself is its own best tagging system. I think we should focus on surfacing the key terms already found there, instead of dreaming up tags.
Because here’s the catch: even if you used tags to find this note, you would *still * have to evaluate it quickly for relevance.
Here’s another example of Layer 3 notes, from the book *The Future of Work* by Jacob Morgan, which you can view in their entirety here:
I didn’t find the book particularly insightful. It was mostly a listing of major trends I already knew about. But it does contain lots of good research findings, which I can use for my own purposes, including in some cases to *make opposing arguments*. This is why usefulness is such a, well, useful thing to look for: it’s like walking through a junkyard for pieces of junk cars you can rip out and repurpose. You aren’t limited to the perspective or knowledge of the author. You start your own thinking at the level of the best thinking out there, instead of at ground level.
With these most useful parts highlighted, Future Me’s attention is brought directly to them. He will be able to quickly scan the highlighted parts, and load up those juicy tidbits without having to re-read the whole book, which would be a waste of time.
Here’s another example of Layer 4 notes, from the book *The Race* by Eliyahu Goldratt:
This book was so influential on my thinking that my Layer 1 notes were practically a book unto themselves. I wanted to capture every fascinating example and detail, so it’s heavy on context, at the price of discoverability.
To improve its discoverability, I added an executive summary (Layer 4) at the top. By scanning this summary I can remember the gist of the book, and if anything is unclear, I can easily do a search of the full notes just below.
This note was originally created in July of 2016 when I read the book, added 4
layers of summarization between July and December, and in January of 2017
was incorporated into one of my all-time favorite posts, The Throughput of Learning. That one phrase highlighted in the screenshot above was the key breakthrough I needed to make that argument possible. Who could have ever predicted that the history of inventory turnover would have anything to do with an article on learning and personal growth?
One final example, of a Layer 5 note:
I read the book *Ask Questions, Get Sales*, and found the question-based sales technique very compelling. I knew just highlighting the text wouldn’t infect it into my thinking, so I decided to create “sketchnotes” of the book’s key points.
Making the ideas visual required me to think about them more deeply, to decide what I agreed with and didn’t, to emphasize parts that applied most directly to my work, and to experiment with how best to represent abstract concepts. All of these helped me integrate the knowledge much more deeply, as something I’ve interacted with, not just passively consumed. And by the way, Evernote’s text recognition works pretty well on hand-writing, so these images are searchable.
That was a lot of information, so in the spirit of compression, let me give you the bottom line.
With Progressive Summarization, we are summarizing our notes, and then
summarizing *that * summary, then summarizing *that * summary, distilling the ideas into smaller and smaller layers each time.
With these layers exposed, we can do a flyby in the “airplane of discoverability,” quickly scanning the peaks, to decide if this mountain has anything to do with what we’re looking for.
But we’ve also preserved all the layers in context, so if we see a peak that looks promising, we can dive right in with our “parachute of understanding,”
drilling down as deep as we need to:
## **Jungles vs. islands**
Most people’s notes are like a dense jungle. There’s lots of interesting stuff in there, but who would ever know? The gems are hidden and obscured. There’s no discoverability.
Progressive Summarization turns the jungle into an archipelago of islands. It reveals your personal information landscape — the unique topography of your goals, values, interests, and pursuits. With a clear landscape, you gain the ability to steer. Toward what you like, or don’t. Toward what makes you comfortable, or what doesn’t. Toward what you need, or what you want. You are the pilot, so you decide.
## **Mind as staging ground**
With a method in hand for rapidly capturing knowledge and compressing it into multi-layered packets for easy retrieval, the whole game of learning is transformed.
Not only is it okay if I don’t remember the vast majority of what I read, ***that***
***actually becomes my goal***. I want to offload what I’ve learned as quickly as possible, so I can forget it as quickly as possible. I want my mind to be an empty vessel, a staging ground where ideas briefly stop in their journey from the outside world, to my second brain. This is what it looks like to be fully present, and to also reach my intellectual potential. To leave my mind free and clear for each new experience, while also building something that lasts.
Here’s the crucial thing to understand: I don’t summarize notes on any sort of schedule, in any particular order, or as a part of a workflow. I summarize them completely opportunistically, ***when I’m already reviewing the***
***note for some other purpose anyway***.
This seems to be very hard for most people to understand, or to trust. It runs contrary to everything we’ve been taught about knowledge as some sort of fungible commodity, like oil or soybeans. Your second brain is not a factory where every piece has to flow through each layer, in sequence, on a strict timetable. It is a network — with local neighborhoods, city centers, and superhighways all changing in different ways, at different speeds, as it should be.
A given note may not be summarized until months or years after it’s been captured. Many notes may never be summarized. This is not just acceptable, it is absolutely fundamental to focusing your attention primarily on your most valuable notes. This is how your first brain works: use it or lose it. You have to be comfortable not only letting things fall through the cracks, but placing the cracks strategically so notes that don’t end up being useful automatically recede from your attention.
The great power of digital technology — that it never forgets anything — is a curse in an era of attention deficit. We need to add to our “digital cognition”
mechanisms for forgetting. The best way to do this, I believe, is to build mechanisms for directing attention. But it takes attention to direct attention.
### Progressive Summarization is an alliance between your current and future selves, a pact in which your current self pays forward any insights they encounter, in exchange for your future self capitalizing on them at some point in the future.
But alliances across time are risky business. Current Self can’t afford to do big, heavy lifts meticulously organizing hundreds of notes upfront. Who knows if that work will ever pay off? I’ve been burned by such speculative investments before: after years of painstakingly organizing, categorizing, and
labeling the songs in my iTunes library, Spotify came out, and I never opened iTunes again.
### Progressive Summarization is really a method for creating value in an environment of uncertainty. You do concrete, relatively easy work now instead of speculative, difficult work for later. You pull time-consuming, but risk-free activities (reading, highlighting, summarizing) as early in time as possible, and push quick but risky activities (execution, decision making, delivery) as far into the future as possible. This way, you have all the ammunition you need ready and waiting at a moment’s notice, while waiting until the eve of battle to decide which target to attack.
If you follow the simple rule of summarizing a note every time you touch it, you’ll organically create a collection where the layers of summarization correspond to how integral that note is to your work. You can see at a glance how important a note is, without reading a word, just by how many layers it contains. Projects that fall apart won’t be as traumatic, because you’ll have lots of notes you summarized and packaged along the way. Once again, you get all the benefits of planning and organization, while spending virtually no time planning or organizing.
*In Part III, I’ll give you a few guidelines and principles to make your* *summarization as effective as possible*
***Subscribe to Praxis, our members-only blog exploring the future***
***of productivity, for just $10/month. Or follow us for free content***
***via Twitter, Facebook, LinkedIn, or YouTube. ***
## **Subscribe**
Get the latest updates and free content
**Progressive Summarization III:**
## **Guidelines and Principles**
Tiago Forte Dec 28, 2017
In Part I, I explained Progressive Summarization, a method for easily creating highly discoverable notes. In Part II, I gave you many examples and metaphors of the method in action.
In Part III, I will give you further guidelines on how to make Progressive Summarization (PS) a part of your daily work. They have been gathered from several years of using the technique in my own projects, and teaching it in my workshops and courses.
There are 4 guidelines:
1. Don’t apply all layers to all notes
2. Use resonance as your criteria
3. Design a system for the laziest version of yourself 4. Keep your notes glanceable
**1. Don’t apply all layers to all notes**
This is perhaps the biggest mistake I see as people adopt P.S. I think it has something to do with the type of person who finds productivity, organization, and research attractive in the first place. They tend to be meticulous, detail-oriented, and perfectionistic. They like perfectly enclosed, universal systems that leave no room for interpretation.
You’re going to have to let go of that.
PS is universal in that it can be applied to any kind of media (we’re still focusing on text but I’ll soon explore others). But it is NOT universal in how it’s applied. It is absolutely NOT the goal to stuff every single note through all the layers of summarization. This isn’t a funnel, where the more notes reach the bottom of the funnel, the better:
There is no “preferred” level of summarization. More summarization is *not* better. Instead, you want to *calibrate* the amount of attention you’re paying to any given note, to correspond with how *valuable* that note is.
Take the most common case — a note that is average in its insight and usefulness. This is the kind of note you will most often be dealing with, statistically speaking. You want to capture the best parts of the source as Layer 1, put it in the appropriate notebook based on its actionability according to P.A.R.A., and then you want to *leave it alone*. It may be months or years before you see this note again. That is not only perfectly fine, it is your goal: to put a strict filter on what is allowed to pop back into your
attention. To make that note “earn its keep” by having some relevance to a real project at some point in the future. If that relevance never happens, then you don’t want to spend one extra second summarizing it.
I believe that notes follow a power law in terms of their value: As in the graphic on the right, a very small proportion of notes (at the left edge) contains the great majority of the insight and usefulness. The rest of your notes (the right trailing edge), contain much less value. You still want to keep them around for unexpected uses, but focus your summarizing attention primarily on the high-value notes.
Here’s a breakdown of approximately how many layers I apply to my notes:
Starting at the bottom:
I only save any notes at all on about 50% of the sources I consume. This completely eliminates 1 out of 2 sources from any future consideration, which is wonderful
Half of those Layer 1 notes (or 25% of the total originally consumed) make it to Layer 2 bolding
4 out of 5 Layer 2 notes (20% of the original) make it to Layer 3
highlighting, since highlighting is relatively easy 1 out of 4 Layer 3 notes (5% of the original) make it to Layer 4 executive summary, since that takes much more energy
And I would estimate that less than 1 out of 5 Layer 4 notes (<1% of the original) make it to Layer 5 remix, since that takes a LOT of time and energy
Can you see how crazy it would be to “require” every note to make it to Layer 5? Doing so would require spending a lot more time on sources I find less
interesting, at the expense of sources I find more interesting.
**2. Use resonance as your criteria**
Notice that there is no explicit criteria for deciding what to include at each layer. I’ve seen attempts to create it, but I think it’s a fool’s errand.
Applying such criteria would require System 2 analytical thinking, which is slow and tiring. Instead, we want to recruit our fast, intuitive System 1 thinking, by using “what resonates” as our criterion.
This is very difficult for Type A organizers to accept. It seems terribly vague and error-prone. And it is. But it also takes advantage of one of the few things your first brain does better than any computer: pattern-matching.
Resonance is a diffuse form of attention, a sort of emotional-intuitive perception that can scan for multiple kinds of patterns at once:
what is **surprising** or **counter-intuitive** what we **know to be true**, but we never quite thought of it that way what aligns with and helps us **interpret past experience** what is **inspiring, moving, or meaningful** what helps us **simplify** and **interpret** other, more complex ideas what **tickles us** in some inexplicable way, often becoming clear only much later
what speaks to our deepest **goals, values, priorities, and questions** what breaks or challenges **mental models**, conscious or unconscious what is **rare and interesting**, and could potentially be useful in the future
You could scan for each of these patterns one at a time using the analytical System 2, but it would take forever. Only our intuition, like a spidey-sense attuned to anything abnormal in our surroundings, can handle so much unstructured information at once.
Don’t slip into analysis or interpretation mode, trying to figure out how to categorize what you’re reading, what it “means,” or what topic it falls under.
Instead, your only job is to *expose the semantic hooks* already found in the text itself. And to leave the job of figuring out how to put it to use to Future You.
**3. Design a system for the laziest version of yourself** I often see people get excited about PS, and start adding “features.” Flush with motivation, they begin inventing labeling systems, tagging hierarchies, naming conventions, and taxonomies. They try to run before they’ve learned to walk, overengineering the summarization layers with a table of contents for every note, or drawing a picture for every note, or making the formatting more consistent for every note.
This is the biggest pitfall, and the thing most likely to make you fall off the wagon.
You have to remember that you’re not designing a system for the best version of yourself — the one that is motivated, relaxed, with lots of time and energy.
You’re designing a system for the worst version of yourself, as Alan Cooper says in his book on interaction design, *About Face*: **Design an interaction model for the worst version of yourself — **
**the one that’s tired, lazy, unmotivated, frazzled — because that’s** **the one that usually shows up when you need a solid workflow** **to fall back on. **
You have to limit yourself, most of the time, to only what you’re willing to do *for every single note*, consistently and far into the future. I find that my laziest self only wants to read what interests him. By using a method where I’m interacting with my favorite ideas almost all the time, I never feel repelled from my notes. Because bolding or highlighting as I read takes almost no energy, I can do it mindlessly, even meditatively, as a sort of knowledge ritual. Knowing that I can add real value to my notes at any time, in any place, in any state of mind, makes reading and note-taking tremendously addicting.
How do we design a note-taking method for the worst version of ourselves?
As Dee Hock puts it:
**Simple, clear purposes and principles give rise to complex and** **intelligent behavior. Complex rules and regulations give rise to** **simple and stupid behavior. **
We want simple purposes and principles. For that reason, PS is nothing but a few loose formatting tips and one overarching rule:
**Spend more time on things that interest you** Aligning your attention with what interests you makes this process enjoyable, thus sustainable.
## **4. Keep your notes glanceable**
One of the key ideas you’ll need to grasp is that more highlights are not better. Generally speaking, more highlights just *dilute* the discoverability, and thus the value, of all the others.
The principle here is that you want to preserve the “glanceability” of your notes. It’s critical that you’re able to, with just a quick glance, get the gist of a paragraph. The most useful sources of information in our lives have this quality:
With a page of Google results, you don’t have to carefully read every word — just keep your vision unfocused and casually scan for keywords that grab your attention
With a photo album, you don’t need to individually examine each element in each photo — your eyes can effortlessly scan dozens of photos looking for a face, a place, or an occasion Glance around the room you’re in — you can effortlessly take in the whole situation, without paying critical attention to any single part of it Glanceability is crucial because our brain is designed to quickly grasp the gist of a scene, not to individually analyze every object like a computer. As soon as you give the brain too many details, it has to drop into analytical mode. As a general rule, the more selective and picky you can be about what to keep, the better. A major reason for preserving earlier layers is to give you a safety net so you can feel comfortable boldly eliminating huge portions of text.
## **Recognition over recall**
The above guidelines work because of a deep principle in our cognition: we are far, *far * better at *recognition* than *recall*.
You’ve probably had the experience of reading an article, and not finding anything particularly surprising. You basically agree with everything the author says, even if you never thought of it exactly that way before. You can *recognize * the validity of what they wrote as something you “already knew.”
But this in no way means you could have *recalled * and written it down yourself.
But if you already knew it, where does that knowledge live? In the vast space of things you can recognize, but not recall:
The brain is simply not designed to recall things. You memorize a phone number and minutes later it’s gone. All the housekeeping features in your brain are constantly clearing out old memories, and weakening unused connections.
On the other hand, your brain’s recognition abilities are astounding: we can recognize faces in an instant, after many years. We can read just a couple sentences and tell that we’ve read it before. Even situations only vaguely similar to past experiences trigger a déjà vu feeling.
By relying on recognition instead of recall, we gain access to much greater bandwidth and memory in our creative pursuits. Here’s the catch: recognition doesn’t work in isolation. It requires some sort of outside stimulus for us to recognize. It requires a concrete medium for our senses to push up against.
This is the true purpose of Progressive Summarization: not to exhaustively catalogue every idea like bugs in a collection, but to create an environment of
rich triggers, prompts, and hooks to spark memories, connections, and even more new ideas. Like a digital self-portrait, this environment helps you recognize what you already know and have already thought, by presenting back to you only the most highly distilled nuggets of insight. It is a digital environment for self-inspection, like a hall of mirrors allowing you to see yourself from every possible angle.
## **Asking better questions**
When we read, we are usually trying to do many things at once: reading the words, assigning meanings to the word, interpreting the sentences, tracking the flow of the argument, remembering the overall structure of the book, deciding whether we agree or not, fitting the ideas into our existing mental models, thinking about related things we’ve read, *etc.*
In other words, we are “always/already reading.” It’s exhausting, juggling so many strands of thought at once. This is why reading is so taxing for most people.
What we are doing with Progressive Summarization is “unbundling” your thinking. We’re getting the types of thinking that are relatively effortless —
absorbing the words, noticing what resonates, highlighting those parts — and pulling them as early in time as possible. This helps make reading pleasurable, and has us performing work that will be of value no matter how or when it’s used.
And we are getting types of thinking that take more effort and are dependent on a specific context — analyzing, interpreting, understanding, comparing, contrasting, synthesizing — and scheduling it for a time in the future when that thinking will be most accurate and useful. But we’re not scheduling it on a calendar, for a certain day or time. Once again, we are scheduling it
*contingently* — when X happens, if Y is needed, if I want to do Z.
Why is leaving these thinking jobs to Future You a good idea? Because you have no idea what a given note means, how to interpret it, how it should be categorized, or how it’s going to be useful. Its meaning is determined by the lens you use to examine it. You could use many interesting lenses, but the best one is a current project, problem, or question.
Using a current project as a lens, you might find your notes on a book are most useful as a model of how to structure arguments. Using a different problem as a lens, you might find those same notes are valuable for one minor point you hardly even noticed before. And using a question as a lens, you might find the author provided answers without even knowing it.
Having a specific project as a lens gives you tremendous motivation, clarity, direction, and focus. It helps you cut away the good ideas to focus on the great ones. A real project provides a state of mind optimized for finding a solution.
Instead of just taking notes to answer questions you have *now*, package them up as potential answers to future questions you can’t even imagine.
The truth is that your note-taking system is not meant for finding answers.
You have Google for that. Answers are kinda boring anyway— they are correct one minute, and then no longer apply the next minute as conditions change.
The purpose of your note-taking system is to help you ask better questions, which no computer can do. The best questions don’t just seek what’s true, but what’s *interesting*. The best questions cleave a mental model in two, exposing its inner workings. The best questions open up new avenues of thinking, expanding outward in a generative space of possibilities. And they do all this with a humble and generous spirit of curiosity.
*In Part IV, we’ll apply compression principles to non-text media, including* *drawing, music, social media, and many others. *
***Follow us for updates on Twitter, Facebook, LinkedIn, and***
## **YouTube. **
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
**Progressive Summarization IV:**
## **Compressing All Types of Media**
Tiago Forte Dec 29, 2017
Reading through the previous three parts, a question probably popped into your mind: does this apply only to text?
It’s an important one, because we are becoming a less text-based society.
Ubiquitous cameras, real-time video chats, and visual displays of information have become the norm. Which means expressions of creativity will increasingly take on these forms.
But the principle of compression is not at all limited to text. It is a universal feature of all information, and by extension, all media. It falls to us, however, to understand how it works and apply it to our medium of choice.
The story begins with this paper, one of the most influential works in the study of cross-disciplinary creativity in recent years.
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
Available here
The paper explains curiosity as “…the desire to create or discover more regular data that is novel and surprising…in the sense that it allows for **compression progress** because its regularity was not yet known.”
Basically, your brain prizes efficiency. If it can remember one thing instead of ten things, it’s happy. For example, learning that fruits are tasty is one piece of information, which is much easier to remember than individually memorizing each kind of fruit, and whether it’s tasty or not. This has obvious value for survival: the brain that is only trying to remember a few, vital pieces of information will outperform the brain trying to remember a bunch of inconsequential details.
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
The drive that has evolved to reward us for seeking out simpler kinds of data is called *curiosity*. Curiosity tends to get directed toward areas in which we don’t know something:
You know traffic patterns fluctuate, but your brain makes you curious about whether there is an easy-to-remember pattern You know the sky is usually blue, but not always, and the brain wants a simple rule to know what to expect
You know people react negatively when you talk about your accomplishments, but your brain wants a simple heuristic to remember what to say or not say
The curiosity drive is pushing you to learn a pattern, a rule, or a heuristic that will help you *compress* your knowledge of that area into something easier to remember.
It doesn’t have to be a perfect rule. Those are rarely found, and never last.
What your brain is looking for is *progress* in its ability to compress its knowledge. Thus, *compression progress*. Any new rule that is better at compressing previously accumulated experience than the last one qualifies: Traffic is worse during rush hour when people are traveling to and from work, which tends to be in the morning and late afternoon The sky is always blue due to its chemical makeup, but can often be obscured by clouds, or darkened at nighttime People react negatively when I talk about my accomplishments because they think I’m bragging, which implies I think I’m better than them All these are rules or patterns we learned first as children, which got progressively refined as we grew older and wiser.
The brain rewards compression progress with *interestingness* — defined as
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
the steepness of the learning curve from the previous rule to the new one. You learn as a baby that things fall when you drop them. The discovery of the phenomenon of gravity, which helps explain a wide range of things, a baby finds tremendously *interesting* (thus all the objects they try dropping to test their new theory).
As an adult, when I learned about the concept of compound interest, I found it very *interesting*. It explained everything from the existence of banks, to why certain people are so much richer than others, to why people advise saving early for retirement. My brain can drop all those puzzle pieces it’s trying to fit together, and hold in its stingy memory just the one concept, compound interest.
The steeper the learning curve — the greater the improvement from the old rule to the new one — the more interesting you find a piece of data. Think back to the times your mind was totally blown. It’s likely that that new information was remapping something pretty deep in your understanding of the world. And it was a good feeling — your brain rewarding you for a job well done.
Compression progress is at the heart of both science and art.
Scientists are always analyzing one tiny part of the world, trying to find a law that explains previous data more simply than the previously known law. They can’t disregard old data. They have to find a program that compresses (or explains) all the previous data better than the previously best known program.
Compression progress goes far beyond storage efficiency. Since short and simple explanations tend to reflect some repetitive regularity in the environment, they help predict the future as well. Scientists study past data to find a new theory that predicts the results of new experiments.
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
Getting better at compression progress not only helps you survive, it potentially helps you predict the future. Planning and preparing for an uncertain future becomes much easier when you’ve compressed your life experience into fundamental principles of how the world works. Some might call this *wisdom*.
But compression progress is also fundamental to art. Beauty is the artistic equivalent of interestingness. Studies have shown that our brains store a representative model of a human face, and then perceive a new face by looking at only the differences between it and this model. New faces that don’t deviate much from the model, for example, by being symmetrical or having simple proportions, are more easily compressed, and thus appear to us as more beautiful.
Jokes also work this way: the premise of the joke sets up a mental model with some sort of flawed assumption or hidden meaning. The punchline both reveals the flawed assumption, and corrects it with a simpler one very efficiently, often in just a few words. Our brain rewards this efficiency with the feeling of “funny.”
What you find when you look out on the world is compression everywhere.
Communication is compression, packaging up tangled thoughts into neat little words with agreed-upon meanings. Love is compression, fusing a series of experiences, memories, feelings, and thoughts into an exhilarating state of mind. Supply chains are compression, adding value at each stage and refining the product into the most elegant expression of utility possible.
And as we’ve discussed, note-taking in all its forms is compression. Let me show you some examples from other media.
**Picasso’s Bull**
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
One of the most famous and clear examples of compression is Picasso’s Bull, which he painted in a series of 11 lithographs in 1945. His goal, it seems, was to find the “essence” of the beast in a series of progressively simpler images.
The first image is a lively and and realistic drawing in full ink. In the second and third images, he adds expression and power, making them even more evocative of the spirit of the animal.
And then he stops building, and begins to dissect the creature along lines of force that follow the contours of its muscles and skeleton. Not unlike a butcher would carve meat. He starts to abstract the bull in the subsequent images, simplifying and outlining the major planes of its anatomy.
Ten years earlier Picasso had said “A picture used to be a sum of additions. In
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
my case a picture is a sum of destructions.” We see this at work here, as he starts to erase sections of the bull to redistribute the balance between the front and rear. By enlarging the eye and flattening its horns, for example, he moves the focal point to the front of the animal.
But Picasso is also creating as much as he is destroying. He collapses the strength of the animal’s shoulders with a bold white line running diagonally in image 5, and then counterbalances it with an equally strong black line running parallel to the shoulders in image 6. The point where these lines intersect suggests the bull’s center of balance.
The compression continues in the final 5 images. As Picasso starts to understand the balance of form in the animal, how weight is distributed between the front and the back, he starts to remove structural lines of support that are no longer needed. Like a builder removing the external scaffolding of a building once it’s almost finished.
Picasso finishes the drawing with a final image, encasing what he has discovered are the most essential elements in a taut, nearly continuous outline. Along the way, even the head has been dropped as a means to emphasizing the horns, and the genitals are left in place only to confirm the animal’s gender.
What we are left with is a stunningly simple line drawing that somehow still manages to capture the fundamental spirit of a bull. It’s important to note that he probably couldn’t have achieved this in one step — the learning curve would have been *too * steep. Picasso once described this process as “charging up” his arm with the essence of the bull. He often wouldn’t keep the whole sequence, turning the canvas upside down and painting over it at each stage.
But what’s critical for us to understand is that *each of the stages was still* *necessary*. Each step contained its own discoveries, its own lessons, its own
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
mistakes. He could have gotten discouraged, or taken a break to work on his art certification, or gone back to “perfect” his first try. But he took a different path — creative destruction. Breaking down, then building up. Diverging, then converging. Compressing, then decompressing.
If even an artistic genius like Picasso needed a step-by-step onramp to creativity, I think you can give yourself a break.
## **Progressive Summarization in music**
One of the participants in my online course *Building a Second Brain*, from which this series is drawn, applied P.S. to music. Here’s the layers (or stages) he came up with:
Look at columns 3 and 4
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
You see the same principles at work in columns 3 and 4: a large volume of unknown quality comes in at Layer 0. He needs to select a subset for further listening and study, but also wants to preserve it all in case something slips through the cracks.
At Layer 1, he has a list of shortlisted tracks. At Layer 2, he’s purchased some of them, investing in their further development. At Layer 3, he’s tagged and starred a smaller subset, a further investment of attention. Layer 4 songs are ready to be played in live sets, and Layer 5 includes only the most valuable and important songs, that he’s actually played or recorded. By this point, these songs are part of his corpus of work and maybe even his identify.
These layers (or stages) define this person’s workflow: the sequence of steps in which value is added, including decision points, milestones, and outputs at each stage.
How he actually implements this workflow is up to him. They could be just conceptual, a mental model of how ideas progress to recordings. He could track which layer each one is at on a kanban board, on a whiteboard, in a notebook, or with small pieces of colored tape. He could use tags and labels in a software program, or invent a naming convention using abbreviations, or designate separate folders for each layer.
Do you see how there are few principles, but infinite implementations? Leave room for your own creativity in the organizing process.
**Progressive Summarization in a tweetstorm** Here’s a final example that’s more feasible for anyone to try. It comes from a tweetstorm (a series of threaded tweets on a topic, on Twitter) I came across one day.
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
Nicholas hasn’t taken my course, but I could see the progressive layers: Layer 0 is the original source, which is linked to at the end of the thread.
Layer 1 is the screenshot he took and attached to his tweet. Layer 2 is the section he highlighted. And Layer 3 is the body of the tweet, where he paraphrased one idea to further emphasize it.
Can you see what he’s done? Instead of doing what most people do, simply sharing a link with no context or explanation, he’s provided us with a finely graded onramp to understanding this source. If the body of his tweet catches my eye, I may decide to read the highlighted text. If that still resonates, I’ll read the text screenshot and the rest of the thread. If I want even more, the link to the full source is included at the end.
From my perspective, this is a service to humanity. I’m not forced to make an all-or-nothing decision to read the whole source, or nothing at all. He’s not teasing me with a comment like “What a great article!”, only for me to
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
discover 20 minutes in that it has no relevance to me. He’s making use of the precious attention he’s spending reading and taking notes, to also make those notes more accessible to me as a prospective reader.
Imagine if we all did this. Social media could become a platform for learning and collaboration, instead of gossip and controversy. We could each immerse ourselves in the field we’re most passionate about, surfacing and distilling the best nuggets of knowledge for others to incorporate as building blocks in their own work.
Instead of everyone reading the same best-selling books, creating a market of a few superstar writers while the vast majority can’t make a living, we could democratize the craft. We could diversify our reading, into the niches and subcultures that rarely see the light of day, but probably hold world-changing insights.
This is a world worth creating. Building a second brain isn’t just about fortifying your own knowledge empire. It’s about becoming an authority, a distributor, and an ambassador of the knowledge you are uniquely positioned to understand and create.
## **Other examples**
I’ll leave you with a collection of examples from my own notes, from a wide variety of different kinds of media. They should illustrate that any kind of media can be progressively summarized:
Mindgym Performance Mgmt notes (**white paper**) Psychological Capital paper notes (**academic paper PDF**) Google Uses Structured Interviews to Improve the Hiring Process and Retention Rates (**online article**)
Critical Chain PM podcast (**podcast**)
### Progressive Summarization IV: Compressing All Types of Media | Praxis 1/7/19, 3(04 PM
Toyota Kata notes (**book**)
Premature Synchronization is the Root of All Evil (**tweetstorm**) Baumol’s cost disease (**Wikipedia article**) Notes on Demistifying the MOOC (NYT) (**New York Times article**) Habit Mapping (**paper notebook**)
DOES San Francisco 2016 w/ John Willis notes (**YouTube video**) Good Strategy, Bad Strategy notes (**audiobook**) Giles Bowkett: Why Scrum Should Basically Just Die In A Fire (**blog** **post**)
Kirkpatrick’s Four-Level Training Evaluation Model — MindTools.com (**output from Liner app**)
Stratechery: Jeff Bezos’ Annual Letter, Facebook Messenger and Payments, Facebook Instant Articles Fizzing? (**email**)
***Follow us for updates on Twitter, Facebook, LinkedIn, and***
***YouTube. ***
## **Subscribe**
Get the latest updates and free content
**Progressive Summarization V:**
**The Faster You Forget, The**
## **Faster You Learn**
Tiago Forte Mar 5, 2018
In Part I, I introduced Progressive Summarization, a method for easily creating highly discoverable notes. In Part II, I gave you examples and metaphors of the method in action. Part III included my top recommendations for how to perform it effectively. Part IV showed how to apply the technique to non-text media.
In Part V, I’ll show you how Progressive Summarization directly contributes to the ultimate outcome we’re seeking with our information consumption: learning.
## **The burden of perfect memory**
In traditional schooling, the ability to recall something from memory is taken as the clearest evidence that someone has learned something. This is the regurgitation model of learning — the more accurately you are able to reproduce it, without adding any of your own interpretation or creativity, the higher your mark.
But in the real world, perfect recall is far from ideal.
This New York Times article tells the fascinating story of the 60 or so people known to have a condition called **Highly Superior Autobiographical** **Memory (HSAM)**. They can remember most of the days of their lives as clearly as the rest of us remember yesterday. Ask one of them what they were
doing on the afternoon of March 16, 1996, and within just a few seconds they’ll be able to describe that day in vivid detail.
These are people who have achieved the holy grail of recall — perfect memory.
And yet, they often describe it as a burden:
**“Everyone has those forks in the road, ‘If I had just done this** **and gone here, and nah nah nah,’ everyone has those,” she told** **me. “Except everyone doesn’t remember every single one of** **them.” Her memory is a map of regrets, other lives she could** **have lived. “I do this a lot: what would be, what would have** **been, or what would be today,” she said….“I’m paralysed,** **because I’m afraid I’m going to fuck up another whole decade,” **
**she said. She has felt this way since 30 March, 2005, the day her** **husband, Jim, died at the age of 42. Price bears the weight of** **remembering their wedding on Saturday, 1 March 2003, in the** **house she had lived in for most of her life in Los Angeles, just** **before her parents sold it, as heavily as she remembers seeing** **Jim’s empty, wide-open eyes after he suffered a major stroke,** **had fallen into a coma and been put on life support on Friday,** **25 March 2005. **
It seems that perfect memory isn’t quite the blessing you’d expect.
## **The importance of forgetting**
I propose that forgetting is just as important to the process of learning as recall. As the world changes faster and more unpredictably, attachment to ideas and paradigms of the past becomes more and more of a liability.
Contrast this with most books and courses on “accelerated learning,” which tend to offer two kinds of approaches:
**#1 Increase the flow of information entering the brain** This leads to techniques like spritzing, listening to audiobooks on 2x speed, speed reading, focusing on already highly condensed sources, blocking distractions, deep focus, and biaural beats.
**#2 Improve memory and recall of this information** This leads to techniques like spaced repetition, memory palaces, mnemonics, music and rhyming, acronyms, and mindmapping.
All these techniques work. And they completely miss the point. They both operate with the same misguided metaphor: the mind as an empty vessel. You fill it with information like filling a jug with water, which you can then retrieve and put to use later. With this framing, your goal is to maximize how much you can get in, and how much you can take out.
But there’s a fundamental difference between a mind and a static container like a jug of water or a filing cabinet: a mind can not just *store* things; it can *take action*. And taking action is where true learning actually takes place.
Here’s the problem: the more we optimize for *storage*, the more we interfere with *action*. The more information we try to consume, meticulously catalogue, and obsessively review, the less time and space remain for the actions that matter: application, implementation, experimentation, conversation, immersion, experience, collaboration, making mistakes.
Learning is not an activity, process, or outcome that you can dial in and optimize to perfection. It is an emergent phenomenon, like consciousness, attention, or love. These states become harder and harder to achieve by trying to force them, a phenomenon known as hyper-intention.
The truth is, we don’t need to “accelerate” or “improve” the way our mind
learns — that is what it evolved to do. All day, all night, whether you’re working or resting, talking or listening, focused or mind-wandering — your brain never stops drawing relationships, making connections, and noticing correlations. You couldn’t stop learning if you wanted to.
Knowing that our brain is continuously collecting information, our goal switches from *remembering* as much as possible, to *forgetting* as much as possible.
## **The information bottleneck**
Contrast this dim view of perfect memory with this article on new deep learning techniques in artificial intelligence. Specifically, a new theory called the “information bottleneck.”
The basic question researchers were trying to answer was, how do you decide which are the most relevant features of a given piece of information? When you hear someone speak a sentence, how do you know to ignore their accent, breathing sounds, background noise, and even words you didn’t quite catch, and still receive the gist of the message? It is a problem fundamental to artificial intelligence research, since computers will tend to give equal weight to all these inputs, and thus end up thoroughly confused.
It turns out, our highly constrained bandwidth for absorbing information is not a hindrance, but key to our ability to perform this feat. What our brain does is discard as much of the incoming noisy data as possible, reducing the amount of data it has to track and process. In other words, our brain’s ability to “forget” as much information as quickly as possible is what allows us to focus on the core message.
This is also how advanced new deep learning techniques work. Take for example an algorithm being trained to recognize images of dogs. A set of
training data (thousands of dog photos) is fed into the algorithm, and a cascade of firing activity sweeps upward through layers of artificial neurons.
When the signal reaches the top layer, the final firing pattern is compared to a correct label for the image — “dog” or “no dog.” Any difference between the final pattern and the correct pattern are “back-propagated” down the layers.
Like a teacher correcting an exam and handing it back, the algorithm strengthens or weakens the network’s connections to make it better at producing the correct label next time.
This process is divided into two parts: in an initial “fitting” phase, the algorithm “memorizes” as much of the training data as possible. It tries to learn as much as possible about how to assign the correct labels. This is followed by a much longer compression phase, during which it gets better at generalizing what it has learned to new images it hasn’t seen before.
The key to this compression phase is the rapid shedding of noisy data, holding onto only the strongest correlations. For example, over time the algorithm will weaken connections between photos of dogs and houses, since most photos don’t contain both. It might at the same time strengthen connections between “dogs” and “fur,” since that is a stronger correlation. It is the “forgetting of the specifics,” the researchers argue, that enables the algorithm to learn general concepts, not just memorize millions of photos.
Experiments show that deep learning algorithms rapidly improve their performance at generalization only in the compression phase.
The key to generalizing the information we consume — to learning — is strictly limiting the incoming flow of information we consume in the first place, AND
then forgetting as much of the extraneous detail as soon as we can. Sure, we lose some detail, but detail is not what the brain is best at anyway. It is best at making meaning, at finding order in chaos, at seeing the signal in the noise.
This paper on the role of forgetting in learning used problem-solving algorithms to determine exactly how much forgetting was optimal. Using a series of experiments testing different hypotheses, they found that the optimal strategy involved learning a large body of knowledge initially, followed by random forgetting of approximately 90% of the knowledge acquired. In other words, performance improved as knowledge was forgotten, right up until the 90% mark, after which it rapidly deteriorated.
Strikingly, they found that this was true even if that 90% included problem-solving routines known to be correct and useful. Trying to “forget” only the least useful knowledge also didn’t help — random forgetting performed far better. The researchers used these results to argue for the existence of
“knowledge of negative value” — forgetting it actually adds value.
### Progressive Summarization is not a method for remembering as much as possible — it is a method for forgetting as much as possible. For offloading as much of your thinking as possible, leaving room for imagination, creativity, and mind-wandering. Preserving the lower layers provides a safety net that gives you the confidence to reduce a text by an order of magnitude with each pass. You are free to strike out boldly on the trail of a hidden core message, knowing that you can walk it back to previous layers if you make a mistake or get lost.
## **Minimizing cognitive load**
How does Progressive Summarization help you offload as much of your thinking as possible? By minimizing the cognitive burden of interacting with information at all stages — initial consumption, review, and retrieval.
Cognitive load theory (CLT) was developed in the late 1980s by John Sweller, while studying problem solving and learning in children. He looked at how
different kinds of tasks placed different demands on people’s working memory. The more complex and difficult the task, the higher the “cognitive load” it placed on the learner, and the greater the perceived mental effort required to complete it. He believed the design of educational materials could greatly reduce the cognitive load on learners, contributing to great advances in instructional design.
CLT proposes that there are three kinds of cognitive load when it comes to learning:
**Inherent**: the inherent difficulty of the topic (adding 2+2 vs. solving a differential equation, for example)
**Extraneous** cognitive load: the design or presentation of instructional materials (showing a student a picture of a square vs. trying to explain it verbally, for example)
**Germane** cognitive load: effort put into creating a permanent store of knowledge (such as notes, outlines, diagrams, categories, or lists) Instructional design, inspired by CLT, focuses on two goals: **Reducing** inherent load by breaking information into small parts which can be learned in isolation, and then reassembled into larger wholes **Redirecting** extraneous load into germane load (i.e. focusing learner’s attention on the construction of *permanent stores of knowledge*) P.S. accomplishes both objectives.
It *reduces* the inherent difficulty of the topic you’re reading about by eliminating the necessity of understanding it completely upfront. It instead treats each paragraph as a small, self-contained unit. Your only goal is to surface the key point in each “chunk” — each chapter, section, paragraph, and sentence — leaving it to your future self to figure out how to string those
insights together.
It also helps *redirect* extraneous load into germane load, by saving all these chunks in a permanent store of knowledge, like a software program. You no longer have to hold in your head all the previous points in a text, and fit each new point into that structure on the fly. You dedicate your effort to constructing small chunks of permanent knowledge, which will be saved for later review.
But reducing cognitive load isn’t just about making learning *easier*. As learning becomes easier, it also becomes *faster, better, deeper, and stronger*.
## **Recall as inhibition**
Why is minimizing cognitive load so important to making learning deeper and stronger?
Because new learning can be impaired when a reader is trying to remember too many things at once. The more bandwidth being used for remembering and memorizing, the less bandwidth is available for understanding, analyzing, interpreting, contextualizing, questioning, and absorbing in any given period of time. Like a bursting hard drive slows down a computer with even the fastest RAM, a brain crammed full of facts and figures starts to slow down even the smartest person.
This blog post describes recent research on what is known as “proactive inhibition of memory formation.” Offloading our thinking to an external tool lowers the brain’s workload as it encounters new information. In the experiments above, telling participants they didn’t have to remember a list of items enhanced their memory for a second list of items.
At first, offloading your thinking seems to cause you to remember *less*.
Especially if you do it immediately, as you read, such as with highlighting.
The ideas seem to jump directly from the page to your notes, barely touching your brain. But in the long run, you actually end up remembering *more*.
Being able to frictionlessly hand off highlighted passages to an external tool, free of the anxiety that comes with keeping many balls in the air, you’re free to encounter the next idea with an empty mind. If it’s compelling, it will stick, regardless of any fancy memorization techniques you may think you need.
The more you try to memorize whatʼs in any given book, the less bandwidth left over for seeing the patterns across them
Your attachment to what you already know may actually interfere with your
ability to understand new ideas. Clinging to our notecards, diagrams, and memorization schemes, we may be missing out on simply being present.
Carefree immersion is, after all, how children learn. And they are the best learners in the world.
## **Training your intuition**
Technology has given us the ability to “remember everything.” Coming from a legacy of information scarcity, this feels like a huge blessing. But it’s clear the blessing has become a curse. Our brains and our bodies are breaking under the strain of constant, high-volume, 24/7 information flows.
We must transition from knowledge hoarders to knowledge curators. We must learn how to frame our options about what to read, watch, and review in a way that restricts what we pay attention to, so we can see clearly instead of being overwhelmed.
What is being called into question is the very purpose of learning. What is learning *for*, now that we can access any knowledge on demand?
Learning is no longer about accumulating data points, but *training our* *algorithm*. Our algorithm is our intuition — our felt sense about what matters, what is relevant, what is interesting, and what is important, even if we’ve never seen it before and can’t explain why we like it.
What’s interesting is that, just like the deep learning experiments mentioned above, we *still* need massive amounts of data for the initial training phase. In other words, we need diverse, intense, personal experience. But 90% of the data we collect through these experiences can be ignored, discarded, or forgotten. What is left over is wisdom — the distilled nuggets of insight that, when deployed in the real world by someone who knows how to use them, can uncompress into dazzling feats of accomplishment. These nuggets of
wisdom apply across a wide range of situations, can be communicated from person to person, and even last for centuries as timeless works of art.
### Progressive Summarization is about using the information you consume as training data for your intuition. You can consume a lot more, because you’re able to continuously offload it. But more importantly, even if you lost all that data, you would still be left with the greatest prize: who you’ve become and what you’re sensitive to as a result of the diversity and depth of your personal experience.
The new purpose of learning is to enable you to adapt, as the pace of change continues to accelerate and the amount of uncertainty in the world continues to spiral upward. This occurs at every level: adapting your lifestyle to fit changing societal conditions; adapting your productivity to fit changing workplace norms; adapting your communication style to fit new kinds of collaboration; adapting your thinking process to fit new ways of solving problems. It applies right down to the most narrow tasks — the hardest part about writing this article were the mental gymnastics I had to perform to not get stuck on my assumptions about what I was trying to say.
Making a dent in a universe that keeps changing shape increasingly requires working on projects and problems that are FAR bigger than you can hold in your head. The challenges of our time are vast and cross the disciplinary boundaries that experts limit themselves to. We need people who can hold the context of two or more completely different fields in their heads at once, and then apply their highly trained intuition to finding patterns and hidden connections.
A lot of people sense this intuitively, but their attempts to memorize and to recall all this context are futile. There’s simply way too much to know. And in the meantime you get frazzled, overwhelmed, and isolated attempting to do
so. This is how we are missing some of our best and brightest minds, lost in their organizational systems as the world falls to pieces.
What we need is people who know how to recruit networks to “know” for them. Networks of people, objects, images, computers, communities, relationships, and places. To connect, unite, inspire, and facilitate collaboration between these networks.
And what does that take? It takes courage, to let go of the security of knowing everything ourselves. It takes vulnerability, to depend on others for our progress and success. It takes presence, noticing what we notice and being willing to bet on it before we know exactly why. It takes curiosity, being willing to ask questions that don’t yet have answers, or any reasonable path to an answer. It takes pushing through our assumptions about how learning should look to get what we know in the hands of someone who needs it, *right* *now*.
***Subscribe to Praxis, our members-only blog exploring the future***
***of productivity, for just $10/month. Or follow us for free content***
***via Twitter, Facebook, LinkedIn, or YouTube. ***
## **Subscribe**
Get the latest updates and free content
### Progressive Summarization VI: Core Principles of Knowledge Capture | Praxis 1/7/19, 3(07 PM
**Progressive Summarization VI:**
**Core Principles of Knowledge**
## **Capture**
Tiago Forte Jul 3, 2018
It might seem absurd that something as simple as a method of highlighting could be so important to a person’s productivity and learning. Even I’m surprised that’s turned out to be the case.
But as testimonials and stories have streamed in from people putting it to use around the world, I’ve become convinced that it is the beginning of a sea change in how we consume information. Just as mindless materialism has given way to mindful consumption, as we’ve realized that more is not always better, I believe we’re starting to see a parallel shift in our attitude toward information consumption. We’re learning that making is often more satisfying than consuming.
**“Economic development is based not on the ability of a pocket** **of the economy to consume but on the ability of people to turn** **their dreams into reality” **
**–Cesar Hidalgo, *Why Information Grows***
College students have told me they will never take notes any other way (“You mean my class notes could be useful even *after* I graduate?!”). Elite consultants have used it to help their clients make sense of the massive amount of data they have at their disposal. I’ve been happily surprised to hear people with many years of educational experience tell me that Progressive Summarization has reinvigorated their reading and note-taking.
https://praxis.fortelabs.co/progressive-summarization-vi-core-principles-of-knowledge-capture/
### Progressive Summarization VI: Core Principles of Knowledge Capture | Praxis 1/7/19, 3(07 PM
Even if you decide not to adopt the summarization method as I’ve described it in this series, I want to outline what I believe to be the universal principles of knowledge capture in the digital age.
In no particular order:
1. Interaction over consumption
2. Balance detail with discoverability
3. Opportunistic compression
4. Intuition over analysis
5. Focus most of your attention on the most valuable information 6. Tacit knowledge over explicit knowledge 7. Value questions over answers
To read this story, become a Praxis member.
## **Praxis**
You can choose to support Praxis with a subscription for $10 each month or $100 annually.
**Members get access to:**
1–3 exclusive articles per month, written or curated by Tiago Forte of Forte Labs
Members-only comments and responses
Early access to new online courses, ebooks, and events A monthly Town Hall, hosted by Tiago and conducted via live videoconference, which can include open discussions, hands-on https://praxis.fortelabs.co/progressive-summarization-vi-core-principles-of-knowledge-capture/
### Progressive Summarization VI: Core Principles of Knowledge Capture | Praxis 1/7/19, 3(07 PM
tutorials, guest interviews, or online workshops on productivity-related topics
Click here to learn more about what's included in a Praxis membership.
*Already a member? Sign in here. *
## **Subscribe**
Get the latest updates and free content
https://praxis.fortelabs.co/progressive-summarization-vi-core-principles-of-knowledge-capture/