A Theory of Consciousness

This article explains a theory of consciousness that is new to my admittedly-imperfect knowledge. The core idea is that consciousness arises from a feedback loop in which a mind's ability to externally communicate an idea connects internally with the same mind's ability to receive ideas from others. I'll argue that this kind of feedback loop adds value because it makes it easier for us to perform these actions with nascent thoughts:

I do not consider the ideas in this article to be a complete theory of consciousness in that many details remain to be resolved. For example, exactly which pieces of the brain perform which tasks, and how are those tasks performed? How can we write a program that learns the way the brain does? Despite not answering these questions in detail, this article does attempt to provide a high-level conceptual framework which may make it easier to attack problems in this domain.

My background in this area

I'm not an expert in most fields closely related to studying consciousness. I'm painfully aware of the fact that I may accidentally be restating ideas that are considered old within those fields. However, throughout my life I've been interested in the question of understanding human consciousness, and I have read many accounts which I find less satisfying than the ideas outlined here. This makes me think that either the good ideas are not well-publicized, or perhaps there is some novelty here.

My background is primarily in machine learning, which is the study of algorithms that can improve with experience. Recently, a great deal of progress has been made in what can be done with algorithms based on neural networks with many layers, often associated with the phrase deep learning. Although most deep learning research is not aimed at directly mimicking the human brain, the inspiration for these approaches is rooted in how the brain works.

In turn, some of the principles that are coalescing within machine learning appear applicable to the way humans learn and think. As an example, there is a precise mathematical sense in which simplifying an idea corresponds to being able to better understand that idea. In the world of human brains, perhaps this mathematical principle corresponds to the idea that: In the context of machine learning, I'm referring to the idea of using regularization as a kind of "simplicity constraint" on a model, which helps avoid overfitting.

"if you can't explain it to a six-year-old, you don't understand it,"

often attributed (disputably) to Richard Feynman. To explicitly connect those principles, mathematical compression can be seen as a way to represent an idea in a minimalistic way, which is perhaps the simplest kind of representation available — that is, an expression that would be friendly to young minds.

As we'll see below, several intuitive principles from machine learning provide evidence that this article's theory adds value to how brains work, and could have evolved naturally.

What is consciousness?

A clear and precise definition of consciousness would obviate the need for a theory to explain it. Unfortunately, such a definition appears to be elusive. As I write this, Wikipedia's opening sentence on consciousness is:

"Consciousness is difficult to define, though many people seem to think they know intuitively what it is."

I'm going to state, confessing to ambiguity, that:

Consciousness is what it's like to know what's happening.

Each component phrase here is doing a fair amount of work. The phrase "what it's like" indicates an internal mental state that we experience, but that is difficult to share. The phrase "to know" encapsulates the difficulties of comprehending knowledge and perception. Finally "what's happening" has double duty, indicating relevant world events as well as mental events like thinking.

So we've arrived at a fuzzy definition: one that's good enough to share the idea of consciousness with another human, but that isn't quite good enough to answer many questions about it. Let's take a more detailed look at some of the key questions that seem difficult at this piont.

Why consciousness feels mysterious

If you and I were talking about a photo, we could both physically point to the photo and rely on the fact that we were receiving a similar internal representation that we could discuss. This ability to connect ideas across minds transcends the merely physical; consider optical illusions. In this image, a human mind will perceive fuzzy dark circles at the intersections of the grid lines:

We can both point to the spots where the illusion appears. We're not pointing to anything we see in the physical world, but rather to something that we both perceive internally.

Discussing consciousness can feel similar in that we both experience it, yet there is nothing obvious in the physical world to point to. In fact, consciousness is more challenging to study because personal mental states are less deterministic than optical illusions. For example, if I read a short story and then ask you to read it, there's no guarantee that we'll have the same internal reactions to reading that story.

Imagine we each have a locked safe. Somehow, only you can see the contents of your safe, and only I can see what's in my safe. Inside each is an object with a new color never before preceived by either of us. I call my color "blurple," and try to describe it to you. You describe the color you see in your safe. We can guess that we perceive the same new color, but we can't really know. If our descriptions differed, it would be hard to interpret those differences.

This same difficulty in communication arises for consciousness. We're talking about internal mental states that are apparently extremely difficult to reproduce across minds. The hidden, individualized nature of our subject makes it harder to know if my consciousness feels to me the same as your consciousness feels to you. A number of interesting questions arise, all difficult:

I won't directly answer these questions, but I hope that some of them feel more approachable by the end of the article. For example, if we want to measure degrees of consciousness in animals, and we understand the pieces that compose consciousness, then we could ask which of the pieces the animal has, and to what degree its abilities compared to those of a human brain.

The theory

A motivation for this theory is the fact that many thoughts take the form of an internal dialogue. From this, we can guess that consciousness might be like talking to yourself, except that your mind takes a shortcut to avoid saying anything out loud.

Let's make this idea more precise.

We can model the brain as executing a cycle that repeatedly chooses a goal, and then acts out a protocol toward that goal. One part of the brain is responsible for choosing goals, while another takes action on those goals. Some goals, such as exploring curiosity, stand out as feeling less directed than others. Nonetheless, those goals can still fit into this model. In the case of curiosity, the goal can be given as satisfying a lower-level desire to attain knowledge, perhaps akin to a kind of mental hunger.

In simple species, this relationship between goals and actions is easier to analyze. Neutrophil are a kind of white blood cell that can attack bacterial infections. Without ascribing awareness to these cells, we can still consider them to have a goal: destroying certain bacterial cells; and a sequence of actions to achieve that goal: essentially, ingesting the offending cells. Taking a few steps up the evolutionary ladder, the behavior of ants can likewise be understood in simple terms, such as seeking food, transporting food, or defending the colony.

As animals become social, some of their available actions focus on expressing an idea to another member of their species. On the receiving end of that expression is the ability to internalize and react to the idea communicated. To simplify, there is both social output and input; giving and receiving.

How can these abilities be internalized? Perhaps there is a kind of forum in the mind — a place where the different specialized components of thought come together to integrate their functions. When we walk up stairs, we receive visual, tactile, and balance-related perceptions about those stairs, and use those to guide the motion of our legs and feet to continue upwards. When we engage in conversation, we hear speech, filter out noise, think about the ideas being discussed, choose a reply, translate that reply into words, and speak those words aloud.

And if there is such a forum, then this is a place where one component of the brain, such as that which hears speech, is re-used to hear an internal voice. In fact, there is no need for the interal perception to be based on speech. It could be based on visual, tactile, or any other type of input we are familiar with.

Let's suppose that consciousness uses such a feedback loop. Then this would explain why thinking can take the form of talking in your head, or drawing a mental picture, internally hearing a song, or imagining spatial relationships, such as those on a map.

So far, I've suggested a feedback loop that can account for some attributes of consciousness. A number of questions immediately arise. I'm going to focus on the two I find most interesting.

Question 1: Why would such a feedback loop be useful?

This is where things relate back to machine learning. When we think of ideas, we seem to have a relatively large immediate working memory. The act of forcing ourselves to translate that thought into speech corresponds to a conversion of a general data type into a simplified data type built with familiar concepts. In machine learning terminology, I would say that a raw working thought is a high-dimensional vector, whereas a speech-processed thought has been reduced to a simplified model — perhaps considered as a sparse vector.

By simplifying the model used to represent that thought, it becomes easier to remember or to communicate. Since the output of the translation forces the new idea into the context of previously-learned concepts, it also allows the new idea to be related to those concepts. This gives us triggers to recall and use the new idea. Finally, I argue that a simplified version of an idea is more likely to generalize in applications. This last thought is akin to using Occam's razor to choose a good explanation, or choosing a machine learning model with few parameters to reduce the risk of overfitting to data.

Question 2: For this idea to work, is it necessary to think in words?

So far, I've illustrated the feedback loop in terms of internal speech. But the fundamental mechanism doesn't have to rely on words. Instead, we can suppose that the translation portion of the feedback loop relies on familiar concepts that may or may not take the form of traditional language.

One model of the brain gives it a language center which is capable of learning to speak, hear, and write a language. Some thinkers have posited that thought is limited by the language we speak. Indeed, Ludwig Wittgenstein once said: Language processing is now understood to take place across multiple areas of the brain which perform various specialized subtasks.

"The limits of my language mean the limits of my world."

But I'm not convinced that this is the whole picture of language and thought.

In particular, animals are perfectly capable of choosing actions before they have learned a language. They clearly have concepts that are learned over time, such as a monkey that prefers a certain kind of food, or an infant that coos when she recognizes one of her parents. I suggest that these concepts, verbal or not, may be stored internally in a mind in the same way that words are.

The feedback loop described here is not based on words so much as it is based on familiar concepts, which we could alternatively think of as abstract words.

This leaves room for many kinds of thinkers — deaf people thinking in sign language, musicians thinking in song, mathematicians thinking of graphs, or anyone at all thinking about ideas they know well, yet have no words for.

The emergence of self-awareness

One interesting consequence of this theory is a way in which we may become aware of ourselves without external actions like looking in a mirror. When you or I talk to another person, there is some part of our brain that sees where they are, recognizes who they are, recalls what we know of them, and in general pulls together an internal picture of personhood for them.

Many of the externally-based signals that trigger this picture of personhood are missing when considering the self. Without a mirror, I don't see myself the way I see other people. By default, I don't usually hear my speech directed at myself. However, the mental feedback loop of speech can give me some recognition of the self that I wouldn't otherwise have. The same parts of my brain that actualize a picture of personhood for others can suddenly realize the same for myself. Similarly, other ideas, such as mental images, can be internally perceived, activating some of the same parts of the brain that recognize their external analogues; in other words, I am able to be aware of my own thinking process in much the same way that I'm aware of what I perceive of the external world.

Some possible details

This theory includes the idea that how we think about words overlaps heavily with how we think about familiar actions. That is, words can be considered as a specialized kind of action-building-block. A final result built of words is often created via the act of speech or of typing.

This gives a path of investigation: first we asked about consciousness; next, we considered an internal feedback loop based around translation of new ideas into a framework of abstract words; finally, we see abstract words as the building blocks of actions we perform. To tie everything back together, the actions I have in mind here are exactly the protocols mentioned earlier in the model of animal behavior as a cycle of goals and protocols.

If we wanted to simulate the abstract language functionality of the brain, a good starting point would be to work on a system that could learn and adapt action-based protocols to new situations. How would such a system represent its ever-growing alphabet of action-building-blocks? How would it learn? How would it performs lookups of appropriate actions? How would it adapt to new situations?

Perhaps actions are like functions in a programming language. We seem to be able to sequence action building blocks together, as well as to repeat similar actions, possibly with variable-like inputs. When I type, my finger motions are all similar, and we might factor out the idea of which letter is being pressed. At the highest level, I am thinking in the ideas I want to express. Below that is a breakdown into specific words. Closer to my fingers, I suspect that there is a muscle memory associated with common words like the or of. When I type an infrequent word, such as discombobulated, I find myself having to momentarily consider it letter-by-letter as I type it. To me, this means that I have internal functions that encapsulate typing the or of, while I don't have such a function for discombobulated, although such a function is forming as I type this article!

The idea of functions allows us to consider most familiar actions as mere recipes, or lists of more finely-grained actions. Programmers, however, will be quick to realize that many behaviors can't be universally achieved without adding abilities like looping or conditional execution. When I think about conditional behavior in human minds, I suspect that conditionals are usually set up as trigger-based reactions. That is, the human brain seems to be primarily asynchronous in that many things happen in parallel, and our thoughts can be interrupted by incoming signals that suggest something more important may be happening. For example, if I'm working for many hours in a row, I don't stop every 10 minutes to ask, should I eat now? Rather, hunger occurs spontaneously and intrudes itself upon my thoughts as needed.

Since many actions require self-adjustment, there may also be a part of the brain that is able to watch an action unfold, and to provide immediate feedback on how things are going. This kind of action-awareness not only makes actions more robust, but it could also allow us to learn a new kind of action by watching someone else perform it. This mechanism could be the basis of translation of new ideas into familiar concepts.

To summarize the line of thinking here, these are some specific thoughts on how abstract language may work in the human brain:

One of the consequences of these ideas is that we only truly understand something by doing it, although abstract concepts can be executed purely internally, such as solving a math problem simply by thinking about it. This matches the Chinese proverb:

I hear and I forget. I see and I remember. I do and I understand.

Why this theory makes sense

Here are several key ways we could evaluate a theory of consciousness:

Throughout the rest of this section, I'll consider each of the above points in order.

Although it's worthwhile, I'm not going to consider other explanations of consciousness here, nor observational differences with them.

Let's take a look at pathological brain states. I think of this as a kind of external reverse engineering in the sense that we're finding clues about how things work without taking anything apart.

Consider the phenomenon of eidetic memory, which is the carefully studied variant of what is popularly known as photographic memory. A fraction of young children possess an impressive ability to recall details of something they've seen for a very brief period of time. Generally, this ability disappears by age 13. I see two interesting points here. First, in cases that have been carefully documented, these memories do not last forever; that is, they might better be called medium-term memories than long-term. Second, there is a relative sparsity of well-documented eidetic memory in adults; evidence suggests that the mechanism enabling eidetic memory in children tends to stop working by adulthood.

In terms of consciousness as a feedback loop based on translation into concepts, it makes sense that highly detailed memories would correspond with a perception that has been captured but not processed. Its internal representation is not sparse, and therefore would not be stored easily. It seems reasonable to guess that detailed images are typically not used much. As the images are not stored for long, this also matches the idea that abstract words are formed or solidified with use. Finally, it also makes sense that detailed, unprocessed mental images would eventually stop being stored in adult memories as they are replaced with translations into the more useful component abstract words.

I imagine there are other pathologies that would shed further light on this line of thinking, but I'm going to move on to internal reverse engineering, by which I mean understanding how the brain works by looking at different elements within it. This can be achieved, for example, by looking at fMRI-based images of brains performing certain actions. I don't have the means to perform fMRI experiments at the moment, but such experiments could be designed to test the hypothesis that some parts of the brain active during speech may also be active during internal speech, as well as some being active during non-verbal protocol execution, such as when trying to teach an action — like a riding a bike — to another person.

The next consideration is the creation of machine intelligence. One could attempt to build a kind of artificial general intelligience that builds on linking actions and abstract words. This work could attempt to create a learning system that represented ideas as program-like functions, including the ability to translate from new protocols into familiar functions, as well as to execute those functions within a self-adjusting feedback loop. Some recent progress has been made in the world of reinforcement learning, allowing algorithms to learn how to play video games given very little starting knowledge. This work matches the current article in that reinforcement learning fits well into the model of behavior as a goal-protocol cycle, although a more detailed analysis would be needed to check for finer grained supporting details.

In terms of my personal experience, this theory matches well. I sometimes think in internally-spoken words, sometimes in images or sounds. I occasionally think in non-verbal abstract words, especially when thinking about a concept in math that is most easily understood through an example visualization or structure. When I analyze my own action protocols, they do feel like hierarchical programs that are generally asynchronous — I don't usually stop to check for conditionals — and composed of loops, compositions of smaller programs, and involve variables to make them adaptable to new situations. I cannot speak to your personal experiences of consciousness, but I invite you to make similar comparisons.

Finally, there is the question of evolution. It seems natural to me that simple species have goals and corresponding actions to work toward those goals. Next comes the ability to learn actions by watching others perform them, perhaps triggered by an internal mechanism designed to give self-adjusting feedback based on perception of one's own in-progress motions. After that — or perhaps evolving in parallel — is the ability to explicitly teach ideas to others. When all of these pieces are in place, a mind is already in a position to become smarter by experimenting in a mental forum rather than just in reality. It seems natural that the mental forum — the nexus of perception and action protocols — would already exist. The last piece to fall into place would be the use of these components to close the internal feedback loop.

Closing the loop

I'll summarize the key ideas.

Consciousness is what it's like to know what's happening. It's an internal experience we all have, yet cannot share in detail. Its details are highly individual, ephemeral, and not easily captured exactly in words. As far as I know, there is no widely-accepted, detailed explanation of how consciousness works.

The idea presented here is that consciousness arises from a process in which a new idea occurs, is translated into familiar concepts, and those familiar concepts are perceived similarly to the perception of external stimuli such as speech. These familiar concepts can take the form of spoken words, known visual ideas, or other previously-understood ideas. The ideas themselves are an internalization of actions; these actions are used to move us toward accomplishing our goals.

Abstract words take the form of programming functions with variables, sequences, and the ability to call other functions. Conditionals are based on asynchronous triggers to take branching actions.

The process of translating new ideas into familiar concepts is useful because it simplifies the model of the idea, relates it to previous ideas, and adds trigger points to bring the new idea to mind. Simplified models support memory, generalization to new applications, and the ability to communicate the idea to others. These justifications of the translation idea can also be viewed as an explanation for why teaching helps us to learn, or for why explaining a problem to a silent audience can still help us make progress on that problem.

These ideas can explain self-awareness: the same brain component which builds a picture of personhood of others can build a picture of personhood of ourselves by perceiving our internal concepts, such as internal speech, as if it had been observed externally as coming from another person.

I have not attempted to carefully support this theory in terms of other theories, nor in terms of experiments, although I think that would be worthwhile. I've argued in favor of these ideas based on qualities of my personal experiences of consciousness, which I am guessing may be shared by other humans. I've also listed some ideas based on pathological neurology and evolution that support the ideas of this article.

I'd be curious to learn of other detailed theories of consciousness, as well as about other forms of evidence that might support this theory, or provide hints about how these ideas could be improved.