Understanding Narrative Transportation Using NotebookLM

In a previous post I discussed Embracing Narrative Transportation as a way to enhance the impact of your personal stories. And while I enjoy writing articles about storytelling, I’ve been exploring AI (Artificial Intelligence) to broaden my research capabilities.

I have to say, it’s amazing to watch these Chatbots in action. They are fast, and comprehensive in their reach, but they don’t replace humans as they still don’t have human experiences that we have, but they are able to locate and analyze information faster than any human.

So I decided to see what some of the latest chatbots had to say about the topic of Narrative Transportation with regards to crafting impactful personal stories. In part two, I asked NotebookLM what it thought. While it’s a long article, if you have an interest in AI, it’s an interesting read.

Input:

As Google’s NotebookLM queries the sources that have been uploaded to a specific notebook, I added 42 sources that discussed some aspect of narrative transportation. Then I asked 8 questions, which are listed in the following text, along with the output from NotebookLM

Output:

Here’s what NotebookLM had to say:

Overview

The provided texts collectively explore the multifaceted nature and impact of narrative. Monika Fludernik‘s work offers a comprehensive introduction to narratology, defining narrative elements and theoretical frameworks for understanding storytelling across various media. Lisa Libby and Melanie Green examine the psychological phenomenon of narrative transportation, investigating how engagement with stories influences beliefs, attitudes, and self-perception, even in the face of conflicting information.

Helenq Bilandzic focuses on measuring narrative engagement, identifying its components and relationship to concepts like identification. Martin Payne’s “Narrative Therapy” discusses the application of storytelling principles in therapeutic contexts, emphasizing the power of personal narratives in shaping identity and facilitating change. Finally, Lucas Bietti and Kurt Braddock, along with Michael Dahlstrom and William Buchanan, investigate the adaptive functions of storytelling and the persuasive effects of narrative on beliefs and cognitive processes like recall and comprehension, highlighting the immersive “experience” of transportation into a story world.

Questions

(1) What are the core components and definitions of Narrative Transportation as described in the provided documents?

Narrative transportation is an experiential state of immersion in which all mental processes are concentrated on the events occurring in the narrative. It’s the process that enables narrative persuasion, occurring when a person becomes so immersed in a story that they feel as if they are truly experiencing it.

Here are some core components and definitions of narrative transportation as described in the sources:

    • Definition: Narrative transportation is defined as a convergent mental process where all mental systems and capacities become focused on events occurring in the narrative. It involves individuals becoming absorbed into a story world. This psychological state varies depending on the text or media product, the recipient, the situation, and interactions between these factors.
  • Key Components: Several key cognitive and emotional components contribute to narrative transportation:
    • Focused Attention: Receivers become so focused on the story that they do not think about distractions in their environment. A high level of focus or attention to the narrative is considered a necessary but not sufficient condition for transportation.
    • Emotional Engagement (including Empathy): There is a strong emotional response to the story with empathy for the characters. Transportation involves empathy, where story receivers try to understand the experience of a story character, knowing and feeling the world in the same way.
    • Mental Imagery: Receivers have a mental image of the story in their heads or try to predict what might happen in the future of the story. Story receivers generate vivid images of the story plot, feeling as though they are experiencing the events themselves.
    • Cognitive Detachment from Reality (Suspended Reality/Loss of Access to the External World): When transported, story receivers lose track of reality in a physiological sense. They mentally leave the real world behind and become entirely focused on the story world, although recent perspectives emphasize this is a dynamic process. This also includes a lack of awareness of surroundings.
  • Related Concepts and Distinctions:
    • Narrative transportation is distinct from a narrative itself, which is an antecedent of narrative transportation. Narratives are defined as a connected series of events that typically include the actions and experiences of one or more protagonists that are causally related.
    • While similar to other experiential constructs like absorption, flow, and immersion, narrative transportation has specific characteristics. Absorption is a personality trait, while transportation is temporary. Flow is more general, while transportation specifically entails empathy and mental imagery. Immersion is more related to aesthetic and visual elements, whereas transportation relies on a story with plot and characters. Identification emphasizes involvement with characters, while transportation concerns involvement with the entire narrative.
  • Outcomes: Narrative transportation can lead to several outcomes:
    • Belief and Attitude Shift: Transportation can influence receivers’ beliefs, attitudes, intentions, and behaviors, aligning them with the story’s content.
    • Reduced Counterarguing: When engrossed in a narrative, the receiver is less likely to challenge the story’s content.
    • Story-consistent beliefs and favorable evaluations of protagonists.
  • Factors Influencing Transportation: The degree of transportation depends on various factors related to the story (e.g., narrativity, quality, coherence, characters), the recipient (e.g., transportability, need for affect), the medium, and the situation. Narrativity, which includes content dimensions and structural features like a sequence of causally related events unfolding over time, contributes to transportation by enabling an easy flow of comprehension.

In essence, narrative transportation is a psychological state of deep engagement with a story, involving focus, emotion, imagination, and a temporary detachment from the real world, which can subsequently influence beliefs and attitudes.

(2) According to these documents, what factors contribute to or hinder an individual’s experience of Narrative Transportation?

According to the documents, several factors contribute to or hinder an individual’s experience of Narrative Transportation. These factors can be broadly categorized into story characteristics, recipient traits and individual differences, situational factors, and medium characteristics.

  1. Story/Narrative Characteristics: These are features inherent to the narrative itself that can influence how easily a person becomes transported.
  • Narrativity and Literariness: Stories with high narrativity, meaning they include protagonists with goals, ensuing conflicts, and follow canonical affective trajectories, tend to elicit stronger narrative transportation. Well-crafted, high-quality narratives are generally more transporting. Literariness can also contribute to transportation.
  • Coherence: A story that provides clear and sensible links between story events, character actions, and other elements is more transporting.
  • Character Development:Well-developed characters contribute to higher transportation. Identifiable characters, where the story receiver can clearly pinpoint the personas, facilitate transportation by allowing for vicarious experience of beliefs and emotions.
  • Emotional Intensity and Range: Stories with a greater emotional range or intensity can be more transporting.
  • Suspense and Dramatic Tension: The presence of suspense or dramatic tension can increase transportation.
  • Psychological Realism (Believability):Even in fantasy settings, the plot and characters should be plausible and act like real people to enhance transportation. Narratives that lack verisimilitude (realism or the appearance of being true) tend to be less transporting. However, this realism refers more to psychological realism than literal similarity to the real world.
  • Clarity and Ease of Comprehension: Narratives that are easy to read and understand facilitate transportation. Conversely, written narratives that are difficult to read, understand, or imagine tend to be less transporting.
  • Vividness and Detail: Rich detail may make stories more transporting. Stories containing rich, concrete imagery can increase levels of transportation.
  • Novelty: Foregrounding in a narrative can increase perceived novelty, potentially contributing to transportation.
  • Story Line Appeal and Quality of Production: These are suggested as important text factors influencing transportation.
  • Unobtrusiveness of Persuasive Subtext: When persuasive messages are presented subtly, transportation may be higher.
  • Similarity of Characters with the Self (Homophily): Characters who are similar to the reader can increase transportation. Personally relevant stories also tend to be more transporting.
  • Narrative Perspective: While third-person perspective stories can be transporting, there is some evidence that the first-person perspective might be particularly effective.
  • Coherence of Narrative Elements: A series of interrelated events that constitute a developed narrative, where the reader relies on the author, and the author relies on the reader’s ability to interpret, contributes to the experience.
  • Causal Information: Information that is part of the causal chain of the story may lead to greater engagement.
  1. Recipient Traits and Individual Differences: These are characteristics of the individual reading or experiencing the narrative.
  • Transportability: This is a generalized tendency toward being transported by a narrative. Individuals high in transportability are more likely to become transported. This trait involves subdimensions like suspense/curiosity propensity, emotional engageability, and ease of accepting fictional worlds.
  • Need for Affect: Individuals with a higher need for affect (a tendency to avoid emotion-inducing situations) are more likely to be transported into narratives.
  • Imaginative Engagement: A higher capacity for imaginative engagement is linked to increased transportation.
  • Narrative Comprehension Ability: How well a person understands the story, characters, and conflicts can affect transportability.
  • Tendencies Toward Mental Imagery Production: Individuals who are better able to imagine the events and characters in a narrative are more likely to be transported. However, individuals with a low ability to generate vivid mental imagery might be more transported by visual media like films than by texts.
  • Tendency Toward Emotional Responding: Individuals with a greater tendency toward emotional responding may have higher transportability.
  • Prior Knowledge: Familiarity with a story topic can increase transportation. Prior knowledge that makes it easier to understand and connect with characters can also make transportation more likely. However, if personal experience contradicts the story, transportation can be disrupted as individuals may adopt a critical mode of processing.
  • Attention: Paying attention to a story is a prerequisite for transportation.
  • Self-Concept Accessibility: The level of accessibility of an individual’s self-concepts upon entering a narrative world can influence experience-taking, which is related to transportation. High self-concept accessibility did not appear to increase distraction or reduce absorption but was associated with lower levels of experience-taking.
  • Literary Expertise: Readers with literary expertise may process texts differently due to their memory representations, including genre expectations.
  • Demographic Factors: Some research suggests that younger individuals and females may experience more narrative transportation. Education level has also been linked to transportation.
  1. Situational Factors: These are elements of the environment or context in which the narrative is experienced.
  • Pre-Narrative Instructions: Instructions given to readers or viewers before engaging with the narrative can affect transportation. For example, instructions to relax and enjoy the narrative can increase transportation, while instructions to focus on surface aspects like grammar can decrease it.
  • Prior Expectations (e.g., Reviews): Positive expectations created by prior positive reviews of a story can encourage greater transportation. Conversely, negative reviews might hinder it.
  • Distraction and Cognitive Load: Anything that distracts the individual or increases cognitive load can interfere with transportation.
  • Co-viewing: The presence and reactions of other people while experiencing a narrative (e.g., watching a movie) can influence transportation and its effects.
  1. Medium Characteristics: The format in which the narrative is presented can also play a role.
  • Type of Medium: Different media (e.g., written, audio, video, virtual reality) can influence the extent of narrative transportation. For example, virtual reality stories may evoke stronger physiological responses.
  • Features of the Medium: Elements such as music in video narratives can affect transportation. The sensory capabilities of a medium (e.g., VR haptic interfaces) can influence how stories can be told and experienced.
  • Visual Narrativity: For visual narratives, the presence of setting, a focal actor, and elements for viewer resonance contribute to visual narrative transportation.

It’s important to note that these factors can interact with each other, and the experience of narrative transportation is often a dynamic process that can fluctuate over time.

(3) How does Narrative Transportation, as explained in these texts, affect an audience’s beliefs, attitudes, and emotions in relation to a story?

Narrative Transportation, as explained in these texts, significantly affects an audience’s beliefs, attitudes, and emotions in relation to a story by immersing them in the narrative world, leading to various psychological and persuasive outcomes. This immersion, characterized by focused attention, emotional engagement, and vivid mental imagery, has several key effects:

  1. Belief Change:
  • Reduced Counterarguing: When individuals are transported into a narrative, they are often less likely to critically evaluate the information presented and may have a reduced tendency to counterargue claims made in the story. This can lead to the acceptance of story-consistent beliefs.
  • Perceived Realism: Higher levels of transportation can lead story receivers to perceive the story and its plot as more realistic and truthful, even if they know it is fictional. This perception of truthfulness can then prompt them to adopt story-consistent beliefs.
  • Influence of Characters’ Beliefs: Strong feelings and connections formed with story characters during transportation can enhance the influence of those characters’ experiences and beliefs on the audience’s own beliefs.
  • Memory Reinforcement: The mental imagery formed during transportation can be recalled later, potentially re-evoking the original communication and reinforcing story-relevant beliefs over time.
  • Generalization of Lessons: Audiences tend to generalize lessons learned from transported narratives and apply them to real-life situations, even if the stories focus on specific individuals and events.
  • Causal Information Acceptance: Information presented at causal locations within a narrative tends to be perceived as more truthful regarding the real world, and this effect can persist over time.
  1. Attitude Change:
  • Favorable Evaluations: Transportation can lead to more positive feelings and evaluations of sympathetic characters within the narrative.
  • Story-Consistent Attitudes: The engrossing experience of narrative transportation can result in more positive attitudes towards the story itself and story-consistent attitudes toward the experiences described within it.
  • Mediation by Affect and Cognition: Narrative transportation can cause both affective (emotional) and cognitive responses, which in turn mediate changes in attitudes. Higher transportation is associated with increased story-consistent affective responses and narrative thoughts, and decreased critical thoughts.
  • Persuasion: Narrative transportation is a key mechanism of narrative persuasion, influencing changes in consumers’ attitudes and intentions. Individuals more transported into narratives often show more positive attitudes towards related topics, such as products or social policies.
  • Reduced Resistance: Stories, particularly when transporting, can be less threatening than direct persuasive messages, helping to overcome resistance to attitude change.
  1. Emotional Impact:
  • Emotional Engagement: Transportation inherently involves emotional engagement with the narrative. Transporting narratives are effective partly because they evoke emotions.
  • Vicarious Experience: Through transportation, audience members can vicariously experience the emotions of the characters, leading to emotional responses aligned with the narrative events.
  • Emotional Flow: The pattern of emotional experiences throughout a story, including shifts in emotional tone and the creation and resolution of suspense, can help maintain audience interest and increase transportation.
  • Intensity of Response: Greater levels of narrative transportation can cause story receivers to perceive the story as more realistic, leading to more intense and story-consistent emotional responses.
  • Connection to Characters: Emotional connections formed with characters during transportation contribute to the overall emotional impact of the story.

In summary, Narrative Transportation acts as a powerful process that allows audiences to become deeply involved in a story world. This immersion reduces critical processing, enhances emotional engagement, and fosters connections with characters and the narrative, ultimately leading to shifts in their beliefs, attitudes, and emotional states that are often consistent with the story’s content and implications. This effect is distinct from more analytical forms of persuasion.

(4) Based on the uploaded documents, what are the key principles of Narrative Transportation that can be applied to crafting impactful personal stories?

Based on the uploaded documents and our conversation history, several key principles of Narrative Transportation can be applied to crafting impactful personal stories:

  • Narrative Quality and Coherence: A well-crafted personal story should have a clear and logical flow of events. The connections between different parts of the story, including cause-and-effect relationships, should be sensible and easy to follow. Ensuring that the actions and motivations of the storyteller (and any other characters involved) are consistent will enhance the coherence.
  • High Narrativity: Personal stories should possess strong narrativity, which encompasses both the content (the sequence of events and the individuals involved) and the discourse (how the story is told using literary devices). A compelling personal story will have identifiable elements of a story, such as a beginning, rising action, climax, falling action, and resolution (even if implicit).
  • Vivid Imagery: To facilitate transportation, a personal story should incorporate rich and descriptive language that allows the audience to form vivid mental images of the events, settings, and people involved. Engaging sensory details (what was seen, heard, felt, etc.) can significantly enhance this aspect.
  • Emotional Engagement: Impactful personal stories evoke emotions in the audience. Sharing your own genuine emotions and describing events in a way that resonates emotionally with listeners or readers is crucial. The emotional arc of your story, including moments of tension, joy, sadness, or reflection, can contribute to greater transportation.
  • Focused Attention: A well-crafted personal story will capture and maintain the audience’s attention. This can be achieved through compelling plot points, intriguing details, and a pace that keeps the audience engaged without overwhelming them.
  • Identifiable Characters (Especially Yourself): Even in a personal story where you are the central figure, making yourself relatable and allowing the audience to understand your experiences and emotions will foster empathy and connection. Sharing your inner thoughts, motivations, and vulnerabilities can help the audience identify with you.
  • Psychological Realism: While personal stories are factual, presenting them in a way that feels psychologically plausible – where the emotions and reactions feel genuine and understandable – can enhance transportation. Even when recounting unusual events, grounding the narrative in relatable human experiences is important.
  • Personal Relevance and Similarity: When crafting a personal story, consider how it might resonate with your intended audience. Highlighting themes, experiences, or emotions that the audience can relate to or find relevant to their own lives can increase their level of transportation. Similarities between the storyteller and the audience can also lead to higher levels of transportation.
  • Perspective: While not explicitly stated as a necessity, using the first-person perspective (“I”) in a personal story can naturally enhance the feeling of direct experience and foster a closer connection between the storyteller and the audience, potentially increasing transportation.

By thoughtfully applying these principles, individuals can craft personal stories that are not just accounts of events, but immersive experiences that can affect an audience’s beliefs, attitudes, and emotions, making the stories more impactful and memorable.

(5) What narrative elements or techniques, identified in these documents, are most effective in fostering Narrative Transportation in personal storytelling?

Based on the uploaded documents and our conversation history, several narrative elements and techniques are identified as being most effective in fostering Narrative Transportation in personal storytelling:

  • Coherence and Narrative Quality: A personal story that flows logically, with clear connections between events, character actions, and other elements, is more likely to be transporting. When the cause-and-effect links are sensible, the audience can more easily become immersed.
  • Strong Narrativity: This involves both the what (narrative content – the sequence of events and characters) and the how (narrative discourse – the literary devices used to tell the story). A well-defined sequence of events with relatable characters provides a foundation for transportation.
  • Vivid Mental Imagery: Using rich and descriptive language that appeals to the senses helps the audience create mental images of the story world and events, a key component of transportation. Concrete details and sensory information make the story more engaging and immersive.
  • Emotional Engagement:Stories that evoke genuine emotions in the audience are more transporting. Sharing your own emotions authentically and describing experiences in an emotionally resonant way allows the audience to connect with the story on a deeper level. The emotional arc of the story, including moments of suspense, joy, or reflection, also plays a crucial role.
  • Focused Attention: While the storyteller primarily influences this through the above elements, crafting a story that captures and holds the audience’s attention is vital for transportation. Avoiding unnecessary digressions and maintaining a compelling pace can aid in this.
  • Identifiable Characters (Especially the Teller): Making yourself and any other individuals in your personal story relatable and understandable fosters empathy and identification, which are closely linked to transportation. Sharing thoughts, feelings, and motivations can help the audience connect with your experience.
  • Psychological Realism: Even when recounting extraordinary personal events, grounding the story in believable human psychology – ensuring that actions and reactions feel plausible – enhances transportation. The audience should be able to understand the characters’ (including your own) motivations and emotional states.
  • Personal Relevance and Similarity: When the audience perceives the personal story as relevant to their own lives or identifies similarities between themselves and the storyteller or the situation, transportation is more likely. Highlighting universal themes or relatable experiences can bridge this gap.
  • Perspective (First-Person):While not explicitly stated as a definitive rule in the sources, using the first-person perspective (“I”) in personal storytelling naturally creates a sense of immediacy and direct experience, potentially drawing the audience more directly into your narrative and fostering transportation.

In essence, crafting impactful personal stories that foster Narrative Transportation involves telling a well-structured and coherent story with vivid details that evoke emotions, allowing the audience to connect with the storyteller and experience the narrative in a way that feels real and relevant to them.

(6) Are there any limitations or potential drawbacks of relying on Narrative Transportation to create impactful personal stories, according to these sources?

Based on the uploaded documents and our conversation history, there are several limitations or potential drawbacks of relying on Narrative Transportation to create impactful personal stories:

  • Distraction by Irrelevant Details: While vivid details are crucial for transportation, an overabundance of them, or details that are irrelevant to the main point of the personal story, can be a significant drawback. These details might obscure the relevant elements of the story and detract from the intended impact. Listeners or readers could become focused on less important aspects, potentially missing the core message or the intended emotional or attitudinal shift.
  • Potential for Misinterpretation and Misapplication: If the audience becomes highly transported but fixates on tangential similarities between the story and their own lives, they might draw incorrect or unintended conclusions. This “distraction by details” could lead to decisions or perspectives that are not aligned with the storyteller’s intention.
  • Risk of Perceived Manipulation: If a personal story is perceived as being overly crafted or manipulative in its attempt to induce transportation, it could backfire. Audiences might become aware of the persuasive intent, which can lead to increased skepticism and reduced effectiveness of the story. The sincerity and authenticity often expected in personal stories are crucial to avoid this perception.
  • Ethical Considerations: Depending on the content of the personal story, high levels of transportation could raise ethical concerns. For instance, if the story inadvertently normalizes or glorifies harmful behaviors, a transported audience might be more susceptible to these negative implications. Accuracy, potential for manipulation, and the portrayal of oneself and others with respect are important ethical considerations.
  • Focus on the Story Over the Message: The very elements that make a personal story highly transporting – engaging plot, vivid characters, and emotional resonance – might overshadow the specific belief, attitude, or behavior the storyteller intends to influence. The audience might remember the story and their emotional experience but not necessarily internalize the intended lesson or perspective.
  • Individual Differences in Transportability: Not everyone is equally susceptible to narrative transportation. Factors like “transportability” (a person’s general tendency to become absorbed in narratives), prior experiences, and the situational context can influence how deeply someone is transported. Therefore, a personal story that is highly transporting for one person might have less impact on another.
  • The Role of Critical Thinking: While transportation often reduces counterarguing, if the personal story contains inconsistencies or elements that strongly contradict the audience’s existing knowledge or beliefs, high transportation might not entirely suppress critical thinking. Instead, these inconsistencies might disrupt transportation and reduce the story’s impact.

In summary, while Narrative Transportation is a powerful tool for crafting impactful personal stories, storytellers need to be mindful of the potential for distraction through details, the risk of appearing manipulative, ethical implications, the possibility of the story overshadowing the message, and individual differences in audience responses. Carefully balancing the elements that foster transportation with a clear focus on the intended message and maintaining authenticity are key to mitigating these potential drawbacks.

(7) Can you identify any examples or case studies of successful personal storytelling within these documents that illustrate the principles of Narrative Transportation?

Based on the uploaded documents, there aren’t explicit, detailed examples or case studies of successful personal storytelling that are thoroughly analyzed to illustrate the specific principles of Narrative Transportation we have discussed. While the sources touch upon the power of narratives and their persuasive effects, the emphasis is often on broader categories of stories, such as public narratives, fictional narratives used in research, or the use of autobiographical storytelling in therapeutic contexts without specific examples dissected for their Narrative Transportation elements.

However, we can infer some potential examples and connect them to the principles based on the information provided:

  • The “Shopping Trolley Soldier (John Mulligan)” mentioned in: This is referred to as a dramatic example of someone benefiting from writing therapy. While the specifics of Mulligan’s story and its telling are not detailed, its effectiveness in resolving emotional issues suggests that it likely employed several principles of Narrative Transportation. For example, to have a “dramatic” impact, the story likely had a degree of narrative quality and coherence, presented events in a way that evoked emotional engagement, and perhaps allowed for some level of identification or psychological realism for the reader to connect with Mulligan’s experiences. However, without more details, this remains an inferred example.
  • The use of autobiographical stories in therapy described in: The text outlines how clients can be asked to choose significant autobiographical stories and clarify them through writing and discussion. The potential for these stories to be “helpful” and for clients to gain insights about their lives by reflecting on their own narratives suggests that when these stories are shared (even if primarily with a therapist), they can be impactful. For Narrative Transportation to occur in this context, the client’s ability to articulate their experiences with coherence, including vivid details of memories, and to convey the emotional significance of these events would likely be crucial. The therapist’s questions to help clarify the story could also enhance these elements. Again, no specific story is provided as a case study.
  • Stories used in health communication research: While not strictly “personal” storytelling in the sense of an individual sharing their own life story, the reference to using storytelling strategies in health messages and the example of a video story about cervical cancer screening having positive effects implies that well-crafted narratives, even if based on others’ experiences or fictionalized for a purpose, can be impactful. These stories likely leverage principles like narrative quality, emotional engagement (perhaps through relatable characters), and the potential for identification to achieve their persuasive goals.

In summary, while the documents highlight the importance and effects of narratives, they do not offer specific, detailed examples of successful personal storytelling that are then explicitly analyzed to showcase how principles of Narrative Transportation were employed. We can infer that instances of impactful storytelling, whether in therapy or health communication, likely benefited from these principles, but the sources do not provide the detailed case studies requested.

(8) How do the different terms provided (Narrative Transportation Theory, Narrative Theory, Transportation Theory, and Narratology) relate to or differ from each other within the context of these documents?

Based on the provided documents and our conversation history, here’s how the different terms relate to or differ from each other:

  • Narrative Theory is the broadest term, encompassing a wide range of approaches to texts that are considered to be, partially or wholly, narrative. It is described as a set of approaches to understand, analyze, and evaluate narratives across various forms, including texts, images, spectacles, and events. Narrative theory draws upon various disciplines like linguistics, semiotics, psychology, and cultural studies to study the forms, functions, and effects of narrative.
  • Narratology is a specific branch or discipline within Narrative Theory. It is often referred to as the theory or science of narrative. Narratology aims to develop systematic frameworks and tools for analyzing narrative structures, elements (like story and discourse), and categories. The sources indicate that narratology has its roots in structuralism and has evolved into various schools of thought and approaches, including classical and postclassical narratologies. Mieke Bal defines narratology as “the ensemble of theories of narratives, narrative texts, images, spectacles, events; cultural artifacts that ‘tell a story'”.
  • Transportation Theory (also referred to as Narrative Transportation Theory) is a psychological theory that focuses specifically on the experience of being immersed or “transported” into a narrative world. This theory examines the cognitive and emotional processes involved when individuals become highly engaged with a story, experiencing a sense of “going into” the narrative world. Key aspects of transportation include focused attention, imagery, and emotional engagement.
  • The relationship between Narrative Theory and Narratology is that narratology is a core component of narrative theory. While narrative theory is the overarching field of study, narratology provides many of the analytical tools, concepts, and frameworks used within this field to understand how narratives function.
  • Narrative Transportation Theory is a specific theory within the broader field of Narrative Theory, but it comes primarily from psychology and communication studies rather than literary studies, which have historically been central to narratology. While narratology might analyze the structural features of a story, Narrative Transportation Theory focuses on the psychological impact of engaging with that story, particularly its ability to transport the audience and potentially lead to persuasion.
  • Transportation Theory and Narrative Transportation Theory appear to be used interchangeably in the provided sources. Both terms refer to the same psychological construct of being immersed in a narrative.

In simpler terms:

  • Think of Narrative Theory as the entire field of studying stories.
  • Narratology is a key set of theories and methods within narrative theory, focusing on the structure and workings of narratives.
  • Narrative Transportation Theory (or Transportation Theory) is a specific psychological theory within narrative theory that explains how people become immersed in stories and the effects of this immersion, such as persuasion.

Therefore, while all these terms relate to the study of narrative, they operate at different levels of generality and come from different disciplinary perspectives. Narratology offers tools for analyzing narratives, while Narrative Transportation Theory explains a particular psychological experience of engaging with narratives.

If you enjoyed this article…Buy Me A Coffee

Learn more about the coaching process or
contact me to discuss your storytelling goals!

Subscribe to the newsletter for the latest updates!

Copyright Storytelling with Impact® – All rights reserved

Understanding Narrative Transportation Using Manus AI

In my last post I talked about Embracing Narrative Transportation as a way to enhance the impact of your personal stories. And while I enjoy writing articles about storytelling, I’ve also been exploring AI (Artificial Intelligence) as a way to broaden my research capabilities.

I have to say, it’s amazing to watch these Chatbots in action. They are fast, and comprehensive in their reach, but they don’t replace humans as they still don’t have human experiences that we have, but they are able to locate and analyze information faster than any human.

So I decided to see what some of the latest chatbots had to say about the topic of Narrative Transportation with regards to crafting impactful personal stories. In part one, I asked Manus AI what it thought. Below is the prompt that I used, its initial response, and the output it generated. Warning, it’s a long article, but should you have an interest in AI, it’s an interesting read

Prompt:

I’m trying to understand the topic of Narrative Transportation in depth, as it relates to storytelling in general, but more specifically, how we can use Narrative Transportation to create impactful personal narratives.

Response:

I’ll help you understand Narrative Transportation in depth, both in general storytelling and specifically for creating impactful personal narratives. I’ll research this topic thoroughly and provide you with comprehensive information. Let me get started right away./em>

Output:

Here’s what Manus AI had to say:

Understanding Narrative Transportation: Creating Impactful Personal Narratives

Introduction

Narrative transportation is a powerful psychological phenomenon that occurs when individuals become completely immersed in a story. This comprehensive guide explores the concept of narrative transportation, its application in storytelling, and specifically how it can be leveraged to create impactful personal narratives. By understanding the mechanisms of narrative transportation and implementing specific techniques, you can craft personal narratives that deeply engage audiences, foster connections, and potentially transform beliefs and attitudes.

Part I: Understanding Narrative Transportation

The Concept of Narrative Transportation

Narrative transportation refers to the psychological experience of being completely immersed in a story. When a person is transported into a narrative, they become absorbed cognitively, emotionally, and in the imagery of the story, regardless of the medium (print, film, podcasts, video games, etc.).
The concept was first formally defined by Richard Gerrig in his 1993 book “Experiencing Narrative Worlds,” where he used travel as a metaphor for reading. He described it as a state in which the reader—referred to as the traveler—becomes detached from their original reality due to deep engagement with the story.
Melanie Green and Timothy Brock later expanded on this idea, describing narrative transportation as “a distinct mental process, an integrative melding of attention, imagery, and feelings.” They conceptualized it as the experience of being carried away by a narrative.

Key Features of Narrative Transportation

  1. Focused attention – Receivers become so focused on the story they do not think about distractions in their environment
  2. Emotional Engagement – There is a strong emotional response to the story with empathy for the characters
  3. Mental Imagery – Receivers have a mental image of the story in their heads or try to predict what might happen in the future of the story
  4. Cognitive Detachment from Reality – Receivers feel like they are really in the story and not in their real-world environment
  5. Belief and Attitude Shift – Narrative transportation can influence receivers’ beliefs, attitudes, and behaviors, aligning them with the story’s content
  6. Reduced Counterarguing – When transported, the receiver is less likely to challenge the story’s content
  7. Long-term Efficacy – The receiver is impacted even after the story ends, affecting long-term behavior and attitude change

Psychological Mechanisms of Narrative Transportation

Reduced Counterarguing

Transporting stories create a more open, accepting mindset for individuals. When readers are transported in an entertaining narrative, they may not want to interrupt their enjoyment to argue with the story. This reduced counterarguing can occur even when the persuasive intent is obvious.

Connections with Characters

Audience members may:

  • See themselves in story characters
  • Come to see characters as friends
  • Admire the characters

Through these connections, audiences may change their attitudes and beliefs to align with those of narrative characters. Identification with and liking of story characters has been shown to increase the adoption of beliefs advocated by the character.

Mental Imagery

The transportation-imagery model highlights the role of visual imagery in transportation-based belief change. The experience of being transported into a story links vivid images with beliefs implied by the story. This connection between images and beliefs may be one basis for the power of narrative persuasion.

Emotional Engagement

Stories are particularly powerful when they evoke strong emotions. The pattern of emotional experience that occurs as readers progress through a story (emotional flow) can help maintain interest, particularly for longer narratives.

Reminding Readers of Personal Experiences

Stories can prompt self-examination and change by reminding readers of experiences in their own lives that relate to those in the narrative. Individuals who are more immersed in a story show greater generalization of the beliefs implied by the story, but additionally, whether the story brings to mind events from the readers’ real lives seems to be important in determining narrative impact.

Factors Affecting Narrative Transportation

Audience Factors:

  • World view (e.g., pro-environmental)
  • Prior knowledge and experience
  • Attitudes and beliefs
  • Empathy
  • Political ideology
  • Religiosity
  • Transportability (individual tendency to become immersed)
  • Need for affect (enjoyment of emotional experiences)
  • Need for cognition (enjoyment of thinking)

Story Factors:

  • Fiction vs nonfiction
  • First-person vs third-person
  • Well-developed characters
  • Intent of character
  • Personally relevant stories
  • Stories that stay engaging
  • Coherence and structure
  • Emotional intensity
  • Suspense and dramatic tension
  • Psychological realism
  • Cultural appropriateness

Part II: Narrative Transportation in Personal Narratives

Unique Aspects of Personal Narratives

Personal narratives differ from fictional narratives in several key ways when it comes to narrative transportation:

Authenticity and Vulnerability

  1. Authenticity as a Bridge: In personal narratives, authenticity serves as a bridge between the storyteller’s world and the audience’s. When authors share their genuine experiences, emotions, and vulnerabilities, they create a unique connection that can be more immediate than in fictional narratives.
  2. Vulnerability Creates Intimacy: Personal narratives that expose vulnerability create a level of intimacy that fictional narratives may struggle to achieve. Research by Hagmann et al. (2024) found that “self-revealing personal narratives create vulnerability by exposing the speaker to others’ negative judgment.” This vulnerability signals authenticity and honesty to listeners, increasing trust and connection.
  3. Real-life Consequences: Unlike fictional narratives, personal narratives carry real-life consequences for the storyteller, which can heighten the audience’s emotional investment. The willingness to incur potential interpersonal costs through self-disclosure signals that the speaker values honesty.

Identification and Social Identity

  1. Shared Experience Identification: In personal narratives, identification often occurs through shared experiences rather than character traits. Research on patient identification in cancer narratives found that shared experiences can create powerful connections regardless of demographic differences.
  2. Social Identity Influence: Personal narratives can activate social identities that influence transportation. According to social identity theory, perceived risk to a patient’s identity increases identification with salient social groups who possess a similar identity.
  3. Transportability Across Divides: Personal narratives have shown the ability to build trust across ideological divides. Research has found that people judge ideological opponents as more trustworthy when opposing opinions are expressed through a self-revealing personal narrative than through either data or stories about third parties.

Mechanisms of Narrative Transportation in Personal Narratives

Personalization and Intimacy

  1. Personalization Effect: Research suggests that personal narratives create a sense of “personalness” that drives effectiveness. Personalness is defined as “the level of intimacy delivered when narrating an experience, or the capability of eliciting a sense of intimacy with the ‘other.'”
  2. <Mental Representation: This sense of intimacy motivates engagement with the content and enables the mental representation of the narrative (narrative transportation), consequently increasing the likelihood of accepting the recommendation or message.

Self-Related Control Beliefs

  1. Empowerment Through Transportation: Research has found that “narratives in both written text and video form with protagonists displaying high versus low self-efficacy can, at least temporarily, affect recipients’ own self-related control beliefs when they experience strong transportation into the stories.”
  2. Mediation Through Transportation and Identification: The effect of narrative manipulation on self-related control beliefs was mediated by experienced transportation and identification, suggesting that personal narratives that facilitate transportation can influence how readers view their own capabilities.

Reduced Counterarguing in Personal Contexts

  1. Disarming Effect: Personal narratives may reduce counterarguing more effectively than fictional narratives because questioning someone’s lived experience can seem inappropriate or insensitive.
  2. Trust Building: Research has found that “trust does not suffer when explanations grounded in self-revealing personal narratives are augmented with data, suggesting that our results are not driven by quantitative aversion.”

Impact of Personal Narratives

Building Trust and Changing Beliefs

  1. Trust Across Differences: Personal narratives have shown the ability to build trust across ideological divides. People are more willing to collaborate with ideological opponents who support their views by embedding data in a self-revealing personal narrative, rather than relying on data-only explanations.
  2. Changing Health Behaviors: Research has found that “identification with the main character in the vignettes was a significant predictor of intentions to participate in cancer research, but only when the mediating role of narrative transportation was considered.”
  3. Self-Concept Changes: Personal narratives can influence how readers view themselves. Research has demonstrated that stories featuring strong protagonists who display high self-efficacy can increase recipients’ own self-related control beliefs when they are highly transported into the stories.

Profound Connections with Audiences

  1. Solace and Belonging: Stories that expose the raw, unfiltered emotions of their creators have the power to resonate with readers in a way that is impossible to replicate. In our own struggles and triumphs, we recognize a reflection of ourselves, and it’s in those moments that we find solace, encouragement, and a sense of belonging.
  2. Shared Human Experience: Personal narratives remind us that we are not alone in our experiences. They create a sense of shared humanity that can be particularly powerful for individuals facing challenges or difficult situations.

Part III: Techniques for Creating Impactful Personal Narratives

Structural Techniques

1. Clear Narrative Structure

A well-structured personal narrative guides readers through your emotional journey with ease. The basic structure includes:

  • Beginning: Establish your core idea and hook your readers
  • Middle: Develop your central idea using descriptive language to define your story, setting, characters, and plot
  • End: Summarize lessons learned, relate your core theme back to readers’ lives, and include a call to action

The ABCDE storytelling framework (from Anne Lamott’s book “Bird by Bird”) provides an excellent structure:

  • Action: Drop the reader right into the action. No need to “set the scene.”
  • Background: Once you have the reader’s attention, tell them why it matters.
  • Development: Build the story up. Where is the story going? What’s at risk?
  • Climax: This is the tipping point. The “aha!” The “oh no!” The “SURPRISE!”
  • Ending: Explain what it was all for. What was the lesson?

2. Start with a Strong Hook

Begin your personal narrative with something that immediately captures attention. This could be:
  • A surprising statement
  • A vivid scene
  • A provocative question
  • A moment of tension
The hook should relate to the core message of your narrative and give readers a reason to keep reading.

3. Focus on Transformation

Personal narratives are most impactful when they demonstrate transformation. Structure your narrative around:
  • Who you were before
  • The challenge or catalyst for change
  • The process of transformation
  • Who you became as a result
  • What this means for the reader

Emotional Connection Techniques

1. Vulnerability and Authenticity

The most powerful element in a personal narrative is the emotional connection you establish with readers. To create this:

  • Be honest and open about how experiences have impacted you
  • Include all relevant details, even embarrassing or painful ones
  • Share your feelings as well as your pain
  • Never be afraid to show vulnerability—this gives readers permission to accept their own
  • Write from a foundation of authenticity rather than trying to persuade or sell ideas

2. Descriptive Language and Sensory Details

Use descriptive language to paint a vivid picture in your reader’s mind:

  • Help readers visually experience colors and shapes
  • Create sensations of temperature, texture, and physical feelings
  • Incorporate scents, tastes, and sounds
  • Use all five senses to connect with readers’ emotions
  • Don’t limit your narrative to a sequence of causes and effects

3. Emotional Flow

The pattern of emotional experience throughout your narrative helps maintain interest:

  • Create and then resolve suspense about outcomes
  • Include changes in emotional tone across the course of the story
  • Balance difficult emotions with moments of insight or relief
  • Build emotional intensity toward key realizations
  • End with emotional resolution that feels satisfying

Connection with Readers

1. Make the Reader the Main Character

Your story isn’t really about you—the reader is the main character. To achieve this:

  • Have clarity around why your personal story is interesting to the READER
  • Connect your perspective with universal truths
  • Ask yourself: What’s the problem you want the reader to solve?
  • Consider: What’s the realization you want the reader to have?
  • Focus on: What’s the obstacle you want to help the reader overcome?

2. Identify the Takeaway

The takeaway explains why your story is important and creates a deeper emotional tie with the reader:

  • Connect your experience to something bigger
  • Identify the universal truth at play
  • Clarify what your experience says about you, life, or the world
  • Determine if your story is about frustration, injustice, loyalty, hope, etc.
  • Make the connection between your story and the message explicit

3. Relevance to Reader’s Life

Make your personal narrative relevant to readers’ lives:

  • Give readers a “heads up” so they can avoid pain you experienced
  • Provide relief if they’re currently going through a similar experience
  • Show how your lessons apply to their situations
  • Create points of similarity between your experience and theirs
  • Focus on shared human experiences that transcend specific circumstances

Finding Story-Worthy Material

1. Daily Reflection Practice

Use the “Homework for Life” framework (from Matthew Dicks’ book “Storyworthy”):

  • Take five minutes daily to reflect on your day
  • Ask: If you had to tell a story from today, what would it be?
  • Write one sentence about it
  • Look for difficult times, moments of growth, and proud or awkward experiences
  • Create two columns: date and story

2. Identify Emotional Significance

Pay attention to events that leave a strong impression:

  • What moments do you replay in your mind?
  • What makes you uncomfortable?
  • What reveals your vulnerability?
  • What forces you to confess a character flaw?
  • What challenges your core beliefs about yourself?

3. Choose Meaningful Experiences

Select experiences that have been transformational:

  • Events that caused you to rethink or reconsider your point of view
  • Circumstances that resulted in personal or professional growth
  • Painful situations that forced you to change your beliefs
  • Unique life experiences that taught you important lessons
  • Moments where you overcame adversity

Refinement Techniques

1. Editing and Revision

After completing your first draft:
  • Edit for grammar, spelling, and punctuation errors
  • Add any extra details you may have forgotten
  • Remove irrelevant information that doesn’t serve the core message
  • Ensure transitions between sections are smooth
  • Check that your narrative has a clear beginning, middle, and end

2. Seek Feedback

Get perspective from others:

  • Ask friends or family members to read your narrative
  • Be clear about what kind of feedback you’re looking for
  • Listen to their responses without taking criticism personally
  • Consider whether readers connect with your story as intended
  • Revise based on feedback to improve clarity and impact

3. Maintain an Enjoyable Process

While writing a personal narrative can be emotionally challenging:

  • Have fun with the process
  • Relax and tell your story naturally
  • Remember that the more relaxed you are, the more authentic your narrative will feel
  • View the process as potentially therapeutic and liberating
  • Recognize that sharing your story can benefit both you and your readers

Part IV: Practical Applications of Personal Narratives

Therapeutic and Personal Growth

Personal narratives can be used for:

  • Processing difficult experiences
  • Gaining clarity about life events
  • Discovering patterns in your behavior or thinking
  • Building self-awareness
  • Creating meaning from challenging circumstances

Professional and Leadership Contexts

Personal narratives are valuable in:

  • Building trust with teams or clients
  • Demonstrating values in action
  • Illustrating problem-solving approaches
  • Creating connection in professional relationships
  • Establishing credibility through lived experience

Educational and Inspirational Purposes

Share personal narratives to:

  • Teach important life lessons
  • Inspire others facing similar challenges
  • Provide practical guidance based on experience
  • Create empathy for different perspectives
  • Motivate positive change in others

Part V: A Framework for Applying Narrative Transportation in Personal Narratives

Step 1: Understand Your Purpose

Before crafting your personal narrative, clarify:
  • What you want to achieve with your story
  • Who your audience is
  • What change you hope to inspire
  • What beliefs or attitudes you want to influence
  • What emotional response you want to evoke

Step 2: Select the Right Experience

Choose an experience that:
  • Has transformed you in some way
  • Contains universal themes or lessons
  • Evokes strong emotions
  • Demonstrates vulnerability and authenticity
  • Has clear stakes and consequences

Step 3: Structure for Maximum Impact

Organize your narrative to:
  • Begin with a compelling hook
  • Establish context and background
  • Build tension through challenges and obstacles
  • Reach a climactic moment of realization or change
  • Resolve with reflection and meaning-making

Step 4: Enhance Transportation Elements

Deliberately incorporate elements that facilitate transportation:

  • Rich sensory details to create mental imagery
  • Emotional authenticity to create connection
  • Vulnerability to build trust
  • Universal themes to create relevance
  • Transformation to inspire hope

Step 5: Connect to the Reader

Make your narrative relevant by:

  • Explicitly stating the takeaway or lesson
  • Connecting your experience to broader human experiences
  • Addressing how your story might help the reader
  • Creating points of similarity between your journey and theirs
  • Inviting the reader to reflect on their own experiences

Step 6: Refine and Test

Improve your narrative by:

  • Getting feedback from trusted readers
  • Assessing whether it creates the intended emotional response
  • Checking if the core message is clear
  • Ensuring the narrative flows smoothly
  • Confirming that it feels authentic and honest

Conclusion

Narrative transportation offers a powerful framework for understanding how stories captivate us and influence our beliefs, attitudes, and behaviors. By applying the principles of narrative transportation to personal narratives, you can create stories that not only engage readers but potentially transform them.

Personal narratives have unique strengths in creating transportation through authenticity, vulnerability, and shared human experience. When crafted with intention and skill, they can build trust, foster connection, and inspire change in ways that other forms of communication cannot.

The techniques outlined in this guide provide practical approaches to enhancing narrative transportation in your personal stories. By focusing on structure, emotional connection, reader relevance, and meaningful content, you can create personal narratives that transport readers into your experiences and leave lasting impressions.

Whether you’re writing for personal growth, professional development, or to inspire others, understanding narrative transportation will help you craft more impactful personal narratives that resonate deeply with your audience.

References

  • Gerrig, R. J. (1993). Experiencing narrative worlds: On the psychological activities of reading.
  • Green, M. C., & Brock, T. C. (2000). The role of transportation in the persuasiveness of public narratives.
  • Hagmann, D., Ho, E. H., & Loewenstein, G. (2024). Personal narratives build trust.
  • Isberner, M. B., Richter, T., Schreiner, C., Eisenbach, Y., Sommer, C., & Appel, M. (2020). Empowering stories: Transportation into narratives with strong protagonists increases self-related control beliefs.
  • Lamott, A. (1994). Bird by Bird: Some Instructions on Writing and Life.
  • Neil, J. M., Gough, A., Kee, F., George, T. J., Pufahl, J., & Krieger, J. L. (2020). The influence of patient identification and narrative transportation on intentions to participate in cancer research.
  • Strange, J. J., & Leung, C. C. (1999). How anecdotal accounts in news and in fiction can influence judgments of a social problem’s urgency, causes, and cures.
  • Valenzuela, A., & Galli, L. (2024). The personalization effect: How personal narratives drive electronic word-of-mouth effectiveness.
  • Van Laer, T., De Ruyter, K., Visconti, L. M., & Wetzels, M. (2014). The extended transportation-imagery model: A meta-analysis of the antecedents and consequences of consumers’ narrative transportation.

If you enjoyed this article…Buy Me A Coffee

Learn more about the coaching process or
contact me to discuss your storytelling goals!

Subscribe to the newsletter for the latest updates!

Copyright Storytelling with Impact® – All rights reserved

Santa Fe Institute.- Nature of Intelligence – Complexity Wrap Up

Hopefully you have been along for the ride and have listened to all six episodes. It’s been a lot to digest, a lot to think about. While the field of neuroscience has made great strides, when it comes to the subject of human intelligence there’s still so much to learn. Which is why I’ve appreciated this podcast.

And now we have AI entering the picture. Will it augment our IQ, or surpass us, to our detriment? It’s a mystery. So much upside, yet there’s a dark side to how AI can be used by bad actors operating behind the scenes. If you missed a post:

As a key to this series is an exploration of AI, I asked Google’s NotebookLM to provide some insights as to the key points that were explored over the series. Does this synopsis align with your impressions? Here’s the cast of characters:

  • Melanie Mitchell (host) – Professor at the Santa Fe Institute working on artificial intelligence and cognitive science. In the final episode, she is interviewed about her background, views on AI, AGI, and the future of the field.
  • Abha Eli Phoboo (host) – Abha is a writer and an obsessive rewriter. Interested in the arts and sciences, she explores the weak interaction between the two. A CERN Press Officer, she translates physics into English and helps scientists communicate their research to the world.
  • Alison Gopnik – Professor of psychology and philosophy, member of the Berkeley AI Research group, external professor with the Santa Fe Institute, who studies how children learn.
  • John Krakauer – Professor of neurology, neuroscience, physical medicine, and rehabilitation at Johns Hopkins University School of Medicine, who researches intelligence and physical movement in animals, machines, and humans.
  • Ev Fedorenko – Featured in the second episode discussing the relationship between language and thought. Her work includes using fMRI brain scans to examine the relationship between language and other forms of cognition.
  • Steve Piantadosi – Featured in the second episode discussing the relationship between language and thought. He provides examples of how language can make learning more efficient.
  • Gary Lupyan – Featured in the second episode discussing the relationship between language and thought. He believes language is one of the major reasons for human intelligence, potentially more of a cause than a result.
  • Murray Shanahan – Professor of cognitive robotics at Imperial College London and principal research scientist at Google DeepMind.
  • Tomer Ullman – Psychologist at Harvard University studying computation, cognition, and development.
  • Linda Smith – Chancellor’s Professor of Psychological and Brain Sciences at Indiana University, a developmental psychologist and pioneer of head-mounted camera research with infants.
  • Mike Frank – Professor of psychology at Stanford, who studies how children learn and uses large datasets and new methodologies.
  • Erica Cartmill – Professor of cognitive science, animal behavior, anthropology, and psychology at Indiana University, who studies cognition and communication across a wide range of species, including great apes and human children.
  • Ellie Pavlick – Discusses how we assess intelligence, particularly in machines, and the challenges of applying human tests to AI. She also talks about the difficulty of understanding how LLMs work internally.

Santa Fe Institute Complexity Podcast

AI Summary via NotebookLM

This podcast series explores the complex question: What is intelligence?. It highlights that defining intelligence is difficult and that there is no single, simple definition; it’s more like a “suitcase word” packed with various capabilities. The series draws on insights from cognitive scientists, child development specialists, animal researchers, and AI experts.

Human intelligence involves many facets. It includes learning about cause and effect by experimenting and interacting with the world. Humans are good at generalizing knowledge and making analogies, applying what they learn in one situation to new ones without needing vast amounts of retraining. Common sense, which relies on innate understandings of the physical world and flexibility in thinking, is also crucial.

Language is seen as a backbone of human culture and a powerful tool for sharing information and ideas, enabling us to learn without direct experience and understand abstract concepts. There is debate, however, on whether language is a cause or a result of human intelligence, and whether language and thought are fundamentally separate or intertwined. Some evidence suggests they can be separate, at least in adults. Human intelligence also relies heavily on our social nature, drive to collaborate, and the unique role of caregiving in development.

Large Language Models (LLMs) like ChatGPT are a focus of the series. These systems are trained on enormous amounts of human-generated text data from the internet. They work by finding statistical correlations in language and predicting the most likely next word or “token”. While LLMs can produce sophisticated and sometimes creative language, there are significant differences compared to human intelligence.

LLMs learn passively from data, unlike humans who learn actively through interaction with the world. They lack an inherent drive to explore or understand the world. There is debate on whether LLMs truly “understand” language in a meaningful sense or simply know how to use words based on patterns. They also cannot engage with the world to update “beliefs” and sometimes make things up, a behavior called “hallucinating”.

Assessing the intelligence of LLMs is challenging. Applying tests designed for humans, like the SAT, might not mean the same thing for a machine. Some researchers suggest LLMs might be learning how to pass the test rather than exhibiting general reasoning ability. Understanding how LLMs actually work internally (“mechanistic understanding”) is seen as crucial but is still a nascent area of research. Some propose thinking of LLMs as sophisticated “role-players” rather than entities with beliefs or consciousness. LLMs might also be better understood as reflecting collective knowledge rather than a single agent’s intelligence.

The concept of Artificial General Intelligence (AGI), often described as human-level intelligence in machines, is discussed, but its definition remains vague and debated. The current path to building powerful AI is seen by some as unsustainable due to the immense data and energy requirements, suggesting that future AI might need to be more “embodied” and learn more like humans or animals.

Beyond theoretical fears, the series highlights real, present risks of AI, including the spread of deepfakes and disinformation, which can erode trust and make it harder to find reliable information online. The unauthorized use of human-generated data for training AI is also raised as an ethical concern.

Top Five Audience Takeaways

  1. Defining “Intelligence” is Surprisingly Difficult. Instead of being a simple, single thing we can measure, intelligence is like a “suitcase word” packed with many different abilities and ways of being. Researchers across various fields agree that there’s no easy, complete definition of what makes something intelligent, whether it’s a person, an animal, or a machine.
  2. Human Intelligence is Deeply Tied to Active Experience and Social Interaction. Humans don’t just passively absorb information; we learn by actively exploring the world, doing “little experiments,” and figuring out cause and effect. Our ability to generalize knowledge to new situations with limited examples is crucial. Furthermore, language, our drive to collaborate, and the unique role of caregiving are fundamental to how our intelligence develops and functions.
  3. Today’s Powerful AI, like ChatGPT (LLMs), Works Very Differently from Human Intelligence. These systems are trained on enormous amounts of text data from the internet, learning by finding statistical patterns and predicting the next word. Unlike humans, they learn passively, lack an inherent drive to explore the world, don’t have beliefs, and can sometimes “hallucinate” or make things up. While they can produce impressive language, there’s a significant debate about whether they truly “understand” in a human sense or are just very sophisticated at using patterns.
  4. Testing AI Intelligence Using Human Standards is Tricky. Applying tests designed for humans, like the SAT or theory-of-mind tasks, to LLMs might not accurately reflect their capabilities. LLMs might simply be learning how to pass the specific test through pattern matching from their vast training data, rather than exhibiting genuine reasoning or understanding. Understanding how these AI systems arrive at their answers – looking “under the hood” – is a crucial but difficult area of research. We also need to be mindful that our human-centric view can limit how we assess intelligence in other entities, including animals.
  5. Current AI Approaches Face Significant Challenges and Present Real Risks. The reliance on massive data and energy to build powerful AI systems may not be sustainable or efficient in the long run. Beyond theoretical fears about Artificial General Intelligence (AGI), there are immediate concerns like the spread of deepfakes and misinformation, which can erode trust and make finding reliable information difficult. There are also ethical questions about using vast amounts of human-generated data to train AI without permission or benefit to the creators. Some researchers suggest future AI development might need to take a different path, perhaps learning more like babies or animals, to be more sustainable and genuinely intelligent.

If you enjoyed this article…Buy Me A Coffee

Learn more about the coaching process or
contact me to discuss your storytelling goals!

Subscribe to the newsletter for the latest updates!

Copyright Storytelling with Impact® – All rights reserved

Nature of Intelligence – Episode Six – AI’s changing seasons

In this final episode of the Complexity podcast, Melanie Mitchell provides us with a bit of her backstory — how she became interested in the topic of AI — and the path she’s been on in the 35 years since she got her PhD, years. She shares the little-known fact that AI wasn’t always the hot topic it’s been in our recent memory, having been through a few up and down cycles along the way.

The world of AI has gone through several cycles of huge optimism and people thinking that true AI is just around the corner, just a few years away. And then disappointment because the methods that AI is using at the time don’t actually turn out to be as promising as people thought. ~ Melanie Mitchell

When she mentions that, “cognitive scientists have been trying to understand what human level intelligence is for a century now” it’s a stark reminder that it doesn’t make sense to compare human intelligence to artificial intelligence if we’re not sure what’s going on in our own minds.

Intelligence, as we’ve seen throughout the podcast is not a well-defined sort of rigorously mathematically defined notion. It’s what Marvin Minsky, the AI pioneer, called a suitcase word. And by that he meant that it’s like a suitcase that’s packed full of a jumble of different things, some of which are related and some of which aren’t. ~ Melanie Mitchell

And there’s no single thing that intelligence is. It’s a whole bunch of different capabilities and ways of being that perhaps are not just one single thing that you could either have more of or less of, or get to the level of something. It’s just not that kind of simple thing. It’s much more of a complex notion. ~ Melanie Mitchell

The dark side to AI is also brought to light, with mention of deep fakes and voice cloning, alongside the perils of misinformation and disinformation. As to what is on the horizon, a big worry is that impersonating humans will become a thing. The bottom line, as AI gets more intelligent, there’s an upside, and a downside.

Hopefully this podcast series gave you some insight as to how the story of our common humanity, and your own story, may unfold.

Transcript

Abha Eli Phoboo: From the Santa Fe Institute, this is Complexity

Melanie Mitchell: I’m Melanie Mitchell

Abha: And I’m Abha Eli Phoboo

Abha: Melanie, it’s so wonderful to be able to sit down and ask you questions this time. Could we maybe get started with, you know, how you got into the business of AI, could you maybe tell us a little bit about that?

Melanie: Yeah, so I majored in math in college. And after college, I worked as a math teacher in a high school in New York City. But while I was there, I didn’t really know what I wanted to do. I knew I didn’t want to teach forever. So I was reading a lot. And I happened to read a book called Gödel, Escher, Bach by Douglas Hofstadter.

And it was a book about, well, Gödel, the mathematician, Escher, the artist, and Bach, the composer, obviously. But it was really much more. It was about how intelligence can emerge from non-intelligent substrate, either in biological systems or perhaps in machines. And it was about the nature of thinking and consciousness. And it just grabbed me like nothing else ever had in my whole life. And I was just so excited about these ideas.

So I decided I wanted to go into AI, which is what Hofstadter himself was working on. So I contacted him. He was at Indiana University and I never heard back. In the meantime, I moved to Boston for a job there and was hanging around on the MIT campus and saw a poster advertising a talk by Douglas Hofstadter. I was so excited.

So I went to the talk and I tried to talk to him afterwards, but there was a huge crowd of people around him. His book was extremely famous and had a big cult following. So then I tried to call him at his office. He was on sabbatical at MIT, it turned out, and left messages and never heard back. So finally I figured out he’s never at his office during the day, so he must be there at night.

So I tried to call him at 10 in the evening and he answered the phone and was in a very good mood and very friendly and invited me to come talk to him. So I did and I ended up being an intern in his group and then going to graduate school to work with him. So that was the story of how I got to my PhD program.

It was actually at University of Michigan where he was moving to, and worked with him for my PhD working on how people make analogies and how a machine might be able to make analogies in a similar way.

Abha: That’s so interesting. I mean, you were very tenacious, you kept not giving up.

Melanie: Yeah, exactly. That was the key.

Abha: So when you graduated, I’ve heard you mentioned before that you were discouraged from mentioning AI in your job search. Could you maybe tell a little bit about what the world of AI was like at that point?

Melanie: Yeah, so the world of AI has gone through several cycles of huge optimism and people thinking that true AI is just around the corner, just a few years away. And then disappointment because the methods that AI is using at the time don’t actually turn out to be as promising as people thought.

And so these are called sort of the AI springs and AI winters. And in 1990, when I got my PhD, AI was in the winter phase. I was advised not to use the term artificial intelligence on my job applications. I was advised to use something more like intelligent systems or machine learning or something like that, but the term AI itself was not looked well upon.

Abha: So what do you think now of the fact that the Nobel Prize just recently went to people working in AI? The one for physics went to John Hopfield and Geoffrey Hinton for their work in machine learning. And then Demis Hasabis for chemistry. What do you think of that?

Melanie: Well, obviously we’re in an AI spring or summer right now and the field is very hot and people are again predicting that we’re going to have, you know, general human and level machine intelligence any day now. I think it’s really interesting that the Nobel prizes this year were sort of, you know, the AI sweep.

There were a lot of people joking that ChatGPT would get the literature prize. But, I was a little surprised at the physics prize, not so much at the chemistry prize. You know, the chemistry prize was for Alpha Fold, which is a program from Google DeepMind, which is better than anything that ever came before in predicting protein structure. That was obviously a huge, huge success and incredible achievement.

So I think that was not surprising to me at all that the DeepMind people got that award. The physics award, you know, Hopfield is a physicist and the work that he did on what are now called Hopfield networks was very inspired by physics. Hinton I was a little more confused about just because I don’t didn’t really see the physics connection so much. I think it is just more the impact that machine learning is having on physics. And machine learning today is all about neural networks, and Hinton was obviously a big pioneer in that field. So I think that’s the thinking behind that. But I know a lot of physicists who have grumbled that that’s not physics.

Abha: Yes, it’s been very interesting to see that debate in the physics community. You and I, you know, we’ve talked to so many researchers over the course of the season, and I wanted to ask if there was something you were hoping to learn when we first started building this podcast together?

Melanie: Well, I think one reason I was excited to do this podcast was because I wanted to talk to people, not just in AI, but also in cognitive science. The voices of cognitive science and AI haven’t been given as much sort of airtime as people who are at big AI companies or big AI labs. I think that they’ve been missing a key element, which is, what is this thing we’re calling intelligence?

What is the goal of something like general AI or AGI? What’s the thing we’re trying to get to when we talk about human level intelligence and cognitive scientists have been trying to understand what human level intelligence is for a century now. The ideas that these people have about intelligence seem to be very different from those of people sort of leading the pack in the AGI world. So I think that’s an interesting contrast.

Abha: I agree. I think I learned a lot too. And John Krakauer, one of the first guests we had in the first episode of the season, you and he are currently going through a three-year discussion project to understand the nature of intelligence. And I’m curious about what you’ve learned. I know you had your first meeting. So what you learned in that first meeting and why do you think it is so important that you want to put this exercise together for a number of years, not just a couple of sessions that end in a month or two.

Melanie: Well, I think there are several aspects to this. So John Krakauer and I have been talking for years about intelligence and AI and learning, and we finally decided that we should really have a set of very focused workshops that include people from all these different fields, similar to this podcast, about the nature of intelligence. AI and machine learning, it’s a very fast moving field.

You hear about new progress every day. There’s many, many new papers that are published or submitted to preprint servers. And it’s just overwhelming. It’s very fast. But there’s not a lot of more slow thinking, more long-term, more in-depth thinking about what it is that we’re actually trying to do here. What is this thing called intelligence? And what are its implications, especially if we imbue machines with it?

So that’s what we decided we would do, kind of slow thinking rather than very fast research that is taking over the machine learning and AI fields. And that’s what in some sense, SFI or Santa Fe Institute is really all about is trying to foster this kind of very in-depth thinking about difficult topics. And that’s one of the reasons we wanted to have it here at the Santa Fe Institute.

Abha: It almost seems counterintuitive to think of AI now in slower terms because the world of AI is moving at such speed and people are trying to figure out what it is. But going back to our original question in this podcast, what do we know about intelligence right now?

Melanie: Well, intelligence, as we’ve seen throughout the podcast is not a well-defined sort of rigorously mathematically defined notion. It’s what Marvin Minsky, the AI pioneer, called a suitcase word. And by that he meant that it’s like a suitcase that’s packed full of a jumble of different things, some of which are related and some of which aren’t.

And there’s no single thing that intelligence is. It’s a whole bunch of different capabilities and ways of being that perhaps are not just one single thing that you could either have more of or less of, or get to the level of something. It’s just not that kind of simple thing. It’s much more of a complex notion. There’s a lot of different hallmarks that people think of. For me, it’s generalization, the ability to generalize, to not just understand something specific, but to be able to take what you know and apply it in new situations without having to be retrained with vast numbers of examples.

So just as an example, AlphaGo, the program that is so good at playing Go. If you wanted to teach it to play a different game, it would have to be completely retrained. It really wouldn’t be able to use its knowledge of Go, or its knowledge of sort of game playing, to apply to a new kind of game. But we humans take our knowledge and we apply it to new situations. And that’s generalization, that’s to me one of the hallmarks of intelligence.

Abha: Right. I’d like to go into your research now, and if you could tell us a little bit about the work you’ve done in conceptual abstraction, analogy making, and visual recognition and AI systems. The problems you’re working on right now, could you tell us a little bit about that?

Melanie: Sure. So I started my career working on analogy making. And when I got to Doug Hofstadter’s group, he was working on building a computer system that could make analogies in a very idealized domain, what he called letter string analogies. So I’ll give you one. If the string ABC changes to the string ABD, what did the string IJK change to?

Abha: IJL.

Melanie: Okay, very good. So you could have said, ABC changes to ABD, that means change the last letter to a D, and you would say IJD. Or you could have said, ABC changes to ABD, but there’s no Cs or Ds in IJK, so just leave it alone. But instead, you looked at a more abstract description. You said, okay, the last letter changed to its alphabetic successor.

That’s more abstract. That’s sort of ignoring the details of what the letters are and so on and applying that rule to a new situation, a new string. And so people are really good at this. You can make up thousands of these little letter string problems that do all kinds of transformations and people get the rules instantly.

But how do you get a machine to do that? How do you get a machine to perceive things more abstractly and apply what they perceive to some new situation? That’s sort of the key of analogy. And it turned out it’s quite difficult because machines don’t have the kind of abstraction abilities that we humans have. So that was back when I was first starting my PhD, that was back in the 1980s.

So that was a long time ago in AI years. But even now, we see that even the most advanced AI systems like ChatGPT still have trouble with these kinds of analogies, and there’s a new kind of idealized analogy benchmark that was recently developed called the Abstraction and Reasoning Corpus, which features more visual analogies, but similar to the ones that I just mentioned.

You have to try and figure out what the rule is and apply it to a new situation. And there’s no machine that’s able to do these anywhere near as well as people. The organizers of this benchmark have offered a prize, right now it’s at $600,000 for anybody who can write a program or build some kind of machine learning system that can get to the level of humans on these tasks. And that prize is still unclaimed.

Abha: I hope one of our listeners will work on it. It would be very cool to have that solved.

Melanie: We’ll put the information in the show notes.

Abha: So can you tell me know how do you go about testing these abilities?

Melanie: So the key for the letter string analogies and also for the abstraction and reasoning corpus problems that’s abbreviated to ARC is to show a few demonstrations of a concept. So like when I said ABC changes to ABD, the concept is, change the rightmost letter to its successor.

Okay, and so I showed you an example and now say, here’s a new situation. Do the same thing. Do something analogous. And the issue is, I haven’t shown you millions of examples, I’ve just shown you one example or sometimes with these problems you can give two or three examples. That’s not something that machine learning is built to do. Machine learning is built to pick up patterns after seeing hundreds to millions to billions of examples, not just one to three examples. So this is what’s called few-shot learning or few-shot generalization.

The few-shot being you just get a few examples. And this is really the key to a lot of human intelligence, is being able to look at a few examples, and then figure out what’s going on and apply that to new kinds of situations. And this is something that machines still haven’t been able to do in any general way.

Abha: So say, if a child sees a dog, right, of a certain kind, but then it sees a Dalmatian, which has different kinds of spots, they can still tell it’s a dog and not a cow, even though they’ve seen a cow with those kinds of patterns on their bodies before. So when you do that in machines, what do you actually find out? What have you found out in your testing of the ARC?

Melanie: We found out that machines are very bad at this kind of abstraction. We’ve tested both humans and machines on these problems. And humans tend to be quite good and are able to explain what the rule is they’ve learned and how they apply it to a new task. And machines are not good at figuring out what the rule is or how to apply a rule to a new task.

That’s what we found so far. Why machines can’t do this well? That’s a big question. And what do they need to do it well? That’s another big question that we’re trying to figure out. And there’s a lot of research on this. Obviously, people always love it when there’s a competition and a prize. So there’s a lot of people working on this. But I don’t think the problem has been solved in any general way yet.

Abha: I want to ask about this other workshop you’ve done quite frequently is the understanding workshop, which actually came out of the barriers of meaning. If you could tell a little bit about what the idea of understanding there was, I thought that was fascinating. Could you maybe recount a little bit?

Melanie: Yeah, so, many decades ago, the mathematician John Carlo Rota wrote an essay about AI. This was long before I was even in AI. And he asked: When will AI crash the barrier of meaning? And by that he meant like, we humans, language and visual data and auditory data, mean something to us. We seem to be able to abstract meaning from these inputs.

But his point was that machines don’t have this kind of meaning. They don’t live in the world, they don’t experience the world, and therefore they don’t get the kind of meaning that we get and he thought of this as a barrier, this is their barrier to general intelligence.

So we had a couple of workshops called AI and the barrier of meaning, because I kind of like that phrase, about what it would take for machines to understand ,and what even understand means. And we heard from many different people in many different kinds of fields. And, it turns out the word understand itself is another one of those suitcase words that I mentioned.

Words that can mean many different things to different people in different contexts. And so we’re still trying to nail down exactly what it is we want to mean when we say, do machines understand? And I don’t think we’ve come to any consensus yet, but it certainly seems that there are some features of understanding that are still missing in machines that people want machines to have this idea of abstraction, this idea of being able to predict what’s gonna happen in the world, this idea of being able to explain oneself, explain one’s own thinking processes and so on.

So understanding is still kind of this ill-defined word that we use to mean many different things and we have to really understand in some sense what we mean by understanding.

Abha: Right. Another question that you asked one of our guests, you posted Tomer and Murray. Some AI researchers are worried about what’s known as the alignment problem, as in, if we have an AI system that is told to, for example, fix global warming, and you have said, what’s to stop it from deciding that humans are the problem and the best solution is to kill us all. What’s your take on this and are you worried?

Melanie: Well, I find it… mysterious when people pose this kind of question, because often the way it’s posed is, imagine you had a super intelligent AI system, one that’s smarter than humans across the board, including in theory of mind and understanding other people and so on. Because it’s super intelligent, you give it some intractable problem like fixed climate change.

And then it says, okay, humans are the source of the problem. Therefore, let’s kill all the humans. Well, this is a popular science fiction trope, right? We’ve seen this in different science fiction movies. But does it even make sense to say that something could be super intelligent across the board and yet try to solve a problem for humans in a way that it knows humans would not support.

So, there’s so much packed into that. There’s so many assumptions packed into that, that I really want to question a lot of the assumptions about whether intelligence could work that way. I mean, it’s possible. We’ve certainly seen machines do unintended things. Remember a while ago, there was the stock market flash crash which was due to machines, allowing machines to do trading and them doing very unintended things, which created a stock market crash.

But the assumption that you could do that with a super intelligent machine, that you would be willing to hand over control of the world and say, go fix climate change, do whatever you want. Here’s all the resources of the world to do it and then have it not have that kind of sort of understanding or… lack of, in some sense, common sense. It really seems strange to me.

So every time I talk about this with people who worry about this, they say things like, well, the machine doesn’t care what we want. It’s just going to try and maximize its reward. And its reward is, does it achieve its goal? And so it will try and create sub goals to achieve its reward. The sub-goal might be, kill all the humans, and it doesn’t care because it’s going to try and achieve its reward in any way possible.

I don’t think that’s how intelligence works or could work. And I guess it’s all speculation right now. And the question is how likely is that to happen? And should we really put a whole lot of resources in preventing that kind of scenario? Or is that incredibly far-fetched and should we put our resources in much more concrete and known risks of AI.

And this was a debate going on, for instance, just in California recently with a California Senate bill to regulate AI. And it was very much influenced by this notion of existential threat to humanity. And it was vetoed by the California governor, and one of the reasons was that the assumptions that it was based on, he felt ,were too speculative.

Abha: What do you think are the real risks of the way we would function with AI if AI would be flourishing in the world at the pace it is?

Melanie: Well, we’re already seeing all kinds of risks of AI happening right now. We have deep fakes in both visual and auditory modalities. We have voice cloning, AI voices that can convince you that they are actually a real person or even a real person that you personally know. And this has led to scams and spread of disinformation and all kinds of terrible consequences. And I think it’s just gonna get worse.

We’ve also seen that AI can flood the internet with what people are calling slop, which is just AI generated content that then things like Google search engine picks up on and returns as the answer to somebody’s search, even though it was generated by AI and it’s totally untrue. We see AI being used, for instance, to undress women in photographs.

You can take a photograph of a woman, run it through a particular AI system, and she comes out looking naked. And people are using this online. And it’s just lots and lots of current risks. You know, Daniel Dennett, the late philosopher, wrote an article very shortly before he died about the risks of artificial people.

The idea that AI impersonating humans and convincing other humans that it is human, and then people kind of believing it and trusting it and giving it the kind of agency it doesn’t have and shouldn’t have.These are the real risks of AI.

Abha: Is there any way to keep the quality of information at a certain standard, even with AI in the loop?

Melanie: I fear not. I really worry about this. The quality of information, for instance, online never has been great. It’s always been hard to know who to trust. One of the whole purposes of Google in the first place was to have a search algorithm that used methods that allowed us to trust the results.

This was the whole idea of what they called PageRank, trying to rank web pages in terms of how much we should trust their results, how good they were and how trustworthy they were. But that’s really fallen apart through the commercialization of the internet, I think, and also the motivation for spreading disinformation. But I think that it’s getting even worse with AI and I’m not sure how we can fix that, to be honest.

Abha: Let’s go back to the idea of intelligence. A lot of people talk about the importance of embodiment. Also, you know, our guests mentioned this to be able to function as intelligent beings in the world because of the input we receive and experiences we have. Why is it important to think of this as a factor?

Melanie: Well, the history of AI has been a history of disembodied intelligence. Even at the very beginning, the idea was that we could somehow sift off intelligence or rationality or any of these things and implement it in a computer. You could upload your intelligence into a computer without having any body or any direct interaction with the world.

So that has gone very far with today’s large language models, which don’t have direct interaction with the world except through conversing with people, and are clearly disembodied. But some people, Iguess, including myself, think that there’s only so far that that can go, that there is something unique about being able to actually do things in the world and interact with the real world in a way that we humans do that machines don’t, that forms our intelligence in a very deep way.

Now it’s possible with vast, almost infinite amounts of data, training data and compute power that machines could come close to getting the knowledge that would approximate that, what humans do. And we’re seeing that kind of happening now with these systems that are trained on everything online, everything digitized, and that companies like Microsoft and Google are now building nuclear power plants to power their systems because there’s not enough energy currently to power these systems.

But that’s a crazy, inefficient, and non-sustainable way to get to intelligence, in my opinion. And so I think that if you have to train your system on everything that’s ever been written and get all the power in the world and even, like Sam Altman says, have to get to nuclear fusion energy in order to get to sort of human level intelligence that you’re just doing it wrong. You’re not achieving intelligence in any way that’s sustainable and we humans are able to do so much with so little energy compared to these machines that we really should be thinking about different ways to approach intelligence and AI.

And I think that’s what some of our guests have said that there’s other ways to do it. And for instance, Alison Gopnik is looking at how to train machines in the way that children learn. And this iswhat Linda Smith and Mike Frank and others are looking at too is like, aren’t there better ways to get systems to be able to exhibit intelligent behavior.

Abha: Right. So let’s move on to AGI. There are a lot of mixed opinions out there about what it is and how it could come into being. What in your view is artificial general intelligence?

Melanie: I think the term has always been a bit vague. It was first coined to mean something like human-like intelligence. The idea is that in the very early days of AI, the pioneers of AI like Minsky and McCarthy, their goal was to have something like the AI we see in the movies, robots that can do everything that people do. But then AI became much more focused on particular specific tasks, like driving a car or translating between languages or diagnosing diseases.

These systems could do a particular task, but they weren’t the sort of general purpose robots that we saw in the movies that we really wanted. And that’s what AGI was meant to capture was that vision. So AGI was a movement in AI back in the early 2000s. It had conferences, they had papers and discussions and stuff, but it was kind of a fringe movement. But it’s now come back in a big way because now AGI is at the center of the goals of all of the big AI companies.

But they define it in different ways. For instance, I think DeepMind defines it as a system that could do all what they call cognitive tasks as well as or better than humans. So that notion of a robot that can do everything has now been narrowed into, oh well, we don’t mean all that physical stuff, but only the cognitive stuff, as if those things could be separated. Again, the notion of disembodiment of intelligence.

OpenAI defined it as a system that can do all economically valuable tasks. That’s how they have it on their website, which is kind of a strange notion, because it’s sort of unclear what is and what isn’t an economically valuable task. You might not be getting paid to raise your child, but raising a child seems to be something of economic value eventually. So I don’t know, I think that it’s ill defined, that people have an idea of what they want, but it’s not clear what exactly the target is or how we’ll know when we get there.

Abha: So do you think we will ever get to the point of AGI in that definition of the ability to do general things?

Melanie: In some sense, we already have machines that can do some degree of general things. You know, ChatGPT can write poetry, it can write essays, it can solve math problems, it can do lots of different things. It can’t do them all perfectly for sure.

And it’s not necessarily trustworthy or robust, but it certainly is in some sense more general than anything we’ve seen before. But I wouldn’t call it AGI. I think the problem is, you know, AGI is one of those things that might get defined into existence, if you will. That is, the definition of it will keep changing until, okay, we have AGI. Sort of like now we have self-driving cars.

Of course, they can’t drive everywhere and in every condition. And if they do run into problems, we have people who acan operate them remotely to get them out of trouble. Do we want to call that autonomous driving? To some extent, yeah. To some extent, no. But I think the same thing is happening with AI, that we’re going to keep redefining what we mean by this. And finally, it’ll be there just because we defined it into existence.

Abha: Going back to the Nobel Prize in physics, physics has a theoretical component that proposes different theories and hypotheses that groups of experimentalists then go and try to see if it’s true or, if they can try it out and see what happens. In AI so far, the tech industry seems to be hurtling ahead without any theoretical component to it necessarily. How do you think academia and industry could work together?

Melanie: There’s a lot of people trying to do what you say, trying to kind of come up with a more theoretical understanding of AI and of intelligence more generally. It’s difficult because the term intelligence, as I said, isn’t that rigorously defined. I think academia and industry are working together especially in the field of applying AI systems to scientific problems.

But one problem is that it’s going much more in the big data direction than in the theoretical direction. So we talked about Alpha Fold, which basically won the chemistry prize. Alpha Fold is a big data system. It learns from huge amounts of data about proteins and the evolutionary histories of different proteins and similarity between proteins. And nobody can look at Alpha Fold’s results and explain exactly how it got there or reduce it to some kind of theory about protein folding and why certain proteins fold the way they do.

So it’s kind of a black box, big data method to do science. And I fear in a way that that’s the way a lot of science is going to go. That some of the problems that we have in science are going to be solved, not because we have a deep theoretical understanding, but more because we throw lots and lots of data at these systems and they are able to do prediction, but aren’t able to do explanation in any way that would be sort of theoretically useful for human understanding.

So maybe we’ll lose that quality of science that is human understanding in favor of just big data prediction.

Abha: That sounds incredibly tragic.

Melanie: Well, maybe the next generation won’t care so much. If you could cure cancer, let’s say, as we’ve been promised by people like Sam Altman that AI is going to do. Do we need to understand why these things work? You know, some kind of magic medicine for curing cancer? Do we need to understand why it works? Well, I don’t know. Lots of medications, we don’t totally understand how they work. So that may be something lost to AI is the human understanding of nature.

Abha: Right. Ted Chiang wrote an article, I think you must have read in the New Yorker, about the pursuit of art and what art is and how AI approaches it versus how we approach it. And even though art does not have the same kind of impact as curing cancer would, it does have a purpose in our human existence.

And to have AI take that away, you must have seen the memes coming out about these things, that one had expected artificial intelligence to take care of the housework, but it’s gone and taken away our creative work instead.

How do you look at that? Does that mean that as humans, we continue trying to pursue these artistic endeavors of understanding or, understanding more deeply things that we feel have meaning for our lives or do we just give that over to AI?

Melanie: That sounds even more tragic to me than giving science over to AI. Ted Chiang wrote that he didn’t think AI generated art was really art because to make art, he said you need to be able to make choices and AI systems don’t really make choices in the human-like sense.

Well, that’s gotten a lot of pushback, as you would imagine. People don’t buy it. I don’t think that art will be taken over by AI, at least not any time soon, because a big part of art is the artist being able to judge what it is that they created and decide whether it’s good or not, decide whether it conveys the meaning that they want it to convey. And I don’t think AI can do that.

And I don’t think it will be able to do that anytime soon, maybe in the very far future. It may be that AI will be something that artists use as a tool. I think that’s very likely already true. Now, one big issue about AI art is that it works by having been trained on huge amounts of human-generated art. And unfortunately, the training data mostly came without permission from the artists. And the artists didn’t get paid for having their artwork being used as training data. They’re still not getting paid.

And I think that’s a moral issue that we really have to consider when thinking about using AI as a tool. To what extent are we willing to have it be trained on human generated content without the permission of the humans who generated the content and without them getting any benefit.

Abha: Right, I think your own book, something was done by AI, right?

Melanie: Yeah, my book, which is called Artificial Intelligence: A Guide for Thinking Humans. Well, like many books, someone used an AI system to generate a book with the same title, that really was pretty terrible, but was for sale on Amazon.

Abha: So if you’re looking to buy that book, make sure you get the correct one.

Melanie: I put in a message to Amazon saying, please take this off. It’s, you know, played, it’s plagiarized. And nothing happened until I got interviewed by a reporter from Wired Magazine about it. And then Amazon deleted that other book. But this is a broad problem.

We’re getting more and more AI generated books that are for sale that either have related content to an actual human-generated book or whatever content. When you buy a book, you don’t know it’s generated by AI. And often these books are quite bad. And so this is part of the so-called slop from AI that’s just sort of littering all of our digital spaces.

Abha: Littering is a good word for this phenomenon, I think. I want to go into the idea of complexity science and AI research. You’ve written a book also on complexity science and AI research. You’ve had a long history with the Santa Fe Institute. You’ve been with us for many years now in different capacities. Why do you think AI is a complex system? And what keeps you in the complexity realm with this research?

Melanie: Well, I think AI at many different levels and dimensions of it are complex systems. One is just the systems themselves. Things like ChatGPT is a big neural network that is very complex, and we don’t understand how it works. People claim that it has so-called emergent behavior, which is a buzzword in complex systems.

And it’s something that complex systems people who think about large networks and large systems with emergent behavior might be able to put some insight in. The first notion of emergence came from physics, and now AI is part of physics, it’s won a Nobel Prize.

So I think these things are all tied up together. But also another dimension is sort of the interaction of AI and society. And clearly that’s a socio-technological complex system of the kind that many people here at the SFI are interested in studying.

So I think there’s many ways in which AI relates to complex systems research. I think SFI in particular is a great place for people to take this slower approach to thinking about these complex problems rather than the more quick incremental improvements that we see in the machine learning literature without very much deep thinking about how it all works and what it all means. So that’s what I’m hoping that SFI will be able to contribute to this whole discussion.

And I think, my colleague David Krakauer here at the SFI and I wrote a paper about the notion of understanding in AI that I think is influential because it really laid out the complexities of the topic. I do think that we people in complex systems do have a lot to contribute to this field.

Abha: So Melanie, we’ve talked about, you know, AI as a complex adaptive system. We’ve talked about AGI, the possibility and where we stand. Where do you think the research will lead us, eventually, say in another 10 years, having seen the progress we’ve made in the last 10 years?

Melanie: I think that one of the big things I mentioned is that the current approach to AI is just not sustainable in terms of the amount of data it requires, the amount of energy it requires. And what we’ll see in the next 10 years is ways to try and reduce the amount of data needed and reduce the amount of energy needed.

And that I think will take some ideas from the way people learn or the way animals learn. And it may even require AI systems to get more embodied. So that might be an important direction that AI takes, I think, in the next decade so that we can reduce this ridiculous dependence on so much data, so much energy, and make it a lot more sustainable and ecologically friendly.

Abha: Great. Thank you so much, Melanie. This has been wonderful as a season and to have you as a co-host was such a privilege. I’ve really enjoyed working with you and I hope we continue to discuss this over time. Maybe we’ll have another season back when you and John have finished your workshop that’s going to happen for the next three years.

Melanie: Yeah, that would be great. It’s been an incredible experience doing a podcast. I never thought I would do this, but it’s been fantastic and I’ve loved working with you. So thanks, Abha.

Abha: Likewise. Thank you, Melanie.

Complexity is the official podcast of the Santa Fe Institute. This episode was produced by Katherine Moncure. Our theme song is by Mitch Mignano, and additional music from Blue Dot Sessions. I’m Abha, thanks for listening.

If you enjoyed this article…Buy Me A Coffee

Learn more about the coaching process or
contact me to discuss your storytelling goals!

Subscribe to the newsletter for the latest updates!

Copyright Storytelling with Impact® – All rights reserved

Nature of Intelligence – Episode Five – How do we assess intelligence?

I don’t know about you, but my brain is starting to hurt, but in a good way. What seems clear to me was summed up when Abha Eli Phoboo informed us that, we don’t fully understand human intelligence or animal intelligence” in this episode.

And there’s much discussion regarding how we’re trying to evaluate machines — and associated LLMs — based on measurements that we use on humans. It may feel ridiculous on one level, but at the moment humans can only understand the world through the lens of being human.

We use medicines all the time that we don’t understand the mechanisms that they work on. And that’s true. And I don’t think we cannot deploy LLMs until we understand how they work under the hood. ~ Ellie Pavlick

But is understanding what LLMs are, or how they operate all that important? As Ellie Pavlick reminds us, there’s much about the world we don’t fully understand. We just know whether something works or not.

But I found the discussion of comparing humans to animals to be as fascinating. Even if you don’t own a pet, I’m sure you’ve been around a number of animals at various times in your life. Did they seem “intelligent”, in one way or another? Did you feel they possessed a personality? I have a friend who’s owned horses most of her life, and when I hear her talking to folks at the stables, they describe each horse as though they were human. Will we describe LLM personas in the same way some day?

Transcript

Abha Eli Phoboo: The voices you’ll hear were recorded remotely across different countries, cities and work spaces.

Erica Cartmill: I often think that humans are very egotistical as a species, right? So we’re very good at particular things and we tend to place more value on the things that we’re good at.

Abha: From the Santa Fe Institute, this is Complexity

Melanie Mitchell: I’m Melanie Mitchell

Abha: And I’m Abha Eli Phoboo

Melanie: As we enter our fifth episode of this season on intelligence, we’ve explored quite a few complicated and controversial ideas. But one thing has become really clear: intelligence is a murky concept. And that’s the point of this series — it’s something that we think we know when we see it, but when we break it down, it’s difficult to define rigorously.

Abha: Today’s episode is about how we assess intelligence. When it comes to testing humans, we have all kinds of standardized measures: IQ tests, the SAT, and so on. But these tests are far from perfect, and they’ve even been criticized as limited and discriminatory.

Melanie: To understand where our desire to test intelligence comes from — and also the way we talk about it as an inherent personality trait — it’s useful to look at the history of intelligence in Western society. In ancient Greece, the concept was described as “reason” or “rationality,” which then evolved into “intelligence” more broadly when the discipline of psychology arose. Philosophers like Socrates, Plato, and Aristotle highly valued one’s ability to think. And at first glance, that seems like a noble perspective.

Abha: But Aristotle took this a step further. He used the quote unquote “rational element,” as justification for a social hierarchy. He placed European, educated men at the top, and women, other races, and animals below them.

Melanie: Other Western philosophers like Descartes and Kant embraced this hierarchy too, and they even placed a moral value on intelligence. By claiming that a person or an animal wasn’t intelligent, it became morally acceptable to subjugate them. And we know how the rest of that European expansion story goes.

Abha: So today’s notions about intelligence can be traced in part to the ways men distinguished themselves from… non-men.

Melanie: Or, to give the philosophers a more generous interpretation, the history of thought around intelligence centers on the idea that it is a fundamentally human quality.

Abha: So if intelligence, in theory, stems from humanity, how do we decide the degree to which other entities, like animals and large language models, are intelligent? Can we rely on observations of their behavior? Or do we need to understand what’s going on under the hood — inside their brains or software circuits?

Melanie: One scientist trying to tackle such questions is Erica Cartmill.

Erica: So my name is Erica Cartmill. I’m a professor of cognitive science, animal behavior, anthropology, and psychology at Indiana University. You know, I really study cognition, particularly social cognition, and the kinds of cognition that allow communication to happen across a wide range of species.

Abha: Erica has extensive experience observing intelligent behavior in beings that are very different from humans.

Erica: So I got the animal bug when I was a kid. And we had a whole range of different kinds of animals. It’s sort of a menagerie. We had horses, we had dogs, we had a turtle, we had a parrot. And I was always out watching lizards and butterflies and birds, mice in our barn. And sometimes I would catch a lizard, put it in a terrarium for two days, observe it, let it go again.

And that kind of wanting to observe the natural world and then have an opportunity to more closely observe it, under you might say controlled circumstances, even as a child, and then release it back into its natural environment is really something that I’ve continued to do as an adult in my scientific career. And that’s what I do mostly with my lab now, kind of split between studying great apes and human children.

But I’ve done work on a range of other species as well, Darwin’s finches in the Galapagos. I’m doing a project now that also includes dolphins and dogs and kea, which is a New Zealand parrot. And I’m starting a dog lab at IU. So I’m excited about some of those other species, but I would say the core of my work really focuses on comparing the cognitive and communicative abilities of great apes and humans.

Melanie: Much of Erica’s research has been on the evolution of language and communication. As we’ve said before, complex language is unique to our species. But other animals communicate in many ways, so researchers have been trying to narrow down what exactly makes our language so distinct.

Erica: So I think humans have always been really focused on this question of what separates us from other species. And for a long time, answers to that question centered around language as the defining boundary. And a lot of those arguments about language really focused on the structural features of language.

And if you look at sort of the history of these arguments, you would see that every time a linguist proposed a feature of language that say, human language is different because X, then people would go out and study animals and they would say, “Well, starlings have that particular feature” or, “A particular species of monkey has that feature.” And then linguists would sort of regroup and say, “Okay, well, actually this other feature is the real dividing line.”

And I think probably the boring answer or interesting answer, depending on how you look at it, is that there probably isn’t one feature. It’s the unique constellation of features combined with a constellation of cognitive abilities that make language different and make it so powerful. But I will say in recent years, the focus of these arguments about “language is unique because” has shifted from language is unique because of some particular structural feature to language is unique because it is built on a very rich social understanding of other minds.

It’s built on inferences about others’ goals, about what others know and don’t know. It’s built on what we call pragmatics and linguistics. So actually it’s very unlike a structured program that you can sort of apply and run anywhere. It’s actually something that relies on rich inferences about others’ intentions.

Melanie: When we humans communicate, we’re often trying to convey our own internal thoughts and feelings, or we’re making inferences about someone else’s internal state. We naturally connect external behavior with internal processes. But when it comes to other beings, our ability to make judgments about intelligence isn’t as straightforward.

Abha: So today we’re going to first look at what we can learn from external behavior and applying human notions of intelligence to animals and machines, which can pass tests at levels that are deceptively similar to humans.

Abha: Part 1: Assessing Intelligence in Humans, Animals, and Machines

Abha: If you have a pet at home, you’ve probably had moments when you’ve wanted to know what it’s trying to say when it barks, meows, or squawks. We anthropomorphize pets all the time, and one of the ways we do that is by envisioning them saying things like, “I’m hungry!” or “I want to go outside!” Or we might wonder what they say to each other.

Melanie: Animals most definitely communicate with one another. But there’s been a lot of debate about how sophisticated their communications are. Does a chimp’s hoot or a bird’s squawk always mean the same thing? Or are these signals flexible, like human words, communicating different meanings depending on context, including the animal’s understanding of the state of its listeners’ minds? In her work, Erica has critiqued the assumptions people often make in experiments testing animal communication.

She’s noted that the methods used won’t necessarily reveal the possible meaning of both vocal and other kinds of signals, especially if those meanings depend on particular contexts.

Erica: Authors recently, ranging from cognitive scientists to philosophers to linguists have argued that human communication is unique because it relies on these very rich psychological properties that underlie it. But this in turn has now led to new arguments about the dividing line between humans and other animals.

Which is that animals use communication that is very code-like, that one animal will produce a signal and another animal will hear that signal or see that signal and decode its meaning. And that it doesn’t rely on inferences about another’s intentions or goals, that the signals can be read into and out of the system. If you record, say, an auditory signal, like a bird call, and then you hide a speaker in a tree, and you play that call back, and you see how other birds respond. So this is called the playback method, unsurprisingly.

And that’s been one of the strongest things in the toolkit that animal communication researchers have to demonstrate that those calls in fact have particular meanings. That they’re not just, I’m singing because it’s beautiful, but that this call means go away and this other call means come and mate with me, and this other call means there’s food around, et cetera, et cetera.

And so decontextualizing those signals and then presenting them back to members of the species to see how they respond is the dominant method by which scientists demonstrate that a call has a particular meaning. That’s been incredibly important in arguing that animals really are communicating things. But that method, and the underlying model that is used to design experiments to ask questions about animal communication, is also very limiting.

Abha: An auditory signal taken out of context, whether a word or an animal call — is a very narrow slice of all the different ways animals — and humans — communicate with each other.

Erica: So it’s very good at demonstrating one thing, but it also closes off doors about the kinds of inferences that animals might be making. If Larry makes this call and I’m friends with Larry, versus Bob makes that call and I’m enemies with Bob, how do I respond? Does Bob know that I’m there? Can he see me? Is he making that call because I am there and he sees me and he’s directing that call to me? Versus, is he making that call to someone else and I’m eavesdropping on it.

Those are kinds of inferences that animals can make. I’m not saying all animals in all cases, but the ways that we ask questions about animal communication afford certain kinds of answers.

And we need, I think, to be more, I don’t know, humble is the right word? But we need to recognize the ways in which they limit the conclusions that we can draw, because this is very different from the way that we ask questions about human language.

And so when we draw conclusions about the difference between human language and animal communication based on the results of studies that are set up to ask fundamentally different questions, I think that leaves a lot to be desired.

Abha: And focusing on abilities that are relevant to humans’ intelligence might mislead us in how we think about animal intelligence.

Erica: I often think that humans are very egotistical as a species, right? So we’re very good at particular things and we tend to place more value on the things that we’re good at. And I think that in many cases, that’s fine, that’s one of our unique quirks as a species. But it also often limits the way that we ask questions and attribute kinds of intelligence to other species.

So it can be quite difficult, I think, for humans to think outside of the things that we’re good at or indeed outside of our own senses. I mean, sort of five senses, biological senses. So elephants… we’ve known for a long time that elephants are able to converge at a particular location, show up, far away at this tree on this day at this time from different starting points. And people really didn’t know how they were doing it.

They were starting too far apart to be able to hear one another. People were, are they planning? Do they have the sense of two Tuesdays from now we’re going to meet at the watering hole? And it wasn’t until people said maybe they’re using senses that fall outside of our own perceptual abilities. In particular, they measured very, very low frequencies and basically asked, okay, maybe they’re vocalizing in a way that we can’t perceive, right?

And so once they did that and greatly lowered the frequency of their recording equipment, they found that elephants were in fact vocalizing at very, very long distances, but they were doing it through this rumble vocalization that actually propagates through the ground rather than through the air.

And so they produce these, I can’t imitate it because you wouldn’t hear it even if I could, but they produce these very low rumbles that other elephants, kilometers away, perceive not through their ears but they perceive through specialized cells in the pads of their feet, where they can feel the vibrations.

And so I think this is a nice example of the way that we have to, in effect, not even necessarily think like an elephant, but imagine hearing like an elephant, having a body like an elephant, thinking, I like to call it thinking outside the human.

Humans are good at particular things, we have particular kinds of bodies, we perceive things on particular time scales, we perceive things at particular light wavelengths and auditory frequencies. Let’s set those aside for a second and think about, okay, what did that species evolve to do? What do its perceptual systems allow it to perceive and try to ask questions that are better tailored to the species that we’re looking at.

Melanie: There’s been a lot of work throughout the many decades on trying to teach human language to other species like chimps or bonobos or African gray parrots. And there’s been so much controversy over what they have learned. What’s the current thinking on the language abilities of these other species and those experiments in general?

Erica: It’s almost hard to answer the question with the current thinking, because there’s very little current research. A lot of that research was done 20 or even 40 years ago. Compared to the work that was being done 30 years ago, there’s very little current work with apes and parrots and dolphins, all of which 30 years ago, everyone was trying to teach animals human language.

And I think it was a really interesting area of inquiry. I would say people differ a little bit, but I think that probably the sort of most dominant opinion or maybe the discussion is best characterized by saying that people today, I think, largely believe that those animals were able to learn, understand, and productively use words, but that they were limited in the scope of the words they could learn, and that they weren’t combining them into productive sentences.

And this was part of the argument that syntax, the combining of words according to particular rules, was something that human language did that was very different from what animals could produce. And so I think with the animal language studies that were showing largely that animals could learn words, they could produce words, sometimes produce words together, but they weren’t doing it in reliable sentence-like structures.

Melanie: But do you think that the fact that we were trying to teach them human language in order to assess their cognitive abilities was a good approach to understanding animal cognition or should we more do what you said before, sort of take their point of view, try to understand what it’s like to be them rather than train them to be more like us?

Erica: I think that’s a great question. My answer probably hinges around the limitations of human imagination. Where I think that teaching animals to communicate on our terms allows us to ask better questions and better interpret their answers than us trying to fully understand their communication systems. People certainly are using things like machine learning to try to quote unquote “decode” whale song or bird song. I think that those approaches, which is more sort of on the animals’ terms or using their natural communication.

And I think that those are very interesting approaches. I think they’ll be good at finding patterns in what animals are producing. The question I think still remains whether animals themselves are perceiving those patterns and are using them in ways that have meaning to them.

Abha: And the way we’ve tried to assess intelligence in today’s AI systems also hinges around the limitations of human imagination, perhaps even more so than animals, given that by default, LLMs speak our language. We’re still figuring out how to evaluate them.

Ellie Pavlick: Yeah, I mean, I would say they’re evaluated very… I would say badly.

Abha: This is Ellie Pavlick. Ellie’s an assistant professor of computer science and linguistics at Brown University. Ellie has done a lot of work on trying to understand the capabilities of large language models.

Ellie: They’re evaluated right now using the things that we can conveniently evaluate, right? It is very much a, what can we measure? And that’s what we will measure. There’s a lot of repurposing of existing kind of evaluations that we use for humans. So things like the SAT or the MCAT or something like that.

And so it’s not that those are completely uncorrelated with the things we care about, but they’re not very deep or thoughtful diagnostics. Things like an IQ test or an SAT have long histories of problems for evaluating intelligence in humans. But they also just weren’t designed with models of this type being the subjects.

I think what it means when a person passes the MCAT or scores well on the SAT is not the same thing as what it might mean when a neural network does that. We don’t really know what it means when a neural network does it, and that’s part of the problem.

Melanie: So why do you think it’s not the same thing? I mean, what’s the difference between humans passing a bar exam and a large language model?

Ellie: Yeah, I mean, that’s a pretty deep question, right? So I would say, compared to a lot of my peers, not as quick to say the language models are obviously not doing what humans do, right?

I tend to reserve some space for the fact that they might actually be more human-like than we want to admit. A lot of times processes that people might be using to pass these exams might not be as deep as we like to think. So when a person, say, scores well on the SAT, we might like to think that there’s some more general mathematical reasoning abilities and some general verbal reasoning abilities. And then that’s going to be predictive of their ability to do well in other types of tasks. That’s why it’s useful for college admission.

But we know in practice that humans often are just learning how to take an SAT, right? And I think we very much would think that these large language models are mostly learning how to take an SAT.

Melanie: So just to clarify, when you say, I mean, I know what it means when a human is learning how to pass a test, but how does a language model learn how to pass a test?

Ellie: Yeah, so we can imagine this simple setting. I think people are better at thinking about, let’s pretend we just trained the language model on lots of examples of SATs. They’re going to learn certain types of associations that are not perfect, but very reliable.

And I always have this joke with my husband when we were in college about how you could pass a multiple choice test without having ever taken the subject. And we would occasionally try to pass his qualifying exams in med school. I think he took an econ exam with me. So there’s certain things like, whenever there’s something like “all of the above” or “none of the above,” that’s more likely to be the right answer than not, because it’s not always there. So it’s only there when that’s the right thing.

Or it’s a good way for the professor to test that you know all three of these things efficiently. Similarly, when you see answers like “always” or “never” in them, those are almost always wrong because they’re trying to test whether you know some nuanced thing.

Then there’s some, and none of these is perfect, but you can get increasingly sophisticated kinds of heuristics and things, based on the words, this one seems more or less related, this seems kind of topically off base, whatever. So you can imagine there’s patterns that you can pick up on. And if you stitch many, many of them together, you can pretty quickly get to, possibly perfect performance, with enough of them.

So I think that’s a kind of common feeling about how language models could get away with looking like they know a lot more than they do by kind of stitching together a very large number of these kinds of heuristics.

Abha: Would it help if we knew what was going on under the hood with LLMs? We don’t really actually know a whole lot about our brains either, and we don’t know anything about LLMs, but would it help in any way if we sort of could look onto the hood?

Ellie: I mean, that’s where I’m placing my bets. Yeah.

Melanie: In Part 2, we’ll look at how researchers are actually looking under the hood. And many of them are trying to understand LLMs in a way that’s analogous to how neuroscientists understand the brain.

Melanie: Part 2: Going Under the Hood

Abha: Okay, so wait a minute. If we’re talking about mechanistic understanding in animals or humans — that is, understanding the brain circuits that give rise to behavior — it makes sense that it’s something we need to discover. It’s not obvious to us, in the same way that it’s not obvious how a car works if you just look at the outside of it.

But we do know how cars work under the hood because they’re human inventions. And we’ve spent a lot of this season talking about how to learn more about artificial intelligence systems and understand what they’re doing. It’s a given that they’re so-called “black boxes.”

But… we made AI. Human programmers created large language models. Why don’t we have a mechanistic understanding? Why is it a mystery. We asked Ellie what she thought.

Ellie: The program that people wrote was programmed to train the model, not the model itself, right? So the model itself is this series of linear algebraic equations. Nobody sat down and wrote, “Okay, in the 118th cell of the 5,000th matrix, there’ll be a point zero two,” right? Instead there’s a lot of mathematical theory that says, why is this the right function to optimize? And how do we write the code? And how do we parallelize it across machines?

There’s a ton of technical and mathematical knowledge that goes into this. There’s all of these other variables that factor in, they’re very much part of this process, but we don’t know how they map out in this particular thing. You kind of set up some rules and constraints to guide a system, but the system itself is on its own. So if you’re routing a crowd through a city or something for a parade, right?

And now you come afterward and you’re trying to figure out why there’s a particular cup on the ground in a particular orientation or something. But you set up, you knew where the people were going to go. But there’s all of this other stuff that, it’s constrained by what you set up, but that’s not all that there is. There’s many different ways to meet those constraints.

And some of them will have some behavioral effects and others will have others, right? There’s a world where everyone followed your rules there wasn’t a cup there. And there’s a rule where those cars crashed or didn’t crash, and all of those other things are subject to other processes. So it’s kind of an under specified problem, right, that was written down. And there are many ways to fill in the details, and we don’t know why we got this one that we got.

Melanie: So when we’re assessing LLMs, it’s not quite the same as humans because we don’t know what happens between the constraints we set up and, for example, ChatGPT’s SAT score at the end.

And we don’t always know how individual people are passing the SAT either — how much someone’s score reflects their underlying reasoning abilities versus how much it reflects their ability to sort of “game” the test. But at the very least, when we see an SAT score on a college application, we do know that behind that SAT score, there’s a human being.

Ellie: We can take for granted that we all have a human brain. It’s true. We have no idea how it works, but it is a known entity because we’ve evolved dealing with humans. You live a whole life dealing with humans. So when you pick somebody to come to your university, or you hire someone for a job, it’s not just a thing that passes the SAT, it’s a human that passes the SAT, right?

That is one relevant feature. Presumably the more relevant feature is that it’s a human. And so with that comes a lot of inferences you can make about what humans who pass the SAT or score a certain score probably also have the ability to do, right? It’s a completely different ball game when you’re talking about somebody who’s not a human, because that’s just not what we’re used to working with.

And so it’s true, we don’t know how the brain works, but now that you’re in the reality of having another thing that’s scoring well, and you have no idea how it works. To me, the only way to start to chip away at that is we need to ask if they’re similar at a mechanistic level. Like asking whether a score on the SAT means the same thing when an LLM achieves it as a human, it is 100% dependent on how it got there.

Abha: Now, when it comes to assessing artificial intelligence, there’s another question here: How much do we need to understand how it works, or how intelligent it is, before we use it? As we’ve established, we don’t fully understand human intelligence or animal intelligence — people debate on how effective the SAT is for us — but we still use it all the time, and the students who take it go on to attend universities and have careers.

Ellie: We use medicines all the time that we don’t understand the mechanisms that they work on. And that’s true. And I don’t think we cannot deploy LLMs until we understand how they work under the hood. But if we’re interested in these questions of, “Is it intelligent?” Just the fact that we care about that question. Answering that question probably isn’t relevant for whether or not you can deploy it in some particular use case.

If you have a startup for LLMs to handle customer service complaints, it’s not really important whether the LLM is intelligent. You just care whether it can do this thing, right? But if you want to ask that question, we’re opening up this very big can of worms and we can’t ask the big questions and then not be willing to do the big work, right.

Melanie: And answering the question of mechanistic understanding is really big work. As in other areas of science, you have to decide what level of understanding you’re actually aiming for.

Ellie: Right, I mean, this kind of idea of levels of description has existed in cognitive science. I think cognitive scientists talk about it a lot, which is kind of what is the right language for describing a phenomenon? And sometimes you can have simultaneous consistent accounts, and they really should be consistent with one another, but it doesn’t make sense to answer certain types of questions at certain levels.

And so I think a favorite example in cognitive sciences is quantum physics versus classical mechanics, right? It would be really cumbersome and bizarre and highly unintuitive and we can’t really do it to say if I roll this billiards ball into this billiards ball and try to describe it at the level of quantum mechanics, it would be an absurd thing to do and you would be missing a really important part of how physics works.

And there’s a lot of debate about whether you could explain the kind of billiards ball in quantum mechanics. But the point is there’s laws at the lower level that tell you that the ball will exist. And now once you know that the ball is there, it makes sense to explain things in terms of the ball because the ball has the causal force in this thing, not the individual things that make up the ball.

But you would want to have the rules that combine the small things together in order to get you to the ball. And then when you know that the ball is there, then you can just talk in terms of the ball and you don’t have to appeal to the lower level things. And sometimes it just makes more sense to talk about the ball and not talk about the lower level things.

And I think the feeling is we’re looking for those balls within the LLM so that you can say, the reason the language model answered this way on this prompt, but when you change the period to have a space before it, it suddenly got the answer wrong.

That’s because it’s thinking in terms of these balls, right? And if we’re trying to understand it at the level of these low level things, it just seems random. If you’re missing the key causal thing, it just seems random. It could be that there is no key causal thing, right? That’s kind of part of the problem. I’m thinking there is, and if we find it, this will be so cool, and the common, legitimate point of skepticism is there might just not be one, right?

Abha: So we’re trying to find the shape and size of these “billiard balls” in LLMs. But as Ellie said, whether or not the billiard balls even exist is not certain. We’re assuming and hoping that they’re there and then going in and looking for them.

Melanie: And if we were to think about how these levels apply to humans, one way we try to gain mechanistic understanding of human intelligence is by looking inside our brains.

If you think back to Ev Fedorenko’s work from our episode about language, Ev’s use of fMRI brain scanning is exactly this — she’s looked at the pathways in the brain that light up when we use language. But imagine if we were to try to go even further and describe human language in terms of the protons, electrons, and neutrons within our brain cells. If you go down to that level of detail, you lose the order that you can see in the larger brain structures. It’s not coherent.

Abha: LLMs work by performing vast numbers of matrix multiplications —- at the granular, detailed level, it’s all math. And we could look at those matrix operations, in the same way we can observe the quantum mechanics of billiard balls. And they’ll probably show us that something’s happening, but not necessarily what we’re looking for.

Ellie: And maybe part of when we’re very frustrated with large language models and they seem like quote “black boxes” is because that’s kind of what we’re trying to do, right? We’re trying to describe these higher level behaviors in terms of the matrix multiplications that implement them, which obviously they are implemented by matrix multiplications, but it doesn’t correspond to anything that looks like anything that we can grab onto.

So I think there’s this kind of higher level description that we all want. It’s useful for understanding the model for its own sake. It’s also really useful for these questions about similarity to humans, right? Because humans aren’t gonna have those exact same matrix multiplications. And so it’s kind of like, what are the higher level abstractions that are being represented? How are they being operated on?

And that’s where the similarity is likely to exist. It’s like we kind of need to invent fMRIs and EEGs and we got to figure out how to do that. And I think there’s, there are some things that exist. They’re good enough to start chipping away and we’re starting to get some interesting converging results, but they’re definitely not the last word on it.

So I would say one of the most popular tools that we use a lot that I think was really invented maybe back around 2019, 2020 or something is called path patching, but that paper I think called it causal mediation analysis. I think there are a lot of papers that kind of have simultaneously introduced and perfected this technique.

But it basically is saying try to find which components in the model are like, maximally contributing to the choice of predicting A over B. So that’s been a really popular technique. There have been a lot of papers that have used it and it has made very reproducible types of results.

And what you basically get is some kind of an fMRI, It lights up parts of the network as saying these ones are highly active in this decision. These ones are less active.

Abha: So then, how do we get from path patching — this fMRI for large language models — to higher-level concepts like understanding, intentions, and intelligence?

We often wonder if LLMs “understand,” but what it means to “understand” something can depend on how you define it.

Melanie: Let me jump up from the matrix multiplication discussion to the highest philosophical level. So there was a paper in 2022 that was a survey of the natural language processing community.

And it asked people to agree or disagree with the following statement: “Some generative models trained only on text, given enough data and computational resources, could understand natural language in some non-trivial sense.” So this is in principle, trained only on language. So would you agree or disagree with that?

Ellie: I would say maybe I would agree. To me, it feels almost trivial because I think what’s nice about this question is it doesn’t treat understanding as a binary. And I think that’s the first place where I usually start when people ask this question. To me, a lot of the debate we’re having right now is not about large language models, it’s about distributional semantics, and it’s whether we thought distributional semantics could go this far.

Melanie: Can you explain what distributional semantics is?

Ellie: Yeah. You know, natural language processing has just been using text. And so using this idea that the words that occur before and after a word are a really good signal of its meaning. And so if you get a lot of text, and you cluster things based on the words, they co-occur with, cat and dog and, or maybe dog and puppy and Dalmatian will all occur together. Cat and dog and bird and other pets will co-occur together. Zebra and elephant, those will co-occur together.

And as you get bigger models and more text, the structure becomes more sophisticated. So you can cut similarity along lots of different dimensions. It’s not just on a one dimension, are these things similar or different. I’ve differentiated pets from zoo animals, but in this other dimension, I’ve just differentiated carnivores from herbivores, right?

So it’s obviously missing some stuff. It might know a lot about “cat and” as it relates to other words, but it doesn’t know what a cat actually is, right? It wouldn’t be able to point out a cat. It can’t see. So it doesn’t know what cats look like and doesn’t know what they feel like.

Melanie: So I think the results of that survey were interesting. That was in 2022. So it might be different now, but half the people agreed and half the people disagreed. And so the disagreement, I think the question was, could something trained only on language in principle understand language in a non-trivial sense? And I guess it’s just a kind of a difference between how people interpret the word understand.

And the people who disagreed, I would say that what you said, these systems know how to use the word cat, but they don’t know what a cat is. Some people would say that’s not understanding.

Ellie: Right, I think this gets down to people’s definition of understand and people’s definition of trivial. And I think this is where I feel like it’s an interesting discussion to have over drinks or something like that, but is it a scientific discussion right now? And I often find it’s not a scientific discussion. Some people just feel like this is not understanding and other people feel sure it is.

And there’s no moving their opinions because I don’t know how you speak to that. So the way you have to speak to it is to try to figure out what’s really going on in humans. Assuming we all agree that humans really understand and that’s the only example we all agree on. We need to figure out whether it is.

And then we have to figure out what’s different in the LLMs and then we have to figure out whether those differences are important or not. And I don’t know. That’s just a really long game.

So as much as I kind of love this question, I’ve increasingly gotten annoyed at having to answer it, cause I just don’t feel like it’s a scientific question. But it could be. It’s not asking about the afterlife or something. It’s not outside of the realm of answerable questions.

Abha: In our previous episodes, we’ve talked about how one of the big questions around artificial intelligence is whether or not large language models have theory of mind, which researchers first started assessing with human psychology tests like the Sally-Anne scenario.

And a second question arose out of that process: if LLMs can pass our human theory of mind tests — if they pass Sally-Anne when the details and the names are changed — are they actually doing complicated reasoning, or are they just getting more sophisticated at matching patterns in their training data?

As Ellie said, she cares that we’re intentional and scientific when we say things like, an LLM “understands” or “doesn’t understand.” And yet —

Ellie: They’re learning much more interesting structure than I would have guessed. So I would say my general, coming into this work, I would have called myself a neural network skeptic, and I still kind of view myself as that, right? I very often get annoyed when I hear people say stuff like they understand or they think.

And yet I actually spend more of my time writing papers saying, there is an interesting structure here. They do have some notion of compositionality. Or they, and I actually do use those words a lot, I really try not to in papers, but when I’m talking, I just don’t have another word for it. And it is so inefficient for me to come up with some new jargon, so I anthropomorphize like crazy in my talks and it’s terrible, and I apologize, blanket at the beginning, and I keep doing it.

But one big takeaway is I’m not willing to say that they think or they understand or any of these other words, but I definitely have stopped making claims about what they obviously can’t do or even obviously aren’t doing, right? Because I had to eat my words a couple of times and I think it’s just we understand so little that we should all just stop trying to call it and just take a little bit of time to study it.

I think that’s okay, we don’t need an answer right now on whether they’re intelligent or not. What is the point of that? It’s just guaranteed to be wrong. And so, let’s just take some time and figure out what we’re trying to even do by asking that question and do it right.

I think right now seeing LLMs on the scene, it’s too similar to humans in all the wrong kinds of ways to make intelligence the right way to be thinking about this. And so I would be happy if we just could abandon the word. The problem, like I said, is then you get bogged down in a ton of jargon and I think we should all just be in agreement that we are in the process, and it might take a while of redefining that word.

I hope it’ll get fractured up into many different words, and that a decade from now, you just won’t even see that in the papers anywhere, but you will see other types of terms where people are talking about other kinds of much more specific abilities.

Melanie: Well also just sort of willing to put up with uncertainty, which very few people in this field seem to be able to do.

Ellie: It would be nice if we could all just wait a decade. I get the world wouldn’t allow that, but I wish we could just do that, right?

Abha: And Erica agrees. Her work with animals has made her pause before making assumptions about what other entities can and can’t do.

Erica: I keep going to new talks and I sort of have an opinion and I get a new talk and then I go, well, that’s really interesting. And I have to kind of revise my opinion. And I push back a lot on human scientists moving the bar on, what makes humans unique? What makes human language unique?

And then I sort of find myself doing that a little bit with LLMs. And so I need to have a little bit of humility in that. So I don’t think they have a theory of mind, but I think demonstrating one, that they don’t and two, why they don’t are not simple tasks. And it’s important to me that I don’t just sort of dogmatically say, “Well, I believe that they don’t,” right?

Because I think people believe a lot of stuff about animals and then go into it saying, “Well, I believe animals don’t have concepts.” And then you say, “Well, why not?” “Well, because they don’t have language.” And it’s okay. So I think that LLMs are fundamentally doing next token prediction.

And I know you can build them within systems that do more sophisticated things, but they’re fundamentally, to the extent that my layperson understands, I mean, I do not build these systems, and you know much more about this than I do.

But I think that they’re very good at predicting the ways that humans would answer those questions based on the corpora of how humans answer either exactly those questions or questions that are similar in form, that are sort of analogous, structurally and logically similar.

And I mean, I’ve been spending quite a bit of time trying to argue that chimpanzees have a theory of mind and people are historically, I mean, now I think they’re becoming a little more open to it, but historically have been quite opposed to that idea. But we’ll very readily attribute those ideas to an LLM simply because they can answer verbal questions about it.

Abha: We’ll readily attribute human characteristics to LLMs because, unlike the chimpanzees Erica studies, they speak like us. They’re built on our language. And that makes them both more familiar to us on a surface level, and more alien when we try to figure out how they’re actually doing things.

Melanie: Earlier, Erica described a tradeoff in studying intelligence in animals: how much do we gain by using the metrics we’re familiar with, like human language, versus trying to understand animals on their own terms, like elephants that rumble through the ground to communicate?

Abha: And we asked Ellie how this applies to large language models. Does that tradeoff exist with them too?

Ellie: Yeah, totally. From the point of view of LLMs, I actually think within our lab, we do a little bit of both of these. I often talk more about trying to understand LLMs in human terms. Definitely much more so than with animals. LLMs were invented to communicate with us and do things for us. So it is not unreasonable or it’s not unnatural to try to force that analogy, right?

Unlike elephants, which existed long before us and are doing their own thing, and they could care less and would probably prefer that we weren’t there at all, right?

Melanie: On the other hand, Erica finds them more difficult to interpret, because even though they can perform on our terms, the underlying “stuff” that they’re made of is less intuitive for her than animals.

Erica: Again, I’m not sure because, an LLM is not fundamentally a single agent, right? It’s a collective. It’s reflecting collective knowledge, collective information. I feel like I know much more how to interpret a single parrot or a single dolphin or a single orangutan performing on a task. How do they, sort of, how do they interpret it? How do they respond?

To me, that question is very intuitive. I know that mind might be very different from my own, but there is a mind there. There is a self. And whether that self is conscious, whether that self is aware of itself, those I think are big questions, but there is a self. There is something that was born into the world that has narrative continuity and one day will die, we all will, right? LLMs don’t have that.

They aren’t born into the world. They don’t have narrative continuity and they don’t die in the same way that we do. And so I think it’s a collective of a kind that humans have never interacted with before.

And I don’t think that our thinking has caught up with technology. So I just don’t think that we’re asking the right questions about them because I don’t, these are entities or collectives or programs unlike anything else that we have ever experienced in human history.

Abha: So Melanie, let’s recap what we’ve done in this episode. We’ve looked at the notion of assessing intelligence in humans, non-human animals, and machines. The history of thought concerning intelligence is very much human centered. And our ideas about how to assess intelligence, it’s always valued the things that are most human-like.

Melanie: Yeah, I really resonated with Erica’s comment about our lack of imagination doing research on animals. And she showed us how a human-centered view has really dominated research in animal cognition and that it might be blinding us to important aspects of how animals think, not giving them enough credit.

Abha: But sometimes we give animals too much credit by anthropomorphizing them. When you make assumptions about what your dog or cat is quote unquote thinking or feeling, we project our emotions and our notions of the world onto them, right?

Melanie: Yeah, our human-centered assumptions can definitely lead us astray in many ways. But Ellie pointed out similar issues for assessing LLMs. We give them tests that are designed for humans, like the SAT or the bar exam, and then if they pass the test, we make the mistake of assuming the same things that we would for humans passing that test. But it seems that they can pass these tests without actually having the general underlying skills that these tests were meant to assess.

Abha: But Ellie also points out that humans often game these tests. Maybe it’s not the tests themselves that are the problem. Maybe it’s the humans or the animals or the machines that take them.

Melanie: Sure, our methods of assessing human intelligence have always been a bit problematic. But on the other hand, there’s been decades of work on humans trying to understand what general abilities correlate with these test scores while we’re just beginning to figure out how to assess AI systems like LLMs. Ellie’s own work in trying to understand what’s going on under the hood in AI systems, as we described before, is called mechanistic understanding or mechanistic interpretability.

Abha: The way I understood this is that she’s looking at ways to understand LLMs at a higher level than just weights and activations in a neural network. It’s analogous to what neuroscientists are after, right? And understanding the brain without having to look at the activation of every neuron or the strength of every synapse.

Melanie: Yeah, as Ellie said, we need something like fMRIs for LLMs. Or maybe we actually need something entirely different, since as Erica pointed out, an LLM might be better thought of as a collective kind of intelligence rather than an individual. But in any case, this work is really at its inception.

Abha: Yeah, and also as both Ellie and Erica pointed out, we need to understand better what we mean by words like intelligence and understanding, which are not yet rigorously defined, right?

Melanie: Absolutely not. And maybe instead of making grand proclamations like, LLMs understand the world or LLMs can’t understand anything, we should do what Ellie urges us to do. That is to be willing to put up with uncertainty.

Abha: In our final episode of the season, I’ll ask Melanie more about what she thinks about all these topics. You’ll hear about her background in the field of intelligence, her views on AGI and if we can achieve it, how sustainable the industry is, and if she’s worried about AI in the future.

That’s next time, on Complexity. Complexity is the official podcast of the Santa Fe Institute. This episode was produced by Katherine Moncure. Our theme song is by Mitch Mignano, and additional music from Blue Dot Sessions. I’m Abha, thanks for listening.

If you enjoyed this article…Buy Me A Coffee

Learn more about the coaching process or
contact me to discuss your storytelling goals!

Subscribe to the newsletter for the latest updates!

Copyright Storytelling with Impact® – All rights reserved