Inferring Emotions

Beyond Emotional Gradients

The study of emotions, with its interdisciplinary appeal, is marked by a quite a bit of controversy. This is partly because there is no general agreement over what words like “emotion” or “feeling” actually mean, just as there is no agreement whether conscious affective states are created in the higher brain regions from signals generated lower down, or whether they begin as felt states in the lower brain and elaborated upon in higher areas where they are modified by learned interpretations into more complex emotions. In either interpretation, it is generally agreed that fully experienced, conscious affective states exist only in higher brain regions.

For our purposes, the precise answer to how feelings come to be felt does not really matter. Feelings are qualia by definition and there is no useful debate over qualia in humans or anything else. We are more concerned with which gradients resulting in feelings a person internally creates, rather than the precise mechanisms humans use to integrate them into conscious experience. We are also concerned with which feelings are projected from one person to another, and how those might be detected by an AI system. In other words, we want to know how to detect what a person projects as feelings, and how to model that data and use it to control a response.

The base of this model is differential perception. We assume that regions in the brain are associated with different feeling states because they either provoke the release of related chemicals or receive signals about their concentrations. The states with which we are concerned can be mixed, but the basic ones are generally accepted to be few.

The late Jaak Panksepp, the neuroscientist who coined the term “affective neuroscience,” proposed a set of seven systems, which he called Seeking, Rage, Fear, Grief, Caring, Lust, and Play. Other formulations exist, but most are similar. Panksepp maps each of these to different regions of the mammalian brain, all of which are in medial and lower regions, and each of which is attended by the release of different chemicals into the body. The mapping of feelings to these brain areas, sometimes referred to as parts of the central limbic system, implies that these feelings are evolutionarily old. It also suggests that they have similar functions in all mammals. Indeed, from the hissing of cats and the growling of dogs to the playfulness of puppies and kittens, the behaviors related to these states are easily observed and their functions readily explained.

 

Inferences vs. Projections

For humans, with their complex neocortices, basic feelings are obviously not the whole story of emotions. Rather, humans experience many complicated emotional states. It is believed that these complex emotions are learned directly from explicit teaching or are culturally absorbed. This implies that the same feeling systems can generate the inputs to a variety of emotions, each of which is dependent on the learned social context in which it arises and the way people have been taught to interpret it.

This means that while feelings may begin in lower brain systems with changes in chemistry, they are interpreted into more nuanced emotions in the higher regions of the neocortex. The need to learn and interpret such contextual emotions before the brain can generate them internally also accounts for why very young children and those with cognitive disabilities do not seem to have the range of complex emotions found in typical adults. Even so, most people, and especially children, will display very strong basic feelings when aroused.

Another important implication in this concerns communication. People communicate their basic feelings rather clearly and very likely have done so for millions of years, but are unlikely to be able to communicate complex emotions so easily. Instead, they must either rely on context to carry the additional data needed to reconstruct their emotional states to others, or explain them with language. For example, complex emotions such as familial embarrassment and schadenfreude are elaborations of basic states. If it is only the feeling states that are being transmitted, then it is the recipient, knowing the context and sharing the culture, who is reconstructing the actual emotion cognitively, using similar modules to those that construct them in the transmitter. This decoding would have rough accuracy at first, and be refined as more information arrives.

Because culture overlays biology, even the communication of basic feelings becomes less overt as the young mature. Nonetheless, we would argue that it is still all or nearly all of what is being directly projected. Indeed, when interacting with each other, feelings, if they are strongly present, are communicated before anyone says anything. Tone of voice, speaking speed, body language, word choice, and similar cues are all part of this transmission and reception.

Different theories have been advanced for how exactly these feelings are picked up between humans, including the recently popularized discovery of mirror neurons, whose role is not fully understood and which has almost certainly been overstated in popular articles. Whatever the exact mechanisms, what is important is that feelings are clearly being transmitted and received, and that if AI systems are to fully participate in human communication, they will need to simulate the use of them as well.

 

Salience and Attention

For building AI systems, the fact that only a limited number of feelings can be projected is welcome news. It means that that AI systems only need components for detecting these basic feelings, and that attempting to detect complex emotions is unlikely to be a fruitful direction. It is also reasonable to assume that machine learning models trained to recognize the presence of any non-neutral emotional valence can be used to trigger the hand-off to a set of specialty models trained to separate out the basic feelings.

Once valence is known, it would be helpful to quantify arousal or affect, the degree of excitability being expressed. This is harder to measure from language alone, but there are clues. For example, excited people tend to speak in bursts, use shorter sentences, include exclamations, and use shorter words. They may also use more profanity, or use statements that display cognitive bias errors.

From there, the degrees of specific feelings, affect, and valence can be combined with information from the context, the speaker’s age, and the content of the communication to create a proxy for a detected emotion in the AI.

Thus, detecting a feeling that turned out to be anxiety from an adult in a statement about work could be used to generate a response such as “You seem anxious about something going on at work. Is there anything I can do to help with that?” Over time, as more emotional data were gathered, more refined outputs could be produced. One caveat is that such models will always need to be culturally specific, but that, too, may be something already being solved within LLMs themselves, as they are now trained on enormous language samples, and these samples are direct representative of culture.

In any case, once a control system is built use modular subsystems to discover what is emotionally salient in a user’s conversational input, it can use that information to adjust the instructions to the LLM, essentially directing its attention to those parts of its linguistic vector space from which emotionally appropriate responses can be generated.

Previous
Previous

Beyond the Turing Test

Next
Next

Feelings from Chemical Gradients