Skip to main content
Free AccessResearch Article

Meaningful Versus Meaningless Sounds and Words

A False Memories Perspective

Published Online:https://doi.org/10.1027/1618-3169/a000506

Abstract

Abstract. The current study assessed memory performance for perceptually similar environmental sounds and speech-based material after short and long delays. In two studies, we demonstrated a similar pattern of memory performance for sounds and words in short-term memory, yet in long-term memory, the performance patterns differed. Experiment 1 examined the effects of two different types of sounds: meaningful (MFUL) and meaningless (MLESS), whereas Experiment 2 assessed memory performance for words and nonwords. We utilized a modified version of the classical Deese–Roediger–McDermott (Deese, 1959; Roediger & McDermott, 1995) procedure and adjusted it to test the effects of acoustic similarities between auditorily presented stimuli. Our findings revealed no difference in memory performance between MFUL and MLESS sounds, and between words and nonwords after short delays. However, following long delays, greater reliance on meaning was noticed for MFUL sounds than MLESS sounds, while performance for linguistic material did not differ between words and nonwords. Importantly, participants' memory performance for words and nonwords was accompanied by a more lenient response strategy. The results are discussed in terms of perceptual and semantic similarities between MLESS and MFUL sounds, as well as between words and nonwords.

Human memory is not consistently accurate and may lead individuals to remember events differently than what occurred or remember an event that never happened, thus creating a “false” memory. False memories have been studied extensively using the Deese–Roediger–McDermott (DRM) paradigm (Deese, 1959; Roediger & McDermott, 1995) primarily because high levels of false recalls and recognitions have been induced under controlled conditions. In the typical DRM task, participants study lists of semantically similar words to a related lure. For example, bed, nap, rest, night, and pillow are all semantically similar to the related lure sleep, which is often just as likely to be recalled or recognized during retrieval as words presented in the list. Several theories explain false memory creation, one of which is the fuzzy trace theory (Brainerd & Reyna, 1998, 2002, 2005). The fuzzy trace theory is based on the dual-process theory of memory, asserting two basic memory types, gist memory and verbatim memory. Gist memory, also called gist traces, can be summarized as understanding a bottom-line meaning, or in other words, possessing fuzzy representations of a past event. Verbatim memory, on the other hand, is a detailed and accurate representation of a past event. Both verbatim and gist traces enhance correct recognition and recall, whereas gist traces cause false alarms or false recalls.

Recently, Olszewska et al. (2015) displayed that semantic associations are not limited to visual presentations demonstrating a false memory effect in short-term memory (STM) when semantic word lists were presented auditorily. McBride et al. (2019) displayed that semantic associations are not the only form of stimuli that can activate a false memory effect. Indeed, McBride et al. (2019) tested STM for phonologically similar words (e.g., item list sip, gin, sit, and pin with a related critical lure sin) and semantically associated words (e.g., awake, doze, dream, pillow, and snore with a related critical lure sleep) framing the study within the DRM paradigm (p. 4). The results demonstrated a greater false memory effect for phonological material than semantic. McBride et al. (2019) argue that the fuzzy trace theory is limited to justifying semantic memory errors due to the natural possession of meaning and is therefore limited in explaining phonological false memories. Holliday and Weekes (2006) similarly agree, proposing the term phonological gist as it more accurately supports the elaborations of encoded targets that are phonologically related to the critical lure. The present research sought to extend the literature on memory distortions by examining false memories produced by acoustically related auditory stimuli with both sounds and linguistic material, using a modified DRM procedure in both STM and long-term memory (LTM).

The majority of studies on auditory memory have used linguistic (Baddeley, 1966; Baddeley et al., 1975; Greene & Pearlman, 1996; Roediger & McDermott, 1995) or musical (Baird & Samson, 2009; Cuddy & Duffin, 2005; Peretz & Coltheart, 2003) stimuli. Notably, several studies that tested participants auditorily examined either STM or LTM separately since the properties of these two stores are distinct. When an approach to memory through multistore models was considered (Atkinson & Shiffrin, 1968), it typically presented differences in performance across stores as STM relied on acoustic encoding, whereas LTM was driven by the meaning of stimuli. However, a unitary account conceptualizes working memory as the activated portion of LTM (Cowan, 1999) and shows that short- and long-term remembering is mediated by overlapping components. Moreover, other unitary theories assume that memory traces consist of a constellation of features that may differ in their accessibility and effectiveness as retrieval cues over time (e.g., Jonides et al., 2008; Nairne, 2002).

A hybrid task was recently developed to test both STM and LTM false memories toward visually presented semantic material (Flegal et al., 2010). This task was later extended to test memory for semantically related verbal material presented in the auditory modality (Olszewska et al., 2015). This unique procedure allowed for testing STM and LTM from the false memory perspective and under equivalent encoding conditions while preserving the definitions of these two memory stores. To our knowledge, this hybrid task has only been used to examine semantically associated verbal material.

Although many studies report findings on memory for words and nonwords (Gardiner & Java, 1990; Gathercole, 2006; Hulme et al., 1991; McKone, 1995; Papagno & Vallar, 1992; Saint-Aubin & Poirier, 2002; Vitevitch & Luce, 1999), studies on environmental sounds1 constitute a still unexplored field (Bartlett, 1997; Bower & Holyoak, 1973; Chiu & Schacter, 1995; Cycowicz & Friedman, 1999; Delogu et al., 2009; Hendrickson et al., 2015). Moreover, linguistic material is usually tested separately from nonlinguistic sounds (exception: Orgs et al., 2006). However, auditorily presented words and environmental sounds can be considered per their similarities and differences (see Hendrickson et al., 2015).

The most apparent difference between words and sounds is the lack of a linguistic component in sounds. For example, Chiu and Schacter (1995) stated that environmental sounds are remembered less well than speech because speech can be abstracted away from the auditory stimulus while retaining the semantic content. In contrast, memory for environmental sounds is more explicitly bound to the details of the waveform. This idea supports the notion that one of the main differences between words and environmental sounds relates to semantics. More specifically, environmental sounds do not contain a linguistic component; instead, they obtain meaning only through a causal relationship with an event or object that may produce the appropriate sound. For example, the sound of “toilet flush” is associated with a toilet and with the act of flushing. In sum, this implies a poorer lexicon for sounds than words that have a direct link of reference (Ballas & Howard, 1987). Moreover, it should be stressed that some sounds, although produced by the everyday environment, may not possess a lexicon and thus lack meaning. How do people solve the dilemma of recognizing ambiguous stimuli? Interestingly, individuals attempt to name the sound, which is consistent with “top-down” processing. This notion is supported when a Gestalt approach is taken concerning sounds and words. Indeed, people naturally tend to stress the role of organization that pertains to memory and perception (Koffka, 1935). The implication is that our experience of reality is not an exact depiction of real events but is the result of a process of “regularization” in line with the reconstructive approach to memory (Bartlett, 1932), where facts retrieved from memory are loaded with expectations and pre-existing knowledge. For example, presented figures are usually remembered as being more regular than they are; hence, an irregular quadrilateral may be remembered as a square or a rectangle. The same may apply to sounds that lack meaning. People may try to remember something similar to what was presented to organize their representation.

Conversely, other studies show that words and environmental sounds share some characteristics. For example, sounds may activate meaning that is associated with a referent (Ballas & Howard, 1987). This implies that words and sounds may be processed similarly, as found in studies which revealed that semantically congruent words or pictures primed environmental sounds, and vice versa (Ballas, 1993; Chen & Spence, 2011; Schneider et al., 2008; Stuart & Jones, 1996; Özcan & Egmond, 2009). Similarities in processing these two types of stimuli were confirmed in studies that used electrophysiological measures (Cummings et al., 2006; Frey et al., 2014; Orgs et al., 2006; Schön et al., 2010) as well as functional imaging (Leech & Saygin, 2011; Thierry et al., 2003; Tranel et al., 2003). It was established that STM performance decreases for phonologically similar words compared to words that sound different (Baddeley, 1966; Conrad, 1964; Conrad & Hull, 1964; Hintzman, 1965), known as the phonological similarity effect (Conrad & Hull, 1964). However, this phenomenon was tested most often for meaningful (MFUL) material using serial recall (e.g., Fallon et al., 1999). Some studies have examined memory outcomes toward ambiguous stimuli and found a generally worse recall rate in comparison to MFUL material (Hulme et al., 1991, 1995) but argued that this difference was due to LTM mechanisms processing of MFUL verbal material. Yet, to date, the effects on memory of acoustic similarities within meaningless (MLESS) linguistic material are largely unexplored.

Although the phonological similarity effect (Conrad & Hull, 1964) is a well-established phenomenon, to our knowledge, there are no studies that have examined an acoustical similarity effect for environmental sounds and linguistic-based material from a false memory perspective. Moreover, the studies cited above that compared the performance of sounds and words highlighted similarities between processing methods (e.g., Cumming et al. 2006). Therefore, we would like to further explore memory for linguistic material and sounds and frame our study within the false memory paradigm.

The Present Studies

The present investigation sought to extend the literature on false memories, emphasizing the auditory modality in both STM and LTM and on acoustic similarities between stimuli rather than on semantic similarities. More specifically, the main objective was to compare STM and LTM performance for auditorily presented material in the form of MFUL versus MLESS sounds (Experiment 1) as well as words and nonwords (Experiment 2) within the false memory perspective. MFUL stimuli can be summarized as material that is easily and instantaneously recognizable. In contrast, MLESS stimuli are material that is ambiguous and lacks identifiable characteristics. To accomplish this, we further modified the hybrid gist-and-verbatim-trace task (Flegal et al., 2010) and adjusted it to test memory for auditory material of two categories with acoustically similar items: environmental MFUL and MLESS sounds and words and nonwords.

Our predictions will be presented for STM and LTM separately and in relation to memory performance defined as the ability to discriminate between studied and unstudied material related to sounds and linguistic stimuli.

Short-Term Memory

We predict that when presented with sounds, participants' memory performance for studied items and related as well as unrelated lures should not differ between MFUL and MLESS material due to an easy access to acoustic features of the sounds (Chiu & Schacter, 1995) that can be acknowledged as durable and perceptually rich (Corballis, 1966; Crowder & Morton, 1969; Penney, 1989). Participants' ability to discriminate between studied and nonstudied stimuli should be equivalent for MFUL and MLESS sounds.

In terms of linguistic material in STM, we may observe improved ability to discriminate between studied and nonstudied items for words than nonwords due to the contribution of LTM as demonstrated by the past literature (Atkins & Reuter-Lorenz, 2008; Coane et al., 2007). However, these studies visually presented stimuli in the form of words. Importantly, other studies discovered that the effects of LTM contribution, although present, were less robust when semantic material was presented auditorily (Olszewska et al., 2015), suggesting that the auditory code is strong and salient (Penney, 1989). In the current study, linguistic material consists of auditorily presented and acoustically similar words that possess differing semantic meanings, which, to some extent, may activate the semantic network (Graf & Schacter, 1985). Nevertheless, if activation of the semantic network does occur, it may not present as effective as it would be when a series of semantically related words are displayed. Consistent with this, Papagno and Vallar (1992) suggested that the semantic contribution to learning words increased with time; hence, in our case, the meaning of stimuli might have a limited impact as each STM trial does not exceed 10 s. On the other hand, it is also possible that recognition will be based primarily on acoustic information producing similar rates of discriminability between studied and nonstudied words and nonwords.

Long-Term Memory

In LTM, retrieval is based on meaning; thus, we should have more correct responses to MFUL studied items and more errors to MFUL related lures than MLESS items. This reasoning should apply particularly to sounds since – in addition to their acoustic similarity – sound-based stimuli also belong to the same semantic category (e.g., four acoustically different cats' meowing). MLESS sounds do not permit the creation of categories; therefore, we may observe fewer correct responses to studied sounds and fewer errors to related lures. Consequently, the ability to discriminate between studied, nonstudied, and related sounds may reveal similar discriminability between MFUL and MLESS sounds, yet a different response bias may be noticed. More specifically, recognizing MFUL stimuli should be accompanied by a more liberal response strategy.

In terms of linguistic material, a hypothesis may be considered from two perspectives. In the current study, words are presented as acoustically similar; however, the meaning differentiates them. In other words, acoustically similar sounds cannot be classified within the same semantic category. Nonwords do not convey any meaning and do not immediately activate any lexical representation (Gathercole, 2006) as words do (Papagno & Vallar, 1992). If this is the case, it can be assumed that nonwords will be processed deeply due to their bizarreness (see Einstein & McDaniel, 1987), making nonwords distinct and salient in memory. Moreover, nonwords are less fluent than words requiring greater cognitive engagement (Oppenheimer, 2008). Distinctiveness, in conjunction with in-depth processing strategies, may result in more correct responses to studied nonwords and fewer errors to related nonwords as compared to words.

On the other hand, due to the prior existence of words in the participant's lexicon, meaning may be generated for nonwords (Schweickert, 1993; see also Arndt et al., 2008). As a result, multiple nonwords may be grouped as one word during encoding and, later at retrieval, participants' recognition will be based on phonological similarities and self-generated meaning. This suggests that in the nonword condition, participants' memory may be less loaded due to grouping nonwords into a one-word category serving as a retrieval cue at the time of recognition. Correspondingly, this could lead to a reliance on meaning, which could manifest in more correct responses to studied nonwords and more errors to related nonwords. In comparison, participants' memory for words would display fewer correct responses to studied stimuli in conjunction with fewer errors to related stimuli.

Both scenarios should reveal no difference between words and nonwords in the ability to discriminate between studied and nonstudied stimuli; however, we expect a difference in response bias toward more liberal for nonwords than words in the second scenario due to less loaded memory.

According to fuzzy trace theory (Brainerd & Reyna, 1998, 2002, 2005), participants should reveal a comparable reliance on verbatim traces and gist traces for both types of stimuli (sounds and linguistic) in STM. In LTM, performance of MFUL sounds should reveal greater reliance on gist than MLESS sounds due to the presence of meaning and generalization. Gist-based memory should be weaker for MLESS sounds as categorization is exceptionally limited. In terms of linguistic material, verbatim and gist traces should be weak for both words and nonwords.

Experiment 1

Because environmental sounds are remembered differently and less well than speech (Chiu & Schacter, 1995), Experiment 1 sought to examine the effect of everyday sounds on memory performance. The goal of the current study was to test memory for perceptually similar MFUL and MLESS environmental sounds. To do this, we utilized a modified hybrid task of the DRM procedure (Flegal et al., 2010) to test both STM and LTM. The experiment used 2 × 2 within-subjects design. Factors were memory delay: short and long as well as the level of sounds' meaning: MFUL and MLESS.

Method

Participants

Thirty-four undergraduates attending the University of Wisconsin-Oshkosh2 (21 female and 13 male, M age = 19.03, SD = 1.05) participated for course credits.

Materials

MFUL and MLESS stimuli consisted of sounds from 80 different categories taken from a freely downloadable online database (www.findsounds.com), which consisted of highly similar concepts (e.g., category “dog” in which all stimuli were similar barking sounds produced by dogs). Each category contained five related sounds, which resulted in 400 different but acoustically associated sounds. The duration of each environmental sound file was identical in length (1,000 ms). Specific to MFUL items, we were guided by Delogu et al.'s (2009) categories of environmental sounds and selected sounds from those categories that were rated in Delogu et al. (2009) as most easily identifiable. MLESS stimuli were gathered independently by research assistants and later discussed with the first, second, and third authors. Any disagreements of MLESS sounds were settled by relistening to the sounds together until agreement in number was reached.

To further ensure that the environmental sounds were easily associated with each other, 10 undergraduate students were presented with all 80 categories and asked to rate the similarity of sounds. All dissimilar items were reconsidered and replaced by sounds of increased similarity. This allowed us to confirm the perceived resemblance of sounds. Following the confirmation of environmental sounds, the students collectively relistened to all sounds and placed stimuli in one of two larger groupings: MFUL or MLESS. Ten additional undergraduate students were then recruited to analyze the MFUL and MLESS sounds and asked to confirm or disapprove of categorization (e.g., all confirmed “baby crying” belonged in the MFUL category). Sounds that were decided to be in the MFUL or MLESS categories were later evaluated in a pilot study of 12 additional undergraduate students. When asked the question “How easily do you recognize what the sound is?” students were asked to rate the sound on a scale of 1–5, with 1 representing “extremely difficult to recognize” and 5 representing “very easy to recognize.” MFUL sounds were associated with M = 4.48, SD = 0.93, and MLESS sounds with M = 2.95, SD = 1.32. Sounds previously categorized as MFUL were rated higher on the given scale as compared to MLESS sounds, t(11) = 10.64, p < .001, d = 3.06, which means that it was easier to recognize the meaning of the former (MFUL) sounds.

Procedure

The STM phase included two blocks (MFUL and MLESS). Each block contained 30 trials (thus, 60 trials in total). The STM phase was approximately 11 min. For both blocks of MFUL and MLESS sounds, participants were instructed to don a set of headphones. Two practice trials preceded both blocks. Each trial began with a 500-ms auditory beep followed by a series of 4 s (either MFUL or MLESS depending on the block) presented at a rate of approximately one per second with an interstimulus break with no sound lasting 500 ms. After the last stimulus, a 500-ms interval lapsed before a math equation appeared for 3,000 ms in which they were asked to press “M” on the keyboard if the math equation was correct or “Z” on the keyboard if the math equation was incorrect. The distractor task we employed was based on the operation span task (Turner & Engle, 1989), which is widely utilized as a measure of working memory capacity (Reuter-Lorenz & Jonides, 2007). A similar distractor task was used previously in studies that applied the same hybrid task (Flegal & Reuter-Lorenz, 2014; Flegal et al., 2010; Olszewska et al., 2015). Next, for half of the trials, a fifth sound was presented, and participants had 3,000 ms to make a response. Probes types for the fifth sound were as followed: (1) a studied item from the immediately preceding list, (2) a lure acoustically similar to the items studied in the immediately preceding list, or (3) a lure unrelated to the items studied in the immediately preceding list. If participants heard the sound previously, they were asked to press “M,” and if they had not heard the sound, they were asked to press “Z” on the keyboard. For the other half of the trials, participants heard a quick double beep sound and were instructed to merely press either of the response keys to proceed to the next trial. This double beep represented a notification for participants to arbitrarily press any key, and it was not perceived as being similar to any of the sounds presented to participants within the experiment. These trials were probed later in a LTM recognition test.

Figure 1 presents the design and procedures (example study and test items are the words and nonwords used in Experiment 2).

Figure 1
Figure 1

Experiment 1 and 2 design. (A) In Experiment 1 instead of word and non-words, MFUL (= meaningful) sounds and MLESS (= meaningless) sounds were used. (B) After short-term memory part was completed, instruction for long-term part was given. LTM = longterm memory.

Around a 2-min break followed the completion of the last STM trial. Next, participants were given instructions for the LTM recognition test (Flegal et al., 2010). The LTM phase was approximately one to two and a half minutes. Participants were presented with 40 trials (30 trials tested memory sets that were not probed at STM) and were instructed to decide whether they heard that sound during the STM phase. The 30 trials were tested in separate blocks (15 MFUL and 15 MLESS). If participants did hear the sound previously, they were instructed to press “M” on the keyboard, and if they did not hear the sound previously to press “Z” on the keyboard. In addition to these 15 MFUL or 15 MLESS list test probes (five each of the three probe types), five MFUL and five MLESS sounds that had been studied during STM trials were included to equate the number of test items that had been studied versus nonstudied. The ten trials were created to match the proportions of yes and no responses in LTM tests and were not analyzed.

Results

To identify sensitivity and response bias, we performed signal detection analyses using dʹ as an estimate of sensitivity and C as an estimate of bias (Table 1).

Table 1 Measures of sensitivity (d′) and response bias (C) of acoustically similar MFUL and MLESS sounds for item-specific memory (studied vs. critical lures and studied vs. nonrelated lures) and for memory for gist (critical lures vs. nonrelated lures)

Following other studies (e.g., Arndt, 2010; Koutstaal & Schacter, 1997) to thoroughly test the sensitivity, dʹ and C values were calculated for various conditions: (1) sensitivity comparing hits to critical lures3 (related sounds; a measure of item-specific memory), (2) sensitivity comparing hits to unrelated lures (also a measure of item-specific memory), and (3) sensitivity comparing critical lures to unrelated lures (false alarms to critical lures are treated as a form of gist-like memory and, thus, are treated as hits).

Item-Specific Memory (Hits Compared to Critical Lures)

A 2 (delay: short vs. long) × 2 (level of meaning: MFUL vs. MLESS) repeated-measures ANOVA revealed a main effect of delay in sensitivity associated with discriminating targets (studied items) from critical lures, F(1, 33) = 66.66, p < .001, , indicating greater sensitivity in STM (M = 1.58; SD = 0.13) than in LTM (M = .28; SD = 0.12). Neither a main effect of the level of meaning, F(1, 33) = 1.46, p = .236, nor an interaction between the level of meaning and delay were significant, F(1, 33) = 0.05, p = .817.

However, the same ANOVA performed on response bias C revealed a main effect of the level of meaning, F(1, 33) = 7.42, p = .010, , indicating a more liberal style of responding for MFUL sounds (M = −.40; SD = 0.07) than for MLESS sounds (M = −.16; SD = 0.07) and an interaction between delay and the level of meaning, F(1, 33) = 45.82, p < .001, . A main effect of delay was not significant, F(1, 33) = 1.66, p = .206. Bonferroni post hoc analyses revealed no difference in response bias between MFUL and MLESS conditions in STM (p = .106). However, in LTM, participants were more liberal in the MFUL than in the MLESS condition (p < .001).

Item-Specific Memory (Hits Compared to Unrelated Lures)

A 2 (delay: short vs. long) × 2 (level of meaning: MFUL vs. MLESS) repeated-measures ANOVA on dʹ measure revealed a main effect of delay, F(1, 33) = 77.57, p < .001, , indicating greater sensitivity in STM (M = 2.75; SD = 0.13) than in LTM (M = 1.21; SD = 0.14), a marginal main effect of the level of meaning, F(1, 33) = 3.78, p = .061, , and an interaction between delay and level of meaning, F(1, 33) = 14.05, p < .001, . A Bonferroni post hoc analysis showed no difference in discriminating studied sounds from unrelated sounds in STM between MFUL and MLESS stimuli (p = .99). However, better discrimination was noticed in LTM for MFUL sounds than MLESS sounds (p = .002). Moreover, ability to discriminate MFUL studied sounds from MFUL unrelated sounds dropped from STM to LTM (p < .001), as does for MLESS sounds (p < .001).

A response bias C using the same ANOVA showed only an interaction between level of meaning and delay, F(1, 33) = 14.71, p < .001, . A main effect of delay F(1, 33) = 0.035, p = .853, and a main effect of the level of meaning, F(1, 33) = 0.012, p = .914, were not significant. Bonferroni post hoc analyses showed that in STM participants were more conservative when responding to MFUL than to MLESS sounds (p = .049). In LTM, a difference in style of responding was not noticed (p = .082).

Gist-Based Memory (Critical Lures Compared to Unrelated Lures)

A 2 (delay: short vs. long) × 2 (level of meaning: MFUL vs. MLESS) repeated-measures ANOVA on dʹ measure revealed a main effect of the level of meaning, F(1, 33) = 11.76, p = .002, , indicating greater sensitivity for MFUL sounds (M = 1.29; SD = 0.10) than for MLESS sounds (M = 0.81; SD = 0.10), and an interaction between the level of meaning and delay, F(1, 33) = 18.74, p < .001, . A main effect of delay was not significant, F(1, 33) = 2.45, p = .126. Bonferroni post hoc tests revealed that in LTM participants relied on gist more when studied MFUL sounds than MLESS (p < .001), whereas no differences were present after a short delay (p = .99). The MLESS LTM condition was least sensitive for gist and differed from other conditions: MFUL STM (p = .003), MFUL LTM (p < .001), and MLESS STM (p = .001).

The same ANOVA on response bias C showed a main effect of delay, F(1, 33) = 96.07, p < .001, , showing a more conservative response strategy in STM (M = 1.04; SD = 0.07) than in LTM (M = 0.38; SD = 0.07) and an interaction between delay and the level of meaning, F(1, 33) = 12.16, p = .001, . A main effect of the level of meaning was not significant, F(1, 33) = 1.35, p = .253. Bonferroni post hoc analyses revealed that in STM a strong conservative strategy of responding did not differ between the MFUL and MLESS conditions (p = .862). Moreover, the least conservative strategy was used in the MFUL condition in LTM, and this was different from the MLESS LTM condition (p = .010).

Discussion

The results demonstrated that there were no differences between MFUL and MLESS sounds in STM when discriminating between different types of sounds, which suggests a rather perceptual analysis of stimuli than the one based on meaning. When participants discriminated between studied sounds and unrelated sounds in LTM, greater ability to discriminate was noticed between MFUL sounds than MLESS sounds. In addition, the ability to discriminate between sounds dropped from STM to LTM for two types of stimuli. Overall, this suggests a shift toward a meaning-based analysis of stimuli and decay of acoustic memory trace after a longer delay.

An additional analysis of response bias showed a less conservative strategy for MFUL sounds than MLESS in LTM when a discrimination task contained critical lures. This may suggest greater reliance on gist when MFUL stimuli were examined.

Experiment 2

In Experiment 2, we sought to examine whether the same pattern will occur for linguistic material of a similar nature: MFUL (words) and MLESS (nonwords). If an acoustic component is processed prior to meaning, a similar performance pattern displayed for sounds in STM should be observed. However, in the nonword condition in LTM, it is anticipated that participants may group nonwords and generate a meaning that is close to a word in their lexicon. Therefore, at retrieval, participants may be cued by a meaning they individually generated, which will lead to more correct responses to studied items and more errors to related lures. In sum, this would result in a poor ability to discriminate between studied and nonstudied nonwords. On the other hand, nonwords may be processed more thoroughly, which would enhance item-specific encoding and lead to a mirror effect (Glanzer & Adams, 1990). As a result of deeper processing, more correct responses to studied items and fewer errors to related lures resulting in a strong ability to discriminate between studied and nonstudied stimuli may be observed.

As in Experiment 1, we used 2 × 2 within-subjects design. Factors were memory delay: short and long as well as linguistic material: words and nonwords.

Method

Participants

Thirty-two undergraduates from the University of Wisconsin-Oshkosh (18 female and 14 male, M age = 19.34, SD = 1.79)4 participated for course credits.

Materials

Words

Acoustically similar words were generated within the database found at elexicon.wustl.edu as part of The English Lexicon Project. Words were chosen with a neighborhood density of eight or more and were grouped into 30 lists of four words, each with one lure word (see Balota et al., 2007; e.g., first word = sage followed by wage, wade, page, and wait).

Nonwords

Acoustically similar nonword lists were generated within the same database as words and followed the same parameters (e.g., first word = fippo followed by fimo, foppo, mippo, and fitho). Moreover, all nonwords followed a variance of construction pattern for words in the English language of consonant, vowel, consonant or vowel, consonant, vowel.

In addition, we created lists of unrelated words and nonwords that were randomly chosen from the lists generated from the database that were not used in prior lists.

Procedure

We followed the same procedure as in Experiment 1, except instead of acoustically related sounds, participants encoded acoustically related linguistic stimuli (Figure 1).

Results

As in Experiment 1, sensitivity and response bias were analyzed in three different ways (Table 2).

Table 2 Measures of sensitivity (d′) and response bias (C) of acoustically similar words and nonwords for item-specific memory (studied vs. critical lures and studied vs. nonrelated lures) and for memory for gist (critical lures vs. nonrelated lures)

Item-Specific Memory (Hits Compared to Critical Lures)

A 2 (delay: short vs. long) × 2 (type of stimuli: words vs. nonwords) repeated-measures ANOVA on dʹ measure revealed a main effect of delay in sensitivity associated with discriminating targets (studied items) from critical lures, F(1, 31) = 22.42, p < .001, , which showed better discrimination ability in STM (M = 1.55; SD = 0.18) than in LTM (M = .55; SD = 0.11). Neither a main effect of a type of stimuli, F(1, 31) = 0.21, p = .647, nor an interaction, F(1, 31) = 0.12, p = .734, was significant.

The same ANOVA on a corresponding measure of response bias C revealed a main effect of delay, F(1, 31) = 10.34, p = .003, , meaning more liberal style in STM (M = −.29; SD = 0.09) than in LTM (M = .08; SD = 0.07), a main effect of the type of stimuli, F(1, 31) = 4.41, p = .044, , indicating more liberal style of responding for nonwords (M = −.21; SD = 0.07) than for words (M = −.00; SD = 0.08), and an interaction, F(1, 31) = 5.94, p = .021, . Bonferroni post hoc tests showed that participants, while recognizing words in LTM, employed the most conservative criterion as compared to other conditions (words STM, nonwords STM, and nonwords LTM), where they were more liberal (p < .001; p < .001; p = .008, respectively).

Item-Specific Memory (Hits Compared to Unrelated Lures)

A 2 (delay: short vs. long) × 2 (type of stimuli: words vs. nonwords) repeated-measures ANOVA on dʹ measure revealed only a main effect of delay, F(1, 31) = 84.61, p < .001, , indicating greater sensitivity in STM (M = 2.69; SD = 0.12) than in LTM (M = 1.07; SD = 0.12). A main effect of type of stimuli, F(1, 31) = 1.91, p = .178, and an interaction, F(1, 31) = 0.001, p = .973, were not significant.

The same ANOVA on C measure showed an interaction between delay and the type of stimuli, F(1, 31) = 7.92, p = .008, . A main effect of type of stimuli, F(1, 31) = 0.94, p = .339, and a main effect of delay, F(1, 31) = 0.29, p = .591, were both not significant. Bonferroni post hoc tests showed no difference in response style between all conditions (all ps > .10).5

Gist-Based Memory (Critical Lures Compared to Unrelated Lures)

A 2 (delay: short vs. long) × 2 (type of stimuli: words vs. nonwords) repeated-measures ANOVA on dʹ measure showed a main effect of delay, F(1, 31) = 11.72, p = .002, , indicating a greater reliance on gist in STM (M = 1.15; SD = 0.15) than in LTM (M = .52; SD = 0.10). A main effect of the type of stimuli, F(1, 31) = 2.31, p = .139, and an interaction, F(1, 31) = .14, p = .710, were not significant.

Response bias C also revealed a main effect of delay, F(1, 31) = 23.99, p < .001, , indicating a less conservative strategy in LTM (M = .61; SD = 0.07) as compared to STM (M = 1.05; SD = 0.08). An interaction between delay and the type of stimuli was only marginal, F(1, 31) = 4.03, p = .053, , and a main effect of type of stimuli, F(1, 31) = 1.21, p = .280, was not significant.

Discussion

Experiment 2 showed that memory performance between words and nonwords in STM did not differ, suggesting reliance on acoustic features rather than on meaning. In LTM, memory performance associated with discriminating between studied items and nonstudied ones decreased for both words and nonwords, indicating delay-related decay of memory trace. Response bias revealed to be more conservative for words than nonwords in LTM when discrimination tasks contained studied words.

General Discussion

In two experiments, we tested memory performance for acoustic material in the form of environmental MFUL and MLESS sounds and linguistic stimuli in the form of words and nonwords. Moreover, we examined immediate and delayed recognition performance. The study was framed within the false memories paradigm with a focus on acoustic similarities between stimuli and, to our knowledge, is the first that tested memory for sounds from this perspective.

To define commonalities and differences between processing of sounds and words, we will discuss STM and LTM separately while focusing on memory patterns for sounds and for linguistic material, followed by analyzing the relation between the two.

Sounds

STM

We hypothesized that memory performance for MFUL and MLESS sounds in STM should be based primarily on perceptual, acoustic features. The results from Experiment 1 showed no significant differences in memory performance between MFUL and MLESS sounds for both verbatim and gist traces (Brainerd & Reyna, 2002). The corresponding memory performance between MFUL and MLESS sounds is consistent with our prediction that participants may have focused on the auditory characteristics of stimuli more heavily before assigning meaning. Such results are also in line with prior research (e.g., Crowder & Morton, 1969), showing that auditorily presented information is durable in STM and retrieval is based more on perceptual characteristics of stimuli rather than on meaning. Indeed, research by Penney (1989) states that the auditory stream automatically encodes the acoustic code involuntarily and can persist in STM for up to a minute long. Our findings for environmental sounds support the strong perceptual properties of auditory material. Although the previous studies reported that auditorily presented verbal stimuli enhanced memory performance due to perceptually rich auditory code, our results show that, to some extent, nonphonological auditory information is also stored and processed in working memory based mostly on perceptual properties of stimuli rather than on meaning. While there are only a few studies on memory for environmental sounds, studies showed that bimodal encoding was beneficial to single modality encoding (Delogu et al., 2009). This implies that during single encoding, other formats were not automatically activated, even for MFUL material. The benefits of bimodal encoding may suggest that participants focused primarily on perceptual features of stimuli, which stays in accordance with our findings. It is important to note that any strong implications should not be made as there are many procedural differences between Delogu et al.'s (2009) study and our study. Thus, this aspect of semantic activation should be tested more thoroughly under similar procedural conditions since, as stressed by Delogu et al. (2009), the field of environmental sounds is still unexplored.

LTM

Our predictions were related to meaning-based retrieval in LTM; therefore, we expected to see more correct responses to studied sounds and more false alarms to related sounds for MFUL stimuli. Simultaneously, we anticipated an opposite pattern for MLESS stimuli with fewer correct responses to studied sounds and fewer false alarms to related sounds. The anticipated memory pattern for MLFUL and MLESS stimuli can be reflected in the lack of differences in discriminability between MFUL and MLESS sounds. However, we hypothesized that this outcome should be reflected in a different response style with a more liberal response strategy toward MFUL stimuli. Indeed, there was no difference in discriminating MFUL and MLESS studied sounds from MFUL and MLESS related sounds in LTM. However, the difference increased when participants discriminated between studied and nonrelated sounds with greater accuracy for MFUL stimuli. The difference was also significant when participants discriminated between related and unrelated sounds making memory for MFUL sounds more gist-based. This suggests that while processing MFUL sounds, after a delay, participants activated relevant categories (i.e., meow or cry) and based their decisions mostly on meaning. This is supported by the already mentioned low dʹ for veridical memory (MFUL studied vs. MFUL related sounds), indicating that it was difficult for participants to perceive a difference between similar sounds. Importantly, a large dʹ for gist memory (MFUL related sounds vs. MFUL nonrelated lures) implies that participants could easily identify a difference between both nonstudied sounds (related and unrelated), one of which (related sound) belonged to the previously activated category, further supporting the aforementioned explanation. High discriminability between MFUL studied sounds and MFUL nonrelated sounds also indicates that the process of categorization increased the ability to distinguish environmental stimuli as one of these sounds belonged to a distinct category. Furthermore, when nonstudied items were not related to any previously studied categories, they were easily correctly rejected.

Performance of MLESS sounds was characterized by low dʹ for both veridical and gist-based memory. This difference in memory patterns for MFUL sounds indicates that retrieval was not based on meaning but rather on perceptual properties. Even if participants attempted to identify sounds and assigned the stimuli meaning, it was not successful, and this reasoning is supported by a low ability to discriminate between studied and nonrelated sounds.

Overall, after a delay, the results imply that MFUL sounds were retrieved based on meaning, demonstrating strong evidence of false memory based on gist representation (Brainerd & Reyna, 1998, 2002, 2005). Moreover, it should be stressed that memory performance of MFUL sounds was accompanied by a more liberal style of responding as compared to MLESS sounds where participants were significantly more conservative in their decisions. This is in line with typical human behaviors where greater experience is paired with greater confidence.

Linguistic Material

STM

Regarding linguistic material, we offered two possible effects in processing words and nonwords with a greater expectation that no differences would be observed in processing words and nonwords due to the auditory presentation of stimuli. The results confirmed this prediction, and no difference in memory performance for words and nonwords was revealed. Such results are consistent with the well-established advantage for verbal STM when the input modality is auditory (Penney, 1989). As a result, we propose that participants focused their auditory attention on the signal's relevant distinguishing characteristics, components that can be easily named as acoustic features. The features of each word (or nonword) are highly durable and account for fairly good discrimination between studied and nonstudied stimuli. Moreover, the pattern we revealed for linguistic stimuli is the same as the pattern demonstrated for sounds, which further supports our prediction that in STM greater reliance is placed on perceptual properties rather than on meaning. This reasoning was also supported by a lack of difference in response styles between words and nonwords, further suggesting that similar processes were engaged while performing MFUL and MLESS tasks.

LTM

Our predictions were twofold. On the one hand, we expected better performance for studied nonwords and fewer errors to related nonwords than to words. On the other hand, we have not excluded a better performance for studied nonwords, however, accompanied by more errors to related nonwords than words due to grouping several acoustically similar nonwords and assigning them one meaning, which would result in treating the self-generated meaning as a retrieval cue. We predicted no difference in discriminability in both cases; however, we expected a different style of responding toward more liberal for nonwords in the second scenario.

As expected, the results from LTM showed that the ability to discriminate studied and nonstudied stimuli did not differ between words and nonwords. At first glance, the results suggest no changes in performance of linguistic material after a delay, regardless of if the material was MFUL or MLESS. However, response bias allows for a further explanation. When participants discriminated between studied and nonstudied related and nonrelated stimuli, participants demonstrated increased leniency when material consisted of nonwords. Why is it possible that while recognizing MLESS linguistic material participants were certain in their decisions more than recognizing stimuli with obvious meaning? An explanation lies in the known human's lexicon, which may allow for constructing a word from nonwords due to their acoustic similarity to the word (Schweickert, 1993). It is highly probable that while encoding acoustically similar nonwords (e.g., fimo, foppo, mippo, and fitho), participants tended to label the nonwords by generating a word from their lexicon that was the closest in its sound to the studied nonwords (e.g., hippo). Later, at the time of retrieval, we suspected that participants based their decisions not only on the phonological similarities but also on self-generated meaning. To put it differently, in the nonwords condition, participants' memory might have been less loaded due to the convergence of the nonwords to one word that shared similar acoustic features with the nonwords and additionally had meaning. Consequently, this might have led participants to rely on this meaning. In terms of words, participants also relied on meaning; however, because of the meaningfulness of each item, participants' memory was loaded with more information. To some extent, this is also consistent with the study of Holliday and Weekes (2006), where it was demonstrated that reliance on phonological gist declined with age, suggesting that retrieval might have been based on a more careful source monitoring. In our study, this was manifested in a more conservative style of responding for words than nonwords where participants discriminated between studied and related stimuli, further supporting our speculation about a larger representation for encoded words than nonwords.

It should also be stressed that the pattern of processing linguistic material in LTM is different from the one noticed for sounds. To reiterate, MFUL sounds were retrieved based on meaning, and it was accompanied by a liberal style of responding, whereas MLESS sounds revealed a more stringent strategy. This difference clearly suggests that processing of linguistic material goes beyond its perceptual characteristics, and this processing is flexible by allowing for searching the closest stimulus with a meaning (word), which most likely occurs automatically. Here, we propose an associative meaning approach that may be described as greater ease in finding a match between a nonword and a word than between a MLESS sound and a MFUL one. In other words, it is easier to find that a nonword fippo is remarkably similar to a word hippo than to find that a MLESS sound that clearly resembles a MFUL sound of a cat meowing.

Summary

Our findings clearly showed similarities and differences in memory for auditory stimuli in the form of environmental sounds and speech in immediate and delayed recognition, emphasizing the role of meaning. Interestingly, we demonstrated more similarities between the processing of words and nonwords words than between MFUL and MLESS sounds. (Chomsky, 2006). Future studies may consider these findings from Chomosky's perspective (2006), as his approach has been recently revived and discussed (Ding et al., 2016).

Limitations

A limitation of this study refers to the sound database. To our knowledge, there is no existing database that contains acoustically similar sounds in the way that each sound within a particular category would be similar enough to another sound, making them all belong to the same category. Although each sound within the same category is similarly perceived, the differences between sounds (the amount each sound differed from one another within each category) should be more precisely defined toward being equivalent. It would be extremely beneficial for future studies to further test memory for increased understanding of the characteristics, sources, and encoding processes of sounds that occur in the everyday world.

References

1We define environmental sounds as sounds that occur in the environment and are either easily identifiable (i.e., cat meow and doorbell) or difficult (there is no obvious meaning that comes to mind immediately).

2As we expected an interaction for response bias, where studied sounds were compared to lures, a post hoc statistical power analysis (G*Power 3) revealed that, for η2 = .58 and f = 1.1 with α = .05, the power equaled 0.99, which is highly above the accepted level of 0.8 (Cohen, 1992).

3In this study, critical lures were sounds acoustically related, but not identical, to sounds in studied lists. For example, if a studied list contained four different cats' meows, a critical lure was a fifth sound of a cat meow (acoustically similar to four sounds in the presented list).

4For the predicted interaction for response bias where studied words were compared to lures the power analysis revealed that, for η2 = .16 and f = 0.44 with α = .05, the power equaled 0.99.

5Although Bonferroni post hoc tests revealed no differences between conditions, means suggest a more conservative style of responding for words than nonwords in LTM. A lack of significant differences may stem from a conservative test used by participants.