Behavioral results
Performance in terms of accuracy was similar for both conditions and in general very high, underlining the compliance of the participants during the experiment. The average accuracy was M = 93.19% (SD = 7.46%) for the auditory task and M = 92.89% (SD = 7.65%) for the visual task. The accuracies of the two conditions did not differ significantly (t(26) = 0.378, p = 0.709). The generally high levels of accuracy suggest a ceiling effect for this behavioral index. This can be explained by the fact that this study did not utilize threshold-level stimuli. Analysis of reaction times revealed significantly longer reaction times for the auditory (M = 552.5 ms, SD = 181.5 ms) compared to the visual task (M = 493.3 ms, SD = 177.9 ms; t(26) = 3.7302, p = 0.0009).
OOA at theta rhythm is modulated by intermodal attention
Typical oscillatory activity of the brain is pronounced in a frequency band of 1–80 Hz, whereas otoacoustic activity is found at much higher frequencies (500–4000 Hz). As the aim of this experiment is to study the effects of cortical top-down modulations on OOA, we applied the Hilbert transform to extract the amplitude modulation for frequencies typical of ongoing cortical oscillations. To avoid a stronger influence of the lower sound frequencies and to create a representation of the cochlea’s frequency response, the otoacoustic signal was bandpass filtered between 1000 and 2000 Hz in 10 Hz steps with a window size of ± 30 Hz. The power spectral densities (PSD) of the 201 bandpass windows were then concatenated to create a representation of the amplitude modulation between 1000 and 2000 Hz of the cochlea’s frequency response.
In a first step, we parameterized induced oscillatory modulations of OOA during the silent cue-target interval. We used the FOOOF-toolbox to differentiate between genuine oscillatory contributions from aperiodic 1/f changes. In all subjects, a peak could be found at low (< 11 Hz) frequencies with a clustering around ~ 5–6 Hz. However, it has to be noted that for a number of subjects more than one peak was identified below 11 Hz. For the Attend Auditory condition, the average peak frequency was at 5.65 Hz (SD = 1.48) for the left and 5.88 Hz (SD = 2.33) for the right ear. For the Attend Visual condition, the average peak frequency was at 5.58 Hz (SD = 1.57) for the left and at 5.85 Hz (SD = 1.83) for the right ear. Which modality was attended to had no statistically significant impact on the peak frequencies in both ears (left: t(26)= 0.2068, p = 0.9462; right: t(26) = 0.0681, p = 0.9462; FDR-corrected (false detection rate)). For the Attend Auditory condition, the average slope was at 0.416 (SD = 0.229) for the left and 0.401 (SD = 0.184) for the right ear. For the Attend Visual condition, the average slope was at 0.413 (SD = 0.226) for the left and 0.400 (SD = 0.192) for the right ear. We found no statistically significant effect of modality for slopes in both ears (left: t(26) = 0.9462, p = 0.6503; right: t(26) = 0.1107, p = 0.9462; FDR-corrected). Figure 1a, b show subjects’ individual peak frequencies and Fig. 1e, f the slope for aperiodic components (“1/f noise”). In order to test if the identified peaks are significant components of the respective PSD, we evaluated for every PSD if the power at the peak frequency is a significant outlier of the distribution of the power at frequencies that were not identified as peaks. For this purpose, we calculated Dixon’s Q tests for every PSD except 1 that did not fulfill the requirements of Dixon’s Q test. In 104 of the 107 tested PSDs, the power at the peak frequency was a significant outlier. An exact binomial test indicated that the proportion of found significant outliers of 0.97 was higher than the expected 0.50, which would be expected if the power at the peak frequencies were outliers by chance (p < 0.0001, two-sided). Moreover, we performed Kolmogorov-Smirnov tests to test for uniformity on the peak frequencies for every ear and condition. The percentage of peak frequencies for the left ear and Attend Auditory condition (D(26) = 9.2347, p < 0.0001) and the percentage of peak frequencies for the left ear and Attend Visual (D(26) = 9.2486, p < 0.0001) were both significantly different from uniformity, indicating that the peak frequencies were not uniformly distributed in both conditions. The same holds true for the right ear (Attend Auditory: D(26) = 9.4619, p < 0.0001; Attend Visual: D(26) = 9.3502, p < 0.0001). While this analysis overall points to a theta-rhythmic modulation of cochlear activity in a silent cue-target interval, the range (1–10.03 Hz) of these peaks suggests a rather high interindividual variability.
Subsequently, we were interested if the ~ 6 Hz component was phase aligned given that the target was temporally predictable. We calculated evoked power in the same way as described above and then used the FOOOF-toolbox to extract periodic components. For the Attend Auditory condition, the average peak frequency was at 4.44 Hz (SD = 1.70) for the left and 3.97 Hz (SD = 1.90) for the right ear. For the Attend Visual condition, the average peak frequency was at 4.71 Hz (SD = 2.17) for the left and 4.50 Hz (SD = 1.66) for the right ear. Which modality was attended to had no statistically significant impact on the peak frequencies in both ears (left: t(26) = − 0.5628, p = 0.9462; right: t(26) = − 1.1651, p = 0.9462; FDR-corrected). For the Attend Auditory condition, the average slope was at 0.414 (SD = 0.249) for the left and 0.395 (SD = 0.230) for the right ear. For the Attend Visual condition, the average slope was at 0.445 (SD = 0.271) for the left and 0.489 (SD = 0.265) for the right ear. We found no statistically significant effect of modality for slopes in both ears (left: t(26) = − 0.7814, p = 0.9462; right: t(26) = − 2.855, p = 0.0664; FDR-corrected). Figure 1c, d show subjects’ individual peak frequencies and Fig. 1g, h the slope for aperiodic components (“1/f noise”). Tests for uniformity analogue to the ones used for the induced signal uncovered that the percentages of peak frequencies for both ears and attention conditions were significantly different from uniformity (left ear, Attend Auditory: D(26) = 8.2005, p < 0.0001; left ear, Attend Visual: D(26) = 9.1565, p < 0.0001; right ear, Attend Auditory: D(26) = 7.0971, p < 0.0001; right ear, Attend Visual: D(26) = 7.8705, p < 0.0001). Moreover, as for induced power we performed Dixon’s Q tests for every PSD except for 22 that did not fulfill the requirements of Dixon’s Q test. In 86 of the 86 tested PSDs, the power at the peak frequency was a significant outlier. An exact binomial test indicated that the proportion of found significant outliers of 1.00 was higher than the expected 0.50, which would be expected if the power at the peak frequencies were outliers by chance (p < 0.0001, two-sided). Thus, we assume that the identified peaks are significant components of their respective PSDs. Finally, we were interested if the phase of the evoked oscillation is different between modalities and ears. With this in mind, we calculated FDR-corrected circular common median tests. The results showed no significant difference for both ear and modality (ear: P(26) = 0.3068, p = 0.5860; modality: P(26) = 0.2967, p = 0.5860; FDR-corrected). The results suggest that the evoked ~ 4 Hz component is not modulated by attentional focus and the same for both ears. Moreover, we tested if the induced and evoked components were different in frequency. A two-sided t test revealed that the frequency of the induced ~ 6 Hz component was significantly higher than that of the evoked ~ 4 Hz component (t(26) = 4.4373, p = 0.0001). The result suggests that the induced ~ 6 Hz component is different from the evoked one. Thus, we assume that the ~ 6 Hz component is not consistent in phase for the cue-target interval.
Next, we tested the hypothesis that cochlear activity is increased during periods of focused auditory compared to visual attention. Descriptively, it appears from the grand average that the amplitude differences (Fig. 2a, b) of the amplitude modulation index (AMI) lie predominantly in the range of low frequencies, corresponding to the frequency range of dominant rhythmic cochlear activity (Fig. 1a, b). Given this overlap, the AMI was pooled across the range of peak frequencies (left ear: 3–10 Hz; right ear: 1–10 Hz) for the cochlear response frequency range of 1000–2000 Hz for the left and right ear, respectively. In the next step, FDR-corrected one-tailed one sample t tests against 0 were performed (see Fig. 2c). The result for the left ear revealed that induced cochlear activity (M = 1.1002%, SE = 0.3047%) was significantly higher for the Attend Auditory condition (t(26) = 2.4701, p = 0.0122). Similarly, the result for the right ear revealed significantly higher induced cochlear activity (M = 1.5343%, SE = 0.3047%) for the Attend Auditory condition (t(26) = 2.3881, p = 0.0122). No interaural differences could be observed (t(26) = − 0.8225, p = 0.4183). In an analogous manner, we performed FDR-corrected one-tailed one sample t tests against 0 for evoked cochlear activity. The results demonstrate that in both ears evoked cochlear activity was not significantly higher for the Attend Auditory condition (left: t(26) = − 1.5779, p = 0.9367; right: t(26) = − 0.6909, p = 0.9367). These analyses propose that while induced cochlear activity shows attentional modulations evoked cochlear activity seems not to be modulated by attention.
Cortical alpha and theta power are related to cochlear changes
In order to assess effects of intermodal attention on brain level, we performed a nonparametric cluster-based permutation analysis on source-projected MEG-power over frequencies of 3–25 Hz (see the “Methods” section). The analysis was pooled across 1.7 s of the cue-target interval. An effect of condition (Attend Auditory > Attend Visual, p = 0.004) was observed that corresponded to a positive cluster in the observed data beginning around 4–6 Hz up to 24–25 Hz. As hypothesized, the extent of this cluster is largest in the alpha and beta range and located in posterior—mainly occipital and parietal—brain regions (see Fig. 2d).
We expected inhibited sensory processing of the current task-irrelevant sensory modality—occipital regions for the visual and temporal regions for the auditory modality. According to dominant frameworks [23], this functional inhibition should manifest as increased power in the alpha band. We found increased alpha power for the Attend Auditory condition over occipital regions. However, no increased alpha power for the Attend Visual condition in auditory regions could be found. This absence may be related to a reduced measurement sensitivity due to the significant loss of MEG sensors covering the temporal regions.
In order to assess whether attentional effects found at the cortical level were associated with the previously described cochlear effects, a correlation between the brain-AMI and the induced OOA-AMI of the left (pooled across 3–10 Hz) and right ear (pooled across 1–10 Hz), respectively, was calculated. A nonparametric cluster-based permutation analysis indicated a significant correlation of brain-AMI and OOA-AMI of the right ear (p = 0.01) but not the left ear (p = 0.62). This corresponded to a negative cluster in the observed data incorporating the whole frequency range (3–25 Hz) of the analysis (see Fig. 3a). The extent of the cluster peaks in the alpha, theta, and beta bands. Dominant locations of the correlation effect are illustrated in Fig. 3a (see Additional file 1: Fig. S1 for an illustration on the brain’s surface). For the theta and alpha frequency range, strong auditory cortical effects are seen in the left STG or medial portions of Heschl’s gyrus, respectively. Interestingly, the effects are strongest contralateral to the OAE probe. However, effects were also observed outside of classical auditory cortical regions, such as in the right (pre-motor) or left inferomedial temporal regions. To illustrate that effects are not driven by outlying participants of relevant effects in the theta and alpha bands, Fig. 3b, c show correlations for voxels with the strongest effects. The negative correlations indicate that lower alpha and theta AMI is accompanied by higher OOA-AMI and vice versa. It is well known that decreasing alpha activity represents a mechanism for a release of inhibition [23, 25]. Thus, the negative correlation suggests that participants exhibiting a stronger release of inhibition (by lower alpha power) in the left auditory brain regions during periods of auditory attention also exhibit elevated OOA-levels (by higher OOA power). This analysis illustrates that attentional modulations of rhythmic activity at the “lowest” (i.e., cochlear) level of the corticofugal system go along with modulations of oscillatory brain activity at the “highest” level. The absence of a significant effect for the correlation with the OOA-AMI of the left ear could be explained by the high amount of saturated sensors in (contralateral) temporal regions, which is caused by magnetic artifacts of the microphone probes (see the “Methods” section). Depending on the number of bad sensors on each side measurement sensitivity can be severely reduced in respective temporal regions.
OOA is not sensitive to within-subject performance variability
Finally, we investigated if the OOA- and cortical effects were sensitive to within-subject performance variability as these kinds of analyses provide more insight into how attention modulates both cortical and cochlear activity. Since accuracies show a ceiling effect, analyses are exclusively run for reaction times. So, for each subject and condition, trials were individually split into slow and fast trials by median splits. In performing the median split for each subject and condition individually, we avoid confounding effects of between-subject and intermodal performance variability.
Initially, analyses for the induced OOA were calculated. A three-factorial ANOVA (2 × 2 × 2) with the repeated measures factors ear (left and right), reaction time (slow and fast), and condition (auditory and visual) was calculated for peak frequencies. The results revealed no significant main effects (ear: F(1, 26) = 0.8273, p = 0.3714; reaction time: F(1, 26) = 0.3177, p = 0.5778; condition: F(1, 26) = 1.4210, p = 0.2440). Next, the same ANOVA was calculated for slopes. Again, its results revealed no significant main effects (ear: F(1, 26) = 0.3452, p = 0.5619; reaction time: F(1, 26) = 2.7340, p = 0.1103; condition: F(1, 26) = 0.2272, p = 0.6376).
Subsequently, a two-factorial ANOVA (2 × 2) with the repeated measures factors ear (left and right) and reaction time (slow and fast) was calculated for induced OOA-AMIs. The results showed no significant main effects (ear: F(1, 26) = 1.0540, p = 0.3141; reaction time: F(1, 26) = 1.6380, p = 0.2119). As there is no “one-sample ANOVA,” we additionally performed FDR-corrected one-tailed one sample t tests against 0 to test if the OOA-AMI in slow and fast trials is increased during periods of focused auditory attention. The results revealed that the OOA-AMI in slow trials of the left ear (M = 0.0631%, SD = 4.4203%) was not significantly increased in the auditory condition (t(26) = 0.0741, p = 0.4828). The OOA-AMI in slow trials of the right ear (M = 0.0399%, SD = 4.7708%) failed to be significantly increased in the auditory condition (t(26) = 0.0435, p = 0.4828). The same pattern was found for fast trials in both ears. The OOA-AMI in fast trials of the left ear (M = 1.8846%, SD = 4.6601%) was not significantly increased in the auditory condition (t(26) = 2.1010, p = 0.0661). The OOA-AMI in fast trials of the right ear (M = 2.8994%, SD = 7.8528%) was not significantly increased in the auditory condition (t(26) = 1.9190, p = 0.0661).
To assess effects of reaction times (slow vs. fast trials) on brain level, we performed a nonparametric cluster-based permutation analysis on source-projected MEG-power over frequencies of 3–25 Hz (see the “Methods” section). The analysis revealed no effect of pretarget MEG-power on reaction times.
Overall, the reported result for cortical activity does not indicate a sensitivity to reaction times. The same holds true for cochlear activity. However, the OOA-AMI in fast trials just fails to be significantly higher for auditory attention compared to visual attention.