Right now we are enjoying this ability but we need to pay attention to where we are going, lest we get lost. Eventually we'll put our eyes back on the road only to find out that we've fallen seriously behind again. Do we, as audio content providers, want to remain hapless victims? Forced to follow along in the wake of the industry instead of helping to steer it?
Hey, we all need to eat, but that is the short-term gain at the expense of the long-term benefit. I know it's easier to give the developer what he's asking for rather than try to figure out a new technology and while it may quite satisfying to hear such beautiful sounds coming from dimly lit living rooms, bedrooms and offices, but it's not the prize.
Don't get me wrong, we've made great strides with the audio quality in games and there is definitely a place in almost any game for good old Redbook audio. But there is something even better down the road and we're neglecting it. The true reward is to have all of that and the ability to make it interactive. Are we there yet? In the past, all we had to work with were little beeps and whatnot from that tiny speaker inside the PC case. Not much need to worry about quality audio. Then came along the soundcard and you could actually play music.
And so MIDI was born. Why MIDI? Well, there was no such thing as a CD and a hard drive was as big as the PC itself. MIDI was developed as a way around the hardware and software limitations of the time. MIDI was and still is a marvelous technology. This is because although MIDI became standardized, the quality of the synthesizer and instrument patches used by the device to play it back has had no such standardization.
When I say "patches" I'm talking about the sounds MIDI uses to replicate instruments, also referred to as 'instrument banks'. These all vary greatly in quality from sound card to sound card. That quality is usually directly related to the price of the sound card. As we've all heard a million times why doesn't it sink in?
Other technologies have come along, but they were and still are proprietary in nature, and that puts us right back to the problem of having a myriad of sound cards in use with varying degrees of compatibility and quality.
It would seem that MIDI has seen it's day as a viable solution to our interactive audio dilemma. This is unfortunate because MIDI talks to the computer in a language it can understand, making it fast, programmable and flexible enough to be interactive, and its file size is unbeatable. With the industry moving rapidly towards online gaming, file size is again a major concern. What we need is some software company to make a program, which would allow a small, fast audio file to sound the same on every PC, with CD quality and the ability to be interactive.
Then of course they'll have to give it away for free to everyone who owns or buys a PC. Right, that'll happen when Satan is wearing ice skates. Would someone please get the Lord of the Abyss a pair of leggings to go with those red figure skates? Microsoft to the rescue!
I can't believe I actually put that phrase in print. Who else could pull it off? Microsoft has developed just what we needed: A program to ensure compatibility among sound cards, The Microsoft Synthesizer, and a program to create the audio content, DirectMusic Producer.
Many sound cards are already DLS compatible and DLS-compatible software synthesizers are becoming available through other companies as well.
The Microsoft Synthesizer is also installed automatically as a part of Internet Explorer so chances are pretty darn good that most PCs and all game players already have it installed on their systems. This too is provided free. So we now have the ability to create and implement interactive, CD quality audio at a fraction of the system resources required by linear Redbook audio.
Kudos to Microsoft now if they would only make the interface understandable to musicians, hint, hint…. I did say CD quality didn't I?
So what is DLS you ask? This means the same sounds you would currently hear in your Redbook audio tracks can be used in a MIDI composition. Instead of a two minute WAV of a violin solo taking up 20 MB of space you take a short sample of that violin sound which will most likely be less than K and make a DLS instrument out of it.
A MIDI note then triggers that sound in the composition and the result is the same two minute violin solo at a fraction of the size. DLS combines the advantages of digital sampling with the compactness and flexibility of MIDI and functions independently from any on-board MIDI instrument sounds already in a sound card. If your sound card isn't already DLS compatible from the manufacturer, the Microsoft Synthesizer handles the processing.
DirectX 8 makes use of the DLS2 standard, which adds many features. You might also notice that I have been using the term interactive audio and not interactive music.
The reason is because a DLS instrument can be comprised of any sound, which means sound effects and voices as well as musical instruments. One of the demonstrations I saw from Microsoft was a sports game sound effect set where the crowd cheered when your team gets a hit and booed when the other team gets a hit. At the same time there was an announcer speaking, a vender hawking his wares and a general crowd ambience.
All of these sounds layered on top of each other as needed by the game events without every having to switch tracks or getting the stutter you experience from loading and unloading an audio track. Where have you been all my life? You might wonder why, if this ability has been around since , everyone doesn't use it. That's a valid question. As I pointed out earlier, the fear factor has kept developers from being interested in learning about it even if there was information easily accessible -- which there isn't.
Since the DirectMusic Producer is a free program, all of the attention has been given to its creation has been in the technology and not the user interface. This means it is difficult to learn and use. Musicians are rarely programmers although when I look around my studio I wonder how I got all of this gear to work together with three PCs and therefore not inclined to deal with the problem solving required to figure it all out. In addition, it's not useful in other areas of the music industry, which means it's gotten little attention in the music community.
Interactive audio also requires a whole new way of thinking about composing. You can't approach a composition in the traditional linear structure because changes in the game will dictate that your composition must change.
If your entire life you've been taught, listened to and created music one way it takes serious dedication and focus to learn to look at audio in a completely different way. With the steep learning curve, it's difficult to justify the loss of productivity while you try to get a handle on it. Who'll pay the rent? Then, after you learn it you have to sell the developers and publishers on the technology.
As a free program it generates no revenue, which means it gets no advertising funds. With little available information, it's a hard sell. It's much easier to go with what you know, and what you can sell. Add up all of these things and you see why interactive audio hasn't taken the industry by storm. The bottom line however is that the ability to produce interactive audio is available and it's an exciting frontier for pioneering musicians and developers who are willing to explore beyond the boundaries.
We owe it to our audience and ourselves to move in this direction and there is really no excuse not to be doing it. Yes it is more difficult to learn but I'm sure that learning a programming language or putting down the pencil and learning to draw with a graphics program was no piece of cake at first either. Publications Pages Publications Pages.
Recently viewed 0 Save Search. Read More. Your current browser may not support copying via this button. Contents Go to page:. All rights reserved. Observations and group discussion show that people look for and identify spatial audio cues horizontally on eye-height as well as upwards, but rarely below them. One participant closed his eyes and tried locating the sound by using his arms to point at the direction he heard it coming from see Figure 6B.
Observation during discussion showed that hands are actively used to describe spatial sounds see Figure 6C. Participants who had never listened to spatial audio before expressed an interest in a dedicated onboarding at the start of the experience.
Participants liked that their ears were not blocked and thus able to hear their surroundings. Yet, the format of glasses was perceived as strange since the lenses are not used and glasses are usually worn for specific purposes not relevant to the given experience. Speakers could be inserted in other head-worn objects such as a hat.
Lack of clear feedback sounds was discussed by all groups. The existing feedback was confusing as it remained unexplained.
The feedback sounds were not clearly distinguishable from sounds which are part of the content or game. The hardware itself does not give any tactile feedback. There was no feedback response on how strong to tap so that it works but without hurting the other.
The most popular idea for other interactive AAR applications was experiences tied to specific locations via GPS, exploring a city or space e. The interactive technology would be popular for party games or to enhance board games. The participatory and collaborative aspect of the experience was deemed especially suitable as ice-breaker activities, warm ups, workshops or speed dating. Other applications could be seen in acting classes or to support the visually impaired.
Technological issues with hardware and audio 3 as well as confusion about instructions and the story 9 had adverse effects on the enjoyment for the participants. The storytelling was sometimes described as simplistic 3 and boring 2 and spatial audio, despite being liked, not necessary for the story 2. Two games were preferred, i. In the Tapping for likes game 10 participants found it exciting to get likes. They preferred when there was consensus. The Circles and Crosses and Notification game each got one vote.
Circles and Crosses was an individual exercise but with the other participants as spectators. The objective of this experiment is to examine how audio interactions can prompt actions in AAR experiences and how AAR supports interactions in participatory performances. This section discusses the results from the user experience study to answer the research questions. Other findings are also discussed that introduce areas for further research.
Table 2 summarizes the main findings of this study. Asymmetric information, by allowing different users to hear different parts of the story, prompted group coordination through verbal and non-verbal communication.
Participants who discovered that they were hearing different elements gave accounts to each other of what they heard, discussed how to solve tasks or perform as a group. Thus, asymmetric information had a strong multiplayer potential and prompted discovery. Yet, even if some participants really enjoyed it, others got confused and criticized the lack of information about the discrepancies in information they received. The analysis of the post-study questionnaire and discussions reveals that confusion often comes from cognitive load, a common side-effect also noted in previous audio-only games e.
To prevent it, some AAR studies suggest finding the right audio-only interaction techniques Sawhney and Schmandt, , using minimal information, or more guidance as interactions become more complex Rovithis et al.
Otherwise, some participants express that using too little audio feedback is confusing, too challenging and increases cognitive load. To that respect, the meta-architecture exposed in the LISTEN project which provides meaningful audio descriptors helping to make context-aware choices regarding which sound to hear when during the game could be used Zimmermann and Lorenz, Audio descriptors depend on the game-purpose: technical descriptions, relations between real and virtual objects, content of real objects, intended user etc.
Please Confirm you are not a Robot shows that in a multiplayer AAR experience asymmetric information could be used as an audio description to distinguish shared content from private information.
This can be achieved by using different sonic signatures, or as in the performance Consequences Playlines, by placing sound sources in the real world for example actors, speakers or a live performance. Some interaction design insights come from the analysis process.
Double tapping the frames to start and finish every game was successful to coordinate the storyline as well as the participants as a group. Yet, participants often report a lack of control over the audio content.
This may be due to hardware issues non-detection, disconnection , the narrative moving too fast and interrupting the players when they were acting, or the impossibility to repeat content if it was not understood for the first time, nor to pause the narrator voice. Hence, offering repetition possibilities, and an audio system designed to not interrupt the participants could be further explored. Also, feedback sounds were sometimes not clearly distinguishable from sounds that are part of the story.
As reported by most participants in the Notification game and in line with previous spatial audio studies Brungart et al. They also had troubles finding the audio cues when located downwards, because they mostly looked for them horizontally as well as upwards, which supports previous results by Rovithis et al. They also mention that continuous spatial sounds a continuous tone or music are easier to locate than interrupted sounds beeping. Hence, to facilitate interaction with AAR content a distinct on-boarding to audio controls, feedback sounds and how to listen to spatial audio is advised.
Despite these challenges, participants were excited about the sound during the introduction and the parts where the spatial audio clues were strong, and suggest that they could play a bigger part in the story, or be integrated with a strong reference to the real environment, which is further explored in AAR studies part of the LISTEN project Zimmermann and Lorenz, Contrary to the existing AAR literature Mariette, , this study explores audio interactions through the angle of user engagement.
Most participants were glad of the experience and liked its novelty, as in previous multiplayer AAR studies Moustakas et al. Most of them would recommend it to a friend and were engaged. One participant sent an email a few days after the experiment, describing how it made them reflect on their own technology use, revealing a longer term impact. The engagement of some participants led them to feel immersed in the environment.
Even if they were still aware of their real surroundings, paid attention to it, felt self-conscious, they were captivated by the experiment and half of them felt that the virtual world surrounded them. Even if the augmented world was perceived as moderately real, it was consistent with a real world experience. These results show that participants experienced a form of presence in the story, for instance illustrated during the discussion where some participants found it impressive when it was unclear whether sounds were augmented or real.
As suggested by Cummings et al. It also supports previous evidence found by Moustakas et al. Real-time data was used to coordinate storylines based on the choices and actions of all participants, thus creating a multiplayer experience. This has previously not been possible to achieve in real time, unless humans are present to observe the actions and direct the narrative as in the Binaural Dinner Date experience 8 , or the narrative evolves linearly, where one element always follows another 9 or actions and recordings are tied to specific GPS locations and have to be placed there before an audience arrives In this experiment this group co-ordinated audio was explored in three out of four games.
In Mirroring this allowed people to choose and pair up with another participant and receive the complimentary audio tracks in order to perform as a pair. In the Notification game each participant received an individual set of sounds to turn off, but the points they achieved are coordinated and a competition with a winner can be identified.
Tapping for likes created a curious friction when the content first started in parallel but then diverged, making the interactions seem more absurd and eventually separated one participant completely from the group to perform a different role. This study shows that narratives can be emergent from the actions of the audience and recombined in ever different ways and forms. Head-tracked spatial audio soundscapes, as opposed to static binaural audio, invite for exploration of the sound sphere.
The experience used the sphere around the head of each user as the frame of reference for sounds to appear. Participants use their whole body to move into the direction of sounds and look for sonic cues. When sound sources are placed in a specific spot in the sonic sphere listeners use their arms and hands to locate or follow the sound. Spatial audio, narrative pace and sonic cues impact the way the performers move.
Further research on the embodied experience of audio augmented reality and spatial audio would be interesting to investigate relationship between sound and body movement. During the study the researchers noted that each group performed and acted distinctively different as a group from each other.
The AAR system guided the performance and the overall experience was perceived similar, nevertheless the freedom of movement and interpretation allowed by the system made it possible for each group to choose their own pace and own set of interactions.
The mood of each group was coherent within the four games but separate from other groups. Each participant had the freedom to add or subtract from the content as they please. Once trust in the experience is built, the participants dare to engage more and create their own character which they express through gestures and enactment. This is in line with the discussion of White that AAR as an immersive medium is suitable for audience engagement in a performance.
By allowing performers to interact with the narrative, the actors can set their own pace in which they want to move through the story. The more control over the audio each participant has, the easier it is for them to follow instructions.
As suggested by Bluff and Johnston the AAR technology shapes the experience and the performance. The best parts of the experience were described as the social interactions. Participants saw a potential in the games to break down social norms and barriers. It was enjoyable to physically interact with other people and most of them would like to play more multiplayer games. This insight supports previous results from Moustakas et al.
Thus, future AAR applications could focus on this multiplayer aspect, with participants evoking party games or board games. The results from the IOS scale show that collaborative action with multiple participants can create a feeling of connection. This positive feeling within the group could create empathy for the characters that emerge from the group performance since the performance received positive feedback for creating understanding around digital wellbeing through embodied, immersive engagement with a topic and narrative.
The design of the experiment, questions asked about each task immediately after its completion, interfered with the flow of the performance and may have impacted on how participants perceived the experience. The testing situation presented an unnatural environment. Due to observers and by-passers being present at any time it was not easy for participants to let go. Furthermore, there was no way for researchers to monitor what participants heard at any given moment. This may have led to overlooking certain reactions and made it difficult to identify technological breakdown over human failure.
Hardware issues e. The researchers were not meant to interfere with the experience but due to technical issues observers had to intervene nevertheless. The high audio latency limited the spatial audio quality. The Frames were criticized for being too bulky and not meaningful for the context of the performance. There was no reference in the experience to the participants wearing glasses and thus no acknowledgment of the technology, which made it seem arbitrary. For future research and evaluation of the same experience it could be tested in one go rather than split into chapters.
The order of the four chapters was defined by the narrative and could not be altered during the experiment. Switching the order of the games for each group might avoid possible bias. A follow-up study should reach out to a more diverse audience and ensure a better gender balance. The experience should also be tested in a more natural environment. Overall participants would have preferred more story-line as part of the experience. They did not feel as the character in a story but just followed instructions.
There could be more narrative drive to hold it all together. The Audio Augmented Reality AAR collaborative experience Please Confirm you are not a Robot was designed, implemented, and tested with the aim to evaluate the potential of AAR as a material to shape the content of a participatory performance through human-computer and human-human interactions.
The experience was designed to explore the affordances of state-of-the-art AAR technology, namely asymmetric audio information between participants, haptic and gesture audio-control. The experience was tested and evaluated using questionnaires, observation and a guided group discussion. The results show that combining 3D audio with simple gesture interactions engaged the participants and made them feel present.
Participants enjoyed the audio only aspect. The social aspect of the experience was the most liked feature. The audience, most often strangers to each other, communicated using verbal and non verbal interactions to perform tasks and navigate asymmetric information. The connection between members of each group increased over time. Many previous AAR studies focus on interactive possibilities with devices and media with little exploration of user experience.
Please Confirm you are not a Robot draws from performance practices to generate an embodied experience for the participants benefiting from real-time audio-interactions to generate narrative content.
Interactive AAR utilizes real-time data for generating content and interacting with the media. Interactive, spatial audio expands previous experiments and performances with binaural audio for more personalized and immersive experiences. The option to generate narratives based on the status of other participants enables performances for multiple participants, each receiving a personal but collaborative experience.
AAR has creative potential for the participant performers to express themselves and their mental images based on sonic stimuli. Interaction between participants—away from purely scripted plays—increases the feeling of empathy for the characters of a story and generates a higher feeling of being immersed in the story. In future studies the use of real-time data could be explored in more detail of how it can generate and influence the story. The participants provided their written informed consent to participate in this study.
AN worked on interaction and experience design, and story and script development. Both AN and VB were equally involved in the study design, execution and analysis.
The paper was written in equal parts. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Many thanks to all participants, without whom this research would not have been possible.
Albrecht, R. Ames, M. Aron, A. Inclusion of other in the self scale and the structure of interpersonal closeness. Bluff, A. Devising interactive theatre: trajectories of production with complex bespoke technologies. Boal, A. Games for actors and non-actors.
Google Scholar. Brungart, D. Effects of headtracker latency in virtual audio displays. Audio Eng. Chatzidimitris, T.
Cummings, J. How immersive is enough? A meta-analysis of the effect of immersive technology on user presence. Media Psychol. Drewes, T. Sleuth: an audio experience.
0コメント