xTV Sound

Matthew Alan Kane

Lucasfilm Ltd.

February 23, 1987




On xTV, sound is the glue that holds together the experience. Each segment's pacing, mood, and style is controlled by its accompanying soundtrack. Other audio, such as sound effects, will be used on xTV to enhance already existing soundtracks or animated graphic simulations. We also plan to explore the use of aural feedback in combination with visual feedback to make using the xTV Workshop as easy and intuitive as possible.


xTV System Sound Sources

There are four sound sources on the xTV classroom system.

1. Compressed digital audio from the CD-ROM player

Most of the segment soundtracks will exist as compressed digital data on CD-ROM. This data will be decompressed by the xTV Accessory Board.

2. CD-Digital Audio from the CD-ROM player

The CD-ROM player in Digital Audio ("Red Book") mode will be available to allow teachers and students to play regular CD's. xTV does not use this mode.

3. Videodisc

The audio tracks on the videodisc will hold soundtracks for the "running" video segments and computer graphic based segments.

4. Apple IIGS

All other sounds will be generated in real-time by the Apple IIGS sound hardware.


All xTV sound will be monophonic. The double amount of data needed for stereo is unnecessary in xTV's classroom environment.



The main audio source on xTV will be a CD-ROM. Audio can be stored on a CD-ROM in two forms; CD-Digital Audio or compressed digital audio. Storing the sound as CD-Digital Audio will give the highest quality signal, but the total length of the sound is limited to approximately seventy minutes of stereo. Using compressed digital audio, it is possible to store at least eight hours of good quality mono sound with the CD-I ("Green Book") 4-bit ADPCM scheme ("FM quality" mode), which works out to over 160 complete three minute soundtracks. For speech quality sound, which requires less than half of the frequency range of music, the potential time is upwards of sixteen hours. Engineers at Apple are currently attempting to determine an audio compression standard for the IIGS and Macintosh computers. We would like to see compression rates doubling the CD-I standard, but if Apple's resulting standard is at least as good as CD-I, then xTV can use it.

One of the major problems in designing an audio compression scheme is deciding the trade-off between compression factor and sound quality. It is very important that the audio quality on xTV be as least as good as standard television sound, and it's equally important that a large number of soundtracks (100 - 500) fit on a single CD-ROM. On xTV, the demands on audio quality are limited by the playback system. It is assumed that xTV will be shown on a regular television set or computer monitor, with mono sound being played through the standard 2 inch speaker. The necessary frequency range of the audio will be similar to monophonic broadcast television (100Hz ~ 10kHz). All recorded sound for xTV will be designed and equalized for the compression scheme to minimize compression artifacts.

In order to have all the produced audio soundtracks and narrations for an entire xTV semester available without switching discs, it will be necessary to use some type of audio compression. We do not believe that the standard IIGS will be able to decompress and playback the audio in real time without hardware assistance. Therefore we will need dedicated audio decompression and playback on the xTV Accessory Board. (See xTV Accessory Board - page 7)

While a more elaborate audio system, stereo, and a broader frequency range would enhance the overall presentation of xTV, people are used to television sound and are unlikely to expect more from the xTV system.




For all of the "running" video segments, the accompanying soundtrack will be on one of the videodisc's stereo channels. The other channel will hold an alternate soundtrack. There will only be one audio output from the videodisc player. The videodisc audio switching will take place in the videodisc player in response to the IIGS's commands.

The images for most xTV segments will exist as still frames on the videodisc. Soundtracks for these segments will be stored as compressed digital data on the CD-ROM.

The audio tracks over the videodisc's still image file will be utilized to hold soundtracks for computer graphic based segments. During a graphic based segment, the CD-ROM is too busy delivering dynamic graphic data to the computer to be sending in compressed audio data, and the IIGS is too busy processing and displaying the graphic data to be playing complex sound. For these segments, xTV music can be provided by the videodisc player from the soundtracks over the still image file.

IIGS Sound Hardware

The Apple IIGS has very sophisticated sound generating hardware. The heart of the system is the Ensoniq Digital Oscillator Chip (DOC). This chip, which is standard in the IIGS, was designed by Ensoniq for their professional synthesizer products. Sounds are produced with the DOC's 32 oscillators from digitized waveforms stored in 64k of dedicated RAM. The audio section of the IIGS runs asynchronous of the rest of the system, which means that the IIGS could be playing music or sound effects at the same time it is creating graphics or controlling an external device such as a videodisc player.

A potential problem in running IIGS sound concurrently with IIGS generated graphics is the amount of processor time each task takes. According to engineers at Apple, servicing the interrupts to run eight voices of the DOC can eat up anywhere from 60% to 80% of the 65816's power. One possible solution to this problem would be to use fewer voices, but this would seriously decrease the IIGS's ability to create interesting music and sound. Another solution would be careful coordination of the graphics to the sound. By using fewer voices during the graphic intensive times, such as filling in map data, and enriching the sound during the graphic dormant times, it should be possible to reduce the processor slowdown. Unless the percentage of system time to generate complex sound can be significantly lessened, there will not be much IIGS sound during computation-intensive operations.



IIGS Sound Quality

The IIGS's sound generating ability is a substantial improvement over the old Apple II or Commodore sound, but we are disappointed that the actual audio quality is not higher. In a professional quality synthesizer, each component of the hardware is "tweaked" to make the system as noise-free as possible. The IIGS's sound generator resides in a box with lots of computer and video circuits that produce noises which are audible when listening to IIGS produced music.

When the audio signal from the IIGS is amplified, rather than played through the computer's internal speaker, the noise becomes unbearable, rendering the IIGS's sound generating circuitry useless for xTV. Since the signal will necessarily be amplified on xTV in order to be heard by an entire classroom of students, it is vital that the sound circuitry within the IIGS be isolated from the rest of the GS circuitry.


Low-level IIGS Sound Tools

The Apple engineers have created a set of low-level sound tools and demonstrated several applications of IIGS sound. All of these will be adapted for use in the xTV Workshop's high-level music tools and other xTV sounds.


•Free Form synthesizer

This sound tool allows the user to play variable length sampled sounds directly through the DOC. Sound data stored in the system memory can be shuttled into the sound RAM in a continuous stream producing up to fifty seconds per megabyte of musical quality sound. (1 megabyte of RAM / 20khz sample rate)

The potential for streaming in sampled sound data from a hard disk has been explored by the Apple engineers. They found that under ideal circumstances, it is possible to have the GS play a very long sample. For xTV's audio, this would not be a very reliable or efficient method of delivery, as it takes almost all of the IIGS's processor time.


•Note synthesizer

The note synthesizer provides a way to control musical sounds and events on the IIGS. Its sound tools let the user turn on and off notes from several different instruments and handle the complex process of allocating the DOC's oscillators.


•Apple Instrument Format

The Apple Instrument Format (AIF) seems to be a comprehensive and well constructed standard for the definition of IIGS instruments. By using a standard format, the note synthesizer and other potential IIGS sound generators will be able to use the same instruments, thus promoting the development of large libraries of different instruments.



The Apple engineers are currently completing the programming of a IIGS sequencer. In conjunction with the note synthesizer and the AIF, the sequencer will round out the Apple supplied IIGS sound tools, giving xTV programmers the ability make good use of the IIGS DOC.



IIGS Sound on xTV

There are several possible uses for IIGS sound in an xTV product. The biggest advantage of IIGS generated sound over prerecorded audio is its dynamic and changeable nature. The IIGS can respond to the particular musical needs of the teachers and students using the xTV Workshop with xTV's high-level music tools and other IIGS sounds.

The major stumbling block in using IIGS sound on xTV is the excess amount of noise currently inherent in the IIGS's audio output. It is of utmost importance that the GS sound be as clean as possible. Without good quality sound, the high-level music tools are essentially useless and the IIGS's other potential sound uses on xTV will be more of a distraction than an enrichment. We heartily urge that any and all steps be taken to rid the IIGS of its noise problem.


xTV High-Level Music Tools

By providing a set of high-level music tools in the xTV Workshop, teachers and students are empowered with the ability to enhance their personally constructed segments.


•My Little Maestro™

The difficulties in creating a true auto-compositional program for the creation of "good" original music are enormous. My Little Maestro will avoid the impossible. It is not the intent of this program to compose original music, rather its intent is to create interesting yet non-distracting musical accompaniment.

My Little Maestro is a pseudo auto-compositional tool for xTV. It enables teachers and students to have their personally designed segments accompanied by music with very little effort. My Little Maestro will ask for a style of music (rock, country, jazz, classical, etc.), and proceed to create a score that can be easily coordinated with the preselected slides. It will use several pre-generated loops of notes and rhythms for each of the various musical styles, and combine them in an semi-intelligent manner. By offering enough stylistic, rhythmic and melodic variety, it should be able to generate a great number of usable musical pieces.


•Band On a Chip™

For the teacher or student who wishes to work with music at a deeper level than My Little Maestro allows, Band On a Chip (BOC) provides an alternative. BOC is similar in function to many inexpensive keyboards. It provides a selection of rhythms (rock, swing, march, bossa-nova, etc.), instrument definitions (flute, piano, organ, guitar, etc.), and the ability to record (sequence) a composition to accompany an xTV presentation.

BOC will make extensive use of Apple's note synthesizer and the AIF to control the DOC. It will need a very simple interface, perhaps modeled after a Casio type keyboard, that allows teachers and students to play with musical structure without requiring them to learn music theory.


Other IIGS Sound on xTV

Besides the high-level music tools, the IIGS's sound capabilities can be taken advantage of in other ways.


• Sampling

Using the sampling capacity of the IIGS and the Free Form synthesizer, it is possible to build a library of sound effects that could be used to either augment the prerecorded audio portions of xTV or added to teacher/student presentations.

According to Apple engineers, it may soon be possible to use the Free Form synthesizer in conjunction with the note synthesizer, to add compressed narration or an instrument lead line to a note synthesizer sequence.


• Aural feedback

In the very real world in which we live, we receive feedback and assurance through more than just our eyes. While visual feedback is vital to most human processes, we often comprehend much of a given situation with our other senses, most notably our ears. The clickings, whirrings, and buzzings of the mechanical world surround us. While driving, our ears let us know how well our car is running. When we hear the hum of kitchen appliances, we are assured that they are doing their jobs. And while wandering through the rest of the world, our ears work very closely with our eyes to guide us.

It is this type of aural feedback that should be a part of the xTV Workshop. By having the xTV remote control generate tones when buttons are pressed, or having each channel on xTV make its own unique sound, this feedback will let the teacher or student using xTV know that they are in the right place, on the right channel, that the system is listening to them, and that everything is AOK. Using the sound generators of the IIGS, we hope to be able to explore and develop this concept.

xTV Accessory Board

The xTV Accessory Board will contain all of the non-standard Apple IIGS hardware that is necessary to the xTV system. The board needs an audio section (described below), a CD-ROM port, and ROM. (See the companion report on the xTV Classroom System)

The audio section of the board must have a decompressor for the audio data coming in the board's CD-ROM port, an 8-bit digital to analog converter, an audio amplifier, and an audio source switcher to select between the three xTV sound sources. It would be very desirable to replace the audio source switcher with an audio mixer so that rather than just being able to select the current sound source, you could mix together all three xTV sources. This could be useful for adding sound effects from the IIGS to an already created score on the videodisc or the CD-ROM, mixing a heavily compressed narration from the CD-ROM with music from the IIGS, or any number of creative mixing applications.



Technical Summary

The two technical demands that must be met in order for xTV to be successful are a good audio compression scheme and the necessary cleanup of the IIGS audio output.

Without a good compression scheme and capable decompression electronics, the ability of xTV to have an entire semester's worth of educational material online and easily accessible will be impossible. While audio quality remains very important, the more digital audio data that can be compressed on a single CD-ROM, the more cost effective and genuinely useful the xTV setup can become. We need to create a product so easy to use that once the system is set up, it takes nothing more than turning on a single switch to be up and running. Being able to have all of the soundtracks online all of the time is a necessary part of the xTV system.

The other technical problem that must be tackled is the cleanup of the IIGS audio output. It is impossible for this report to stress too strongly the opinion that if the audio quality of the IIGS is not significantly improved, its use in the xTV system will be severely limited, or even non-existent.