In general, the audiobook read should be well-paced, easy to understand, and up to the same professional industry standards found in other audiobooks. If you have not spent much time listening to audiobooks, please do so before embarking on the labor-intensive process of personally narrating and editing an audiobook.
Audiobooks should be recorded in 16 bit / 44.1 kHz wav file format, which is considered CD quality and is best for archiving. Once you have fully produced your audio file it should be saved as a 192kbps mp3. This is the format that you will upload to ACX. Generally, audiobooks are recorded by one of two methods:
This is most common. When a mistake is made, go back to the previous sentence or break, and punch in (pick up the read from where you left off). Be careful not to cut off breaths, as the recording should sound clean, and any chopped audio will need to be fixed later. One of the main pros of using this method is that it becomes much quicker to edit, since the book is already put together during recording. One of the cons is that it takes a reasonable amount of audio production experience to be able to “punch” cleanly and correctly, to achieve.
Less common. When a mistake is made, recording does not stop. The reader simply goes back and re-reads from the previous sentence or break. Later, the takes are edited together (and the mistakes are cut out). One of the pros of this method is that it’s faster to record the audiobook, and there are less interruptions. One of the cons is that it will take about three times as long to edit the book, in the end.
Room Tone should be recorded at the end of every day of recording, using the same settings that were used to record the narration. This is ‘dead air’, where only the mic is recorded, by itself, capturing the ambience of the room. Later, this Room Tone is added between sentences, and used to mask noises and adjust pacing, in the edit.
Please record opening and closing credits for the book. The opening credits should placed at the very start of the audiobook (before any narration), and closing credits should be at the tail end. We suggest the following template:
“[title of audiobook]
Written by [name of author]
Narrated by [name of narrator]
This has been [title of audiobook]
Written by [name of author]
Narrated by [name of narrator]
Copyright [year and name of copyright holder]
Production copyright [year it was recorded] by [company name]
Noises caused by the narrator moving around (this can be clothing, jewelry, rustling papers, or rubbing against the chair, for example) should be avoided. These sounds can take the listener ‘out of the element’, and ruin the experience.
Wind from the narrator’s mouth can often hit the mic too hard, and cause a plosive. This is an unwanted pop which can occur on any word, but usually on ones that start with the letter “P”. Plosives sound bad, and are distracting to the listener. If a plosive occurs, the line should be re-read, while making adjustments to the position of the narrator and/or force with which the word is read, so that no pop is audible.
If the audio is too loud, it will distort. The level meter will usually display this by displaying red, or indicating that you have gone over the ‘zero’ mark. Often, distortion cannot be heard while playing back the audio within the software program that is being used for recording, because there is built in ‘headroom’, which allows you to decrease the volume of that part in the edit without having to re-record. But even if distortion is not heard, if the level hits ‘zero’ (or goes into the red)’, it will distort once we encode the book for Audible.com and iTunes. To avoid all distortion in the audiobook, the actual level meter must be constantly monitored, keeping it below zero at all times.
Any part of the audiobook that requires consideration for continuity, should be treated with care, so that it remains consistent throughout. This can include distinct character voices, accents, or alternate pronunciations. It’s always a good idea to add markers during recording (or manually note the time) for anything that may need continuity. This way, it’s easy to go back and see how it was read the first time. For example, the narrator may give a minor character a child-like voice in chapter one, but then forget what it sounded like when the character recurs later. In another example, the narrator may choose how to read a word that has two dictionary-accepted pronunciations. Later, if that word reappears in the text, it needs to be pronounced the exact same way. Markers (or notation of the record time) make it easier to locate these kind of things.
The narrator should read at a pace which can be followed by any reasonable listener of audiobooks. A diverse group of people listen to audiobooks, and many listeners will not be able to follow along if the narrator is reading too fast. A pace which is too quick can also lead to skipping words, sentences, or paragraphs, which can cause edit and QC time to take longer. (Please see the Editing Guidelines below for more details on proper pacing, and how to know if it’s the correct speed.)
Once the audiobook has been recorded, it should be fully edited and QC’d, as follows:
All books should be separated into chapters so that the editing process is facilitated by each file starting at zero. Prologues and Epilogues should also be kept as separate files. The final edited master should contain the entire book broken up into files containing one chapter each.
Pacing is the editor’s art. The editor’s main objective is to keep the read natural sounding and flowing, as well as preserve and promote the phrasing and dramatic intent of the reader and author. Nothing is more disturbing to the listening experience than an unnatural presentation of words or phrases as well as the spaces between them. As a suggestion, sit back at the start (and then, again, at the end) of the editing process, and simply listen to a few minutes of the audiobook, while closing your eyes and asking, ‘is this moving too fast? Too slowly? What pace feels right?’ You may also have a trusted friend (who listens to audiobooks regularly) listen for the same.
Always use Room Tone. Never leave silence or spaces of any kind. Please listen carefully to the Room Tone and make sure it is free of all click, pops, and background noise. All spaces between words, phrases and sentences should be void of clicks, pops, smacks or any other sounds (stomach noises, car horns, etc) and should be replaced when necessary with Room Tone only.
Breaths at the beginning of paragraphs should often be removed. Preserve breaths unless it will disturb the flow of the phrase. More breaths should be removed in an instructional read than a dramatic read. Breaths that seem “out of the element” of the narration, should be cut. Breaths that feel natural, should be left in.
There should be exactly 500ms (0.5 seconds) at the head of each file, and exactly 3.5 seconds at the tail. There should be exactly 2.5 seconds after the narrator announces the chapter (“chapter x”).
Remove as many clicks and undesirable mouth sounds from within words and phrases as time will allow. A baseline production value should be established at the beginning of the edit that is adhered to throughout the program. Care should be taken when fixing or removing any undesirable sounds around or within words so that the end result actually sounds better.
A good editor, working from a good Punch Record, normally spends two to four hours to edit a finished hour of audio (i.e. 16 to 32 hours to edit an 8-hour audiobook). The more meticulously they adhere to the above rules, the longer it will take.
Every audiobook should go throughout a Quality Control (QC) process. The audiobook’s producer (i.e. the Narrator or other Studio Professional) should listen once upon completion. However the Rights Holder (i.e. the Author, Agent, or Publisher) has final sign-off on the audiobook project.
The process of Quality Control begins after the edit. When problems arise in the read that cannot be readily repaired, such as omitted words or phrases, misreads etc., the editor should note the chapter, line, time and nature of the issue. During QC, the editor reviews the edited material to detect and repair all errors. Once mistakes are noted, they should be re-read and fixed in the original audio files. Re-records usually include misreads, omissions, mispronunciations, and noises that could not be patched with Room Tone. All of this must be corrected and fixed by the narrator, then inserted by the editor, before the edit is considered ‘complete’. The corrections should sound seamless, and the listener should not be able to hear any difference in audio quality or be able to tell that the correction was inserted into the narration.
In the end, the retail-ready audiobook should contain no misreads (whether minor or major), no mispronunciations, no cut-off breaths or other editing noises, and no unwanted external noises (which could be anything such as a stomach growl, a knock on the table, or a car horn from the street). There should not be any parts of the book where the listener can hear the narrator moving papers or making mouth noises in between sentences or paragraphs (that should all be covered up with Room Tone). Finally, the QC process should serve as a catch-all to correct these errors, and any other defects in the audiobook (such as a duplicated sentence, or a strip of silence).
Once the book is fully edited and QC’d, it should be mastered. Mastering is the process of adjusting the sound to make it more even and "listenable"
Your submitted files should measure between -23dB and -18dB RMS, with peaks hovering around -3dB. Your noise floor should fall between -60dB and -50dB.
To make the audiobook levels louder and more-even throughout is vital. Typically, this process is achieved by RMS normalization around -20db, or compression/limiting. Compression should be applied with a fast attack and release, around a ratio of 3:1. A hard limiter may also be used, and audiobooks are EQ’d during this time, to sweeten the sound and make it more pleasing to the ear. Often, muddled low end and mid-range is cut to make the audiobook sound more clear and smooth.
Though such methods work for many recordings, they will not work in all instances and what matters most is achieving the levels noted in the first paragraph.
We recommend that all files are archived in WAV format and saved as 192 Kbps MP3’s, the file format required for uploading to ACX.
Please select and include a “sample” snippet of audio from your final audiobook production master. It will be used on Audible.com to let potential customers sample the audiobook. This should be a separate audio file, chosen using the following guidelines:
To upload audio files to ACX:
Please feel free to contact audio@acx.com with any technical or procedural questions.