We’ve come to love the Internet, both because it is so rich in content and inexpensive and, perhaps more importantly, because it allows us to define how we communicate. As its ability to carry richer forms of media advances, we’ll find ourselves using it more and more. Once Internet voice delivers quality that rivals (or betters) the capabilities of the PSTN, the phone company had better look for another line of business. The PSTN will cease to exist; all its complexity will be absorbed into the Internet, as just one more technology. As with most of the rest of the Internet, open source technologies will lead this transformation.
The dream of having our technical inventions talk to us is older than the telephone itself. Each new advance in technology spurs a new wave of eager experimentation. Generally, results never quite meet expectations, possibly because as soon as a machine says something that sounds intelligent, most people assume that it is intelligent.
People who program and maintain computers realize their limitations, and thus tend to allow for their weaknesses. Everybody else just expects their computers and software to work. The amount of thinking a user must do to interact with a computer is often inversely proportional to the amount of thinking the design team did. Simple interfaces belie complex design decisions.
The challenge, therefore, is to design a system that has anticipated the most common desires of its users, and can also adroitly handle unexpected challenges.
The Festival text-to-speech server can transform text into spoken words. While this is a whole lot of fun to play with, there are many challenges to overcome (for more on integrating Festival with Asterisk, refer back to the section called “Text-to-Speech Utilities”).
For Asterisk, an obvious value of text-to-speech might be the ability to have your telephone system read your emails back to you. If you’ve noticed the somewhat poor grammar, punctuation, and spelling typically found in email messages these days, you can perhaps appreciate the challenges this poses.
One cannot help but wonder if the emergence of text-to-speech will inspire a new generation of people dedicated to proper writing. Seeing spelling and punctuation errors on the screen is frustrating enough—having to hear a computer speak such things will require a level of Zazen that few possess.
If text-to-speech is rocket science, speech recognition is science fiction.
Speech recognition can actually work very well, but unfortunately this is generally true only if you provide it with the right conditions—and the right conditions are not those found on a telephone network. Even a perfect PSTN connection is considered to be at the lowest acceptable limit for accurate speech recognition. Add in compressed and lossy VoIP connections, or a cell phone, and you will discover far more limitations than uses.
Asterisk now has an entire speech API, so that outside companies (or even open source projects) can tie their speech recognition engines into Asterisk. One company that has done this is LumenVox. By using LumenVox’s speech recognition engine along with Asterisk, you can make voice-driven menus and IVR systems in record time! For more information, see http://www.lumenvox.com.
As we gain access to more and more bandwidth, it becomes less and less easy to understand why we still use low-fidelity codecs. Many people do not realize that Skype provides higher fidelity than a telephone; it’s a large part of the reason why Skype has a reputation for sounding so good.
If you were ever to phone CNN, wouldn’t you love to hear James Earl Jones’s mellifluous voice saying “This is CNN,” instead of some tinny electronic recording? And if you think Allison Smith[185] sounds good through the phone, you should hear her in person!
In the future, we will expect, and get, high-fidelity voice through our communications equipment.
As more and more hardware vendors start building support for high-fidelity voice into their VoIP hardware, you’ll see more support in Asterisk for making better-than-PSTN-quality calls.
While most of this book focuses on audio, video is also supported in many ways within Asterisk. Video support is not complete, however. The problem is not so much one of functionality as one of bandwidth and processing power. Asterisk 1.10 is expected to contain better support for handling media, including video.
The concept of videoconferencing has been around since the invention of the cathode ray tube. The telecom industry has been promising a videoconferencing device in every home for decades.
As with so many other communications technologies, if you have videoconferencing in your house, you are probably running it over the Internet, with a simple, inexpensive webcam. Still, it seems that people see videoconferencing as a bit gimmicky. Yes, you can see the person you’re talking to, but there’s something missing.
Videoconferencing promises a richer communications experience than the telephone. Rather than simply hearing a disembodied voice, you have access to all the nuances of speech that come from face-to-face communication.
There are some challenges to overcome, though, and not all of them are technical.
Consider this: using a plain telephone, people working from their home offices can have business conversations, unshowered, in their underwear, feet on the desk, coffee in hand—if they use a telephone. A similar video conversation would require half an hour of grooming to prepare for, and couldn’t happen in the kitchen, on the patio, or…well, you get the idea.
Also, the promise of eye-to-eye communication over video will never happen as long as the focal points of the participants are not in line with the cameras. If you look at the camera, your audience will see you looking at them, but you won’t see them. If you look at your screen to see whom you are talking to, the camera will show you looking down at something—not at your audience. That looks impersonal. Perhaps if a videophone could be designed like a Tele-Prompt-R, where the camera was behind the screen, it wouldn’t feel so unnatural. As it stands, there’s something psychological that’s missing. Video ends up being a gimmick.
Since Asterisk is fully VoIP-enabled, wireless is all part of the package.
This is a term that has been hyped by the telecom industry for years, but adoption has been far slower than predicted.
Unified messaging refers to the concept of tying voice and text-messaging systems into one. With Asterisk, the two don’t need to be artificially combined, as Asterisk already treats them the same way.
Just by examining the terms, unified and messaging, we can see that the integration of email and voicemail must be merely the beginning—unified messaging needs to do a lot more than just that if it is to deserve its name.
Perhaps we need to define “messaging” as communication that does not occur in real time. In other words, when you send a message, you expect that the reply may take moments, minutes, hours, or even days to arrive. You compose what you wish to say, and your audience is expected to compose a reply.
Contrast this with conversing, which happens in real time. When you talk to someone on a telephone connection, you expect no more than a few seconds’ delay before the response arrives.
Several years ago, Tim O’Reilly delivered a speech entitled “Watching the Alpha Geeks: OS X and the Next Big Thing” (http://www.macdevcenter.com/pub/a/mac/2002/05/14/oreilly_wwdc_keynote.html), in which he talked about someone piping IRC through a text-to-speech engine. One could imagine doing the reverse as well, allowing us to join an IRC or instant messaging chat over a WiFi phone, with our Asterisk PBX providing the speech-to-text-to-speech translations.
As monopoly networks such as the PSTN give way to community-based networks like the Internet, there will be a period of time where it is necessary to interconnect the two. While the traditional providers would prefer that the existing model be carried into the new paradigm, it is increasingly likely that telephone calls will become little more than another application the Internet happily carries.
But a challenge remains: how to manage the telephone numbering plan with which we are all familiar and comfortable?
The ITU defined a numbering plan in its E.164 specification. If you’ve used a telephone to make a call across the PSTN, you can confidently state that you are familiar with the concept of E.164 numbering. Prior to the advent of publicly available VoIP, nobody cared about E.164 except the telephone companies—nobody needed to.
Now that calls are hopping from PSTN to Internet to who-knows-what, some consideration must be given to E.164.
In response to this challenge, the IETF has sponsored the Electronic NUmber Mapping (ENUM) working group, the purpose of which is to map E.164 numbers into the Domain Name System (DNS).
While the concept of ENUM is sound, it requires cooperation from the telecom industry to achieve success. However, cooperation is not what the telecom industry is famous for, and thus far ENUM has foundered.
The folks at http://e164.org are trying to contribute to the success of ENUM. You can log onto this site, register your phone number, and inform the system of alternative methods of communicating with you. This means that someone who knows your phone number can connect a VoIP call to you, as the http://e164.org DNS zone will provide the IP addressing and protocol information needed to connect to your location.
As more and more people publish VoIP connectivity information, fewer and fewer calls will be connected through the PSTN.
As is true with any worthwhile thing, Asterisk will face challenges. Let’s take a glance at what some of them may be.
These days, the Internet is changing so fast, and offers so much diverse content, that it is impossible for even the most attentive geek to keep on top of it all. While this is as it should be, it also means that an enormous amount of technology churn is an inevitable part of keeping any communications system current.
As long as long-distance calls cost money, there will be criminals who will wish to steal. Toll fraud is nothing new, but with many unsecured Asterisk systems now on the Internet, the popularity of scripts to find these systems and compromise them has exploded. Administrators of Internet-connected telephone systems will need to carefully design their security to ensure that any calls made from their systems are made only by authorized users.
Yes, it’s coming. There will always be people who believe they have the right to inconvenience and harass others in their pursuit of money. Efforts are under way to try to address this, but only time will tell how efficacious they will be.
The industry is making the transition from ignorance to laughter. If Gandhi is correct, we can expect the fight to begin soon.
As their revenue streams become increasingly threatened by open source telephony, the traditional industry players are certain to mount a fear campaign, in hopes of undermining the revolution.
There is a rumor that the major network providers will artificially cripple VoIP traffic by tagging and prioritizing the traffic of their premium VoIP services and, worse, detecting and bumping any VoIP traffic generated by services not approved by them.
Some of this is already taking place, with service providers blocking traffic of certain types through their networks, ostensibly as some public service (such as blocking popular file-sharing services to protect us from piracy). In the United States, the FCC has taken a clear stand on the matter and fined companies that engage in such practices. In the rest of the world, regulatory bodies are not always as accepting of VoIP.
What seems clear is that the community and the network will find ways around blockages, just as they always have.
A former chairman of the United States Federal Communications Commission, Michael Powell delivered a gift that may well have altered the path of the VoIP revolution. Rather than attempting to regulate VoIP as a telecom service, he championed the concept that VoIP represents an entirely new way of communicating and requires its own regulatory space in which to evolve.
VoIP will become regulated, but not everywhere as a telephony service. Some of the regulations that may be created include:
One of the characteristics of a traditional PSTN circuit is that it is always in the same location. This is very helpful to emergency services, as they can pinpoint the location of a caller by identifying the address of the circuit from which the call was placed. The proliferation of cell phones has made this much more difficult to achieve, since a cell phone does not have a known address. A cell phone can be plugged into any network and can register to any server. If the phone does not identify its physical location, an emergency call from it will provide no clue as to where the caller is. VoIP creates similar challenges.
Law enforcement agencies have always been able to obtain wiretaps on traditional circuit-switched telephone lines. While regulations are being enacted that are designed to achieve the same end on the network, the technical challenges of delivering this functionality will probably never be completely solved. People value their privacy, and the more governments want to stifle it, the more effort will be put toward maintaining it.
These practices are already being seen in the US, with fines being levied against network providers who attempt to filter traffic based on content.
When it comes to regulation, Asterisk is both a saint and a devil: a saint because it feeds the poor, and a devil because it empowers the phrackers and spammers like nothing ever has. The regulation of open source telephony may in part be determined by how well the community regulates itself. Concepts such as DUNDi, which incorporate anti-spam processes, are an excellent start. On the other hand, concepts such as caller ID–spoofing are ripe with opportunities for abuse.
Due to the best-effort reality of the TCP/IP-based Internet, it is not yet known how increasing real-time VoIP traffic will affect overall network performance. Currently, there is so much excess bandwidth in the backbone that best-effort delivery is generally quite good indeed. Still, it has been proven time and time again that whenever we are provided with more bandwidth, we figure out a way to use it up. The 1-MB DSL connection undreamt of five years ago is now barely adequate.
Perhaps a corollary of Moore’s Law[186] will apply to network bandwidth. QoS may become moot, due to the network’s ability to deliver adequate performance without any special processing. Organizations that require higher levels of reliability may elect to pay a premium for a higher grade of service. Perhaps the era of paying by the minute for long-distance connections will give way to paying by the millisecond for guaranteed low latency, or by the percentage point for reduced packet loss. Premium services will offer the five-nines[187] reliability the traditional telecom companies have always touted as their advantage over VoIP.
Open systems require new approaches to solution design. Just because the hardware and software are cheap doesn’t mean the solution will be. Asterisk does not come out of the box ready to run; an Asterisk system has to be designed and built, and then maintained. While the base software is free, and the hardware costs will be based on commodity pricing, it is fair to say that the configuration costs for a highly customized system will be a sizable part of the overall solution cost. In fact, in many cases, because of Asterisk’s high degree of complexity and configurability, the cost will be more than would be expected with a traditional PBX.
The rule of thumb is generally considered to be something like this: if it can be done in the dialplan, the system design will be roughly the same as for any similarly featured traditional PBX. Beyond that, only experience will allow one to accurately estimate the time required to build a system.
Open source telephony creates limitless opportunities. Here are some of the more compelling ones.
Some people will tell you that price is the key, but we believe that the real reason Asterisk will succeed is because it is now possible to build a telephone system as one would a website: with complete, total customization of each and every facet of the system. Customers have wanted this for years. Only Asterisk can deliver.
Anyone can contribute to the future of communicating. It is now possible for someone with an old $200 PC to develop a communications system that has the intelligence to rival the most expensive proprietary systems. Granted, the hardware would not be production-ready, but there is no reason the software couldn’t be. This is one of the reasons why closed systems will have a hard time competing. The sheer number of people who have access to the required equipment is impossible to equal in a closed shop.
The design of a PBX was always a kind of art form, but before Asterisk, the art lay in finding creative ways to overcome the limitations of the technology. With limitless technology, those same creative skills can now be properly applied to the task of completely answering the needs of the customer. Open source telephony engines such as Asterisk will enable this. Telecom designers will dance for joy, as their considerable creative skills will now actually serve the needs of their customers, rather than being focused on managing kludge.
Ultimately, the promise of open source comes to nothing if it cannot fulfill the need people have to solve problems. The closed industries lost sight of the customer, and tried to fit the customer to the product.
Open source telephony brings voice communications in line with other information technologies. It is finally possible to properly begin the task of integrating email, voice, video, and anything else we might conceive of over flexible transport networks (whether wired or wireless), in response to the needs of the user, not the whims of monopolies.
[185] Allison Smith is The Voice of Asterisk—it is her voice in all of the system prompts. To have Allison produce your own prompt, simply visit http://www.theivrvoice.com.
[186] Gordon Moore wrote a paper in 1965 that predicted the doubling of transistors on a processor every few years.
[187] This term refers to 99.999%, which is touted as the reliability of traditional telecom networks. Achieving five nines requires that service interruptions for an entire year total no more than 5 minutes and 15 seconds. Many people believe that VoIP will need to achieve this level of reliability before it can fully replace the PSTN. Many other people believe that the PSTN doesn’t even come close to five-nines reliability. This could have been an excellent term to describe high reliability, but marketing departments abuse it far too frequently.