WebRTC Integrator's Guide
上QQ阅读APP看书,第一时间看更新

Running WebRTC without SIP

HTML5 websockets can be defined by ws:// followed by the URL in the server field while readying a WebRTC client for registration. This enables bidirectional, duplex communications with server-side processes, that is, server-side push events to the client. It also enables the handshake after sharing media metadata such as ports, codecs, and so on.

It should be noted that WebRTC works in an offer/answer mode and has ways of traversing the Network Address Translation (NAT) and firewalls by means of Interactive Connectivity Establishment (ICE). ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). This is covered later in the chapter.

Sending media over WebSockets

WebRTC mainly comprises three operations: fetching user media from a camera/microphone, transmitting media over a channel, and sending messages over the channel. Now, let's take a look at the summarized description of every operation type.

getUserMedia

The JavaScript getUserMedia function (also known as MediaStream) is used to allow the web page to access users' media devices such as camera and microphone using the browser's native API, without the need of any other third-party plugins such as Adobe Flash and Microsoft Silverlight.

Tip

For simple demos of these methods, download the WebRTC read-only by executing the following command:

svn checkout http://webrtc.googlecode.com/svn/trunk/ webrtc-read-only

The following is the code to access the IP camera in the Google Chrome browser and display the local video in a <video/> element:

/*The HTML to define a button to begin the capture and HTML5 video element on web page body */
<video id="vid" autoplay="true"></video>
<button id="btn" onclick="start()">Start</button>

/*The JavaScript block contains the following function call to start the media capture using Chrome browser's getUserMedia function*/
video = document.getElementById("vid");
function start() {
  navigator.webkitGetUserMedia({video:true}, gotStream, function() {});
  btn.disabled = true;
}

/*The function to add the media stream to a video element on a page*/
  function gotStream(stream) {
  video.src = webkitURL.createObjectURL(stream);
}

Note

When the browser tries to access media devices such as a camera and mic from users, there is always a browser notification that asks for the user's permission.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you

The following screenshot depicts the user notification for granting permission to access the camera in Google Chrome:

The following screenshot depicts the user notification for granting permission to access the camera in Mozilla Firefox:

The following screenshot depicts the user notification for granting permission to access the camera in Opera:

RTCPeerConnection

In WebRTC, media traverses in a peer-to-peer fashion and is necessary to exchange information prior to setting up a communication path such as public IP and open ports. It is also necessary to know about the peer's codecs, their settings, bandwidth, and media types.

To make the peer connection, we will need a function to populate the values of the RTCPeerConnection, getUserMedia, attachMediaStream, and reattachMediaStream parameters. Due to the fact that the WebRTC standard is currently under development, the JavaScript API can change from one implementation to another. So, a web developer has to configure the RTCPeerConnection, getUserMedia, attachMediaStream, and reattachMediaStream variables in accordance to the browser on which we are running the HTML content.

Note

It is noted that WebRTC standards are in rapid evolution. The API that was used for the first version of WebRTC was the PeerConnection API, which had distinct methods for media transmission. As of now, the old PeerConnection API has been deprecated and a new enhanced version is under process. The new Media API has replaced the media streams handling in the old PeerConnection API.

The browser APIs of different browsers have different names. The criterion is to determine the browser on which the web page is opened and then call the appropriate function for the WebRTC operation. The identity of the browser can be determined by extracting a friendly name or checking for a match with a specific library name of the different browser. For example, when navigator.webkitGetUserMedia is true, then WebRTCDetectedBrowser = "chrome", and when navigator.mozGetUserMedia is true, then WebRTCDetectedBrowser = "firefox". The following table shows the W3C standard elements in Google Chrome and Mozilla Firefox:

Such methods also exist for Opera, which is a new addition to the WebRTC suite. Hopefully, Internet Explorer, in the future, would have native support for WebRTC standards. For other browsers such as Safari that don't support WebRTC as yet, there are temporary plugins that help capture and display the media elements, which can be used until these browsers release their own enhanced WebRTC supported versions. Creating WebRTC-compatible clients in Internet Explorer and Safari is discussed in Chapter 9, Native SIP Application and Interaction with WebRTC Clients.

The following code snippet is used to make an RTC peer connection and render videos from one HTML video frame to another on the same web page. The library file, adapter.js, is used, which renders the polyfill functionality to different browsers such as Mozilla Firefox and Google Chrome.

The HTML body content that includes two video elements for the local and remote videos, the text status area, and three buttons to start capturing, sending, and stop receiving the stream are given as follows:

<video id="vid1" autoplay="true" muted="true"></video>
<video id="vid2" autoplay></video>
<button id="btn1" onclick="start()">Start</button>
<button id="btn2" onclick="call()">Call</button>
<button id="btn3" onclick="hangup()">Hang Up</button>
<xtextarea id="ta1"></textarea>
<xtextarea id="ta2"></textarea>

The JavaScript program to transmit media from the video element to another at the click of the Start button, using the WebRTC API is given as follows:

/* setting the value of start, call and hangup to false initially*/
btn1.disabled = false;
btn2.disabled = true;
btn3.disabled = true;

/* declaration of global variables for peerconecection 1 and 2, local streams, sdp constrains */
var pc1,pc2;
var localstream;
var sdpConstraints = {'mandatory': {
                        'OfferToReceiveAudio':true,
                        'OfferToReceiveVideo':true }};

The following code snippet is the definition of the function that will get the user media for the camera and microphone input from the user:

function start() {
  btn1.disabled = true;
  getUserMedia({audio:true, video:true},
  /* get audio and video capture */
  gotStream, function() {});
}

The following code snippet is the definition of the function that will attach an input stream to the local video section and enable the call button:

function gotStream(stream){
  attachMediaStream(vid1, stream);
  localstream = stream;/* ready to call the peer*/
btn2.disabled = false;
}

The following code snippet is the function call to stream the video and audio content to the peer using RTCPeerConnection:

function call() {
  btn2.disabled = true;
  btn3.disabled = false;
  videoTracks = localstream.getVideoTracks();
  audioTracks = localstream.getAudioTracks();
  var servers = null;

  pc1 = new RTCPeerConnection(servers);/* peer1 connection to server */
  pc1.onicecandidate = iceCallback1;
  pc2 = new RTCPeerConnection(servers);/* peer2 connection to server */

  pc2.onicecandidate = iceCallback2;
  pc2.onaddstream = gotRemoteStream;
  pc1.addStream(localstream);
  pc1.createOffer(gotDescription1);
}

function gotDescription1(desc){/* getting SDP from offer by peer2 */
pc1.setLocalDescription(desc);
pc2.setRemoteDescription(desc);
pc2.createAnswer(gotDescription2, null, sdpConstraints);
}

function gotDescription2(desc){/* getting SDP from answer by peer1 */
pc2.setLocalDescription(desc);
pc1.setRemoteDescription(desc);
}

On clicking the Hang Up button, the following function closes both of the peer connections:

function hangup() {
  pc1.close();
  pc2.close();
  pc1 = null; /* peer1 connection to server closed */
  pc2 = null; /* peer2 connection to server closed */
  btn3.disabled = true; /* disables the Hang Up button */
  btn2.disabled = false; /*enables the Call button */

}

function gotRemoteStream(e){
  vid2.src = webkitURL.createObjectURL(e.stream);
}

function iceCallback1(event){
  if (event.candidate) {
    pc2.addIceCandidate(new RTCIceCandidate(event.candidate));
  }
}

function iceCallback2(event){
  if (event.candidate) {
    pc1.addIceCandidate(new RTCIceCandidate(event.candidate));
  }
}

In the preceding example, JSON/XHR (XMLHttpRequest) is the signaling mechanism. Both the peers, that is, the sender and receiver, are present on the same web page; this is represented by the two video elements shown in the following screenshot. They are currently in the noncommunicating state.

As soon as the Start button is hit, the user's microphone and camera begin to capture. The first peer is presented with the browser request to use their camera and microphone. After allowing the browser request, the first peer's media is successfully captured from their system and displayed on the screen. This is demonstrated in the following screenshot:

As soon as the user hits the Call button, the captured media stream is shared in the session with the second peer, who can view it on their own video element. The following screenshot depicts the two peers sharing a video stream:

The session can be discontinued by clicking on the Hang Up button.

RTCDataChannel

The DataChannel function is used to exchange text messages by creating a bidirectional data channel between two peers. The following is the code to demonstrate the working of RTCDataChannel.

The following code snippet is the HTML body of the code for the DataChannel function. It consists of a text area for the two peers to view the messages and three buttons to start the session, send the message, and stop receiving messages.

<div id="left">
<br>
<h2>Send data</h2>
<textarea id="dataChannelSend" rows="5" cols="15" disabled="true">
</textarea>
<br>
<button id="startButton" onclick="createConnection()">
  Start</button>
<button id="sendButton" onclick="sendData()">Send Data</button>
<button id="closeButton" onclick="closeDataChannels()">Stop Send Data
</button>
<br>
</div>
<div id="right">
<br>
<h2>Received Data</h2>
<textarea id="dataChannelReceive" rows="5" cols="15" disabled="true">
</textarea><br>
</div>

The style script for the text area is given as follows; to differentiate between the two peers, we place one text area aligned to the right and another to the left:

#left { position: absolute; left: 0; top: 0; width: 50%; }
#right { position: absolute; right: 0; top: 0; width: 50%; }

The JavaScript block that contains the functions to make the session and transmit the data is given as follows:

/*Declaring global parameters for both sides' peerconnection, sender, and receiver channel*/
var pc1, pc2, sendChannel, receiveChannel;

/*Only enable the Start button, keep the send data and stop send data button off*/
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;

The following code snippet is the script to create PeerConnection in Google Chrome, that is, webkitRTCPeerConnection that was seen in the previous table. It is noted that a user needs to have Google Chrome Version 25 or higher to test this code. Some old Chrome versions are also required to set the --enable-data-channels flag to the enabled state before using the DataChannel functions.

function createConnection() {
  var servers = null;
  pc1 = new webkitRTCPeerConnection(servers,{
    optional: [{RtpDataChannels: true}]});
    try {
      sendChannel = pc1.createDataChannel("sendDataChannel", {
        reliable: false});
    } catch (e) {
      alert('Failed to create data channel.' + 'You need Chrome M25 or later with --enable-data-channels flag'););
    }

  pc1.onicecandidate = iceCallback1;
  sendChannel.onopen = onSendChannelStateChange;
  sendChannel.onclose = onSendChannelStateChange;
  pc2 = new webkitRTCPeerConnection(servers,{
    optional: [{RtpDataChannels: true}]});
  
  pc2.onicecandidate = iceCallback2;
  pc2.ondatachannel = receiveChannelCallback;

  pc1.createOffer(gotDescription1);
  startButton.disabled = true; /*since session is up, disable start button */
  closeButton.disabled = false;  /*enable close button */
}

The following function is used to invoke the sendChannel.send function along with user text to send data across the data channel:

function sendData() {
  var data = document.getElementById("dataChannelSend").value;
  sendChannel.send(data);
}

The following function calls the sendChannel.close() and receiveChannel.close() functions to terminate the data channel connection:

function closeDataChannels() {
sendChannel.close();
receiveChannel.close();
pc1.close();      /* peer1 connection to server closed */
pc2.close();      /* peer2 connection to server closed */

  pc1 = null;
  pc2 = null;
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;
document.getElementById("dataChannelSend").value = "";
document.getElementById("dataChannelReceive").value = "";
document.getElementById("dataChannelSend").disabled = true;
}

Peer connection 1 sets the local description, and peer connection 2 sets the remote description from the SDP exchanged, and the answer is created:

function gotDescription1(desc) {
  pc1.setLocalDescription(desc);
  pc2.setRemoteDescription(desc);
  pc2.createAnswer(gotDescription2);
}

function gotDescription2(desc) {
  pc2.setLocalDescription(desc);
  trace('Answer from pc2 \n' + desc.sdp);
  pc1.setRemoteDescription(desc);
}

The following is the function to get the local ICE call back:

function iceCallback1(event) {
  if (event.candidate) {
    pc2.addIceCandidate(event.candidate);
  }
}

The following is the function for the remote ICE call back:

function iceCallback2(event) {
  if (event.candidate) {
    pc1.addIceCandidate(event.candidate);
  }
}

The function that receives the control when a message is passed back to the user is as follows:

function receiveChannelCallback(event) {
  receiveChannel = event.channel;
  receiveChannel.onmessage = onReceiveMessageCallback;
  receiveChannel.onopen = onReceiveChannelStateChange;
  receiveChannel.onclose = onReceiveChannelStateChange;
}

function onReceiveMessageCallback(event) {
  document.getElementById("dataChannelReceive").value = event.data;
}

function onReceiveChannelStateChange() {
  var readyState = receiveChannel.readyState;
}

function onSendChannelStateChange() {
  var readyState = sendChannel.readyState;
  if (readyState == "open") {
    document.getElementById("dataChannelSend").disabled = false;
    sendButton.disabled = false;
    closeButton.disabled = false;
  } else {
    document.getElementById("dataChannelSend").disabled = true;
    sendButton.disabled = true;
    closeButton.disabled = true;
  }
}

The following screenshot shows that Peer 1 is prepared to send text to Peer 2 using the DataChannel API of WebRTC:

Empty text areas before beginning the exchange of text

On clicking on the Start button, as shown in the following screenshot, a session is established between the peers and the server:

Putting in text from one's peers after hitting the Start button

As Peer 1 keys in the message and hits the Send button, the message is passed on to Peer 2. The preceding snapshot is taken before sending the message, and the following picture is taken after sending the message:

Text is exchanged on DataChannel on the click of the Send button

However, right now, you are only sending data from one localhost to another. This is because the system doesn't know any other peer IP or port. This is where socket-based servers such as Node.js come into the picture.

Media traversal in WebRTC clients

Real-time Transport Protocol (RTP) is the way for media to flow between end points. Media could be audio and/or video based.

Note

Media stream uses SRTP and DTLS protocols.

RTP in WebRTC is by default peer-to-peer as enforced by the Interactive Connectivity Establishment (ICE) protocol candidates, which could be either STUN or TURN. ICE is required to establish that firewalls are not blocking any of the UDP or TCP ports. The peer-to-peer link is established with the help of the ICE protocol. Using the STUN and TURN protocols, ICE finds out the network architecture and provides some transport addresses (ICE candidates) on which the peer can be contacted.

An RTCPeerConnection object has an associated ICE, comprising the RTCPeerConnection signaling state, the ICE gathering state, and the ICE connection state. These are initialized when an object is first created. The flow of signals through these nodes is depicted in the following call flow diagram:

ICE, STUN, and TURN are defined as follows:

  • ICE: This is the framework to allow your web browser to connect with peers. ICE uses STUN or TURN to achieve this.
  • STUN: This is the protocol to discover your public address and determine any restrictions in your router that would prevent a direct connection with a peer. It presents the outside world with a public IP to the WebRTC client that can be used by the peers to communicate to it.
  • TURN: This is meant to bypass the Symmetric NAT restriction by opening a connection with a TURN server and relaying all information through this server.

Note

STUN/ICE is built-in and mandatory for WebRTC.