WebSocket and Java
WebSocket is a cool new(ish) technology that allows real-time, two-way communication between the browser and the server, with almost no overhead. What I want to do here, is to provide a very succint, but sufficient overview of how to start using the technology. So, a few things to start with:
- a tcp socket connection is opened between the browser and the server, and each party can send messages to the other (i.e. the server can push data whenever it has it available – no need for polling, long polling, iframes, etc.)
- not all browsers support it – IE 10 is the first IE version to support it, Android still has issues. Fortunately, there’s SockJS, which falls back to other push-emulations, if WebSocket is not supported.
- not all proxy servers support it / allow it, so again fallback might be needed
- suitable for games, trading applications, and in fact anything that requires the server to push data to the browser
- Java has a standard API (JSR-356), which you can use on the server to handle WebSocket connections.
- Spring provides an API ontop of the Java API. The good thing about the spring support is that it has server-side support for SockJS and you can use dependency injection effortlessly. Spring also provides STOMP support for a message-driven architecture. Both spring articles include links to GitHub sample projects, which I recommend.
Before proceeding to some sample code, here is the socket lifecycle, including client and server (assuming one of the above APIs):
- The browser sends an HTTP request with a special Upgrade header, with value “websocket”.
- If the server “speaks” weboscket, it replies with status 101 – switching protocols. From now on we are no longer using HTTP
- When the server accepts the tcp socket connection, an initialization method is invoked, where the current websocket session is passed. Each socket has a unique session id.
- Whenever a browser sends a message over to the server, another method is invoked where you get the session and the message payload.
- Based on some payload parameter, the application code performs one of several actions. The payload format is entirely up to the developer. Normally, though, it is a JSON-serialized object.
- Whenever the server needs to send a message, it needs to obtain the session object, and use it to send a message.
- When the browser closes the connection, the server is notified, so that it can cleanup any resources, associated with the particular session.
Currently no API or framework supports annotation-based routing. The Java API supports annotation-based endpoint handlers, but it gives you one class per connection URL, and normally you want to perform multiple operations on a single connection. I.e., you connect to ws://yourserver.com/game/ and then you want to pass “joinGame”, “leaveGame” message. Likewise, the server needs to send more than one type of messages back. The way I implemented this, is via an enum, containing all possible types of actions/events, and using a switch construct to determine what to invoke.
So I decided to make a simple game for my algorithmic music composer. It is using the Spring API. Here are the slides for a relevant presentation I did in the company I’m working for. And below is some sample code:
@Component public class GameHandler extends WebSocketHandlerAdapter { private Map players = new ConcurrentHashMap<>(); private Map playerGames = new ConcurrentHashMap<>(); @Override public void afterConnectionEstablished(WebSocketSession session) throws Exception { Player player = new Player(session); players.put(session.getId(), player); } @Override public void afterConnectionClosed(WebSocketSession session, CloseStatus status) throws Exception { leaveGame(session.getId()); } @Override protected void handleTextMessage(WebSocketSession session, TextMessage textMessage) throws Exception { try { GameMessage message = getMessage(textMessage); //deserializes the JSON payload switch(message.getAction()) { case INITIALIZE: initialize(message, session); break; case JOIN: join(message.getGameId(), message.getPlayerName(), session); break; case LEAVE: leave(session.getId()); break; case START: startGame(message); break; case ANSWER: answer(message, session.getId()); break; } } catch (Exception ex) { logger.error("Exception occurred while handling message", ex); } }
Let’s see a sample secnario, where the server needs to send messages to clients. Lets take the case when a player joins the game, and all other players need to be notified of the new arrival. The central class in the system is Game, which has a list of players, and as you can see, a Player contains a reference to a WebSocket session. So, when a player joins, the following method of Game is invoked:
public boolean playerJoined(Player player) { for (Player otherPlayer : players.values()) { otherPlayer.playerJoined(player); } players.put(player.getSession().getId(), player); return true; }
And player.playerJoined(..) sends a message over the underlying connection, notifying the browser that a new player has joined:
public void playerJoined(Player player) { GameEvent event = new GameEvent(GameEventType.PLAYER_JOINED); event.setPlayerId(player.getSession().getId()); event.setPlayerName(player.getName()); try { session.sendMessage(new TextMessage(event.toJson())); } catch (IOException e) { new IllegalStateException(e); } }
Sending messages from the server to the browser may also be triggered by a scheduled job.
The point is that you keep a list of all connected browsers, so that you can send information back. The list can be a static field, but in the case of a singleton spring bean it doesn’t need to be.
Now, two important aspects – security and authentication. Here’s a nice article by Heroku, discussing both. You should prefer wss (which is websocket over TLS) if there is anything sensitive. You should also validate your input on both ends, and you should not rely on the Origin header, because an attacker may spoof the browser very easily.
Authentication can rely on the HTTP session cookie, but apparently some people prefer to implement their own cookie-like workflow in order to obtain a short-lived token, which can be used to perform authenticated operations.
WebSocket makes DDD come naturally. You no longer work with anemic objects – your objects have their respective state and operations are performed on that state. Related to that, a websocket application is more easily testable.
That’s the general set of things to have in mind when developing a WebSocket application. Note that you don’t have to use WebSocket everywhere – I’d limit it only to features where “push” is needed.
Overall, WebSocket is a nice and interesting technology that hopefully obsoletes all hacky push emulations.