This Chatifesto has been written by Elijah from the wonderful leap.se project.
Old chat, new chat
The model of chat popularized in the 2000s was simple but useful: you had a client that listed your online contacts, and you could chat with them in real time if you were both online.
The protocols that emerged in this period, such as XMPP and OTR, had this model of interaction in mind when initially designed.
Since then, a new type of chat has started to rapidly overtake old chat. This new chat allows one to switch smoothly between synchronous and asynchronous communication. If you send someone a message with new chat, you have a very high degree of confidence that they will get it eventually, and not get lost when they turn off a device without reading it or silently bounced because they were not online at the moment. New chat is more media rich, allowing images and video inline. New chat promises that your data will be available, you can switch from device to device without losing messages or worrying about where they are stored. New chat is sometimes transport agnostic, switching between data or SMS as available. With new chat, it is often super easy to create new invite-only groups by listing a few of your contacts and sending a message. New chat often allows you to use mobile, desktop, or web clients.
New chat is not simply old chat with better features. It is a distinct and, in most cases, simplified user experience. People communicate differently with new chat. Most importantly, old chat is declining rapidly as new chat expands rapidly. We need to update our approach to securing chat to be compatible with a new chat model.
The goldilocks architecture
There are three possible architectures for a message system: centralized, peer-to-peer, or federated.
In a centralized model, all messages are passed through a single organization. Typically, everything works and is easy to use, but the user loses control over their data and must entirely place all their security in the hands of the central authority.
In a peer-to-peer model, all users communicate directly with one another, cutting out the dependency on an authority. This comes at a cost: peer-to-peer system are more difficult to use, the data is less available when you want it, they don't work well on mobile devices or with intermittent network connections, and authenticity becomes much more challenging to establish. Experience has shown that when authenticity is difficult, most people don't bother, thus greatly undermining the security of the system as a whole.
In a federated model, messages are passed from a user to their provider, relayed to the recipient's provider, and then finally to the recipient. A federated model is halfway between a centralized model and a peer-to-peer model. There are still authorities, but the user is free to choose whichever one they want, and they can later switch if they so desire. The user is not completely autonomous, but they still have control over their data.
Centralized models should be abandoned as quickly as possible. They are easy to implement, but are dead end streets. Once upon a time, nothing on the internet was centralized. The last decade has seen the rise of centralized platforms, but it is likely they will not stand the test of history once open protocols are mature enough to catch up in terms of ease and features.
Peer-to-peer models are perhaps the future, but the far future. There are serious and difficult problems intrinsic to peer-to-peer approaches that are unlikely to be solved any time soon, such as authenticity, availability, and resistance to traffic analysis.
Federated models represent a goldilocks middle ground between centralized and peer-to-peer. Federated approaches have the potential to allow us the availability and user friendliness of centralized systems while also retaining a large degree of control and freedom concerning our data.
Some might say that federated systems are not that different because they still rely on the central authority of the root DNS zone. This authority is very different than other forms of central authority, because the actions taken by the root DNS zone are necessarily visible and auditable by external observers.
To the moon
Federated secure chat is like the moon landing. It is interesting in itself, but mostly the benefit comes from all the positive externalities that come from solving the problem of getting to the moon.
Chat is centrally situated in the world of secure communication, and touches upon nearly all the interesting and hard problems we must eventually tackle in order to achieve communication security.
automatic authenticity. the problem of authenticating the identity of other users is incredibly important, because all other security properties depend on it. the only proven models that are easy enough for wide adoption are centralized. if we don't create a federated system of authenticity that is easy to use, we will be creating secure system with a huge vulnerability: human error in validating public keys.
cryptographic groups. what constitutes a secure group, what does membership mean, who controls a group, how do we adapt public key cryptography to groups, how do we negotiate a session key in a group? these are core questions which must be answered for any secure communication that is not strictly one to one.
federated storage. chat is media rich, so we have to tackle how we can do secure cloud storage, not tied to any one provider, that you can grant specific access to particular media assets.
secure routing. metadata must be regarded as extremely sensitive information. it is a detailed blueprint of how you interact with society and is often far more revealing than content. what the NSA can do today, repressive governments will be able to do tomorrow.
Chat is flexible
Short messages can be incredibly flexible and lend themselves to many different idioms and patterns of communication.
The core of the XMPP chat protocol is not complex, but it lends itself to a lot of possibilities. People use XMPP not just for chat, but to control servers, transport data for multi-player games and real time document collaboration, implement a follow/subscribe model like twitter, or a friend update model like facebook. Chat is also a particularly good session negotiation protocol for audio and video conferencing.
Solving the problem of secure chat does not simply mean we have secure chat. It means we are very far along the road to also creating secure federated social networking, secure document collaboration, and secure notification.
The holy grail
Our broad goal is secure and easy to use chat that is always available, works across all your devices, and where you don't have to worry if the people you chat with are online or not.
In slightly more detail, this means:
Client encryption: The content of every message (or other stanza) from one user A to user B must be encrypted on the client device of user A and decrypted on the client device of user B.
Offline messages: Every server must be able to receive offline messages on behalf of a user, and forward these messages to the user's client when they next appear online.
Automatic authenticity: The user's client must attempt to automatically validate the public keys of other users. If the client is unable to do so, it must indicate this visually.
Groups: Every client and server must support encrypted multi-user chat. Access to group chats should be granted by one of the following: invite, membership in a pre-defined cryptographic group, or open to anyone.
Device portable: The user should be able to communicate via multiple devices, on any platform, without losing messages. The protocol must be able to function well on mobile devices with spotty network and limited battery.
Secure routing: The user should be protected from analysis of their associations and pattern of message traffic.
Transport encryption: Both clients and servers must require transport encryption of the communication stream using a cipher with forward secrecy. This includes client to server and server to server communication.
Easy to use: The client must be auto-configuring based on the information advertised by a service provider. The client should provide some visual confirmation that a message has been received by recipient server.
Media encryption: Any documents and other media attached to a chat message should be encrypted using the same access controls as the message itself.
Secure storage: Any user data (such as logs, roster, keys, etc) saved locally or synced by the client must be client-encrypted.
These would be really nice, but may be very difficult in the near term:
- Ideally, the server should not know the list of contacts (roster) of a user.
- Ideally, the server should not know the membership list of a particular group.
- Ideally, the server should not know whom you communicate with.
Although these are nice, we are not trying to achieve any of the following:
- direct file transfer from one client to another
- direct peer to peer messaging without server relays
- support for gateways to other chat protocols or SMS
session encryption -- possible to get forward secrecy and deniability. Group session negotiation is experimental and slow.
object encryption -- possible for parties to communicate asynchronously (i.e. while only one party is online).
hybrid approach -- long lived session keys that can be used for online communication, or forward hashes of these keys (so keeping the key around does not make past conversation vulnerable).
Session encryption provides the highest security, but we also need the ability to support offline communication. Object encryption could be paired with transport that was forward secret to obviate some of its security problems (although not against an attack from the provider).
There is interesting research in optimizing group session negotiation, but not much code. http://link.springer.com/chapter/10.1007/978-3-540-45146-4_7
By secure routing, we mean that the "from" and "to" information of every message or stanza should be protected from association analysis by servers that relay messages.
Some ideas to protect the routing information:
Auto-aliases: Each party auto-negotiates aliases for communicating with each other. Behind the scenes, the client then invisibly uses these aliases for subsequent communication. The advantage is that this is backward compatible with existing routing. The disadvantage is that the user's server stores a list of their aliases. As an improvement, you could add the possibility of a third party service to maintain the alias map.
Onion headers: A message from user A to user B is encoded so that the "to" routing information only contains the name of B's server. When B's server receives the message, it unwraps (unencrypts) a supplementary header that contains the actual user "B". Like aliases, this provides no benefit if both users are on the same server. As an improvement, the message could bounce around intermediary servers, like mixmaster.
Third party dropbox: To exchange messages, user A and user B negotiate a unique "dropbox" URL for depositing messages, potentially using a third party. To send a message, user A would post the message to the "dropbox". To receive a message, user B would regularly polls this URL to see if there are new messages.
Mixmaster with signatures: Messages are bounced through a mixmaster-like set of anonymization relays and then finally delivered to the recipient's server. The user's client only displays the message if it is encrypted, has a valid signature, and the user has previously added the sender to a 'allow list' (perhaps automatically generated from the list of validated public keys).