Skip to menu

服务器框架

nanomsg是其作者两年前开始设计和实现的下一代 zmq,因为基本上完全 re-engineer 了,所以名字也变了。

这是一篇快速笔记,原文标题为 Differences between nanomsg and ZeroMQ(原文链接)。我在泛读此文时,本来计划是把它简明地摘录一下,因为读来深觉其文(包括其引用的各篇文章)信息量较大,充满了作者对之前的开源作品 zmq 的经验教训的各种提炼和反思,不是那种可以快速压缩成三两句干货的水文,于是边读边记录下了原文的要点,以备自己日后参考。

[GL] 开头的行,都是我夹带的私货,见谅。

POSIX 兼容性的实现 (与zmq不同,nanomsg 目标是保持完全的 POSIX 兼容性)

  • 发送/接收函数的语法和语义与POSIX一致
  • Context这个概念被去掉了,掉了...
  • Sockets由void*改为int (句柄化)
  • [GL] 我能说 zmq 原本就只有三个核心概念(context/socket/message)吗,这就嗖的一声干掉了一个。
  • [GL] 再一次印证了zmq doc里一句点32个赞的话:"We add power by removing complexity rather than exposing new functionality."

与使用C++的zmq不同的是,nanomsg使用C实现

  • 不依赖 C++ runtime 降低了内存需求总量和内存分配的数量
  • 从效果上看,降低了内存碎片程度,提高了缓存命中率
  • [GL] 至于为什么从C++转向C,作者写了两篇雄文(这里这里)还是蛮值得一读的
  • [GL] 对了,上面两篇文章大半价值都在下面的评论里,粗粗过滤一下水分,还是蛮精彩的,特此说明一下。

更方便扩展的传输协议

  • transport 和 protocol 相关的实现被单独提到两个对应的头文件里,并提供标准API支持
  • 已经有一些新的 protocol 在实现中(SURVEY / BUS 等)
  • [GL] 可以想见的是,有了标准API,会有 user-defined protocol 逐步出现,值得期待。

线程模型改进

  • zmq 有一个基本设计是为库内每个独立对象维护一个 worker thread,这个设计带来了很多局限性,新的设计是核心对象不再和特定线程绑定
  • 在 nanomsg 里面 REQ 支持 retry, REQ/REP 支持 cancelling
  • inproc 协议使用起来,行为(bind, auto-reconnect)跟其他的协议也更一致了
  • 在新的线程模型下,nanomsg 正在尝试实现 thread-safe socket
  • [GL] 其实我倒觉得实践中这个帮助不大,有了zmq里大量定义良好的协议组合,不同线程混合读写一个 socket 几乎可以看作是一种 bad smell 了。

IOCP 的支持

  • 在 Windows 平台上酌情使用 iocp/NamedPipes 提高性能,而不是始终使用 BSD socket
  • [GL] 优雅的抽象重要,群众的呼声更重要啊。

Routing 优先级的支持

  • 就是往外发的消息在一些特定情况下可以 fallback 到不同的目标去

其他小改进

  • tcp 连接可以指定具体的本地接口("tcp://eth0;192.168.0.111:5555")
  • 全库范围 DNS 查询异步实现
  • 真正的零拷贝 (zmq只保证到内核边界前是零拷贝)
  • PUB/SUB 协议在150M量级的订阅情况下性能改进

协议层的设计改进

  • 不同协议之间被完全隔离(比如REQ和PUB是不互通的)
  • 协议的完整行为被规范化(spec 在 rfc 目录),目标是被 IETF 标准化
  • [GL] 野心很大啊,这是要在通讯协议上一桶浆糊的架势啊。


nanomsg是没有原生的ROUTE和dealer模式的。

作者已经写了集群的文章来处理这些场景,带了优先级,更加智能。

Load balancing is one of the typical features of messaging systems. Although some don't do it (MQTT), most of them provide some way to spread a workload among cluster of boxes.

In ZeroMQ load balancing is done via REQ (requests that require replies) or, alternatively, by PUSH socket (requests that don't require replies). When designing it, I've opted for completely fair load-balancer. What it means is that if there are two peers able to process requests, first request goes to the first one, second request goes to the second one, third request goes ot the first one again etc. It's called round-robin load balancing. Of course, if a peer is dead, or it is busy at the moment, it's removed from the set of eligible destinations and left alone to re-start or finish the task it is dealing with.

For nanomsg I've chosen a more nuanced approach. Similar to ZeroMQ, load balancing is done by REQ and PUSH sockets, however the algorithm is different. Instead of having a single ring of peers eligible for processing messages, nanomsg has 16 such rings with different priorities. If there are any peers with priority 1, messages are round-robined among them. The peers with priorities from 2 to 16 get no messages. Only after all the priority 1 peers are dead, disconnected or busy load balancer considers peers with priority 2. Once again, requests are round-robined among all such peers. If there are no more peers with priority 2, priority 3 peers are checked etc. Of course, once a peer with priority 1 comes back online, any subsequent messages will be sent to it instead of to the lower priority peers.

Now, the obvious question is: How is this useful? What can I do with it?

Basically, it's a failover mechanism. Imagine you have two datacenters. One in New York, other one in London. You have some boxes able to provide specific kind of service. There are some of them on either site. You want to avoid trans-continental traffic and the associated latency. However, if things go bad and there's no service available on your side of the ocean you may prefer the trans-Atlantic communication to not being able to use the service at all. So, this is a configuration you can set up in New York:

prio1.png

As you can see, local services have priority 1 and they are thus used, unless all of them are dead, busy or disconnected. Only at that point the requests will begin to be router across the ocean.

Here's the code needed to set up the requester:

int req;
int sndprio;
char buf [64];

/*  Open a REQ socket. */
nn_init ();
req = nn_socket (AF_SP, NN_REQ);

/*  Connect to 3 servers in New York and 1 in London. */
sndprio = 1;
nn_setsockopt (req, NN_SOL_SOCKET, NN_SNDPRIO, &sndprio, sizeof (int));
nn_connect (req, "tcp://newyorksrv001");
nn_connect (req, "tcp://newyorksrv002");
nn_connect (req, "tcp://newyorksrv003");
sndprio = 2;
nn_setsockopt (req, NN_SOL_SOCKET, NN_SNDPRIO, &sndprio, sizeof (int));
nn_connect (req, "tcp://londonsrv001");

/*  Do your work. */
while (1) {
    nn_send (req, "MyRequest", 9, 0);
    nn_recv (req, buf, sizeof (buf), 0);
    process_reply (buf);
}

/*  Clean up. */
nn_close (req);
nn_term ();

So far so good. Now let's have a look at more sophisticated setup:

prio2.png

As can be seen, there's a cluster or workers (REPs) on each site. Clients (REQs) access the cluster via an intermediate message broker (REQ/REP device). The broker at each site is set up in such a way that if none of the boxes in local cluster can process the request, it is forwarded to the remote site. Of course, if there's no worker available on either site, the messages would bounce between New York and London, which is something you probably don't want to happen and you should take care to drop such messages in the broker. (Alternatively, if you believe that loop detection is a problem worth addressing in nanomsg itself, feel free to discuss in on nanomsg mailing list.)

There's a similar priority system implemented for incoming messages (NN_RCVPRIO option). It's somehow less useful than the load-balancing priorities, however, it may prove useful in some situations. What it allows you to do is to process requests from one source in preference to requests from another source. Of course, this mechanism kicks in only if there is a clash between the two. In slow and low-collision systems the inbound priorities have little effect.

Let's have a look at the following setup:

prio3.png

As long as there are requests from the client (REQ) on the left, the server (REP) will process them and won't care about requests from the client on the right.

EDIT: The RCVPRIO option was removed from nanomsg for now, as there were no obvious use cases for it. If you feel like needing it, please discuss your use case on nanomsg mailing list.

Finally, it's interesting to compare the above system with how priorities are dealt with in traditional messaging systems. Old-school messaging systems typically attach the priority to the message. The idea is that messages with high priorities should be delivered and processed faster than messages with low priorities.

nanomsg, on the other hand, believes that all messages are created equal — there's no priority attached to a message and instead it's paths within the topology that can be prioritised. In other words, nanomsg believes in clear separation of mechanism (sending/receiving messages) and policy (how and with what priority are the messages routed). In yet another words, there's a clear separation of "programmer" role (sending and receiving messages) and "admin" role (setting up the topology, with all the devices, priorities etc.)

While the above may seem to be just a matter of different design philosophy, there are sane reasons why nanomsg handles the priorities in the way it does.

Traditional broker-based messaging system stores messages inside of the broker. Typically, the priorities are implemented by having separate internal message queues for different priority levels. The sender application sends a message via TCP to the broker where it is stored in appropriate queue (depending on the priority level defined in the message). The receiver application asks for a message and broker retrieves a message from the highest-priority non-empty queue available and sends it back via TCP:

prio4.png

This model works OK for traditional, slow messaging systems. The messages spend most of their lifetime stored in prioritised queues on the broker so there's no problem. However, in modern, fast, low-latency and real-time systems it is no longer the case. The time spent inside of broker approaches zero and most of the message lifetime is spent inside TCP buffers.

That changes the situation dramatically. TCP buffer is not prioritised, it's a plain first-in-first-out queue. If a high-priority message is stuck behind a low-priority message, bad luck, the low priority message is going to be delivered first:

prio5.png

To solve this problem nanomsg offers an alternative solution: Create different messaging topologies for different priorities. That way there are multiple TCP connections involved and head-of-line-blocking problem, as described above, simply doesn't happen:

prio6.png

Topology for "urgent" messages is drawn in red. Topology for "normal" messages is in black. Client sends request to either "urgent" or "normal" REQ socket, depending on the priority level it is interested in. Server polls on both REP sockets and processes messages from the "urgent" one first. Finally, given that the two topologies use two different TCP ports, it is possible to configure network switches and routers is such a way that they treat urgent messages in preference and that they allocate certain amount of bandwidth for them, so that network congestion caused by normal messages won't effect delivery of urgent messages.




from






向上