简析gRPC client 连接管理
背景
- 客户端skd 使用gRPC作为通信协议,定时(大概是120s)向服务器发送pingServer 请求。
- 服务端是80端口,如xxx:80.
问题
- 发现客户端不断的端口重连服务器的。
- 使用netstat -antp
- 如图, 如标红的服务器地址连接是TIME_WAIT,后面有和服务器建立连接 ESTABLISHED。
- TIME_WAIT 状态表明是client 端主动断开了连接。
-
- 这和我之前的认知有点冲突,gRPC 应该是长连接,为什么这里每次都断开呢,这样不就长了短连接了吗?
- 而且客户端主动断开的,会不会是client端哪里有问题?
-
- 带着疑问,在client 抓了一包,
- 发现client 总是受到一个 length 为17 的包,然后就开始FIN 包,走TCP 挥手的流程。
- 使用WireShark 对tcpdump的结果查看,发现这个length 17 的包,是一个GOAWAY 包。
如图:
- 这个是HTTP2定义的一个“优雅”退出的机制。
-
- 这里有HTTP2 GOAWAY stream 包的说明。
-
-
- 根据之前的对gRPC的了解,gRPC client 会解析域名,然后会维护一个lb 负载均衡,
- 这个应该是gRPC对idle 连接的管理。pingServer 的时间间隔是120s, 但是gRPC 认为中间是idle连接,
- 所以通知client 关闭空闲连接?
-
- 为了验证这个想法,修改了一下gRPC 的demo, 因为我们client 端使用是cpp 的gRPC 异步调用方式,
- 所以更加gRPC 的异步demo, 写了一个简单访问服务器的async_client
代码:
-
- #include <iostream>
- #include <memory>
- #include <string>
-
- #include <grpcpp/grpcpp.h>
- #include <grpc/support/log.h>
- #include <thread>
-
- #include "gateway.grpc.pb.h"
-
- using grpc::Channel;
- using grpc::ClientAsyncResponseReader;
- using grpc::ClientContext;
- using grpc::CompletionQueue;
- using grpc::Status;
- using yournamespace::PingReq;
- using yournamespace::PingResp;
- using yournamespace::srv;
-
- class GatewayClient {
- public:
- explicit GatewayClient(std::shared_ptr<Channel> channel)
- : stub_(srv::NewStub(channel)) {}
-
- // Assembles the client's payload and sends it to the server.
- //void PingServer(const std::string& user) {
- void PingServer() {
- // Data we are sending to the server.
- PingReq request;
- request.set_peerid("1111111111111113");
- request.set_clientinfo("");
- request.set_capability(1);
- request.add_iplist(4197554190);
- request.set_tcpport(8080);
- request.set_udpport(8080);
- request.set_upnpip(4197554190);
- request.set_upnpport(8080);
- request.set_connectnum(10000);
- request.set_downloadingspeed(100);
- request.set_uploadingspeed(10);
- request.set_maxdownloadspeed(0);
- request.set_maxuploadspeed(0);
- // Call object to store rpc data
- AsyncClientCall* call = new AsyncClientCall;
- // stub_->PrepareAsyncSayHello() creates an RPC object, returning
- // an instance to store in "call" but does not actually start the RPC
- // Because we are using the asynchronous API, we need to hold on to
- // the "call" instance in order to get updates on the ongoing RPC.
- call->response_reader =
- stub_->AsyncPing(&call->context, request, &cq_);
- // StartCall initiates the RPC call
- //call->response_reader->StartCall();
- // Request that, upon completion of the RPC, "reply" be updated with the
- // server's response; "status" with the indication of whether the operation
- // was successful. Tag the request with the memory address of the call object.
- call->response_reader->Finish(&call->reply, &call->status, (void*)call);
-
- }
-
- // Loop while listening for completed responses.
- // Prints out the response from the server.
- void AsyncCompleteRpc() {
- void* got_tag;
- bool ok = false;
-
- // Block until the next result is available in the completion queue "cq".
- while (cq_.Next(&got_tag, &ok)) {
- // The tag in this example is the memory location of the call object
- AsyncClientCall* call = static_cast<AsyncClientCall*>(got_tag);
-
- // Verify that the request was completed successfully. Note that "ok"
- // corresponds solely to the request for updates introduced by Finish().
- GPR_ASSERT(ok);
-
- if (call->status.ok())
- std::cout << "xNetClient received: " << call->reply.code() << " task:" << call->reply.tasks_size() <<" pinginterval:"<< call->reply.pinginterval() << std::endl;
- else
- //std::cout << "RPC failed" << std::endl;
- std::cout << ": status = " << call->status.error_code() << " (" << call->status.error_message() << ")" << std::endl;
-
- // Once we're complete, deallocate the call object.
- delete call;
- }
- }
- private:
- // struct for keeping state and data information
- struct AsyncClientCall {
- // Container for the data we expect from the server.
- PingResp reply;
- // Context for the client. It could be used to convey extra information to
- // the server and/or tweak certain RPC behaviors.
- ClientContext context;
- // Storage for the status of the RPC upon completion.
- Status status;
- std::unique_ptr<ClientAsyncResponseReader<PingResp>> response_reader;
- };
- // Out of the passed in Channel comes the stub, stored here, our view of the
- // server's exposed services.
- std::unique_ptr<srv::Stub> stub_;
-
- // The producer-consumer queue we use to communicate asynchronously with the
- // gRPC runtime.
- CompletionQueue cq_;
- };
-
- int main(int argc, char** argv) {
-
-
- // Instantiate the client. It requires a channel, out of which the actual RPCs
- // are created. This channel models a connection to an endpoint (in this case,
- // localhost at port 50051). We indicate that the channel isn't authenticated
- // (use of InsecureChannelCredentials()).
- if (argc < 2){
- std::cout << "usage: " <<argv[0]<< " domain:port" << std::endl;
- std::cout << "eg: " <<argv[0]<< " gw.xnet.xcloud.sandai.net:80" << std::endl;
- return 0;
- }
- GatewayClient xNetClient(grpc::CreateChannel( argv[1], grpc::InsecureChannelCredentials()));
- // Spawn reader thread that loops indefinitely
- std::thread thread_ = std::thread(&GatewayClient::AsyncCompleteRpc, &xNetClient);
- for (int i = 0; i < 1000; i++) {
- xNetClient.PingServer(); // The actual RPC call!
- std::this_thread::sleep_for(std::chrono::seconds(120));
- }
- std::cout << "Press control-c to quit" << std::endl << std::endl;
- thread_.join(); //blocks forever
- return 0;
- }
接下来的时间很简单,运行一下。
使用netstat -natp 观察,可以重新。 async_client 也是断开,重连。
进一步调试发现,把发包的时间修改为10s 的时候,可以保持连接,大于10s基本上连接就会断开。
小结
小结一下:
gRPC 管理连接的方式,默认情况下,大于10s没有数据发送,gRPC 就会认为是个idle 连接。server 端会给client 端发送一个GOAWAY 的包。client 收到这个包之后就会主动关闭连接。下次需要发包的时候,就会重新建立连接。
目前还不知道是不是有配置项修改这个值,对gRPC 的机制还不是很熟,后面再研究一下。