最近在看Go标准库里面的rpc源码,发现了下面一段代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// ServeHTTP implements an http.Handler that answers RPC requests.
func (server *Server) ServeHTTP(w http.ResponseWriter, req *http.Request) {
	if req.Method != "CONNECT" {
		w.Header().Set("Content-Type", "text/plain; charset=utf-8")
		w.WriteHeader(http.StatusMethodNotAllowed)
		io.WriteString(w, "405 must CONNECT\n")
		return
	}
	conn, _, err := w.(http.Hijacker).Hijack()	//注意看这里
	if err != nil {
		log.Print("rpc hijacking ", req.RemoteAddr, ": ", err.Error())
		return
	}
	io.WriteString(conn, "HTTP/1.0 "+connected+"\n\n")
	server.ServeConn(conn)
}

这是一段接管 HTTP 连接的代码,所谓的接管 HTTP 连接是指这里接管了 HTTP 的 TCP 连接,也就是说 Golang 的内置 HTTP 库和 HTTPServer 库将不会管理这个 TCP 连接的生命周期,这个生命周期已经划给 Hijacker 了。

Hijack 介绍

首先看下官方的http.Hijacker接口介绍:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
type Hijacker interface {
	// Hijack lets the caller take over the connection.
	// After a call to Hijack the HTTP server library
	// will not do anything else with the connection.
	//
	// It becomes the caller's responsibility to manage
	// and close the connection.
	//
	// The returned net.Conn may have read or write deadlines
	// already set, depending on the configuration of the
	// Server. It is the caller's responsibility to set
	// or clear those deadlines as needed.
	//
	// The returned bufio.Reader may contain unprocessed buffered
	// data from the client.
	//
	// After a call to Hijack, the original Request.Body must not
	// be used. The original Request's Context remains valid and
	// is not canceled until the Request's ServeHTTP method
	// returns.
	Hijack() (net.Conn, *bufio.ReadWriter, error)
}

Hijack()可以将HTTP对应的TCP连接取出,连接在Hijack()之后,HTTP的相关操作就会受到影响,调用方需要负责去关闭连接。

实现原理

当我们在调用Hijack时,Http Server会将当前连接设置为StateHijacked,并从管理的connection列表中删除,交给使用者自己来管理:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// 当前连接设置为StateHijacked
func (c *conn) setState(nc net.Conn, state ConnState) {
	srv := c.server
	switch state {
	case StateNew:
		srv.trackConn(c, true)
	case StateHijacked, StateClosed:
		srv.trackConn(c, false) //这里调用trackConn方法
	}
	if state > 0xff || state < 0 {
		panic("internal error")
	}
	packedState := uint64(time.Now().Unix()<<8) | uint64(state)
	atomic.StoreUint64(&c.curState.atomic, packedState)
	if hook := srv.ConnState; hook != nil {
		hook(nc, state)
	}
}

//从管理的连接中删除
func (s *Server) trackConn(c *conn, add bool) {
	s.mu.Lock()
	defer s.mu.Unlock()
	if s.activeConn == nil {
		s.activeConn = make(map[*conn]struct{})
	}
	if add {
		s.activeConn[c] = struct{}{}
	} else {
		delete(s.activeConn, c)
	}
}

和正常的HTTP请求区别

为了和正常的HTTP请求对比,写了下面代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
//hijack请求
http.HandleFunc("/hijack", func(w http.ResponseWriter, r *http.Request) {
		hj, _ := w.(http.Hijacker)
		conn, buf, _ := hj.Hijack()
		defer conn.Close()

		buf.WriteString("hello hijack\n")
		buf.Flush()
	})

//正常的http请求
http.HandleFunc("/http", func(writer http.ResponseWriter, request *http.Request) {
	fmt.Fprintln(writer, "hello http\n")
})

我们分别来curl请求下这两个接口:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ curl "http://127.0.0.1:9001/hijack" -i
hello hijack

$ curl "http://127.0.0.1:9001/http" -i
HTTP/1.1 200 OK
Date: Mon, 17 Feb 2020 03:41:52 GMT
Content-Length: 12
Content-Type: text/plain; charset=utf-8

hello http

看出来了吗?乍一看,首先我们能看到使用Hijack的请求返回没有响应头信息。这里我们要明白的是,Hijack之后虽然能正常输出数据,但完全没有遵守http协议。

为什么呢?翻下源码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Serve a new connection.
func (c *conn) serve(ctx context.Context) {
	//省略...
	serverHandler{c.server}.ServeHTTP(w, w.req)
	w.cancelCtx()
	if c.hijacked() { //被hijack后直接返回
			return
	}
	w.finishRequest()
	//省略...	
}

这是net/http包中的方法,也是http路由的核心方法。调用ServeHTTP方法,如果被Hijack了就直接return了,而一般的http请求会经过后边的finishRequest方法,加入headers等并关闭连接。

使用场景

当不想使用内置服务器的HTTP协议实现时,请使用Hijack

一般在在创建连接阶段使用HTTP连接,后续自己完全处理connection。符合这样的使用场景的并不多,基于HTTP协议的rpc算一个,从HTTP升级到WebSocket也算一个。

RPC使用

go中自带的rpc可以直接复用http server处理请求的那一套流程去创建连接,连接创建完毕后再使用Hijack方法拿到连接。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// ServeHTTP implements an http.Handler that answers RPC requests.
func (server *server) servehttp(w http.responsewriter, req *http.request) {
    if req.method != "connect" {
        w.header().set("content-type", "text/plain; charset=utf-8")
        w.writeheader(http.statusmethodnotallowed)
        io.writestring(w, "405 must connect\n")
        return
    }
    conn, _, err := w.(http.hijacker).hijack()
    if err != nil {
        log.print("rpc hijacking ", req.remoteaddr, ": ", err.error())
        return
    }
    io.writestring(conn, "http/1.0 "+connected+"\n\n")
    server.serveconn(conn)
}

客户端通过向服务端发送method为connect的请求创建连接,创建成功后即可开始rpc调用。

WebSocket使用

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// ServeHTTP implements the http.Handler interface for a WebSocket
func (s Server) ServeHTTP(w http.ResponseWriter, req *http.Request) {
    s.serveWebSocket(w, req)
}

func (s Server) serveWebSocket(w http.ResponseWriter, req *http.Request) {
    rwc, buf, err := w.(http.Hijacker).Hijack()
    if err != nil {
        panic("Hijack failed: " + err.Error())
    }
    // The server should abort the WebSocket connection if it finds
    // the client did not send a handshake that matches with protocol
    // specification.
    defer rwc.Close()
    conn, err := newServerConn(rwc, buf, req, &s.Config, s.Handshake)
    if err != nil {
        return
    }
    if conn == nil {
        panic("unexpected nil conn")
    }
    s.Handler(conn)
}

websocket在创建连接的阶段与http使用相同的协议,而在后边的数据传输的过程中使用了他自己的协议,符合了Hijack的用途。通过serveWebSocket方法将HTTP协议升级到Websocket协议。

小结

使用Hijack时的一些注意点:

  • Hijack之后的conn需要手动关闭;
  • Hijack之后不能再对w http.responsewriter里面的w写入数据;
  • http.response实现了http.Hijacker接口;