goproxy 源码分析
go get 取包原理
1 第一步,正则匹配出依赖包的查询路径
go get可以指定具体包的import路径或者通过其自行分析代码中的import得出需要获取包的路径。但是import路径,并不直接就是该包的查询路径。在go get的源码实现中,包的查询路径是通过一组正则匹配出来的。也就是说,import路径是必须匹配这组正则表达式的,如果不匹配的话,代码是肯定无法编译的。
再结合go-get参数,向远端VCS系统发起https://github.com/goproxyio/goproxy?go-get=1请求。
2 第二步,查询得出包的远端仓库地址
包的远端仓库地址,可以通过go get请求的响应中的go-import的meta标签中的content中获取的。
3 第三步,根据仓库地址clone到本地
虽然版本控制系统VCS本身就存在各类区别,但是一些基础操作大多类似。在go get中具体clone的过程会根据具体的VCS采用对应的操作。
go get 代理取包流程
了解了go get取包的基础流程后,说说Go Module功能开启后的完整流程。
可以用go get -x 查看拉取的详细过程
go get -x github.com/goproxyio/goproxy # get https://goproxy.cn/github.com/goproxyio/@v/list# get https://goproxy.cn/github.com/@v/list# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/list# get https://goproxy.cn/github.com/@v/list: 404 Not Found (0.686s)# get https://goproxy.cn/github.com/goproxyio/@v/list: 404 Not Found (0.754s)# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/list: 200 OK (0.855s)# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.info# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.info: 200 OK (0.117s)go: downloading github.com/goproxyio/goproxy v1.0.0# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.zip# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.zip: 200 OK (0.228s)# get https://goproxy.cn/sumdb/sum.golang.org/supported# get https://goproxy.cn/sumdb/sum.golang.org/supported: 200 OK (0.032s)# get https://goproxy.cn/sumdb/sum.golang.org/lookup/github.com/goproxyio/goproxy@v1.0.0# get https://goproxy.cn/sumdb/sum.golang.org/lookup/github.com/goproxyio/goproxy@v1.0.0: 200 OK (0.414s)# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/109# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/199.p/195# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/1/055.p/119# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/109: 200 OK (0.028s)# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/1/055.p/119: 200 OK (0.040s)# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/199.p/195: 200 OK (0.057s)# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/324# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/324: 200 OK (0.226s)go: github.com/goproxyio/goproxy upgrade => v1.0.0# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.mod# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.mod: 200 OK (0.093s)go: finding module for package github.com/goproxyio/goproxy/internal/module# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/module/@v/list# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/@v/listgo: finding module for package github.com/goproxyio/goproxy/internal/cfg# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/cfg/@v/listgo: finding module for package github.com/goproxyio/goproxy/internal/modfetch# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/@v/listgo: finding module for package github.com/goproxyio/goproxy/internal/modload# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modload/@v/listgo: finding module for package github.com/goproxyio/goproxy/internal/modfetch/codehost# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/codehost/@v/list# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/module/@v/list: 404 Not Found (2.579s)# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/codehost/@v/list: 404 Not Found (2.474s)# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/@v/list: 404 Not Found (2.882s)# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/@v/list: 404 Not Found (2.984s)# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/cfg/@v/list: 404 Not Found (3.339s)# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modload/@v/list: 404 Not Found (3.353s)go: finding module for package github.com/goproxyio/goproxy/internal/modloadgo: finding module for package github.com/goproxyio/goproxy/internal/modulego: finding module for package github.com/goproxyio/goproxy/internal/modfetch/codehostgo: finding module for package github.com/goproxyio/goproxy/internal/cfggo: finding module for package github.com/goproxyio/goproxy/internal/modfetch../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:12:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/cfg../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:13:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modfetch../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:14:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modfetch/codehost../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:15:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modload../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:16:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/module
开启Go Module后,go get增加了一个新的环境变量GOPROXY。该环境变量一旦开启,go get就完全切换到新的取包流程,即GOPROXY流程。
在GOPROXY流程中,官方定义了一组代理接口, 请参考官方接口定义。
https://tip.golang.org/cmd/go/#hdr-Module_proxy_protocol
GET $GOPROXY/<module>/@v/list returns a list of all known versions of the given module, one per line.GET $GOPROXY/<module>/@v/<version>.info returns JSON-formatted metadata about that version of the given module.GET $GOPROXY/<module>/@v/<version>.mod returns the go.mod file for that version of the given module.GET $GOPROXY/<module>/@v/<version>.zip returns the zip archive for that version of the given module.
其实这组接口的定义就是$GOPATH/pkg/mod/cache/download中的文件系统。就是说,我们可以直接将此目录下的文件系统作为代理使用,如下命令:
export GOPROXY=file:///$GOPATH/pkg/mod/cache/download/goproxy 其实很简单,实现了上述四个接口的代理
% lsDockerfile contrib main.go scriptsLICENSE docker-compose.yaml proxy sumdbMakefile go.mod renameio testREADME.md go.sum robustio
先看下main.go文件
func main()handle = &logger{proxy.NewRouter(proxy.NewServer(new(ops)), &proxy.RouterOptions{Pattern: excludeHost,Proxy: proxyHost,DownloadRoot: downloadRoot,})}handle = &logger{proxy.NewServer(new(ops))}server := &http.Server{Addr: listen, Handler: handle}
注册了一个ops server
ops实现了协议要求的接口
type ops struct{}func (*ops) List(ctx context.Context, mpath string) (proxy.File, error)func (*ops) Latest(ctx context.Context, path string) (proxy.File, error) {d, err := download(module.Version{Path: path, Version: "latest"})func (*ops) Info(ctx context.Context, m module.Version) (proxy.File, error)func (*ops) GoMod(ctx context.Context, m module.Version) (proxy.File, error)func (*ops) Zip(ctx context.Context, m module.Version) (proxy.File, error)
接着看下proxy/router.go文件
func NewRouter(srv *Server, opts *RouterOptions) *Routerrt := &Router{opts: opts,srv: srv,}remote, err := url.Parse(opts.Proxy)proxy := httputil.NewSingleHostReverseProxy(remote)proxy.Director = func(r *http.Request) {director(r)r.Host = remote.Host}rt.proxy.Transport = &http.Transport{Proxy: http.ProxyFromEnvironment,TLSClientConfig: &tls.Config{InsecureSkipVerify: true},}
调用了httputil的
httputil.NewSingleHostReverseProxy函数
func (rt *Router) ServeHTTP(w http.ResponseWriter, r *http.Request) {if strings.HasPrefix(r.URL.Path, "/sumdb/") {sumdb.Handler(mw, r)}if strings.HasSuffix(r.URL.Path, "/@latest") {}rt.proxy.ServeHTTP(mw, r)}func GlobsMatchPath(globs, target string) bool {matched, _ := path.Match(glob, prefix)}
最后看看proxy/server.go文件
首先注入ops
func NewServer(ops ServerOps) *Server {return &Server{ops: ops}}
然后ServeHTTP接口对ops的接口进行了包装和反向代理
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {if strings.HasPrefix(r.URL.Path, "/sumdb/") {sumdb.Handler(w, r)}i := strings.Index(r.URL.Path, "/@")modPath, err := module.UnescapePath(strings.TrimPrefix(r.URL.Path[:i], "/"))switch what {case "latest":ctype = contentTypeJSONf, openErr = s.ops.Latest(ctx, modPath)case "v/list":ctype = contentTypeTextf, openErr = s.ops.List(ctx, modPath)default:what = strings.TrimPrefix(what, "v/")}switch ext {case ".info":ctype = "application/json"f, openErr = s.ops.Info(ctx, m)case ".mod":ctype = "text/plain; charset=UTF-8"f, openErr = s.ops.GoMod(ctx, m)case ".zip":ctype = "application/octet-stream"f, openErr = s.ops.Zip(ctx, m)default:http.Error(w, "request not recognized", http.StatusNotFound)return}http.ServeContent(w, r, what, info.ModTime(), f)
func ServeContent(w ResponseWriter, req *Request, name string, modtime time.Time, content io.ReadSeeker)该函数使用提供的ReaderSeeker提供的内容来恢复请求,该函数相对于io.Copy的优点是可以处理范围类请求,设定MIME类型,并且处理了If-Modified-Since请求.如果未设定content-type类型,该函数首先通过文件扩展名来判断类型,如果失效的话,读取content的第一块数据并将他传递给DetectContentType进行类型判断.name可以不被使用,更进一步说,他可以为空并且不在respone中返回.如果modtime不是0时间,该时间则体现在response的最后一次修改的header中,如果请求包括一个If-Modified-Since header,该函数利用modtime来决定是否发送该content.该函数利用Seek功能来决定content的大小.
推荐阅读
