可视化 Kubernetes 历史记录
简介
Sloop 可以监控 Kubernetes event ,记录事件和资源状态变化的历史,并提供可视化来帮助调试过去的事件。
主要特点:
允许查找和检查不再存在的资源(例如:发现之前部署中的 pod )。 提供时间线显示,显示deployment 、ReplicaSet 和 StatefulSet 更新中相关资源的退出。 帮助调试瞬态和间歇性错误。 可以查看 Kubernetes 应用程序中随时间的变化。 是一个独立的服务,不依赖于分布式存储。
架构
安装及使用
docker 安装
我们可以使用官方提供的镜像安装,sloop
数据文件保存在容器的/data目录下
docker run -it -p 8080:8080 -v ~/.kube/config:/kube/config -v /data:/data -e KUBECONFIG=/kube/config sloopimage/sloop
通过访问https://localhost:8080 即可进入web ui 。
在侧边栏我们可以选择要查看的时间范围,名称空间,资源对象,以及关键词过滤等。
在详情页面我们可以看到我们event的详情
我们还可以点击页面里的details 查看资源对象的详情
还可以点击页面上方的debug menu
进入debug 页面查看metrics
我们还可以配置一下我们打开ui后的默认页面,sloop有如下选项
[root@dev-tools sloop]# docker run --rm -it -p 8080:8080 -v ~/.kube/config:/kube/config -e KUBECONFIG=/kube/config sloop sloop -h
Usage of configFileOnly:
-alsologtostderr
log to standard error as well as files
-apiserver-host string
Kubernetes API server endpoint
-badger-detail-log-enabled
Turns on detailed logging of BadgerDB
-badger-discard-ratio float
Badger value log GC uses this value to decide if it wants to compact a vlog file. The lower the value of discardRatio the higher the number of !badger!move keys. And thus more the number of !badger!move keys, the size on disk keeps on increasing over time.
-badger-enable-event-logging
Turns on badger event logging
-badger-keep-l0-in-memory
Keeps all level 0 tables in memory for faster writes and compactions
-badger-level-one-size int
The maximum total size for Level 1. 0 = use badger default
-badger-level-size-multiplier int
The ratio between the maximum sizes of contiguous levels in the LSM. 0 = use badger default
-badger-max-table-size int
Max LSM table size in bytes. 0 = use badger default
-badger-number-of-compactors int
Number of compactors for badger
-badger-number-of-level-zero-tables int
Number of level zero tables for badger
-badger-number-of-zero-tables-stall int
Number of Level 0 tables that once reached causes the DB to stall until compaction succeeds
-badger-sync-writes
Sync Writes ensures writes are synced to disk if set to true
-badger-use-lsm-only-options
Sets a higher valueThreshold so values would be collocated with LSM tree reducing vlog disk usage
-badger-vlog-file-size int
Max size in bytes per value log file. 0 = use badger default
-badger-vlog-fileIO-mapping
Indicates which file loading mode should be used for the value log data, in memory constrained environments the value is recommended to be true
-badger-vlog-gc-freq duration
Frequency of running badger's ValueLogGC
-badger-vlog-max-entries uint
Max number of entries per value log files. 0 = use badger default
-badger-vlog-truncate
Truncate value log if badger db offset is different from badger db size
-bind-address string
Web server bind ip address.
-cleanup-frequency duration
Frequency between subsequent runs for the database cleanup
-config string
Path to a yaml or json config file
-context string
Use a specific kubernetes context
-crd-refresh-interval duration
Frequency between CRD Informer refresh
-default-kind string
Default UX filter kind
-default-lookback string
Default UX filter lookback
-default-namespace string
Default UX filter namespace
-deletion-batch-size int
Size of batch for deletion
-disable-kube-watch
Turn off kubernetes watch
-disable-store-manager
Turn off store manager which is to clean up database
-display-context string
Use this to override the display context. When running in k8s the context is empty string. This lets you override that (mainly useful if you are running many copies of sloop on different clusters)
-enable-delete-keys
Use delete prefixes instead of dropPrefix for GC
-gc-threshold float
Threshold for GC to start garbage collecting
-keep-minor-node-updates
Keep all node updates even if change is only condition timestamps
-kube-watch-resync-interval duration
OPTIONAL: Kubernetes watch resync interval
-log_backtrace_at string
when logging hits line file:N, emit a stack trace
-logtostderr
log to standard error instead of files
-max-disk-mb int
Max disk storage in MB
-max-look-back duration
Max history data to keep
-playback-file string
Read watch data from a playback file
-port int
Web server port
-record-file string
Record watch data to a playback file
-restore-database-file string
Restore database from backup file into current context.
-stderrthreshold int
logs at or above this threshold go to stderr
-store-root string
Path to store history data
-use-mock-badger
Use a fake in-memory mock of badger
-v int
log level for V logs
-vmodule string
comma-separated list of pattern=N settings for file-filtered logging
-watch-crds
Watch for activity for CRDs
-web-files-path string
Path to web files
Failed to pre-parse flags looking for config file: flag: help requested
ERROR: logging before flag.Parse: I0509 10:51:23.862730 1 config.go:256] Default config set
Usage of sloop:
-alsologtostderr
log to standard error as well as files
-apiserver-host string
Kubernetes API server endpoint
-badger-detail-log-enabled
Turns on detailed logging of BadgerDB
-badger-discard-ratio float
Badger value log GC uses this value to decide if it wants to compact a vlog file. The lower the value of discardRatio the higher the number of !badger!move keys. And thus more the number of !badger!move keys, the size on disk keeps on increasing over time. (default 0.99)
-badger-enable-event-logging
Turns on badger event logging
-badger-keep-l0-in-memory
Keeps all level 0 tables in memory for faster writes and compactions (default true)
-badger-level-one-size int
The maximum total size for Level 1. 0 = use badger default
-badger-level-size-multiplier int
The ratio between the maximum sizes of contiguous levels in the LSM. 0 = use badger default
-badger-max-table-size int
Max LSM table size in bytes. 0 = use badger default
-badger-number-of-compactors int
Number of compactors for badger
-badger-number-of-level-zero-tables int
Number of level zero tables for badger
-badger-number-of-zero-tables-stall int
Number of Level 0 tables that once reached causes the DB to stall until compaction succeeds
-badger-sync-writes
Sync Writes ensures writes are synced to disk if set to true (default true)
-badger-use-lsm-only-options
Sets a higher valueThreshold so values would be collocated with LSM tree reducing vlog disk usage (default true)
-badger-vlog-file-size int
Max size in bytes per value log file. 0 = use badger default
-badger-vlog-fileIO-mapping
Indicates which file loading mode should be used for the value log data, in memory constrained environments the value is recommended to be true
-badger-vlog-gc-freq duration
Frequency of running badger's ValueLogGC (default 1m0s)
-badger-vlog-max-entries uint
Max number of entries per value log files. 0 = use badger default (default 200000)
-badger-vlog-truncate
Truncate value log if badger db offset is different from badger db size (default true)
-bind-address string
Web server bind ip address.
-cleanup-frequency duration
Frequency between subsequent runs for the database cleanup (default 30m0s)
-config string
Path to a yaml or json config file
-context string
Use a specific kubernetes context
-cpuprofile string
write profile to file
-crd-refresh-interval duration
Frequency between CRD Informer refresh (default 5m0s)
-default-kind string
Default UX filter kind (default "_all")
-default-lookback string
Default UX filter lookback (default "1h")
-default-namespace string
Default UX filter namespace (default "default")
-deletion-batch-size int
Size of batch for deletion (default 1000)
-disable-kube-watch
Turn off kubernetes watch
-disable-store-manager
Turn off store manager which is to clean up database
-display-context string
Use this to override the display context. When running in k8s the context is empty string. This lets you override that (mainly useful if you are running many copies of sloop on different clusters)
-enable-delete-keys
Use delete prefixes instead of dropPrefix for GC
-gc-threshold float
Threshold for GC to start garbage collecting (default 0.8)
-keep-minor-node-updates
Keep all node updates even if change is only condition timestamps
-kube-watch-resync-interval duration
OPTIONAL: Kubernetes watch resync interval (default 30m0s)
-log_backtrace_at value
when logging hits line file:N, emit a stack trace
-log_dir string
If non-empty, write log files in this directory
-logtostderr
log to standard error instead of files
-max-disk-mb int
Max disk storage in MB (default 32768)
-max-look-back duration
Max history data to keep (default 336h0m0s)
-playback-file string
Read watch data from a playback file
-port int
Web server port (default 8080)
-record-file string
Record watch data to a playback file
-restore-database-file string
Restore database from backup file into current context.
-stderrthreshold value
logs at or above this threshold go to stderr
-store-root string
Path to store history data (default "./data")
-use-mock-badger
Use a fake in-memory mock of badger
-v value
log level for V logs
-vmodule value
comma-separated list of pattern=N settings for file-filtered logging
-watch-crds
Watch for activity for CRDs (default true)
-web-files-path string
Path to web files (default "./pkg/sloop/webserver/webfiles")
修改默认的名称空间以及资源对象及时间
docker run --rm -it -p 8080:8080 -v ~/.kube/config:/kube/config -e KUBECONFIG=/kube/config sloop sloop -default-namespace=kube-system -default-kind=pod -default-lookback=2h
从源码安装
mkdir -p $GOPATH/src/github.com/salesforce
cd $GOPATH/src/github.com/salesforce
git clone https://github.com/salesforce/sloop.git
cd sloop
go env -w GO111MODULE=auto
make
$GOPATH/bin/sloop
Helm 方式安装
git clone https://github.com/salesforce/sloop.git
cd sloop
cd /root/sloop/helm/sloop
kubectl create namespace sloop
helm template . --namespace sloop> sloop-test.yaml
kubectl -n sloop apply -f sloop-test.yaml
参考:https://github.com/salesforce/sloop.git
评论