This article brings you an introduction to a set of lightweight Kubernetes log collection solutions. I've used this solution myself in production, and to my surprise, the kubernetes resources it consumes are really small compared to the ELK solution. Then follow this article and start learning Xi it......
Why you should use Loki
This article focuses on the Loki log collection application developed by Grafana. Loki is a lightweight log collection and analysis application, which uses the promtail method to obtain the log content and send it to Loki for storage, and finally add the data source to Grafana's datasource for log display and query.
Loki's persistent storage supports five types: Azure, GCS, S3, Swift, and Local, among which S3 and Local are commonly used. In addition, it also supports many types of log collection, such as the most commonly used logstash and fluentbit.
So what are the advantages of it?
- 支持的客户端,如Promtail,Fluentbit,Fluentd,Vector,Logstash和Grafana Agent
- Preferred agent Promtail, which can extract logs from multiple sources, including local log files, systemd, Windows event logs, Docker logging drivers, and more
- There are no log format requirements, including JSON, XML, CSV, logfmt, unstructured text
- Query logs using the same syntax as query metrics
- Log queries allow you to dynamically filter and convert log lines
- You can easily calculate the required metrics in the logs
- Minimal indexes at ingestion mean you can dynamically slice and dice logs at query time to answer new questions as they arise
- Cloud-native support, using Prometheus to scrape data
A simple comparison of the log collection components
Loki works to solve the problem of log parsing format
As we can see from the above figure, when parsing logs, it is mainly index, which includes timestamps and part of the label of the pod (other labels are filename, containers, etc.), and the rest is the log content. The specific query effect is as follows:
{app="loki",namespace="kube-public"}为索引
Log collection architecture pattern
In the process of use, it is recommended to use promtail as an agent to deploy it on the Kubernetes worker node in DaemonSet mode to collect logs. You can also use the other log collection tools mentioned above, and the configuration of the other tools is attached at the end of this article.
What are the Loki deployment modes?
Loki is built from a number of component microservices, and there are 5 microservice components. Add a cache to these 5 to put the data up and speed up the query. The data is placed in the shared storage configuration memberlist_config part and the state is shared between instances, so that Loki can be scaled out infinitely.
After configuring the memberlist_config section, polling is used to find the data. For the sake of ease of use, the official compiles all microservices into a binary, which can be controlled by the command line parameter -target, supports all, read, and write, and we can specify different modes according to the size of the log volume when deploying
all (read/write mode)
After the service is started, the data queries and data writes we do come from this one node. Take a look at this diagram below:
read/write (read/write splitting mode)
When running in read/write splitting mode, the fronted-query query forwards traffic to the read node. The querier, ruler, and fronted are retained on the read node, and the distributor and ingester are retained on the write node
Operates in microservice mode
In the microservice mode, different roles are started through different configuration parameters, and each process references its target role service.
Server-side deployment
We've talked so much about loki and how it works, and you're looking forward to how it's deployed, right?!How to deploy, where to deploy, how to use it after deployment, and so on.
You need to prepare a k8s cluster before deploying. Okay, let's look down patiently......
AllInOne deployment model
(1) K8S deployment
The program we downloaded from github does not have a configuration file, so we need to prepare a copy of the file in advance. A complete allInOne configuration file is available here, with some optimizations.
The configuration file content is as follows
auth_enabled: false
target: all
ballast_bytes: 20480
server:
grpc_listen_port: 9095
http_listen_port: 3100
graceful_shutdown_timeout: 20s
grpc_listen_address: "0.0.0.0"
grpc_listen_network: "tcp"
grpc_server_max_concurrent_streams: 100
grpc_server_max_recv_msg_size: 4194304
grpc_server_max_send_msg_size: 4194304
http_server_idle_timeout: 2m
http_listen_address: "0.0.0.0"
http_listen_network: "tcp"
http_server_read_timeout: 30s
http_server_write_timeout: 20s
log_source_ips_enabled: true
# http_path_prefix如果需要更改,在推送日志的时候前缀都需要加指定的内容
# http_path_prefix: "/"
register_instrumentation: true
log_format: json
log_level: info
distributor:
ring:
heartbeat_timeout: 3s
kvstore:
prefix: collectors/
store: memberlist
# 需要提前创建好consul集群
# consul:
# http_client_timeout: 20s
# consistent_reads: true
# host: 127.0.0.1:8500
# watch_burst_size: 2
# watch_rate_limit: 2
querier:
engine:
max_look_back_period: 20s
timeout: 3m0s
extra_query_delay: 100ms
max_concurrent: 10
multi_tenant_queries_enabled: true
query_ingester_only: false
query_ingesters_within: 3h0m0s
query_store_only: false
query_timeout: 5m0s
tail_max_duration: 1h0s
query_scheduler:
max_outstanding_requests_per_tenant: 2048
grpc_client_config:
max_recv_msg_size: 104857600
max_send_msg_size: 16777216
grpc_compression: gzip
rate_limit: 0
rate_limit_burst: 0
backoff_on_ratelimits: false
backoff_config:
min_period: 50ms
max_period: 15s
max_retries: 5
use_scheduler_ring: true
scheduler_ring:
kvstore:
store: memberlist
prefix: "collectors/"
heartbeat_period: 30s
heartbeat_timeout: 1m0s
# 默认第一个网卡的名称
# instance_interface_names
# instance_addr: 127.0.0.1
# 默认server.grpc-listen-port
instance_port: 9095
frontend:
max_outstanding_per_tenant: 4096
querier_forget_delay: 1h0s
compress_responses: true
log_queries_longer_than: 2m0s
max_body_size: 104857600
query_stats_enabled: true
scheduler_dns_lookup_period: 10s
scheduler_worker_concurrency: 15
query_range:
align_queries_with_step: true
cache_results: true
parallelise_shardable_queries: true
max_retries: 3
results_cache:
cache:
enable_fifocache: false
default_validity: 30s
background:
writeback_buffer: 10000
redis:
endpoint: 127.0.0.1:6379
timeout: 1s
expiration: 0s
db: 9
pool_size: 128
password: 1521Qyx6^
tls_enabled: false
tls_insecure_skip_verify: true
idle_timeout: 10s
max_connection_age: 8h
ruler:
enable_api: true
enable_sharding: true
alertmanager_refresh_interval: 1m
disable_rule_group_label: false
evaluation_interval: 1m0s
flush_period: 3m0s
for_grace_period: 20m0s
for_outage_tolerance: 1h0s
notification_queue_capacity: 10000
notification_timeout: 4s
poll_interval: 10m0s
query_stats_enabled: true
remote_write:
config_refresh_period: 10s
enabled: false
resend_delay: 2m0s
rule_path: /rulers
search_pending_for: 5m0s
storage:
local:
directory: /data/loki/rulers
type: configdb
sharding_strategy: default
wal_cleaner:
period: 240h
min_age: 12h0m0s
wal:
dir: /data/loki/ruler_wal
max_age: 4h0m0s
min_age: 5m0s
truncate_frequency: 1h0m0s
ring:
kvstore:
store: memberlist
prefix: "collectors/"
heartbeat_period: 5s
heartbeat_timeout: 1m0s
# instance_addr: "127.0.0.1"
# instance_id: "miyamoto.en0"
# instance_interface_names: ["en0","lo0"]
instance_port: 9500
num_tokens: 100
ingester_client:
pool_config:
health_check_ingesters: false
client_cleanup_period: 10s
remote_timeout: 3s
remote_timeout: 5s
ingester:
autoforget_unhealthy: true
chunk_encoding: gzip
chunk_target_size: 1572864
max_transfer_retries: 0
sync_min_utilization: 3.5
sync_period: 20s
flush_check_period: 30s
flush_op_timeout: 10m0s
chunk_retain_period: 1m30s
chunk_block_size: 262144
chunk_idle_period: 1h0s
max_returned_stream_errors: 20
concurrent_flushes: 3
index_shards: 32
max_chunk_age: 2h0m0s
query_store_max_look_back_period: 3h30m30s
wal:
enabled: true
dir: /data/loki/wal
flush_on_shutdown: true
checkpoint_duration: 15m
replay_memory_ceiling: 2GB
lifecycler:
ring:
kvstore:
store: memberlist
prefix: "collectors/"
heartbeat_timeout: 30s
replication_factor: 1
num_tokens: 128
heartbeat_period: 5s
join_after: 5s
observe_period: 1m0s
# interface_names: ["en0","lo0"]
final_sleep: 10s
min_ready_duration: 15s
storage_config:
boltdb:
directory: /data/loki/boltdb
boltdb_shipper:
active_index_directory: /data/loki/active_index
build_per_tenant_index: true
cache_location: /data/loki/cache
cache_ttl: 48h
resync_interval: 5m
query_ready_num_days: 5
index_gateway_client:
grpc_client_config:
filesystem:
directory: /data/loki/chunks
chunk_store_config:
chunk_cache_config:
enable_fifocache: true
default_validity: 30s
background:
writeback_buffer: 10000
redis:
endpoint: 192.168.3.56:6379
timeout: 1s
expiration: 0s
db: 8
pool_size: 128
password: 1521Qyx6^
tls_enabled: false
tls_insecure_skip_verify: true
idle_timeout: 10s
max_connection_age: 8h
fifocache:
ttl: 1h
validity: 30m0s
max_size_items: 2000
max_size_bytes: 500MB
write_dedupe_cache_config:
enable_fifocache: true
default_validity: 30s
background:
writeback_buffer: 10000
redis:
endpoint: 127.0.0.1:6379
timeout: 1s
expiration: 0s
db: 7
pool_size: 128
password: 1521Qyx6^
tls_enabled: false
tls_insecure_skip_verify: true
idle_timeout: 10s
max_connection_age: 8h
fifocache:
ttl: 1h
validity: 30m0s
max_size_items: 2000
max_size_bytes: 500MB
cache_lookups_older_than: 10s
# 压缩碎片索引
compactor:
shared_store: filesystem
shared_store_key_prefix: index/
working_directory: /data/loki/compactor
compaction_interval: 10m0s
retention_enabled: true
retention_delete_delay: 2h0m0s
retention_delete_worker_count: 150
delete_request_cancel_period: 24h0m0s
max_compaction_parallelism: 2
# compactor_ring:
frontend_worker:
match_max_concurrent: true
parallelism: 10
dns_lookup_duration: 5s
# runtime_config 这里没有配置任何信息
# runtime_config:
common:
storage:
filesystem:
chunks_directory: /data/loki/chunks
fules_directory: /data/loki/rulers
replication_factor: 3
persist_tokens: false
# instance_interface_names: ["en0","eth0","ens33"]
analytics:
reporting_enabled: false
limits_config:
ingestion_rate_strategy: global
ingestion_rate_mb: 100
ingestion_burst_size_mb: 18
max_label_name_length: 2096
max_label_value_length: 2048
max_label_names_per_series: 60
enforce_metric_name: true
max_entries_limit_per_query: 5000
reject_old_samples: true
reject_old_samples_max_age: 168h
creation_grace_period: 20m0s
max_global_streams_per_user: 5000
unordered_writes: true
max_chunks_per_query: 200000
max_query_length: 721h
max_query_parallelism: 64
max_query_series: 700
cardinality_limit: 100000
max_streams_matchers_per_query: 1000
max_concurrent_tail_requests: 10
ruler_evaluation_delay_duration: 3s
ruler_max_rules_per_rule_group: 0
ruler_max_rule_groups_per_tenant: 0
retention_period: 700h
per_tenant_override_period: 20s
max_cache_freshness_per_query: 2m0s
max_queriers_per_tenant: 0
per_stream_rate_limit: 6MB
per_stream_rate_limit_burst: 50MB
max_query_lookback: 0
ruler_remote_write_disabled: false
min_sharding_lookback: 0s
split_queries_by_interval: 10m0s
max_line_size: 30mb
max_line_size_truncate: false
max_streams_per_user: 0
# memberlist_conig模块配置gossip用于在分发服务器、摄取器和查询器之间发现和连接。
# 所有三个组件的配置都是唯一的,以确保单个共享环。
# 至少定义了1个join_members配置后,将自动为分发服务器、摄取器和ring 配置memberlist类型的kvstore
memberlist:
randomize_node_name: true
stream_timeout: 5s
retransmit_factor: 4
join_members:
- 'loki-memberlist'
abort_if_cluster_join_fails: true
advertise_addr: 0.0.0.0
advertise_port: 7946
bind_addr: ["0.0.0.0"]
bind_port: 7946
compression_enabled: true
dead_node_reclaim_time: 30s
gossip_interval: 100ms
gossip_nodes: 3
gossip_to_dead_nodes_time: 3
# join:
leave_timeout: 15s
left_ingesters_timeout: 3m0s
max_join_backoff: 1m0s
max_join_retries: 5
message_history_buffer_bytes: 4096
min_join_backoff: 2s
# node_name: miyamoto
packet_dial_timeout: 5s
packet_write_timeout: 5s
pull_push_interval: 100ms
rejoin_interval: 10s
tls_enabled: false
tls_insecure_skip_verify: true
schema_config:
configs:
- from: "2020-10-24"
index:
period: 24h
prefix: index_
object_store: filesystem
schema: v11
store: boltdb-shipper
chunks:
period: 168h
row_shards: 32
table_manager:
retention_deletes_enabled: false
retention_period: 0s
throughput_updates_disabled: false
poll_interval: 3m0s
creation_grace_period: 20m
index_tables_provisioning:
provisioned_write_throughput: 1000
provisioned_read_throughput: 500
inactive_write_throughput: 4
inactive_read_throughput: 300
inactive_write_scale_lastn: 50
enable_inactive_throughput_on_demand_mode: true
enable_ondemand_throughput_mode: true
inactive_read_scale_lastn: 10
write_scale:
enabled: true
target: 80
# role_arn:
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
inactive_write_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
read_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
inactive_read_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
chunk_tables_provisioning:
enable_inactive_throughput_on_demand_mode: true
enable_ondemand_throughput_mode: true
provisioned_write_throughput: 1000
provisioned_read_throughput: 300
inactive_write_throughput: 1
inactive_write_scale_lastn: 50
inactive_read_throughput: 300
inactive_read_scale_lastn: 10
write_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
inactive_write_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
read_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
inactive_read_scale:
enabled: true
target: 80
out_cooldown: 1800
min_capacity: 3000
max_capacity: 6000
in_cooldown: 1800
tracing:
enabled: true
Caution:
The value of ingester.lifecycler.ring.replication_factor is 1 in the case of a single instance
The value of the ingester.lifecycler.min_ready_duration is 15s, and it will be displayed for 15 seconds to change the status to ready after startup
The value of the memberlist.node_name can be set without setting, and the default value is the name of the current host
memberlist.join_members is a list where the hostname/IP address of each node needs to be added in case of multiple instances. In k8s, you can set up a service to bind to StatefulSets.
query_range.results_cache.cache.enable_fifocache is recommended to be set to false, which I set to true here
instance_interface_names is a list, the default is ["en0", "eth0"], you can set the corresponding NIC name as needed, generally no special settings are required.
Create a configmap
Note: Write the above content to a file - > loki-all.yaml and write it as a configmap to the k8s cluster. You can use the following command to create:
kubectl create configmap --from-file ./loki-all.yaml loki-all
You can view the created configmap by running the command, as shown in the following figure
Create persistent storage
In k8s, our data needs to be persisted. The log information collected by Loki is critical to the business, so it is necessary to retain the logs when the container is restarted.
Then you need to use pv, pvc, back-end storage can use nfs, glusterfs, hostPath, azureDisk, cephfs and other 20 support types, here because there is no corresponding environment to use the hostPath mode.
apiVersion: v1
kind: PersistentVolume
metadata:
name: loki
namespace: default
spec:
hostPath:
path: /glusterfs/loki
type: DirectoryOrCreate
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: loki
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
volumeName: loki
Create an app
After preparing the statefulSet deployment file of k8s, you can directly create an application in the cluster.
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: loki
name: loki
namespace: default
spec:
podManagementPolicy: OrderedReady
replicas: 1
selector:
matchLabels:
app: loki
template:
metadata:
annotations:
prometheus.io/port: http-metrics
prometheus.io/scrape: "true"
labels:
app: loki
spec:
containers:
- args:
- -config.file=/etc/loki/loki-all.yaml
image: grafana/loki:2.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 45
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: loki
ports:
- containerPort: 3100
name: http-metrics
protocol: TCP
- containerPort: 9095
name: grpc
protocol: TCP
- containerPort: 7946
name: memberlist-port
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 45
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 500m
memory: 500Mi
limits:
cpu: 500m
memory: 500Mi
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /etc/loki
name: config
- mountPath: /data
name: storage
restartPolicy: Always
securityContext:
fsGroup: 10001
runAsGroup: 10001
runAsNonRoot: true
runAsUser: 10001
serviceAccount: loki
serviceAccountName: loki
volumes:
- emptyDir: {}
name: tmp
- name: config
configMap:
name: loki
- persistentVolumeClaim:
claimName: loki
name: storage
---
kind: Service
apiVersion: v1
metadata:
name: loki-memberlist
namespace: default
spec:
ports:
- name: loki-memberlist
protocol: TCP
port: 7946
targetPort: 7946
selector:
kubepi.org/name: loki
---
kind: Service
apiVersion: v1
metadata:
name: loki
namespace: default
spec:
ports:
- name: loki
protocol: TCP
port: 3100
targetPort: 3100
selector:
kubepi.org/name: loki
In the above configuration file, I have added some pod-level security policies, these security policies also have cluster-level PodSecurityPolicy, to prevent the entire cluster crash due to vulnerabilities, about cluster-level psp, you can see the official documentation for details
Verify the deployment results
When you see the above running status, you can use the API to see if the distributor is working normally, and when Active is displayed, the normal distribution log flow will be normal to the collector (ingester)
(2) Bare metal deployment
Put loki in the /bin/ directory of the system, prepare a grafana-loki.service control file to reload the system service list
[Unit]
Description=Grafana Loki Log Ingester
Documentation=https://grafana.com/logs/
After=network-online.target
[Service]
ExecStart=/bin/loki --config.file /etc/loki/loki-all.yaml
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target
Overload the system list command to directly manage the system automatically:
systemctl daemon-reload
# 启动服务
systemctl start grafana-loki
# 停止服务
systemctl stop grafana-loki
# 重载应用
systemctl reload grafana-loki
Promtail deployment in action
When deploying a client to collect logs, you also need to create a configuration file as described in the steps above to create a server. The difference is that you need to push the log content to the server
(1) K8S deployment
Create a profile
server:
log_level: info
http_listen_port: 3101
clients:
- url: http://loki:3100/loki/api/v1/push
positions:
filename: /run/promtail/positions.yaml
scrape_configs:
- job_name: kubernetes-pods
pipeline_stages:
- cri: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_controller_name
regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
action: replace
target_label: __tmp_controller_name
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_name
- __meta_kubernetes_pod_label_app
- __tmp_controller_name
- __meta_kubernetes_pod_name
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: app
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_instance
- __meta_kubernetes_pod_label_release
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: instance
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_component
- __meta_kubernetes_pod_label_component
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: component
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node_name
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
replacement: $1
separator: /
source_labels:
- namespace
- app
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: replace
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- action: replace
regex: true/(.*)
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
- __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
- __meta_kubernetes_pod_container_name
target_label: __path__
Create a configMap with the above and the method is the same as above
Create a DaemonSet file
Promtail is a stateless application that does not need persistent storage, it needs to be deployed in the cluster, or the same is to prepare the DaemonSets to create files.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: promtail
namespace: default
labels:
app.kubernetes.io/instance: promtail
app.kubernetes.io/name: promtail
app.kubernetes.io/version: 2.5.0
spec:
selector:
matchLabels:
app.kubernetes.io/instance: promtail
app.kubernetes.io/name: promtail
template:
metadata:
labels:
app.kubernetes.io/instance: promtail
app.kubernetes.io/name: promtail
spec:
volumes:
- name: config
configMap:
name: promtail
- name: run
hostPath:
path: /run/promtail
- name: containers
hostPath:
path: /var/lib/docker/containers
- name: pods
hostPath:
path: /var/log/pods
containers:
- name: promtail
image: docker.io/grafana/promtail:2.3.0
args:
- '-config.file=/etc/promtail/promtail.yaml'
ports:
- name: http-metrics
containerPort: 3101
protocol: TCP
env:
- name: HOSTNAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumeMounts:
- name: config
mountPath: /etc/promtail
- name: run
mountPath: /run/promtail
- name: containers
readOnly: true
mountPath: /var/lib/docker/containers
- name: pods
readOnly: true
mountPath: /var/log/pods
readinessProbe:
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 5
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
allowPrivilegeEscalation: false
restartPolicy: Always
serviceAccountName: promtail
serviceAccount: promtail
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
Create a promtail application
kubectl apply -f promtail.yaml
After using the above command to create, you can see that the service has been created. The next step is to add a DataSource to Grafana to view the data.
(2) Bare metal deployment
If it is a bare-metal deployment, you need to make a slight change to the above configuration file, change the address of the clients, and store the file in /etc/loki/, for example:
clients:
- url: http://ipaddress:port/loki/api/v1/push
Add the system boot configuration, and store the service configuration file in /usr/lib/systemd/system/loki-promtail.service as follows:
[Unit]
Description=Grafana Loki Log Ingester
Documentation=https://grafana.com/logs/
After=network-online.target
[Service]
ExecStart=/bin/promtail --config.file /etc/loki/loki-promtail.yaml
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target
The startup method is the same as the above server-side deployment content
Loki in DataSource
Add a data source
具体步骤: Grafana->Setting->DataSources->AddDataSource->Loki
Notes:
http URL address, which namespace the application or service is deployed, you need to specify its FQDN address, and its format is ServiceName.namespace. If the default port number is 3100, you need to fill in the http://loki:3100, why don't you write the IP address and write the name of the service, because there is a DNS server in the k8s cluster that will automatically resolve this address.
Find log information
Other client configurations
Logstash acts as a log collection client
Install the plugin
After starting Logstash, we need to install a plugin, you can install the loki output plugin through this command, and after the installation is completed, you can add information to the output of logstash.
bin/logstash-plugin install logstash-output-loki
Add configurations for testing
For complete logstash configuration information, please refer to the LogstashConfigFile on the official website
output {
loki {
[url => "" | default = none | required=true]
[tenant_id => string | default = nil | required=false]
[message_field => string | default = "message" | required=false]
[include_fields => array | default = [] | required=false]
[batch_wait => number | default = 1(s) | required=false]
[batch_size => number | default = 102400(bytes) | required=false]
[min_delay => number | default = 1(s) | required=false]
[max_delay => number | default = 300(s) | required=false]
[retries => number | default = 10 | required=false]
[username => string | default = nil | required=false]
[password => secret | default = nil | required=false]
[cert => path | default = nil | required=false]
[key => path | default = nil| required=false]
[ca_cert => path | default = nil | required=false]
[insecure_skip_verify => boolean | default = false | required=false]
}
}
Or use the http output module of Logstash with the following configuration:
output {
http {
format => "json"
http_method => "post"
content_type => "application/json"
connect_timeout => 10
url => "http://loki:3100/loki/api/v1/push"
message => '"message":"%{message}"}'
}
}
Helm installation
If you want to install it easily, you can use helm. Helm encapsulates all the installation steps, simplifying the installation steps.
For people who want to learn more about k8s, helm is not very suitable. Because it is encapsulated and executed automatically, k8s administrators do not know how the components depend on each other, which may cause misunderstandings.
Without further ado, let's start the helm installation
Add a repo source
helm repo add grafana https://grafana.github.io/helm-charts
Update the source
helm repo update
deploy
Default configuration
helm upgrade --install loki grafana/loki-simple-scalable
Customize the namespace
helm upgrade --install loki --namespace=loki grafana/loki-simple-scalable
Customize the configuration information
helm upgrade --install loki grafana/loki-simple-scalable --set "key1=val1,key2=val2,..."
8. Fault solutions
1.502 BadGateWay
The address of loki is incorrect
In k8s, the address is filled in incorrectly, resulting in a 502. Check to see if the address of the loki is the following:
http://LokiServiceName
http://LokiServiceName.namespace
http://LokiServiceName.namespace:ServicePort
Grafana and Loki check the network communication status and firewall policy between the nodes on different nodes
2.Ingester not ready: instance xx:9095 in state JOINING
Wait patiently for a while, as it will take some time for the program to start in allInOne mode.
3.too many unhealthy instances in the ring
Changing the ingester.lifecycler.replication_factor to 1 is due to an incorrect setting. This will be set to multiple replication sources at startup, but currently only one is deployed, so this will be prompted when viewing the label
4.Data source connected, but no labels received. Verify that Loki and Promtail is configured properly
- promtail can't send the collected logs to loki, check if promtail's output is normal
- Promtail sent the log while Loki wasn't ready, but Loki didn't receive it. If you need to receive logs again, you need to delete the positions.yaml file, and you can use find to find the specific path
- promtail ignores the target log file or configuration file error that causes the boot to fail properly
- promtail was unable to discover the log file at the specified location
Official Documentation:
- https://kubernetes.io/docs/concepts/security/pod-security-policy/
Thanks for reading, I hope it helps you :) Source: juejin.cn/post/7150469420605767717
That's all for today's sharing, if it helps, welcome to support it with one click triple (like, comment, forward)!
Reader-only group: We sincerely invite you to join the technical exchange group and roll together!
If there are any errors or other problems, please leave comments and corrections. If it helps, welcome to like + forward and share. For more related open source technical articles, please stay tuned!Resource sharing (Xiaobian has carefully prepared various academic Xi materials for 2048G.) Including system operation and maintenance, database, redis, MogoDB, e-book, Java basic course, Java practical project, architect comprehensive tutorial, architect practical project, big data, Docker container, ELK Stack, machine learning Xi, BAT interview intensive lecture video, etc. )