天天看點

Linux系統及應用問題分析排查工具

linux伺服器上經常遇到一些系統和應用上的問題,如何分析排查,需要利器,下面總結清單了一些常用工具、trace tool;最後也列舉了最近hadoop社群在開發發展的分布式系統的trace tool。

引用linux-performance-analysis-and-tools中圖檔,說明這些tool試用層次位置

Linux系統及應用問題分析排查工具

uname -a 或 cat /proc/version #print system information

linux hadoopst2.cm6 2.6.18-164.el5 #1 smp tue aug 18 15:51:48 edt 2009 x86_64 x86_64 x86_64 gnu/linux

uptime

15:42:46 up 674 days, 6 min, 35 users, load average: 1.30, 5.97, 11.53

cat /etc/redhat-release

red hat enterprise linux server release 5.4 (tikanga)

lsb_release

lsb version:  :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch

cat /proc/cpuinfo

cat /proc/meminfo

lspci - list all pci devices

lsusb - list usb devices

last, lastb - show listing of last logged in users

lsmod — show the status of modules in the linux kernel

modprobe - add and remove modules from the linux kernel

ps

to print a process tree: ps -ejh / ps axjf

to get info about threads: ps -elf / ps axms

ulimit -a

lsof - list open files, unix一切皆檔案

lsof -p pid

rpm/yum

rpm -qf file #檔案所屬rpm包

rpm -ql rpm #rpm包含檔案

/var/log/yum.log #yum 更新包日志

/etc/xxx #系統級程式配置目錄, 如

/etc/yum.repos.d/ yum源配置

/var/log/xxx #日志目錄, 如

/var/log/cron #crontab日志,可以檢視排程執行情況

ntpd - network time protocol (ntp) daemon,同步叢集中機器時間

squid - proxy caching server,叢集webui的代理

mpstat - report processors related statistics. 注意%sys %iowait值

vmstat - report virtual memory statistics

iostat - report central processing unit (cpu) statistics and input/output statistics for devices and partitions.

netstat - print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships

netstat -atpn | grep pid

ganglia - a scalable distributed monitoring system for high-performance computing systems such as clusters and grids.

sar/tsar - collect, report, or save system activity information; tsar是淘寶自己改進的版本

定時采樣(每分鐘),可查曆史記錄(預設5分鐘),可彌補ganglia顯示更詳細資訊

iftop - the "top" bandwidth consumers shown. iftop wiki

iotop

vmtouch, portable file system cache diagnostics and control

telnet/nc ip port - 确認目标端口是否可通路,隻ping通不一定端口可通路,可能防火牆等禁止

ifconfig/ifup/ifdown - configure a network interface

traceroute - print the route packets trace to network host

nslookup - query internet name servers interactively

tcpdump - dump traffic on a network, 類似開源工具 wireshark, netsniff-ng, 更多工具比較

lynx - a general purpose distributed information browser for the world wide web

tcpcp - allows cooperating applications to pass ownership of tcp connection endpoints from one linux host to another one.

ldconfig - configure dynamic linker run time bindings

ldconfig -p | grep so 檢視so是否在link cache中

ldd - print shared library dependencies, 檢視exe或so依賴的so

nm - list symbols from object files,可grep查找是否存在相關的symbol,是否undefined.

readelf - displays information about elf files. 可現實elf相關資訊,如32/64位,适用的os,處理器

gdb

cat /proc/$pid/[cmdline|environ|limits|status|...] - 程序相關資訊

pstack - print a stack trace of a running process

pmap - report memory map of a process

jdk tools and utilities

java troubleshooting tools

jinfo - print java process information, 如classpath,java.libary.path(jni so目錄)

jstack - print a stack trace of a running java process,可檢視死鎖情況

jmap - report memory map of a java process

jmap -histo:live 可觸發full gc

jmap -dump:live,file=$file 可dump heap記憶體,用于jhat等工具debug分析object在heap的占用情況

jhat - heap dump browser - starts a web server on a heap dump file (eg, produced by jmap -dump), allowing the heap to be browsed.

起http服務,浏覽器通路檢視

-j-mxxxxm ,分析大檔案時需要加大heap大小

若有對象資料超大或記憶體占用過多,極有可能memory leak

memory analyzer (mat) - eclipse plugin,java heap analyzer

可視化工具,但受到機器記憶體的限制,無法分析太大的heap dump file

jdb - 可起服務做server,eclipse等工具遠端連接配接調試

jstat - java virtual machine statistics monitoring tool

jstatd - virtual machine jstat daemon,可配合jvisualvm

jvisualvm - java virtual machine monitoring, troubleshooting, and profiling tool;可遠端連接配接jstatd/jmx, 可視化展示工具:示範

jvmtop - in a top-like manner, displays jvm internal metrics (e.g. memory information) of running java processes.

jvm performance optimization jvm開發者寫的優化文章

overview

compilers

garbage collection

concurrently compacting gc

scalability

hprof - heap profiler: java -agentlib:hprof

寫log,但系統線上或無法源碼時

strace - trace system calls and signals

示例:strace/ltrace的應用執行個體

示例:可跟蹤系統調用時間,如機器cpu:%sys高的問題

blktrace, generate traces of the i/o traffic on block devices

ltrace - a library call tracer

xtrace

gprof - a performance analysis tool, sampling and call-graph profiling

valgrind - an instrumentation framework for building dynamic analysis tools. automatically detect many memory management and threading bugs, and profile your programs in detail

systemtap - a simple command line interface and scripting language for writing instrumentation for a live running kernel plus user-space applications for complex tasks that may require live analysis, programmable on-line response, and whole-system symbolic access.

linux版dtrace(sun在solaris上開發的)

功能強大,kernel, user-space app,cross language(java perl python ruby),build-in markers(pg mysql)

can write and reuse simple scripts to deeply examine the activities of a live system

data can be extracted, filtered, and summarized quickly and safely, to enable diagnoses of complex performance or functional problems

豐富的 "tapset" script library

btrace - dynamic tracing tool for the java platform. userguide

基于動态位元組碼修改技術(hotswap)來實作運作時java程式的跟蹤和替換, 實作原理

btrace使用總結

詳細介紹

byteman - simplifies tracing and testing of java programs. can modify a running application without needing to stop and restart it.

define rules specifying the side effects you want to inject 而 btrace類java文法

dapper, a large-scale distributed systems tracing infrastructure

x-trace, a network diagnostic tool designed to provide users and network operators with better visibility into increasingly complex internet applications.

htrace, a tracing framework intended for use with distributed systems written in java

add tracing to hdfs

update htrace for hbase

部分内容有引用微網誌其他童鞋的,如有問題可以及時聯系。