Linux实用工具总结之grep

一些会比较经常用到的Linux下的工具，当然都是命令行。多数工具需要与正则表达式配合使用，所以可以很多工具使用前都需要先了解正则表达式，正则表达式可以参考这里正则表达式学习笔记

关于

grep的作用就是根据模式（可以是正则表达式或字符串）打印匹配到的内容到标准输出，即可以对管道输出的内容进行匹配，也可以对文件或文件下的文件进行匹配。如果不使用-P指定正则表达式语法兼容Perl，则默认使用-G参数，指定为基本正则表达式
用法（来自于man手册）：

1 2	grep [OPTIONS] PATTERN [FILE...] grep [OPTIONS] [-e PATTERN \| -f FILE] [FILE...]

OPEIONS的可选项有很多，比较常见的有（加粗表示很实用）：

匹配模式选择

-E, –extended-regexp 扩展正则表达式egrep
-F, –fixed-strings 一个换行符分隔的字符串的集合fgrep
-G, –basic-regexp 基本正则
-P, –perl-regexp 调用的perl正则
-e, –regexp=PATTERN 后面根正则模式，默认就是这个选项
-f, –file=FILE 从文件中获得匹配模式
-i, –ignore-case 不区分大小写
-w, –word-regexp 匹配整个单词
-x, –line-regexp 匹配整行
-z, –null-data a data line ends in 0 byte, not newline

杂项:
-s, –no-messages 不显示错误信息
-v, –invert-match 显示不匹配的行
-V, –version 显示版本号
–help 显示帮助信息
–mmap use memory-mapped input if possible

输出控制

-m, –max-count=NUM 显示每一个文件中匹配到的最大次数
-b, –byte-offset 打印匹配行前面打印该行所在的块号码。
-n, –line-number 显示的加上匹配所在的行号
–line-buffered 刷新输出每一行
-H, –with-filename 当搜索多个文件时，显示匹配文件名前缀
-h, –no-filename 当搜索多个文件时，不显示匹配文件名前缀
–label=LABEL print LABEL as filename for standard input
-o, –only-matching 只显示匹配成功的内容，而不是整行。当某个匹配在一行中多次出现，每一个都单独显示一次（与-c配合可以输出某个串一共出现了多少次）
-q, –quiet, –silent 不显示任何东西，但可以用于检查grep的退出状态（0为匹配成功）
–binary-files=TYPE assume that binary files are TYPE TYPE is ‘binary’, ‘text’, or ‘without-match’
-a, –text 匹配二进制的东西
-I 不匹配二进制的东西
-d, –directories=ACTION 目录操作，读取，递归，跳过
ACTION is ‘read’, ‘recurse’, or ‘skip’
-D, –devices=ACTION 设置对设备，FIFO,管道的操作，读取，跳过
ACTION is ‘read’ or ‘skip’
-R, -r, –recursive 递归调用
其中-R会递归符号链接目录，-r只递归当前真实目录，而忽略链接目录
–include=PATTERN files that match PATTERN will be examined
–exclude=PATTERN files that match PATTERN will be skipped.
–exclude-from=FILE files that match PATTERN in FILE will be skipped.
-L, –files-without-match 匹配多个文件时，显示不匹配的文件名
-l, –files-with-matches 匹配多个文件时，显示匹配的文件名
-c, –count 显示匹配了多少次
-Z, –null print 0 byte after FILE name

文件控制:
-B, –before-context=NUM 打印匹配本身以及前面的几个行由NUM控制
-A, –after-context=NUM 打印匹配本身以及随后的几个行由NUM控制
-C, –context=NUM 打印匹配本身以及随后，前面的几个行由NUM控制
-NUM 根-C的用法一样的
–color[=WHEN],
–colour[=WHEN] use markers to distinguish the matching string
WHEN may be always, never or auto.
-U, –binary do not strip CR characters at EOL (MSDOS)
-u, –unix-byte-offsets report offsets as if CRs were not there (MSDOS)

实例分析

1.查看grep的man手册中-A参数的作用：

1	man grep \| grep '^\s*\-A' -A 3

首先man手册，然后通过管道使用grep查询以-A开头的行，并通过-A参数，指定输出匹配行及其后3行。结果为：

-A NUM, --after-context=NUM
      Print NUM lines of trailing context after matching lines.  Places a line containing
      a group separator (--) between contiguous  groups  of  matches.   With  the  -o  or
      --only-matching option, this has no effect and a warning is given.

2.查看系统头文件中固定大小的整型是不是通过typedef进行定义的：

1	grep -ni 'typedef\s\+.\s\+uint[[:digit:]]_t' /usr/include/*.h

使用-n参数显示行号，-i参数忽略大小写，注意正则表达式中匹配一个或多个的+号在这里需要进行转义，否则表示正常匹配+号。\d不能用于匹配数字，需要使用posix的正则表示法，而且方括号还必须是双方括号。最后的文件名，可以使用通配符。
注意上述语句等价于使用-P参数指定为Perl正则语法的语句：

1	grep -niP 'typedef\s+.\s+uint\d_t' /usr/include/*.h

输出结果为：

/usr/include/stdint.h:48:typedef unsigned char      uint8_t;
/usr/include/stdint.h:49:typedef unsigned short int uint16_t;
/usr/include/stdint.h:51:typedef unsigned int       uint32_t;
/usr/include/stdint.h:55:typedef unsigned long int  uint64_t;
/usr/include/stdint.h:58:typedef unsigned long long int uint64_t;

3.递归匹配头文件中含有单个字符串grep的文件，并显示文件名和行号：

1	grep -rniw 'grep' /usr/include/

使用-r参数，递归匹配，-w进行整词匹配

更多内容可以参见man手册
GNU的grep手册： https://www.gnu.org/software/grep/manual/grep.html