正规表示法(Regular Expression,RE,或称为常规表示法)通过一些特殊字符的排列,用以搜寻/取代/删除一列或多列文字字符串,可以让使用者轻易的达到搜寻/取代某特定字符串的处理程序。简单来说就是处理字符串的方法,只要工具程序支持这种表示法,支持正规表示法的工具程序很多,包括vi, sed, awk等等。 正规表示法与通配符不同,通配符代表的是bash操作接口的一个功能。 使用正规表示法时,需要留意当时环境的语系,由于一般联系正规表示法时,使用的是兼容于POSIX的标准,因此就使用C这个语系,因此,底下的很多联系都是使用LANG=C这个语系数据进行的。另外,要了解一些特殊符号: ![]() ![]() grep的一些进阶选项-A : 后面加数字,为after, 除了列出该行外,后续的n 行也列出来 -B : 后面加数字,为before, 除了列出该行外,前面的n 行也列出来 ![]() 基础正规表示法练习首先将语系设置为C [super@localhost /]# echo $LANG zh_CN.UTF-8 [super@localhost /]# export LANG=C; export LC_ALL=C [super@localhost /]# echo $LANG C 练习文件: [super@localhost /]# cat regular_express.txt "Open Source" is a good mechanism to develop programs. apple is my favorite food. Football game is not use feet only. this dress doesn't fit me. However, this dress is about $ 3183 dollars. GNU is free air not free beer. Her hair is very beauty. I can't finish the test. Oh! The soup taste good. motorcycle is cheap than car. This window is clear. the symbol '*' is represented as start. Oh! My god! The gd software is a library for drafting programs. You are the best is mean you are the no. 1. The world <Happy> is the same with "glad". I like dog. google is the best tools for search keyword. goooooogle yes! go! go! Let's go. # I am VBird 搜寻特定字符串,寻找'the' -n : 显示行号 [super@localhost /]# grep -n 'the' regular_express.txt8:I can't finish the test.12:the symbol '*' is represented as start.15:You are the best is mean you are the no. 1.16:The world <Happy> is the same with "glad".18:google is the best tools for search keyword. 取得不管大小写的'the',-i : 忽略大小写的区别 [super@localhost /]# grep -in 'the' regular_express.txt8:I can't finish the test.9:Oh! The soup taste good.12:the symbol '*' is represented as start.14:The gd software is a library for drafting programs.15:You are the best is mean you are the no. 1.16:The world <Happy> is the same with "glad".18:google is the best tools for search keyword. 利用中括号[]来搜寻集合字符搜寻'test'和'taste', 不管[]中有几个字符,他都仅代表某一个字符 [super@localhost /]# grep -n 't[ae]st' regular_express.txt 8:I can't finish the test. 9:Oh! The soup taste good. 查找'oo',[super@localhost /]# grep -n 'oo' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 2:apple is my favorite food. 3:Football game is not use feet only. 9:Oh! The soup taste good. 18:google is the best tools for search keyword. 19:goooooogle yes! 查找'oo',但前面不是'g'[super@localhost /]# grep -n '[^g]oo' regular_express.txt 2:apple is my favorite food. 3:Football game is not use feet only. 18:google is the best tools for search keyword. 19:goooooogle yes! #前面也可能是o,所以满足条件 要求'oo'前面不是小写字母[super@localhost /]# grep -n '[^a-z]oo' regular_express.txt 3:Football game is not use feet only. 取得有数字的那一行[super@localhost leiothrix_lutea]# grep -n '[0-9]' regular_express.txt 5:However, this dress is about $ 3183 dollars. 15:You are the best is mean you are the no. 1. 使用的特殊字符实现'oo'前面不是小写字母[super@localhost leiothrix_lutea]# grep -n '[^[:lower:]]oo' regular_express.txt 3:Football game is not use feet only. 行首和行尾字符'^'; '$'行首'the'的列出 [super@localhost test]# grep -n '^the' regular_express.txt 12:the symbol '*' is represented as start. 开头时小写的那一行列出[super@localhost test]# grep -n '^[a-z]' regular_express.txt 2:apple is my favorite food. 4:this dress doesn't fit me. 10:motorcycle is cheap than car. 12:the symbol '*' is represented as start. 18:google is the best tools for search keyword. 19:goooooogle yes! 20:go! go! Let's go. 取代[super@localhost test]# grep -n '^[[:lower:]]' regular_express.txt 开头不是英文字母[super@localhost test]# grep -n '^[^a-zA-Z]' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 21:# I am VBird 行尾为小数点'.' 结束的,因为小数点具有其他意义,所以必须使用跳脱字符(\)来解除其特殊意义。[super@localhost test]# grep -n '\.$' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 2:apple is my favorite food. 3:Football game is not use feet only. 4:this dress doesn't fit me. 5:However, this dress is about $ 3183 dollars. 6:GNU is free air not free beer. ...... 找出空白行[super@localhost test]# grep -n '^$' regular_express.txt 21: 不显示空白行和批注行'#'[super@localhost test]# grep -v '^$' /etc/rsyslog.conf | grep -v '^#' 任意字符'.',与任意字符'*' 在正规表示法中,'.'代表:一定有一个任意字符的意思 '*':代表前一个字符,0到无穷多次的意思 找出g??d的字符串, [super@localhost test]# grep -n 'g..d' regular_express.txt1:"Open Source" is a good mechanism to develop programs.9:Oh! The soup taste good.16:The world <Happy> is the same with "glad". 'o*'代表有0-无穷多个o的意思,因此,如果用'o*'的话,就会把文本中的所有内容都显示出来找到至少含有两个'o'的字符串,就需要'ooo*' [super@localhost test]# grep -n 'ooo*' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 2:apple is my favorite food. 3:Football game is not use feet only. 9:Oh! The soup taste good. 18:google is the best tools for search keyword. 19:goooooogle yes! 找出两个'g'之间至少有一个'o'的字符串[super@localhost test]# grep -n 'goo*g' regular_express.txt 18:google is the best tools for search keyword. 19:goooooogle yes! 找出一行中还有两个'g'的,即开头是g,结尾是g,中间任意字符'.*'代表零个或多个任意字符[super@localhost test]# grep -n 'g.*g' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 14:The gd software is a library for drafting programs. 18:google is the best tools for search keyword. 19:goooooogle yes! 20:go! go! Let's go. 限定连续RE字符范围{} {}在shell中有特殊意义,因此,也必须使用跳脱字符'\'让他失去特殊意义才行 \{n,m\}代表连续n到m个的前一个RE字符要找到两个oo的字符串, [super@localhost test]# grep -n 'o\{2\}' regular_express.txt 1:"Open Source" is a good mechanism to develop programs. 2:apple is my favorite food. 3:Football game is not use feet only. 9:Oh! The soup taste good. 18:google is the best tools for search keyword. 19:goooooogle yes! 'g'和'g'之间有2-3个'o'[super@localhost test]# grep -n 'go\{2,3\}g' regular_express.txt 18:google is the best tools for search keyword. 正规表示法的特殊字符与一般指令列输入指令的通配符并不相同,找到以a为开头的任何档名的文件, [super@localhost test]# ls -l a* /etc #但好像不是这么简单。。 [super@localhost test]# ls /etc | grep '^a.*' 参考 鸟哥的Linux私房菜 ---------------------------------------------------------------------------------------------------------------------- 我们尊重原创,也注重分享,文章来源于微信公众号:Jar的荒野秘境,建议关注公众号查看原文。如若侵权请联系qter@qter.org。 ---------------------------------------------------------------------------------------------------------------------- |