An Overview of the Grep Command - Computer Science

Download Report

Transcript An Overview of the Grep Command - Computer Science

What is grep?
 % man grep
 DESCRIPTION
 The grep utility searches text files for
a pattern and prints all lines that
contain that pattern. It uses a compact
non-deterministic algorithm.
 Can search either a specified file or standard input
 Be careful: certain characters such as $, *, [, ^, |, (, ), and \ are
also meaningful to the shell.
grep options pattern input_file_names
 Other variants of grep
 fgrep – “fixed grep” (also known as “fast grep”)
–does not support regular expressions (uses
fixed strings).
 egrep – supports the extended set +, ?, | , and ( )
– supports some metacharacters, but not others
such as \(, \) , \n, \<, \>
A metacharacter is a character that has a special meaning to a computer program, such
as a shell interpreter or a regular expression engine. It is therefore safest to enclose
the pattern list in single quotes (‘... ‘) to avoid confusion.
Useful grep options
 -i or
--ignore-case
 -v or
--invert-match //Selects non-matching lines
 -w
 -c
 -n
//Select only those lines containing matches that form
whole words.
//Prints only a count of the lines that contain the pattern.
//Precede each line by its line number in the file
(first line is 1).
Many more advanced options are available in UNIX Resource #3 on course website
or by checking man grep.
Examples
The file users contains 18 lines. Shown are examples of the results from
combining multiple grep options.
Examples (Cont’d)
 % grep -i -w main *
 search files in the current working directory for the word
main, any case
 % who | grep kc498
 send the output of who to grep and look for matches of
kc498
Searching for text patterns
with regular expressions
 A regular expression describes a pattern in a set of
strings.
 As in the previous examples, letters and numbers are
regular expressions of themselves.
 Use the escape character \ to precede any special metacharacter. (Also has other uses when used with
different characters.)
 Regular expressions can be concatenated
 Use parentheses to override precedence and group.
Repetition and Placement Operators
Operator
Matches
.
Any single character except
newline
^
Beginning of line
^A matches any line starting with “A”
$
End of input or line
• h$ matches "h" in "teach" but not in
"teacher”
• grep ‘^$’ searches for blank lines
*
The preceding character 0 or
more times.
um* matches "um" in “bum", "umm" in
"yummy", and "u" in "huge"
+
The preceding character 1 or
more times.
um+ matches "um" in "rum" and "umm" in
"yummy" but nothing in "huge"
?
x|y
Example
.an matches "ran" and "can" in the phrase
"bran muffins can be tasty"
The preceding item is optional st?on matches "son" in "Johnson" and
and will be matched at most "ston" in "Johnston" but nothing in
once.
"Appleton" or "tension"
Either x or y
FF0000|0000FF matches "FF0000" in
bgcolor="#FF0000" and "0000FF'" in font
color="#0000FF"
Searching for ranges and multiple matches

{n}


{n,}


Match preceding item between n and m times.
[ ]


Match preceding item at most m times
{n,m}


Match preceding item at least n times
{,m}


Match preceding item exactly n times
match any one of the enclosed characters, as in [aeiou]. Use Hyphen "-" for a range, as in [09].
[^ ]

match any one character except those enclosed in [ ], as in [^0-9]
More special uses of \


\b
Matches a word boundary.
Ex: \dog\b will match
occurrences of
the word
“dog” (similar to –w).

Matches any single whitespace
character (space, tab, etc.)

\S
Matches any single nonwhitespace
character
\d
Matches any digit, the same as
[0-9]

\s

\W
Matches any non-alphanumeric
character
\D
as
Matches any non-digit, the same
[^0-9]

\r
carriage return
 \w
Wildcard—Matches any
alphanumeric character,
including
underscore. Equivalent

\n
line feed
Regular Expression Tester
 http://www.cs.drexel.edu/~introcs/Fa09/extras/Reg
Exp/RegExp_tester.html
References

Robbins, Arnold. UNIX In a Nutshell. 4th ed. Safari Books Online: O'Reilly,
2005. Web.

Magloire, Alain, et al. GNU Grep: Print lines matching a pattern. Free Software
Foundation, Inc. 8 September 2010. Web.

Wikipedia Contributors. "Metacharacter." Wikipedia. 17 May 2010. Web. 24
Sept. 2010. <http://en.wikipedia.org/wiki/Metacharacter>.

"Regular Expressions in Grep." Robelle: Solid Software for HP Servers. Web. 24
Sept. 2010. <http://www.robelle.com/smugbook/regexpr.html>.

"About Regular Expressions." Drexel University Computer Science. Web. 25
Sept. 2010.
<http://www.cs.drexel.edu/~introcs/Fa09/extras/RegExp/regular_expr
.html>.