TGREP(1)                    General Commands Manual                   TGREP(1)

NAME
       tgrep - print text that matches patterns of lexical tokens

SYNOPSIS
       tgrep [OPTION...]*  -e PATTERN ...  [FILE...]

DESCRIPTION
       Tgrep  searches for patterns in a set of one or more files, matching on
       lexical tokens rather than characters (as in grep) or words (as in grep
       -w). Patterns are insensitive to white-space, so matches can span  mul-
       tiple lines. If searching source code, comments are removed before pat-
       terns are matched.

       All  tokens  in  the pattern must be separated from each other by white
       space.

       Traditional meta-characters from regular expression theory  that  match
       common  symbols and operators in C-like languages, (such as , ), |, and
       + ) are interpreted as regular characters unless precided  by  a  back-
       slash, as in \+.  The Kleene star * is interpreted as a regular charac-
       ter  if preceded in the pattern by white space. If it is used as a suf-
       fix of another token it is interpreted as a meta-character, as  in  to-
       ken*,  specifying  an  optional repetition of zero or more times of the
       preceding token.

       The patterns can use variable binding and constraints (see examples be-
       low).

OPTIONS
       -c     Suppress normal output and only print a count of the  number  of
              times  that the pattern was matched. Note that there can be mul-
              tiple matches on a single line of input.

       -d     Display long matches in terse format, showing only the first and
              last two lines of each match.

       -e pattern
              Define the pattern expression to be matched. Tokens must be sep-
              arated by spaces.  A token category can be specified with  an  @
              prefix. The list of token types is:

                   @chr      - quoted character
                   @const    - short for one of:
                        @const_flt
                        @const_hex
                        @const_int
                        @const_oct
                   @cpp      - preprocessor directive
                   @ident    - identifier
                   @key      - keyword
                   @modifier - long, short, signed, unsigned
                   @oper     - operator
                   @qualifier - const, volatile
                   @storage  - static, extern, register, auto
                   @str      - string
                   @type     - int, char, float, double, void

              The  character  pair  .* can be used to specify a don't-care se-
              quence of tokens. Variable binding can be used (definition,  for
              instance:  x:@ident,  with  references  later  in the expression
              written as :x).  Location constraints can also be added (see the
              pe manual page of the Cobra tool).
              There can be multiple -e options.

       -i     Case insensitive search. Text fields in the input are mapped  to
              lower-case.   The letters in the pattern itself are not changed,
              so use only lower-case in  the  pattern  itself  to  match  text
              fields.

       -j     Display matches in json format.

       -l     List only the names of files that contain matches of the pattern
              (see also -c).

       -Nn    Use  n  cores  (n>=1)  to read the input files.  By default n is
              equal to the number of files, with a maximum of  8.   (also  ac-
              cepted: -N n)

       -n     Display  filenames  and line number ranges for each matched pat-
              tern (not compatible with -j and -t).

       -r 'pattern'
              Recursively find filenames matching regular expression 'pattern'
              (e.g. '*.[ch]') in the current directory, or sub-directories  to
              use for input. Use single quotes around the pattern to avoid the
              shell from expanding the pattern.

       -t     Display  the  matches, highlighting which tokens were matched on
              each line.

       -x     Search comments instead of code.

       languages:
              -Ada -C++ -html -Java -Python -text
              which can be abbreviated to: -A -C++ -H -J -P -T
              The default input language assumed is ISO standard C.


NOTES
       Tgrep is a small shell-script that runs the Cobra tool to  perform  its
       function, in a style that is comparable to typical uses of the standard
       grep tool.

       Tgrep assumes Cobra Version 5.3 (January 2026) or later.

       A  full  description  of the notation for defining pattern expressions,
       including the optional use of bound variables and constraints,  can  be
       found in the online Cobra documentation. See: pe.html


EXAMPLES
       A  simple  way  to  list include directives that import locally defined
       files, in a set of source files, can be specified with  token  category
       specifications, as follows.

            $ tgrep -e '@cpp @str' *.[chyl]

       The  notation  @cpp matches a compiler directive, and the notation @str
       matches any string.  If instead we want to list system  include  files,
       the pattern becomes a little longer, as follows.

            $ tgrep -e '@cpp < @ident . h >' *.[chyl]


       If  we  want  all  system  include files, but not those matching either
       stdio.h or stdlib.h, we could write:

            $ tgrep -e '@cpp < ^[stdio stdlib] . h >' *.[chyl]


       To illustrate the use of variable binding and don't care sequences  us-
       ing  the  Kleene star, the following pattern matches a sequence of code
       where the return value of fopen in a subsequent fprint statement  with-
       out checking the return value for errors.

            $ tgrep -e 'x:@ident = fopen ( .* ) ^:x* fprintf ( :x' *.c

       Note the use of spaces to separate individual tokens and symbols.
       The first token matches any identifier, with the name bound to variable
       x, using the prefix notation x:.

       The  next five token specifiers match token texts exactly, with .* used
       as a short-hand for a don't care sequence of zero or more tokens.

       After the closing round brace of calls to fopen (the braces are guaran-
       teed to match at the right level of nesting) the pattern  requires  the
       absence  of uses of the bound variable (using the negation prefix ^ and
       the bound variable reference shorthand Lx, followed by the suffix *  to
       indicate a repetition of zero or more).

       The  next  token must match fprintf followed by an open round brace and
       then a repeat of the bound variable x, again refered to as :x.

       The next example shows a pattern that looks for a C function  prototype
       definition  that  is immediately followed by the function definition (a
       form of redundancy).  It uses two bound variables, named x and y here.

            $ tgrep -d -e 'x:@type y:@ident ( .* ) \; :x :y ( .* ) {' *.c

       Results are displayed in the default abbreviated form.  We have to  use
       a  backslash escape to protect the semi-colon from being interpreted as
       a command separator.  To also account for return values with  pointers,
       we can extend the pattern by including optional matches of zero or more
       * symbols.

            $ tgrep -e 'x:@type ** y:@ident ( .* ) \; :x ** :y ( .* ) {' *.c


       The following example shows the use of a positional constraint:

            $ tgrep -e ' x:\; .* :x <1> @1 (:x.lnr == .lnr)' *.c

       This  pattern  matches  two semi-colons, with arbitrary text in between
       them, but the positional constraint placed at the repeat of  the  semi-
       colon  (with  the  first semi-colon bound to variable x) requiring that
       the definition and reference appear on the same line. This can be  used
       to  find  uses  of  multiple  statements  appearing on the same line of
       source text, which violates some  coding  standards.   The  pattern  as
       given will also match the control portion of for statements, but can be
       refined  to exclude those matches by extending the constraint to remove
       matches inside round brace pairs:

            $ tgrep -e ' x:\; .* :x <1> @1 (:x.lnr==.lnr && .round==0)' *.c

       Search comments in Java code that contain regular expression FIXME, in-
       senstive to capitalization.

            $ tgrep -Java -x -n -e /[Ff][Ii][Xx][Mm][Ee] *.c

AUTHOR
       Gerard Holzmann, gholzmann@acm.org

SEE ALSO
       cobra(1), grep(1), awk(1)
       https://codescrub.com

TGREP(1)                    General Commands Manual                   TGREP(1)