PCRE(3)                                                 PCRE(3)





NAME
       PCRE - Perl-compatible regular expressions

PCRE CALLOUTS

       int (*pcre_callout)(pcre_callout_block *);

       PCRE  provides  a  feature  called "callout", which is a
       means of temporarily passing control to  the  caller  of
       PCRE  in  the  middle of pattern matching. The caller of
       PCRE provides an external function by putting its  entry
       point  in  the global variable pcre_callout. By default,
       this variable contains NULL, which disables all  calling
       out.

       Within  a  regular expression, (?C) indicates the points
       at which the external function is to be called.  Differ-
       ent callout points can be identified by putting a number
       less than 256 after the letter C. The default  value  is
       zero.  For example, this pattern has two callout points:

         (?C1)eabc(?C2)def

       If  the  PCRE_AUTO_CALLOUT  option  bit  is   set   when
       pcre_compile()  is  called,  PCRE  automatically inserts
       callouts, all with number 255, before each item  in  the
       pattern.  For example, if PCRE_AUTO_CALLOUT is used with
       the pattern

         A(\d{2}|--)

       it is processed as if it were

       (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)

       Notice  that  there  is  a callout before and after each
       parenthesis and alternation bar. Automatic callouts  can
       be  used  for tracking the progress of pattern matching.
       The pcretest command has an option that  sets  automatic
       callouts;  when it is used, the output indicates how the
       pattern is matched. This is useful information when  you
       are  trying  to optimize the performance of a particular
       pattern.

MISSING CALLOUTS

       You should be aware that, because  of  optimizations  in
       the way PCRE matches patterns, callouts sometimes do not
       happen. For example, if the pattern is

         ab(?C4)cd

       PCRE knows that any matching  string  must  contain  the
       letter "d". If the subject string is "abyz", the lack of
       "d" means that matching  doesn't  ever  start,  and  the
       callout  is  never reached. However, with "abyd", though
       the result is still no match, the callout is obeyed.

THE CALLOUT INTERFACE

       During matching, when PCRE reaches a callout point,  the
       external  function defined by pcre_callout is called (if
       it is  set).  The  only  argument  is  a  pointer  to  a
       pcre_callout   block.   This   structure   contains  the
       following fields:

         int          version;
         int          callout_number;
         int         *offset_vector;
         const char  *subject;
         int          subject_length;
         int          start_match;
         int          current_position;
         int          capture_top;
         int          capture_last;
         void        *callout_data;
         int          pattern_position;
         int          next_item_length;

       The version field is an integer containing  the  version
       number  of  the block format. The initial version was 0;
       the current version is 1. The version number will change
       again  in future if additional fields are added, but the
       intention is never to remove any of the existing fields.

       The  callout_number  field  contains  the  number of the
       callout, as compiled into the pattern (that is, the num-
       ber  after ?C for manual callouts, and 255 for automati-
       cally generated callouts).

       The offset_vector field is a pointer to  the  vector  of
       offsets  that  was  passed by the caller to pcre_exec().
       The contents can be inspected in order to  extract  sub-
       strings  that  have been matched so far, in the same way
       as for extracting substrings  after  a  match  has  com-
       pleted.

       The  subject and subject_length fields contain copies of
       the values that were passed to pcre_exec().

       The start_match field contains  the  offset  within  the
       subject  at  which the current match attempt started. If
       the pattern is not anchored, the callout function may be
       called  several times from the same point in the pattern
       for different starting points in the subject.

       The current_position field contains  the  offset  within
       the subject of the current match pointer.

       The  capture_top field contains one more than the number
       of the highest numbered captured substring so far. If no
       substrings  have been captured, the value of capture_top
       is one.

       The capture_last field contains the number of  the  most
       recently  captured substring. If no substrings have been
       captured, its value is -1.

       The callout_data field contains a value that  is  passed
       to pcre_exec() by the caller specifically so that it can
       be  passed  back  in  callouts.  It  is  passed  in  the
       pcre_callout  field of the pcre_extra data structure. If
       no such data was passed, the value of callout_data in  a
       pcre_callout  block  is  NULL. There is a description of
       the pcre_extra structure in the pcreapi documentation.

       The pattern_position field is present from version 1  of
       the  pcre_callout  structure.  It contains the offset to
       the next item to be matched in the pattern string.

       The next_item_length field is present from version 1  of
       the  pcre_callout  structure.  It contains the length of
       the next item to be matched in the pattern string.  When
       the  callout  immediately precedes an alternation bar, a
       closing parenthesis, or the  end  of  the  pattern,  the
       length  is  zero.  When  the callout precedes an opening
       parenthesis, the length is that of  the  entire  subpat-
       tern.

       The  pattern_position  and  next_item_length  fields are
       intended to help  in  distinguishing  between  different
       automatic callouts, which all have the same callout num-
       ber. However, they are set for all callouts.

RETURN VALUES

       The external callout  function  returns  an  integer  to
       PCRE. If the value is zero, matching proceeds as normal.
       If the value is greater than zero, matching fails at the
       current  point,  but backtracking to test other matching
       possibilities goes ahead, just as if a lookahead  asser-
       tion  had  failed.  If  the value is less than zero, the
       match is abandoned, and pcre_exec() returns the negative
       value.

       Negative  values  should normally be chosen from the set
       of     PCRE_ERROR_xxx     values.     In     particular,
       PCRE_ERROR_NOMATCH forces a standard "no match" failure.
       The error number PCRE_ERROR_CALLOUT is reserved for  use
       by  callout  functions;  it  will  never be used by PCRE
       itself.

Last updated: 09 September 2004
Copyright (c) 1997-2004 University of Cambridge.



                                                        PCRE(3)
