actions.qbk

来自「Boost provides free peer-reviewed portab」· QBK 代码 · 共 519 行 · 第 1/2 页

QBK
519
字号
[/ / Copyright (c) 2008 Eric Niebler / / Distributed under the Boost Software License, Version 1.0. (See accompanying / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) /][section Semantic Actions and User-Defined Assertions][h2 Overview]Imagine you want to parse an input string and build a `std::map<>` from it. Forsomething like that, matching a regular expression isn't enough. You want to/do something/ when parts of your regular expression match. Xpressive letsyou attach semantic actions to parts of your static regular expressions. Thissection shows you how.[h2 Semantic Actions]Consider the following code, which uses xpressive's semantic actions to parsea string of word/integer pairs and stuffs them into a `std::map<>`. It isdescribed below.    #include <string>    #include <iostream>    #include <boost/xpressive/xpressive.hpp>    #include <boost/xpressive/regex_actions.hpp>    using namespace boost::xpressive;    int main()    {        std::map<std::string, int> result;        std::string str("aaa=>1 bbb=>23 ccc=>456");        // Match a word and an integer, separated by =>,        // and then stuff the result into a std::map<>        sregex pair = ( (s1= +_w) >> "=>" >> (s2= +_d) )            [ ref(result)[s1] = as<int>(s2) ];        // Match one or more word/integer pairs, separated        // by whitespace.        sregex rx = pair >> *(+_s >> pair);        if(regex_match(str, rx))        {            std::cout << result["aaa"] << '\n';            std::cout << result["bbb"] << '\n';            std::cout << result["ccc"] << '\n';        }        return 0;    }This program prints the following:[pre123456]The regular expression `pair` has two parts: the pattern and the action. Thepattern says to match a word, capturing it in sub-match 1, and an integer,capturing it in sub-match 2, separated by `"=>"`. The action is the part insquare brackets: `[ ref(result)[s1] = as<int>(s2) ]`. It says to take sub-matchone and use it to index into the `results` map, and assign to it the result ofconverting sub-match 2 to an integer.[note To use semantic actions with your static regexes, you must`#include <boost/xpressive/regex_actions.hpp>`]How does this work? Just as the rest of the static regular expression, the partbetween brackets is an expression template. It encodes the action and executesit later. The expression `ref(result)` creates a lazy reference to the `result`object. The larger expression `ref(result)[s1]` is a lazy map index operation.Later, when this action is getting executed, `s1` gets replaced with thefirst _sub_match_. Likewise, when `as<int>(s2)` gets executed, `s2` is replacedwith the second _sub_match_. The `as<>` action converts its argument to therequested type using Boost.Lexical_cast. The effect of the whole action is toinsert a new word/integer pair into the map.[note There is an important difference between the function `boost::ref()` in`<boost/ref.hpp>` and `boost::xpressive::ref()` in`<boost/xpressive/regex_actions.hpp>`. The first returns a plain`reference_wrapper<>` which behaves in many respects like an ordinaryreference. By contrast, `boost::xpressive::ref()` returns a /lazy/ referencethat you can use in expressions that are executed lazily. That is why we cansay `ref(result)[s1]`, even though `result` doesn't have an `operator[]` thatwould accept `s1`.]In addition to the sub-match placeholders `s1`, `s2`, etc., you can also usethe placeholder `_` within an action to refer back to the string matched bythe sub-expression to which the action is attached. For instance, you can usethe following regex to match a bunch of digits, interpret them as an integerand assign the result to a local variable:    int i = 0;    // Here, _ refers back to all the    // characters matched by (+_d)    sregex rex = (+_d)[ ref(i) = as<int>(_) ];[h3 Lazy Action Execution]What does it mean, exactly, to attach an action to part of a regular expressionand perform a match? When does the action execute? If the action is part of arepeated sub-expression, does the action execute once or many times? And if thesub-expression initially matches, but ultimately fails because the rest of theregular expression fails to match, is the action executed at all?The answer is that by default, actions are executed /lazily/. When a sub-expressionmatches a string, its action is placed on a queue, along with the currentvalues of any sub-matches to which the action refers. If the match algorithmmust backtrack, actions are popped off the queue as necessary. Only after theentire regex has matched successfully are the actions actually exeucted. Theyare executed all at once, in the order in which they were added to the queue,as the last step before _regex_match_ returns.For example, consider the following regex that increments a counter wheneverit finds a digit.    int i = 0;    std::string str("1!2!3?");    // count the exciting digits, but not the    // questionable ones.    sregex rex = +( _d [ ++ref(i) ] >> '!' );    regex_search(str, rex);    assert( i == 2 );The action `++ref(i)` is queued three times: once for each found digit. Butit is only /executed/ twice: once for each digit that precedes a `'!'`character. When the `'?'` character is encountered, the match algorithmbacktracks, removing the final action from the queue.[h3 Immediate Action Execution]When you want semantic actions to execute immediately, you can wrap thesub-expression containing the action in a [^[funcref boost::xpressive::keep keep()]].`keep()` turns off back-tracking for its sub-expression, but it also causesany actions queued by the sub-expression to execute at the end of the `keep()`.It is as if the sub-expression in the `keep()` were compiled into anindependent regex object, and matching the `keep()` is like a separate invocationof `regex_search()`. It matches characters and executes actions but never backtracksor unwinds. For example, imagine the above example had been written as follows:    int i = 0;    std::string str("1!2!3?");    // count all the digits.    sregex rex = +( keep( _d [ ++ref(i) ] ) >> '!' );    regex_search(str, rex);    assert( i == 3 );We have wrapped the sub-expression `_d [ ++ref(i) ]` in `keep()`. Now, wheneverthis regex matches a digit, the action will be queued and then immediatelyexecuted before we try to match a `'!'` character. In this case, the actionexecutes three times.[note Like `keep()`, actions within [^[funcref boost::xpressive::before before()]]and [^[funcref boost::xpressive::after after()]] are also executed early when theirsub-expressions have matched.][h3 Lazy Functions]So far, we've seen how to write semantic actions consisting of variables andoperators. But what if you want to be able to call a function from a semanticaction? Xpressive provides a mechanism to do this.The first step is to define a function object type. Here, for instance, is afunction object type that calls `push()` on its argument:    struct push_impl    {        // Result type, needed for tr1::result_of        typedef void result_type;        template<typename Sequence, typename Value>        void operator()(Sequence &seq, Value const &val) const        {            seq.push(val);        }    };The next step is to use xpressive's `function<>` template to define a functionobject named `push`:    // Global "push" function object.    function<push_impl>::type const push = {{}};The initialization looks a bit odd, but this is because `push` is beingstatically initialized. That means it doesn't need to be constructedat runtime. We can use `push` in semantic actions as follows:    std::stack<int> ints;    // Match digits, cast them to an int    // and push it on the stack.    sregex rex = (+_d)[push(ref(ints), as<int>(_))];You'll notice that doing it this way causes member function invocationsto look like ordinary function invocations. You can choose to write yoursemantic action in a different way that makes it look a bit more likea member function call:    sregex rex = (+_d)[ref(ints)->*push(as<int>(_))];Xpressive recognizes the use of the `->*` and treats this expressionexactly the same as the one above.When your function object must return a type that depends on itsarguments, you can use a `result<>` member template instead of the`result_type` typedef. Here, for example, is a `first` function objectthat returns the `first` member of a `std::pair<>` or _sub_match_:    // Function object that returns the    // first element of a pair.    struct first_impl    {        template<typename Sig> struct result {};        template<typename This, typename Pair>        struct result<This(Pair)>        {            typedef typename remove_reference<Pair>                ::type::first_type type;        };        template<typename Pair>        typename Pair::first_type        operator()(Pair const &p) const        {            return p.first;        }    };    // OK, use as first(s1) to get the begin iterator    // of the sub-match referred to by s1.    function<first_impl>::type const first = {{}};[h3 Referring to Local Variables]As we've seen in the examples above, we can refer to local variables withinan actions using `xpressive::ref()`. Any such variables are held by referenceby the regular expression, and care should be taken to avoid letting thosereferences dangle. For instance, in the following code, the reference to `i`is left to dangle when `bad_voodoo()` returns:    sregex bad_voodoo()    {        int i = 0;        sregex rex = +( _d [ ++ref(i) ] >> '!' );        // ERROR! rex refers by reference to a local        // variable, which will dangle after bad_voodoo()        // returns.        return rex;    }When writing semantic actions, it is your responsibility to make sure thatall the references do not dangle. One way to do that would be to make thevariables shared pointers that are held by the regex by value.    sregex good_voodoo(boost::shared_ptr<int> pi)    {

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?