C++ - Split String By Regex


Answer :

#include <regex>  std::regex rgx("\\s+"); std::sregex_token_iterator iter(string_to_split.begin(),     string_to_split.end(),     rgx,     -1); std::sregex_token_iterator end; for ( ; iter != end; ++iter)     std::cout << *iter << '\n'; 

The -1 is the key here: when the iterator is constructed the iterator points at the text that precedes the match and after each increment the iterator points at the text that followed the previous match.

If you don't have C++11, the same thing should work with TR1 or (possibly with slight modification) with Boost.


To expand on the answer by @Pete Becker I provide an example of resplit function that can be used to split text using regexp:

  #include <regex>    std::vector<std::string>   resplit(const std::string & s, std::string rgx_str = "\\s+") {         std::vector<std::string> elems;        std::regex rgx (rgx_str);        std::sregex_token_iterator iter(s.begin(), s.end(), rgx, -1);       std::sregex_token_iterator end;        while (iter != end)  {           //std::cout << "S43:" << *iter << std::endl;           elems.push_back(*iter);           ++iter;       }        return elems;    } 

This works as follows:

   string s1 = "first   second third    ";    vector<string> v22 = my::resplit(s1);     for (const auto & e: v22) {        cout <<"Token:" << e << endl;    }      //Token:first    //Token:second    //Token:third      string s222 = "first|second:third,forth";    vector<string> v222 = my::resplit(s222, "[|:,]");     for (const auto & e: v222) {        cout <<"Token:" << e << endl;    }      //Token:first    //Token:second    //Token:third    //Token:forth 

You don't need to use regular expressions if you just want to split a string by multiple spaces. Writing your own regex library is overkill for something that simple.

The answer you linked to in your comments, Split a string in C++?, can easily be changed so that it doesn't include any empty elements if there are multiple spaces.

std::vector<std::string> &split(const std::string &s, char delim,std::vector<std::string> &elems) {     std::stringstream ss(s);     std::string item;     while (std::getline(ss, item, delim)) {         if (item.length() > 0) {             elems.push_back(item);           }     }     return elems; }   std::vector<std::string> split(const std::string &s, char delim) {     std::vector<std::string> elems;     split(s, delim, elems);     return elems; } 

By checking that item.length() > 0 before pushing item on to the elems vector you will no longer get extra elements if your input contains multiple delimiters (spaces in your case)


Comments

Popular posts from this blog

Are Regular VACUUM ANALYZE Still Recommended Under 9.1?

Can Feynman Diagrams Be Used To Represent Any Perturbation Theory?