Do you know how to use regex with C++? Regex had been added in the STL library since C++11!
If you exploit "sed" in bash or code in Python, you are certainly already familiar with regex! The regular expressions are able to detect easily a specific pattern in any sequences.
During this post, I'll show you how to use the different methods and exploit this powerful library!
Introduction.
What is a regular expression? What about regex in C++11?
At first, what is a regular expression? A regular expression (shorten regex) is a sequence which define a pattern. Thanks to string algorithms, the regex is able to find specific pattern in text files in order to match, search for group of patterns or replace patterns.
Regex had been added in the C++11.
Which file is parsed during this post?
During this post, all my regular expression will exploit the same text file line per line. Do not worry, I'll reprint each line before the result!
June 12th, 2018 - Danny commands 6 apple July 15th, 2018 - Jean-Reblochon commands 4 banana August 18th, 2018 - Jane commands 5 grape August 19th, 2018 - JWhat?Gne? December 4th, 2019 - Mathilde commands 50 cherry January 5th, 2020 - Danny commands 2 pineapple March 15th, 2020 - Jane commands 4 banana August 18th, 2020 - Anna commands 1 potato
You if want to play with it or do a special test on your side, do not hesitate to use the same kind of code! Here is the main I used:
Another point, mastery the regular expression is not easy at the first sight. However, with few knowledges of the tool, you can easily develop some complex instructions. Do not hesitate to exploit online tools like regex101 for helping. You'll see, regex in C++ give to you a powerful tool to parse any text files easily!
Method reviews and code examples.
Using regex in C++ with "match" - Check if a pattern is present.
At first, we will take a look to the match method. Well named, match helps you to find a pattern in a string. Let's review this code:
Pretty simple! Here I created my regular expression with the object std::regex. Now, I want to look who bought bananas in my list. Let's run the code:
Simple and efficient, we found that Jane and "Jean-Reblochon" bought bananas!
Using regex in C++ with "search" - Extract patterns.
For this point, "search" is a powerful feature of regex. With a good knowledge of regular expression, you can easily extract data from any text file in sorted array. I personally use this feature to launch automatic log analysis (based on Python).
For our example with regex in C++, I'll extract for the text file the day, name of the buyer, number of article and the article name of each lines:
Do not hesitate to run to see the large output...
As you see, std::smatch react exactly like an iterator of string. The result is put in this array in format of std::string and ready to be used in the following order. One group correspond to each instruction between parenthesis (excepted group 0 which correspond to the complete string parsed, meaning that there is a match):
- Group0 - Complete string parsed
- Group1 - The day.
- Group2 - Buyer name.
- Group3 - Number of article bought.
- Group4 - The article's name.
Moreover, as you see, the no matching data are filtered.
Using regex in C++ with "replace" - Replace pattern.
If you already use "sed" in bash scripts, you are probably familiar with replace.
Replace allows you to detect some pattern in your string object to replace by another pattern. The usage is really simple! Let's replace the name "Danny" by "Danny 'the cool'":
As you see, efficient and easy to use!
Conclusion.
Now, you know how to use std::regex in C++! If you want more details about this, do not hesitate to learn how to use the regular expression. This kind of skill, specially useful in Python and JavaScript, is helpful to automate testing or grab quickly relevant data in log files.
If you want to learn more about regular expressions. Or, if you would test some special patterns: You can use this tutorial ==> https://regexone.com/
Personally, I use regularly regex in C++11 to parse several text files. Specially sent by providers or with old code used to automatic testing.