Can I Replace Groups In Java Regex?


Answer :

Use $n (where n is a digit) to refer to captured subsequences in replaceFirst(...). I'm assuming you wanted to replace the first group with the literal string "number" and the second group with the value of the first group.

Pattern p = Pattern.compile("(\\d)(.*)(\\d)"); String input = "6 example input 4"; Matcher m = p.matcher(input); if (m.find()) {     // replace first number with "number" and second number with the first     String output = m.replaceFirst("number $3$1");  // number 46 } 

Consider (\D+) for the second group instead of (.*). * is a greedy matcher, and will at first consume the last digit. The matcher will then have to backtrack when it realizes the final (\d) has nothing to match, before it can match to the final digit.


You could use Matcher#start(group) and Matcher#end(group) to build a generic replacement method:

public static String replaceGroup(String regex, String source, int groupToReplace, String replacement) {     return replaceGroup(regex, source, groupToReplace, 1, replacement); }  public static String replaceGroup(String regex, String source, int groupToReplace, int groupOccurrence, String replacement) {     Matcher m = Pattern.compile(regex).matcher(source);     for (int i = 0; i < groupOccurrence; i++)         if (!m.find()) return source; // pattern not met, may also throw an exception here     return new StringBuilder(source).replace(m.start(groupToReplace), m.end(groupToReplace), replacement).toString(); }  public static void main(String[] args) {     // replace with "%" what was matched by group 1      // input: aaa123ccc     // output: %123ccc     System.out.println(replaceGroup("([a-z]+)([0-9]+)([a-z]+)", "aaa123ccc", 1, "%"));      // replace with "!!!" what was matched the 4th time by the group 2     // input: a1b2c3d4e5     // output: a1b2c3d!!!e5     System.out.println(replaceGroup("([a-z])(\\d)", "a1b2c3d4e5", 2, 4, "!!!")); } 

Check online demo here.


Sorry to beat a dead horse, but it is kind-of weird that no-one pointed this out - "Yes you can, but this is the opposite of how you use capturing groups in real life".

If you use Regex the way it is meant to be used, the solution is as simple as this:

"6 example input 4".replaceAll("(?:\\d)(.*)(?:\\d)", "number$11"); 

Or as rightfully pointed out by shmosel below,

"6 example input 4".replaceAll("\d(.*)\d", "number$11"); 

...since in your regex there is no good reason to group the decimals at all.

You don't usually use capturing groups on the parts of the string you want to discard, you use them on the part of the string you want to keep.

If you really want groups that you want to replace, what you probably want instead is a templating engine (e.g. moustache, ejs, StringTemplate, ...).


As an aside for the curious, even non-capturing groups in regexes are just there for the case that the regex engine needs them to recognize and skip variable text. For example, in

(?:abc)*(capture me)(?:bcd)* 

you need them if your input can look either like "abcabccapture mebcdbcd" or "abccapture mebcd" or even just "capture me".

Or to put it the other way around: if the text is always the same, and you don't capture it, there is no reason to use groups at all.


Comments

Popular posts from this blog

Are Regular VACUUM ANALYZE Still Recommended Under 9.1?

Can Feynman Diagrams Be Used To Represent Any Perturbation Theory?