|
 |
regular_expressions thread: Problem using Regex to detect a block of code
Message #1 by "cheeleong82" <cheeleong82@h...> on Tue, 31 Dec 2002 06:01:51
|
|
Yes, Regex can be used for this. It will be a very complicated job,
however. It is easier to do with HTML as you can match against the close
tag. However, with for loops, the { } characters are used for many
different kinds of blocks. I would suggest instead using some kind of
counting mechanism. Alternatively, match blocks in general (things
before { brackets up until a ;\s* match. A solution just using regex
groups is as follows. First match any namespace contents:
... new Regex(@"(using
[^;]*;\s*)*namespace\s*[^{]\{\s*(?<namespaceBlock>.*)\s*\}",
RegexOptions.Multiline || RegexOptions.Singleline ||
RegexOptions.Compiled || RegexOptions.ExplicitCapture);
Now in a loop keep trying to find a match for this. When you succeed,
use the contents of the named namespaceBlock group and pass it back to
the regex IsMatch() method. When it fails, you know you have finished
with the namespace declarations and all that's left are class, struct,
and other blocks. So, you now want a regex like this on the
namespaceBlock group:
... new Regex(@"\s*(?<type>class [^{]|struct
[^{]){(?<typeDefinition>.*)\s*\}),
RegexOptions.Multiline || RegexOptions.Singleline ||
RegexOptions.Compiled || RegexOptions.ExplicitCapture);
Now the contents of the typeDefinition group is the actual block itself.
Finally, we can use the main match.
... new Regex(@"\s(?:for\(.*?\)\s*\{\s*(?<loopContents>.*)\s*\})\s*",
RegexOptions.Multiline || RegexOptions.Singleline ||
RegexOptions.Compiled ||
RegexOptions.ExplicitCapture);
You can then do an IsMatch on the named loopContents group to see if
there are any other for loops. If so, then extract the contents again.
This is probably possible with one hugely complex regex, with positive
and negative assertions and many Or conditions. However, it'll take too
long to work it out.
Try and merge them together slightly if you can by adding a new subgroup
inside the loopContents group that optionally matches another for loop.
This way you can probably do more cunning things.
-----Original Message-----
From: cheeleong82 [mailto:cheeleong82@h...]
Sent: 31 December 2002 11:32
To: Regular Expressions
Subject: [regular_expressions] Problem using Regex to detect a block of
code
Hello! I'm having a problem in using Regex to detect the structure of a
block of code.
For instance, i have the block of code as follow:
for (int i = 0; i < 8; i++)
{
for (int j = 0; j <8; j++)
{
for (int x = 0; x < 8; x++)
{
}
}
for (int g = 0; g < 9; g++)
{
}
}
Problems: 1. How can detect the structure of the block of code?
For instance, the "for(int x....)" is having an outer loop "for(int j
....) and finally the outermost "for(int i...)".
2. How can I determine that "for(int g...)" and "for(int j...)" are
both under "for(int i...)"?
Is the Regex class able to solve my problem? Or any better way that I
can
use? Thanks for your kind help!
|
|
 |