Welcome Guest ( Log In | Register )




                Web Hosting

 
Reply to this topicNew Topic
Using Regular Expressions To Parse Functions
turbopowerdmaxst...
post Mar 24 2008, 02:07 AM
Post #1


Premium Member
Group Icon

Group: [HOSTED]
Posts: 427
Joined: 16-February 06
From: Kolkata, India
Member No.: 11,322
myCENTs:29.11


I have a hierarchy of functions represented as something like:-

@BeforeText(@AfterText(@ReadURL('http://www.quotationspage.com/quote/1.html')@,'<dt>',True)@,'</dt>',True)@

Each of the functions follow the format @FunctionName(<ParamString>)@. The ParamString itself can be composed of a string, a number or a boolean value. For example:-

@AfterText('String', "st", True)@ - This function returns the portion of text following the specified substring. It is composed of three parameters. First one is the string being worked on, second is the substring being sought and the third parameter denotes whether to Ignore case.

The above code will result in an output of ring.

Trouble starts with function nesting. The following example should give the output so.

@BeforeText(@AfterText('Microsoft', "Micro", True)@, "ft", True)@

This kind of nesting should be possible to arbitrary levels. I can do this using Loops and Conditional Branching statements. But, I would rather use Regular Expressions and in the process learn some more nuances of .NET's Regular Expression support.

Classes for evaluating the functions have been designed. What I need to do is parse the functions from the deepest of levels just as the programming languages do, execute it and use the resulting output as parameters to outer level functions. But I am not sure where to begin. Anybody care to give a head start?
Go to the top of the page
 
+Quote Post
faulty.lee
post Mar 24 2008, 02:33 AM
Post #2


Super Member
Group Icon

Group: [HOSTED]
Posts: 500
Joined: 5-November 06
Member No.: 17,016
myCENTs:79.88


Why don't you consider using .net scripting or powershell, instead of cracking your head to reinvent the wheel. For .net scripting, the script itself is plain .net, be it VB.Net or C#. The main thing is to use System.CodeDom.Compiler.ICodeCompiler to compile and run the code on the fly. Windows Powershell http://www.microsoft.com/windowsserver2003...ll/default.mspx
It's also relying on .net. But it's scripting engine is much more powerful, and suited to run as an console or scripted.

Go to the top of the page
 
+Quote Post
turbopowerdmaxst...
post Mar 24 2008, 02:50 AM
Post #3


Premium Member
Group Icon

Group: [HOSTED]
Posts: 427
Joined: 16-February 06
From: Kolkata, India
Member No.: 11,322
myCENTs:29.11


I have tried my hands on the ICodeCompiler interface. It forms a part of the package I am building for Pika Bot. But, these functions have got to be written in this custom language and regular expressions seemed to be the best option. I could, however, convert these functions into VB or C# representations and execute the code using the ICodeCompiler interface.
Go to the top of the page
 
+Quote Post
turbopowerdmaxst...
post May 14 2008, 08:47 AM
Post #4


Premium Member
Group Icon

Group: [HOSTED]
Posts: 427
Joined: 16-February 06
From: Kolkata, India
Member No.: 11,322
myCENTs:29.11


Given below is a simplistic code that represents the function evaluation method. Codes irrelevant to the problem have been abstracted to avoid cluttering.

CODE
public static string Substitute(string Message)
{
    string Pat = @"@(?<name>[a-z0-9]+)\((?<paramstring>.*)\)@";
    Match M = Regex.Match(Message, Pat, RegexOptions.IgnoreCase | RegexOptions.Singleline);

    if (M.Success)
    {
        // BEGIN Parameter determination Code

        // END Parameter determination Code

        // Substitute Parameters, incase they contain functions. (Nested Functions)
        for(int i = 0; i < Params.Length; i++)
        {
            Params[i] = Substitute(Params[i]);
        }

        // Pre & Post variables contain the strings which are before and after the matched text.
        // Func is an object of the appropriate function class.
        // The Parameters are added to it after they are substituted.
        Message = Pre + Func.Invoke() + Post;
    }
    return Message;
}


The pattern matches the outermost function (using .* in the paramstring) and thus allows nested functions to be matched in the next recursions of the function. A lot of code exists between the // BEGIN Parameter determination Code and // END Parameter determination Code blocks. This splits the ParamString using , to determine the parameters and then combines invalid string entries (incase the , is contained inside a string parameter). The loop iterates through all the parameters calling the function itself for all the parameters. This takes care of nested functions. The object Func returns the result obtained from evaluating the function. For example @Log10(100)@ will result in 2. Pre & Post contain the strings before and after the matched text. In the input ABC@Rnd(1,100)@DEF, ABC is Pre and DEF is Post.

Consider the following Input:-

@BeforeText(@AfterText('School','S')@,'l')@

In the first call to the Substitute function, the pattern matches the whole input: @BeforeText(@AfterText('School','S')@,'l')@. Here, BeforeText is the name sub-group while @AfterText('School','S')@,'l' is the paramstring sub-group. The next recursive call to the substitution function passes the input: @AfterText('School','S')@.

The problem now is that multiple functions at the same level cannot be evaluated.

@Log10(0)@ Some Intermediate Text @Log(0)@

The Pattern matches the entire message - @Log10(0)@ Some Intermediate Text @Log(0)@. But, what I want it to do is match @Log10(0)@ and @Log(0)@ seperately. Excluding the symbols @ ( ) from the paramstring will not work as that would disable nested functions to be evaluated. I am wondering if there is something like recursive pattern matching in .NET and will it be able to aid in this matter.
Go to the top of the page
 
+Quote Post
faulty.lee
post May 14 2008, 11:50 AM
Post #5


Super Member
Group Icon

Group: [HOSTED]
Posts: 500
Joined: 5-November 06
Member No.: 17,016
myCENTs:79.88


I'm not that good with regex. I do have a suggestion, maybe you can count for matching '@(' and ')@'. Say for every '@(' you increment the counter, then for every ')@' you decrement the counter. Like reference counting in C++. When counter is 0, you need to look for the next function, instead of skipping it. Get the last position/index of the last matching '@(' ')@' pair, then can either chop off the matched set of '@( and ')@', the match the regex again, of use the match function where you can specified the starting position for matching (but you loose the regexoption).
Go to the top of the page
 
+Quote Post
turbopowerdmaxst...
post May 14 2008, 11:59 AM
Post #6


Premium Member
Group Icon

Group: [HOSTED]
Posts: 427
Joined: 16-February 06
From: Kolkata, India
Member No.: 11,322
myCENTs:29.11


I would have done it that way back in the old days but ever since realizing the power of Regex, it just doesn't feel right. It is my last option, though. I have just come across some interesting ways of matching such constructs but I can't seem to get them to work. Any ideas?
Go to the top of the page
 
+Quote Post
Moudey
post Oct 23 2008, 10:53 PM
Post #7


Newbie [ Level 1 ]
Group Icon

Group: Members
Posts: 1
Joined: 23-October 08
Member No.: 34,043


plz how can parse this func?
Go to the top of the page
 
+Quote Post

Reply to this topicNew Topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

Collapse

> Similar Topics

    Topic Title Replies Topic Starter Views Last Action
No New Posts   7 FirefoxRocks 462 5th January 2009 - 05:33 AM
Last post by: x0rk
No New Posts   7 flute 307 5th January 2009 - 04:07 AM
Last post by: santaclaus
No New Posts   9 khalilov 312 14th November 2008 - 03:12 PM
Last post by: Quatrux
No New Posts   5 khalilov 358 1st November 2008 - 06:58 PM
Last post by: sparkx
No New Posts   4 NelsonTR4N 234 10th October 2008 - 12:23 PM
Last post by: magiccode9
No New Posts 8 Flamez 484 11th August 2008 - 07:18 AM
Last post by: vujsa
No New Posts   0 Archimedes 277 31st July 2008 - 04:17 PM
Last post by: Archimedes
No New Posts   4 Feelay 433 8th May 2008 - 10:20 AM
Last post by: Jared
No New Posts   4 OpaQue 1,321 25th March 2008 - 10:08 PM
Last post by: Umar Shah
No New Posts   4 Jimmy89 592 21st February 2008 - 03:24 AM
Last post by: Jimmy89
No New Posts   0 vujsa 566 18th February 2008 - 05:09 AM
Last post by: vujsa
No New Posts 4 FirefoxRocks 1,303 4th November 2007 - 02:23 PM
Last post by: Blaise
No New Posts   1 dserban 584 16th August 2007 - 10:59 PM
Last post by: develCuy
No New Posts   0 turbopowerdmaxsteel 1,630 3rd August 2007 - 04:19 AM
Last post by: turbopowerdmaxsteel
No New Posts   4 TavoxPeru 4,022 29th April 2007 - 11:01 AM
Last post by: TavoxPeru