cfRegeX


cfRegex Features

This page offers a quick overview of features provided.

For in-depth information, the documentation goes into detail for everything mentioned here, and more.

What Do You Think?

I want to know what people's views are on these features. Do they do everything you would expect? Does anything here seem a mistake? If you have any thoughts, come have a chat about them on the mailing list.


More, Better Functions

Regex support in core CFML currently comprises of just three functions, and despite this there is still inconsistency!

With cfRegex the functions and arguments have been carefully considered to ensure that argument order is always consistent, that arguments which make sense are available, and that a range of useful functions are available.

To avoid conflicts or potential confusion, the functions are named "Regex[Action]" instead of existing "Re[Action]".

Newer Regex Engine

The regex engine used by CF is Apache ORO, which has limited functionality compared to other regex engines, and was retired by Apache in September 2010.

The cfRegex project is built on top of the more powerful Java regex engine (java.util.regex), which provides new regex features not available with the Apache ORO Engine.

This is supplemented with additional easy to use functionality, making cfRegex a very flexible regex tool.


cfregex Tag

When working with long or complex regex, spreading across multiple lines and adding inline comments can be vital for maintainability.

The cfregex tag makes this just as easy as multiline commented SQL is with cfquery.

To put it another way, imagine coming across some code with this expression all on one line:

(\w+)(?:\nAlias ([^\n]+))?((?:\n\t\w+ \S+ .*$)*)
\n([^\n]+)$((?:\n.*$)+?(?=\n\n+))

Wouldn't you much rather something more like:

<cfregex
	action     = "match"
	returntype = "groups"
	text       = #SectionText#
	variable   = "MatchedSections"
	>
	## Name            -> Group 1
	(\w+)
	
	## Optional Alias  -> Group 2
	(?:\nAlias ([^\n]+))?
	
	## All Arguments   -> Group 3
	## Format [name][type][hint]
	((?:\n\t\w+ \S+ .*$)*)
	
	## Return Type     -> Group 4
	\n([^\n]+)$
	
	## Optional Notes  -> Group 5
	## (Keep matching until double newline)
	((?:\n.*$)+?(?=\n\n+))
</cfregex>

The cfregex tag has an equivalent action for every function (the only difference is that comment mode is on by default for cfregex tag, and off by default for functions), or it can be used to compile a new regex object.

Regex Object

Usually, when working with regex in CFML, you have no control over whether a regex pattern is created every time, or if it is cached in memory for future use - indeed this behaviour differs between the different CFML engines.

If you are using a regex once, you don't need to keep it around after use, but if you will be using it many times it is more efficient to create a single object which can be used multiple times and doesn't need to be repeatedly re-parsed.

cfRegex provides programmers with a Regex Object, which is created from a compiled pattern, and can be re-used as much as nededed.

Other Features

There are various features planned for future versions of cfRegex, but only if there is sufficient interest.

Examples of features under consideration are...

For more details on what is planned, see the roadmap.

It is important to note that features will not be added if nobody wants them. If you would find any of the above useful, or there are any others that you'd like to see implemented, make sure you tell us.

The more people that request a feature the greater the chances it will be implemented.