Friday, February 4, 2011

Regular Expressions in Ruby

I'm about halfway through Chapter 4 of Beginning Ruby. Beginning Ruby has told me to write bits of code like so:

puts "This is a test".scan(/\w+/).length

(which counts the elements in "This is a test"

It was easy enough to grasp WHAT the code did, but HOW it did it was a bit trickier. And I get hung up on these kinds of things, but I had no idea what the / and \ were for. But I've got it now. The forward slashes signal the beginning and end of a regular expression, and the \ is a stop, which tells Ruby the character following it is a character and not code. This doesn't apply to characters that have no other meanings in regular expressions.

And the pipes are just used to separate information inside a regular expression. So in a bit of code such as

We have the forward slashes signaling the beginning and end of the regular expression, the backslashes telling Ruby that the . and the ? are characters to be counted by this code, not code itself and the pipes telling Ruby there's a new piece of information ahead. And again, a backslash is not needed for the exclamation mark, because it has no other meaning inside of a regular expression.

I think the best thing I can do is to remember to get a handle on big picture things, and not get hung up on the syntax.

1 comment:

  1. Looks like the regex you posted didn't show up in the article. Can you repost?

    \w means "any character that makes up a word". If I remember correctly it's just a short-hand way of writing something like [a-zA-z0-9_]. So \w+ means any run of at least one word characters in a row, as large as possible.

    Pipes actually mean alternate matcher, most useful when you have more than one character.

    /abc/def/ghi/ means match abc def or ghi

    Or more complex options:

    /(http/ftp):\/\/.*/ match http:// or ftp://

    Here's a great reference of almost all the popular regex options
