Ruby On Rails, Design, Simplicity, Web 2.0, Ajax, Mac and Tons of Pizza.

Jan 24

I'm not a RegExp Master..

Posted by Sandro Paganotti in Ruby on Rails - comments are closed digg this add to delicious

I’ve tried to create a regular expression that can trasform each aphostrophe to an escaped one but my attempt with irb driven me to a very strange behavior:



irb(main):001:0> "string ' with aphostrophe" 
=> "string ' with aphostrophe" 
irb(main):002:0> "string ' with aphostrophe".gsub("'","\\'")
=> "string  with aphostrophe with aphostrophe" 


Can you explain me why the expression I wrote duplicate everything after the aphostrophe ?
Can you lead me to a solution ?

Thank you guys.

Sandro

Comments

  • Keeto

    Posted on January 24

    I'm not a regexp master myself, so I don't know why the last two words are repeated. But the solution is to escape the backslashes: "string ' with apostrophe".gsub("'", "\\\\'") That should work.. ^_^"
  • twinwing

    Posted on January 24

    What Keeto said, something to do with parsing the string twice, so after the first pass, you'll end up with \\ only. Just one of the quirks of Ruby...
  • Peter Cooper

    Posted on January 24

    \' references $' which is a special Ruby variable defined in regular expression settings that means "the characters to the right of the match". $` does the characters to the left.. for example: "string ' with aphostrophe".gsub(/'/, "\\\`") gives.. "string string with aphostrophe" There's a list of these strings here. It works this way so that back references can be obtained (\1 -> $1, \2 -> $2, etc).. What Keeto suggests will do what you want, even if it looks "wrong" in irb due to irb escaping the results.
  • Sandro

    Posted on January 24

    Wow, Thank you Keeto for your suggest and twinwing and Peter for the explanation ! (and for the useful link :)
  • Dominik

    Posted on January 24

    Damn! ;-) I assume I figured this out just when you were writing your comment, Peter.
  • Radarek

    Posted on January 24

    It took me a while to understand what's going on. I got it finally. When you put string as first parameter to gsub, it will be converted to regexp using Regexp.new(param). Second parameter is of course string, but it's firstly interpreted by Ruby (so \\' gives \') and later is interpreted by regexp engine. As we know we could use some special characters combinations, for example \1. \' has special meaning to. It's replaced by string from current match to the end of string. If you want avoid to \' be interpreted in that way, you must add another backslash: "\\\\'" => \\'. > Just one of the quirks of Ruby... I think quirks is in other place. Why the hell Ruby convert string to regexp? If I want use string I mean string not regexp ;). Double interpreting (by string and regexp engine) is ok.
  • Ryan

    Posted on January 24

    Just in case you aren't aware, I've been using Rubular a lot lately... it's a Ruby parser for regex. Seems to work pretty good, and you can visual see what the matches are as you type, which is helpful for someone new/learning regex.

Post a comment

Categories:

Tags:

Powered by Mephisto, Valid XHTML 1.1, Valid CSS - Supported by Wave Factory