Monday, April 09, 2007

Email Pattern With Regex

I was looking for some good pattern to verify someone's email address with Regular Expression. Some quick search on Google has put me on this link which gives me a quick headache, since the code is very very long and confusing (try it by yourself if you don't believe me). I believe there's a lot of good pattern in the Internet, so i tried to look for information in offline version of Microsoft's MSDN system and i got this pattern:
"^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"

I started to test this pattern and it gives me a good expression at the beginning, until i test to use some combination, for example user.@domain.com or user-@domain.com. The pattern above failed to detect this as a invalid email address (well, i don't know whether it's valid or not based on RFC822, but i never see people use that kind of format). So i tried to modify the pattern above and here's my modifications that can detect that format as invalid email address:
"^([\w]+)(([-\.][\w]+)?)*@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"

What i add was some kind of constraint for characters after there's a hypen or a dot. So if there is a dot or hypen before the "@" sign, then you must have characters after that, or else it will produce an error. You can have as many dot or hypen as you like, so you can input you.long.name.here@domain.com

I hope this pattern can be used by other people who are having problem of finding good pattern for validating email address.

I know that this pattern is far from perfect, so if you find any problem/bug, please let me know and i'll update them as needed.