Regular Expression in R for twitter username

I am looking to handle and text between @ and : with out any space for example @rstat:. I would like to form a regular expression to handle this. I have tried ^@.[A-z0-9_].:$ but its not working.

Kindly help me here.

1 answer

  • answered 2017-09-14 16:29 Wiktor Stribiżew

    The ^@.[A-z0-9_].:$ pattern matches the start of string (^), then a @, then any char (with .), then letters, digits, _, `, [, \, ], ^, then any char again, a : and end of string ($). So, it can match, say, a @§`‘: string.

    You may use stringr str_extract_all like this

    str_extract_all(x, "(?<=@)[^\\s:]+")
    

    If you must check for the : presence, add a lookahead check:

    str_extract_all(x, "(?<=@)[^\\s:]+(?=:)")
                                      ^^^^^
    

    See the regex demo.

    Details

    • (?<=@) - a location in string that is immediately preceded with @ symbol
    • [^\\s:]+ - 1 or more (due to +) chars other than whitespace and :
    • (?=:) - a positive lookahead that requires the presence of : immediately to the right of the current location.