finding a substring in a text column that start and end with a specific string

I'm trying to scan a text dataframe column and retrieve a string that starts with a specific string and ends with a specific string.I tried to use substring with instr but couldn't get it working.

1 answer

  • answered 2018-03-13 20:50 LeMoN.xaH

    what you could do is use regex and pattern matching to achieve this

    @ def getText(startsWith: String, endsWith: String)(text: String): (String, String) = {
        val rr = s"($startsWith(.+?)$endsWith)".r
        text match {
          case rr(all, partial) => (all, partial)
          case _ => ("", "")
    defined function getText
    @ getText("1", "2")("1hdfjhsdf2") 
    res5: (String, String) = ("1hdfjhsdf2", "hdfjhsdf")

    this should do what you want