How to rename an existing Spark SQL function

I am using Spark to call functions on the data which is submitted by the user.

How can I rename an already existing function to a different name like like REGEXP_REPLACE to REPLACE?

I tried the following code :

ss.udf.register("REPLACE", REGEXP_REPLACE)           // This doesn't work
ss.udf.register("sum_in_all", sumInAll)
ss.udf.register("mod", mod)
ss.udf.register("average_in_all", averageInAll)

1 answer

  • answered 2017-12-11 07:12 philantrovert

    Import it with an alias :

    import org.apache.spark.sql.functions.{regexp_replace => replace }
    df.show
    +---+
    | id|
    +---+
    |  0|
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    |  6|
    |  7|
    |  8|
    |  9|
    +---+
    
    df.withColumn("replaced", replace($"id", "(\\d)" , "$1+1") ).show
    
    +---+--------+
    | id|replaced|
    +---+--------+
    |  0|     0+1|
    |  1|     1+1|
    |  2|     2+1|
    |  3|     3+1|
    |  4|     4+1|
    |  5|     5+1|
    |  6|     6+1|
    |  7|     7+1|
    |  8|     8+1|
    |  9|     9+1|
    +---+--------+
    

    To do it with Spark SQL, you'll have to re-register the function in Hive with a different name :

    sqlContext.sql(" create temporary function replace 
                     as 'org.apache.hadoop.hive.ql.udf.UDFRegExpReplace' ")
    
    sqlContext.sql(""" select replace("a,b,c", "," ,".") """).show
    +-----+
    |  _c0|
    +-----+
    |a.b.c|
    +-----+