Provide partition data based on month using hive

I have data in text file like:

323.81.303.680 - - [25/Oct/2011:01:41:00 -0500] "GET /download/download6.zip HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.19) Gecko/2010031422 Firefox/3.0.19"

created table schema for above data using regexserde:

CREATE TABLE IF NOT EXISTS weblogs (
  host STRING,
   identity STRING,
   apache_user STRING,
   time STRING,
   request STRING,
   status STRING,
   size STRING,
   referer STRING,
   agent STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\"[^\"]*\") ([^ \"]*|\"[^\"]*\"))?",
"output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s"
) 
STORED AS TEXTFILE;

and loaded the data properly

but here the problem is I need to Provide the partition data based on monthly

im getting time : [25/Oct/2011:01:41:00 -0500] as string.

I need to partition the data based on monthly, above time field is string type so please any one helps me to partition the all the data based on year and month wise.

1 answer

  • answered 2017-06-17 18:24 Dudu Markovitz

    select  time
           ,unix_timestamp(time,'[dd/MMM/yyyy:HH:mm:ss z]')                                                                                                         
           ,from_unixtime(unix_timestamp(time,'[dd/MMM/yyyy:HH:mm:ss z]'))
           ,to_utc_timestamp(from_unixtime(unix_timestamp(time,'[dd/MMM/yyyy:HH:mm:ss z]')),'PST')
           ,date_format(to_utc_timestamp(from_unixtime(unix_timestamp(time,'[dd/MMM/yyyy:HH:mm:ss z]')),'PST'),'YYYY-MM') 
    
    from    weblogs
    ;
    

    +------------------------------+---------------+---------------------+----------------------------+---------+
    |             time             |      c1       |         c2          |             c3             |   c4    |
    +------------------------------+---------------+---------------------+----------------------------+---------+
    | [25/Oct/2011:01:41:00 -0500] | 1,319,524,860 | 2011-10-24 23:41:00 | 2011-10-25 06:41:00.000000 | 2011-10 |
    +------------------------------+---------------+---------------------+----------------------------+---------+