If Na then replace the value by another value - R (Geocode)

I have a data-frame, where if the lat is NA then I want the For loop to look again for a geocode and replace it with values in the dataframe.

Country        Continent    long      lat
Netherlands    Europe         NA       NA
     Norway    Europe   8.468946 60.47202
       Poland  Europe  19.145136 51.91944

library(ggmap)
geocode("CountryName") will give the lat and long result.

How do you programmatically assign R to run through a for loop - for each rows in the dataframe and check for NA and if NA then get the geocodes and replace it in the dataframe df.

Please help me with this. Thanks.

1 answer

  • answered 2018-01-11 19:42 oszkar

    My answer is fundamentally the same as mentioned above in Gregor's comment, but with a working example.

    After issuing the next commands in R:

    library(ggmap) # for using command 'geocode'
    
    # setting up a sample dataframe with missing longitudes and latitudes data
    df <- data.frame(Country = c('Netherland', 'Norway', 'Poland'), 
                     Continent = rep('Europe', 3),
                     long = c(NA, 8.468946, 19.145136),
                     lat = c(NA, 60.47202, 51.91944))
    # print the dataframe
    df
    

    You will get the next output:

         Country Continent      long      lat
    1 Netherland    Europe        NA       NA
    2     Norway    Europe  8.468946 60.47202
    3     Poland    Europe 19.145136 51.91944
    

    To fix the missing longitudes and latitudes issue the next commands:

    # looking for rows where longitude is missing
    missing.long <- is.na(df$long)
    # getting the missing longitude for the above TRUE marked rows
    df[missing.long, 'long'] <- geocode(as.character(df$Country[missing.long]))$lon
    # looking for rows where latitude is missing
    missing.lat <- is.na(df$lat)
    # getting the missing latitude for the above TRUE marked rows
    df[missing.lat, 'lat'] <- geocode(as.character(df$Country[missing.lat]))$lat
    # print the dataframe
    df
    

    And you will get the output:

         Country Continent      long      lat
    1 Netherland    Europe  5.291266 52.13263
    2     Norway    Europe  8.468946 60.47202
    3     Poland    Europe 19.145136 51.91944
    

    Of course if the longitude and latitude data always missing together, you don't have to use separate missing.long and missing.lat vectors.