Computer Diary

    "God grant me the serenity to accept the things I cannot change; courage to change the things I can; and wisdom to know the difference."

    Reinhold Niebuhr

    Here my little rant and praise place, where the daily experiences of my programming work are expressed. I publish them with the idea that others might find it useful and benefit from it.

    Tag <Heuristics>

    Check also other posts with other tags.

    Automatically Geotag Photos without GPS
    last edited 2009/05/18 11:30 (*)

    Update 2009/04/26: Since the script has grown beyond just a quick hack I moved it to my technology site GeoTag, Automatic Geotagging Photos without GPS .

    I made a couple of hundred photos while my past bicycle travels and wrote diaries and added description to many of the photos.

    I considered to geotag (find proper location and its coordinates of latitude and longitude) the photos - but I postponed it after my first attempts. Now I made another attempt with a database from , with cities1000.txt which lists apprx. 85,000 cities with over 1,000 population (allCountries.txt has 8,000,000 entries which I will test as next), and the most useful data in this dataset are the aliases which lists city names in different languages and variation - that made my second attempt a success.

    Finding Location

    As first I take cities1000.txt and fill a sqlite database, two tables, geocities and geoalias.

    • geocities is the entry of each city: name, alias, lat, long, district and country code
    • geoalias are all aliases pointing to geocities entries

    The geoalias table speeds up, so in theory, the lookup.

    Applying Heuristics

    To find a location from a text isn't that easy one might think, and I came up with some assumptions (aka heuristics) to find a location.

    Anatomy of City Names

    I assumed a city name starts always with an uppercase, followed by lowercase characters. Italian and French city names often have multiple terms, whereas middle terms may be lower case, but first and last term starts with uppercase.

    So I ended up with following pattern matching:

    [A-Z][a-z]+ [A-Za-z]+ [A-Za-z]+ [A-Z][a-z]+
    [A-Z][a-z]+ [A-Za-z]+ [A-Z][a-z]+
    [A-Z][a-z]+ [A-Z][a-z]+
    in that order.

    Additionally I implemented that in case multiple locations with the same name are found, sort according distance to last found location. How does this help? E.g. when I make a tour and travel from Prague/Praha to Vienna, and lookup Vienna I get 5 entries:

    • Vienna (VA,US) 38.9012225,-77.2652604 (#4791160)
    • Vienna (WV,US) 39.3270191,-81.5484578 (#4825976)
    • Vienna (GA,US) 32.0915577,-83.7954518 (#4228440)
    • Vienna (IL,US) 37.4153295,-88.8978435 (#4252025)
    • Wien (09,AT) 48.2084877601653,16.3720750808716 (#2761369)

    but the entry I want is "Wien", and this one is most close geographically to previous looked up "Prag". This small enhancement helped a lot to determine location correctly.

    Geotagging Photos

    For web-site internally I defined a file called "list" which resides in the folder of the photos, it lists every file with its description, like this:

    0001.jpg    Zurich by night
    0002.jpg    Rapperswil in the morning, after long night

    My little perl-script geotag either accepts locations or filenames, if it's a file, it tries to find the location names or if it's a 'list' file, it handles it accordingly and prints out an alike 'list' file I call 'list.geo' which looks like this:

    0001.jpg    geo:name=Zurich (ZH,CH),\
    0002.jpg    geo:name=Rapperswil (SG,CH),\

    the time is the timestamp of the photo.

    Interpolation of Locations

    Since I have many photos not all have description nor location description, there I try to optionally interpolate the location: I use found locations before and after, and interpolate according timestamp a linear location interpolation.

    This works for me quite well, since I usually stop and take a few photos within 1-2mins and then ride again to the next location and make there photos again, and only put a description of the first photo in sequence. Since it takes me 1-2 hours to reach the next location, as I ride the bicycle, using the timestamp of the photo gives a good guess that the photos I take quickly in timely sequence are also near the location of the first photo with the location description.



    • sqlite-3.x , install via local package manager
    • perl module DBD::SQLite
    • perl module Time::HiRes

    install the perl-module either with your local package manager, or

    % perl -MCPAN -e 'install DBD::SQLite'
    % perl -MCPAN -e 'install Time::HiRes'


    Copy geotag into /usr/local/bin (as root) or keep it locally; as first unzip the cities1000.txt.gz or

    % gzip -d cities1000.txt.gz

    As next run geotag and have cities1000.txt in the same directory:

    % ./geotag

    it will create ~/DB/ and populate ~/DB/geotag.db and takes a couple of minutes, on a Pentium4 2.4GHz about 20 mins to create the sqlite database geotag.db. After that the lookup will respond instantly of course.


    % ./geotag prag
    Praha (52,CZ) lat=50.0878367932108,long=14.4241322001241
    % ./geotag vienna
    Wien (09,AT) lat=48.2084877601653,long=16.3720750808716
    % ./geotag boulder
    Boulder (CO,US) lat=40.0149856,long=-105.2705456
    % ./geotag vienna
    Vienna (IL,US) lat=37.4153295,long=-88.8978435
    % ./geotag paris
    Paris (TN,US) lat=36.3020023,long=-88.3267107
    % ./geotag munich
    München (02,DE) lat=48.1376831438553,long=11.5743541717529
    % ./geotag paris
    Paris (A8,FR) lat=48.85341,long=2.3488
    % ./geotag paris,il,us
    Paris (IL,US) lat=39.611146,long=-87.6961374
    % ./geotag vienna,at
    Wien (09,AT) lat=48.2084877601653,long=16.3720750808716
    % ./geotag -f gpx diary.txt > list.gpx

    so it behaves as I wanted, depending on previously found matches determine the perimeter of the next found location.

    I made a test-run based on my ./list file with photo description of my Europe 2008 Tour:

    % ./geotag list > list.geo
    % ./geotag -f gpx list > list.gpx

    and it made 1-2 errors which I corrected by hand, and added one waypoint (Rapperswil) so the path doesn't go over a lake - this is the result:

    Note: I had no GPS coordindates to start with, I solely used the description of my photos to conclude the waypoints. I used for this, used some of their examples, and referenced the list.gpx within the javascript code:

       var lgml = new OpenLayers.Layer.GML("GPX", "list.gpx", {    
          format: OpenLayers.Format.GPX,
          style: {
              strokeColor: 'red', strokeWidth: 5, 
              strokeOpacity: 0.5 },
          projection: new OpenLayers.Projection("EPSG:4326")

    I added some verbosity which is printed to stderr like giving the distance of the looked up locations, whereas location data is stdout (so you can redirect it via > file):

    % ./geotag berlin paris london
    Berlin (16,DE) lat=52.5166667,long=13.4
    Paris (A8,FR) lat=48.85341,long=2.3488
    London (ENG,GB) lat=51.5084152563931,long=-0.125532746315002
            3 locations looked up, 3 successes, 0 failed (0.0%)
            1222.499km cumulative distance


    Since version 0.012 also tcp-based client/server is built in:

    % ./geotag -server

    if this machine has as IP, and then go on a client, and do this:

    % ./geotag -s 'new york'

    You can create a ~/.geotagrc where you can define the defaults:


    and then call

    % ./geotag 'new york'

    and it will use the server to lookup the locations, via tcp on port 10102.

    Check also other posts with other tags.

    [ post new entry ] (only for administators)



    Tags: (separated by commas)

    Date (optional):



    Copyright 2007-2016, 2020-2024 © by René K. Müller <>
    Illustrations and graphics made with Inkscape, GIMP and Tgif