Computer
Computer Diary
    RepRap


    "For to be free is not merely to cast off one's chains, but to live in a way that respects and enhances the freedom of others."

    Nelson Mandela
    from "Long Walk to Freedom" 1995

    Here my little rant and praise place, where the daily experiences of my programming work are expressed. I publish them with the idea that others might find it useful and benefit from it.

    2009/04/22
    Automatically Geotag Photos without GPS
    last edited 2009/05/18 11:30 (*)

    Update 2009/04/26: Since the script has grown beyond just a quick hack I moved it to my technology site The-Labs.com: GeoTag, Automatic Geotagging Photos without GPS .

    I made a couple of hundred photos while my past bicycle travels and wrote diaries and added description to many of the photos.

    I considered to geotag (find proper location and its coordinates of latitude and longitude) the photos - but I postponed it after my first attempts. Now I made another attempt with a database from Geonames.org , with cities1000.txt which lists apprx. 85,000 cities with over 1,000 population (allCountries.txt has 8,000,000 entries which I will test as next), and the most useful data in this dataset are the aliases which lists city names in different languages and variation - that made my second attempt a success.

    Finding Location

    As first I take cities1000.txt and fill a sqlite database, two tables, geocities and geoalias.

    • geocities is the entry of each city: name, alias, lat, long, district and country code
    • geoalias are all aliases pointing to geocities entries

    The geoalias table speeds up, so in theory, the lookup.

    Applying Heuristics

    To find a location from a text isn't that easy one might think, and I came up with some assumptions (aka heuristics) to find a location.

    Anatomy of City Names

    I assumed a city name starts always with an uppercase, followed by lowercase characters. Italian and French city names often have multiple terms, whereas middle terms may be lower case, but first and last term starts with uppercase.

    So I ended up with following pattern matching:

    [A-Z][a-z]+ [A-Za-z]+ [A-Za-z]+ [A-Z][a-z]+
    [A-Z][a-z]+ [A-Za-z]+ [A-Z][a-z]+
    [A-Z][a-z]+ [A-Z][a-z]+
    [A-Z][a-z]+ 
    
    in that order.

    Additionally I implemented that in case multiple locations with the same name are found, sort according distance to last found location. How does this help? E.g. when I make a tour and travel from Prague/Praha to Vienna, and lookup Vienna I get 5 entries:

    • Vienna (VA,US) 38.9012225,-77.2652604 (#4791160)
    • Vienna (WV,US) 39.3270191,-81.5484578 (#4825976)
    • Vienna (GA,US) 32.0915577,-83.7954518 (#4228440)
    • Vienna (IL,US) 37.4153295,-88.8978435 (#4252025)
    • Wien (09,AT) 48.2084877601653,16.3720750808716 (#2761369)

    but the entry I want is "Wien", and this one is most close geographically to previous looked up "Prag". This small enhancement helped a lot to determine location correctly.

    Geotagging Photos

    For renekmueller.com web-site internally I defined a file called "list" which resides in the folder of the photos, it lists every file with its description, like this:

    0001.jpg    Zurich by night
    0002.jpg    Rapperswil in the morning, after long night
    ...
    

    My little perl-script geotag either accepts locations or filenames, if it's a file, it tries to find the location names or if it's a 'list' file, it handles it accordingly and prints out an alike 'list' file I call 'list.geo' which looks like this:

    0001.jpg    geo:name=Zurich (ZH,CH),\
    geo:long=8.55,lat=47.3666667,time=123238128
    0002.jpg    geo:name=Rapperswil (SG,CH),\
    geo:long=8.82227897644043,geo:lat=47.2255721988597,\
    time=123239228
    ...
    

    the time is the timestamp of the photo.

    Interpolation of Locations

    Since I have many photos not all have description nor location description, there I try to optionally interpolate the location: I use found locations before and after, and interpolate according timestamp a linear location interpolation.

    This works for me quite well, since I usually stop and take a few photos within 1-2mins and then ride again to the next location and make there photos again, and only put a description of the first photo in sequence. Since it takes me 1-2 hours to reach the next location, as I ride the bicycle, using the timestamp of the photo gives a good guess that the photos I take quickly in timely sequence are also near the location of the first photo with the location description.

    Download

    Requirements

    • sqlite-3.x , install via local package manager
    • perl module DBD::SQLite
    • perl module Time::HiRes

    install the perl-module either with your local package manager, or

    % perl -MCPAN -e 'install DBD::SQLite'
    % perl -MCPAN -e 'install Time::HiRes'
    

    Usage

    Copy geotag into /usr/local/bin (as root) or keep it locally; as first unzip the cities1000.txt.gz or cities1000.zip:

    % gzip -d cities1000.txt.gz
    

    As next run geotag and have cities1000.txt in the same directory:

    % ./geotag
    

    it will create ~/DB/ and populate ~/DB/geotag.db and takes a couple of minutes, on a Pentium4 2.4GHz about 20 mins to create the sqlite database geotag.db. After that the lookup will respond instantly of course.

    Examples

    % ./geotag prag
    Praha (52,CZ) lat=50.0878367932108,long=14.4241322001241
    
    % ./geotag vienna
    Wien (09,AT) lat=48.2084877601653,long=16.3720750808716
    
    % ./geotag boulder
    Boulder (CO,US) lat=40.0149856,long=-105.2705456
    
    % ./geotag vienna
    Vienna (IL,US) lat=37.4153295,long=-88.8978435
    
    % ./geotag paris
    Paris (TN,US) lat=36.3020023,long=-88.3267107
    
    % ./geotag munich
    München (02,DE) lat=48.1376831438553,long=11.5743541717529
    
    % ./geotag paris
    Paris (A8,FR) lat=48.85341,long=2.3488
    
    % ./geotag paris,il,us
    Paris (IL,US) lat=39.611146,long=-87.6961374
    
    % ./geotag vienna,at
    Wien (09,AT) lat=48.2084877601653,long=16.3720750808716
    
    % ./geotag -f gpx diary.txt > list.gpx
    

    so it behaves as I wanted, depending on previously found matches determine the perimeter of the next found location.

    I made a test-run based on my ./list file with photo description of my Europe 2008 Tour:

    % ./geotag list > list.geo
    % ./geotag -f gpx list > list.gpx
    

    and it made 1-2 errors which I corrected by hand, and added one waypoint (Rapperswil) so the path doesn't go over a lake - this is the result:


    Note: I had no GPS coordindates to start with, I solely used the description of my photos to conclude the waypoints. I used OpenStreetMap.org for this, used some of their examples, and referenced the list.gpx within the javascript code:

       ...
       var lgml = new OpenLayers.Layer.GML("GPX", "list.gpx", {    
          format: OpenLayers.Format.GPX,
          style: {
              strokeColor: 'red', strokeWidth: 5, 
              strokeOpacity: 0.5 },
          projection: new OpenLayers.Projection("EPSG:4326")
       });
       map.addLayer(lgml);
    

    I added some verbosity which is printed to stderr like giving the distance of the looked up locations, whereas location data is stdout (so you can redirect it via > file):

    % ./geotag berlin paris london
    Berlin (16,DE) lat=52.5166667,long=13.4
    Paris (A8,FR) lat=48.85341,long=2.3488
    London (ENG,GB) lat=51.5084152563931,long=-0.125532746315002
    statistics:
            3 locations looked up, 3 successes, 0 failed (0.0%)
            1222.499km cumulative distance
    

    Client/Server

    Since version 0.012 also tcp-based client/server is built in:

    % ./geotag -server
    

    if this machine has 192.168.1.8 as IP, and then go on a client, and do this:

    % ./geotag -s 192.168.1.8 'new york'
    

    You can create a ~/.geotagrc where you can define the defaults:

    server: 192.168.1.8
    

    and then call

    % ./geotag 'new york'
    

    and it will use the server to lookup the locations, via tcp on port 10102.



    All posts or individual posts:

  • MetaFS - Dealing With Metadata the Proper Way (2013/12/14 01:00)
  • My Cellphones & Smartphone (2010-2013) (2013/09/02 23:15)
  • KDE / Kubuntu 12.04: 10+ years terrible GUI, A Systemic Problem of OSS (2013/04/27 12:18)
  • UNIX Man on Windows 7: VirtualBox + Ubuntu + LXC (2012/11/27 19:24)
  • Metadata - The Unresolved Mess (2012/07/10 20:59)
  • Cellphone Networks: Thieves, Insanity & Crap (2010/01/26 13:57)
  • MacOS-X for a UNIX Man with a PC (2009/09/26 20:43)
  • Windows XP for a UNIX Man (2009/09/22 18:28)
  • Server Counting (2009/05/18 11:50)
  • Automatically Geotag Photos without GPS (2009/04/22 08:28)
  • Rebirth of FastCGI (2009/04/15 17:17)
  • Online Advertisement & Income for Web-Site Owners (2009/03/18 22:10)
  • iPhone JavaScript Frameworks (aka Avoiding Objective-C) (2009/03/14 22:08)
  • Google - The Almighty Tracker & Advertising Blocking (2009/03/12 22:09)
  • How To Save 300MB RAM (2009/03/07 22:07)
  • Verbosity of Programming Languages (2009/03/06 22:06)
  • Problems with MacOSX (2009/03/03 22:03)
  • MacOSX: My First Steps (2009/02/24 09:57)
  • Catch 22 with HDD/DVD Recorder Medion Life (2009/02/24 09:27)
  • Kubuntu 8.1 as guest on VirtualBox MacOSX host (2009/02/24 01:33)
  • VirtualBox vs VMWare Fusion on MacOSX (2009/02/24 01:19)
  • SQL vs GREP with 230K lines (12MB) GeoLite (2009/02/23 20:10)
  • Kubuntu 8.1: Eye-Candy & Memory Waste (2009/01/24 09:57)
  • Firefox 2.0.x / 3.0.x - Memory Waste (2009/01/22 19:34)


    [ post new entry ] (only for administators)

    Title:

    Text:

    Tags: (separated by commas)

    Date (optional):

    Password:
  •  


    .:.




    Copyright 2007-2016, 2020-2024 © by René K. Müller <spiritdude@gmail.com>
    Illustrations and graphics made with Inkscape, GIMP and Tgif