Computer
Computer Diary
    RepRap


    "Yes, we can."

    Barack Obama
    4th November 2008, Chicago, USA

    Here my little rant and praise place, where the daily experiences of my programming work are expressed. I publish them with the idea that others might find it useful and benefit from it.

    Tag <MacOSX>

    Check also other posts with other tags.

    2012/07/10
    Metadata - The Unresolved Mess
    last edited 2012/07/21 11:01 (*)

    We count the year 2012, cloud computing and cloud storage has arrived, and still increasing. We expect all our data available 24/7 and every device is online and grants us access to the internet - The Big Big Cloud of Everything.

    What Is Metadata?

    Metadata is the description of the data. So what does it mean?

    Let me give you an example: I provided computer-support for many of my friends and one friend in particular used to call me almost daily to resolve some of his problems - and I used to tell him "write things down". Now, he forgot logins and passwords of services he used, also root passwords of the machines I installed and I said "write down the password!", and he did: on a piece of paper, and he wrote down "hello12KB" (as example):


    Pieces of paper with passwords on them: Data without Metadata

    Weeks passed by, and you guess right, there were several pieces of paper with words on them written on . . . which password belonged to which login, which computer or which web-service? The data which described the data was NOT written down - and you assume correct, without the metadata your data becomes useless.

    Poor Man's Primary Metadata: The Filename

    "My Thesis 2.doc" or "Test.doc", "New Test.doc", yes, those are the filenames which should describe what the data is.

    I personally name files with some relevant metadata, e.g. including the date "2012-07-10" or so, like "My Thesis 2012-07-10.odt"

    Bad habit is to name files like "July 10, 2012" because no simple sorting by filename gives a relevant order, month names sorted alphabetically doesn't bring you anything. Likewise "07-10-11" is even bigger non-sense, what is the year, month and day here? Right, you write it always the same way, but others have to guess? No matter how smart you name the files, you can't embed all what you really want - filenames would become unreadable.

    Poor Man's Secondary Metadata: The Folder

    The use of directories or folders has helped a bit:

    • Thesis 2012/
      • Notes/
        • Sites.txt
      • Sketches/
        • Mindmap By S. Sturges 2009.pdf
      • Human Psyche 2012-07-10.odt
      • Human Psyche 2012-07-10.pdf

    Files of importance are fully qualified, means, even without context (location in the filesystem) reveal their identity. Disposable files, like text-file with URLs or notes have the same filename, yet, depending on location show the relevancy by their association or context only. But this isn't really covering metadata at the level it should be!

    Metadata has been neglected hugely, because it is a pain to preserve, obtain or (re-)create.

    I addressed this challenge with Universal Annotation System (UAS) but I admit, I hardly use it - it is not truly integrated so well in the system, it is just an add-on, but it works on Linux, Windows (CLI/Cygwin /Perl ) and MacOS-X (CLI) - no desktop interface yet made (which I should do).

    Good Examples

    Music (MP3)

    MP3 by default supports several metatags: title, artist, album, year, genre, even the cover photo is preserved when you copy the mp3 file. You need special software to manipulate the metadata.

    Webpages (HTML)

    Every webpage has a URL and a title, aside of the content itself. The URL is like the path and filename on your local computer, but the title is most useful metadata. One could argue the entire markup of HTML gives more information, but HTML has been a mix of design and markup content and now is a complete mess so to speak: <b>bold</b> means what? Something is important or just rendered bold? Is <strong> more meaningful? I guess you get my point: design and true markup distinction has been blurred and due to this unfortunate circumstance a lot of metadata lost even at the moment of data creation it was existant.

    Photos (JPEG)

    The metadata in JPEG photos are actually done well, thanks to EXIF - which contains most important metadata, the shutter time of the photocamera, time & date (even though often timezone is unknown) and often now also GPS based location. Needless to say, many graphic programs do destroy the EXIF metadata, overwrite it because it's no longer able to contribute - this is due the ignorance of the developers and the limitation of the metadata format.

    Solution

    Before I can jump on the solution, let me look closely to the problem itself: the way we store data. How can we even put metadata to data? The filename convention is limiting, extend existing file-structure is . . possible, when we all would agree any file-format being like a XML or alike structured file, in which we could put as many metadata fields we like, followed with binary data itself, for example:

    <xml>
    <title content="DSC0001"/>
    <date content="2012/07/10 15:10:07.012" timezone="+1:00"/>
    <location longitude="20.1182" latitude="8.3012" 
       elevation="800.23" celestialbody="Earth"/>
    <annotation type="audio/mp3" encoding="base64">...
       ....</annotation>
    <revision date="2012/07/12 10:05:01.192">GIMP-1.20:
       brightness(1.259),
       contrast(2.18),
       rgbadj(0.827,1.192,0.877)</revision>
    <annotation type="image/jpeg" 
       encoding="binary" length=50982>....
    </annotation>
    </xml>
    

    that would be a photo taken with a photo-camera, with some audio attached (describing or recording some sound) and then altered with GIMP.

    XML

    Technically such a transition could be made with tampering with the open() call (opening a file) in the Operating System (OS) and related low-level file access libraries, thereby and skip over the XML for all applications, and openXML() which opens the file including the XML chunk and parse the header. Important would be, that file-copy operations would use openXML(), where all other existing programs, e.g. GIMP or Photoshop still would read an existing JPEG using a tampered open() which skips the heading XML chunk.

    Once this would be done, we could easily start to attach metadata to our data, since it would be integrated to the existing file.

    I realized, all other ways to attach metadata externally provides problems to keep it properly attached - only in a tightly supervised system you can ensure metadata remaining tied to the data.

    Metadata Usage

    First of all the title has to be given, so the filename could be anything, the title is the most relevant piece of metadata.

    Further, date & time and time should be as exact as possible (not just seconds, but milliseconds or microseconds).

    Further, location of the creation of the data.

    Further, all alterations done to the file. One could even store the delta (diff) in the metadata so the revision could be undone independent of the filesystem - that means, a file can be transfered among filesystems with its own revision history embedded.

    <revision 
       date="2010/07/08 09:10:13.1823" timezone="+1"
       comment="Changed color from red to blue"
       type="data/diff"
       encoding="base64">7sn38vma85...26743abf738</revision>
    

    Real Life Example

    Let me use the above folder example, and make it ONE file:

    <xml>
    <title content="Thesis 2012"/>
    <date content="2012/07/10 15:10:07.012" timezone="+1:00"/>
    <location longitude="20.1182 latitude="8.3012" 
       elevation="800.23" celestialbody="Earth"/>
    <annotation title="Sites" type="text/html">
    &lt;a href="http://important.com/link/to/some/data.sqldump"
       &gt;study 1996-01-20 raw data Prof. Studer (Vienna)&lt;/a&gt;
    </annotation>
    <annotation type="text/odt" 
       encoding="binary" length="125162"
       >....</annotation>
    <revision 
       date="2010/07/08 09:10:13.1823" timezone="+1"
       location.longitude="20.1182" location.latitude="8.3012" 
       location.elevation="800.23" location.planet="Earth"
       type="data/diff"
       encoding="base64">7sn38vma85...26743abf738</revision>
    </xml>
    

    Features:

    • All notes are included, e.g. as HTML
    • Revisions either are stored within the odt, or as <revision>.

    this way we can trace changes, view or undo changes even at a much later time - still have the notes attached to the original research.

    Filesystem as Database

    Currently (status 2012/07) only MacOS-X does the desktop search right, all data is indexed at storage time, there is no post indexing. Windows 7 and Linux are still a mess regarding desktop search, you have to fiddle around which data gets indexed, or you are provided with the worst kind of searching: searching a string in all files means opening all files and read them all - in the age of terrabytes (~1012 bytes) harddisks this is a matter of several hours to even days, depending on the reading speed. It is incredible that so little attention has been given to this aspect of offline or desktop searching, except Apple.com , which does many things right, a few things not; whereas the Open Source community as well Microsoft try to catch up in regards of innovation.

    The solution is simple: make the filesystem a database. Auto-index all data, start with proper filetype based on MIME types, and properly assign helpers and indexer based on the MIME types.

    Index not just strings or text, but also graphics, pixel-based but also vector-based (e.g. SVG) and music so one can find similar forms and structures in data.

    With a database view or access to properly "metadata-sized" data we actually begin to understand the data we have . . . Google has done its part to provide searching capability to the internet, but how does it when I want to find a book whose title is the same as the title of a dish or name of a city - the context is part of the metadata, and it is only considered via tricks such as including a related term uniquely connects to the context one searches.

    A pseudo SQL-like statement:

    select * from files where mtime > -1hr and revision.app == 'gimp'

    Give me all files altered the last hour by the GIMP (Photoshop-like Open-Source image editor), or imagine this:

    select * from files where image ~

    and lists a photo from your vacation:

    or

    select * from files where image ~ circle(color=blue)

    and the results look like this:



    Metadata is best obtained at the moment of data creation - at that moment it needs to be preserved, stored and properly made available for searching.

    Let's see what comes the next years . . .



    2009/03/03
    Problems with MacOSX
    last edited 2009/03/30 19:08 (*)

    On MacOSX 10.5.6 on MacBook Pro:

    Rendering Artifacts

    I was disappointed about KDE-4.1 half backed unfinished frontend programming where graphical artifacts were quite common, yet, after 2-3 weeks the MacBook Pro with MacOSX 10.5.6 with recent fixes shows OpenGL articifacts as well. Here a screenshot showing the black artifacts (the blurred out infos and red marker is done by me):


    MacOSX OpenGL artifacts (aka nobody is perfect)

    Crashing while Asleep

    Today I opened the MacBook Pro, after having put the machine into sleep . . . and it stopped to wake up. I had to forcefully shutdown the machine by pressing the "ON/OFF" (|) button for 5 seconds, then it shut down, waiting a few seconds and press the button again, and the machine was booting freshly.

    I'm surprised that Apple has the same problems as PC makers to put machines to sleep and wake up instantly.

    Setting Up Apple File Server (AFP) on Kubuntu

    I found this link HowTo: Make Ubuntu A Perfect Mac File Server And Time Machine which was most informative, and surprising one has to compile the 'netatalk' by hand in order to include settings required to make AFP work with recent MacOSX clients - but after 40 minutes the AFP server was up and running, even announced through the network as it were an original Apple File Server.

    The only problem I faced later, when you create files and folders on the console of the server itself (Linux), those become unreadable or unaccessible for the MacOSX access and vice-versa. This becomes a problems when you work on a web-site, and save files on it, and have Linux webserver access those files internally.

    Terminal.app: Fixing Backspace & Delete Keys

    As one notices quickly, Apple redefined the Backspace key to be "Delete", and the Delete key to be "Backspace", Microsoft behaviour, and this is meant as an insult.

    Here the remedy:

    • Within Terminal.app -> Preference -> Advanced -> "Delete sends ^H" and select this
    • Within your .cshrc or .bash add 'stty erase ^H'

    Source:



    2009/02/24
    MacOSX: My First Steps
    last edited 2009/04/10 13:13 (*)

    MacOS-X (10.5.6), UNIX made Eye Candy & Easy

    I admit, I was very excited to get my hand on a MacBookPro, I was expecting everything working out of the box.


    MacBookPro with MacOS-X 10.5.6

    GUI/Quartz


    MacBook Pro System Profile
    The dock is made intuitively, I click on the icon, the applications starts, and I can put it drag & drop it there and inserts nicely, or close it.

    Installing a .dmg is a bit non-intuitive, as you donwload a .dmg and open it, and you see a few files, sometimes you see two icons, one of the application, and one of the folder "Application", you are supposed to drag it over, why this is so, no clue. When I click on an installer, I want to install it, and not be taught how to drag one icon over another, stupid.


    MacOS-X in Action

    So, starting an application doesn't mean that it is installed, not so, you require to drag the application into the folder "Application" to ensure it is in the system, after then you disconnect the .dmg disk image by . . . moving it into the Trash. I thought the Trash is where you put things when you want to delete things? E.g. to eject a CD or DVD, throw it into the Trash . . . . very strange.

    Expose/Spaces, multiple screens work fine, fast switching, key shortcuts definable.

    Open Source Programs

    As next I installed Firefox 3, and Opera 9.63 and GIMP 2.6, and Inkscape 0.46, each application I downloaded from the respective homepages.

    Then I installed MacPorts, there the first fiddling around was required, the /opt/local folder was populated, but the xterm/bash didn't find it, so you had to edit .profile or .cshrc in case you ran tcsh. As next installing screen using sudo port install screen which worked perfectly. Unforunately, the default behaviour of screen wasn't acceptable, still calling bash at startup instead tcsh as defined in the xterm/Terminal.app. Anyway, Google has all the answers, the .screenrc edit to do what I want, and I'm almost done.

    termcapinfo xterm* ti@:te@
    shell /bin/tcsh
    

    Update: you can change the login shell in "System Preferences" -> "Account", select your own login, make sure you can make changes, and click with right button or with CTRL pushed, "Advanced Options" appears . . . one of the few inconsistancies I found, to hide this important menu.

    Perl & Perl Modules

    I start to install various perl modules, and I realize there are two perls now in the system, one in /opt/local and another in /usr/bin/, not good. I sym link /usr/bin/perl to use /opt/local/bin/perl, and then install perl modules using

                 
    perl -MCPAN -e "install Digest::SHA"
    
    and so forth, the few handy modules, including DBD::SQLite, but that requires the sqlite3, which I install via sudo port install sqlite3, etc.

    I also compile my own old text editor, and it compiles without problems (the source code is over 20 years old).

    Firefox / Opera / Safari plus Flash and Quicktime

    That what I was hoping for, flash and quicktime seem to work, not perfectly, but ok. E.g. the video stream at zdf.de doesn't work seamless, the wifi connection quicktime has problens, audio fails and I have to restart the player again a few times . . . not really mature the quicktime streaming, surprising.

    VirtualBox

    I installed VBox on Kubuntu 8.1 as host a few days ago, and ran Windows XP SP2 as guest, and it installed without problems, under MacOS-X 10.5.6 and VBox 2.1.2 the very same CD failed to install (report of a "I386\asms: Error Message: the parameter is incorrect") . . . not very convincing, whoever is the culprit. I even burnt another CD from the iso file I had with the Mac, same error message, finally I mounted the .iso file and fed that as CD, and that worked. This means, likely the Mac/VBox/WindowsXP CD Driver chain is somewhere broken, likely the Windows XP CD driver based on findings on the net, that people with real machines, not VirtualBox, encountering the same error.

    I installed Kubuntu 8.1 (only at 800x600, higher resolution wasn't possible, bad) and FreeBSD 7.0 ("boot only" did not work as ftp access to Internet failed, but "disk 1" did work) as guests, but it seems VirtualBox on Mac is a bit behind compared to the Linux host variant, surprising.

    Spotlight

    At the right upper corner you have a magnifying glas, no it doesn't magnify the screen buffer, it does . . . searching. It's much faster to type the name of the application you like to start instead to search through the directories via Finder, in particular when you installed the iPhone SDK which is . . . somewhere deep in filesystem installed . . .

    So, the full text index is great via Spotlight, I hoped KDE/Kubuntu invest more time to provide a seamless working desktop search of that quality.

    Expose & Spaces

    When switching spaces, the window order is reset, e.g. when I clicked on a window to get in the front, and I switch the space, and get back, the window I had on top is hidden and another window is top, a slight annoyance.

    Beta Software

    iPhoney

    It's a Safari web browser which is scaled down to the size of the iPhone, ideal to test web apps for the iPhone, but do you think you can quit the application? No, the quit in the menu doesn't work, it remains running ... so, we are back on the console, as we were using a half finished KDE program and terminate with the kill command, so on the MacOS-X also half finished programs exist, a reality check for me.

    Video Gmail Chat

    Asked me to quit all "active browsers" (without offer to start them up later again), which I did, and still can't install it . . . wow, Google makes software which doesn't work, first time for me.

    Conclusion

    I don't like the keyboard, too small RETURN/ENTER key, bright screen, good battery, warms your hands while working on the computer (is heating a bit too much I think).

    Software wise, a bit dimmed expectation, but mostly I'm very pleased with the entire setup, most stuff is intuitive, very few options are missing, most functionality is sufficient. I like the fast graphics, no delays, I can compile, watch movies, hear streamed radio, all at the same time (just for test purpose) and don't sense a slow down in GUI interaction.

    Keyboard Shortcuts

    Most things are intuitively layed out when using MacOS-X, yet, a few keyboard shortcuts are not known to newbies:
    • File Rename: in Finder, click on name, and hit ENTER/RETURN
    • File Delete: in Finder, click on name, and hit APPLE + DEL/BACKSPACE
    • Exit/Terminate Running Program: APPLE + 'q'

    Links



    2009/02/24
    Kubuntu 8.1 as guest on VirtualBox MacOSX host
    last edited 2009/02/24 13:38 (*)


    Kubuntu 8.1 as guest on MacOSX with VirtualBox.org
    I do intensive graphic work with scripting, e.g. I manipulate .svg files and call inskape on the command-line without GUI, and let it render/export a PNG from a manipulated .svg file, using perl.

    It took quite a while to

    % sudo port install inkscape
    

    it really took 2 hours on my MacBookPro 15" Duo Core. I thought, to install Kubuntu 8.1, with the famous unfinished KDE-4.1, on a VirtualBox.org , and mount nfs there, and share the directories where those rendering of SVG to PNG is required.

    This turned out to be a very fast solution . . . as I was able to install a "server" installation quite fast:

    % sudo apt-get install tcsh openssh-server screen wget inkscape
    

    is about what I need on a server without X11 . . . and installed within a 1-2 minutes.

    So, even I installed MacPorts.org on MacOSX 10.5.6, the virtual machine with Kubuntu (or Ubuntu) seems another suitable alternative to have special apps which aren't ported (yet) to MacOSX. The X11 support under MacOSX is sufficient for me.



    Check also other posts with other tags.


    [ post new entry ] (only for administators)

    Title:

    Text:

    Tags: (separated by commas)

    Date (optional):

    Password:
     


    .:.




    Copyright 2007-2016, 2020-2024 © by René K. Müller <spiritdude@gmail.com>
    Illustrations and graphics made with Inkscape, GIMP and Tgif