Info Management - Others

Friday, September 29, 2017

10:49 AM

ten key principles to ensure that information management activities are effective and successful:

recognise (and manage) complexity
focus on adoption
deliver tangible & visible benefits
prioritise according to business needs
take a journey of a thousand steps
provide strong leadership
mitigate risks
communicate extensively
aim to deliver a seamless user experience
choose the first project very carefully

From <http://www.steptwo.com.au/papers/kmc_effectiveim/>

In a complex environment, it is not possible to enforce a strict command-and-control approach to management (principle 1).

Instead, a clear end point (‘vision’) must be created for the information management project, and communicated widely. This allows each project team to align themselves to the eventual goal, and to make informed decisions about the best approaches.

From <http://www.steptwo.com.au/papers/kmc_effectiveim/>

Experience has shown that a more effective approach is to think of an information management strategy in terms of a series of projects.

Each of these projects is chosen carefully to have the greatest impact on information management challenges.

From <http://www.steptwo.com.au/papers/cmb-information-management-strategy/>

========

publicvoit - Using a hierarchical classification for file management is a mistake IMHO. You're neglecting basically any information management improvement of the recent seven decades.

https://karl-voit.at/2018/08/25/deskop-metaphor/ should give you an impression on a few things that are wrong with this approach. I also recommend you to read "Everything is miscellaneous" by David Weinberger on that topic.

Furthermore, you can read my answer to this issue on https://karl-voit.at/managing-digital-photographs/ (I improved the content yesterday)

From <https://www.reddit.com/r/datacurator/comments/gybbx3/best_open_access_classification_system_like_the/>

I recommend you to read "Everything is miscellaneous" by David Weinberger to open up everybody's mind a bit here and there.

Oh, and there is another article about this topic I'd recommend: https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/

From <https://www.reddit.com/r/datacurator/comments/gybbx3/best_open_access_classification_system_like_the/>

Neha_Soma - Finding the very best classification/organizational system is basically impossible because it is a moving target. In other words, what you thought was the very best system today looks downright primitive compared to the new organizing idea you will have in the future. Your data will change, your priorities will change, technology is always changing. Constant change is the price of progress.

So.... instead of tediously changing the structure of your files/folders/tags every time there is a new classification kid on the block... only deal with the metadata, not the actual data itself. Put that metadata in a super flexible format like a spreadsheet, which has rich and powerful sort/filter/find capabilities, then output the categorization you want.

From <https://www.reddit.com/r/datacurator/comments/gybbx3/best_open_access_classification_system_like_the/>

Managing digital files is mostly done in an ad hoc way. People are collecting files on their computers and using folder hierarchies that grow organically. This does not scale well when it comes to retrieval success. Looking for a specific file is a task that often does not result in finding the information or it requires some level of frustration.

I'm somebody who has spent many years with personal organization schemes. I tried all kinds of tools and methods. I changed my personal file management concept multiple times in order to minimize file curation effort, maximize consistency and retrieval success. For a couple of years, I was a professional researcher for this topic, writing a PhD thesis on a new file management method that tries to overcome the limitations of the desktop metaphor of our desktops. Scientific results from peers, my own findings and methods from this PhD project had a huge influence on the method I developed afterwards.

This article is explaining the concept I developed and which I am using on a daily basis. I'm using digital photographs as an use-case example. However, the method described is not about image files only.

My method consists of multiple ingredients:

My simple folder hierarchy.
A file name convention including tags.
A set of self-written tools that minimizes my personal effort as much as possible.
Advanced information retrieval features that you can't find anywhere else like TagTrees.

From <https://karl-voit.at/managing-digital-photographs/>

Learning about Inbox Zero for email management, I got rid of my hierarchy there. It was a rewarding experience, not having to move emails into more or less fitting sub-folder. To my surprise and relive, an inbox folder and a single archive folder was enough. I won't go into details on my email workflow here. The thing I want to emphasize is that moving away from complex structures using over a hundred of sub-folders to a minimal one is not only possible but might even come with many benefits attached.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

research shows that the average person will end up with one hundred thousand files, 440,000 emails and 120,000 digital photographs.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

((Hardlinks, Sumbolic Links, LNK files, etc.))

soft links. They are able to link to files, directories and cross-partition items.

…But there is even a bigger argument against using reparse points. There is a pattern popular on Windows systems which I don't know when it evolved in history. When an application is opening a file like Document.docx, it renames the file to a temporary name like ~Document.docx starting with a tilde character. As the user modifies the content of the file, changes are written to the memory and to Document.docx. When the user closes the application, the temporary file ~Document.docx gets removed and Document.docx now contains the updated content.

…has one big downside when somebody is using reparse points features like symbolic links. As the original file gets renamed to a temporary file, a completely new file is written with the modified content and the temporary file gets removed, this method replaces any links with new copies instead of taking care for links.

This way, tools like tagstore or filetags (mentioned above) can not use file system supported features for linking even when the Administrator limitation was dropped. Links get replaced by copies silently. This is the most important argument against (symbolic) links on Windows systems.

Therefore, Windows users are doomed to use the low performing, limited LNK files instead of advanced NTFS file system features.

From <https://karl-voit.at/2018/08/25/links/>

((( LEO context

Leo's clones are identical to hardlinks. They can only exist in same filesystem volume (.leo file), and there's no meaningful difference between one clone and another. Editing one edits them all. Deleting one has no effect on the others, until it's the last one.

Does this mean bookmarks are same as symlinks? Are there any significant differences?

Would it be feasible or even reasonable to use Leo nodes as .filetag manager?

)))

The decisions between sub-directories on each level are not distinct. Unfortunately, you can not put real-life items in a totally strict hierarchy without logical conflicts. Whatever structure you're coming up with, I can easily construct endless examples where its uniqueness fails. This is a crucial thing to know when designing complex hierarchies. This is also a well-known disadvantage of dated concepts like the Dewey Decimal Classification. Don't get me started on this one.
Sometimes it is hard to recognize that you're not going to find the item in this sub-hierarchy. You have to go through a number of locations until you realize that you followed the wrong sub-hierarchy. This is lost effort and often leads to the wrong conclusion that the file you're looking for does not exist here. This is the worst case scenario for every retrieval task.
The decisions you want to use to navigate through your directories are usually influenced by your current mental context. This context is different from the mental context when storing items. For example when you save an image from your birthday party you most likely choose a directory related to your birthday event. When retrieving, you're probably looking for an image of aunt Sally. That is a totally different context and chances are that you don't look for the perfectly nice photograph of aunt Sally in the directory of your birthday because in this particular situation, you forgot that she was attending the party.

In the literature, you will find more on these topics when you look for "semantic cueing". As a side-note, I tend to like "temporal cueing" for certain retrieval tasks. This is why I created Memacs and its eco-system.

Another thing worth mentioning is related to retrieval tasks where you don't exactly what to look for. For example when you are looking for a nice image to use as a background for a presentation slide on the topic of privacy and IT security. It is almost impossible to do using navigation without knowing exactly which image you have in what directory. Serendipity is hard with a strict hierarchy of directories.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

On the other hand side, search does come with its conceptual disadvantages as well. Without going too much into details, the downsides mostly relate to psychological hurdles to come up with a suitable search query. And: you have to know what you're looking for to a certain degree.

Further more, some desktop search tools are just not well designed. For example, despite the fact that Apple Spotlight search in general works much better than Windows search, the visualization of the results is very simplified and therefore limiting. Fifteen years ago I was using Copernic Desktop Search which was - in my opinion - much more advanced than nowadays desktop search engines.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

((and also the killed-off Google Desktop Search))

So what to do instead of having deeply nested directories?

My recommendation is that you follow the same rationale that librarians did centuries ago. Don't spend much effort in organizing the files. Follow a very flat hierarchy concept and invest your effort in advanced retrieval methods instead.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

My approach with filetags, date2name, appendfilename, move2archive and TagTrees offers people more efficient file management and multi-classification using tags. Instead of curating a directory structure, you should curate a controlled vocabulary of tags. This way, you can circumvent the strict hierarchy for information. With a decent (but not too big) set of tags, filetags is able to derive a completely directory structure called TagTrees which offers you many different navigational paths to the same file. This time, as long as you don't choose tags that do not apply (which is less likely than directories that do not apply), you will find your file within the TagTrees in each case.

Similar to the index cards of the librarians, the file is represented as a link. And it is represented not once but many, many times. You can use your associative part of the brain instead of the part of your brain that remembers where you've stored the item in a totally different set of context.

From <https://karl-voit.at/2020/01/25/avoid-complex-folder-hierarchies/>

Nayuki - a tag based future filesystem

https://www.nayuki.io/page/designing-better-file-organization-around-tags-not-hierarchies

Naturally hierarchical phenomena

Some natural concepts have a nested structure and are already labeled in a hierarchical way. Geographic locations are hierarchical – a city belongs to a state which belongs to a country which belongs to a continent. For example if you want to organize a set of travel photos, you can make a folder for each country, a subfolder for each city, and so on. Each photo belongs to only one place, never two countries or two cities at once. Divisions of time are hierarchical. You can create a folder for each year, and put each document in one year folder. Or you can go finer and create a subfolder for each month, et cetera.

((YES! This is what is missing from digital, the 3rd and 4th dimensions. Does it already exist and we just haven't seen and described it?))

A much more reliable alternative is to name each file by the hash of its byte-level content, for example the SHA-256 hash value of 87529003f42c1f0439b0c760ebfe5e6dff1b436c3c9b4f8b41ad9b1fe6dc6795 (64 digits). Hashes are long and ugly, but have a couple of useful properties in this situation. The same file content always produces the same hash (unlike the timestamping scheme), so exact duplicates are easily detected. No communication or coordination is needed among a cluster of servers, because probability theory ensures that these random-looking names are extremely unlikely to collide.

Hence it may not be desirable or necessary to let the end user choose and manage unique file names. As we’ll see later, it might be a good idea to design a system where each file name is an auto-generated hash, but where files can be named, categorized, and queried using higher-level metadata systems.

(( ))

As I see it all of those challenges listed stem from the fundamental "connectedness" nature of spatial files. Unlike typical word documents they're rarely stand alone. It doesn't make a lot sense, and often flat out isn't possible, to pass along one or two files for someone else to read or do something with. It's like "here's my top 10 report" (but before you can read it you have to plug in this 1 TB usb drive). …a caricature to be sure, but one to illuminate the principle ;-)

Created with OneNote.