A New Vision For The Desktop


As I start to write this article at the end of March 2005 I have in my head a partially formed vision for how to improve the usability of the Personal Computer Desktop. I will use this article so slowly explore that vision and try and turn it into a complete concept. I will do that in stages, so that ultimately this becomes a multipage specification for the complete vision of a new version of the way the PC desktop operates. As I publish each update, I will reset the date of the article so that it reappears at the top of the list of articles on this site.

The current desktop environment has a fairly consistent approach. Each application generally, although not always, has a single window for most of the users working. This window has borders, possibly with scroll bars, and at the top is a window titile bar, a menu bar, and often a toolbar. At the bottom is often a status bar of some kind.

The overall desktop generally consists of a full screen area on which futher icons are placed, some representing starting points for some activity, others files put there by the user. On one edge is some form of panel with lists of running tasks, a menu to start new ones, some form of notification area (the system tray) and potentially a quick launch section with the icons of frequently started applications.

Also, just as important is they way that each window has a postion in terms of its on topness compared with every other, and that the on top window is both

  • the one that has the focus for keyboard and mouse input (although there are exceptions), and
  • is opaque, and obscures what is beneath it. So although we have a 3D concept, its very limited and is more two dimensional in nature and use.

I think this is basically a function of the lack of power of the graphics when these concepts where first put in place. What I want to do is revisit this approach when we are in a world in which most of the hardware can

  • provide more flexibility in terms of transparency
  • the ability to make use of 3D perspectives. So I want to take these additional capabilities and explore how we might make the users life easier.

How does the user work?

Document Centric v Application Centric

When Apple first introduced the Lisa, and then the Mac, to the world it established an approach to the desktop in which users were supposed to see documents in folders, and work with them. The application was hidden, and just somehow automatically connected to the documents. Users were supposed to neatly file those documents into a folder hierarchy and never really know that an application existed.

I can’t speak for the Mac community as I don’t know anyone who uses a Mac these days, but I can speak about many of my collegues at work who use Microsoft Windows, and they definately do not work that way. They think application. For instance, if I ask a collegue to give me a copy of his PowerPoint presentation he will often start PowerPoint, open up the presentation from the application and use Save As to copy it on to (say) a memory stick. Now I don’t work that way, but many do. Definitely not the way the original designers thought they might.

Why? Why do people see things this way? I don’t really know, but here are three theories:-

  • In the Windows world you generally have to pay money to buy an application, and the vendors marketing department therefore a boosting the importance of an application as opposed to the documents it creates (see how I used the term PowerPoint Presentation above)
  • These days, the two key applications for most people are e-mail and web browser. Neither of these connect directly to documents stored on the desktop.
  • To line up windows so that you can copy file a to location b is really difficult. It much simpler to do it using the application, or using windows explorer where one pane has the tree structure in it and can be used as the destination of a move/copy command.

So in my vision, I think we have to give more credence to the concept of an application (or rather the function of doing something – I’ll talk about that later) as a key driver, although that does not mean that we must forget the document angle either (again more on that later).


The original concept of users operating on several things at once has lead to the development of a desktop in which the user is expected to have have multiple windows open at the same time, being able to switch between them at will.

I think the reality is different. I think that normally users focus on only one task at a time. However, it is quite important to understand that they might simulatenously be wanting to monitor some other process in the background whilst concentrating on that one task. In saying that, it is a sort of monitoring that is more than just has this event occurred, but might be more akin to continually judging progress whilst never-the-less concentrating on the main activity. Todays windowing systems insist that you leave some space on your screen for monitoring window because of the limitations on where keyboard and mouse focus works when

Switching between tasks (or starting a new one) is also an important component of the user experience. Right now, things seem to be inconsistent between applications, in that those with a document focus require you to explicity save the document to keep changes, whilst those that don’t have a document focus in quite the same way (for instance your mail program) just work.

Locating Information

A key element of efficient working is to rapidly locate a piece of information to work on. Traditional methods use the file system and effectively allow the user to build a tree structured hierarchy to locate the information. But in reality he doesn’t

The alternative that is being talked about is to allow the user to define metadata with the file, and then to use a search engine to help him locate the information. Having seen the search engine approach in use, I do not believe it works very well. They key problem is finding documents that you know are there, but for which the search criteria seem to be wrong. In this case, browsing is a necessary component.

In analysing my own approach to finding information, I think there are number of criteria that we use naturally. Lets explore each of them in the subsections below


In most cases we remember the type of information that we are searching for. These are generally coupled with the application although not necessarily known that way. So we think of an e-mail message, a document, a slide show, an audio file or a movie (or any other file – such as saved game …).


First off, I think there are two categories of time that we are talking about. Firstly, there is time for its own sake. Phrases such as last week, last month, or two days ago are all in this category. I would submit that our memories are a little hazy when it comes to remembering time, and the best that can be achieved is one unit back in time where a resolution of about 1:6 to one. In otherwords, we can remember today and yesterday, but then the resolution needs to drop by approx 6. So we remember this week, last week and then drop to this month, last month and then drop to this half year, last half year.

The second mechanism is links via events (either spot events or ones that cover a period). So we can remember “at the meeting with Customer X”, or whilst I was working at.


Here, I think, is an area where users do need to have a mostly heirarchical model of subject areas that they can use to file information in.

[Need to think about exceptions to this rule]

Some new concepts

Classes of Task

I think the first thing we need to do is get away from the old concepts of applications, and think instead in terms of tasks. Then of each of these tasks classify into different types according to the user concentration on them. So, as an initial list:-

  • Working on a specific document to create, view or edit it (at this stage, forget whether the document is an e-mail message, writing on a page, a drawing, or whatever. The important point is that there is a focus of attention on a specific area of the screen, and that there needs to be some tools with which to manipulate the document. Although the term document is used here, I think it can also be other media. For instance, playing a game, or watching a DVD would also fit into this class.
  • As a specific enhancement to the above task, it may be necessary to copy information between two documents, or follow instructions from one document whilst working on another.
  • Filing away a document that has been worked on, or finding a previously worked on document so work on it some more. This may include looking at lists of potential items to understand the relationship between them (for instance a list of messages in an e-mail coversation thread). Tools will be required to control the navigation, or search for the item.
  • As a special case of the above, to switch between a limited number of previously active tasks, maybe triggered by an event (e.g. an e-mail arrives, so you read it, reply and then switch back to whatever you were doing before)
  • Do something as a background activity, with occassional need to intervene or monitor progress (e.g play music, or download a set of files from the internet).

Actions and Tools


There is a frequent requirement to search for information that is in a heirarchy. I am a firm believer that

  • The user is much more able to find what he wants if the complete hierarchy is exposed at the begining (ie no collapsable trees). [Think about a display mechanism that allows this]
  • That this heirarchy must be related to what the user expects (for simple heirarchies) or what the user defines himself (for complex heirarchies)
  • The heirarchy is about subject areas and NOT about type or time (including events) – because, as we see below the computer should also maintain these links separately.


A number of actions will be standard amongst the tasks above. Some of them will relate to switching between tasks, whilst others will be specific to a given task (e.g. Print the document currently being worked on)

[Need to expand this further]


Task will need to define their own actions.

The individual and his identity to the outside world.

In todays world the computer is not just a tool for undertaking solo tasks, it is also a tool for communicating with others. But there is a sublety to this. It is no good assuming that just because I am sitting in front of my computer that

  • I want the whole world to know that I am there
  • Everyone knows me by the same identity

Each individual will hold a number of identities and will also have a list of his contacts (other individual/identity combinations) for each identity, and grouped together into classes.

He can enable his presence to be known by class of contact

Locating Information

As we have discussed above, there needs to be a standardised way of storing information so that it can be found again. The key concepts that we need to link together is


  • Tree structured subject area – much akin to the folder hierarchy used today
  • Time linked to events (in a calender)
  • Type of information (mail message, document, image …)


This is a different problem and needs wider thought.

Putting it all together

The start

There will be a process, which I will not cover in in this article, of starting up the computer, connecting it to a network, and getting it to a point where a known individual is sitting at a screen, keyboard, mouse, other peripheral combination ready to start work.

The approach to task selection seems to me to be really dependent on whether you are creating something new, or locating something old.

As we have seen from information identification, new documents start with application selection.

The focussed task

The fundemental concept above is of normal focus on a single document with a set of tools with which to work on that document.

I think that the pictorial representation of that document related to the focussed task should use the full screen. No standard borders or scroll bars, or menus or toolbars, with maximum space given to the document. If any dimension of the document is smaller that the screen, then this border will be primarily black (except for transparency effects). If the document is bigger than the screen rather than scroll bars, a standard mouse drag should be used to move the document around

Tools for manipulating the document, or for switching to another task, should be as layed out in a visible window above this full screen (perhaps being slightly transparent so that were ever it covering a key aspect of the main document it could been seen). However, although it remains above the focussed window, keyboard and mouse input remains pushed towards the full screen except when obviously manipulating the controls.

[Need to consider the alternative of a panel down the side of the screen that pops out when the mouse is pushed hard to the side of the screen, but dissappears again when the mouse is moved away]

Where possible this should be like a control panel, with buttons to press or sliders (where an anologue input is required)

An Update from the Future

I just found this article as I migrate my web site using Jekyll and am running through my old posts converting them to markdown. I never took the concept further, but there are a lot of similarities with the tablet of today (April 2020). Apps definitely have the focus, and many of the ideas of full screen, but controls on top or appearing when needed resonate.

Time series retrieval is an option - I just think about the last few days using Google Photos to retrieve old pictures, but heirarchical storage hasn’t really caught on. An app like Notability has a limited (two levels of organisation + naming of the notes underneath by time and subject), and Mail Programs have folders, but I might be the only one who uses such things rigourusly. My wife for instance has a single inbox with all mail received in the last 10 years stored in it, and despite suggesting she organises it, she doesn’t