Saturday, March 24, 2012

Gnome 3 - A loving critique

My Desktop view while writing this article

Comparison to Gnome 2

Before switching to Gnome 3, I had used Gnome 2 for years. I also had a brief stint with Unity. Many Gnome 2 users have lamented that 3 is not as customizable as 2. Therefore, given less control over their desktop experience, they loudly blast 3. You've probably heard them on places such as Slashdot. I mostly think this attitude stems from an ideological distaste for 3. The most vociferous objectors seem to be the long time Linux users. They come from a traditional Linux culture which places customizability above almost everything else. Therefore, opinionated software, such as Unity or 3, strikes them as anti-Linux. The point the Gnome developers believe, and which I agree with, is that there is a difference between customizability and distracting clutter. However, any efforts to reduce distracting clutter will, almost by definition, remove some control from you. This is not necessarily a bad or wrong thing. As one commentator on Slashdot put it, long time Linux users are some of the most conservative people in the world.

Just because Gnome 2 allows for easier customizability in some areas does not mean it is better at fulfilling the primary purpose of a desktop environment (launching and maintaining applications). For one, it has a "start menu" like interface for selecting applications. So you have to navigate a categorized list to find and launch an application. That always seemed problematic for me when I wanted to run a more obscure app. For example, I would constantly forget which apps were in the "Preferences" category and which ones in the "System Settings" category. Therefore, I usually ended up scanning through both looking for a name or icon that seemed correct enough. This problem could be partially mitigated by installing Gnome-do (or a similar launcher). The Gnome-do experience, seemed like a combination of 3's Activity view and its pop-up command dialog. An experience by itself that wasn't quite as nice as those two avenues are separately.

In Gnome 2, it is easier to pin favorite applications to the top panel. This makes launching a new instance of each application a one-button click interaction. This works well for applications that naturally have multiple instances open. However, it does not work as well for single instance applications such as the music player Rhythymbox. For them, you want the desktop to open the application on the first click then activate the running application on the second click. Gnome 3 does this. Actually, it is possible to have it both ways in 3 since there is the Panel Favorites extension that emulates the Gnome 2 top panel favorites.

Comparison to Unity

Ubuntu Unity

Unity

Ubuntu's Unity is similar to Gnome 3 in some ways. It has a left hand side bar application launcher which is similar to the Dash. It allows you to search for applications to launch. However where Unity fails, in my opinion, is in trying to be both a tablet environment and a desktop environment at once. Largely because of this, Unity's windows take up more space, the icons are blockier, and the application launcher system is even less configurable than in 3. Furthermore, the Activities view in 3 has more functionality than the equivalent view in Unity. The equivalent view in Unity (called the Dash view) does not let you switch to workspaces, select open windows, or drag app icons into workspaces. Where Unity is stronger is in keyboard shortcuts. After opening Unity's Dash view, you are presented to a variety of ways to select applications, files, and music without using a mouse. For example, the top 9 items in the Unity side bar (analogous to 3's Dash), can be selected with the 1-9 numeric keys. In some ways, these extra, temporary key bindings can be faster. Often they don't amount to much, though. I've found myself better remembering the keyboard shortcuts in 3's simply because there are fewer of them.

As of time of writing, launching a new instance of an app is a lot easier in 3 than in Unity. In 3, from the Dash, you always have the opportunity to launch a new instance of an app from a right-click context menu. Strangely in Unity, not every app gives you that choice. This behavior might not even be customizable, either.

Another big advantage 3 has over Unity is its extension ecosystem. While the ecosystem is currently in its nascency, it shows potential. It could be the killer app that vaults 3 over Unity in the same way that the Firefox extension system helped vault it over Internet Explorer (at least among choosier tech savvy users). Unity currently doesn't have an extension ecosystem.

The Activities view

My Activities view while writing this article

In Gnome 3, most of the action happens in the Activities view. It can be launched three different ways:

  1. Alt+F1
  2. Windows key (only the left Windows key on my system: Fedora 16)
  3. Slamming the mouse pointer into the top left corner of the screen (hot corner)

The Activities view is one of the most well designed parts of Gnome 3. You can tell the Gnome team spent a lot time on getting this right. Most of the functionality does what you expect it to do. For example, single clicking a workspace preview square shows the windows within that workspace. Clicking again on it again moves you to it and back to the Desktop view. Escape also does what you expect most of the time. When you first access the Activities view, hitting Escape will take you back to the Desktop view. If you do anything within the Activities view such as typing or right-clicking the Dash, Escape will only back out of that action. It keeps you in the Activities view. This seems like the exact right way Escape should be implemented.

The one quibble I have about the Activities view is that after typing to search for something (ie an application), you can only use the up/down arrows to navigate through the found icons. You cannot use the left/right keys. The reason for this is that the left/right keys are used for editing your search text. Therefore the up/down arrows scroll through the found icons in reading order: left-to-right, up-to-down. The problem is that icons at the very bottom of the screen take a long time to reach. This forces you to either refine your search or use the mouse. An interesting solution could be an extension that reserves all of the arrow keys for navigation but uses Emacs style key-bindings for editing the text.

Opening the Activities view opens up some different views:

  • A left hand side bar called the Dash. This shows your favorite applications on top followed by running applications below.
  • All open windows within the current workspace on the left-screen (if you have more than one screen) plus all right-screen windows from all workspaces on the right-screen. Clicking on a window takes you to it and to the workspace where it resides. This is called app activation. You can also close windows by clicking on a circular X that is displayed after you move the the mouse over it.
  • A panel showing previews of the different workspaces. Clicking on a workspace takes you to it. Dragging an app from the Dash opens it in that workspace. You can also drag windows to different workspaces.
  • The notifications panel. This is a place on the bottom of the left-hand screen which shows notifications and some other background running applications (such as Dropbox).
  • A search textbox that allows you to immediately start searching for an application to launch. You do not need to put this box "in focus". Typing at any time will start searching for applications.
  • Two tabs labeled "Windows" and "Applications". Windows is the default. Clicking on Applications gives you a view of all applications which can be filtered by category. Filtering by category is similar to the Gnome 2 Applications menu.

App launching

Within the Activities view, you use the Dash to activate a running application or launch a new instance. For people coming from Gnome 2, where clicking on an icon always launched a new instance, this can take some getting used to. There are a few different ways to do both:

Left clicking on an icon in the Dash will only launch a new application instance if an instance of that application is not already running. Otherwise, it will activate that app. As I have already defined, activating an app means putting it in focus, then moving you to its workspace. App activation is most useful for applications that are naturally single instance (eg Rhythmbox). For multi-instance apps (eg gnome-terminal), this behavior can be slightly frustrating. App icons are highlighted when at least one instance of them is already running. Unfortunately, the highlighting isn't really bright enough to be noticed unless you are already looking for it.

Right clicking on an icon brings up a small menu. This menu shows already running instances for you to activate. It also has a "New Instance" choice and a "Save as Favorite" choice. The latter will keep that icon permanently in the Dash even after all running instances are closed. Dragging an icon to a different place along the Dash will also save it as a permanent favorite. This makes the assumption that if you are going through the trouble of positioning an icon on the Dash you expect it to permanently remain there. Apps already saved as a favorite have a "Remove from Favorites" choice.

The whole concept of "app activation" takes a little getting used to if you are coming from a Gnome 2 mindset. I am slowly warming up to it. I still find that I don't choose to click on an already running app to "find" it as much as I notice if it is running (via the subtle highlighting) then click accordingly. This is probably because most of the time I only have 4-5 distinct apps open: Chrome, Emacs, Gnome Terminal, Rhythmbox, and maybe something else. Of them, only Rhythymbox is a naturally single-instance application (perhaps Emacs, too) and I already know where it is located (I keep it in Workspace 4). So middle-clicking (detailed below) to open a new instance is what I do more than left-clicking to activate. This does take slightly more brain cycles than clicking on a top panel favorite did in Gnome 2. Still, I'm staring to find it not terribly annoying. Maybe I'm just getting used to it.

There are a couple of other ways to launch a new application instance within the Activities view.

  1. Searching for an application by typing some of its name, then using the up/down arrow keys to highlight it, then pressing enter. This will always launch a new instance of the app. Clicking on it with the mouse will also work here.
  2. Middle-clicking on an icon in the Dash. Middle-clicking with my mouse means clicking the scroll wheel.

Middle-clicking an icon in the Dash actually creates a new instance of the app in the workspace just below the current one. The reason for moving it to a different workspace has to do with a new concept known as dynamic workspaces. I will discuss the concept of dynamic workspaces a little bit later. Also, there is an extension you can install which keeps the new instance within the current workspace. I will discuss extensions a little bit later, too.

There is another way to launch applications (not counting dropping into a terminal shell). You can use the pop-up command dialog. This is like a bare-bones version of Gnome-do. You press Alt+F2, the screen dims slightly, and a small textbox appears in the middle of the current window. You then type the exact command desired into this textbox. This pop-up is super fast for short, known commands (eg gedit). If you develop Gnome 3 extensions, you'll find yourself using this a lot with "lg". Unlike after finding an application in the Activities view, you can pass arguments to the command using this dialog.

Pop-up command dialog

The Good and the Different

Gnome 2 was a kludge of an environment. It featured an ugly, heterogeneous mix of panels, drawers, and bars. In comparison, Gnome 3 is simple, elegant, and dare I say it for a Linux desktop, beautiful. The first thing I noticed is how sparse it is. In the desktop view, only a slim top bar is visible (in addition to the application windows). Even the draggable bar on top of each window is slimmer than its Gnome 2 equivalent. It certainly achieves the goal of being non-distracting. There is hardly anything visible to distract you! I think it also achieves the more difficult goal of having functionality that with few exceptions does what you expect it to do. Functionality that works as expected is more easily remembered and absorbed into your workflow.

With a fresh look the at desktop paradigm, the Gnome 3 developers have done away with some of the old patterns they thought were of low value. They've also created some new ones. Some of these I like; some I don't.

A decision I like was to remove the venerable "minimize, maximize, close" window button pattern. While all of these actions are still available, only the close button remains on the top right corner of a standard window (by default). Along with this is the removal of a bottom panel of window buttons (think Windows taskbar). Given that I use workspaces to group together logical windows of work, I usually do not use minimize at all within a given workspace. I typically move the windows into a certain workspace then set them at certain position and size. Sort the Warren Buffett buy-and-hold method of window configuration. Therefore, minimize and a taskbar are little value (on Linux that is; I still use them when I'm using MS Windows since it does not have workspaces).

Another welcome enhancement is the natural way the Activities view can be opened by the mouse. The top left corner of the leftmost screen is called a "hot corner". Slamming the mouse into that corner opens the Activities view. It's fast, intuitive, and easy to hit. Also, since the top left corner is an area of the screen not typically frequented by the mouse, there is little chance of accidentally triggering it. It is also consistent with the Linux paradigm of "focus follows mouse" by triggering an action without having to click. After opening the Activities view, it does what you'd expect by de-activating the view if you re-slam the mouse back into the hot corner. So it a nice little innovation without a downside.

A new feature I could not get used to was dynamic workspaces. Admittedly, I may not have given it enough of a chance. I quickly ran to the Static Workspaces extension. To me and to a lot of Linux users, workspaces are static. Each one has meaning. For example, I always keep Rhythmbox in workspace 4. Workspace 4 is my music workspace. I even have a hotkey to it (Ctrl+Alt+4). It's an easy, comfortable convention that I don't have to think about. With dynamic workspaces, there is no Workspace 4. In fact, you must have at least four windows open to ever even see a fourth workspace. You start out on Workspace 1. You can move an open window down to Workspace 2. However, you can never have an empty workspace. If you remove all the windows from a workspace, it goes away and the workspace numbers re-calibrate themselves. So if you have three workspaces and you remove all the windows from Workspace 2, Workspace 2 goes away and Workspace 3 becomes Workspace 2. It's a different way of approaching the concept of workspaces but one that doesn't work with my concept of categorizing workspaces.

Another new feature I found annoying was how windows will be re-sized by dragging them to different areas of the screen. This is a problem because I like to have some windows maximized (eg Emacs) in one screen and others manually sized (eg gnome-terminal) in the other. I like to be able to snap the medium sized windows to the top of the screen. With drag-to-resize dragging a medium sized window to the top of the screen became a chore. If I was overzealous with the mouse movement, the window would be accidentally maximized. That would cause me to curse and then unmaximize it. After this happened over and over, I finally turned off drag-to-resize.

Inconsistencies

Gnome 3 still being relatively new, I have also found it to have to some minor annoyances. For one, gui configuration in 3 is spread across three different editors: gconf-conf, dconf-conf, and the Gnome Tweak tool. The Gnome Tweak tool exists entirely for configuration of extensions written for 3. That I understand. I don't understand the need for two other editors, though. To me there seems to be almost no good rhyme or reason for why a particular configuration could be modified in dconf-editor or gconf-editor. I'm should someone more versed in the finer points of Gnome could tell me the difference. My point is that to gain adoption from a wider base of users there needs to only be one editor.

Another minor annoyance is that Looking Glass (lg), the debugger for developing extensions, is modal. After opening it, you cannot interact with any other window. In order to move or modify a window, you have to first close lg, do the action, then reopen lg. You'll find yourself repeating that sequence a lot. The developers page for 3 states that lg is inspired by the Firebug extension for Firefox. Firebug is not modal.

Looking glass debugger

There is small inconsistency in gtk support for extensions. In tinkering around I found it difficult (perhaps impossible) to open a simple, hello-world-style gtk window from within an extension. Therefore, extensions are restricted to a subset of windowing controls. Typically these controls have a different look-and-feel from standard gtk windows. You'll notice this difference in the various graphical "sudo" pop-up windows. If the window originates from an application or script it is in the form of a standard gtk window (eg gksudo, beesu). If it comes from a Gnome 3 extension, it looks like a Gnome 3 panel (ie black background with gray trim). This inconsistency is functionality meaningless. However, inconsistencies like this give you more of a disjointed experience.

Some of the hotkey bindings aren't quite as customizable as one would like either. For example, the key-binding for opening the Activities view is Alt+F1. That can be gui configured to something else in the System Settings dialog. In addition, the left Windows key will also open up the Activities view. Only the left Windows key, though. The right Windows key does nothing. Why? I have no idea. I also have no idea how to change this via a gui interface (or any other type of interface for that matter).

Bugs

In addition to the minor inconsistencies, there were also some things that appeared to not work at all. One of these being the Accessibility switcher within the Activities view. Perhaps this is just user error, but I couldn't figure out how to actually select an entry. To recreate this issue: first open the Activities view, then click Cntrl+Alt+TAB to bring up the Accessibility switcher. You should see the following selectable choices: Top Bar, Dash, Windows, Applications, Search. You can now use the arrows key to highlight one of them. At this point, I couldn't figure out how to actually select the highlighted choice. The Enter key does nothing; neither does any combination of a control key + Enter key. So far Google searching for this issue has been futile as all results only to point out how to launch the Accessibility switcher. Perhaps this is a bizarre issue with my setup and Fedora is to blame. Perhaps, but until proven innocent, I'm blaming Gnome 3.

Another buggy feature within the Activities view is dragging app icons from the Dash into a workspace. This is supposed to launch a new instance of that app within the workspace. Dragging the first app works as expected. However, if you immediately drag the same icon into another workspace, it launches itself in the currently active workspace and not the selected workspace. In addition, the mouse icon stays busy (ie spinner) until you exit out of the Activities view. Considering that I rarely use the drag-to-initiate feature, this is a minor bug for me. It is still something which should be tightened up.

Extensions

Functionality in Gnome 3 can be extended or changed via installable extensions. These are written by outside developers and are findable from the extension repository. As of time of writing, there are roughly 150 extensions found in the repository. Of these probably less than 20 are of any value. So extension developers are still sort of feeling their way around this thing. I briefly tinkered with writing one myself. I found the experience somewhat frustrating. First, there is almost no official documentation on how to write an extension. Even the official wiki points you to only two blog entries written by a guy named Finnbar P. Murphy. Two blog entries does not constitute documentation. Not to mention the fact that one of the entries is based on Gnome 3.0 and since the release of Gnome 3.2 is almost completely out of date. Good documentation would include both reference material and cookbook-style entries for implementing common patterns. The reference material would have to be specific to just extension development since extension development is only a subset of general Gnome application development. I found that some Gnome application code (eg opening a standard gtk window) will not work within an extension. This lack of documentation and examples never gave me confidence that I was developing my extension the right way. For an environment that prides itself on convention and consistency, it needs to do a lot better job of assuring developers that their code conforms to the Gnome way. I felt like any extension I would have managed to get working would have been filled with a lot of cargo-cult.

Shell Extensions tab in Gnome Tweak tool

There are some noteworthy and useful extensions. The ones I'm currently using are:

  • Alternative Status Menu For some Godforsaken reason, Gnome 3 doesn't let you fully power off your machine unless you hold Alt while choosing Suspend from the Status Menu. Suspend is a laptop feature, and I am on a desktop machine. So it doesn't really make sense to click on. Plus, sometimes you just need a good ole' fashioned power off. Alternate Status Menu gives you that option in the menu without making you remember which control key to hold down.
  • gTile This opens one panel per screen right in the middle of the in-focus window. By selecting tiles within the panel, you can resize windows to fit together on the screen. It works reasonably well at that task. It still seems a bit half baked. There are no key-bindings for it (eg Escape), and some of the buttons at the bottom of the panel make no sense to me. I couldn't find any documentation on what they do. Actually, they do not seem to do anything at all. So this extension is mildly useful but could certainly be improved.
  • Windows Alt Tab This changes Alt-TAB to behave like you expect. Instead of tabbing across all workspaces as is the default behavior, it only tabs through the current one. Also, multiple instances of an application are shown as different entries. The default behavior is to show them as a single entry and then force you to Alt-Tilde to the desired instance.
  • New Instance on Current Workspace By default, if you middle-click on a Dash entry, it will open a new instance of that application in a workspace below the current one. This only makes sense if you are using dynamic workspaces. I am using static workspaces. This extension changes the behavior to open the new instance in the current workspace.
  • Workspace Indicator This is a little dropdown located on the top panel (near the Status Menu) which displays your current workspace and lets you navigate to a new one with the mouse.
  • Frippery Static Workspaces It makes workspaces static and behave like they did in Gnome 2.
  • noally This removes the accessibility menu from the top panel.
  • Media player indicator This is a panel accessible from the top panel which displays which song your media player is currently playing. It allows for basic playback controls: play, pause, next song, volume control.

Of these, the extensions most important to me are: Static workspaces, Media player, and Workspace Indicator.

Summary

A critique like this tends to focus on the negative. I hope I didn't give the impression that Gnome 3 isn't ready for prime time. I believe it is. I use it on my personal computer and I'm not planning on switching. That being said, any sufficiently complex technology will have some disagreeable design decisions. Gnome 3 is no exception. The larger point is that Gnome 3 is a desktop paradigm improvement, and unlike Unity or MS Metro, it is not a hybrid desktop/tablet environment. So it is not saddled with their same compromising design decisions. It works with your mouse and keyboard and it works great by mostly keeping out of sight.