Monday, 30 November 2009

Shuffl sprint 8 - progress review

The updated sprint plan is at http://code.google.com/p/shuffl/wiki/SprintPlan_8.

This was the final sprint, so effort was largely directed mainly towards leaving the project in a clean state for further work.

The demonstration application running from Google Code SVN has been enhanced to represent the current state of development, and the underlying framework has been significantly improved and stabilized. While the delivered functionality is less than had been intended, the project is in fairly good shape for ongoing development as part of the Shuffl project, and the basic ideas have been somewhat vindicated.

I did not make time to install a copy of the system for Chris Holland, but still hope to do this soon. We will be continuing to work with Chris in the ADMIRAL project.

See also the final progress report at http://code.google.com/p/shuffl/wiki/20091130_JISCRI_ProjectFinalProgressReport.

Shuffl final progress report

The final progress report for the Shuffl JISCRI project has been published in the project wiki, at 20091130_JISCRI_ProjectFinalProgressReport.

Monday, 16 November 2009

Shuffl sprint 8 plan

The plan for sprint 8 has been posted at http://code.google.com/p/shuffl/wiki/SprintPlan_8.

This will be the final sprint to be conducted under the initial Shuffl JISCRI project umbrella, but work on Shuffl will continue as part of our ADMIRAL project.

The main goals for this sprint are:

  • to complete the back-end storage interface work started in the previous sprint
  • to improve the online prototype demonstrator for project evaluation (mainly, create a new Shuffl workspace with instructions displayed for doing the various things that Shuffl can do). Also, notes for installing eXist and Shuffl on a new system
  • install a copy of Shuffl and eXist on a machine our research user's workgroup
  • final reporting and project wrap-up, including tidying up some matters to do with project sustainability

Shuffl sprint 7 - progress review

The updated sprint plan is at http://code.google.com/p/shuffl/wiki/SprintPlan_7.

The high point of this sprint was the enthusiastic reception of the visualization interface, even though this has been completed at the expense of some of the other more curation-oriented features. If this truly helps us achieve better engagement for the ADMIRAL project, I judge this will have been a good trade-off, but some cautionary notes raised on the discussion group about maintaining the right project focus need to be borne in mind. The mechanisms for working directly with spreadsheet data went some way to reducing the pressure for some of the features not yet implemented. I have agreed with Chris Holland to install a copy of Shuffl on a system where he can use it directly with his own data, which will hopefully provide a powerful point of engagement for continuing work on ADMIRAL.

It may be worth noting that I don't feel the focus on visualization has been entirely at the expense of the original goals of Shuffl; i.e., to provide a lightweight tool for capturing and sharing annotations and data. Many of the fundamental capabilities have been demonstrated, but in different combinations: user-editable semi-structured data, card linking, and a flexible, pluggable framework for introducing new card structures. On the down side, some of the intended work on containers (e.g. stacks of cards) has not been addressed, and the card serialization format currently deployed is JSON and not RDF.

The testing framework has been extremely valuable. The full test suite now performs in excess of 2000 individual tests (though many of these are repetitious). Areas which have proved more challenging to debug have been exactly those parts of the user interaction code that are not covered by unit tests. I have resisted taking time to implement a UI test framework (e.g. based on Selenium), but rather have tried to move logic out of the user interaction code into unit-testable functions. This is a debatable strategy, but in the limited time available I didn't feel the benefits of deploying a full UI test suite would get me further forward. When I get time, I'd like to evaluate the Windmill framework (http://www.getwindmill.com/), as my past experience with Selenium has been somewhat mixed.

With work on Shuffl planned to continue as part of the ADMIRAL project, I feel my top priority is to implement as much as possible of the features desired by the actively engaged researcher - which is to improve the interface for saving and loading workspaces. Other than that, I need to continue the steps taken to promote sustainability of the outputs, including creation of a more approachable demonstration prototype. These two strands of effort would ideally come together in a back-end storage plugin that works with the Google Data API - if the opportunity presents, I'd rather like to do a mini-hackathon with someone who is familiar with the details of the Google Data API.

Wednesday, 11 November 2009

"I can really see myself using this"

Big win !!!

As the Shuffl project draws to a close, I have been having some doubts about the amount of progress made, and have been asking myself whether it was the right decision to spend effort on data visualization within Shuffl (which has been time consuming). But I've just had a real boost.

During a brief demonstration to Chris Holland, I showed him his own spreadsheet data loaded and plotted as graphs in Shuffl, eliciting the response:

"I can really see myself using this"

Vindication indeed!

Chris described the ability to quickly draw up plots and add annotations without having to use four different programs as a real winner for him. Working with Chris' own data, I have seen the need for and implemented (a) selecting data blocks from within a worksheet, and (b) supporting a mixture of linear and logarithmic data plots, both of which I believe to have been contributory to Chris' response.

We have agreed that:

  • Label editing (requested previously) is not an immediate priority
  • Plot colour selection would still be nice
  • I shall focus my remaining efforts on improving the user interface for workspace saving and loading (which will require some reworking of the storage interface, but will in any case prepare the ground for continuing Shuffl work in the ADMIRAL project), and
  • I shall arrange to install a copy of eXist on a computer accessible to Chris so he can try using Shuffl in his own environment.

The main new feature that Chris requested during the demonstration was the ability to print a Shuffl workspace, or save it as an image for incorporation into a paper or document. I think this is a reasonable and do-able goal, but it won't be implemented within the current project. Maybe as part of ADMIRAL? For now, there is screen capture and printing.

There is also an interaction here with discussions I've had with Scott Wilson (Wookie Widgets) and Ross Gardler (OSS Watch), in which I have been wisely cautioned against trying to make Shuffl into a generic application server. If data visualization is to prove a draw for engaging with researchers then, looking forward to the ADMIRAL project, I think I should look seriously into the possibility of incorporating a Wookie server into the planned ADMIRAL Data Store server (LSDS). I suspect that a real win here would be if Wookie can serve a widget for displaying raw spreadsheet content (by "raw", I mean here without export to CSV format). I look forward to more interesting discussion and exploration.

Tuesday, 3 November 2009

Shuffl sprint 7 plan

The plan for Sprint 7 has been posted at http://code.google.com/p/shuffl/wiki/SprintPlan_7.

The main targets for this sprint will to continue work not completed in the previous sprint:

  • Visualization of data: after getting feedback from Chris Holland, and to obtaining some representative sample data, I have a number of user interface elements to complete.
  • Improve error handling when loading/saving workspaces.
  • Improve usability of the interface for importing data and loading/saving workspaces.

Sunday, 1 November 2009

Shuffl sprint 6 progress review

Progress during this sprint has been fairly poor, largely due to distraction from both personal and non-project affairs. The total amount of effort spent was 3.5 days against planned effort of 8 days. A further factor impacting progress was that reorganizing the code to enable graph label-row selection in a data table took about a day longer than expected.

On the positive side, I did hold a second review meeting with Chris Holland, and the current development is being conducted very much in response to his feedback, to make the data graphing display more useful to him.

Also on the positive side, the mechanisms for linking data between cards seem to be working nicely (e.g. when I reload new data into a table card, or change the label row, an associated graph card updates immediately; but I do need to capture this relationship when I save and restore a workspace).

Some thoughts about sprint planning: for this sprint, having fewer than 4 days actual effort expended, the sprint planning and review process seems to lack sufficient data to be meaningful. In setting sprint duration, it would seem reasonable to take account of the total amount of effort being applied rather than simply the number of elapsed days. I don't plan to change the duration of the remaining two sprints for this project, but for future projects, planning sprints with fewer than 10-15 total days of effort may be something to avoid.