Project Meeting November 2012

This week, Phin, Rob and I again had the pleasure of meeting up with Chris Batt to discuss progress on the project. On this occasion, we were also able to demo the new VentureNavigator assessment that plugs into Enlighten – the EPrints repository at Glasgow.

The demo was running on the development server and – this being a live demo – we did see one error. Chris assured us that this was quite normal in these situations, especially where in-development software was concerned, and the demo continued without further incident. The demo showed how the previously-discussed top-level SIC codes were used to direct user input before being further refined using a set of keywords derived from the Enlighten repository, on which VentureNavigator indexes the EPrints content. With only a subset of Glasgow’s EPrints records currently held in the development server, the results are clearly skewed towards those subject areas for which papers are available and, indeed, even the fully-functional live system will naturally produce better results for disciplines in which Glasgow currently excels. However, this is part of the point: if there is clearly expertise at the University which is of relevance to a particular ‘innovation aware’ business, to use Phin’s term again, then they will want to know about it. If there is no relevant expertise, then this is also important: a business will not want to waste time looking for opportunities that simply aren’t there. If this pilot project – based on Glasgow’s EPrints repository – is successful, then it would make perfect sense to consider expanding the VentureNavigator solution to include multiple repositories, in order to be able to produce results from a variety of institutions, each with its own research foci. Services provided by the likes of the Open University’s CORE project, for example, might also provide a means of expanding our assessment’s reach beyond a single institution. That said, the outputs of the existing project will be imminently transferable: the EPrints plug-in developed at Glasgow will permit any institution with an EPrints repository to expose their research outputs in a VentureNavigator-friendly format, or, indeed, in a format that could be digested by any number of other services.

Phin outlined our plans to test the software and to engage with key stakeholders in the project. To this end, we are aiming to talk to up to ten businesses with which Glasgow has an existing relationship to let them try out the assessment and provide feedback on the process as a whole, from the user interface to the quality and relevance of the information the assessment provides. Chris was particularly keen that we get the user interface right. What we’re doing behind the scenes is technically fairly complicated but this will all be for naught if the public face of the project isn’t user-friendly: our aspiration is to make the assessment as straightforward to complete as buying a song from iTunes or ordering a book from Amazon. The guys at Essex have gone to great lengths to distil the assessment’s interface down to the most concise form possible, while retaining a keen focus on ensuring the results are relevant and useful to the end user.

As Chris described it, we’re proving that the “supply chain” works, or to put it another way, we are demonstrating that there is demand in the business world for academic expertise, and that the software and techniques developed under the auspices of the Encapsulate project provide the requisite means of supplying that expertise. From where we stand now, I’m confident that we’ll be in a position to show that supply chain in action by the end of the project, and have useful things to say about the process of sharing academic outputs with the business community.


VentureNavigator development update

Locked away in the vaults of VentureNavigator the development team (Phin and myself) have been beavering away in an attempt to connect the data warehouse that is Glasgow’s EPrints Repository with the dynamic business tool that is VentureNavigator Assessments.  Much of the earlier issues and project decisions have been covered in other blog posts so I’m going to try not to duplicate those post. Instead, the aim of this post is to offer an overview of the project from the perspective of the VentureNavigator development team.

One of the first areas we tackled was the processing of the EPrints data.  With a wealth of information made available by Glasgow we needed to target and display the data into a format that would be useful to VentureNavigator users. So, barely out of the barn door and we’ve already encountered our first problem: How to display the EPrints data in a logical intuitive manner?  The logical process would be to ask the users what they want, see if this data is available in EPrints and then return the relevant data to the user.  The best mechanism for this is the  VentureNavigator Assessment tool which is already designed to return useful data to users based on their answers to preset questions.  Assessments can tunnel users answers into a number of options and then return predefined feedback based on the options selected by the user.

Real time?

We initially thought that we could query Glasgow’s EPrints Repository in real time.  This, however, threw up more issues.  What happens if the connection between VN and Glasgow goes down or is slow?  If a user returns to an old assessment how would we know if new data is available?  In the end we surmised that by storing the relevant data locally we could resolve the previously mentioned issues and make the process more efficient.  In order to do this we would need to import the data.  To this end I wrote a simple PHP script that would parse Glasgow’s EPrints Repository through their API and store the required data locally.  This data was then linked to the existing assessment feedback mechanism.

Setting the stage

With the behind the scenes cabling all tucked away it was time to prepare the stage.  VN assessments have various inputs for users to enter their answers including the standard free text, drop down boxes and radio buttons. After some experimentation with each of these it became evident that none of them would work effectively.  Free text inputs fields have less chance of returning any data unless the user happens to enter values that correlate exactly with our keywords (for more information on keywords please read Matt’s informative blogpost – Searching for the key(words)).

AJAX Keyword auto-complete

Drop-down fields looked to be the holy grail but we soon discovered that with any more than 30 or so keywords the drop-down field became overloaded and cumbersome to use. The best input tool would be one that allowed users to enter their fields of interest but restricted the actual input to an entry on our keyword list.  As there was no input mechanism that existed for assessments we hacked (ahem, I mean tweaked) an exsiting autocomplete input field used elsewhere in VentureNavigator.  This input field means as users to start typing in the field and they will see a list of keywords that match their entry. This restricts them from entering a keyword that will not match up with the results.

Optimising for speed

As the project has progressed we have encountered issues in terms of processing the vast amount of data in Glasgow’s EPrints Repository.  One method we used to optimise the system was to index the data that we pull in against the keywords we use to search for the data.  The results have been impressive reducing the processing time by over 30%.  The other method we are currently employing is the tried and tested “throw more memory at it”.  I can’t comment on the results as this is yet to be benchmarked.  As Glasgow steadily increases the size of their repository the effectiveness of changes will become more and more apparent.

So, we now have an assessment that will gauge users areas of interest in terms of current activity and for future innovation and display relevant excerpts from Glasgow’s EPrints Repository as well as signposting them to Glasgow’s EPrints site.  We still need to iron out the format of EPrints data that is displayed to the VentureNavigator users ensure that we can handle processing the full set of Glasgow’s EPrints repository but early indicators are positive.