Preliminary findings of user trials

We’re now coming to the end of the user-trials, here are some preliminary conclusions which mostly relate to the start of the trails when we gave our users a questionnaire to try to check our assumptions of what would help and their expectations of what we might do.

Our users come from the Science and Engineering schools at Heriot-Watt University, they’re computer scientists, engineers, physicists, chemists, bioscientists and mathematicians. Just over half are PhD students, most of the others are post-docs though there are two lecturers and a professor.

This still seems like a good idea.
That is to say, potential users seem to think it will help them. We wanted 20 volunteer users for the trial and we didn’t find it difficult to get them; in fact we got 21. Nor was it too difficult to get them to use Sux0r; only one failed to use it in to the extent we required. Of course there was a bit of chivvying involved, and we’re giving them an amazon voucher as a thank-you when they complete the trial, which has probably helped, but compared to other similar evaluations it hasn’t been difficult to get potential users engaged with what we’re trying to do.

Our assumptions about how researchers keep up to date is valid for a section of potential users.
We assumed that researchers would try to keep up to date with what was happening in their field my monitoring what was in the latest issues of a defined selection of relevant journals. That is true of most of them to some extent. So for example 11 said that they received email alerts to stay up to date with journal papers. On the other hand the number of journals monitored was typically quite small (5 people looked at none; 8 at 1-4; 6 at 5-10; and 2 at 11-25). This matched what we heard from some volunteers that monitoring current journals wasn’t particularly important to them compared to fairly tightly focused library searches when starting a new project and hearing about papers through social means (by which I mean through colleagues, at conferences and through citations). Our impression is that it was the newer researchers, the PhD students, who made more use of journal tables of content. This would need checking, but perhaps it could be because they work on a fairly specific topic for a number of years and are less well connected to the social research network whereas a more mature researcher will have accreted a number of research interests and will know and communicate with others in the same field.

Feeds alone won’t do it.
Of our 21 mostly young science and technology researchers, 9 know they use RSS feeds (mostly through a personal homepage such as Netvibes), 5 don’t use them but know what they are, 7 have never heard of them; 2 use RSS feeds to keep up to date with journals (the same number as use print copies of journals and photocopies of journal ToCs), compared with 11 who use email alerts.

If you consider this alongside the use of other means of finding new research papers I think the conclusion is that we need to embed the filtered results into some other information discovery service rather than just provide an RSS feed from sux0r. Just as well we’re producing an API.

We have defined “works” for filtering
We found that currently fewer than 25% of articles in a table of contents are of interest to the individual researchers, and they have an expectation that this will rise to 50% or higher (7 want 50%, 7 want 75% and one wants everything to be of interest) in the filtered feed. On the other hand false negatives, that is the interesting articles that wrongly get filtered out, need to be lower than 5-10%.

Those are challenging targets. We’ll be checking the the results against them in the second part of the user tests (which are happening as I’ve been writing this), but we’ll also check whether what we do achieve is perceived as good enough.

Just for the ultra-curious among you, here’s the aggregate data from the questionnaire for this part of the trials

Total Started Survey: 21

Total Completed Survey: 21 (100%)

No participant skipped any questions

1. What methods do you use to stay up to date with journal papers?
Email Alerts 52.4% 11
Print copy of Journals 14.3% 3
Photocopy of Table of Contents 9.5% 2
RSS Feeds 9.5% 2
Use Current Awareness service (i.e. ticTOCs) 4.8% 1
None   0.0% 0
Other (please specify) 61.9% 13
2. How do you find out when an interesting paper has been published?
Find in a table of contents 14.3% 3
Alerted by a colleague 38.1% 8
Read about it in a blog 9.5% 2
Find by searching latest articles 76.2% 16
Other (please specify) 47.6% 10
3. How many journals do you regularly follow?
None 23.8% 5
1-4 38.1% 8
5-10 28.6% 6
11-25 9.5% 2
26+   0.0% 0
4. Do you subscribe to any RSS Feeds.
Yes, using a feed reader (i.e. bloglines, google reader) 9.5% 2
Yes, using a personal homepage (i.e. iGoogle, Netvibes, pageflakes) 23.8% 5
Yes, using a desktop client (thunderbird, outlook) 4.8% 1
Yes, using my mobile phone 4.8% 1
No, but I know what RSS Feeds are 23.8% 5
No, never heard of them 33.3% 7
Other (please specify)   0.0% 0
5. When scanning a table of contents for a journal you follow, on average, what percentage of articles are of interest to you?;
100%   0.0% 0
Over 75%   0.0% 0
Over 50% 4.8% 1
Over 25% 19.0% 4
Less than 25% 71.4% 15
I don’t scan tables of contents 4.8% 1
6. The Bayesian Feed Filter project is investigating a tool which will filter out articles from the latest tables of contents for journals that are not of interest to you.
What would be an acceptable percentage of interesting articles for such a tool?
I would expect all articles to be of interest 4.8% 1
I would expect at least 75% of articles to be of interest 33.3% 7
I would expect at least 50% of articles to be of interest 33.3% 7
I would expect at least 25% of articles to be of interest 19.0% 4
I would only occasional expect an article to be of interest 9.5% 2
7. What percentage of false negatives (i.e. wrongly filtering out interesting articles) would be acceptable for such a tool?
0% (No articles wrongly filtered out) 14.3% 3
<5% 23.8% 5
<10% 38.1% 8
<20% 4.8% 1
<30% 4.8% 1
<50%   0.0% 0
False negatives are not a problem 14.3% 3
8. What sources of research literature do you follow?
Journal Articles 95.2% 20
Conference proceedings 71.4% 15
Pre-prints 14.3% 3
Industry News 33.3% 7
Articles in Institutional or Subject Repositories 19.0% 4
Theses or Dissertation 57.1% 12
Blogs 33.3% 7
Other (please specify) 19.0% 4


Filed under trialling

4 responses to “Preliminary findings of user trials

  1. In the ‘other’ answers for question 1 is there any interesting information, or are the ‘other’ ways they were keeping up with journal papers so varied that this is not useful?

  2. Ah, yes, I should have said that most of the other methods (9 of the 13) involved searching databases (Web of Science, Google Scholar, Science Direct). Browsing, following-up citations and recommendations from colleagues also featured. This was partly what gave us the feeling that current awareness in the sense of keeping up with what has just been published isn’t seen by some researchers as being a distinct activity from a general literature search (also, some of them told us that directly when we were talking to them).

  3. Chris Rusbridge

    I find these days that Twitter helps a lot, in two ways. First, my contacts tend to tweet about something interesting they have read recently (it’s low cost compared with blogging), and second, the Twitter search facility, while very flaky, let’s me leave a complex search running so that I find stuff from people I have never met. EG during a workshop on Provenance, I discovered a related paper being given at another conference overseas, at the same time!

    My complex search is (digital OR data) AND (preservation OR curation). Looks a bit like a current awareness search, doesn’t it! There is some dross but occasional nuggets…

  4. Pingback: BayesFF: Final post « Bayesian Feed Filter