Scatter and Decay E-journal usage patterns
Carol Tanopir
Abundant Data – too much?
Easy access to use of ejournal collections
There is a need to gather more data, because as good as it is now we have to gather info from users to find out what they use and why and what other things they use (web, subject repositories) to get a fuller picture.
Max Data funded by ilms funded to get data about data about selected libraries and look at whole picture.
They want to look at the story behind the data, what can we conclude with confidence about the different methods of data collection.
Ultimate outcome will be a cost benefit model for the different methods of data collection. What is worthwhile to use for the different things you want to find.
Goals to help libraries make the best use of data, they are sharing what they have collected today.
3 teams
Dave Norris London looking at deep log data, logs from all of Ohio link libraries, they isolated 4 libraries that they also surveyed.
1 Research intensive, 1 research extensive and 2 masters heavy.
Log analysis of ejournal usage only
With survey you can better identify survey base, you can separate out by rank and degree.
Negative thing about surveys is that it is self reported, but you can find out why they did things and what the values
Gail from Tennessee survey reports
2 findings for today: scatter and decay. Looking at how you measure use in libraries.
Scatter: relative dispersion of points on a graph in respect to a mean value, (used loosely today) want to look at how reading varies by subject discipline.
Looked at how many journals used in last month multiplied it by 12 to get a year.
Medical and health students and faculty have read many more articles 434 for faculty and 222 students, many more than other disciplines. Variation in number of articles or reading done found.
Downloads by subject in Ohio University, found by logs
Medical journals first followed by other sciences. Looked at subject are of journal to come up with this distribution.
Used counter data and combined it to come up with a list of titles and used info from link resolver to come up with subject headings. Medicine did not come out on top this way. Medical school located in Memphis, and they use separate databases from Knoxville location, this would account for the discrepancy.
The Arts/humanities section looks high because link resolver put new/fashion entertainment into that category.
File had 17,000 unique titles. Log report had 7,000.
In addition to counter reports they also had other info from aggregators which came through here.
Pointed out that some titles had multiple subjects.
Couple titles in top 20 use, billboard and rolling stone, history of rock class is very popular there (Tenn.)
You have to look behind stats to find out where information is coming from. In the future it would be good to pull out scholarly titles and look at the distribution.
David looked at downloads by subject staff and faculty vs. student use.
Student use is much higher to online articles. This might be because Faculty has other sources for info, personal sub-reading rooms-colleagues.
Survey: how many minutes did you spend on the last article you read. Found that engineers spent twice as much time on medical articles.
Found medical articles are more widely read but they have been read through more quickly.
Average number of seconds of view per article was gleaned from the log data.
135 seconds time read by sciences viewers. Assumed that article is scanned over and decision is made to print or download.
Looking at the Counter data to see the entire collection. Found that 10% of titles counted for about 80% of use.
Decided to look at titles alone, and combined data on like titles, wasn’t fun matching on titles or unstandard ISSN. Ended up with 17000 titles and 11.5% counted for 80% of usage.
Highest use titles: science with over 8000 downloads. Nature 4,000, Tetrahedron, AMS archives (older chem. Data), USA today
Data from Ohio link looks somewhat different because they were looking at more scholarly titles (presumably)
–David –
Decay: process of gradually becoming inferior, gradual decrease. Or reading fails by age of articles
Did a lot of work on all 6000 Ohio link titles, good background on contextual use. There has been an increase in use of older material, because it is much more visible than it previously was.
Evidence shows that people who use search engines find older material.
Decay of article readings.
Carol asked people about when the last article they read was.
Looking at downloads
Looking at how data is triangulated and what does it mean?
Citation: Journal of the American society for
When we actually look at dowloads. We get a different picture
Conclusion:
No one perfect method for data collection. Need to look at sub discipline of reader and for items. Found so far that student assignments can lead to skewed data.
When looking at stories behind data there is a big difference based on sub discipline. Caution about log data.
Student reader skewed towards students, because faculty has other way to get info
We need to examine what are value is compared to our costs.
They will continue to look at possibilities for log collection. More work the more cost and more of your time is used.
The more effort you put in the greater story.