Statistics for Librarians (ACRL)

January 14th, 2005 by The Curious Tiger

105_0581
Getting up at the crack of dawn to attend a statistics course is like taking Chinese herbal medicine: the stuff may be hard to swallow at first, but you’ll feel that much better for doing it afterwards. The following notes are my highlights of this morning’s 3-hour long presentation on statistics and specifically how they’re used in/by research libraries in evaluating electronic journal usage.

But first, a big disclaimer: it’s been over 10 years since my last formal math class. I expect to make a few completely innocent errors, so here’s how to get in touch with the brains who were responsible for this presentation: Philip Davis, transplanted Canadian and Plato-quotin’ librarian at Cornell University, or John McDonald, Acquisitions Librarian at Caltech Library System. John’s email is , or , or thatreallysmartandeffusiveguy@thesame. (while the statement is true, the addy is not)

    Here’s an overview of what we covered:

  • Language of statistics: mean, medium, standard deviation, normal distribution
  • Evaluating literature based on 3 key criteria: Validity, Reliability, Generalizability
  • How to sample an entire population and how to make statements based on the data
  • How to interpret usage statistics on making decisions

I’ve always been confused between the difference of medium and mean. Here’s a quick n’ dirty tip from John on how to differentiate the two:

Medium is the middle, whereas
Mean is the average

We spent some time talking about sampling, the different types of sampling: Simple Random Sampling and Stratified Random Sampling. What I got out of Phil: don’t ask for stratification unless you really, really need it. Low response rates don’t and shouldn’t compromise survey results. From John: librarians have great response rates. Surprised? We didn’t think so either.

Before this morning, I never understood why some pre-election surveys always polled such a small number of registered voters. In my ignorance I just assumed that the bigger the population, the bigger the sample. That’s until I learned about the Law of diminishing returns, which states just the opposite, as illustrated below:

Pop size Sample size
500 314 (68%)
1000 516 (52%)
5000 880 (18%)
100,000 1056 (1%)

Here’s an online calculator the guys included on the handout that will help calculate your sample size.

Phil and John lead the room in a fun, interactive (meaning, we ate our test subjects) on statistical hypothesis testing using M&Ms. We compared our color distribution to the official numbers provided by Mars Corp. This handily transitioned us into talking about P values, which I actually got on the first try. It’s a statment of probability: in stating Pdon’t actually tell you what is being downloaded, or who did the downloading, why an article was downloaded, or how many individuals are responsible for the statistics. If anything, usage stats are just the first step in evaluating e-journals; you have to find overwhelming evidence before deciding you need to cancel something.

Phil gave some background on Project Counter. providing industry standards on counting and reporting usage stats across publishers. As Martha would say: standards, it’s a good thing.

Thanks for ACRL for providing complimentary bagels, cream cheese, muffins, tea, and coffee. Props to both Phil and John for taking the fear factor out of the subject, patiently fielding all of our questions, and of course, the complimentary M&Ms.

Tags: ,

| Print this post Print this post

Leave a Reply


Bad Behavior has blocked 2828 access attempts in the last 7 days.