The Evergreen Head-bob

July 3rd, 2008 by Karen

I have been stewing about my maiden post on this blog for some time, so here goes.

I presented or talked about Evergreen and open source seven times at ALA Annual 2008. (Seven times, I tell you, seven times!) It was a great experience, and part of the joy came from the many moments when the heads began bobbing. (No, not nodding off to sleep… well, maybe for one tired conference-goer… but actively bobbing in affirmation.)

I would say “No hidden code” and heads would bob.

I would say “Free to use, free to download, free to examine, free in every way” and heads would bob.

I would say “Interoperability” (which open code facilitates, even if it is not a characteristic of all open source) and heads would bob.  (Including the heads of many potential vendor partners.)

I would say “Reindexing, transaction load, deduping, powerful, avenue for growth” and heads would bob.

I would say “This brings us full circle to where we were in the beginning of library automation, when we steered our own ship,” and heads would bob.

Evergreen community members can’t always see the bobbing heads. But they are there all the same.

Happy 4th! (And belated Canada Day, not to mention the 400th birthday of Quebec!)

A Riff on Small

June 2nd, 2008 by drdata

In A Riff on Big, I dealt with the skewed nature of library distributions in the United States (although the finding is more generally applicable)—that is, there few big libraries and many small libraries. Moreover, the big libraries are really, really big and the small libraries are really, really small.

However, the critical question is not this fact but its implications for a national information policy. In a consential democracy with the foundational principle of an informed citizenry, any information policy has to consider the disparate sizes and resources of the nation’s libraries and how to ensure that those with access only to the smallest libraries can have access to greater resources—as great as is possible.

With all that in mind, Karen Schneider asked me if the size of libraries in square feet was known? Well, actually, it is, and I produced a list for her of the PINES libraries with the size square feet for each outlet running Evergreen. Hmmm. Interesting. The summary statistics, as well as sources are included below but what do they mean?

The conventional model for allocating public library services has several layers. Independent libraries may operate independently with resources available at worse through Ill. In larger settings, branches will draw on a central library or, even larger, regions of independent systems draw on the resources of the region. Until recently, that is what was possible.

The PINES system allows the libraries in all PINES libraries in the system to draw on all other member libraries. The system includes about 2 million bibliographic items and 9 million or so physical items. This arrangement of resource sharing is not common nationally. That fact may be because a consortial library system was not available until recently.

PINES, then, has made more resources more available to more smaller libraries than the conventional model could if applied in Georgia. This is an important step forward and one that has profound information policy implications.

I conclude that the conventional model has served its function as well as was possible before the development of software capable of managing resources of a large consortium. If resource sharing is important, recreating such conventional systems today would, at best, be doing the wrong thing well.

Some Data

Here are some summary stats for libraries reporting this number to the US (N=15,858) and for PINES (N=243) for FY 2005:

Summary statistics of library outlets by square feet of space

                         All                           PINES
                         U.S.                           # by
                      Quartiles                    U.S. quartiles

Fourth           Greater or equal to                     59
Quartile         12,000 sq. ft.

Third            Less than 11,999
Quartile         and greater than 5,314                  62
                 square feet

Second           Less than 5,313                         80
Quartile         and greater than 2,234
                 square feet

First
Quartile         Fewer than 2,233                        42
                 square feet

Note that the number we normally see of the PINES libraries gives the aggregated figures for each of the (now) 48 systems but this number is considerably disaggregated. Let’s look a bit more.

Smallest library outlets by square footage

                         All                            PINES
                         U.S.                             #

Smallest          less than                              20
10%               1,120 sq. ft.

Smallest          less than                               9
5%                784 sq. ft.

Smallest          fewer than                              4
1%                384 square feet

Note that these categories are not mutually exclusive and that each interval is included in the interval above.

Source of data

The data come from the FY 2005 Outlet File published originally by the U.S. National Center for Education Statistics and now available on the Website of the U.S. Institute of Museum and Library Services. The outlet file has an observation for 17,299 outlets. Appendix J of the documentation reports that 16,557 responded and 44 did not respond but had data from the year before. I found some responses were negative (a conventional way of representing such things as “not applicable” or “unknown”) and deleted those. Bookmobiles apparently do not conventionally report square footage, at least the PINES libraries do not, thus leaving 15,585 outlets (branches and central libraries) with reported figures.

That this number is from FY 2005, the latest national level data we have, and facts will have changed since that survey.

Bob Molyneux

Measures of maximum load in PINES - update

May 30th, 2008 by drdata

In an earlier post, I reported on some figures measuring the then maximum transaction load levels on PINES. Two of those were from the day after Memorial Day in 2007 (that is, May 29, 2007). What happened in 2008 the day after Memorial Day (May 27, 2008)? Well, we have another busy day.

                          May 29, 2007                               May 27, 2008

Total circs                 96,326                                     100,427
per day

Maximum circs
per hour                    11,305                                      12,227
                          (11AM-12PM)                                 (11AM-12PM) 

This year’s total figure is 4.26% over last year’s.

The busiest minute in 2008 was 331 transactions while last year’s maximum figure was 548—although it was a day earlier in the month.

What was the busiest second? Why, I am glad you asked. There were three seconds during the day when 9 items circulated.

Bob Molyneux

So we beat on, boats against the current…

April 10th, 2008 by drdata

I have long puzzled over the fact that in the golden age of the library function, there has been a systematic failure of library institutions.

By “library function” I refer to the functions and institutions that maintain the memory of the human species. This function includes those institutions and people who save, organize, and provide for eventual retrieval the human record. In spite of the fact that what we do is often ignored or disrespected, it the genetic trick as a species. It is a key aspect of what we are built to do. We are not the fastest species nor the strongest. Smartest? Roll your own wry comment on that. But we remember and we do it with many institutions I lump in the library function: libraries, archives, museums, and so on, with apologies to the archivists.

In my lifetime, the human record has starting moving from paper, stone, vinyl recordings and so on to digital forms. When I was graduated from library school in 1971, the importance of information as a thing would get a perfunctory: “yeah, yeah.” Now, everyone knows.

The failure of institutions of our field to adapt to this new information environment is a calamity for the species and one that is being remedied by bypassing the institutions that, traditionally, have done the library function. There is an old saying that “the Internet routes around trouble”–an expression that indicates a foundational principle of the Internet architecture. I believe that with so many things going on in the information world, we can see the human species routing around the traditional institutions because there are increasingly more efficient methods for remembering, organizing, and retrieving the human record—which we humans cannot not do–and which these traditional institutions do not do well. Not all people are routing around all library institutions. Not everywhere. Not everyone. But many–perhaps most. Librarians did not invent Yahoo nor Google. We made the sale about the importance of information but couldn’t consummate it. We have much to contribute to this new information world but we are largely failing.

I was forced to consider this dynamic again recently in a microcosm of the general problem I have outlined. What restarted this whole train of thought was a report that discussed integrated library systems in a specific context. I apologize for a special pleading but it was both inaccurate and sad. There was no aspect of this report that struck me as informed and the people who paid for it are in peril and they don’t know it. The recommended solution for their problem WILL NOT WORK because the recommended technology won’t do what the marketing folks said it will do and the person who wrote the report, I infer, was none the wiser. You would think after the events of the last year or so, library decision makers would have wised up. One would suppose that by now librarians would have their defenses up to resist vaporware and the blandishments of the slick.

I think one problem is that we do not have a critical mass of technically trained or technically astute people in the field nor an assessment culture. In the librarians’ house, there are many mansions…just not a big enough IT one. And the ones we have…well, you know how they are…not like us…difficult. We treat them like the Dilberts of the library world because too many library decision makers do not understand much about technology.

Years ago, Kathleen (de la Pena) Heim, in a different context, made the perceptive observation that we “need to recruit a new us.” We didn’t in information technology. As a result, we certainly do not have enough IT folks with an understanding of the library function in the field and information seekers are routing around us seeking information.

In many fields with a requisite critical mass of technically-trained people, there is an assessment community. PC Magazine, Ars Technica, and other such sites provide independent assessment of computers, hard drives, thumb drives, and so on. I remember reading magazines that discussed “hi-fi” and reviewed equipment back when I could hear. There are magazines like Consumer Reports for consumer products. These days, one has sources like the reviews and comments at sites like NewEgg.com if you are purchasing computer equipment and discussions there are often lively but usually informed.

In the library world, we have a blizzard of marketing twaddle about technical subjects and a host of people who comment but too few with the ability to pierce the marketing veil and tell folks in the library community what these various products will and will not do based on an accurate understanding of the underlying design concepts and their execution. We do not have an assessment community. How many IT failures based on librarians believing impossibilities do we have to endure?

Boswell quotes Samuel Johnson as saying: “Depend upon it, sir, when a man knows he is to be hanged in a fortnight, it concentrates his mind wonderfully.” Except in the library world.

Bob Molyneux

A Riff on Big

April 7th, 2008 by drdata

I have alluded in several posts to the disparities one finds in looking at distributions of library data. By “distributions” I am talking here about making observations and generalizations when one looks at all the data from a set of libraries.

I am going to discuss a fact of library life we all know about, give you a few numbers, and discuss a few implications. I take circulation figures (TOTCIR as NCES calls it) for all public libraries in the United States for fiscal year 2005. These are the latest national-level data we have from NCES. I use the data as I recompiled them in a dataset that began when I was at the U.S. National Commission on Libraries and Information Science. I have continued to update that series since then. Exhausting documentation exists on that site.

In that year in the dataset, there were 8,957 libraries and they circulated 2 billion items. Of those total circulations, 1.8 billion were reported for the highest quartile. That is, the 2,240 library systems that had the highest number of circulations accounted for 86% of all circulations. Those libraries in the first quartile, that is, those 2,240 with the lowest circulations had 13.9 million circulations or fewer than 1% of the total. This relationship—the big are awfully big and the small are really small—is an observable fact in most variables (staff, income, expenditures, holdings, and so on) and in every universal library dataset I have worked on. It is a characteristic of our national library system. The term in Statistics to describe this kind of distribution is “skew” and library distributions typically are skewed.

There are two kinds of implications, I think, to this characteristic. One deals with information policy and the second with the design of integrated library systems. Considering the information policy implications first, there are, it seems to me, three aspects that arise when we consider that the structure of library resources in this country are so disparate.

First is the effect of differential resources in a consensual democracy where an informed citizenry is a foundational element.

Second, what effects do differential resources have on the post-Enlightenment notion that was particularly important in the history of U.S. libraries: the library as the university of the common man? This question still is important because of the necessity of continuing education in an era with so many dynamic changes in our economy and where people need to retool for new kinds of jobs.

Third, given that the small libraries are so small, their staffs are also small. Andrea Neiman, of the Kent County Public Library, Chestertown, Maryland reminded me the other day when we were chatting about the implications of skewness of something important, that is, what we know about the small libraries which is not much. Consider: the top quartile of public libraries employs 85% of all public total full time equivalent staff while the bottom quartile employs fewer than 1%–numbers similar to what we saw with circulations, of course. The upper boundary of this quartile is less than 1.2 FTEs–the rest are smaller. Thus, these libraries will not likely be adequately represented at conferences nor in decision making bodies.

The policy implications are for another place and time, of course, and we all know of the halting attempts to address this problem, that is, to breakdown the information silos faced by users of libraries.

A second implication of the skewness has to do with the design of library automation systems. They have handled the fact of the distribution of library resources awkwardly and that, in turn, follows from the unsystematic way these resources have been designed traditionally. I will leave to my colleague Mike Rylander to discuss what he has explained to me about how ILSs were designed but there was market segmentation: big ILSs and small ILSs as a result of the limitations of early design decisions and capabilities of the ILSs. The influence these design limitations had on information policy would be a fascinating subject to explore.

In any case, Evergreen is, currently, unique in that its design encompasses very big to pretty small libraries and its ability to handle diverse consortia is also unique—and valuable as I have discussed in several previous posts. The PINES experience indicates that good design can address the information policy aspect of skewed library distributions.

Bob Molyneux

Birds Gathered, No Feathers Ruffled

April 2nd, 2008 by jason

No, despite the title, this isn’t a belated April Fools joke. :)

We had around 50 people descend on the Marriott City Center during PLA for the Evergreen Birds of a Feather gathering. Evergreen stakeholders seemed to be well distributed around the room and it was just an informal gathering with food and much chatting. My only regret is that I neglected to bring a camera. ;) I know I got a lot out of it, and it was very nice to put faces to names. I do think that next time we should also offer a few structured activities to compliment the social aspect. If anyone has any ideas or comments, feel free to throw them out here or on the mailing lists.

Thanks folks!

– Jason

Evergreen Scales Down. Way down.

April 1st, 2008 by jason

Much ado has been made about Evergreen’s ability to scale up with both its service-oriented architecture and consortia-savvy interfaces, and as folks witnessed at PLA, it can also scale down to the size of a laptop. Well, now we have really done it. We have taken the n-tier concept to the extreme and have introduced fractional-tiers! Yes, you can now partition your data into quanta, and from base axiomatic principals truly grow Evergreen into a fractal framework that will help gestalt your libraries into the next millennium. It’s no longer about consortia; instead, it’s all about personal digital appliances and technological augmentation. You can take an Evergreen data seed, and install it on, say, an mp3 player or cell phone, or, if you want to really be on the bleeding edge (literally), you can plant an Evergreen data seed into a biochip and have it surgically implanted into your neocortex, freeing you forever from the confines of conventional search and discovery interfaces. Now you can carry your data with you! If you could put Google into your brain, would you? Well, this isn’t Google, but it can be your library!

Welcome to the brave new world of Evergreen!

– Jason

Why do you do it that way? (or, design rationale)

March 20th, 2008 by jason

Yesterday I was explaining to some librarians how Evergreen relates Items (the actual physical barcoded material that circulates) to Volumes (where the Call Number lives) to Bib Records (which contains the MARC), and one person was curious and asked, “why do you do it that way?” The short answer is because it’s good design, but the question momentarily threw me for a loop because the implication is that they’re used to systems which do not do it that way.

Imagine that the structure used for items in the database for your ILS is laid out something like this:

Item Table
———-
Internal ID
Creation Date
Barcode
Call Number
…other fields

In some automation systems, that Call Number field will be a free-text label. If you want to change the Call Number for an Item, then that field will necessarily get changed. If you want to change the Call Number for a large grouping of Items, then you will have to change the field for all those items (sometimes one at a time).

In Evergreen, that Call Number field will actually be a foreign key to another table in the database, one that looks something like this:

Volume
——-
Internal ID
Owning Library
Call Number Label
…other fields

This allows a Call Number to be shared amongst a group of items, and be modified with a single edit.

This is also an example of database normalization.

Why is this important? It has to do with redundancy of information (and I’m not talking about backup storage, RAID’s, etc).

Human beings love redundancy when it comes to communication; our meanings get emphasized by our body language, and the syntax and structure of our languages, both spoken and written, encode information in multiple ways. But redundancy opens the door to discrepancy, and discrepancy leads to ambiguity. Human beings can deal with ambiguity, but computers (and most software developers, and maybe catalogers) can’t abide it. “Time flies like an arrow, but fruit flies like a banana”.

A computer might resolve ambiguity in a non-obvious way, or worse, it may resolve ambiguity in a manner that coincidentally matches a human’s expectation, only to cause trouble later when you start getting data anomalies.

Normalization is a best practice design technique for designing relational database schemas, though there are cases when you want to make a trade-off and de-normalize certain data structures. But those cases are usually optimizations that should occur after you already have a good design as your starting point.

Relational databases are powerful enough that they have sunk into the consciousness of librarians, to the extent that they put down “Must be built on a relational database” in their RFP’s. But most legacy ILS products actually use hierarchal databases, which are good for some things, but not for others. The design decisions made when you go with a hierarchal database are very different than the ones you make when you start with a relational database, and I worry about the legacy products that have “tacked on” relational databases for buzzword and RFP-compliance. Legacy automation may now have relational databases, but are they actually using them as a relational database? Some are, but I know of at least one that isn’t.

Let’s return to Evergreen for a moment. Because Call Number Volumes are represented as their own entities, you have the option of thinking of them differently. For example, you can move a set of items to a different “call number”, one which may be associated with a different Bib Record altogether (or you could move a volume itself to a new bib record). You could even change which library “owns” a volume, and suddenly all the items attached to that volume have a new owner. And you’re able to place a hold on a specific “volume”, in addition to title-level and copy-level holds (and in Evergreen, meta-record holds across editions and formats).

But you can also do what you may already be used to, and change the Call Number for an item (or a batch of items) from an Item Editing interface, and not have to know about the more flexible structures that are being manipulated underneath.

Because we already have Volumes, we’re also in a better place for adding Serials.

There are fewer widespread repercussions to changing an end-user interface than there are with changing your database schema, so that’s why it’s important to have a good database design from the beginning, so that your interfaces have more options. To steal a sentiment from Mr. Miyagi, Evergreen has strong root.

– Jason

Congrats to not one, but two other Open Source ILSs

March 18th, 2008 by miker

First I’d like to welcome NewGenLib to the virtual family of FOSS ILSs. In truth, we’ve known about them for a while and have been looking at their serials interfaces during our ACQ/SER design, but now that eIFL is covering them, well… ;) It’s great to see another entrant, and one that has already found an itch to scratch. I’m sure cross-pollination is in the stars as they seem to have an interesting system.

Next up, a pair of kudos to Koha.

Over the past weekend they added, at a mailing list member’s request, a call number browser inspired by Evergreen’s, which we call Shelf Browse. In Evergreen, because it supports a hierarchical organization of libraries, you can actually browse an entire system or even consortium as one huge virtual shelf! It’s a very nifty feature, and one that we know the PINES patrons have been making good use of (to the tune of 66,965 and counting so far this year, and about 300,000 times in 2007) since Evergreen launched in September of 2006. Now Koha will have a similar feature at the request of a small church Library! This, my friends, is Open Source at work.

By way of evidence from our users, I’ll mention that Evergreen provides call number / shelf browse as a “Quick Search” from the advanced search interface, which is useful to Evergreen users and may be useful for Koha patrons as well. In any case, good work.

I also noticed that Koha has incorporated, as of November of last year according to their source repository’s timestamps, the SIP2 code that David Fiander and Bill Erickson wrote for Evergreen. We’re glad to see the code that GPLS funded is going to good use in and inspiring other projects!

Three and a half years ago, when I first joined PINES and the Evergreen team, there was a dream and a small test server. Now we’ve written more than a quarter of a million lines of code, and that code runs the day-to-day operations of one state-wide library consortium (biggest in the world, he bragged ;) ) with at least two more in the works, and is helping to build a province-wide consortium in Canada — and let’s not forget the Laurentian/McMaster/Windsor “Unholy Trinity.” These are amazing, and they fill my heart with a satisfaction that is difficult to describe, but none of those things, even as possibilities, are why I signed up. I joined this effort because I believe in Open Source software. I believe whole-heartedly that it is a force for positive change in an industry I love, and fits perfectly with the mission of libraries.

Again, congrats to both NewGenLib and Koha, and let’s keep the cross-pollination going.

–miker

Laundry List

March 15th, 2008 by miker

It’s been a while, but here’s just some of what I’ve personally been doing during this latest stretch of radio silence:

  • In-database circulation permit calculation — ~5ms instead of ~100ms for arbitrarily complex circ rules
  • In-database hold permit calculation — ~5ms, down from about 50ms
  • In-database relevance rank adjustments — configure relative weights and have them take effect immediately
  • Native SRU interface (full CQL bibliographic searching context set) with Z39.50 based on Simple2ZOOM
  • Rewritten search code — no more stalls on overly common search terms, and much faster in general
  • Per-object permissions
  • I18N infrastructure for in-database strings (Library names, etc)
  • Infrastructure required for in-database record ingest (coming this summer)
  • Acquisitions and serials data modeling
  • Exposing many preexisting back-end features to the OPAC (advanced query syntax, resort and limit to available after search, etc)
  • Work on some other Evergreen-based ideas

And yes, like everybody else, I’ve added Google Book Search to the Evergreen catalog.

This doesn’t even scratch the surface of the development that’s been going on recently. ACQ is coming along, the OPAC is getting friendlier, patron self-service options are expanding, full I18N is now just a matter of string removal (and we have both French-Canadian and simplified Chinese translations) … the list is far too long for me to remember on a cloudy Saturday morning.

–miker

“We want to emulate the PINES experience”

March 7th, 2008 by drdata

One of the interesting aspects of working with Evergreen is the phone calls from large consortia or state libraries wishing to start large resource sharing networks like PINES. I believe there was a latent demand by the library community–particularly from library users–for ILS software capable of managing large networks that has now been met by Evergreen.

While Evergreen is happy running on small libraries, it does have a unique niche in large consortia because of its distributed database architecture, the OpenSRF backend, and its robustness. This structure allows small libraries to run Evergreen with one server and large consortia to run Evergreen by adding more servers.

A few days ago I reported on the highest circulations so far in PINES for a day (96,000), an hour (11.300) and a minute (548). What was not mentioned is that these transactions occurred while there were one thousand terminals logged into the database being used by library staff who were checking out those materials, cataloging books, and doing other activities that changed the database. There are 275 library outlets spread across a state using Evergreen and this network circulates about 19 million items a year.

The underlying breakthrough is not only the ability of the database to scale but also the ability to handle a high level of changes from many sources to the database–the heart of so much in the consortium. No other ILS software, proprietary or open source can currently handle this kind of large consortium so gracefully.

In database speak, data silos are relatively small and separate databases that talk to each other with difficulty. Silos are the bane of analysts who have to pierce the spread-out databases to get a coherent picture of, say, the state of a company when each department has its own data in formats that do not integrate with those of other departments. It is similar in the library world: many small libraries that communicate only with difficulty. Any citizen of Georgia can get a PINES card and we know that library patrons are bypassing non-PINES libraries in order to get access to PINES at member libraries. By those actions, they are able to use the materials in a large, virtual library. When library users have a choice, they will break down library silos. Welcome to a long tail world.

Now, the politics is a bit awkward because there are librarians who either do not see the handwriting on the wall or prefer business as usual. In the age of Google, library users have been educated to expect better than that from their libraries.

It is curious that it took leadership from a state library and an open source community to create software with Evergreen’s capabilities and that it never came from legacy vendors.

Bob Molyneux

Code4lib 2008 (Mini) Roundup

March 3rd, 2008 by bill

I’ve been to my share of library conferences over the last couple of years. They all have something to offer in their own way: networking, schmoozing, wacky vendor displays, swag, etc. It’s been my experience, though, that as a geek/developer, I’m often a little disappointed at the content — it’s just not geared toward someone like me. This is why I love Code4lib. The conference is densely packed with interesting technical presentations. They are the kind of talks that not only inspire you, they give you something concrete to take back home. In addition to the presentations, you are surrounded by an interesting gang of library technologists, who, I get the sense, all have tricks up their sleeves. It makes for a fulfilling week, to say the least.

Some highlights

My week started with the Evergreen pre-conference, skillfully lead by Dan Scott. Dan blogged about it here. Apart from some technical woes, a lot was seen and (hopefully) learned by all. We performed the install, imported bib and holdings data, implemented new functionality (new OpenSRF method and OPAC UI) to email a user their password if they have forgotten it, and took a quick look at some of the staff client interfaces.

Conference day 1 began with Brewster Kahle’s keynote address on his work with archive.org and openlibrary.org. Brewster’s vision and chutzpah is refreshing and inspiring. The goal of “One wiki page per book” sounds so simple (and obvious) when inserted into a slide show, but the implementation will require tremendous work and resources. I applaud their efforts, not only in principle, but because I think it will lead to better information and resource sharing in the long run.

After lunch, Winona Salesky and Michael Park did a joint presentation called “XForms for Metadata Creation”, which included two different MODS editors developed with XForms. I liked this talk in particular for two reasons. For starters, the demos I saw were that of a simple (yet powerful) bibliographic data creation interface, which I see as potentially useful for any ILS. Additionally, the use of XForms, which I only have a basic understanding of, gave me a chance to see some new (for me) technology in action. My interest is piqued.

I skipped the breakout sessions that day to put together a (5 minute) lightning talk on Pylons. One of the interesting challenges of lightning talks is the use of a shared PC and what amounts to a moratorium on web access for the sake of efficiency. My presentation was a set of images and screenshots, the first a crane on a sunny Portland day, which seemed apropos of the topic. I discussed one particular aspect of Pylons, which allows you to easily plug in custom pre and post-processing middleware applications for your web apps. I demonstrated a simple XML validator and a highlighting plugin which bolded pre-defined terms in the HTML on its way to the client. These gave me the chance to dig a little deeper into Pylons, an architecture which I think has a lot of interesting potential.

Wednesday morning I learned a lot about RDA. Later, I thought the discussion on of the ILS Discovery Interface Task Force segued well into Ross Singer’s lightning talk on Jangle, which, as Ross pointed out, could be used to implement the proposed standardized ILS interface.

I was not able to attend the last day, so I missed a chunk of the conference, including Dan’s CouchDB talk. Glad it went well, Dan!

Wow, now I realize why most folks blog about conferences as they are happening ;) It’s a lot to digest, especially for a conference as dense as code4lib. There were a lot of great presentations, breakout sessions, and lightning talks this year and this post only comments on a few of them. Next year’s conference will be held in Providence, RI, and if you can, I would recommend checking it out.

-bill

Measures of maximum load in PINES

February 29th, 2008 by drdata

I was curious about how many circulations PINES had on the biggest day on record. And that led to our asking: which hour and which minute had the most circs?

Maximum circs by DAY:
96,326 - May 29, 2007

Maximum circs by HOUR:
11,305 - May 29, 2007, 11 AM

Maximum circs by MINUTE:
548 - May 2, 2007, 9:21 AM

I understand that the day after Memorial Day in 2007 is remembered as a very busy day for reasons that are clear.

For a bit of perspective, the FY 2005 NCES/NCLIS data for all US public libraries give the following annual figures for total circulations by library:

Q3 (third quartile–the libraries above this number are in the top 25%):
144,663

Q2 (median–the middle value):
43,912

Q1 (libraries below this number are the bottom 25%):
13,113

The arithmetic mean is 229,685.

That maximum number of circs for PINES for one day is greater than the median annual value for all public libraries.

Why is the mean so much higher than the middle value? Because the distribution of all these numbers is highly skewed–in the Statistical sense. And that fascinating topic can be the subject for compelling blog post at another time.

Bob Molyneux

Acquisitions Update

February 15th, 2008 by bill

As noted in Dan’s blog post http://open-ils.org/blog/?p=115, we’ve chosen a new path for acquisitions and serials development for Evergreen. One of the benefits of this new approach is our movement toward a more traditional development process, a process very similar to the one used in the development of the majority of Evergreen. This allows us to efficiently prototype, build, deploy, and repeat. In fact, the only real difference between acquisitions development and the traditional Evergreen development process is our collaboration with other developers in the community from the outset. The developer community is growing and acquisitions and serials development is a good example of how we’ll be growing together.

So what’s been going on for the last month?

We’ve been working through key components of the acquisitions infrastructure, including models for funds, funding sources, providers, and picklists, to name a few. Additionally, we’ve implemented finer-grained permissions handling, some of which was already planned for Evergreen, some of which is new and more detailed to explicitly handle the unique needs of acquisitions and serials management.

To the chagrin of many who are watching, the user interfaces are not the first thing we develop. As we work through use cases, our initial objective is to make sure we understand the system components, how they interact, who can access them, and in what ways. However, we do have some basic interfaces up for testing on our development server: http://acq.open-ils.org/oils/acq/picklist/list. DISCLAIMER: The interfaces are very basic and running on a development (ups & downs) server. Though components of these interfaces may be used in the final system, these were not designed with usability in mind.

Our goal for this month was to complete the basic round-trip “buy a book” scenario. That scenario is almost complete and, after ACQFest III, where some of our remote development team comes to Georgia for a 3-day planning and coding session, we should easily reach that goal and more.

March will bring more focus on enhancing the basic order work flow, user interface development, and initial models for serials control.

The question of the day, of course, is when will Evergreen have an acquisitions system? At the current rate of development, we should have functional core acquisitions functionality in April, with serials control following along generally about month behind.

We always encourage and appreciate feedback. As you check in and watch the interfaces evolve, let us know what you think.

-bill

BC Pines Growth

February 14th, 2008 by brad

Posting an announcement from Jacqueline van Dyk from the British Columbia Pines Project:

Greetings,

I am pleased to announce the launch of the 3rd Evergreen pilot site in BC, Powell River Public Library. Their online catalogue can now be viewed at: http://powellriver.catalogue.bclibrary.ca/

Joining Prince Rupert and Fort Nelson on the system, the collective holdings for all three libraries can be searched here: http://catalogue.bclibrary.ca

Watch for the launch of the Terrace public library next and stay tuned for the remainder of the 2008 implementation schedule.

Welcome aboard Powell River!

Indeed, welcome aboard!

logging #openils-evergreen, and writing OPEN-ILS-DOCUMENTATION

February 11th, 2008 by jason

Hi folks,

Just a quick blurb to let everyone know that we’re now publicly logging the #openils-evergreen IRC channel where most of the Evergreen developers hang out for real-time chat. This was proposed by Dan Scott and informally voted upon on the mailing lists and in-channel. Here’s an excerpt of some of his rationale:

…as our i18n work proceeds and Evergreen becomes a realistic option for more countries throughout the world, we will increasingly run into situations where the time zones of the North American core development team doesn’t match the time zones of potential adopters and potential developers. In addition, in many places Internet connectivity is more sporadic: it might be available just a few hours per day. In those cases, the knowledge that is shared via IRC could be greatly beneficial to others if they were able to at least search for previous discussions on the IRC logs.

We also added a web gateway to IRC to make it easier for folks to join us, and browse and search interfaces for the logs.

Something else I wanted to mention, involving the creation of a new mailing list, OPEN-ILS-DOCUMENTATION. On OPEN-ILS-GENERAL and OPEN-ILS-DEV, I had made this offer:

Here’s an open invitation to everyone: I’ll personally train anyone on Evergreen who’s willing to contribute to the documentation effort (via remote desktop and phone conference technology).

I was surprised by the number of volunteers, and while I have more than enough for the short-term, the offer is still open for those who want training down the road. Thanks again to Dan Scott for agreeing to manage the documentation efforts of these volunteers!

– Jason

Isn’t Evergreen just for consortia with hundreds of libraries?

February 11th, 2008 by brad

We get this question often at Equinox Software and we see it on occasion on the Evergreen community mailing lists and in the library blogosphere.

I am a single-branch public library with about 25,000 titles. I’ve looked at the options and I really like Evergreen. But isn’t it simply ‘too big’ for my situation?

It is true Evergreen was built from the ground up to handle the largest of library systems: it’s highly scalable, fault-tolerant (meaning it can sustain hardware failures and keep running), and has capabilities that make it a perfect selection for large libraries and consortia. Those attributes/goals are clearly stated on the open-ils community website.

However, it is important to realize this does not mean Evergreen is not a good solution for smaller libraries as well. Evergreen does in fact “scale down” elegantly. We know this because there are developers in our community that run Evergreen on their laptops. That’s right folks–they can run the entire server-side software stack on their laptops. Another example: Bill Erickson, at the last code4lib conference, ran an “Evergreen install in 20 minutes” breakout session in which Evergreen was installed on a laptop.

Evergreen was designed to be able to scale to multiple servers of varying sizes and purposes. But it can also scale to a single server (or laptop) if that’s what the library desires and what makes sense in the cost-to-uptime ratio.

Also, the ability for Evergreen to “scale down” from a functional standpoint is something that’s apparent in Kent County Library’s recent migration announcement. Kent County Library serves a population of approximately 19,500 though a main library and 2 smaller branches. There is more information about the decision to go with Evergreen on the Kent County Library blog.

I hope this information puts the rumor to rest that Evergreen is just for big libraries and consortia. The fact is Evergreen can work quite well in smaller situations.

Every reader his or her book; Every book its reader.

February 7th, 2008 by drdata

Lorcan Dempsey’s post Evergreen and Pines refers to a Georgia Public Library Service (GPLS) report Use of Georgia’s public libraries continues to rise in Internet Age and echoes one I made earlier here entitled Save the time of the reader. Briefly, all three deal with increases in the use of Georgia Public libraries generally and PINES particularly. But Dempsey makes a point of mentioning the consortial aspect of PINES and I have been thinking about this very point since my post a few weeks ago.

PINES, of course, runs on Evergreen and although Evergreen is running individual systems, the fact is that Evergreen is the most consortially-aware ILS—open source or proprietary—has resulted in some fascinating dynamics which these cited posts allude to.

In my post, Save the time of the reader, I reported on my attempt to measure the change year over year in three PINES use measures that could be compared before and after the switch from the legacy vendor to Evergreen. There are dramatic increases which I attributed largely to Ranganathan’s law it referred to because the reader’s time was saved as a result of better design of the interface, particularly of holds.

However, there is a measurement problem that brings out something more important that had implications I didn’t understand a month ago. Not all public libraries in Georgia are in PINES and the number changed over the period of my analysis. If we want apples to apples, as the saying goes, we should analytically remove the libraries that were added during the year so that we would have circs, holds, and such at the same set of libraries, that is, so that increases reported are not because of the addition of new libraries but rather because of easier to use software with better features. People, though, are just a bit complicated and that simple statistical manipulation would miss something key.

Georgia has a universal borrower card so if your local library is not in PINES, you can go to one that is and borrow from the PINES system. And what library users are doing is pretty clear, they are making the drive: they want access to the 2 million bibliographic entities and the over 9 million items owned by the libraries in PINES. As a result, the obvious statistical treatment itself would miss this dynamic aspect of people’s behavior. They don’t hold still and they react to changes in their environment.

To come full circle, I offer this speculation: that some of the growth in the number of PINES systems (44 to 49) since Evergreen went live in 2006, may be a result of this behavior.

I was graduated from library school in 1971 and I have seen various fashions about how best to do library service: big regionals or small local and intimate libraries. Back and forth. Back and forth. I believe the jury is now in. Library users are voting with their feet; they want the big, resource rich library.

Chris Jowaisis’s Texas Library Systems wiki has an ILS Discussion in which he offered an opinion I thought insightful about the Georgia experience: “(CJ opinion) political agreements are as impressive as the technology achievements of Evergreen.” I have quoted Chris’s observation often. Now, after seeing the experience in Georgia up close, I think that while we librarians and politicians care about the politics, the people who use libraries don’t. If you build the resource sharing consortium, they will come.

Bob Molyneux

PLA and an Evergreen Users Group, Birds of a Feather gathering?

February 6th, 2008 by jason

Hi folks,

I was wondering if any of you would be interested in gathering together to talk about Evergreen during the Public Library Association’s National Conference (http://www.placonference.org/) in March? Most of the Evergreen developers will be there and we can arrange for a meeting space. The notion is that anyone interested in the project (software, community, vendors, etc.) could be invited, and we could style it as a Birds of a Feather-type gathering, or maybe even the nucleus of an independent Evergreen Users Group.

What do you think? What sort of topics or activities would you be interested in? Would you be interested in something like this at other times or places? Is anyone interested in helping to organize this?

Thanks!

– Jason

LOCATION UPDATE:March 27, 5:30pm - 7:30pm at the Minneapolis Marriot City Center, Gray & Wayzata rooms (8th floor)

PINES Consortium welcomes Lake Blackshear Regional Library

January 31st, 2008 by brad

Just a quick post to announce PINES has grown by another library system: Lake Blackshear Regional. The PINES consortium has now grown to 49 library systems comprised of over 275 physical locations. The database has also grown: PINES is up to 9.3 million items and 1.8 million active users. PINES had over 17 million circulations last year.