You’ve got …… nothing?
Last week’s UCLink failure sheds light on incorporation of e-mail into every aspect of campus life
| 19 March 2003
The campus’s dependence on electronic mail — for teaching, business transactions, and routine communication between units and individuals — has grown exponentially over the past decade. The extent of that dependence came home last week, when a system failure directly affected all campus UCLink e-mail users, thousands of people trying to correspond with them, and a variety of business transactions.
“This is the biggest system failure that I’ve experienced,” says Jack McCredie, associate vice chancellor for information systems and technology (IST). It was also “the most troublesome,” he added — since for several days it wasn’t clear what had happened. “Complex systems fail in complex ways, and this was a very complex and subtle software problem.”
The problem began to unfold early Friday, March 7, after IST staff performed a routine space-reallocation operation on UCLink, the e-mail service the campus provides free of charge to 45,000 students and employees. It was immediately obvious that a system error had occurred. But the full extent of the damage — which corrupted uclink, uclink4, boalthall, tsw, and uhs accounts whose names started with the letters A-D and J-O — was not clear until the following Monday. IST then requested on-site support from the system vendor, Hewlett-Packard — whose technicians struggled, alongside IST staff, to diagnose the complex hardware/software failure and repair the system. A handful of programmer analysts worked over the weekend to recover data, and by the evening of Monday, March 17, all faulty hardware components had been replaced and service had been restored for virtually all UCLink accounts.
Word of events at Berkeley circulated quickly among e-mail managers at campuses around the country. “We got a sympathy message from the University of Chicago, describing their woes in December, when their e-mail system was down for 10 days during finals,” says Jerry Smith, director of Workstation Support Services, which oversees UCLink.
The week’s e-mail trouble, from initial delays in delivery to a complete shutdown on several days for almost half of UCLink’s 45,000 accounts, affected virtually every campus constituency — faculty trying to answer questions about course material; researchers collaborating on academic papers and time-sensitive grant applications; students trying to access their e-mail, through the web, from handheld computers and campus terminals; personnel officers attempting to process human-resources changes; and staff working on all manner of campus business.
“It was very enlightening to find out how deeply embedded e-mail is in campus business and academic processes,” says Ann Dobson, associate director of Central Computing Services. Since its inception in the early 1990s, UCLink usage has grown on the order of 5 to 10 percent each semester.
Novelty and frustrations
The meltdown brought a pleasant reversion to simpler times and lower-tech means of communication for some. Senior Editor Dick Corten of the Graduate Division resorted to “sneakernet” to hand-deliver documents in Sproul Hall; Maureen Morely, executive director of the Academic Senate, Berkeley division, reported the pleasure of a face-to-face conversation with such a messenger.
But for many units and for campus instructors — especially those teaching large lecture courses — frustrations far outweighed novelty. Undergraduate Admissions now relies heavily on e-mail to communicate personally with prospective students, obtain supplemental application information, and provide passwords to admissions-status updates on its web page. “Our records show that 550 applicants were impacted by UCLink being down,” says Director of Undergraduate Admissions Pam Burnett. “After all of the nurturing we did with students, it was ironic that they were then rebuffed by UCLink, and we lost hundreds of their messages.”
In the School of Public Health, Maureen Lahiff teaches probability and statistics to some 80 students. “It’s really great to have the students e-mail me questions,” she says, “because often in the very process of articulating their questions, it allows them to discover the answer.”
For Raymond Jeanloz, professor of earth and planetary science, the technology has become an essential tool for communicating with 400-plus undergrads in a popular course, “The Planet.” On the eve of a midterm, jeanloz@uclink went down — provoking a dilemma for Jeanloz and his co-teacher. The two get dozens of e-mail messages from students asking them to clarify a concept or explaining a personal circumstance. “We had to think through the consequences of holding an exam. We finally decided to go ahead with the midterm, but I was personally concerned.”
A further frustration, for Jeanloz and others, was not knowing where to find reliable updates on how long the crisis might last, either from UCLink itself or from Public Affairs, whose staffers (not being on UCLink) were unaware of the extent and duration of the outage until Wednesday morning. (Regular updates on recovery efforts, with links to the UCLink web page, have been available since that time from Public Affairs on the campus’s online NewsCenter, newscenter.berkeley.edu.)
UCLink was designed a decade ago, “for communication between individuals,” says Smith. Increasingly, however, the data flowing through UCLink’s “veins” are generated by administrative or business applications. The Human Resources Management System (HRMS) might notify a departmental contact of a change to an employee’s record; the forthcoming e-Travel application will forward travel information from preparer to a reviewer.
This use of UCLink, called “enterprise messaging,” has gained momentum over the last two years, Smith says, as the campus has put more of its administrative and business functions online. “We’re paying the penalty for success,” he says. “I guess we should take it as a compliment that people take UCLink for granted and build it into their own enterprise applications.”
Last week’s events revealed the vulnerabilities of the system. Employees trying to perform personnel transactions through HRMS, for example, were unable to complete their work until the e-mail-dependent feature was disabled.
According to McCredie, the campus has been aware of UCLink’s limitations for some time. IST has made numerous hardware and software upgrades to the system since its inception, and recently developed a request-for-proposal for an up-to-date replacement system designed to handle both conventional e-mail messages and enterprise-messaging traffic.
Because of a 4-percent budget cut for FY2002-03, that project was put on hold this year, says McCredie. Even bigger cuts loom on the horizon. “But this is a loud call to all of us, that we have to find a way, within the coming months, to make that purchase,” he says.
Will IST replace UCLink this year? “If you’d asked me two weeks ago, I would have told you that we won’t be able to afford it,” says McCredie. “But given the magnitude of problems we’ve just experienced, I have to make this a proposal that comes through this year. Enterprise messaging is too crucial.”