Device used across U.S. caused Va.'s computer crash

Posted to: News

By Rosalind S. Helderman and Anita Kumar

RICHMOND

The data-storage unit that failed last week, wreaking havoc in the computer networks of a number of Virginia agencies, is technology used by virtually every major company and government in the country.

The crash - still baffling to state officials - exposes the vulnerability of modern interconnected computer networks and is being closely watched by information technology professionals across the country.

"People in the industry are watching in horrified fascination as this unfolds," said Robin Harris, an Arizona technology analyst who writes a blog on computer-storage systems. "There's a lot of 'there but for the grace of God go I' kind of thinking."

It is also a political headache for Republican Gov. Bob McDonnell who promised on taking office that he would straighten out the state's troubled $2.4 billion computing contract with Northrop Grumman and is now grappling with what appears to be the worst network failure since the company took over the state's technology overhaul in 2003.

Chief Information Officer Sam Nixon said that the problem began Aug. 25 with the crash of a pair of 3-year-old memory cards, one of which was supposed to back up the other. That led to 485 of the state's 4,800 data servers being knocked offline.

"The thing that is never supposed to happen happened," he said.

Built by Massachusetts-based EMC Corp., the storage units are fundamental building blocks of the databases necessary to run complex organizations in a digital age. Industry analysts say EMC's drives are used throughout the world of finance and government and have generally been considered highly reliable. If the units were found to have flaws that caused the failure, the impact could be felt far beyond Virginia.

"This is surprising - it's a selling point for them when they talk to a major organization, that this stuff never goes down," said Bill Kreher, a senior technology analyst who monitors EMC at the investment firm Edward Jones.

Experts say that even more troubling than the failure of the initial storage drive - which was repaired by Friday - is how long it has taken state agencies, Northrop Grumman and the Virginia Information Technologies Agency to restore computer applications.

Nixon acknowledged that it took much longer than it should have for the system to be restored and said he plans to study ways to speed up the process.

In a statement this week, Northrop Grumman, which will face a state fine of more than $100,000, vowed to learn from the problems and correct them. EMC officials did not return calls for comment. State officials have said the company reported that the Virginia outage was unprecedented, an occurrence never before seen in 1 billion hours of system use.

COMMENTS ADVISORY: Users are solely responsible for opinions they post here; comments do not reflect the views of The Virginian-Pilot or its websites. Users must follow agreed-upon rules: Be civil, be clean, be on topic; don't attack private individuals, other users or classes of people. Read the full rules here.
- Comments are automatically checked for inappropriate language, but readers might find some comments offensive or inaccurate. If you believe a comment violates our rules, click the report violation link below it.

Northrop Grumman must be paying someone up high...

This is what VITA (Virginia Information Technology Agency) gets for putting all their eggs in one basket... NG are the worse contractor and have been consistently under-performing their contract. VITA should have fired them, but instead chose to fire their CIO when he question their bills and did not want to pay them because they were not doing a good job.
http://hamptonroads.com/2009/06/company-hired-va-it-help-costing-money-disrupting-work

Sounds to me...

Sounds to me like someone's database somehow (bad programming) became corrupt.

The raid arrays that are supposed to piggyback and protect data from this only really work for mechanical failure.

If this was software related (as in an excess of corrupt data ruining the database) there is a very good chance it was just copied onto both (or more depending on array size) hard drives making the raid setup virtually worthless.

In other words, there will be positions available for IT careers with this company in the near future (probably).

EMC Corp

Why haven't we (VA tax payers) heard from the brass at EMC? How about the Pilot giving Joe Tucci (Chairman, President and Chief Executive Officer) or Bill Teuber (Vice Chairman) a call? Claiming that their stuff doesn't break is false advertising. Any computer tech knows better.

It isn't that it doesn't

It isn't that it doesn't break, it is more that it has a built in array for such times that a hard drive does go down.

Think of it like this, your computer has a hard drive, everything you save goes to that hard drive.

In a raid array setup, there are 2 or more hard drives with the same data.

If one hard drive goes down due to MECHANICAL failure, the others can easily continue to do the job needed until the defective unit is changed out.

HOWEVER. This is only really against mechanical failure, if you get a nasty virus or corrupt data that goes unchecked, it can affect the entire raid array.

I think the blame goes to misinformed IT guys who thought the system was indestructible, or more likely the powers that be didn't feel the need to pay an experienced IT guy a decent wage.

But as the article says, it is being watched closely by everyone in the field because this is (more or less) the first major hiccup in this design.

"never supposed to happen",,,, "never goes down"

"never supposed to happen", "never goes down"

'Never' and 'always' USUALLY have exceptions.

T0o all those demanding perfection aint gonna happen

Perfection is an impossibility but some people expect it. To me 2 questions really need answered, what caused to failure and why did the recovery take so long. I will reserve judgment until I get the answers.

Supposin' ------

--- that an impasse in Foreign Relations became an opportunity for a covert enemy of Uncle Sam to paralyze America's electronic arteries and force America into a further diminishing of world involvement; that is the goal of Bin Ladin's following: the withdrawal of American influence and presence in the Middle East. We have already been warned of Eastern European attacks via the Internet on technological interests here. Suppose the Chinese government or the Russian Government wanted that capability.

I think you may be getting

I think you may be getting too riled up in this whole paranoia.

Computers are not flawless by any means.

Automatically assuming "TERRORISTS DID IT!" is not an answer for everything that goes wrong.

You give them WAY too much credit.

Money Isn't Everything, But To Virginia...

Anyone's dollar will do! That's a pretty helfty fine. why don't they just find a company that can handle the load. Funny, no other states are reporting on this...

Hummmm another little white one, perhaps??? Not our officials!

It is a reasonable

It is a reasonable fine.

They installed the "failsafe" systems that caused delays and data loss over a week.

Remember when a data server goes down, you lose money EVERY SECOND it is down.

Well that and the fact these server clusters can cost quite the pretty penny.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Please note: Threaded comments work best if you view the oldest comments first.

More articles from: News rss feed   


Toolbox


special features