Got a thought or idea for our BI Bloggers ? Email biinactionblog@ebizq.net
BI in Action Blog
|
« Will Search Replace Query? | Main | BI Leader: BI Will Incorporate More Web 2.0, Social Networking, SOA » August 24, 2007Big Data --Why Size Doesn't Necessarily Matter
A couple of months back, I had the opportunity to talk with Alex Spinelli, CTO of TheStreet.com, about the ways his company leverages the massive amounts of content and data surging through its servers each day. Jim "Mad Money" Cramer alone is probably enough to fill a few servers. (A posting of the Database Trends & Applications article this interview originally appeared in can be found here.) Spinelli told me that TheStreet.com maintains more than a terabyte’s worth of data in various forms - articles, alert data, company data, and trading data. While the online information service’s data store totals well into the terabyte range, the company’s data managers prefer to keep its data in a distributed format. “We addressed the problem of having lots of lots of information on lots and lots of data servers by cutting it up into smaller segments,” Spinelli said. “Each one is a bit more manageable and allows us to have a bit more flexibility, since the information is specialized.” What to do with all this data? Just a few years ago, when the largest databases were only just starting to top the 1TB mark. Now, a terabyte is almost commonplace, said Richard Winter, president of Winter Corp. In fact, a terabyte equals “only a few disk drives these days." The ability to effectively manage these large stores of data leads to greater business agility, Spinelli said. That’s why he prefers to maintain large data stores in a distributed fashion, linked by grid and clustering technologies. “ The landscape changes quickly for businesses such as ours,” he said. “The ability to be agile and flexible is one of the most important things I can deliver to my business.” TheStreet.com is deploying grid and clustering solutions to avoid building “a giant huge database running on big subsystems,” Spinelli explained. “We can actually be very smart about building out a modular database that scales horizontally and lets us still slice and dice as we need to, and be very flexible and agile, but have it all within the same management systems. That will enable us to very quickly move in different directions.” Posted by joemckendrick in Data Management | Digg This | Add to del.icio.us Trackback Pings TrackBack URL for this entry: |















