Curiosity is bliss    Archive    Feed    About    Search

Julien Couvreur's programming blog and more

Trends in computing

 

Ants

I recently read Pat Helland's slides on trends in computing (also available in video).
It gives some long-term perspective (10 years) on deep changes in computing, and how they will impact how we compute.
Distributed computing will become increasingly common, parallel computing will permeate all applications, engineering trade-offs will open up to non-atomic guarantees and impact business models.


Here are some key take-aways:

Uni-processor performance is leveling off (because of Instruction-Level Parallelism, Power and Memory walls). Adding many cores is the only way to significantly increase performance.
Each core will be smaller and cooler, and the price for aggregate performance will continue to go down.
Similarly, data centers will also shrink, become cheaper, and more distributed commodities.

Storage is also changing. Disk capacity has increased a lot, but the I/O access has not. Soon, disks will be treated as tapes (sequential access), or used as cold storage.
On the other hand, flash storage offers scalable bandwidth (the more flash chips you add, the more bandwidth you get to your data) and the prices per gigabyte are falling. On top of that, flash memory uses very little power.


The trends that Pat illustrates converge to one point, a deep shift towards parallelism.
Consequently, data will need to be local and replicated in more places (cache on each core, RAM, various devices and cloud storage).
This leads to increased concurrency and synchronization problems.

He suggests that a partial solution will be to accept relaxed consistency constraints, such as allowing local decisions (with the computer equivalent of an apology when there is a screw-up) and eventual consistency, instead of requiring atomic transactions.


The trends and analysis Pat present resonated with what I've witnessed in the industry. See the pointers section below.

Many points are already visible in distributed computing already. But these changes will increasingly apply within a single machine as well.
This will require new toolsets (programming languages and compilers) as we move from sequential to parallel programming. For example, humans simply will not be able to deal with this level of parallelism (think threads, mutable shared memory and locks) and all the cases and errors which may occur.
I would expect techniques from functional programming to gain a lot of traction to help address these issues.
High-performance and secure message passing will probably require a mix of new hardware design and OS design, and borrow from the capability-object security model as well.


As Dr. Burton Smith put it in his "Reinvention of Computing" keynote:
“The coming years will fundamentally reshape software and transform the way people use and interact with computers. In order for consumers to enjoy performance improvements in the future, mass-market technology providers will have to embrace parallel computing to differentiate and compete. It’s vital that software and hardware adapt to new models of computing.”


Pointers:


Hadoop, using disks as tapes.

MapReduce programming model applied to multi-cores, here (video) and here (pdf).

Singularity is a research operating system, which lowers communication cost between processes and kernel by running all of them in the same ring, and keeps them isolated and secure by using a safe compiler and following capability discipline.

Polyphonic C#, Parallel LINQ, DryadLINQ, Parallel FX extensions to C# are all extensions of .Net to make it more suitable to parallel computing.

Microsoft's Parallel Computing page also has many great resources.

Dynamo is a highly available distributed store with a simple Get/Set API (like a hashtable). It is a smart balancing of the constraints of the CAP conjecture (consistency, availability, and partition tolerance, choose two), which demonstrates that systems can be very useful even with a relaxed consistency guarantee. In particular, it keeps and returns multiple conflicting versions of a single object, because it prefers write availability over atomic transactions.

Update (2008/12/24):

Jim Gray's Tape is Dead, Disk is Tape, Flash is Disk, RAM Locality is King and Rules of Thumb in Data Engineering (which explains the storage hierarchy).

Gustavo Duarte's summary of bandwidth and latency in the various parts of a computer.

______________________________________

Wow, what I would give for a 500GB flash drive! Maybe one day..

I wonder if someone will come up with a way to harness the tons of old single processor chips that will be laying around one day - perhaps finding a way to make motherboards that would harness multiple chips as another option for distributed computing. That would save a ton of waste from the landfills in the future.

Posted by: Alice (April 13, 2008 03:27 AM)
comments powered by Disqus