Traditional programming has hit the power wall
April 13, 2011 // By Eric Verhulst and Bernhard Sputh
A recent article in EE-Times Europe states that computing has hit a power wall. Indeed, chip designers spoiled programmers in the past with ever increasing amount of compute cycles and memory space to waste. This has led to great new features, which we all would like to keep, however the way we program these hardware monsters has not really changed.
Yes compilers have become better in optimising code, but everything after has stayed the same. The linking phase of C / C++ programs is still largely a brute force operation, including everything the program might need and very often code that never will be executed. This leads to enormously bloated programs, that have to be:
- stored in non-volatile storage, and
- in the RAM of the system that execute them.
A simple “Hello World” might need a few Mbytes and links in 10000’s of functions. While this is less of an issue in desktop type systems, due to them having ample of cheap (D)RAM available and reasonably sized caches, the latter is not true for embedded systems, which represent the ever growing bulk of computer driven systems on the planet.
Needing a lot of (D)RAM does not only cost money, but also energy, because (D)RAM needs to be continuously refreshed and operates often with 100’s of wait states compared with the superfast GHz CPUs. Thus this becomes part of the power wall we are currently hitting. And to follow Moore’s law, the only way forward is more parallel processing cores on the same die, even if that doesn’t increase the access speed to the external (D)RAM. In the end, chips are pin-bound. Performance is cache bound on such chips and therefore code size still matters.
However this is only a side line of the real problem that developers hit today, when trying to exploit the parallelism. First of all the approach used today with threading is a difficult to get right approach, due to the state-space exploding easily beyond what a single developer can keep in his head, and traditional testing cannot cope with this. The situation is worsened by the fact that most thread synchronisation mechanisms are hard to get right. However, there is good news.
The problem has already been solved over 30 years ago, by C. A. R. Hoare who developed the formal Communicating Sequential Processes (CSP) process algebra. CSP has been implemented in software and in hardware systems, and even today there are implementations of it available for many different programming languages, JCSP for Java, PyCSP for Python, C++CSP for C++, and libCSP2 for C, to just name a few.
However, most of these CSP implementations are still designed for single CPU systems, even if there are environments, which spread CSP systems over multiple CPUs, even across the internet. CSP at first sight is also not intuitive to the “normal” developer, who often develops from within a sequential programming paradigm, whereas CSP takes more of a top-down approach.
But 30 years of history means also that these problems have been recognised and people have started to work on these problems, and have improved the situation. Preceded by work on the Virtuoso parallel RTOS (acquired by Wind River Systems in 2001), one novel result of such work is the so called Interacting Entities approach developed by Altreonic and implemented in their network centric OpenComRTOS.
It provides a scalable concurrent way of programming, whether the target is a single chip multicore device or a networked heterogeneous system with 1000’s of nodes. In the Interacting Entities approach there are two types of entities: tasks and hubs. Tasks can be compared with “clean” threads, i.e. they are active entities with their own, user defined functionality but with a private workspace.
Hubs on the other hand are passive entities that implement the interactions between the tasks. Hubs represent synchronisation primitives, such as semaphores and mutexes, but they can do much more, and there is the ability for the users to develop their own hubs, thus allowing very complex synchronisation protocols to be implemented easily. Tasks only interact via hubs, thus there are no direct task to task interactions.
With this strong task decoupling between active and passive entities and each of them having a unique address within the system it is possible to distribute a system over many CPUs and still achieve the same logical behaviour without having to rewrite the code. This mechanism is now also being ported from the original embedded implementation in C (only requiring 5 KiBytes/node) to other language environments like Python, called Python Interacting Entities (PIE).
Other languages to follow are Java, Ruby and Haskel. This demonstrates that a straightforward efficient programming paradigm that works across heterogeneous hardware platforms as well as across heterogeneous programming languages for new multi-processor platforms is not a pipe dream requiring lots of new research. The fundamental work was already done by Dijkstra and further formalised by Hoare and others. Interacting Entities is the pragmatic superset that make things happen.
More information available on www.altreonic.comRelated article:
Computing has hit 'power wall'
Automotive microcontroller benchmark takes energy efficiency into account
May 21, 2013
Today, cars are crammed with microprocessors, and many of them are not completely switched off when the driver parks and ...
EnSilica partners Cross Border Technologies to boost sales growth in key European markets
Industry's first ultra-wideband Doherty amplifiers support broadband operation
Electronics Manufacturing Services boom for medical industry says analyst
Gemalto teams with Encore Networks for mission critical M2M communications as US shifts to wireless
Solar industry capital spending hits seven-year low in 2013 but upturn is on the cards
May 21, 2013
Although global capital spending during 2013 in the photovoltaics supply chain is expected to fall to its lowest level since ...
Apple's overseas tax evasion stirs debate over US tax code
Could Intel enable USD200 Ultrabook?
Places2Be project aims to boost European leadership around FD-SOI
InterviewWireless control drives Atmel in Europe
Atmel's recent acquisition of Osmo Devices with a WiFi Direct design center in Cambridge and some key microcontroller launches has seen the company focus heavily on wireless control in Europe says Jörg ...
Filter WizardCheck out the Filter Wizard Series of articles by Filter Guru Kendall Castor-Perry which provide invaluable practical Analog Design guidelines.
Linear video channel
READER OFFERRead more
The development platform for i.MX 6Quad from element14 (built to the Freescale SABRE Lite design) is an evaluation platform featuring the powerful i.MX 6Q, a multimedia application processor with Quad ARM Cortex-A9 cores at 1.2 GHz from Freescale Semiconductor.
This month, Freescale and element14 are giving away five such platforms, worth £128.06 each, for EETimes Europe's readers to win. The platform helps evaluate the rich set of peripherals and includes a 10/100/Gb Ethernet port, SATA-II, HDMI v1.4, LVDS, parallel RGB interface, touch screen interface, analog headphone/microphone, micro TF and SD card interface, USB, serial port, JTAG, camera interface, and input keys for Android.
And the winners are...
In our previous reader offer, Pico Technology was giving away one of its recently launched PicoScope 3207B, a 2-channel USB 3.0 oscilloscope worth 1451 Euros. Lucky winner Mr L. Sanchez-Gonzalez from Spain should be receiving his PicoScope 3207B soon. Let's wish them some interesting findings with his projects.
December 15, 2011 | Texas instruments | 222901974
Unique Ser/Des technology supports encrypted video and audio content with full duplex bi-directional control channel over a single wire interface.