Technology
The BrightScale Array™ Architecture Takes a New Approach
Currently there are two basic approaches to the implementation of video processing and compression functions. ASIC implementations provide efficient solutions in terms of cost and performance vs. power. However they are specifically targeted designs which allow little or no flexibility. This limits feature and performance enhancements over time, adaptability to changes in algorithms and standards, and multiple functional personalities for end products. DSP based solutions provide a programmable and flexible alternative, however existing programmable architectures are unable to meet the demanding performance needs of video processing within the cost and power budgets appropriate to the application.
BrightScale takes a new approach to parallel architectures with its patented BrightScale Array architecture. BrightScale Array combines thousands of RISC-like processing elements with local memory on a single chip. This tight coupling of distributed processing and local memory is a variant of the next generation Processor-in-Memory (PIM) designs currently in development by the major semiconductor companies.
BrightScale has made substantial progress over those companies with the first cost effective PIM implementation. Implemented with standard bulk CMOS, BrightScale Array incorporates a proprietary inter-processor network that can deliver unprecedented levels of memory bandwidth and true linear scalability. The BrightScale Processing Elements (PE) are manipulated by a programmable BrightScale controller which executes a sequential instruction stream. This enables a software abstraction of the array processor that simplifies customer use.
BrightScale Array and Processing Elements (PE)
The BrightScale Array is a new recipe comprised of proven ingredients; it incorporates the efficiency of a RISC core, the scalable performance of a SIMD and the simplicity of a DSP.
The BrightScale team first implemented the BrightScale Array in a TSMC 0.13 mm generic processing in 2005. This prototype chip contained 4096 processing elements with the controller running in a FPGA. This demonstration platform showed the basis for a production chip capable of over 200 MegaPixel/sec of color processed and JPEG compressed images at an incredibly efficient 10 mW/MegaPixel/sec. This proof-of-concept in 2005 successfully demonstrated the BrightScale Array claims of performance, low power and programmability.
In 2006, BrightScale Team developed a System on Chip, BrightScale BA 1024 — Programmable Media Processor incorporating its patented BrightScale Array.
The BrightScale BA 1024 incorporates 1024 specialized PEs arranged in a special array that enables high performance parallel processing of the digital video streams for Digital Television and other multimedia applications.
The BrightScale BA 1024’s unique processor in memory architecture, combined with its highly efficient interconnect fabric, is a groundbreaking implementation of fine grained multiprocessing. This breakthrough performance is enabled by a careful balance between computing power and memory bandwidth.
The BrightScale Programmable Video Platform (PVP) approach enables creative deployment of algorithms for video processing and image enhancement. These algorithms are swappable and the compute engine itself is both shared and re-used in applications where H.264 and MPEG 2 streams coexist.
In addition, many software “knobs” can be implemented, enabling the DTV OEMs to tune image quality (de-interlacing, motion adaptive algorithms, etc) and improve identity for differentiation (color space selection, noise reduction features, etc.).
For more information, see the product brief: BrightScale BA 1024 (PDF).
The Values of the BrightScale Array
- Performance
- Multiple simultaneous data streams accommodate both codec and post processing needs
- Flexibility
- Programmable functionality similar to other general purpose processors such as DSPs and RISC CPU’s
- Reduced time to market, enabling many product releases per year
- Ideal platform for a family of products and performance levels due to the flexibility and scalability of both hardware and software
- Programmability allows OEMs to deploy their proprietary algorithms for image enhancements and advanced codecs if desired.
- Power
- Competitive with ASIC implementation power at an equivalent performance level
- An order of magnitude better than most DSP-based programmable solutions
- Cost
- Software based implementations provide lower cost for new designs and subsequent feature enhancements than ASICs


