How to Use Single Board Computer / SBC to do Cluster Computing on the Edge?



Unbeknownst to many, there once existed an extraterrestrial life detection project known as SETI@home, marking my inaugural encounter with the concept of cluster computing. Despite failing to procure evidence of an extraterrestrial civilization, I was deeply astounded by the innovative exploitation of vast amounts of idle personal computing power to accomplish large-scale computational tasks. The next instance of profound awe arose with the emergence of Bitcoin, prompting contemplation of whether a cluster-based approach could be employed to mine the cryptocurrency using single board computers (SBCs), like Raspberry Pi, LattePanda, etc. However, this intriguing notion swiftly dissipated with the introduction of graphics cards to the mining arena. Yet, it was then that the concept of clustering appeared before us. Recently, with the advent of concepts such as edge computing, the notion of cluster computing has once again resurfaced in our purview. Therefore, today, we shall explore the realm of cluster computing and the potential possibilities it holds within the domain of edge computing.


What is cluster computing


What, then, constitutes cluster computing? A cluster refers to an assemblage of mutually independent computers that, through the use of high-speed communication networks, form a larger computational service system. Each node within the cluster—representing an individual computer—is an autonomous server executing its own distinct services. Capable of intercommunication, these servers collaboratively deliver applications, system resources, and data to users, all while being managed as a unified system. Upon soliciting the cluster system, users perceive it as a singular, independent server, when in actuality they are engaging with an ensemble of clustered servers. Succinctly stated, a cluster is a multitude of computers—servers, to be precise—that are integrated under a collective designation. For instance, when accessing the webpages of Google or Baidu, one might presume that crafting a similar page would require mere minutes, yet the reality is that the final product is a synergistic effort of thousands of server clusters. If one were to encapsulate the essence of clustering in a single sentence, it would be a legion of servers collaboratively executing a common task, potentially necessitating coordinated management, and distributed across multiple data centers, be they situated locally or dispersed across diverse global regions.


Classifications of Clusters


Computer cluster architectures, delineated by function and structure, are commonly categorized into the following types: 


High-Availability Clusters (Abbreviated as HAC) 

High-availability clusters enhance system availability and fault tolerance by replicating data and services in multiple instances. In the event of node failure, automatic failover to operational nodes is facilitated. Application scenarios encompass critical domains such as web services and databases. 


Load-Balancing Clusters (Abbreviated as LBC or LB) 

Load-balancing clusters uniformly distribute incoming requests across diverse nodes for processing, thereby augmenting the system's performance and scalability. Applicable scenarios include web servers, DNS servers, and the like.


High-Performance Computing Clusters (Abbreviated as HPC) 

High-performance computing clusters, comprising multiple computational nodes, leverage parallel computing to elevate computational performance and expeditiousness. Their application spans scientific computation, simulation analysis, data mining, and related fields.


With the ceaseless advancement of technology and the perpetual expansion of application scenarios, cluster computing is witnessing ongoing deployment and innovation across various domains.


Why Employ Cluster Computing?


Reflecting upon the historical evolution of computers, one discerns humanity's unwavering pursuit of enhancing computational performance. From the earliest ENIAC computer to contemporary supercomputers, our quest for augmenting computational capabilities has been relentless.


Examining this vertically, we endeavored to improve the performance of individual computing components, such as elevating CPU processing speeds, expanding memory capacities, and accelerating communication rates. Consequently, we witnessed the introduction of mainframe designs spearheaded by IBM, which continuously elevated the performance and capacities of various components, as well as internal communication speeds, transitioning from cabling to optical fiber and from serial to parallel communication.


Simultaneously, it became evident that the sheer stacking of performance entailed astonishingly exorbitant costs, rendering it uneconomical for accomplishing our predetermined objectives. Perhaps, we reasoned, large-scale tasks could be deconstructed, allowing generalized devices to execute small-scale tasks in unison to achieve the desired outcome. Thus, cluster computing emerged as a viable solution.


Cluster computing boasts attributes such as high performance, reliability, scalability, cost-effectiveness, parallelism, and flexibility. Clusters can enhance computational performance by adding computational nodes, facilitating large-scale parallel computation and substantially abbreviating processing times. Redundancy is customarily incorporated into such computational systems, ensuring continued operation through alternative nodes in the event of node failure, thereby averting system-wide collapse.


Moreover, cluster computing systems can flexibly expand node quantities as needed, scaling with growing computational demands without necessitating wholesale system replacement. Tasks within cluster computing can be allocated to disparate computational nodes for parallel execution, fully leveraging the processing capacities of multiple computers and enhancing system-wide parallelism. Cluster computing can be optimized based on specific computational tasks, selecting appropriate node configurations and computational schemes to maximize resource utilization and computational efficiency.


Despite these advantages, cluster systems do not necessitate exorbitant costs. A typical cluster architecture generally requires a mere handful or several dozen server hosts, rendering it far more affordable compared to specialized supercomputers that are frequently valued in the hundreds of thousands of dollars. When attaining equivalent performance requirements, employing a cluster architecture offers greater cost-effectiveness compared to large-scale computers with equivalent computational capacities.


Applications and Implementation of Cluster Computing on Single-Board Computers (SBCs)


An SBC is an embedded computer that integrates various functionalities such as a computer processor, memory, storage, and a range of input/output interfaces. Compared to traditional PCs, SBCs are more compact, consume less power, and exhibit higher integration, making them suitable for a wide array of applications including the Internet of Things (IoT), artificial intelligence, industrial control, medical devices, drones, and more.


As SBCs have advanced, their computational prowess has reached a level capable of meeting the demands of general-purpose computing. As a result, consideration has been given to utilizing cluster computing to further enhance their computational potential.


Cluster computing on SBCs typically entails the following steps:


1. Determine the Number and Configuration of SBCs: Cluster computing requires at least two or more computational nodes, with each node requiring a minimum of one PCIe slot. Additionally, each node must possess adequate processing power, memory, and storage capacity to support the computational tasks at hand.


2. Install Operating System and Software: Install the operating system and necessary software, such as MPI and OpenMP libraries, on each SBC node. These libraries are critical components for parallel computing and enable the full utilization of each node's CPU and memory resources.


3. Configure Network Connections: Connect each SBC node to the same local area network to facilitate communication and data exchange. Communication between nodes typically employs high-speed network interconnect technologies such as InfiniBand or Ethernet.


4. Write Parallel Computing Code: Writing parallel computing code is a critical step in cluster computing. The code should utilize parallel computing libraries such as MPI or OpenMP and distribute computational tasks across each node for parallel execution. Writing parallel computing code necessitates advanced programming skills and experience.


5. Execute Parallel Computing: Transfer the written parallel computing code to PCIe SBC nodes and simultaneously run the code on all nodes within the cluster. Each node allocates computational tasks to its CPU and memory resources for processing and transmits the results back to the master node for integration and aggregation.


Cluster computing with SBCs can be applied across multiple domains. For example, in radar signal processing, SBCs can distribute data collection, preprocessing, and processing tasks across multiple nodes, significantly enhancing the system's processing capacity and real-time performance. In video encoding and decoding, SBCs enable distributed video encoding and decoding across multiple nodes, improving processing efficiency and shortening computation times by distributing tasks across multiple nodes. For artificial intelligence tasks such as deep learning, distributing models across nodes can accelerate training and inference speeds. Additionally, SBCs facilitate distributed storage and processing of databases across multiple nodes, improving database performance and capacity.


These application scenarios can achieve heightened processing performance and scalability through cluster computing on single-board computers.


Raspberry Pi Cluster Computing Case Study


To date, the Raspberry Pi is among the most popular single-board computers, lauded for its affordability. The Raspberry Pi community and software ecosystem are vast and expansive. The latest Raspberry Pi models offer gigabit Ethernet, USB 3, 2, 4, or 8 GB of RAM, while maintaining low power consumption. This contrasts with building clusters using Dell or HP servers, or even older laptops, which, although capable of providing superior performance, are more power-hungry and costly. Raspberry Pi clusters are not only simple to use, but also compact, portable, and highly scalable.


On the official Raspberry Pi website, we can see hobbyists who have constructed clusters based on Raspberry Pi devices. Through the assembly process, one not only gains access to a personal cluster computer but also acquires knowledge pertaining to cluster computing. As a result, having one's own cluster computer is no longer a daunting task.


According to the description provided by the author of this article:


We’re going to put together an eight-node cluster connected to a single managed switch. One of the nodes will be the so-called “head” node: this node will have a second Gigabit Ethernet connection out to the LAN/WAN via a USB3 Ethernet dongle, and an external 1TB SSD mounted via a USB3-to-SATA connector. While the head node will boot from an SD card as normal, the other seven nodes — the “compute” nodes — will be configured to network boot, with the head node acting as the boot server and the OS images being stored on the external disk. As well as serving as the network boot volume, the 1TB disk will also host a scratch partition that is shared to all the compute nodes in the cluster.


All eight of our Raspberry Pi boards will have a Raspberry Pi PoE+ HAT attached. This means that, since we’re using a PoE+ enabled switch, we only need to run a single Ethernet cable to each of our nodes and don’t need a separate USB hub to power them.


· 8 x Raspberry Pi 4

· 8 x Raspberry Pi PoE+ HAT

· 8-port Gigabit PoE-enabled switch

· USB 3 to Gigabit Ethernet adaptor

· USB 3 to SATA adaptor

· SSD SATA drive

· 8 x Ethernet cables

· 16 GB SD card

· Cluster case


Cluster Computing Based on Raspberry Pi 4

Cluster Computing Based on Raspberry Pi 4


Upon assembling a Raspberry Pi cluster, the following applications can be pursued:


1. Building a Private 

Cloud If you desire to have your own internal private cloud, utilizing a Raspberry Pi cluster is an excellent option. For instance, if you work in a company with hundreds or thousands of employees and seek to implement a cloud solution for backup, CRM hosting, document storage, file sharing, or pushing new versions of modified files in collaborative settings—without compromising privacy when accessing older versions of these files—there is no need to operate a bare-metal machine the size of a refrigerator to achieve this. A Raspberry Pi cluster is more cost-effective and compact. You can use it to host private clouds, version control systems, backup systems, and more, all integrated into one system. The advantage of building a private cloud is that you are not reliant on third parties for cost regulations or holding your private data, and if necessary, you can upgrade hardware within a few hours.


2. Self-Hosted Web Servers

If you require self-hosted web servers, a Raspberry Pi cluster is a great choice, especially if you aim to reduce costs. Simply connect external storage of the desired size and install the LAMP package to transform the Raspberry Pi cluster into a web server. You can use Grafana for visual monitoring, install cPanel for clients, use QoS to regulate bandwidth and user accounts, and allocate resources as needed. Best of all, using the same cluster, you can offer shared and dedicated web hosting options. If your client base has recently grown and requires more resources, you can install new nodes on the Raspberry Pi cluster without affecting service. Notably, the Raspberry Pi Foundation even hosts their Raspberry Pi 4 release website on a cluster composed of multiple Raspberry Pis, serving high volumes of traffic.


3. NAS and File Sharing in Companies or Homes 

You can create shared files for colleagues or employees using a Raspberry Pi cluster in multiple ways. Samba is a highly convenient tool for connecting LAN computers using different operating systems. You can also host a NAS on the Raspberry Pi cluster to store large files, such as demonstration videos, software suites, accounting PDF files, or backup files. Each individual in the company or family members can have their own space on the cluster, and different share sizes can be allocated to users. The cluster can also facilitate internal communication between computers. Bandwidth usage can be set and monitored to avoid obstructing data transfers for other users. This means that hundreds or even thousands of people can simultaneously use the same Raspberry Pi cluster to store and share files within the office or easily complete tasks from home via a VPN.


Cluster Board Based on Raspberry Pi

Cluster Board Based on Raspberry Pi


Hybrid Cluster Computing Case Study


With Raspberry Pis becoming increasingly difficult to acquire, many individuals have begun adopting a hybrid approach to building cluster computers. For instance, an internet user by the name of Carlos Eduardo employed a LattePanda as the master node and used two ARM-based nodes as worker nodes to construct a hybrid cluster computer based on Kubernetes. Such a cluster computer can be deployed for use as a Network Attached Storage (NAS) or as a Docker container server.


Cluster Computing Based on LattePanda

Cluster Computing Based on LattePanda


Disadvantages of Cluster Computing on Single-Board Computers (SBCs)


While SBCs can be employed for cluster computing, there are certain disadvantages, as outlined below:


1. Limited Computational Capabilities: SBCs typically possess lower computational power and memory capacity, making them less suitable for handling large-scale data and complex computational tasks. This implies that addressing complex problems may require the use of a greater number of SBCs, increasing costs and management complexity. However, the issue of limited computational power can be mitigated to some extent by stacking a large number of devices.


2. Constrained Network Performance: SBCs generally utilize standard Ethernet connections, which may limit their network performance, as the network may become a bottleneck when processing large volumes of data. Gigabit Ethernet is currently the highest communication speed, but due to various factors, it often performs at merely 100 megabits per second. Explorations into the use of fiber optics to further enhance speed are underway.


3. Limited Scalability: Given that SBCs are typically individual devices, their scalability is constrained. Expanding an SBC cluster requires more physical space, power supplies, and cooling systems, potentially increasing costs and complexity. With the trend of developing SBC core boards, the emergence of integrated devices is possible.


4. Lack of Professional Technical Support: SBCs are commonly designed and manufactured by small teams or individual developers, resulting in technical support that is often less comprehensive than that of large manufacturers. This may lead to difficulties and challenges in using and managing SBC clusters. Professional communities for single-board computers may offer a solution to this issue. As commercialization progresses, corresponding service companies may emerge.


Recommendations for Cluster Devices in Edge Computing


Edge computing refers to the practice of processing data closer to the source of data generation, such as IoT devices or sensors, rather than relying solely on centralized cloud data centers. Implementing cluster devices in edge computing enhances processing capabilities, reduces latency, and improves reliability. The following are some recommendations for cluster devices suitable for edge computing:


The Raspberry Pi series has long been a popular choice for building cluster computing setups. The affordable and energy-efficient Raspberry Pi boards are used to create clusters that offer parallel processing capabilities for various tasks, such as data analysis, simulations, and scientific research.


Hybrid clusters represent an evolution in cluster computing. Unlike traditional clusters that utilize homogeneous hardware, hybrid clusters consist of a mix of different types of single-board computers. These clusters typically include a powerful master node based on x86 architecture (e.g., Lattepanda) and multiple worker nodes based on ARM architecture (e.g., Raspberry Pi).


The shift toward hybrid clusters stems from the desire to achieve better performance, energy efficiency, and cost-effectiveness.


Performance Improvement


Hybrid clusters leverage the strengths of different architectures to enhance overall performance. The x86-based master node can handle complex and resource-intensive tasks, while the ARM-based worker nodes can efficiently execute simpler tasks in parallel.


Energy Efficiency


Hybrid clusters offer improved energy efficiency by utilizing ARM-based worker nodes that consume less power compared to their x86 counterparts. This makes hybrid clusters an eco-friendly solution for computing workloads.




Hybrid clusters provide a cost-effective way to build powerful computing setups. The combination of a relatively expensive x86-based master node and multiple affordable ARM-based worker nodes balances the overall cost without compromising on performance.


Challenges and Solutions for Hybrid Clusters


Despite the advantages of hybrid clusters, there are certain challenges to consider:


1. Software Compatibility: Ensuring that software runs seamlessly on both x86 and ARM architectures may require optimization and adaptation. 

2. Hardware Integration: Building a hybrid cluster requires careful consideration of hardware compatibility and integration.


3. Resource Allocation: Effective management of resources in a hybrid cluster is essential for optimal performance.


Solutions to these challenges include the use of containerization technologies, hardware abstraction layers, and resource management tools.


Hybrid clusters represent a significant advancement in cluster computing, offering improved performance, energy efficiency, and cost-effectiveness. By combining the power of x86-based master nodes with the efficiency of ARM-based worker nodes, hybrid clusters provide a flexible and scalable solution for a wide range of applications. As the technology continues to evolve, hybrid clusters are poised to become an integral part of the computing landscape.




In summary, implementing cluster computing on PCIe SBCs requires a high degree of technical expertise and specialized knowledge. Proper planning and configuration are necessary, along with suitable software and hardware support, to ensure the system can efficiently and reliably execute computational tasks. A Raspberry Pi cluster is capable of hosting and simultaneously executing all of the aforementioned operations on a single machine. It can host game servers and cloud services while transcoding your movies. It offers an opportunity to learn Linux while hosting a NAS, without impacting other operations on the Raspberry Pi cluster. You can experiment on the same multi-node machine that hosts your backups without worrying about losing them during testing. Work at faster speeds and securely store data, all within the capabilities of the cluster.


Perhaps one day, through seti@home, we will discover extraterrestrial life, or perhaps as you comfortably enjoy life at home, never forget that cluster computing may be quietly and diligently exerting its capabilities.