The risks of overlooking flash memory
During the design process, manufacturers (OEMs) of industrial products undergo long engagement processes with multiple part suppliers. The purpose of these engagements is to ensure that each embedded part will not only work for the application, but also meet the workload and environmental requirements in the field. Tony Diaz, Product Manager for Delkin Devices, speaks to Electronic Specifier.
However, supplier engagements can be lengthy and complex. As a result, OEMs may opt to purchase off-the-shelf parts, particularly when the part is perceived to be a ‘commodity’ item. This is often the case with flash memory, which is normally selected based on specifications such as type of flash, memory capacity and form factor.
Because flash memory is widely available in a variety of form factors for consumer electronics, many OEMs assume they can proceed without a customised solution. However, in doing so, OEMs may overlook considerations such as workload (the frequency of reading and writing large amounts of data to memory), power management issues (dirty power, power cycling, power failure) and environmental conditions (temperature, vibration).
These factors can lead to data corruption and other errors in the field, while reducing the reliability and lifespan of the flash storage. For industrial OEM products, which can have a lifecycle of three-five years or longer, compared to 6-18 months for consumer applications, this can cause failures in the field and reduce the overall life of their product.
“Many industrial OEMs purchase flash storage devices over the internet only to discover at the launch of the product there were unexpected issues due to inaccurate assumptions about the environment and workload requirements,” said Tony Diaz, Product Manager for Delkin Devices, a value-added supplier of non-volatile flash storage solutions in a variety of SSD, card and module solutions.
This can lead to severe consequences for users. In the transportation industry, for instance, the unexpected failure of mission critical data may lead to the compromising of safety features that drivers rely upon to prevent accidents. In manufacturing automation, unexpected data device failures can cause machinery to malfunction, potentially leading to a costly and disruptive cessation of production.
Given the critical role in storing mission-critical data, Diaz said the majority of industrial flash storage solutions require some level of customisation to adequately meet workload requirements in real-world industrial scenarios.
All flash storage has a finite life, depending on how well it is managed and the workload requirements. To optimise and extend the life of a flash storage device, therefore, careful consideration must be given to how data is written to the device.
Writing to flash is the process of prepping the blocks of flash and then programming new data to the flash blocks. However, new data cannot be saved to flash until the old data is first erased. Due to the nature of flash storage, only a finite number of programming and erasing cycles can be performed before wear renders it unreliable to store data. In addition, some flash media is not used evenly, further reducing the life of the device.
Fortunately, there are options to extend the life of a flash device, including reducing unnecessary copying of files or downloading of data, consolidating writes, wear levelling techniques and even selecting whether the data is written sequentially or randomly.
“If an OEM misjudges or misunderstands the workload requirements, there are implications for the storage,” explained Diaz. “It could be as simple as unexplained errors in the field, or it could be a situation where they are wearing out the flash memory much faster than they realise.”
An important flash storage customisation option involves mechanical ruggedness. Is the application subjected to unusual amounts of vibration? Does the typical operating environment exceed even standard industrial storage parameters?
Although industrial flash storage is designed to be rugged, different applications have different operating requirements. Customising the mechanical ruggedness of the storage can alleviate concerns about failures associated with operating conditions.
One of the best ways to ensure that a storage device will work as expected in operating conditions is to partner with a manufacturer who offers testing reliability services.
Companies like Delkin Devices, for example, offer design verification testing, ongoing reliability testing and even accelerated lift testing to simulate long-term operating conditions at its manufacturing facility in San Diego, California.
One of the more common real world scenarios for industrial flash storage is power issues such as dirty power, excessive power cycling and unexpected power failures.
When power is lost during a write operation, it can cause data loss. This is because the data that was being written to the storage was not completely saved.
Although only a small amount of data may not have been written when a power failure occurs, it can cause significant ongoing problems, including fatal corruption of the entire system. It can also cause inefficient use of memory capacity, which can dramatically shorten the lifespan of the embedded flash storage.
Taking steps to reduce external sources of power loss is important for mitigating the risk of power fails. However, power failures can still occur, so internal protections are essential for reducing the risk of data loss. For flash memory systems that handle critical data, that means built-in power loss controls, including systems for monitoring power supply and the ability to recover data after a power loss that occurs during a write operation.
Dirty power due to outages, brownouts, surges and power spikes is another concern. This can be particularly problematic in transportation where DC dips below the required threshold, which can ultimately confuse the source and lead to errors for equipment critical to the operation of trains, automobiles or airplanes.
Excessive power cycling to conserve battery life can also become a problem. In some industries where OEM products are utilised in remote locations, the power is cycled tens of thousands of times a year to keep the battery in a sleep mode or to power it off altogether. This can also degrade the performance of the flash memory.
Finally, the operational requirements refer to the manufacturer’s supply chain and how they go about sourcing parts, engaging suppliers and ensuring that the parts they source will be available throughout the product lifecycle.
It is very common for the bill of materials (BOM) of commercial grade flash storage to be updated without warning, and this is necessary for consumer OEMs because it helps maximise functionality while minimising price. For industrial OEMs, however, what is needed above all is consistency and reliability.
Diaz said there is an even higher standard that can be achieved, which is when the component parts are controlled and ‘locked’.
“This means that once qualified, the flash, controller and firmware will not change as long as the part number is active. If anything needs to be changed, the part number is changed and that essentially guarantees that the customer is notified and the BOM is updated,” explained Diaz.
In the short term, off-the-shelf flash storage may have the right specs and cost less than a customised part from a supplier, but there are always hidden costs and risks for the OEM.
Flash storage is a critical part for rugged industrial applications and more manufacturers should engage a supplier from the beginning of the design process, to ensure they get what they need for the entire lifecycle of their product.
“Industrial OEMs are often more focused on designing a high quality product, and so do not spend much time considering flash storage,” said Diaz. “But given the critical nature of data in today’s devices, there is too much risk to take industrial flash for granted.”