Friday, November 8, 2024

Micron SSD Endurance: Product Management Case Study

- Advertisement -

Micron is able to provide their client with a solution over the course of the following several weeks. The Micron 3500 SSD was the item in concern (Figure 1). After the problem was resolved, this product was regarded as one of the greatest client SSDs ever produced. As Jon Coulter of Tweak Town puts it, “It’s simply the best OEM SSD ever made.”

Since then, Micron has revisited SSD endurance and made trade-offs to satisfy consumers with nearly every product generation.

- Advertisement -
Micron client SSD portfolio
Image credit to Micron

Endurance SSD

SSD durability

So many technical product managers are learning something new every day. SSD endurance its lifespan is one such topic.

There are several techniques to test SSD endurance: TBW for client SSDs and DWPD (drive writes per day) for business SSDs. The percentage of the SSD’s capacity that can be written per day is measured by DWPD. For instance, a 1TB corporate SSD with a 0.3 DWPD allows the user to write about 300GB (30%) of data to the SSD per day up until the warranty expires. The term “TBW” refers to the amount of data that can be written to a drive before it fails terabytes, or thousands of gigabytes. There is a unique TBW value for every capacity (Figure 2).

TBW values for the Micron 3500
Image credit to Micron

Figure 2 indicates that you can write at least 300 terabytes of data on a 512GB SSD before it fails. To get an idea of the size of 300 terabytes, look at this little example.

In just three years, if you were to write and overwrite 100GB of data on your computer every day, you would only reach about 107 TBW before the drive’s warranty expired. That is roughly one-third of the drive’s rated endurance. And in that amount of time, could you picture writing 100GB of data every day? In a month, most of us wouldn’t even come close to that amount!

- Advertisement -

The following formula, when simplified, yields the TBW specification for a specific drive:

image 38

You can observe that the TBW increases with SSD capacity. Likewise, this also applies to the program/erase (P/E) count. But there is an inverse relationship between TBW and the write amplification factor. WAF is just the number of times user data is transported and rewritten inside the SSD. WAF is influenced by a number of factors, the most important of which is the workload you place on the SSD. That figure is low, averaging between three and four WAF for average client workloads.

SSD endurance is also measured by mean time between failures, or MTBF. The average time between drive failures is measured by MTBF, which is not an absolute statistic. An SSD’s Mean Total Bearing Life (MTBF) is a difficult metric to compute since it depends on the dependability of each and every SSD component. Nevertheless, the majority of Micron’s client SSDs have an MTBF rating of two million hours. According to this assessment, a Micron client SSD will malfunction roughly every 230 years on average. That failure rate is incredibly low!

Deeper exploration of each of these variables quickly reveals that the final TBW is a trade-off between an enormous number of variables, including SSD workload, NAND valid block count, NAND block size, NAND defectivity, static SLC P/E cycle count, super block architecture, and on and on.

Options for the future

Reconsidering workload and how it affects SSD endurance is one of the many unknowns associated with this new revolution. They as product managers need to prepare for these scenarios. Micron now understand tradeoffs to help enhance the SSD endurance by up to 10 times if it’s required for running complicated vision-language models (VLMs) locally on the PC, thanks to efforts solving the endurance problem on multiple product generations.

Enduring lessons

Resolving a complex issue presents a chance for growth and creativity. I’m hoping these can assist you too.

  • Remain composed; panicking won’t help you.
  • It’s acceptable to not know everything.
  • Seek assistance from others and have faith in your group.
  • Acknowledge uncertainty and turbulence; work toward establishing clarity.
  • Oftentimes, the obvious answer isn’t the best one to choose.
  • Become inquisitive about and develop a passion for fixing the issue not just the answer on its own.
  • Make it a daily goal to learn and embrace every challenge.
  • Sharing and teaching what you’ve learnt is essential to ensuring the success of the group as a whole.

If you read through to the end, you might be wondering how the Micron 3500 TBW problem was resolved.

They discovered that in order to reach these new heights, Micron G8 NAND has undergone a number of new process advancements.They had significantly negative defectivity estimates due to the novelty, which kept us from achieving 2 million MTBF targets for a particular TBW specification. Working with a customer’s staff, learned that their specification called for a lower MTBF target. As a result, a could quickly satisfy their endurance requests. Ultimately, they achieved their original 2 million MTFB targets without any issues or compromises because, when the product was released, defectivity was much lower than anticipated. A win-win solution resulted from being receptive to the issue.

- Advertisement -
Cheekuru Bhargav
Cheekuru Bhargav
Cheekuru Bhargav has been writing Laptops, RAM and SSD articles for govindhtech from OCT 2023. He was a science graduate. He was an enthusiast of Laptops.
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes