Open Standards Featured at FMS 2023

SNIA welcomes colleagues to join them at the upcoming Flash Memory Summit, August 8-10, 2023 in Santa Clara CA.

SNIA is pleased to join standards organizations CXL Consortium™ (CXL™), PCI-SIG®, and Universal Chiplet Interconnect Express™ (UCIe™) in an Open Standards Pavilion, Booth #725, in the Exhibit Hall.  CMSI will feature SNIA member companies in a computational storage cross industry demo by Intel, MINIO, and Solidigm and a Data Filtering demo by ScaleFlux; a software memory tiering demo by VMware; a persistent memory workshop and hackathon; and the latest on SSD form factors E1 and E3 work by SNIA SFF TA Technical work group. SMI will showcase SNIA Swordfish® management of NVMe SSDs on Linux with demos by Intel Samsung and Solidigm.

CXL will discuss their advances in coherent connectivity.  PCI-SIG will feature their PCIe 5.0 architecture (32GT/s) and PCIe 6.0 (65GT/s) architectures and industry adoption and the upcoming PCIe 7.0 specification development (128GT/s).  UCIe will discuss their new open industry standard establishing a universal interconnect at the package-level.

SNIA STA Forum will also be in Booth #849 – learn more about the SCSI Trade Association joining SNIA.

These demonstrations and discussions will augment FMS program sessions in the SNIA-sponsored System Architecture Track on memory, computational storage, CXL, and UCIe standards.  A SNIA mainstage session on Wednesday August 9 at 2:10 pm will discuss Trends in Storage and Data: New Directions for Industry Standards.

SNIA colleagues and friends can receive a $100 discount off the 1-, 2-, or 3-day full conference registration by using code SNIA23.

Visit snia.org/fms to learn more about the exciting activities at FMS 2023 and join us there!

So just what is an SSD?

It seems like an easy enough question, “What is an SSD?” but surprisingly, most of the search results for this get somewhat confused quickly on media, controllers, form factors, storage interfaces, performance, reliability, and different market segments. 

The SNIA SSD SIG has spent time demystifying various SSD topics like endurance, form factors, and the different classifications of SSDs – from consumer to enterprise and hyperscale SSDs.

“Solid state drive is a general term that covers many market segments, and the SNIA SSD SIG has developed a new overview of “What is an SSD? ,” said Jonmichael Hands, SNIA SSD Special Interest Group (SIG)Co-Chair. “We are committed to helping make storage technology topics, like endurance and form factors, much easier to understand coming straight from the industry experts defining the specifications.”  

The “What is an SSD?” page offers a concise description of what SSDs do, how they perform, how they connect, and also provides a jumping off point for more in-depth clarification of the many aspects of SSDs. It joins an ever-growing category of 20 one-page “What Is?” answers that provide a clear and concise, vendor-neutral definition of often- asked technology terms, a description of what they are, and how each of these technologies work.  Check out all the “What Is?” entries at https://www.snia.org/education/what-is

And don’t miss other interest topics from the SNIA SSD SIG, including  Total Cost of Ownership Model for Storage and SSD videos and presentations in the SNIA Educational Library.

Your comments and feedback on this page are welcomed.  Send them to askcmsi@snia.org.

Your Questions Answered on Persistent Memory, CXL, and Memory Tiering

With the persistent memory ecosystem continuing to evolve with new interconnects like CXL™ and applications like memory tiering, our recent Persistent Memory, CXL, and Memory Tiering-Past, Present, and Future webinar was a big success.  If you missed it, watch it on demand HERE!

Many questions were answered live during the webinar, but we did not get to all of them.  Our moderator Jim Handy from Objective Analysis, and experts Andy Rudoff and Bhushan Chithur from Intel, David McIntyre from Samsung, and Sudhir Balasubramanian and Arvind Jagannath from VMware have taken the time to answer them in this blog. Happy reading!

Q: What features or support is required from a CXL capable endpoint to e.g. an accelerator to support the memory pooling? Any references?

A: You will have two interfaces, one for the primary memory accesses and one for the management of the pooling device. The primary memory interface is the .mem and the management interface will be via the .io or via a sideband interface. In addition you will need to implement a robust failure recovery mechanism since the blast radius is much larger with memory pooling.

Q: How do you recognize weak information security (in CXL)?

A: CXL has multiple features around security and there is considerable activity around this in the Consortium.  For specifics, please see the CXL Specification or send us a more specific question.

Q: If the system (e.g. x86 host) wants to deploy CXL memory (Type 3) now, is there any OS kernel configuration, BIO configuration to make the hardware run with VMWare (ESXi)? How easy or difficult this setup process?

A: A simple CXL Type 3 Memory Device providing volatile memory is typically configured by the pre-boot environment and reported to the OS along with any other main memory.  In this way, a platform that supports CXL Type 3 Memory can use it without any additional setup and can run an OS that contains no CXL support and the memory will appear as memory belonging to another NUMA code.  That said, using an OS that does support CXL enables more complex management, error handling, and more complex CXL devices.

Q: There was a question on ‘Hop” length. Would you clarify?

A: In the webinar around minute 48, it was stated that a Hop was 20ns, but this is not correct. A Hop is often spoken of as “Around 100ns.”  The Microsoft Azure Pond paper quantifies it four different ways, which range from 85ns to 280ns.

Q: Do we have any idea how much longer the latency will be?  

A: The language CXL folks use is “Hops.”   An address going into CXL is one Hop, and data coming back is another.  In a fabric it would be twice that, or four Hops.  The  latency for a Hop is somewhere around 100ns, although other latencies are accepted.

Q: For memory semantic SSD:  There appears to be a trend among 2LM device vendors to presume the host system will be capable of providing telemetry data for a device-side tiering mechanism to decide what data should be promoted and demoted.  Meanwhile, software vendors seem to be focused on the devices providing telemetry for a host-side tiering mechanism to tell the device where to move the memory.  What is your opinion on how and where tiering should be enforced for 2LM devices like a memory semantic SSD?

A: Tiering can be managed both by the host and within computational storage drives that could have an integrated compute function to manage local tiering- think edge applications.

Q: Re VM performance in Tiering: It appears you’re comparing the performance of 2 VM’s against 1.  It looked like the performance of each individual VM on the tiering system was slower than the DRAM only VM.  Can you explain why we should take the performance of 2 VMs against the 1 VM?  Is the proposal that we otherwise would have required those 2 VM’s to run on separate NUMA node, and now they’re running on the same NUMA node?

A: Here the use case was, lower TCO & increased capacity of memory along with aggregate performance of VM’s v/s running few VM’s on DRAM. In this use case, the DRAM per NUMA Node was 384GB, the Tier2 memory per NUMA node was 768GB. The VM RAM was 256GB.

In the DRAM only case, if we have to run business critical workloads e.g., Oracle with VM RAM=256GB,  we could only run 1 VM (256GB) per NUMA Node (DRAM=384GB), we cannot over-provision memory in the DRAM only case as every NUMA node has 384GB only. So potentially we could run 4 such VM’s (VM RAM=256Gb) in this case with NUMA node affinity set as we did in this use case OR if we don’t do NUMA node affinity, maybe 5 such VM’s without completely maxing out the server RAM.  Remember, we did NUMA node affinity in this use case to eliminate any cross NUMA latency.78

Now with Tier2 memory in the mix, each NUMA node has 384GB DRAM and 768GB Tier2 Memory, so theoretically one could run 16-17 such VM’s (VM RAM =256GB), hence we are able to increase resource maximization, run more workloads, increase transactions etc , so lower TCO, increased capacity and aggregate performance improvement.

Q: CXL is changing very fast, we have 3 protocol versions in 2 years, as a new consumer of CXL what are the top 3 advantages of adopting CXL right away v/s waiting for couple of more years?

A: All versions of CXL are backward compatible.  Users should have no problem using today’s CXL devices with newer versions of CXL, although they won’t be able to take advantage of any new features that are introduced after the hardware is deployed.

Q: (What is the) ideal when using agilex FPGAs as accelerators?

A: CXL 3.0 supports multiple accelerators via the CXL switching fabric. This is good for memory sharing across heterogeneous compute accelerators, including FPGAs.

Thanks again for your support of SNIA education, and we invite you to write askcmsi@snia.org for your ideas for future webinars and blogs!

It’s A Wrap – But Networking and Education Continue From Our C+M+S Summit!

Our 2023 SNIA Compute+Memory+Storage Summit was a success! The event featured 50 speakers in 40 sessions over two days. Over 25 SNIA member companies and alliance partners participated in creating content on computational storage, CXL™ memory, storage, security, and UCIe™. All presentations and videos are free to view at www.snia.org/cms-summit.

“For 2023, the Summit scope expanded to examine how the latest advances within and across compute, memory and storage technologies should be optimized and configured to meet the requirements of end customer applications and the developers that create them,” said David McIntyre, Co-Chair of the Summit.  “We invited our SNIA Alliance Partners Compute Express Link™ and Universal Chiplet Interconnect Express™ to contribute to a holistic view of application requirements and the infrastructure resources that are required to support them,” McIntyre continued.  “Their panel on the CXL device ecosystem and usage models and presentation on UCIe innovations at the package level along with three other sessions on CXL added great value to the event.”

Thirteen computational storage presentations covered what is happening in NVMe™ and SNIA to support computational storage devices and define new interfaces with computational storage APIs that work across different hardware architectures.  New applications for high performance data analytics, discussions of how to integrate computational storage into high performance computing designs, and new approaches to integrate compute, data and I/O acceleration closely with storage systems and data nodes were only a few of the topics covered.

“The rules by which the memory game is played are changing rapidly and we received great feedback on our nine presentations in this area,” said Willie Nelson, Co-Chair of the Summit.  “SNIA colleagues Jim Handy and Tom Coughlin always bring surprising conclusions and opportunities for SNIA members to keep abreast of new memory technologies, and their outlook was complimented by updates on SNIA standards on memory-to memory data movement and on JEDEC memory standards; presentations on thinking memory, fabric attached memory, and optimizing memory systems using simulations; a panel examining where the industry is going with persistent memory, and much more.”

Additional highlights included an EDSFF panel covering the latest SNIA specifications that support these form factors, sharing an overview of platforms that are EDSFF-enabled, and discussing the future for new product and application introductions; a discussion on NVMe as a cloud interface; and a computational storage detecting ransomware session.

New to the 2023 Summit – and continuing to get great views – was a “mini track” on Security, led by Eric Hibbard, chair of the SNIA Storage Security Technical Work Group with contributions from IEEE Security Work Group members, including presentations on cybersecurity, fine grain encryption, storage sanitization, and zero trust architecture.

Co-Chairs McIntyre and Nelson encourage everyone to check out the video playlist and send your feedback to askcmsi@snia.org. The “Year of the Summit” continues with networking opportunities at the upcoming SmartNIC Summit (June), Flash Memory Summit (August), and SNIA Storage Developer Conference (September).  Details on all these events and more are at the SNIA Event Calendar page.  See you soon!

50 Speakers Featured at the 2023 SNIA Compute+Memory+Storage Summit

SNIA’s Compute+Memory+Storage Summit is where architectures, solutions, and community come together. Our 2023 Summit – taking place virtually on April 11-12, 2023 – is the best example to date, featuring a stellar lineup of 50 speakers in 40 sessions covering topics including computational storage real-world applications, the future of memory, critical storage security issues, and the latest on SSD form factors, CXL™, and UCIe™.

“We’re excited to welcome executives, architects, developers, implementers, and users to our 11th annual Summit,” said David McIntyre, C+M+S Summit Co-Chair, and member of the SNIA Board of Directors.  “We’ve gathered the technology leaders to bring us the latest developments in compute, memory, storage, and security in our free online event.  We hope you will watch live to ask questions of our experts as they present, and check out those sessions you miss on-demand.”

Memory sessions begin with Watch Out – Memory’s Changing! where Jim Handy and Tom Coughlin will discuss the memory technologies vying for the designer’s attention, with CXL™ and UCIe™ poised to completely change the rules. Speakers will also cover thinking memory, optimizing memory using simulations, providing capacity and TCO to applications using software memory tiering, and fabric attached memory.

Compute sessions include Steven Yuan of StorageX discussing the Efficiency of Data Centric Computing, and presentations on the computational storage and compute market, big-disk computational storage arrays for data analytics, NVMe as a cloud interface, improving storage systems for simulation science with computational storage, and updates on SNIA and NVM Express work on computational storage standards.

CXL and UCIe will be featured with presentations on CXL 3.0 and Universal Compute Interface Express™ On-Package Innovation Slot for Compute, Memory, and Storage Applications.

The Summit will also dive into security with a introductory view of today’s storage security landscape and additional sessions on zero trust architecture, storage sanitization, encryption, and cyber recovery and resilience.

For 2023, the Summit is delighted to present three panels – one on Exploring the Compute Express Link™ (CXL™) Device Ecosystem and Usage Models moderated by Kurtis Bowman of the CXL Consortium, one on Persistent Memory Trends moderated by Dave Eggleston of Microchip, and one on Form Factor Updates, moderated by Cameron Brett of the SNIA SSD Special Interest Group.

We will also feature the popular SNIA Birds-of-a-Feather sessions. On Tuesday April 11 at 4:00 pm PDT/7:00 pm EDT, you can join to discuss the latest compute, memory, and storage developments, and on Wednesday April at 3:00 pm PDT/6:00 pm EDT, we’ll be talking about memory advances.

Learn more in our Summit preview video, check out the agenda, and register for free to access our Summit platform!

“Year of the Summit” Kicks Off with Live and Virtual Events

For 11 years, SNIA Compute, Memory and Storage Initiative (CMSI) has presented a Summit featuring industry leaders speaking on the key topics of the day.  In the early years, it was persistent memory-focused, educating audiences on the benefits and uses of persistent memory.  In 2020 it expanded to a Persistent Memory+Computational Storage Summit, examining that new technology, its architecture, and use cases.

Now in 2023, the Summit is expanding again to focus on compute, memory, and storage.  In fact, we’re calling 2023 the Year of the Summit – a year to get back to meeting in person and offering a variety of ways to listen to leaders, learn about technology, and network to discuss innovations, challenges, solutions, and futures.

We’re delighted that our first event of the Year of the Summit is a networking event at MemCon, taking place March 28-29 at the Computer History Museum in Mountain View CA.

At MemCon, SNIA CMSI member and IEEE President elect Tom Coughlin of Coughlin Associates will moderate a panel discussion on Compute, Memory, and Storage Technology Trends for the Application Developer.  Panel members Debendra Das Sharma of Intel and the CXL™ Consortium, David McIntyre of Samsung and the SNIA Board of Directors, Arthur Sainio of SMART Modular and the SNIA Persistent Memory Special Interest Group, and Arvind Jaganath of VMware and SNIA CMSI will examine how applications and solutions available today offer ways to address enterprise and cloud provider challenges – and they’ll provide a look to the future.

SNIA leaders will be on hand to discuss work in computational storage, smart data acceleration interface (SDXI), SSD form factor advances, and persistent memory trends.  Share a libation or two at the SNIA hosted networking reception on Tuesday evening, March 28.

This inaugural MemCon event is perfect to start the conversation, as it focuses on the intersection between systems design, memory innovation (emerging memories, storage & CXL) and other enabling technologies. SNIA colleagues and friends can register for MemCon with a 15% discount using code SNIA15.

April 2023 Networking!

We will continue the Year with a newly expanded SNIA Compute+Memory+Storage Summit coming up April 11-12 as a virtual event.  Complimentary registration is now open for a stellar lineup of speakers, including Stephen Bates of Huawei, Debendra Das Sharma of  Universal Chiplet Interconnect Express™, Jim Handy of Objective Analysis, Shyam Iyer of Dell, Bill Martin of Samsung, Jake Oshins of Microsoft, Andy Rudoff of Intel, Andy Walls of IBM, and Steven Yuan of StorageX.

Summit topics include Memory’s Headed for Change, High Performance Data Analytics, CXL 3.0, Detecting Ransomware, Meeting Scaling Challenges, Open Standards for Innovation at the Package Level, and Standardizing Memory to Memory Data Movement. Great panel discussions are on tap as well.  Kurt Lender of the CXL Consortium will lead a discussion on Exploring the CXL Device Ecosystem and Usage Models, Dave Eggleston of Microchip will lead a panel with Samsung and SMART Modular on Persistent Memory Trends, and Cameron Brett of KIOXIA will lead a SSD Form Factors Update.   More details at www.snia.org/cms-summit.

Later in 2023…

Opportunities for networking will continue throughout 2023. We look forward to seeing you at the SmartNIC Summit (June 13-15), Flash Memory Summit (August 8-10), SNIA Storage Developer Conference (September 18-21), OCP Global Summit (October 17-19), and SC23 (November 12-17). Details on SNIA participation coming soon!

Our Storage Life on the Edge Webcast Series Continues….

The second webcast in our Storage Life on the Edge series is coming up on March 22, 2022 at 10:00 am Pacific time.  This panel, moderated by Bill Martin, SNIA Compute, Memory, and Storage Initiative Chair, takes a deeper dive to focus on edge use cases in the computational storage space.

Our panelists Mayank Saxena from Samsung, Stephen Bates from Eideticom, and Tong Zhang from ScaleFlux will discuss edge to cloud use cases where storage and compute resources need to be deployed in practical topologies that deliver the very best in application performance. They’ll examine high performance edge data needs, database acceleration solutions, meeting retail chain challenges, and more. You won’t want to miss their panel discussion and the chance to ask your questions live.

Register here to attend. We’ll look forward to seeing you!

Why Cryptocurrency and Computational Storage?

Our new SNIA Compute, Memory, and Storage webcast focuses on a hot topic – storage-based cryptocurrency.

Blockchains, cryptocurrency, and the internet of markets are working to transform finance, wealth, safety, digital security, and trust. Storage-based cryptocurrencies had a breakout year in 2021. Proof of Space and Time is a new blockchain consensus that uses storage capacity to secure the blockchain. Decentralized file storage will enable alternatives to hyperscale data centers for hosting files and objects. Understanding the TCO of a storage system and optimizing the utilization of the storage hardware is critical in scaling these systems.

Join our speakers, Jonmichael Hands of Chia Network and Eli Tiomkin of NGD Systems, for this discussion on how a new approach of auto-plotting SSDs combined with computational storage can lower the total TCO. Registration is free for this webcast on Tuesday, February 15 at 10:00 am Pacific time. Click on the link to register and see you there! https://www.brighttalk.com/webcast/663/526154

What is eBPF, and Why Does it Matter for Computational Storage?

Recently, a question came up in the SNIA Computational Storage Special Interest Group on new developments in a technology called eBPF and how they might relate to computational storage. To learn more, SNIA on Storage sat down with Eli Tiomkin, SNIA CS SIG Chair with NGD Systems; Matias Bjørling of Western Digital; Jim Harris of Intel; Dave Landsman of Western Digital; and Oscar Pinto of Samsung.

SNIA On Storage (SOS):  The eBPF.io website defines eBPF, extended Berkeley Packet Filter, as a revolutionary technology that can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules.  Why is it important?

Dave Landsman (DL): eBPF emerged in Linux as a way to do network filtering, and enables the Linux kernel to be programmed.  Intelligence and features can be added to existing layers, and there is no need to add additional layers of complexity.

SNIA On Storage (SOS):  What are the elements of eBPF that would be key to computational storage? 

Jim Harris (JH):  The key to eBPF is that it is architecturally agnostic; that is, applications can download programs into a kernel without having to modify the kernel.  Computational storage allows a user to do the same types of things – develop programs on a host and have the controller execute them without having to change the firmware on the controller.

Using a hardware agnostic instruction set is preferred to having an application need to download x86 or ARM code based on what architecture is running.

DL:  It is much easier to establish a standard ecosystem with architecture independence.  Instead of an application needing to download x86 or ARM code based on the architecture, you can use a hardware agnostic instruction set where the kernel can interpret and then translate the instructions based on the processor. Computational storage would not need to know the processor running on an NVMe device with this “agnostic code”.

SOS: How has the use of eBPF evolved?

JH:  It is more efficient to run programs directly in the kernel I/O stack rather than have to return packet data to the user, operate on it there, and then send the data back to the kernel. In the Linux kernel, eBPF began as a way to capture and filter network packets.  Over time, eBPF use has evolved to additional use cases.

SOS:  What are some use case examples?

DL: One of the use cases is performance analysis. For example, eBPF can be used to measure things such as latency distributions for file system I/O, details of storage device I/O and TCP retransmits, and blocked stack traces and memory.

Matias Bjørling (MB): Other examples in the Linux kernel include tracing and gathering statistics.  However, while the eBPF programs in the kernel are fairly simple, and can be verified by the Linux kernel VM, computational programs are more complex, and longer running. Thus, there is a lot of work ongoing to explore how to efficiently apply eBPF to computational programs.

For example, what is the right set of run-time restrictions to be defined by the eBPF VM, any new instructions to be defined, how to make the program run as close to the instruction set of the target hardware.

JH: One of the big use cases involves data analytics and filtering. A common data flow for data analytics are large database table files that are often compressed and encrypted.  Without computational storage, you read the compressed and encrypted data blocks to the host, decompress and decrypt the blocks, and maybe do some filtering operations like a SQL query.  All this, however, consumes a lot of extra host PCIe, host memory, and cache bandwidth because you are reading the data blocks and doing all these operations on the host.  With computational storage, inside the device you can tell the SSD to read data and transfer it not to the host but to some memory buffers within the SSD.  The host can then tell the controller to do a fixed function program like decrypt the data and put in another local location on the SSD, and then do a user supplied program like eBPF to do some filtering operations on that local decrypted data.  In the end you would transfer the filtered data to the host.  You are doing the compute closer to the storage, saving memory and bandwidth.

SOS:  How does using eBPF for computational storage look the same?  How does it look different?

Jim – There are two parts to this answer.  Part 1 is the eBPF instruction set with registers and how eBPF programs are assembled.  Where we are excited about computational storage and eBPF is that the instruction set is common. There are already existing tool chains that support eBPF.   You can take a C program and compile it into an eBPF object file, which is huge.  If you add computational storage aspects to standards like NVMe, where developing a unique tool chain support can take a lot of work, you can now leverage what is already there for the eBPF ecosystem. 

Part 2 of the answer centers around the Linux kernel’s restrictions on what an eBPF program is allowed to do when downloaded. For example, the eBPF instruction set allows for unbounded loops, and toolchains such as gcc will generate eBPF object code with unbounded loops, but the Linux kernel will not permit those to execute – and rejects the program. These restrictions are manageable when doing packet processing in the kernel.  The kernel knows a packet’s specific data structure and can verify that data is not being accessed outside the packet.  With computational storage, you may want to run an eBPF program that operates on a set of data that has a very complex data structure – perhaps arrays not bounded or multiple levels of indirection.  Applying Linux kernel verification rules to computational storage would limit or even prevent processing this type of data.

SOS: What are some of the other challenges you are working through with using eBPF for computational storage?

MB:  We know that x86 works fast with high memory bandwidth, while other cores are slower.  We have some general compute challenges in that eBPF needs to be able to hook into today’s hardware like we do for SSDs.  What kind of operations make sense to offload for these workloads?  How do we define a common implementation API for all of them and build an ecosystem on top of it?  Do we need an instruction-based compiler, or a library to compile up to – and if you have it on the NVMe drive side, could you use it?  eBPF in itself is great- but getting a whole ecosystem and getting all of us to agree on what makes value will be the challenge in the long term.

Oscar Pinto (OP): The Linux kernel for eBPF today is more geared towards networking in its functionality but light on storage. That may be a challenge in building a computational storage framework. We need to think through how to enhance this given that we download and execute eBPF programs in the device. As Matias indicated, x86 is great at what it does in the host today. But if we have to work with smaller CPUs in the device, they may need help with say dedicated hardware or similar implemented using additional logic to aid the eBPF programs One question is how would these programs talk to them?  We don’t have a setup for storage like this today, and there are a variety of storage services that can benefit from eBPF.

SOS: Is SNIA addressing this challenge?

OP: On the SNIA side we are building on program functions that are downloaded to computational storage engines.  These functions run on the engines which are CPUs or some other form of compute that are tied to a FPGA, DPU, or dedicated hardware. We are defining these abstracted functionalities in SNIA today, and the SNIA Computational Storage Technical Work Group is developing a Computational Storage Architecture and Programming Model and Computational Storage APIs  to address it..  The latest versions, v0.8 and v0.5, has been approved by the SNIA Technical Council, and is now available for public review and comment at SNIA Feedback Portal.

SOS: Is there an eBPF standard? Is it aligned with storage?

JH:  We have a challenge around what an eBPF standard should look like.  Today it is defined in the Linux kernel.  But if you want to incorporate eBPF in a storage standard you need to have something specified for that storage standard.  We know the Linux kernel will continue to evolve adding and modifying instructions. But if you have a NVMe SSD or other storage device you have to have something set in stone –the version of eBPF that the standard supports.  We need to know what the eBPF standard will look like and where will it live.  Will standards organizations need to define something separately?

SOS:  What would you like an eBPF standard to look like from a storage perspective?

JH – We’d like an eBPF standard that can be used by everyone.  We are looking at how computational storage can be implemented in a way that is safe and secure but also be able to solve use cases that are different.

MB:  Security will be a key part of an eBPF standard.  Programs should not access data they should not have access to.  This will need to be solved within a storage device. There are some synergies with external key management. 

DL: The storage community has to figure out how to work with eBPF and make this standard something that a storage environment can take advantage of and rely on.

SOS: Where do you see the future of eBPF?

MB:  The vision is that you can build eBPFs and it works everywhere.  When we build new database systems and integrate eBPFs into them, we then have embedded kernels that can be sent to any NVMe device over the wire and be executed.  The cool part is that it can be anywhere on the path, so there becomes a lot of interesting ways to build new architectures on top of this. And together with the open system ecosystem we can create a body of accelerators in which we can then fast track the build of these ecosystems.  eBPF can put this into overdrive with use cases outside the kernel.

DL:  There may be some other environments where computational storage is being evaluated, such as web assembly.

JH: An eBPF run time is much easier to put into an SSD than a web assembly run time.

MB: eBPF makes more sense – it is simpler to start and build upon as it is not set in stone for one particular use case.

Eli Tiomkin (ET):  Different SSDs have different levels of constraints.  Every computational storage SSDs in production and even those in development have very unique capabilities that are dependent on the workload and application.

SOS:  Any final thoughts?

MB: At this point, technologies are coming together which are going to change the industry in a way that we can redesign the storage systems both with computational storage and how we manage security in NVMe devices for these programs.  We have the perfect storm pulling things together. Exciting platforms can be built using open standards specifications not previously available.

SOS:  Looking forward to this exciting future. Thanks to you all.

Q&A on Data Movement and Computational Storage

Recently, the SNIA Compute, Memory, and Storage Initiative hosted a live webcast “Data Movement and Computational Storage”, moderated by Jim Fister of The Decision Place with Nidish Kamath of KIOXIA, David McIntyre of Samsung, and Eli Tiomkin of NGD Systems as panelists.  We had a great discussion on new ways to look at storage, flexible computer systems, and how to put on your security hat.

During our conversation, we answered audience questions, and raised a few of our own!  Check out some of the back-and-forth, and tune in to the entire video for customer use cases and thoughts for the future.

Q:  What is the value of computational storage?

A:  With computational storage, you have latency sensitivity – you can make decisions faster at the edge and can also distribute computing to process decisions anywhere.

Q:  Why is it important to consider “data movement” with regard to computational storage?

A:  There is a reduction in data movement that computational storage brings to the system, along with higher efficiencies while moving that data and a reduction in power which users may not have yet considered.   

Q: How does power use change when computational storage is brought in?

A:  You want to “move” compute to that point in the system where operations can be accomplished where the data is “at rest”. In traditional systems, if you need to move data from storage to the host, there are power costs that may not even be currently measured.  However, if you can now run applications and not move data, you will realize that power reduction, which is more and more important with the anticipation of massive quantities of data coming in the future.

Q: Are the traditional processing/storage transistor counts the same with computational storage?

A:  With computational storage, you can put the programming where it is needed – moving the compute to that point in the system where it can achieve the work with limited amount of overhead and networking bandwidth. Compute moves to where the data sits at rest, which is especially important with the explosion of data sets.

Q:  Does computational storage play a role in data security and privacy?

A: Security threats don’t always happen at the same time, so you need to consider a top-down holistic perspective. It will be important both today and in the future to consider new security threats because of data movement.

There is always a risk for security when the data is moving; however, computational storage reduces the data movement significantly, and can play as a more secure way to treat data because the data is not moving as much. Computational storage allows you to lock the data, for example, medical data, and only process when needed and if needed in an authenticated and secure fashion.  There’s no requirement to build a whole system around this.

Q:  What are the computational storage opportunities at the edge? 

A:  We need to understand the ecosystem the computational storage device is going into. Computational storage sits at the front line of edge applications and management of edge infrastructure pieces in the cloud.  It’s a great time to embrace existing cloud policies and collaborate with customers on how policies will migrate and change to the edge.

Q: In your discussions with customers, how dynamic do they expect the sets of code running on computational storage to be? With the extremes being code never changing (installed once/updated rarely) to being different for every query or operation. Please discuss how challenges differ for these approaches.

A:  The heavy lift comes into play with the application and the system integration.  To run flexible code, customers want a simple, straightforward, and seamless programming model that enables them to run as many applications as they need and change them in an easy way without disrupting the system.  Clients are using computational storage to speed up the processing of their data with dynamic reconfiguring in cutting edge applications.  We are putting a lot of effort toward this seamless and transparent model with our work in the SNIA Computational Storage Technical Work Group.

Q:  What does computational storage mean for data in the future?

A: The infrastructure of data and data movement will drastically change in the future as edge emerges and cloud continues to grow. Using computational storage will be extremely beneficial in the new infrastructure, and we will need to work together as an ecosystem and under SNIA to make sure we are all aligned to provide the right solutions to the customer.