Bridgeworks features in this article from Digitalisation World about the benefits of WAN Acceleration tackling slow cloud applications and Gen-AI.
April 30, 2024
Generative artificial intelligence (Gen-AI) is the latest technology craze, but David Linthicum, Author of ‘An Insider’s Guide to Cloud Computing’, and a contributor at InfoWorld, opines that most cloud-based generative AI performance stinks. What’s lacking is basic computer architecture best practices, making these systems quite sluggish. This issue can arise with any cloud-based application because of the effects of latency, packet loss and poor bandwidth utilisation.
As for generative AI’s sluggishness, he comments: “Performance is often an afterthought with generative AI development and deployment. Most deploying generative AI systems on the cloud -and even without the cloud – have yet to learn what the performance of their generative AI systems should be, take no steps to determine performance and end up complaining about the performance after deployment. Or, more often, the users complain, and then generative AI designers and developers complain to me.”
Phil Hill – a systems architect at Bridgeworks – explains that there was an incident in July 2023 with Photoshop’s generative AI that led users tweaking their routes or connectivity to get around sluggish performance. There may be many reasons for this occurrence, which could include Wide Area Network (WAN) congestion on specific WAN links. Sometimes SD-WAN can avoid the congestion on certain links, allowing to reconfigure and even out the workload.
He adds: “Forgetting AI for the moment, let’s remember why software is written. Basically, need is the mother of invention. What we are seeing is a plethora of new apps and solutions on a monthly basis addressing some sort of perceived need. However, the author of such systems only has specific tools in his toolbox – the ones he owns or understands.”
Creating trade-offs
This means, that while there are many ways of doing things, a comparison with two vastly different methods will lead to trade-offs between them. Like with any network, the more people sharing, the slower the network can become because bandwidth becomes apportioned to each connected user. Then again, the spectres of latency and packet loss may also be playing a role – as expressed by Linthicum.
Hill finds that the systems moving the most data will suffer the most, and it will be especially evident when they are geographically separated over large distances from each other.
“You also have to remember though that the AI itself is doing a few things to process the request – broadly it needs to understand the request, synthesise plausible responses, assess accuracy and ethical implications and respond,” he explains.
Processing sluggishness
Consider, as well, that when the hardware processing is shared, there is bound to be some processing latency as there will be lots of service requests. This is a fact of processing data and hardware performance. “It’s only when we transfer data that we can improve the outcome by using WAN Acceleration,” he advises.
Performance is, nevertheless, seen as an afterthought in many cases – including with generative AI development and deployment. So, at this juncture, he notes Abraham Maslow’s comment: “If the only tool you have is a hammer, you tend to see every problem as a nail.” With regard to cloud applications more broadly, when the app relies heavily on database usage or distance related to data transfers, then similar issues of sluggishness can occur.
Determining performance
So, how should organisations determine what the performance should be of their cloud-based applications – including Gen-AI? His advice is to go back to basics because organisations should be aiming to establish some key performance indicators (KPIs). This requires them to research how people are currently completing a task, and how they can improve on their efficiency by doing something differently. They should also consider how long the task takes to complete, and how easy or painful the experience is. By analysing what works well and what doesn’t, performance improvements can be attained.
With regard to performance, Hill says technologies such as SD-WANs and WAN Optimisation can resolve some cloud performance issues for Gen-AI and for cloud-based applications. This could be to the extent that the transfer of dispersed data can be sped up using these, but not database queries and generative processes.
Better results
However, they can’t achieve the same results as WAN Acceleration. Hill explains: “Where data transfer has been identified as the bottleneck, and where the bandwidth is only minimally used, 90% bandwidth utilisation, using our PORTrockIT WAN and data acceleration appliances, but each case will be different.”
He recommends that organisations should assess the WAN needs of their cloud-based applications – including GEN-AI. This requires an investigation in how much data an organisation needs to transfer in order to specify their WAN bandwidth requirements. To start with, this is about analysing how many megabytes per second they need to transfer to do this.
He then advises that there should be a need to request bandwidths in excess of the current usage requirements. As part of the performance improvement, they should also consider WAN Acceleration technology to mitigate latencies of 5ms or higher.