Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
Oracle releases FIPS-validated crypto module for Java | InfoWorld
Technology insight for the enterpriseOracle releases FIPS-validated crypto module for Java 30 Apr 2025, 6:10 pm
Oracle has announced Oracle Jipher, which makes cryptographic services available for Java developers using the standard Java Cryptography Architecture (JCA) framework.
Announced April 29, Jipher is a Java cryptographic service provider that packages a Federal Information Processing Standards (FIPS) 140-2 validated OpenSSL cryptographic module. Jipher is packaged as a JAR file and is downloadable from Java Tools and Resources and from My Oracle Support for Java SE users.
Oracle noted that the JDK includes cryptographic service providers such as Sun
, SunRsaSign
, and SunJC
, which provide concrete implementations of algorithms as defined by the JCA framework. These providers let Java applications access security algorithm implementations by specifying a particular provider or by letting the framework locate the requested algorithm by searching through the registered providers in the specified preference order. However, these cryptographic service providers are not FIPS-140 validated. FIPS-140 standards, which are published by the National Institute of Standards and Technology (NIST), define security requirements for cryptographic modules.
Jipher enables deployments of Java applications in FIPS 140-regulated environments. It achieves this by leveraging the OpenSSL 3.x FIPS module, Oracle said. Jipher requires an up-to-date release of Oracle JDK 17 or JDK 21, or GraalVM for JDK 17 or JDK 21, and is made available under the Java SE OTN license. It is supported for Java SE subscribers and users running Java workloads in Oracle Cloud Infrastructure.
Jipher represents a significant advancement in Oracle’s commitment to delivering standards-compliant security solutions, Oracle said. With JDK 24, released in March, Oracle delivered two post-quantum algorithm implementations standardized by FIPS 203 and FIPS 204 to help protect Java users against emerging threats of quantum computing.
Zoho adds AI capabilities to its low code dev platform 30 Apr 2025, 1:57 pm
Zoho on Wednesday announced the addition of 10 AI-centric services and features within Zoho Creator, the company’s low code application development platform, that it said are part of its pledge to invest only in “AI capabilities that drive real-time, practical and secure benefits to business users.”
The expanded offerings include CoCreator, the firm’s new AI “development partner” powered by Zia, Zoho’s AI assistant, that it said in a release “facilitates faster, simpler and more intelligent app building with the use of voice and written prompts, process flows and business specification documents.”
New features also include the ability to transform unstructured data from different file types and databases into customized applications, aided by what the company described as “advanced AI-based data prep capabilities that remove inconsistencies and bring logical structure to detail.”
In addition, users can leverage either ZohoAI or OpenAI to develop applications using prompts, process flow diagrams, or systems documentation, and, it said, “automatically generate contextual code blocks tailored to application requirements and structure.”
Shashi Bellamkonda, principal research director at Info-Tech Research Group, said, “there is a dichotomy in the use and access to AI. While anyone with a browser has access to an LLM, creating custom applications has largely been inaccessible to regular business users and has needed the intervention of technology.”
Zoho, he said, “is not new to AI and has been working on it for some time. Zoho Creator is a step in the right direction for business users to create their own apps within the organization, with built-in guardrails that may ease concerns from CxOs, especially around allocating precious data science resources to be used for simple custom applications.”
Bellamkonda added, “‘I love writing detailed documentation,’ said nobody, and tools like Zoho’s CoCreator can assist in automating software requirement creation and may potentially do a better job than any bored human being and not miss anything; [there’s] less spreadsheet wrangling.”
It is, he said, “aimed at improving developer productivity, but I also see clear use cases for business users using CoCreator, [as well as] finance, marketing, and sales teams creating their own mini-CRMs. Reducing development cycles within enterprises will be a huge cost-saving using tools like Zoho CoCreator.”
The company, said Bellamkonda, “has shared its stance on responsible AI, noting that its models are not trained on customer data, which aligns with the privacy and compliance concerns of CIOs and CTOs. [Its] pricing models, which are typically designed with accessibility in mind, may appeal to customers and attract organizations of various sizes.”
That said, he pointed out, “adoption beyond developers and technical teams will be a challenge, as non-tech users are not used to creating their own apps. Zoho will have to continue to reach out and educate end users by collaborating with their technology buyers.”
Used effectively, “this could lead to measurable productivity gains within organizations,” Bellamkonda said.” Zoho is an outlier, as they own their tech stack and do not have the pressure of external funding, which means the economics of doing business with Zoho may be an advantage.”
Bharath Kumar, head of marketing and CX, Zoho Creator, said during a press and analyst briefing that since its launch in 2006, Creator has been “a reflection of market trends, initially acting as an online database when cloud was starting up, ensuring anytime, anywhere access of data was possible.”
Now, he said, “we are at a paradigm shift, at a point where the impact is coming from AI. [It] empowers everyone, it makes the whole process easy, and it is enabling users to build powerful applications.”
Info-Tech’s Bellamkonda added that the fact that CoCreator, which now is part of customers’ existing Creator subscriptions, is so user-friendly is paramount. “In my opinion it should be used by citizen developers more than regular developers,” he said. “[For the latter], it will make their job easier, but it’s powerful for [for the former] because they can do something themselves instead of having to go through a whole process.”
Meta will offer its Llama AI model as an API too 30 Apr 2025, 11:34 am
Meta has unveiled a preview version of an API for its Llama large language models. The new offering will transform Meta’s popular open-source models into an enterprise-ready service directly challenging established players like OpenAI while addressing a key concern for enterprise adopters: freedom from vendor lock-in.
“We want to make it even easier for you to quickly start building with Llama, while also giving you complete control over your models and weights without being locked into an API,” Meta said in a statement during its first-ever LlamaCon developer forum.
The Llama API represents Meta’s evolution from simply releasing open-source models to providing a variety of cloud-based AI infrastructure.
Greyhound Research chief analyst Sanchit Vir Gogia said, “They’re shifting the battlefield from model quality alone to inference cost, openness, and hardware advantage.”
OpenAI SDK compatibility
The new service will offer one-click API key creation, interactive model playgrounds, and immediate access to Meta’s latest Llama 4 Scout and Llama 4 Maverick models, the company said.
Integration with existing infrastructure is straightforward through lightweight SDKs in both Python and TypeScript. Meta has maintained compatibility with the OpenAI SDK, allowing developers to convert existing applications with minimal code changes.
The solution includes tools for fine-tuning and evaluation, enabling developers to create custom versions of the new Llama 3.3 8B model — potentially reducing costs while improving performance for specific use cases.
Chip partnerships
Meta will collaborate with AI chip makers Cerebras and Groq to improve inferencing speed, a critical factor in production AI applications.
Cerebras, known for its specialized AI chips, promises dramatically faster performance compared to conventional GPU solutions. According to third-party benchmarks cited by the company, Llama 4 Scout runs on its chips at over 2,600 tokens per second, compared to OpenAI’s ChatGPT running at approximately 130 tokens per second.
“Developers building agentic and real-time apps need speed,” said Andrew Feldman, CEO of Cerebras. “With Cerebras on Llama API, they can build AI systems that are fundamentally out of reach for leading GPU-based inference clouds.”
Similarly, Groq’s Language Processing Unit (LPU) chips deliver speeds of up to 625 tokens per second. Jonathan Ross, Groq’s CEO, emphasized that their solution is “vertically integrated for one job: inference,” with every layer “engineered to deliver consistent speed and cost efficiency without compromise.”
Neil Shah, VP for research and partner at Counterpoint Research, said, “By adopting cutting-edge but ‘open’ solutions like Llama API, enterprise developers now have better choices and don’t have to compromise on speed and efficiency or get locked into proprietary models.”
Greyhound’s Gogia said that Meta’s strategic tie-ups with Groq and Cerebras to support the Llama AI “mark a decisive pivot in the LLM-as-a-Service market.”
Exploiting hesitancy about proprietary AI
The Llama API enters a market where OpenAI’s GPT models have established early dominance, but Meta is leveraging key advantages to attract enterprise customers who remain hesitant about proprietary AI infrastructure.
“Meta’s Llama API presents a fundamentally different proposition for enterprise AI builders — it’s not just a tool, but a philosophy shift,” Gogia noted. “Unlike proprietary APIs from OpenAI or Anthropic, which bind developers into opaque pricing, closed weights, and restrictive usage rights, Llama offers openness, modularity, and the freedom to choose one’s own inference stack.”
Meta’s explicit commitment to data privacy, saying it does not use prompts or model responses to train its AI models, directly addresses concerns about other providers using customer data to improve their systems. Furthermore, its data portability guarantee ensures that models built on the Llama API are not locked to its servers, but can be moved and hosted wherever enterprises wish.
This approach creates a unique middle ground: enterprise-grade convenience with the ultimate exit strategy of complete model ownership.
Market impact and future plans
Currently available as a limited free preview with broader access planned “in the coming weeks and months,” the Llama API positions Meta as a direct competitor to OpenAI, Microsoft, and Google. The company describes this release as “just step one,” with additional enterprise capabilities expected throughout 2025.
Prabhu Ram, VP for industry research group at CyberMedia Research, described Meta’s Llama API as a faster, more open, and modular alternative to existing LLM-as-a-service offerings. “However, it still trails proprietary platforms like OpenAI and Google in ecosystem integration and mature enterprise tooling.”
For technical teams eager to test these performance claims, accessing Llama 4 models powered by Cerebras and Groq requires only a simple selection within the API interface.
Industry analysts suggest Meta’s entry could accelerate price competition in the AI API market while raising the bar for inference performance. For enterprises developing customer-facing AI applications, the performance improvements could enable new categories of applications where response time is critical.
“Meta’s long-term impact will hinge on how effectively it can close the ecosystem gap and deliver enterprise-grade solutions atop its open model stack,” Ram concluded.
Four essential ingredients of software development 30 Apr 2025, 5:00 am
Somewhere between marching to a screaming Marine Corps drill instructor, watching Naval Aviators smash into a steel deck, and living in constant fear of getting chewed out for an “Irish pennant,” I learned some valuable lessons about writing good code.
In the fall of 1987, I was commissioned as an Ensign in the US Navy. I received my commission as a result of 14 weeks of close, personal instruction from Gunnery Sergeant Bernhard Jones, USMC, while at Aviation Officer Candidate School (AOCS). From there, I spent the next five years as an Aviation Intelligence Officer in FA-18 squadrons, where I trained and briefed Naval Aviators on the threats they might face in the course of their duties.
Those years defined much of who I am today, and many of the lessons I learned from the Navy carried over into my career as a software developer and development manager. Here are four of those lessons.
Attention to detail
Much of my experience at AOCS involved the simple lesson of paying attention. Push-ups are a great teacher. You quickly become a believer when you’re screaming “Attention to detail!” on the way down and “Teamwork!” on the way up. A slight blemish on the shine of a shoe or a loose thread—the result of not paying attention to one’s uniform, or the uniform of one’s classmate—resulted in the command “On your face!”, sending us to the floor for more push-ups. Attention to detail is critical in aviation. Even the tiniest, seemingly insignificant details can mean the difference between a safe mission and disaster.
While most software isn’t a life or death matter, attention to detail is critical for success. When coding, if one pays attention to all the little details—proper naming, good formatting, ensuring corner cases are covered—the result will be more maintainable code and fewer bugs. Attention to detail today means less hassle tomorrow.
Teamwork
The other half of the push-up chant is teamwork. It is rare for a military member to do anything alone. Aircraft almost always fly in at least a group of two, and most aircraft have multiple crew members. Getting an aircraft launched off the deck of an aircraft carrier requires a highly coordinated dance of many people working together. Nearly every step involves working together with others to accomplish the goal.
In software development, some lone developers build things, generally for themselves. However, most developers are part of a team, and working together is critical for the success of any project. Whether it is pair programming, answering questions in Slack, doing code reviews, or giving training, a software developer has to work with others to get a project into the hands of customers.
Communication
In naval aviation, communication is critical. Talking on the radio is a skill unto itself. There are precise ways to ensure that messages are delivered and acknowledged, as radio communications are notoriously unstable. Within an aircraft, the simple notion of who has control of the plane is very clearly communicated. “I have the aircraft” is said clearly and firmly. On a flight deck, the noise is overwhelming, so all communications must be done through signalling and other means. A miscommunication can very easily and very quickly result in a deadly accident. There is no margin for error.
Software developers won’t usually cause a fatal mishap, but clear communication still makes or breaks a project. Good communication is essential to successful teamwork. It can mean writing well (in emails, chats, issues, documentation, etc.), dealing well with others (including difficult personalities), and even writing good code.
I think code is an underappreciated means of communication. I’d go so far as to say it is the single most important way a developer communicates. Not only does code communicate to the compiler, but it also communicates to all future developers who will maintain that code. It might seem a bit odd to make sure that you are expressing yourself clearly to someone you may never meet, but that’s a big part of writing good code.
Standard operating procedures
All naval aviators live by the NATOPS manual—Naval Air Training and Operating Procedures Standardization—which outlines the acceptable, proper, and proven way to operate a given aircraft. Aviators are expected to know the NATOPS manual for their aircraft inside and out. The manual is said to be “written in blood” because many of the procedures found within were instituted as the result of crashes.
In the software development world, we have “best practices.” Thankfully, our manuals are written in project postmortems, not actual ones. The right way to get things done is learned through the experience of, well, doing things the wrong way. A good development team has a set of rules that they follow—for formatting code, conducting code reviews, building object libraries, and so on—that are proven ways to be successful. Lessons learned become blueprints for future operations.
Every flight in Naval Aviation has a strict ritual that it follows: a briefing that discusses everything expected to happen on the flight, a walk around the aircraft to ensure that it is in good working order, the flight itself, which is conducted according to the briefing, and a thorough debriefing of everything that happened on the flight.
This ritual will be pretty familiar to any development team that does planning meetings, code reviews, and retrospectives.
I’ve never done it, but easily one of the craziest things we humans do is land a fighter jet on the pitching deck of an aircraft carrier in the dark of night. Such amazing feats are possible only through attention to detail and teamwork. Unless you are writing code for something like a pacemaker, software development isn’t a matter of life or death, but there is a lot we can learn from what it takes to bring a fighter jet safely home to a pitching flight deck.
Catching up with Angular 19 30 Apr 2025, 5:00 am
Angular 19 continues its ongoing project of simplifying and improving developer experience while boosting performance. The development team’s well-publicized attention to these goals has paid off in the general perception of Angular as a project on the upswing. The most recent State of JavaScript survey showed a strong increase in positive developer sentiment toward Angular.
Now is a great time to catch up with what’s new with this dynamic framework. I’ll discuss several innovative ideas and changes found in the latest release. Also see my recent article comparing Angular with React, Vue, and Svelte.
What’s different about Angular?
Angular is a complete, all-in-one reactive framework for JavaScript. It started as an architected system that has evolved in response to real-world usage and feedback. Changes recently have focused on developer experience and performance. This latest release brings continued progress that will be of interest to both Angular users and the Angular-curious.
Incremental hydration
Incremental hydration is a performance optimization that is new with the developer preview of Angular 19. Hydration is the process of making components in the user interface interactive. This is a key element of optimization in modern applications, where striking a balance between server-side rendering (SSR) and client-side rendering is essential.
There are many ways to slice this cake, and Angular 19’s incremental hydration puts the power in the developer’s hands. Developers using Angular 19 can define exactly how a page will be loaded on the client after it has been rendered on the server side.
Angular developers use the @defer
directive to describe to the engine how a component should be made interactive in the browser. The engine uses these instructions to load the necessary JavaScript. Triggers in the @defer
annotation are used to fine-tune this behavior.
While @defer
is not new, the Angular 19 version lets you control how app hydration is done on a per-component basis.
Event replay
The Angular 19 documents include a deep dive into event replay, which is a largely under-the-hood mechanism related to hydration. Essentially, event replay answers the question: If a component is loaded onto the browser without interactivity and then engaged by the user, how will the engine “replay” that event once the component is bootstrapped?
Usually, these details are hidden from framework users, even though they have a major impact on application performance. Event replay also has interesting uses when optimizing or debugging deeply. It’s an interesting new feature for programmers to explore.
Granular control over server routes
Angular is adding a ServerRoute
to allow users to fine-tune the engine’s rendering of server-side routes. Angular joins other full-stack frameworks in empowering developers with more control over route rendering. Essentially, the new ServerRoute
allows you, as the app developer, to control how your routes are rendered on a per-page basis. In Angular 19, you now have three options:
- Rendered on the server
- Rendered on the client
- Pre-rendered on the server
This lets you get the right mode for your specific pages, based on their characteristics. By default, routes with parameters are server-side rendered while routes without parameters are pre-rendered.
These kinds of hooks into the SSR engine allow developers to explore more aggressive, application-specific optimizations than were previously possible. It’s important to remember that you don’t have to adopt these optimizations up-front. It’s sometimes not clear where to apply them until an application has matured enough that you have real data to work with. But when you do find the bottlenecks, being able to drill down and adjust directly in the UI is vital.
There is both an art and a science to this kind of UI optimization. It often also depends on end-user behavior. Angular now provides tools that make it easier than ever for application developers to handle complex user interactions in the UI.
Factoring out Zone.js
Factoring out Zone.js was an implementation decision to simplify Angular’s internal design and performance. Zone.js was used by the server-side rendering engine to determine when the page was finished with asynchronous operations (things like data store and API requests and loading resources for navigation). These had to be completed before the page was marked as ready (“stable”) and sent to the client. Most developers who interacted with Zone.js did so for debugging or optimizing.
Angular 19 replaces Zone.js with RxJS, which is already in use throughout Angular. RxJS simplifies the dependencies for Angular, brings this aspect of the engine in line with the existing idioms for the framework, and makes it easier for application developers to push debugging into the rendering engine.
Standalone components are now the default option
Angular 19 makes standalone components (i.e., components defined without a module) the default, though developers can still use modules. Non-module components were first introduced in Angular 17, and have already become the de facto standard for developers in practice. It is remarkable how much lighter components feel after dropping the module front matter.
Signals-based inputs, outputs, and view queries
Signals are quite popular for providing fine-grained, universal reactivity in a simple JavaScript syntax. Angular adopted Signals early on, and Angular 19 solidifies their use in inputs (Angular’s version of child props), outputs (child-to-parent eventing), and view queries (direct DOM access like React’s useRef
). This simplifies and unifies Angular’s reactive idiom, though in many cases the difference is subtle: just an extra dot operator or parentheses for Signals.
Angular 19 includes CLI commands that will auto-migrate your application to Signals. The Angular team has also integrated Signals auto-migration capabilities into their IDE schematics. You can simply right-click to get context-sensitive access to the Signals conversion support. In cases where you have code explicitly modifying values, you will still have to set up the migration manually, since Signals inputs are immutable.
Command-line environment variable declarations
Every CLI tool must be able to specify environment variables for commands. Angular 19 goes a step further by adding the ability to specify environment variables for the build command:
ng build --define "apiKey='$API_KEY'"
An exciting project
That is a lot of action, especially given Angular’s brisk release cadence. It’s an exciting project and merits exploring if you aren’t already using it. Of course, all this change means Angular users have to keep up with new releases. It helps that the framework is improving with each new version and implementing many of the latest ideas in reactive programming.
It is remarkable that Angular has renewed itself to such an extent that it is now considered one of the most dynamic and forward-looking of all the reactive frameworks. It’s a big, flagship project, backed by Google, but with a welcoming open source vibe. That’s impressive.
Apiiro launches AI-powered risk analysis map for software 29 Apr 2025, 10:33 pm
Agentic application security provider Apiiro has unveiled Software Graph Visualization, an AI-powered interactive map that enables real-time visualization of software architectures across all components, vulnerabilities, data exposure, and material changes, according to the company.
Introduced April 28, Software Graph Visualization offers question-driven, dynamic graphs that map risk exposure, attack surfaces, and sensitive data flow in real time, helping security teams understand changes to software architecture and pinpoint threats with a visual inventory of critical software components. “Software Graph Visualization eliminates the need to interview developers or use self-based attestation questionnaires that make it hard to identify how software components connect and where security risks emerge,” said Idan Plotnik, co-founder and CEO of Apiiro. “By using AI agents to generate a visual map of the entire software inventory, along with contextual security review questions and threat model stories, security teams can quickly identify, prioritize, remediate, and communicate risks.”
The following use cases are addressed by Software Graph Visualization:
- Threat modeling to assess risk and vulnerabilities in designs and evaluate how sensitive data moves across boundaries.
- Penetration test scoping to attain understanding of attack surfaces to scope tests. The graph visualizes API architecture and data flow, highlighting risk entry points, potential vulnerabilities, and business-critical areas of the system more prone to exploitation.
- Change impact assessment to assess risk introduced by new code changes by comparing pre-release and post-release states of the application.
- Privacy review to streamline the identification of privacy risks with sensitive data.
- Blast radius analysis to measure the potential spread and impact of security breaches.
- Toxic combinations to identify dangerous combinations across an application.
- Vulnerability management to prioritize and remediate vulnerabilities with complete context.
Software Graph Visualization works with multiple programming languages including Java, C#, Python, JavaScript, and TypeScript. It is available in SaaS, hybrid, and on-premises deployment options.
GCC 15 compilers arrive with Rust, C, C++, and Cobol enhancements 29 Apr 2025, 5:47 pm
GCC (GNU Compiler Collection) 15.1 has arrived with improvements for programming languages ranging from Rust to C to Cobol. GCC 15.1 also brings improvements for vectorization and for compiling very large input files.
Announced April 25 as the first release in the GCC 15 branch, GCC 15.1 is described by GCC developers as a major release. Among the highlights, compile time for large input files with -Wmisleading-indentation
has been significantly improved, according to the GCC team. The compiler now can track column numbers larger than 4,096, and very large source files have more accurate location reporting.
In another improvement in the release, the vectorizer now supports vectorizing loops with early exits where the number of elements for the input pointers are unknown through peeling for alignment. This capability is supported for only for loops with fixed vector lengths. GCC 15.1 includes a long list of other changes, such as GCC now emitting diagnostics in multiple formats simultaneously, via the new option -fdiagnostics-add-output=
.
In language support, the C compiler now defaults to the C23 standard and fully conforms to it, said Richard Biener, who participates in the development of GCC. Biener cited C’s backing of C23 as likely the most impressive feature of GCC 15.1. New Rust accommodations include basic inline assembly support being added to the front end, enabling compilation of architecture functions of core 1.49; the addition of support for for loops; and the lowering of the minimum required Rust version to Rust 1.49, thus allowing more systems to compile to the Rust front end. For Cobol, GCC now includes an ISO Cobol compiler, gcobol. And for C++, the GCC front end now implements additional C++26 features such as attributes for structured bindings and variadic friends, along with some missing C++23 features and defect resolutions.
GCC is under the jurisdiction of the Free Software Foundation.
Why hasn’t cheaper hardware lowered cloud prices? 29 Apr 2025, 5:00 am
Public cloud providers have established themselves as the primary lifeline for modern enterprise IT, delivering unprecedented scalability, operational efficiency, and innovation. Despite all the advancements they’ve ushered in, businesses are noticing a disparity that’s hard to ignore: Why are public cloud prices holding firm—or even increasing—while hardware costs have plummeted?
As an analyst who closely follows this industry, I believe the answer lies at the intersection of economics, business priorities, and infrastructure complexities. Public cloud providers operate on the promise of seemingly infinite scalability, yet they are businesses beholden to investors and shareholders as well as customers. Their billion-dollar infrastructure investments, shareholder expectations for consistent returns, and high operational costs contribute to a rigid pricing structure—a reality many enterprises now grapple with.
Keep in mind that I don’t work for a cloud provider. I’m offering some educated guesses based on anecdotal data, observed trends, and logical conclusions. With that in mind, I’ll explore why major cloud providers haven’t passed on savings from declining hardware costs and what that means for businesses. More importantly, how can enterprises navigate this landscape? I recommend considering alternatives to the hyperscalers, from managed service providers to private clouds. The public cloud’s unchecked expansion may face serious headwinds as organizations reprioritize cost efficiency.
Reasons for high cloud prices
During the past 15 years, public cloud providers have made massive investments in building and maintaining their global infrastructure. Billions of dollars have gone into the construction of state-of-the-art data centers and global private networks and to fund R&D for advanced cloud services.
These expenditures are not one-time costs. Hyperscalers must continuously invest to keep up with demand, roll out new services, and navigate regulatory challenges. Understandably, investors expect strong and consistent returns on these investments. In fact, public cloud providers are not incentivized to dramatically lower costs; doing so could adversely affect margins and shareholder confidence.
This may explain why cloud pricing remains steady even as hardware costs (servers, storage, networking equipment) fall. The hyperscalers’ priority appears to be sustaining a long-term business model rather than passing on immediate cost savings to customers. In addition, the operational demands associated with running hyperscale cloud environments remain significant. Public cloud providers face ongoing expenses, including:
- Power and cooling for massive data centers
- Maintenance costs for a global, distributed infrastructure
- Compliance with sustainability and carbon-neutrality initiatives
- Cybersecurity defenses in a constantly shifting threat landscape
The sheer scale of these providers’ operations adds layers of complexity. Providers must design for hyper-redundancy, manage geographically dispersed data centers, and guarantee uptime and service levels. These factors likely contribute to their reluctance to lower prices.
Public cloud providers justify their premium pricing by pointing to the wide array of features they offer. Beyond compute, storage, and networking, these platforms provide managed databases, machine learning tools, internet of things capabilities, edge services, global user access, and more.
This value-added approach positions public clouds as more than just infrastructure providers—they are integral enablers of modern enterprise innovation. However, premium pricing may become increasingly unjustifiable for businesses that do not require this full spectrum of services or for companies that can find functional alternatives.
Public cloud alternatives are increasingly attractive
Enterprises must recognize that public cloud providers ultimately operate as businesses driven by profitability and shareholder returns. This is not inherently bad, but it does suggest businesses should evaluate their IT infrastructure strategies with a broader perspective. The alternatives are becoming more accessible and affordable, increasingly attractive for organizations seeking cost-efficient solutions.
One such option is the use of managed service providers. These providers deliver infrastructure scalability and customizable services combined with tailored support. They are often a cost-effective choice for businesses that have specialized requirements but don’t want the premium price tag of hyperscalers.
Another alternative is colocation facilities, which allow organizations to deploy and maintain their own hardware within state-of-the-art shared facilities. Businesses retain hardware ownership while managing long-term costs. Similarly, sovereign or regional cloud providers cater to enterprises with intense compliance and regulatory requirements. These providers focus on localized storage and regulations, services that benefit industries such as healthcare and finance, often at a cost considerably lower than hyperscalers.
Private cloud solutions are a viable option for enterprises with predictable, high-volume workloads. By owning and managing their own infrastructure, organizations can bypass recurring fees while exercising full control over resources and security. Additionally, hybrid cloud architectures offer an appealing balance, utilizing public clouds for burstable workloads and private or alternative infrastructure for baseline operations. A hybrid strategy enables both flexibility and cost optimization, ensuring businesses get the value of public clouds without being entirely reliant on them.
Enterprises that assess these alternatives could mitigate rising costs in the cloud and find solutions better aligned with their goals. By carefully analyzing workloads, scalability demands, and compliance requirements, organizations can identify IT infrastructure that delivers greater ROI while maintaining operational efficiency.
A cautious outlook for public cloud adoption
Based on anecdotal data and observed trends, I would argue that persistent high pricing in the public cloud will continue to drive organizations to rethink their infrastructure strategies. Some enterprises will always require the scalability and redundancy of hyperscalers, but others may reconsider the economics, especially when compared to alternative solutions.
If these pricing trends continue, it’s possible we’ll see a reduction in new enterprise cloud spending or a stronger focus on hybrid and multicloud environments. The discrepancy between public cloud pricing and the falling cost of hardware could ultimately serve as a wake-up call for enterprises to diversify their IT spending.
Although I don’t have a crystal ball, it’s pretty clear that high public cloud prices are here to stay—at least for the foreseeable future. Enterprises should carefully evaluate their workloads, cost structures, and long-term IT strategies. MSPs, colocation services, and private clouds deserve a second look if costs in the public cloud seem prohibitive.
5 ways generative AI boosts cloud and IT operations 29 Apr 2025, 5:00 am
Anyone who thinks engineers, administrators, and analysts in IT operations have an easy job hasn’t spent enough time in their shoes.
Automation, observability, and machine learning help IT operations deploy and manage many more large-scale and mission-critical workloads. However, expected service levels, compliance requirements, multi-cloud complexities, and exponentially increasing data volumes all increase business requirements and expectations of IT operations.
How to leverage genAI in IT and cloud operations
According to the 2024 Global Workforce AI Report, 85% of IT teams say AI makes their workday more positive. These professionals said AI gives them time to learn new skills, get more work done, and take on more creative work.
- Software developers use genAI to generate code, create documentation, and simplify app modernizations.
- GenAI is helping data scientists spend more time learning end-user workflows and reducing data bias in training data.
- CIOs invest in agentic AI to drive customer success, improve supply chain forecasting, and find manufacturing defects.
But how can genAI simplify work in IT and cloud operations? Lori Rosano, MD & SVP of North American Public Cloud at SAP, says, “Integrating genAI into cloud and IT operations empowers organizations to elevate their performance, improve agility, and become more resourceful and better equipped to respond to evolving environments.”
Here are five ways to use genAI in incident response, security, cloud infrastructure, and finops.
Improve AIops and incident response
I’ve previously written about various ways to use AIops, including for machine learning in application monitoring, helping site reliability engineers (SREs) meet service level objectives, and reducing major incident resolution times. AIops solves the problem of centralizing alert information, sequencing telemetry data, identifying likely root causes, and triggering common remediation automations.
”GenAI is significantly enhancing IT and cloud operations by automating tasks such as incident resolution and log analysis,” says Kellyn Gorman, advocate and engineer at Redgate. “It leverages predictive analytics to monitor system performance and address potential issues before they arise, provides data-driven recommendations for workload optimization, and improves user interactions through conversational tools.”
GenAI increases the scope of AIops capabilities, especially in complex IT environments where it’s hard to trace incidents to their sources. By providing engineers with genAI prompt capabilities, they can explore different scenarios around root causes and remediations of challenging incidents.
“GenAI assists by generating insights, summarizing complex system data, and automating documentation for incident response and remediation,” says Preetpal Singh, global head of product and platform engineering at Xebia. “By reducing manual effort in interpreting system logs and operational workflows, genAI helps ops teams make data-driven decisions faster while AI-driven automation handles performance tuning and anomaly detection.”
Enable accurate root cause analysis
Most IT service management functions separate incident management and problem management functions. The primary role of incident management is to find the source of an issue and restore services, while problem management performs root cause analysis, especially regarding recurring issues with multiple underlying symptoms.
“Coupling observability with AIops enables automated detection, diagnosis, and remediation—delivering self-healing infrastructure that strengthens application resilience,” says Steve Mayzak, global managing director of Search AI at Elastic. “Teams can also better interpret data and signals, gain visibility, and optimize operations. GenAI takes this further, providing intuitive navigation and deeper insights via simple queries. For example, if code consumes excessive processing power, genAI can analyze code profiling data, pinpoint high-load functions, and recommend optimizations to boost efficiency and cut costs.”
IT operations have long sought the opportunity to extend performance analytics into the application and networking layers where the more complex issues occur.
“Not only can genAI help to reduce time to resolution through the swift analysis of data sets and incident alerts, but it can also directly assist IT teams by answering their questions,” says Anant Adya, EVP at Infosys Cobalt. “AI chatbots can guide professionals through complex incidents by compiling resources and solutions from different networks.”
As organizations train genAI tools on observability, incident response, and asset management data, they will usher in a new era where AI agents trained for IT operations can analyze historical performance and recommend configuration changes to improve resiliency.
Enhance security audits and threat detection
Resolving and finding root causes of security incidents is more challenging with the growing number of threats and bad actors exposing vulnerabilities in ways that can be impossible to trace manually.
“Cloud security, even with a large team of human IT professionals, often feels like a game of whack-a-mole because there are too many entrances with too many moles to whack,” says Joe Warnimont, security and technical expert at HostingAdvice.com. “Generative AI changes the game, as it can patrol many entrances simultaneously while also making predictions on where to respond based on trends and past infiltrations.”
I expect specialized AI agents to support IT operations that differ from those information security professionals use. Each agent focuses on a specific function to detect, predict, and respond to issues and optimization opportunities.
“For cloud security, genAI enhances threat detection, identifies anomalies, and automates incident response,” says Bakul Banthia, co-founder of Tessell. “It strengthens access management by analyzing user behavior and device security while continuously auditing cloud configurations for compliance.”
Another opportunity for IT operations is accelerating compliance with data governance policies. Many organizations are deploying data security posture management (DSPM) platforms and defining their AI governance policies, but what about the required implementations in IT operations?
“With the vast amounts of data stored in the cloud, ensuring data security and privacy is paramount,” says Josh Ray, CEO at Blackwire. “GenAI can help enforce data governance policies, improve threat detection and response, automate compliance policy enforcement, and deliver continuous security improvements.”
Scale cloud ops in complex environments
Incident and problem management are reactive, where genAI can analyze data quickly and respond to issues autonomously or with a human in the middle. Another opportunity is using genAI for more proactive work where it can improve the robustness and scale of implementing standard operating procedures.
“Generative AI for IT operations helps organizations struggling to keep up with the complexity and scale of modern IT environments by streamlining processes and automating routine tasks, like patching,” says Joel Carusone, SVP of data and AI at NinjaOne.
GenAI is also used in strategic IT functions, such as scaling cloud operations for complex workloads.
“GenAI is improving IT and cloud operations by automating infrastructure, predicting demand, and reducing waste, but without oversight, it can just as easily drive up costs,” says Karthik SJ, GM of AI at LogicMonitor. “Ops teams need to learn to track AI workloads in real-time, fine-tune automation to prevent unnecessary scaling, and use AI insights to optimize costs. The real value isn’t in letting AI run the show—it’s in knowing how to control it to make cloud operations leaner, faster, and more cost-effective.”
I also see agentic AI as a partner to cloud architects and engineers, especially as public clouds and infrastructure providers release new capabilities and innovations. We should expect cloud AI agents to do more than scale infrastructure. As their sophistication improves, they can be invaluable partners for scenario-planning architecture upgrades.
Shift to scaleable finops and IT strategic planning
A similar shift is happening in finops, where early AI agents are reactive and provide tactical best practices to reduce cloud costs.
“GenAI is transforming finops by automating cloud cost optimization, identifying unused resources, and dynamically adjusting workloads to reduce waste,” says Tiago Miyaoka, AI and data practice lead at Andela. “Tasks that once required manual effort from finops engineers—such as tracking underutilized instances and reallocating resources—can now be streamlined with AI-driven systems. By continuously scanning cloud environments and applying intelligent cost-saving strategies, genAI helps organizations minimize expenses while maintaining performance.”
Integrating and normalizing all the cost and consumption data to support finops activities can be challenging for larger enterprises operating in multiple clouds, geographically dispersed data centers, and edge-computing locations. GenAI is already overhauling data integration capabilities, and finops use cases offer a significant cost and carbon savings opportunity.
“Many of the cloudops and finops tools involve analyzing tons of usage data stored in several different databases and rely on APIs and scripts to get insights into the usage and costs,” says Karthik Kannan, head of product management, strategy, and operations at Nile. “GenAI capabilities such as data summarization, data visualization, and text summarization can potentially reduce or eliminate the need for such software. Ops teams can get instant insights into the usage and costs and design their optimization strategies around those insights.”
GenAI will present new opportunities to simplify work in IT operations but don’t expect an agentic AI silver bullet anytime soon. With every wave of infrastructure and operational simplifications comes a new generation of capabilities businesses need, creating new challenges to operational resiliency.
Microsoft previews SignalR client for iOS 28 Apr 2025, 4:14 pm
Microsoft has introduced a Swift client for its SignalR library for ASP.NET, allowing iOS developers to add real-time web functionality to their applications.
SignalR Swift is a client library for connecting to SignalR servers from Swift applications, according to Microsoft. The client also works with the Azure SignalR service.
Introduced in a public preview April 22, the SignalR Swift client mends a situation in which iOS developers who wanted real-time bi-directional communication with SignalR needed to rely on community-built clients or make their own Swift implementation, both of which brought forth maintenance and compatibility issues, Microsoft said. With the new client, iOS developers can add real-time features such as chat, notifications, and live dashboards to SwiftUI or UIKit apps, and leverage full SignalR functionality including hubs, groups, and client/server streaming on iOS and macOS.
Samples leveraging the client can be found at github.com/dotnet. Instructions on installing the Swift client are at devblogs.microsoft.com.
SignalR is intended to simplify the process of adding real-time web functionality to applications. This real-time capability enables server code to push content to connected clients instantly, rather than having the server wait for a client to request new data.
AWS updates Amazon Bedrock’s Data Automation capability 28 Apr 2025, 5:41 am
AWS has updated the Data Automation capability inside its generative AI service Amazon Bedrock to further support the automation of generating insights from unstructured data and bring down the development time required for building applications underpinned by large language models (LLMs).
Bedrock’s Data Automation, according to AWS, is targeted at developers who can use the capability to accelerate the development of generative AI-driven applications by helping build components or workflows, such as automated data analysis for insights, in a simplified manner.
AWS has integrated Data Automation with Bedrock’s Knowledge Bases capability to help developers extract information from unstructured multimodal data and use it as context for retrieval augmented generation (RAG) use cases.
The latest update to Data Automation includes support for modality enablement, modality routing by file type, extraction of embedded hyperlinks when processing documents, and an increased overall document page limit of 3,000 pages.
“These new features give you more control over how your multimodal content is processed and improve Bedrock Data Automation’s overall document extraction capabilities,” AWS wrote in a blog post.
AWS said that enterprise developers can use the modality enablement feature to configure which modalities — image, document, audio, and video — will be processed amongst all data for a particular project or application.
Developers also have the choice to route specific file types as modalities, and what that means is developers will be able to process JPEG or JPG files as documents, and MP4 or M4V files as video files instead of their original image or audio type via Data Automation.
Another feature that has been added to Data Automation is the embedding of hyperlinks found in PDFs as part of the output or insights generated.
“This feature enhances the information extraction capabilities from documents, preserving valuable link references for applications such as knowledge bases, research tools, and content indexing systems,” the cloud services provider wrote.
Additionally, AWS has also increased support for processing documents, up to 3,000 pages per document from 1,500 pages per document, in Bedrock Data Automation.
The increased limit will enable developers to process larger documents without splitting them, the cloud services provider said, adding that this also simplifies workflows for enterprises dealing with long documents or document packets.
Currently, Amazon Bedrock Data Automation is generally available in the US West (Oregon) and US East (Northern Virginia) regions.
OpenSearch in 2025: Much more than an Elasticsearch fork 28 Apr 2025, 5:00 am
Open source has never been more popular. It’s also never been more contentious.
With hundreds of billions of dollars available to the companies that best turn open source into easy-to-use cloud services, vendors have fiercely competed for enterprise dollars. This has led to a spate of licensing changes from the companies that develop the open source software, and forks from the clouds that want to package and sell it. But something interesting is happening: These forks may start as clone wars, but they’re increasingly innovative projects in their own right. I’ve recently written about OpenTofu hitting its stride against Terraform, but OpenSearch, which has its big community event this week in Amsterdam, is an even bigger success story.
Born from the fire of Elastic’s 2021 license change, OpenSearch spent its first few years stabilizing and proving it could (and should) continue to exist. In the past year, OpenSearch has actively forged its own identity as a truly independent and innovative force in enterprise search, one that is quickly evolving to be much more than an Elasticsearch look-alike.
Moving beyond the fork
To understand OpenSearch’s recent path, a quick rewind is essential. In early 2021, Elastic ditched the Apache License (ALv2) for new Elasticsearch and Kibana versions, opting for the Server Side Public License (SSPL) and the Elastic License (ELv2). The goal? Keep AWS and other cloud vendors from offering Elasticsearch as a service without Elastic getting a cut. AWS, whose managed service relied heavily on the ALv2 codebase, responded swiftly, forking Elasticsearch 7.10.2 and Kibana 7.10.2. They stripped Elastic’s proprietary code and telemetry, launching the OpenSearch project under ALv2. It was a bold move but it left a lot of uncertainty: AWS didn’t have expertise in running a community-driven project, and only had a bit more experience managing its own open source projects (such as Firecracker).
Frankly, the odds weren’t great that AWS would succeed. And yet it has.
In 2023 I noted some of OpenSearch’s early successes as it expanded its community and won over some early customers. But it’s really the events in the past year that have demonstrated just how far AWS has come in learning how to contribute to open source in big ways, setting up OpenSearch as a serious contender in enterprise search.
Even though most open source projects have very limited contributor pools and often are the handiwork of a single developer (or a single company), it’s easier to attract volunteer contributors when a project sits within a neutral foundation. As such, AWS demonstrated how serious it was about OpenSearch’s open source success when it moved the project to the Linux Foundation in late 2024, establishing the OpenSearch Software Foundation (OSSF). This wasn’t just admin shuffling; it was strategic. Placing the project within a neutral foundation directly addressed concerns about AWS controlling the project. Suddenly the Technical Steering Committee (TSC) boasted representatives from SAP, Uber, Oracle, Bytedance, and others. Additionally, OpenSearch now can claim more than 1,400 unique contributors (over 350 active), hundreds of maintainers across dozens of organizations, and activity spanning more than 100 GitHub repositories by early 2025.Critically, the percentage of contributions and maintainers from outside AWS has significantly increased, signaling progress towards genuine diversification.
For AWS, whose Leadership Principles almost demand control over customer outcomes (“Deliver Results,” etc.), this is a revolutionary change in how it does business.
Getting better all the time
Clearly, OpenSearch is on the correct path. With governance solidifying, OpenSearch has pursued aggressive development, guided by a public road map, pushing beyond its roots to tackle modern data challenges, especially in AI/vector search and observability. OpenSearch has significantly moved beyond mere Elasticsearch compatibility. Driven by user needs, OpenSearch has added vector similarity search, hybrid search combining keyword and semantic methods, and built-in neural search capabilities. In 2024 alone, OpenSearch made major strides—adding integration with Facebook’s FAISS, SIMD hardware acceleration, and vector quantization for high-performance semantic searches.
Performance and scalability improvements have also been dramatic. Query speeds increased significantly (up to six times faster than early versions), thanks to extensive optimizations. New features, such as segment replication, have boosted data ingestion throughput by approximately 25%. Additionally, remote-backed storage now enables cost-efficient indexing directly into cloud object storage services, critical for enterprises dealing with petabyte-scale data sets.
This isn’t a community hoping to play catch-up. This is a strategic bid for leadership.
It’s one thing to write good code. It’s quite another to convince enterprises to use it. In this area, there’s growing evidence that OpenSearch is gaining enterprise ground. Just measuring use (without concern for whether it’s paid adoption), by the end of 2023 OpenSearch had surpassed 300 million cumulative downloads, clearly signaling mainstream adoption. AWS, for its part, touts “tens of thousands” of customers (which may be true, but that number includes users of older Elasticsearch versions). Although it’s hard to find public examples of large enterprises adopting OpenSearch, past and future OpenSearchCon events reveal LINE, Coursera, and other significant users (though most of the talks are still given by AWS employees). Job postings show that Fidelity Investments, Warner Bros, and others are OpenSearch users. Plus, a Linux Foundation report found 46% of surveyed users run OpenSearch as a managed service, indicating significant cloud uptake. High demand (87%) for better interoperability also suggests users see it as part of a broader stack.
The long shadow of Elasticsearch
Despite progress, OpenSearch faces challenges, primarily the constant comparison with Elasticsearch. For example, Elastic often claims performance advantages (40% to 140% faster). However, a March 2025 Trail of Bits benchmark comparing OpenSearch 2.17.1 and Elasticsearch 8.15.4 found OpenSearch faster overall on the “Big 5” workload and moderately faster in Vectorsearch (default engines), though results varied. Benchmarks are notoriously unreliable gauges; your mileage may vary.
Nor can OpenSearch still claim to be the open source alternative to Elasticsearch. In late 2024, Elastic added an AGPLv3 license option alongside SSPL and ELv2. Skeptics viewed this return to open source as a cynical response to OpenSearch’s momentum, but in my own conversations with Shay Banon, Elastic’s cofounder, the company had always wanted to return to an OSI-approved license: “I personally always wanted to get back to open source, even when we changed the license. We were hoping AWS would fork and would let us move back when enough time has passed.” Whatever the motivation, Elasticsearch is now just as open source as OpenSearch.
That comparison no longer really matters. OpenSearch has proven it’s more than AWS’ knee-jerk reaction to supply chain risk. OpenSearch is building its own identity, focused on next-gen workloads. Still, OpenSearch’s big challenge is still the process of converting its open governance and permissive licensing into an ecosystem that builds superior search to Elasticsearch or other competitors. There’s a long way to go, but its progress in the past few years, and particularly in 2024, suggests OpenSearch is here to stay—and to win.
14 tiny tricks for big cloud savings 28 Apr 2025, 5:00 am
When the cost of cloud computing is listed in cents or even fractions of a cent, it’s hard to remember that even small numbers can add up to big bills. Yet they do, and every month it seems CFOs come close to dying of multiple heart attacks when the cloud computing bill arrives.
To save the health of these bean counters, and also the necks of the engineers on the receiving end of their ire, here’s a list of small ways organizations can save money on the cloud. None of these tricks is likely to lead to big savings on its own, but together they can add up—or should we say subtract down?—to lower the overall cloud bill.
Shut down development clusters when they’re not in use
Some developers work eight hours a day, and some work more. But it’s rare for anyone to use a development cluster for more than 12 hours a day for a sustained period. There are 168 hours in a week but if you and your team work only a quarter of those hours, it’s possible to save 75% on the cost of running your development clusters. Yes, shutting down big clusters can take time. Yes, some types of odd machines may be hard to spin up immediately. Consider writing scripts that run in the background and manage it all for you.
Smart mock your microservices
Many cloud applications are constellations of machines running microservices. Instead of firing up all your machines, you can employ a smart set of mock services to imitate machines that are not the focus of the daily work. Mock instances of microservices can significantly shrink what is required to test new code. Smart developers can often configure these mock instances to offer better telemetry for debugging by tracking all the data that comes their way.
Cap local disk storage
Many cloud instances come with standard disks or persistent storage with default sizes. This can be some of the most expensive disk space for your computation, so it makes sense to limit how much you assign to your machines. Instead of choosing the default, try to get by with as little as possible. This may mean clearing caches or deleting local copies of data after it’s safely stored in a database or object storage. In other words, try to build very lightweight versions of your servers that don’t need much local storage.
Right-size cloud instances
Good algorithms can boost the size of your machine when demand peaks. But clouds don’t always make it easy to shrink all the resources on disk. If your disks grow, they can be hard to shrink. By monitoring these machines closely, you can ensure that your cloud instances consume only as much as they need and no more.
Choose cold storage
Some cloud providers include services for storing data that does not need fast access. AWS’s Glacier and Scaleway, for instance, charge a very low price but only if you accept a latency of several hours or more. It makes sense to carefully migrate cold data to these cheaper locations. In some cases, security could be another argument for choosing this option. Scaleway boasts of using a former nuclear fallout shelter to physically protect data.
Choose cheaper providers
Some competitors offer dramatically lower prices for services like object storage. Wasabi claims to offer prices that are 80% less than the competition. Backblaze says its services cost one-fifth of what you might pay elsewhere. Those are big savings. These services also compete on access latency offering faster “hot storage” response times. Of course, you’ll still have to wait for your queries to travel over the general Internet instead of just inside one data center, but the difference can still be significant. Affordable providers also sometimes offer competitive terms for data access. Some cut out the egress fees, which can be valuable for data that is downloaded frequently.
Spot machines
Some cloud providers run auctions on spare machines and the price tags can be temptingly low. Because you can run tasks without firm deadlines when the spot price is low, spare machines are great for background work like generating monthly reports. On the other hand, it’s important to know these spot instances may be shut down without much warning. Applications that run on spare machines should be idempotent. It’s also worth noting that when demand is high, the spot prices can soar. Just think of using them as a bit of a financial adventure.
Reserved instances
Cloud providers can offer significant discounts for organizations that make a long-term commitment to using hardware. These are sometimes called reserved instances, or usage-based discounts. They can be ideal when you know just how much you’ll need for the next few years. The downside is that the commitment locks in both sides of the deal. You can’t just shut down machines in slack times or when a project is canceled.
Be transparent
Engineers are pretty good at solving problems, especially numerical ones, and in the end, cloud cost is just another metric to optimize. Many teams leave the cloud costs to some devops pro who might have a monthly meeting with someone from finance. A better solution is to broadcast the spending data to everyone on the team. Let them drill down into the numbers and see just where the cash is going. A good dashboard that breaks down cloud costs may just spark an idea about where to cut back.
Go serverless
The cloud computing revolution has always been about centralizing resources and then making it easy for users to buy only as much as they need. The logical extreme of this is billing by each transaction. The poorly named “serverless” architecture is a good example of saving money by buying only what you need. A friend of mine brags that one of his side hustles costs him only 3 cents per month but one day he hopes it will go viral and the bills will spiral into the tens or even hundreds of dollars. Businesses with skunk work projects or proofs of concept love these options because they can keep computing costs quite low until the demand arrives.
Store less data
Programmers like to keep data around in case they might ever need it again. That’s a good habit until your app starts scaling and it’s repeated a bazillion times. If you don’t call the user, do you really need to store their telephone number? Tossing personal data aside not only saves storage fees but limits the danger of releasing personally identifiable information. Stop keeping extra log files or backups of data that you’ll never use again.
Store data locally
Many modern browsers make it possible to store data in object storage or even a basic version of a classic database. The WebStorage API offers a simple key-value store while the IndexedDB stores hierarchical tables and indexes them too. Both solutions were intended to be smart, local caches for building more sophisticated web applications that also responded quickly without overloading the network connection. But they can also be used to save storage costs. If the user wants to save endless drafts, well, maybe they can pay for it themselves.
Move the work elsewhere
While many cloud providers charge the same no matter where you store your data, some are starting to change the price tag based on location. AWS, for instance, charges $0.023 per gigabyte in Northern Virginia but $0.026 in Northern California for S3 storage. Alibaba recently cut its prices in offshore data centers much more than the onshore ones. Location matters quite a bit in these examples. Unfortunately, it may not be easy to take advantage of these cost savings for large blocks of data. Some cloud providers have exfiltration charges for moving data between regions. Still, it’s a good idea to shop around when setting up new programs.
Offload cold data
Cutting back on some services will save money, but the best way to save cash is to go cold turkey. There’s nothing stopping you from dumping your data into a hard disk on your desk or down the hall in a local data center. Hard disk prices can be just above $10 per terabyte for a new hard disk or below $7 for a used disk. And that’s not a monthly price or even an annual one; it’s for as long as the disk keeps spinning. Of course, you only get that price in return for taking on all the responsibility and the cost of electricity. It won’t make sense for your serious workloads, but the savings for not-so-important tasks like very cold backup data can be significant. You might also note some advantages in cases where compliance rules favor having physical control of the data.
Conquering the costs and complexity of cloud, Kubernetes, and AI 28 Apr 2025, 5:00 am
Platform engineering teams are at the forefront of enterprise innovation, leading initiatives in cloud computing, Kubernetes, and AI to drive efficiency for developers and data scientists. However, these teams face mounting challenges in managing costs and complexity across their expanding technological landscape. According to industry research conducted by my company, Rafay Systems, 93% of teams face hurdles in Kubernetes management, with cost visibility and complex cloud infrastructure cited as top challenges for organizations.
While IT leaders clearly see the value in platform teams—nine in 10 organizations have a defined platform engineering team—there’s a clear disconnect between recognizing their importance and enabling their success. This gap signals major stumbling blocks ahead that risk derailing platform team initiatives if not addressed early and strategically. For example, platform teams find themselves burdened by constant manual monitoring, limited visibility into expenses, and a lack of standardization across environments. These challenges are only amplified by the introduction of new and complex AI projects. There’s a pressing need for solutions that balance innovation with cost control so that platform teams can optimize resources efficiently without stunting modernization.
The problem with platform team tools
Let’s zoom out a bit. The root cause of platform teams’ struggles with Kubernetes cost visibility and control often traces back to their reliance on tools that are fundamentally misaligned with modern infrastructure requirements.
Legacy cost monitoring tools often fall short due to a multitude of reasons:
- They lack the granular visibility needed for cost allocation across complex containerized environments.
- They weren’t designed for today’s multi-team, multi-cloud architectures, creating blind spots in resource tracking.
- Their limited visibility often results in budget overruns and inefficient resource allocation.
- They provide inadequate cost forecasting and budgeting.
Our research shows that almost a third of organizations underestimate their total cost of ownership for Kubernetes, and that a lack of proper visibility into costs is a major hurdle for organizations. Nearly half (44%) of organizations reported that “providing cost visibility” is a key organizational focus for addressing Kubernetes challenges in the next year. And while standardization is essential for effective cost management and successful overall operational efficiency, close to 40% of organizations report challenges in establishing and maintaining enterprise-wide standardization—a foundational element for both cost control and operational efficiency.
Platform teams that manually juggle cost monitoring across cloud, Kubernetes, and AI initiatives find themselves stretched thin and trapped in a tactical loop of managing complex multi-cluster Kubernetes environments. This prevents them from driving strategic initiatives that could actually transform their organizations’ capabilities.
These challenges reflect the overall complexity of modern cloud, Kubernetes, and AI environments. While platform teams are chartered with providing infrastructure and tools necessary to empower efficient development, many resort to short-term patchwork solutions without a cohesive strategy. This creates a cascade of unintended consequences: slowed adoption, reduced productivity, and complicated AI integration efforts.
The AI complexity multiplier
The integration of AI and generative AI workloads adds another layer of complexity to an already challenging landscape, as managing computational costs and the resources it takes to train models introduces new hurdles. Nearly all organizations (95%) plan to increase Kubernetes usage in the next year, while simultaneously doubling down on AI and genAI capabilities. 96% of organizations say it’s important for them to provide efficient methods for the development and deployment of AI apps and 94% say the same for generative AI apps. This threatens to overwhelm platform teams even more if they don’t have the right tools and strategies in place.
As a result, organizations increasingly seek capabilities for GPU virtualization and sharing across AI workloads to improve utilization and reduce costs. The ability to automatically allocate AI workloads to appropriate GPU resources based on cost and performance considerations has become essential for managing these advanced technologies effectively.
Prioritizing automation and self-service
Our research reveals a clear mandate: Organizations must fundamentally transform how they approach infrastructure management to becoming enablers of self-service capabilities. According to our research, organizations are prioritizing proactive, automation-driven solutions such as automated cluster provisioning, standardized and automated infrastructure, and self-service experiences as top initiatives for developers.
Organizations are zeroing in on a range of cost management initiatives for platform teams over the next year, including:
- Reducing and optimizing costs associated with Kubernetes infrastructure,
- Visibility and showback into cloud and Kubernetes costs,
- Providing chargeback to internal groups (finops).
The push toward automation and self-service represents more than just a technical evolution—it’s a fundamental shift in how organizations approach infrastructure management. Self-service automation allows developers to move quickly while maintaining guardrails for resource usage and cost control. At the same time, standardized infrastructure and automated provisioning help ensure consistent deployment practices across increasingly complex environments. The result is a more sustainable approach to platform engineering that can scale with organizational needs while keeping costs in check.
By investing in automation and self-service capabilities now, organizations can position their platform teams to handle future challenges more effectively, whether they come from new technologies, changing business needs, or evolving infrastructure requirements.
Empowering platform teams
Platform team challenges—from Kubernetes and multi-cloud management to generative AI implementation—are significant, but not insurmountable. Organizations that successfully navigate this landscape understand that empowering platform teams requires more than just acknowledging their importance. It highlights the need for robust, versatile tools and processes that enable effective cost management and standardization. Platform teams need comprehensive solutions that balance innovation with cost control, while optimizing resources efficiently without impeding modernization efforts. Empowered platform teams will be the key differentiator between organizations that survive and those that excel as the landscape continues to evolve with new challenges in cloud, Kubernetes, and AI.
Haseeb Budhani is co-founder and CEO of Rafay Systems.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
Baidu hits the turbo button to get back into AI race 25 Apr 2025, 7:28 pm
An industry analyst Friday offered a lukewarm response to a series of announcements from Chinese tech giant Baidu around upgrades to its multimodal foundation model, ERNIE 4.5, and reasoning model, ERNIE X1, first released last month.
During his keynote at the firm’s annual developer conference in Wuhan, China, CEO Robin Li launched ERNIE 4.5 Turbo and ERNIE X1 Turbo, which, according to a release, feature “enhanced multimodal capabilities, strong reasoning, low costs and are available for users to access on Ernie Bot now free of charge.”
Li said, “the releases aim to empower developers to build the best applications — without having to worry about model capability costs, or development tools. Without practical applications, neither advanced chips nor sophisticated models hold value.”
At the launch of the new models’ predecessors last month, Baidu said in a release that the introduction of the two offerings “pushes the boundaries of multimodal and reasoning models,” adding that ERNIE X1 “delivers performance on par with DeepSeek R1 at only half the price.”
The firm said it plans to integrate both new models into its product ecosystem, and that the integration will include Baidu Search, China’s largest search engine, as well as other offerings.
According to a Reuters report, during his keynote Li also announced that Baidu had “successfully illuminated a cluster comprising [of] 30,000 of its self-developed, third generation P800 chips, which can support the training of DeepSeek-like models.”
Analysts unimpressed
Paul Smith-Goodson, vice president and principal analyst for quantum computing, AI and robotics at Moor Insights & Strategy, was unimpressed.
“[Baidu’s] announcement that the P800 Kunlun chip clusters were ‘illuminated’ only means they were turned on in preparation for training models with hundreds of billions of parameters,” he said. “While that is a technical advancement for China, it is the norm for companies such as OpenAI, Google, IBM, Anthropic, Microsoft, and Meta to train their models with hundreds of billions of parameters.”
Also, said Smith-Goodson, “Baidu’s statement that it used 30,000 Kunlun chips is nothing exceptional when compared to the number of GPUs the US uses to train large models. Kunlun chips are also inferior to US GPUs. In the next-gen AI we will be using something on the order of 100,000 GPUs. Because there is a lack of benchmarks, I have to be skeptical about the performance of this model compared to global leaders.”
Smith-Goodson pointed out, “it boils down to a race between China and the US to build the first Artificial General Intelligence (AGI) model. The US still holds a lead, but China is pressing hard to catch up.”
Thomas Randall, director of AI market research at Info-Tech Research Group, was also lukewarm about the announcements. Still, he pointed out, “Baidu remains an important part of China’s competitive AI sector, which includes companies like Alibaba, Tencent, and Huawei.”
Baidu’s ERNIE models, he said, “are one of the few domestically developed LLM series that compete with OpenAI/GPT-level models. The Kunlun chips and new cluster announcement reinforce that Baidu isn’t just doing models. Baidu has become a broad provider for hardware and applications, too.”
Strategically relevant but commercially limited
However, Randall said, Baidu “remains under immense pressure from emerging startups like DeepSeek, Moonshot AI, and the cloud giants like Alibaba. While still a heavyweight, Baidu is not unchallenged in China.”
He added that, across western countries, Baidu remains largely irrelevant because of the lack of trust in geopolitics, and the decoupling of the US and Chinese tech ecosystems. “[This] makes Western expansion near impossible. Moreover, in global AI model benchmarks, Baidu is mostly a secondary mention against the likes of OpenAI, Anthropic, Google, and Mistral.”
But overall, said Randall, “Baidu remains strategically relevant globally, but commercially limited across the West. The key takeaway for western AI companies is that innovation is not US-centric, but that only assists in pushing the AI race forward.”
Thesys introduces generative UI API for building AI apps 25 Apr 2025, 6:21 pm
AI software builder Thesys has introduced C1 by Thesys, which the company describes as a generative UI API that uses large language models (LLMs) to generate user interfaces on the fly.
Unveiled April 18, C1 by Thesys is available for general use. C1 lets developers turn LLM outputs into dynamic, intelligent interfaces in real time, Thesys said. In elaborating on the C1 technology, Thesys said enterprises now are racing to adopt AI, but building the front end for AI agents has remained a major hurdle. Teams spend months and significant resources designing and coding interfaces, only to deliver static, inconsistent, and often-disengaging user experiences, the company said.
Generative UI enables LLMs to generate interactive interfaces in real time, Thesys said. Generative UI interprets natural language prompts, generates contextually relevant UI components, and adapts dynamically based on user interaction or state changes. C1 can generate UI for any use case and any data, enables UI generations to be guided via system prompts, supports integration with external tools via function calling, and supports a wide variety of UI components via its Crayon React framework, according to Thesys.
More than 300 teams already are using Thesys tools to design and deploy adaptive AI interfaces, according to the company.
Cloud native explained: How to build scalable, resilient applications 25 Apr 2025, 9:48 am
What is cloud native? Cloud native defined
The term “cloud-native computing” encompasses the modern approach to building and running software applications that exploit the flexibility, scalability, and resilience of cloud computing. The phrase is a catch-all that encompasses not just the specific architecture choices and environments used to build applications for the public cloud, but also the software engineering techniques and philosophies used by cloud developers.
The Cloud Native Computing Foundation (CNCF) is an open source organization that hosts many important cloud-related projects and helps set the tone for the world of cloud development. The CNCF offers its own definition of cloud native:
Cloud native practices empower organizations to develop, build, and deploy workloads in computing environments (public, private, hybrid cloud) to meet their organizational needs at scale in a programmatic and repeatable manner. It is characterized by loosely coupled systems that interoperate in a manner that is secure, resilient, manageable, sustainable, and observable.
Cloud native technologies and architectures typically consist of some combination of containers, service meshes, multi-tenancy, microservices, immutable infrastructure, serverless, and declarative APIs — this list is not exhaustive.
This definition is a good start, but as cloud infrastructure becomes ubiquitous, the cloud native world is beginning to spread behind the core of this definition. We’ll explore that evolution as well, and look into the near future of cloud-native computing
Cloud native architectural principles
Let’s start by exploring the pillars of cloud-native architecture. Many of these technologies and techniques were considered innovative and even revolutionary when they hit the market over the past few decades, but now have become widely accepted across the software development landscape.
Microservices. One of the huge cultural shifts that made cloud-native computing possible was the move from huge, monolithic applications to microservices: small, loosely coupled, and independently deployable components that work together to form a cloud-native application. These microservices can be scaled across cloud environments, though (as we’ll see in a moment) this makes systems more complex.
Containers and orchestration. In could-native architectures, individual microservices are executed inside containers — lightweight, portable virtual execution environments that can run on a variety of servers and cloud platforms. Containers insulate the developers from having to worry about the underlying machines on which their code will execute. That is, all they have to do is write to the container environment.
Getting the containers to run properly and communicate with one another is where the complexity of cloud native computing starts to emerge. Initially, containers were created and managed by relatively simple platforms, the most common of which was Docker. But as cloud-native applications got more complex, container orchestration platforms that augmented Docker’s functionality emerged, such as Kubernetes, which allows you to deploy and manage multi-container applications at scale. Kubernetes is critical to cloud native computing as we know it — it’s worth noting that the CNCF was set up as a spinoff of the Linux Foundation on the same day that Kubernetes 1.0 was announced — and adhering to Kubernetes best practices is an important key to cloud native success.
Open standards and APIs. The fact that containers and cloud platforms are largely defined by open standards and open source technologies is the secret sauce that makes all this modularity and orchestration possible, and standardized and documented APIs offer the means of communication between distributed components of a larger application. In theory, anyway, this standardization means that every component should be able to communicate with other components of an application without knowing about their inner workings, or about the inner workings of the various platform layers on which everything operates.
DevOps, agile methodologies, and infrastructure as code. Because cloud-native applications exist as a series of small, discrete units of functionality, cloud-native teams can build and update them using agile philosophies like DevOps, which promotes rapid, iterative CI/CD development. This enables teams to deliver business value more quickly and more reliably.
The virtualized nature of cloud environments also make them great candidates for infrastructure as code (IaC), a practice in which teams use tools like Terraform, Pulumi, and AWS CloudFormation, to manage infrastructure declaratively and version those declarations just like application code. IaC boosts automation, repeatability, and resilience across environments—all big advantages in the cloud world. IaC also goes hand-in-hand with the concept of immutable infrastructure—the idea that, once deployed, infastructure-level entities like virtual machines, containers, or network appliances don’t change, which makes them easier to manage and secure. IaC stores declarative configuration code in version control, which creates an audit log of any changes.

There’s a lot to love about cloud-native architectures, but there are also several things to be wary of when considering it.
IDG
How the cloud-native stack is expanding
As cloud-native development becomes the norm, the cloud-native ecosystem is expanding; the CNCF maintains a graphical representation of what it calls the cloud native landscape that hammers home to expansive and bewildering variety of products, services, and open source projects that contribute to (and seek to profit from) to cloud-native computing. And there are a number of areas where new and developing tools are complicating the picture sketched out by the pillars we discussed above.
An expanding Kubernetes ecosystem. Kubernetes is complex, and teams now rely on an entire ecosystem of projects to get the most out of it: Helm for packaging, ArgoCD for GitOps-style deployments, and Kustomize for configuration management. And just as Kubernetes augmented Docker for enterprise-scale deployments. Kubernetes itself has been augmented and expanded by service mesh offerings like Istio and Linkerd, which offer fine-grained traffic control and improved security
Observability needs. The complex and distributed world of cloud-native computing requires in-depth observability to ensure that developers and admins have a handle on what’s happening with their applications. Cloud-native observability uses distributed tracing and aggregated logs to provide deep insight into performance and reliability. Tools like Prometheus, Grafana, Jaeger, and OpenTelemetry support comprehensive, real-time observability across the stack.
Serverless computing. Serverless computing, particularly in its function-as-a-service guise, offers to strip needed compute resources down to their bare minimum, with functions running on service provider clouds using exactly as much as they need and no more. Because these services can be exposed as endpoints via APIs, they are increasingly integrated into distributed applications, operating side-by-side with functionality provided by containerized microservices. Watch out, though: the big FaaS providers (Amazon, Microsoft, and Google) would love to lock you in to their ecosystems.
FinOps. Cloud computing was initially billed as a way to cut costs — no need to pay for an in-house data center that you barely use — but in practice it replaces capex with opex, and sometimes you can run up truly shocking cloud service bills if you aren’t careful. Serverless computing is one way to cut down on those costs, but financial operations, or FinOps, is a more systematic discipline that aims to aligns engineering, finance, and product to optimize cloud spending. FinOps best practices make use of those observability tools to best determine what departments and applications are eating up resources.
Advantages and challenges for cloud-native development
Cloud native has become so ubiquitous that its advantages are almost taken for granted at this point, but it’s worth reflecting on the beneficial shift the cloud native paradigm represents. Huge, monolithic codebases that saw updates rolled out once every couple of years have been replaced by microservice-based applications that can be improved continuously. Cloud-based deployments, when managed correctly, make better use of compute resources and allow companies to offer their products as SaaS or PaaS services.
But cloud-native deployments come with a number of challenges, too:
- Complexity and operational overhead: You’ll have noticed by now that many of the cloud-native tools we’ve discussed, like service meshes and observability tools, are needed to deal with the complexity of cloud-native applications and environments. Individual microservices are deceptively simple, but coordinating them all in a distributed environment is a big lift.
- Security: More services executing on more machines, communicating by open APIs, all adds up to a bigger attack surface for hackers. Containers and APIs each have their own special security needs, and a policy engine can be an important tool for imposing a security baseline on a sprawling cloud-native app. DevSecOps, which adds security to DevOps, has become an important cloud-native development practice to try to close these gaps.
- Vendor lock-in: This may come as a surprise, since cloud-native is based on open standards and open source. But there are differences in how the big cloud and serverless providers works, and once you’ve written code with one provider in mind, it can be hard to migrate elsewhere.
- A persistent skills gap: Cloud-native computing and development may have years under its belt at this point, but the number of developers who are truly skilled in this arena is a smaller portion of the workforce than you’d think. Companies face difficult choices in bridging this skills gap, whether that’s bidding up salaries, working to upskill current workers, or allowing remote work so they can cast a wide net.
Cloud native in the real world
Cloud native computing is often associated with giants like Netflix, Spotify, Uber, and AirBNB, where many of its technologies were pioneered in the early ’10s. But the CNCF’s Case Studies page provides an in-depth look at how cloud native technologies are helping companies. Examples include the following:
- A UK-based payment technology company that can switch between data centers and clouds with zero downtime
- A software company whose product collects and analyzes data from IoT devices — and can scale up as the number of gadgets grows
- A Czech web service company that managed to improve performance while reducing costs by migrating to the cloud
Cloud-native infrastructure’s capability to quickly scale up to large workloads also make it an attractive platform for developing AI/ML applications: another one of those CNCF case studies looks at how IBM uses Kubernetes to train its Watsonx assistant. The big three providers are putting a lot of effort into pitching their platforms as the place for you to develop your own generative AI tools, with offerings like Azure AI Foundry,Google Firebase Studio, and Amazon Bedrock. It seems clear that cloud native technology is ready for what comes next.
Learn more about related cloud-native technologies:
- Platform-as-a-service (PaaS) explained
- What is cloud computing
- Multicloud explained
- Agile methodology explained
- Agile development best practices
- Devops explained
- Devops best practices
- Microservices explained
- Microservices tutorial
- Docker and Linux containers explained
- Kubernetes tutorial
- CI/CD (continuous integration and continuous delivery) explained
- CI/CD best practices
Docker’s new MCP Catalog, Toolkit to solve major developer challenges, experts say 25 Apr 2025, 6:24 am
Docker, provider of containers for application development, is planning to add a new Model Context Protocol (MCP) Catalog and a Toolkit that experts say could solve major challenges faced by developers when building out agentic applications.
Anthropic’s MCP, which was released in November last year, is an open protocol that allows AI agents inside applications to access external tools and data to complete a user request using a client-server mechanism, where the client is the AI agent and the server provides tools and data.
Agentic applications, which can perform tasks without manual intervention, have caught the fancy of enterprises as they allow them to do more with constrained resources.
But without MCP, developers would face a major challenge: they would be unable to connect disparate data sources and tools with large language models (LLMs), without which agents cannot perform tasks on their own.
Docker, which is where at least 20 million developers build their applications, is adding Catalog and Toolkit as it says that despite MCP’s popularity, its experience is “not production-ready — yet.”
“Discovery (of tools) is fragmented, trust is manual, and core capabilities like security and authentication are still patched together with workarounds,” Docker executives wrote in a blog post.
Paul Chada, co-founder of DoozerAI, an agentic digital worker platform, said that presently, MCP servers are messy client-side installs and not true enterprise-grade solutions, meaning they run directly on users’ PCs and potentially expose credentials.
Catalog and Toolkit are expected to solve challenges related to tool discovery, credential management, and security, with the Catalog serving as the home for discovering MCP tools and Toolkit simplifying the process of running and managing MCP servers securely, the executives explained.
MCP Catalog to aid tools discovery
The Catalog, according to Docker, is essentially a marketplace where authors or builders of these tools can publish them for developers to discover.
To get the marketplace running, Docker has partnered with several companies, including Stripe, Elastic, Heroku, Pulumi, Grafana Labs, Kong Inc., Neo4j, New Relic, and Continue.dev, the executives said. At launch, the marketplace would contain over 100 verified tools.
Chada sees MCP Catalog serving as an accredited, secure hub and marketplace for MCP servers, which can then be subsequently deployed into Docker containers using the new MCP Toolkit.
“The Catalog will help developers find trusted or verified tools, which reduces the risk of security breaches,” Chada explained.
In the same vein, Moor Insights and Strategy’s principal analyst Jason Andersen said that Docker is probably the first to try and build a centralized place to discover tools related to MCP, which is non-existent presently as the protocol is very new.
MCP Toolkit for simple management of MCP servers
Docker’s MCP Toolkit, according to Chada, solves key developer challenges, such as environment conflicts, security vulnerabilities from host access, complex setup requirements, and cross-platform inconsistencies, by offering features
“To bypass developer challenges, the Toolkit offers a one-click deployment from Docker Desktop, built-in credential management, containerized isolation, a Gateway Server, and a dedicated command line interface (CLI),” Chada said.
“By containerizing MCP servers, Docker creates a standardized, secure environment where developers can focus on building AI applications rather than wrestling with configuration and security issues,” Chada added.
Both Catalog and Toolkit are expected to be made available in May, the company said. However, it is yet to finalize the pricing of both offerings.
Both Chada and Andersen believe that Docker’s rivals, such as Kubernetes, are also expected to add similar capabilities soon. Andersen further believes that most cloud service providers will also start offering something similar to the Catalog and Toolkit as MCP’s popularity grows.
Hype versus execution in agentic AI 25 Apr 2025, 5:00 am
Agentic AI has captured the imagination of enterprises everywhere. It promises autonomous systems capable of reasoning, making decisions, and dynamically adapting to changing conditions. The allure lies in machines operating independently, free of human intervention, streamlining processes and enhancing efficiency at unprecedented scales. No wonder it’s billed as the next big thing.
[ Related: Agentic AI – Ongoing news and insights ]
It’s a tempting vision, reinforced by marketing headlines and ambitious vendor pitches. Global AI investments surged past $90 billion in 2022, with a significant slice aimed specifically at technologies like agentic AI. But if you step back from the narrative, a troubling truth emerges: Agentic AI in the cloud is focused more on glossy presentations than on enterprise realities.
Execution falls short
Agentic AI remains more conceptual than practical. For all its potential, the technology has failed to demonstrate widespread adoption or scalability in enterprise contexts. We hear a lot about self-directed systems transforming industries, but evidence of meaningful deployment is painfully scarce.
Deloitte’s recent AI survey found that only 4% of enterprises pursuing AI are actively piloting or implementing agentic AI systems. The vast majority remain trapped in cautious experimentation. This gap isn’t surprising given the challenges involved. Agentic AI requires advanced reasoning, contextual understanding, and the ability to learn and adapt autonomously in complex, unstructured environments. This level of sophistication is still aspirational for most organizations.
Furthermore, infrastructure and cost hurdles are daunting. A recent Gartner report revealed that rolling out agentic AI projects often costs two to five times more than traditional machine learning initiatives. These systems demand extensive training data, advanced processing power, and robust integration with existing workflows—investments not all enterprises are prepared to make.
Where the disconnect lies
Agentic AI adoption often stumbles for two key reasons: technological immaturity and overblown expectations. The technology promises autonomous decision-making, but it struggles to handle edge cases, unpredictable variables, and the nuances of human decision-making contexts in practical scenarios. I’ve seen this firsthand.
Consider self-driving vehicles, touted for years as a flagship example of agentic AI. Although companies like Tesla and Waymo have made progress, full autonomy remains a distant goal fraught with technical setbacks. Enterprises pursuing agentic AI quickly encounter similar pitfalls where the systems falter in dynamic, real-world scenarios that require judgment and adaptability.
These examples highlight the widening gap between marketing rhetoric and implementation capabilities. The hype promises revolutionary change, yet real progress is slow and incremental.
Reassess your approach
Hype-driven initiatives rarely end well. Enterprises that invest in agentic AI without a clear road map for value creation risk wasting time, money, and resources. Instead of chasing the flashiest new technology, organizations should concentrate on their specific needs and measurable outcomes. Large-scale agentic AI solutions may not provide the answer. Many organizations could achieve a better ROI by implementing simpler AI tools, such as recommendation systems or predictive analytics that integrate seamlessly into existing workflows.
The path to meaningful AI adoption starts with clarity. Before scaling, enterprises should prioritize pilot programs and test agentic AI in controlled environments. These tests should be accompanied by key performance indicators that track measurable performance, such as cost savings and improvements in process efficiency.
Additionally, infrastructure readiness is crucial. Agentic AI typically requires robust data sets, seamless integration, and a commitment to addressing ethical concerns such as bias and accountability. Without these elements, projects are likely to fail.
Enterprises also need to hold vendors accountable. Too much of today’s agentic AI marketing lacks transparency and makes bold claims without providing adequate proof points or benchmarks. Ask questions. Get objective answers. Businesses must demand deeper insights into scalability, deployment timelines, and technical limitations to make informed decisions.
Managing hype versus value
Agentic AI has undeniable potential, but its current state is overhyped and underdelivered. Enterprises rushing to adopt these technologies risk falling into expensive traps, seduced by promises of autonomy without understanding the underlying complexities.
Organizations can avoid the pitfalls of hype-driven adoption by focusing on immediate business needs, prioritizing incremental AI solutions, and demanding transparency from vendors. This should not be a race to be the first to adopt agentic AI—it should be about adopting it in the smartest ways possible. The best path forward for the vast majority of enterprises is to wait for the technology to mature while pursuing today’s more pragmatic AI initiatives.
Ultimately, AI success in enterprises isn’t about chasing the headlines; it’s about creating real, measurable value. By staying grounded in practical realities, businesses will position themselves for sustainable growth today and in the future when agentic AI finally fulfills its potential.
Python and WebAssembly? Here’s how to make it work 25 Apr 2025, 5:00 am
Top picks for Python readers on InfoWorld
6 languages you can deploy to WebAssembly right now
Learn how to deploy Python and five other languages to run on Wasm, along with the advantages and disadvantages of each language choice.
Airgapped Python: Setting up Python without a network
Who here has unreliable networks? Maybe your admins have blocked too many sites, or you’re preparing for another 10-hour flight without Wi-Fi. Whatever the issue is, here’s a step-by-step guide to help.
Life without Python’s ‘dead batteries’
Python’s a “batteries included” language, but some of those batteries have been dead for a while. Python 3.13 ripped ‘em out and sent ’em sailing. But what to do about the ones you needed? Here’s how to safely replace packages like smtpd
, cgi,
msilib
, and more.
Django 5.2 release touts automatic model importing—and phases out earlier 5.x editions
The newest Django has more than new features you want to use, it also pushes Django 5.1 out of mainstream support and leaves Python 3.9 and earlier behind.
More good reads and Python updates elsewhere
What’s new in the 2025.1 edition of PyCharm
Could this be the one PyCharm edition to rule them all? AI-powered code generation, support for Python’s Hatch project manager tool, better Jupyter notebooks, and tons more. (Start with the Pro edition and continue with free-tier features.)
Build Python projects to standalone directories with py-app-standalone
Imagine an alternative to PyInstaller that uses uv
to deploy Python apps as redistributables without the headaches. py-app-standalone
is still considered experimental but it’s worth a look.
AutoKitteh: Workflow automation and orchestration
Billed as “a developer-first alternative to no-code/low-code platforms,” AutoKitteh lets you write workflows and automations in Python, then run them with your own self-hosted AutoKitteh server or the in-beta cloud service.
Slightly off topic: Beej’s Guide to Git
The creator of one of the best guides ever written to the C language—and network programming, and POSIX interprocess communication—now has an equally great guide to Git.
Page processed in 1.798 seconds.
Powered by SimplePie 1.3.1, Build 20121030175403. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.