<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Cloud – IMG.LY Blog</title><description>Posts tagged Cloud on the IMG.LY blog.</description><link>https://img.ly/blog/tag/cloud/</link><language>en-us</language><image><url>https://img.ly/apple-touch-icon.png</url><title>Cloud – IMG.LY Blog</title><link>https://img.ly/blog/tag/cloud/</link></image><atom:link href="https://img.ly/blog/tag/cloud/rss.xml" rel="self" type="application/rss+xml"/><generator>Astro</generator><lastBuildDate>Tue, 09 Jun 2026 13:48:12 GMT</lastBuildDate><ttl>60</ttl><item><title>How to Load Stripe Data into Google BigQuery</title><link>https://img.ly/blog/how-to-load-stripe-data-into-google-bigquery/</link><guid isPermaLink="true">https://img.ly/blog/how-to-load-stripe-data-into-google-bigquery/</guid><description>Discover how IMG.LY leverages Stripe&apos;s Data Pipeline to seamlessly transfer data into Google BigQuery using Google Cloud Functions.</description><pubDate>Thu, 18 Jul 2024 10:04:26 GMT</pubDate><content:encoded>&lt;p&gt;At IMG.LY, we recognize that leveraging data is essential for driving innovation and growth. To optimize our data for reporting, we consolidate multiple data sources, including Stripe billing and financial data, into Google BigQuery.&lt;/p&gt;
&lt;p&gt;IMG.LY is the leading provider of creative editing SDKs for &lt;a href=&quot;https://img.ly/products/video-sdk/?utm_source=imgly&amp;#x26;utm_medium=blog&amp;#x26;utm_campaign=stripebigquery&quot;&gt;video&lt;/a&gt;, &lt;a href=&quot;https://img.ly/products/photo-sdk/?utm_source=imgly&amp;#x26;utm_medium=blog&amp;#x26;utm_campaign=stripebigquery&quot;&gt;photo&lt;/a&gt;, and &lt;a href=&quot;https://img.ly/products/creative-sdk/?utm_source=imgly&amp;#x26;utm_medium=blog&amp;#x26;utm_campaign=stripebigquery&quot;&gt;design templates&lt;/a&gt;. While this article may not directly relate to media creation, we believe in empowering developers through knowledge sharing. Let’s dive in.&lt;/p&gt;
&lt;p&gt;Until now, we’ve relied on &lt;a href=&quot;https://www.fivetran.com&quot;&gt;Fivetran&lt;/a&gt; to fetch our data from Stripe and store it in Google BigQuery. Fivetran uses Stripe’s API, calling each endpoint, iterating over all resources, and storing the results in BigQuery (or any other supported data warehouse). While this generally works well, issues can arise. For instance, we sometimes create Stripe Subscriptions using inline pricing with &lt;a href=&quot;https://docs.stripe.com/api/subscription_items/create#create_subscription_item-price_data&quot;&gt;the &lt;code&gt;price_data&lt;/code&gt; parameter&lt;/a&gt;. This generates a new &lt;code&gt;Price&lt;/code&gt; object in Stripe on-the-fly and immediately sets it to &lt;code&gt;active: false&lt;/code&gt;. Consequently, the &lt;code&gt;Price&lt;/code&gt; object is not returned by Stripe API’s price endpoint, leading to missing data in our warehouse. Although Fivetran’s support was exceptional in resolving this issue within a day, it highlighted a potential flaw in relying solely on ETL services for data extraction.&lt;/p&gt;
&lt;p&gt;Recently, Stripe introduced &lt;a href=&quot;https://stripe.com/data-pipeline&quot;&gt;Data Pipeline&lt;/a&gt;, its own service for transferring Stripe data into a data warehouse. This ensures complete, reliable data without needing a third-party service to read Stripe’s API. Additionally, you can receive test environment data and access several tables not available via the API. For a comprehensive summary of the available data, &lt;a href=&quot;https://dashboard.stripe.com/stripe-schema&quot;&gt;refer to Stripe’s official data schema&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Currently, Stripe supports only Snowflake and Amazon Redshift as data warehouses. However, they’ve recently added the option to &lt;a href=&quot;https://docs.stripe.com/stripe-data/access-data-in-warehouse/cloud-storage/google-cloud-storage&quot;&gt;deliver data as Parquet files into Google Cloud Storage (GCS)&lt;/a&gt;. The next step for us was to import this data into Google BigQuery.&lt;/p&gt;
&lt;h2 id=&quot;setting-up-stripe-data-pipeline-with-google-cloud-storage&quot;&gt;Setting Up Stripe Data Pipeline with Google Cloud Storage&lt;/h2&gt;
&lt;p&gt;Stripe is renowned for its excellent developer experience, and this beta feature is no exception. Enabling it within the Stripe Dashboard is quick, and the &lt;a href=&quot;https://docs.stripe.com/stripe-data/access-data-in-warehouse/cloud-storage/google-cloud-storage&quot;&gt;documentation&lt;/a&gt; is straightforward. After following the instructions and enabling the feature, it takes a while for data to appear in GCS. Once available, a complete data dump is provided every 6 hours, structured as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;At the root level, Stripe creates a folder representing the date and time of the latest transfer, e.g., &lt;code&gt;2024071600&lt;/code&gt; (&lt;code&gt;YYYYMMDDHH&lt;/code&gt;), representing the 12 am push on July 16, 2024.&lt;/li&gt;
&lt;li&gt;One level deeper, there are two folders: &lt;code&gt;livemode&lt;/code&gt; and &lt;code&gt;testmode&lt;/code&gt;, representing live and test data, respectively.&lt;/li&gt;
&lt;li&gt;Each folder contains one folder per data table, e.g., &lt;code&gt;subscriptions&lt;/code&gt; or &lt;code&gt;invoices&lt;/code&gt;. Additionally, a &lt;code&gt;coreapi_SUCCESS&lt;/code&gt; file indicates successful data transfer to your GCS bucket and readiness for consumption.&lt;/li&gt;
&lt;li&gt;Within the table folders are several Parquet files containing the actual data for each table.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;loading-the-data-from-google-cloud-storage-into-google-bigquery&quot;&gt;Loading the Data from Google Cloud Storage into Google BigQuery&lt;/h2&gt;
&lt;p&gt;There are multiple ways to transfer data from GCS to BigQuery. We opted for the following approach:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Using Google Cloud Scheduler to publish a message to Google Pub/Sub every 6 hours at 1 am, 7 am, 1 pm, and 7 pm.&lt;/li&gt;
&lt;li&gt;Creating a Google Cloud Function that listens for new messages on the above Pub/Sub topic. When a message is received, it triggers a Node.js script that loads the most recent data from GCS into BigQuery and deletes it from GCS.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s delve into the details.&lt;/p&gt;
&lt;h3 id=&quot;create-a-google-cloud-scheduler-job&quot;&gt;Create a Google Cloud Scheduler Job&lt;/h3&gt;
&lt;p&gt;First, create a new Cloud Scheduler job &lt;a href=&quot;https://console.cloud.google.com/cloudscheduler/jobs/new&quot;&gt;here&lt;/a&gt; with the following configuration:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Name&lt;/strong&gt;: Choose a name for this job.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Region&lt;/strong&gt;: The region is not crucial for this task; we used &lt;code&gt;europe-west3&lt;/code&gt; since most of our services are in Germany.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Frequency&lt;/strong&gt;: We want the job to run every 6 hours at 1 am, 7 am, 1 pm, and 7 pm. Stripe publishes data every 6 hours, but it takes time to transfer it to GCS. We chose 1 hour later than Stripe’s push time, so our value is &lt;code&gt;0 1,7,13,19 * * *&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Timezone&lt;/strong&gt;: Choose ‘Coordinated Universal Time (UTC)’.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Target type&lt;/strong&gt;: Choose ‘Pub/Sub’.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Select a Cloud Pub/Sub topic&lt;/strong&gt;: Select or create a new Pub/Sub topic using the default configuration. This is used to trigger the Cloud Function.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Message body&lt;/strong&gt;: For this task, we don’t look at the contents of the message, as such the content of this value doesn’t matter. We opted for a simple &lt;code&gt;load&lt;/code&gt; string.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; sizes=&quot;(min-width: 566px) 566px, 100vw&quot; data-astro-image=&quot;constrained&quot; data-astro-image-pos=&quot;center&quot; width=&quot;566&quot; height=&quot;1240&quot; src=&quot;https://img.ly/_astro/Screenshot-2024-07-16-at-11.46.22_1dcdSQ.webp&quot; srcset=&quot;/_astro/Screenshot-2024-07-16-at-11.46.22_1dcdSQ.webp 566w&quot;&gt;&lt;/p&gt;
&lt;p&gt;Finally, click ‘Create’ to set up the scheduler. Now, a message is published to the selected Pub/Sub topic every 6 hours. Next, we need to respond to this message.&lt;/p&gt;
&lt;h3 id=&quot;create-a-google-cloud-function&quot;&gt;Create a Google Cloud Function&lt;/h3&gt;
&lt;p&gt;Create a Google Cloud Function triggered by Pub/Sub &lt;a href=&quot;https://console.cloud.google.com/functions/add&quot;&gt;here&lt;/a&gt; with the following configuration:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Environment&lt;/strong&gt;: Choose ‘2nd gen’.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Function name&lt;/strong&gt;: Choose a name for this Cloud function.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Region&lt;/strong&gt;: Select the region for the function, typically europe-west3 for our services.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trigger type&lt;/strong&gt;: Choose ‘Cloud Pub/Sub’.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud Pub/Sub topic&lt;/strong&gt;: Select the Pub/Sub topic created in the previous step.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Adjust the ‘Runtime, build, connections and security settings’ based on your Cloud setup and the required processing power for Stripe data. Generally, the following settings work well:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Memory allocated&lt;/strong&gt;: ‘512 MiB’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CPU&lt;/strong&gt;: ‘1’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Timeout&lt;/strong&gt;: ‘540’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Minimum number of instances&lt;/strong&gt;: ‘0’ (to ensure the function shuts down when not in use)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Maximum number of instances&lt;/strong&gt;: ‘1’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Service account&lt;/strong&gt;: Use or create a service account with permissions to access the GCS bucket where Stripe data is stored and the BigQuery datasets to load the data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ingress settings&lt;/strong&gt;: Choose ‘Allow internal traffic only’.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; sizes=&quot;(min-width: 554px) 554px, 100vw&quot; data-astro-image=&quot;constrained&quot; data-astro-image-pos=&quot;center&quot; width=&quot;554&quot; height=&quot;1286&quot; src=&quot;https://img.ly/_astro/Screenshot-2024-07-16-at-11.56.55_1Iq5S7.webp&quot; srcset=&quot;/_astro/Screenshot-2024-07-16-at-11.56.55_1Iq5S7.webp 554w&quot;&gt;&lt;/p&gt;
&lt;p&gt;Click ‘Next’ to provide the function’s code. Select:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Runtime&lt;/strong&gt;: ‘Node.js 20’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source code&lt;/strong&gt;: ‘Inline Editor’&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Entry point&lt;/strong&gt;: &lt;code&gt;loadStripeData&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the &lt;code&gt;package.json&lt;/code&gt;, add the BigQuery and Cloud Storage Node.js packages:&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;index.js&lt;/code&gt;, add the following code:&lt;/p&gt;

&lt;p&gt;This script does the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;For each environment (live and test), it searches for the latest folder containing a &lt;code&gt;coreapi_SUCCESS&lt;/code&gt; file.&lt;/li&gt;
&lt;li&gt;For each table, it groups all related Parquet files and loads them into the BigQuery table using &lt;code&gt;WRITE_TRUNCATE&lt;/code&gt;, which overwrites existing data. Note that the location is specified as &lt;code&gt;EU&lt;/code&gt;, matching our BigQuery dataset and GCS bucket location. Adjust this parameter if your data is elsewhere.&lt;/li&gt;
&lt;li&gt;If all files for an environment are loaded without errors, the files are deleted from GCS. This step is optional; if you prefer to keep a backup, you can omit this part.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Click ‘Deploy’ to deploy your Cloud function.&lt;/p&gt;
&lt;h3 id=&quot;create-bigquery-datasets&quot;&gt;Create BigQuery Datasets&lt;/h3&gt;
&lt;p&gt;The final step is to create two datasets in Google BigQuery. Open &lt;a href=&quot;https://console.cloud.google.com/bigquery&quot;&gt;Google BigQuery&lt;/a&gt;, click on the three dots next to your project’s name, and select ‘Create dataset’. Enter a name and choose a location matching your GCS bucket’s location. Repeat this process for the test dataset.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Following these steps will ensure your Stripe data is imported into BigQuery and automatically updated every 6 hours. However, as Data Pipeline for GCS is still in beta, there are some limitations. For example, the schema of the Parquet files lacks type annotations for timestamps, so all timestamps in BigQuery are represented as &lt;code&gt;INTEGER&lt;/code&gt; instead of &lt;code&gt;TIMESTAMP&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Additionally, some tables, such as &lt;code&gt;subscription_item_change_events&lt;/code&gt;, are not currently transferred when syncing with Google Cloud Storage, although this issue is expected to be resolved soon. Meanwhile, we continue to use Fivetran in conjunction with the above method to sync Stripe data to Google BigQuery and plan to fully migrate to Data Pipeline once it exits the beta phase.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thank you for reading!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3,000+ creative professionals gain exclusive access and hear of our releases first—&lt;a href=&quot;https://share.hsforms.com/1IgAOV1wASXGPnFG4ZPLejg1hk3i&quot;&gt;subscribe&lt;/a&gt; to our newsletter and never miss out.&lt;/strong&gt;&lt;/p&gt;</content:encoded><dc:creator>Sascha</dc:creator><media:content url="https://blog.img.ly/2024/07/stripe-bigquery-how-to.jpg" medium="image"/><category>How-To</category><category>Business Intelligence</category><category>Data</category><category>Cloud</category><category>Insights</category></item><item><title>Cutting Through The Jungle: An In-depth Review of Cloud GPU Providers to Train Your AI Models in 2024</title><link>https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models/</link><guid isPermaLink="true">https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models/</guid><description>Embark with us on a journey to finding the best place to host AI models.</description><pubDate>Mon, 22 Apr 2024 06:35:54 GMT</pubDate><content:encoded>&lt;h2 id=&quot;navigating-the-world-of-ai-models-hosting&quot;&gt;Navigating the World of AI Models Hosting&lt;/h2&gt;
&lt;p&gt;Here at IMG.LY, we recently dug into finding the best place to host AI models to support apps we’re dreaming up. We wanted to figure out if using cloud GPUs or going serverless would work better for us. As we were looking specifically for service providers to run Image Generation Workloads on, we focused on those that could be the best fit for that. Along the way, we picked up some cool insights and ran into a few hiccups. We think sharing our journey and the things we figured out could help you when you’re looking to deploy your own AI models.&lt;/p&gt;
&lt;p&gt;First off, we’ll explain what cloud GPU and serverless hosting really mean. Then, we’ll chat about their good and not-so-good sides when it comes to hosting AI models. It’s super important to make sure whatever hosting you choose fits your model like a glove. We’ll talk about some tools we stumbled upon that could help with that. Next up, we’ll give you a peek at some of the providers we checked out and our thoughts on how they might fit with what we’re working on. We decided to skip over the big names like IBM, Google, and Amazon this time. We were curious about what the newer, smaller companies have to offer.&lt;/p&gt;
&lt;p&gt;To wrap things up, we’ll share some final thoughts on all our research. Plus, we’ll throw in some tips and ideas you might want to think about when you’re doing your own digging. Whether you’re developing AI models or planning to host some of the well-known ones, we hope our adventure helps you nail down the perfect hosting solution for what you need. Ready to jump in?&lt;/p&gt;
&lt;h2 id=&quot;kinds-of-cloud-hosting-for-ai-models&quot;&gt;Kinds of Cloud Hosting for AI Models&lt;/h2&gt;
&lt;p&gt;Cloud hosting has been around for as long as there has been a cloud. Though the server hardware is not at your location, earlier versions of cloud hosting required that your team learnt lots about server infrastructure. As things have evolved, providers now manage the infrastructure so that you can focus on your work. You can now host even just a single function in the cloud, if that’s what you need. In our research, we looked at general serverless hosting and at Cloud GPU AI providers.&lt;/p&gt;
&lt;h3 id=&quot;serverless-hosting&quot;&gt;Serverless Hosting&lt;/h3&gt;
&lt;p&gt;Serverless hosting can be defined as an architecture model that lets developers build and run applications and services without managing the servers they run on. The cloud provider manages things like security, provisioning, scaling, and connectivity.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;Diagram of serverless app services all running together on general hardware.&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; sizes=&quot;(min-width: 478px) 478px, 100vw&quot; data-astro-image=&quot;constrained&quot; data-astro-image-pos=&quot;center&quot; width=&quot;478&quot; height=&quot;390&quot; src=&quot;https://img.ly/_astro/serverless2_9SKyE.webp&quot; srcset=&quot;/_astro/serverless2_9SKyE.webp 478w&quot;&gt;&lt;br&gt;
In a serverless CPU-loads hosting the host provisions your services to the most appropriate and available hardware. However, with most of the providers of GPU loads you get to choose.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Serverless Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pay-per-compute model: you only pay for the compute time you consume.&lt;/li&gt;
&lt;li&gt;Autoscaling: the provider will automatically scale up or down depending on load, from a few requests a day to thousands per second.&lt;/li&gt;
&lt;li&gt;No server management: eliminates the need for developers to &lt;em&gt;also&lt;/em&gt; understand server infrastructure. Often, just a Docker image holding an application is sufficient.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Serverless Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cold starts: instance deallocates after a certain idle time (enabling the great pay-per-compute model) so initial request after this can be noticeably slow.&lt;/li&gt;
&lt;li&gt;Limited control over specifics: certain GPU hardware or even server hardware may be unavailable at times which can impact performance.&lt;/li&gt;
&lt;li&gt;Limitations on time - there may be limitations on the execution time of functions, which can impact long-running processes.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;cloud-gpu-hosting&quot;&gt;Cloud GPU Hosting&lt;/h3&gt;
&lt;p&gt;Cloud GPU hosting provides access to GPU and TPU (Tensor Processing Unit) hardware that can perform the parallel operations essential for AI model training and inference. The provider allows users to configure specific hardware for their jobs.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;Diagram showing AI model running on specific GPU&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; sizes=&quot;(min-width: 545px) 545px, 100vw&quot; data-astro-image=&quot;constrained&quot; data-astro-image-pos=&quot;center&quot; width=&quot;545&quot; height=&quot;390&quot; src=&quot;https://img.ly/_astro/gpu3_Z27RcaD.webp&quot; srcset=&quot;/_astro/gpu3_Z27RcaD.webp 545w&quot;&gt;&lt;br&gt;
With cloud GPU each service or model gets its own GPU while running. Your other services communicate with the model through an API.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cloud GPU Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;High performance: GPUs are specifically designed to run AI models and other tasks like deep learning and complex simulations.&lt;/li&gt;
&lt;li&gt;Full control of hardware: users can specify specific hardware configurations for their projects.&lt;/li&gt;
&lt;li&gt;Persistent availability: resources are not deallocated, so there is no latency for provisioning for the first request.&lt;/li&gt;
&lt;li&gt;Cost-effective experiments: the upfront cost of purchasing GPU hardware to experiment with different configurations is eliminated. Services are priced with a pay-as-you-go model.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cloud GPU Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Costs over time: costs do not go down during periods of low demand. Over time, costs can potentially surpass the cost of investing in local hardware.&lt;/li&gt;
&lt;li&gt;Management overhead - managing and optimizing hardware configurations is not automatically part of the hosting. You’ve got to learn some server administration and manage security and upgrades.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;providers&quot;&gt;Providers&lt;/h2&gt;
&lt;p&gt;It’s important to understand that this isn’t a ranking of the best providers or an endorsement. It’s what we discovered with some web searching, reviewing the available documentation, and tinkering with any demo or free tools and models the provider makes available. The list could easily have been different providers and we think some of the pros and cons and qualities would be the same. Hopefully, some of the questions we raise and the pros or cons we noticed in our research can help you to guide your research.&lt;/p&gt;
&lt;p&gt;Our goal was to find potential hosts for various workflows with different models in a scalable manner. We want to be able to build applications around the workflows. Some of our, specific, requirements include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Autoscaling, ideally out-of-the-box without the need for custom Kubernetes setup or similar technologies.&lt;/li&gt;
&lt;li&gt;Minimal vendor lock-in.&lt;/li&gt;
&lt;li&gt;Compatibility with various technologies (REST API, WebSocket, Webhooks, etc.).&lt;/li&gt;
&lt;li&gt;Support for Windows Server.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With those disclaimers and caveats, here is a short summary of our research.&lt;/p&gt;

























































&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Provider&lt;/th&gt;&lt;th&gt;Best For&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Runpod IO (Serverless)&lt;/td&gt;&lt;td&gt;Deploy AI models with GPU support and require customizable API interfaces.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Vast AI (Serverless)&lt;/td&gt;&lt;td&gt;Affordable GPU resources and a variety of GPU options for AI model training.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Paperspace (Serverless)&lt;/td&gt;&lt;td&gt;Flexible workflows and support for different stages of AI model development.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;CoreWeave (Serverless)&lt;/td&gt;&lt;td&gt;Strong knowledge of Kubernetes and need autoscaling capabilities for AI workloads.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Modal (Serverless)&lt;/td&gt;&lt;td&gt;Comprehensive documentation and examples for deploying AI models in containers.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;ComfyICU (Serverless)&lt;/td&gt;&lt;td&gt;Serverless infrastructure tailored for hosting ComfyUI applications.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Replicate (Serverless)&lt;/td&gt;&lt;td&gt;Easy-to-use API for executing AI tasks without managing infrastructure.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Genesis Cloud (Cloud GPU)&lt;/td&gt;&lt;td&gt;Sustainability and need scalable GPU instances for AI model training.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Fly IO (Cloud GPU)&lt;/td&gt;&lt;td&gt;To deploy complete applications with GPU support in a scalable environment.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Runpod IO (Cloud GPU)&lt;/td&gt;&lt;td&gt;GPU resources in various regions and require customizable Docker-based deployments.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Lamda Labs (Cloud GPU)&lt;/td&gt;&lt;td&gt;On-demand GPU resources for model training and inference tasks.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Together AI (Cloud GPU)&lt;/td&gt;&lt;td&gt;A platform for testing serverless models and occasional access to GPU clusters.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;If you want to skip ahead to a specific part, here are the providers we will be diving into:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Serverless Providers&lt;/strong&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#runpodioserverless&quot;&gt;Runpod IO (Serverless)&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#vastai&quot;&gt;Vast AI&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#paperspace&quot;&gt;Paperspace&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#bananadev&quot;&gt;Banana Dev&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#coreweave&quot;&gt;CoreWeave&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#modal&quot;&gt;Modal&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#comfyicu&quot;&gt;ComfyICU&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#replicate&quot;&gt;Replicate&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GPU Cloud Providers&lt;/strong&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#genesiscloud&quot;&gt;Genesis Cloud&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#flyio&quot;&gt;Fly IO&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#runpodiocloudgpu&quot;&gt;Runpod IO (Cloud GPU)&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#lamdalabs&quot;&gt;Lamda Labs&lt;/a&gt;&lt;br&gt;
&lt;a href=&quot;https://img.ly/blog/reviewing-cloud-gpu-providers-for-training-ai-models//#togetherai&quot;&gt;Together AI&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;serverless-providers&quot;&gt;Serverless Providers&lt;/h2&gt;
&lt;h3 id=&quot;runpod-io-serverless&quot;&gt;Runpod IO (Serverless)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://www.runpod.io/&quot;&gt;Runpod IO&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A Docker image that includes the installation of Python + GPU packages, models, and ComfyUI.&lt;/li&gt;
&lt;li&gt;Python/Go handlers act as an API interface to ComfyUI, which is vendor-specific, but can be wrapped in a more general API for reuse. For more information, see &lt;a href=&quot;https://9elements.com/blog/hosting-a-comfyui-workflow-via-api/&quot;&gt;this article on hosting a ComfyUI workflow via API&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Good documentation, including public GitHub repositories with examples.&lt;/li&gt;
&lt;li&gt;Relatively large community for a new provider.&lt;/li&gt;
&lt;li&gt;Compatibility with Windows Server.&lt;/li&gt;
&lt;li&gt;Handlers allow for webhook and WebSocket-like communication for API feedback.&lt;/li&gt;
&lt;li&gt;Network volume to store models/data and reduce cold start times.&lt;/li&gt;
&lt;li&gt;Control over the number of workers and the ability to define persistently active workers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Availability of GPUs, especially in Europe, needs to be validated.&lt;/li&gt;
&lt;li&gt;Handlers can only be written in Python and Go.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;General open questions regarding serverless infrastructure and AI inference tasks.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The overall package seems very mature. The setup can largely be adopted from the GitHub examples. Good documentation and community support (notably on Reddit). The open questions regarding pricing and cold starts are typical for serverless infrastructure.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;vast-ai&quot;&gt;Vast AI&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://vast.ai/&quot;&gt;Vast AI&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Peer-to-Peer Sharing. Companies/organizations can rent out their unused GPUs.&lt;/li&gt;
&lt;li&gt;A GPU Marketplace approach.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Affordable prices through their peer-to-peer GPU sharing model.&lt;/li&gt;
&lt;li&gt;A wide selection of different GPUs.&lt;/li&gt;
&lt;li&gt;Good global availability of GPUs.&lt;/li&gt;
&lt;li&gt;Ability to define autoscaler groups, allowing different workflows to scale differently.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The autoscaler is currently only in beta mode.&lt;/li&gt;
&lt;li&gt;Data privacy/security concerns when renting GPUs from anonymous providers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How will the autoscaler beta evolve?&lt;/li&gt;
&lt;li&gt;Control over GPU providers: Can one allow only certain trusted providers (e.g., those based in the EU)?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Even though the pricing is more affordable, there may be significant issues, in terms of security and data protection, as well as the fact that the autoscaler is still in the beta phase.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;paperspace&quot;&gt;Paperspace&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.digitalocean.com/products/paperspace/workflows/getting-started/your-first-workflow/&quot;&gt;Paperspace&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The serverless approach (Workflows or Gradient) is still in beta &lt;a href=&quot;https://www.paperspace.com/gradient/workflows&quot;&gt;Paperspace Gradient Workflows&lt;/a&gt; is based on &lt;a href=&quot;https://argoproj.github.io/workflows/&quot;&gt;Argo Workflows&lt;/a&gt; which utilizes Kubernetes.&lt;/li&gt;
&lt;li&gt;A predefined API is available for communicating with workflows, as detailed in &lt;a href=&quot;https://docs.digitalocean.com/reference/paperspace/pspace/commands/completion/&quot;&gt;DigitalOcean’s documentation for Paperspace commands&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The ability to use different machines (GPUs) at different stages of a workflow.&lt;/li&gt;
&lt;li&gt;Provided by Digital Ocean, allows for general hosting customers to expand into GPU hosting without finding a new vendor.&lt;/li&gt;
&lt;li&gt;Possible Windows support as outlined in &lt;a href=&quot;https://docs.digitalocean.com/products/paperspace/machines/getting-started/run-windows-app/&quot;&gt;DigitalOcean’s documentation on running Windows apps&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Complex documentation: offers many features for various use cases (AI learning, data preparation, validation, and inference).&lt;/li&gt;
&lt;li&gt;Vendor lock-in through a proprietary system: Gradient Workflows and YAML config are specific to Paperspace.&lt;/li&gt;
&lt;li&gt;No real-time feedback over the API.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Since it’s still in beta, how will the ecosystem continue to develop?&lt;/li&gt;
&lt;li&gt;How extensive is the knowledge of Kubernetes required to implement autoscaling?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It’s positive that it’s offered by Digital Ocean as they are a more mature company with general hosting experience. The approach seems very specific to Digital Ocean. Furthermore, it may require experience with Kubernetes.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;banana-dev&quot;&gt;Banana Dev&lt;/h3&gt;
&lt;p&gt;It has been excluded: Recently, they announced the termination of their serverless model as it was not cost-effective.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Learning from this: Currently, there are many new providers entering the market aiming to establish themselves as cloud GPU or serverless GPU providers. This highlights the importance of minimizing vendor lock-in.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h3 id=&quot;coreweave&quot;&gt;CoreWeave&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.coreweave.com/coreweave-kubernetes/serverless&quot;&gt;CoreWeave&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Heavily based on Kubernetes.
&lt;ul&gt;
&lt;li&gt;A Kubernetes file is created for setup; scaling and additional infrastructure are managed by Core Weave.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Autoscaling by default with the possibility of scaling to zero.&lt;/li&gt;
&lt;li&gt;Supports Windows.&lt;/li&gt;
&lt;li&gt;Minimal vendor lock-in due to Kubernetes configuration.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Strong dependency on Kubernetes, with the serverless setup based on &lt;a href=&quot;https://knative.dev/docs/&quot;&gt;KNative documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Does not offer a handler API, etc., to communicate directly with ComfyUI.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How complicated would it be to implement an API interface and resulting scaling to address the correct instances, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Good documentation and a close interface to Kubernetes. For a team with strong knowledge of Kubernetes, this could be a prime candidate.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;modal&quot;&gt;Modal&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Container Setup: Containers are defined through Modal’s own container setup &lt;a href=&quot;https://modal.com/docs/guide/custom-container&quot;&gt;Modal custom container documentation&lt;/a&gt;.
&lt;ul&gt;
&lt;li&gt;Docker images can also be used.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Modal-specific handlers to communicate with ComfyUI and other models.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Supports webhooks and custom endpoints &lt;a href=&quot;https://modal.com/docs/guide/webhooks#custom-domains&quot;&gt;Modal webhooks documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Focus on fast startups/cold starts.&lt;/li&gt;
&lt;li&gt;Emphasis on AI inference tasks.&lt;/li&gt;
&lt;li&gt;Comprehensive documentation with many examples.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Vendor lock-in if Modal’s container setup is used.&lt;/li&gt;
&lt;li&gt;Autoscaling and scaling configuration are not directly described.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How exactly does the autoscaling work?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Assessment:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For us, this is a candidate for closer consideration. The container setup can be managed through Dockerfiles, and the API defined by Modal’s own interface.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;comfyicu&quot;&gt;ComfyICU&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://comfy.icu/serverless/&quot;&gt;ComfyICU&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pure focus on ComfyUI, serverless infrastructure.&lt;/li&gt;
&lt;li&gt;API interface for communication.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Minimal setup effort.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Limited control over the API.&lt;/li&gt;
&lt;li&gt;Limited GPU resources.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How does the autoscaling work, if it exists at all?&lt;/li&gt;
&lt;li&gt;Community-based open source. What is the long-term support for this project?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Potentially useful for testing or building a demo site, but probably not suitable for developing our commercial applications.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;replicate&quot;&gt;Replicate&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://replicate.com/&quot;&gt;Replicate&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Execution of AI tasks/models in the cloud via an API.&lt;/li&gt;
&lt;li&gt;No access to infrastructure, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Supports various languages: Node, Python, Swift.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No control over the infrastructure, number of GPUs, or workers.&lt;/li&gt;
&lt;li&gt;API rate limits.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How can autoscaling be enabled?&lt;/li&gt;
&lt;li&gt;Is it possible to create custom API endpoints, webhooks, websockets?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For testing or as a demo for one’s own model, this can be a very good platform. However, as a standalone application interface, it doesn’t meet some of our core requirements.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&quot;gpu-cloud-providers&quot;&gt;GPU Cloud Providers&lt;/h2&gt;
&lt;h3 id=&quot;genesis-cloud&quot;&gt;Genesis Cloud&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://www.genesiscloud.com/&quot;&gt;Genesis Cloud&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Focus on sustainability and renewable energy.&lt;/li&gt;
&lt;li&gt;Scaling through instances as detailed in &lt;a href=&quot;https://developers.genesiscloud.com/instances&quot;&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A REST API is available for managing instances.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The availability of GPUs varies significantly by region.&lt;/li&gt;
&lt;li&gt;Limited selection of GPUs.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How quickly can new instances be scaled up or down?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The use case for Genesis Cloud appears to be more suited for model training or tasks that require a significant amount of computing power for extended periods.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;fly-io&quot;&gt;Fly IO&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://fly.io/&quot;&gt;Fly IO&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Focus on the deployment of complete applications.&lt;/li&gt;
&lt;li&gt;Also offers its own GPU servers.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Docker File support with additional configuration via a TOML file.&lt;/li&gt;
&lt;li&gt;Quick scaling of GPUs up or down facilitated by the launch process.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Limited selection of GPUs, with only very large GPUs available.&lt;/li&gt;
&lt;li&gt;Specifically tailored for Linux.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How well does the launch system perform for relatively fast inference tasks?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Since primarily large GPUs are available, the focus here also appears to be more on model training or other long-duration tasks. However, the launch system might also potentially be used for inference.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;runpod-io-cloud-gpu&quot;&gt;Runpod IO (Cloud GPU)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://www.runpod.io/&quot;&gt;Runpod IO&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A wide range of GPUs available across various regions.&lt;/li&gt;
&lt;li&gt;Base Docker images for popular tasks or support for custom Docker images.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Many different data center regions.&lt;/li&gt;
&lt;li&gt;A variety of CPUs available.&lt;/li&gt;
&lt;li&gt;Simple setup via Docker images.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No direct autoscaling (would need to use Runpod Serverless for that).&lt;/li&gt;
&lt;li&gt;Despite a large selection of GPUs and many different data center locations, the availability of GPUs is not very high.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open Questions:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Can autoscaling be implemented without using serverless?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The setup can largely be adopted from the GitHub examples. There is good documentation and a community (much of it on Reddit). The availability of GPUs could become a problem, especially for smaller GPUs.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;lamda-labs&quot;&gt;Lamda Labs&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;On-demand cloud with a focus on model training and inference.&lt;/li&gt;
&lt;li&gt;Similar concept to Runpod, offering a variety of GPUs.
&lt;ul&gt;
&lt;li&gt;GPU availability is very limited.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Runpod and Lambda Labs seem to have a similar approach and similar offerings. Runpod appears to have greater availability.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&quot;together-ai&quot;&gt;Together AI&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Concept:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Offers an API and playground for testing serverless models.&lt;/li&gt;
&lt;li&gt;Also offers GPU clusters but only upon request.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We didn’t dig into the GPU clusters since information is available only upon request. Otherwise, in the API/serverless area, it appears to be similar to Replicate.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&quot;established-providers&quot;&gt;Established Providers&lt;/h2&gt;
&lt;p&gt;As we said in the introduction we did not examine the old, large providers like Google Cloud, AWS, Azure, Nvidia, etc., in detail. Rather, we focused on the new providers aiming specifically at the market segment of AI GPUs. With the older providers, we are more in the realm of cloud GPUs and less in serverless. Given the size of these providers and the wide range of market segments they cover, it can make sense to opt for them if one is already familiar with their architecture and documentation.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Google Cloud Platform (GCP)&lt;/li&gt;
&lt;li&gt;AWS&lt;/li&gt;
&lt;li&gt;Microsoft Azure&lt;/li&gt;
&lt;li&gt;IBM Cloud&lt;/li&gt;
&lt;li&gt;NVIDIA GPU Cloud (NGC)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Just as we saw that performance can vary wildly for different models, pricing can be similarly complex. When evaluating costs, consider factors like response times, the number of required workers, and potential charges for features like caching. Many providers offer detailed pricing guidelines on their websites, which can be crucial for ensuring you only pay for the computing power you truly need. Experimenting with performance of your model and applications during development will be helpful to make sure your hardware and pricing are both optimized for your application.&lt;/p&gt;
&lt;p&gt;Another thing to consider is what kind of experience does your team already have? Most cloud GPU services provide tools like CLI or REST APIs to manage resources, which can be a steep learning curve if your team is not familiar with these technologies. Additionally, while serverless platforms may support multiple programming languages, compatibility with your team’s preferred language—be it JavaScript, Python, or Go—is essential. As exciting as it can be to learn new languages, it’s probably not the best use of your team’s time.&lt;/p&gt;
&lt;p&gt;The size of files you’ll be moving between your model and the other parts of your project may also be a factor. Your users may not notice latency for models that communicate using text only. Text moves quickly from point to point in a network. However, if your model takes large image files as input or output, you may find that moving data between data centers is too slow. Then you’d want to focus on providers who can offer more general hosting in addition to cloud GPU hosting.&lt;/p&gt;
&lt;p&gt;As we continue to research this for our own projects, we are thinking the best configuration for us is to use a cloud GPU exclusively for generation tasks and communicate with it via an API from our existing back end. We will have to experiment to see if we can have those functions geographically separate, or if we need to find one hosting company and one data center for both. As we learn more we may change our ideas, but that’s part of the fun of working in technology, things change. By using the higher-cost cloud GPU for as few tasks as possible, we’ll know we aren’t wasting compute power for things easily handled by a general CPU.&lt;/p&gt;
&lt;p&gt;We hope this has given you some useful background and ideas as you research hosting options for your AI projects. Understanding the subtle differences between serverless and cloud GPU hosting can spark innovative ideas tailored to your needs. Perhaps some of the lesser-known providers we’ve explored might just be the perfect fit for your next project. As always, the dynamic nature of technology keeps us on our toes—ready to adapt and evolve. Happy hosting!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thanks for reading. Join over 3000 specialists with powerful apps and &lt;a href=&quot;https://share.hsforms.com/1IgAOV1wASXGPnFG4ZPLejg1hk3i&quot;&gt;subscribe&lt;/a&gt; to our newsletter. We keep you in the loop with brand-new features, early access, and updates.&lt;/strong&gt;&lt;/p&gt;</content:encoded><dc:creator>Walter</dc:creator><media:content url="https://blog.img.ly/2024/04/cloud-gpu-review.jpg" medium="image"/><category>AI</category><category>Cloud</category><category>Development</category><category>Machine Learning</category></item><item><title>Infopark reinvents CMS asset management with the PhotoEditor SDK</title><link>https://img.ly/blog/infopark-reinvents-cms-asset-management-with-the-photoeditor-sdk-eb026e6864fe/</link><guid isPermaLink="true">https://img.ly/blog/infopark-reinvents-cms-asset-management-with-the-photoeditor-sdk-eb026e6864fe/</guid><description>Scrivito: A Native Cloud CMS built with Ruby on Rails </description><pubDate>Wed, 24 Oct 2018 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;strong&gt;Infopark:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Founded 1994&lt;/li&gt;
&lt;li&gt;Based in Berlin, Germany&lt;/li&gt;
&lt;li&gt;SaaS provider for website infrastructure technologies (CMS, CRM)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; A simple and intuitive photo editing solution to provide their clients with seamless and easy asset management&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt; A versatile image editor within &lt;em&gt;Scrivito’s&lt;/em&gt; Content Browser&lt;/p&gt;
&lt;p&gt;The native cloud service &lt;em&gt;Scrivito&lt;/em&gt; is tailored to meet the requirements of editors worldwide and provides a clean and simple What-you-see-is-what-you-get, drag &amp;#x26; drop interface that enables its users to change their websites and content on the fly; eliminating complex tree hierarchies and all the guesswork that goes along with it. Since successful websites hinge on high-quality pictures, Infopark wanted to equip its users with a robust image editing solution. “We wanted to provide our users with a coherent and easy-to-use working environment, and because most editors don’t have Photoshop or a proper skillset for other sophisticated programs, we were searching for a solution that would enable our users to quickly and professionally edit their images within &lt;em&gt;Scrivito&lt;/em&gt;” says Thomas Witt, co-founder and Director of Product &amp;#x26; Business Development at Infopark.&lt;/p&gt;
&lt;p&gt;Before integrating the PhotoEditor SDK into their award-winning service &lt;em&gt;Scrivito&lt;/em&gt;, Infopark tried to tackle the problem themselves with a simple, self-made solution based on ImageMagick. The editor could perform operations such as rotate, crop or flip “but that solution didn’t have an appealing UI, nor was it easy to use” says Thomas Witt. “After evaluating some solutions, we felt that the PhotoEditor SDK would be a good match because it is the easiest to use for both developers and users and also has a nice UI that doesn’t look cluttered.”&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot; sizes=&quot;(min-width: 800px) 800px, 100vw&quot; data-astro-image=&quot;constrained&quot; data-astro-image-pos=&quot;center&quot; width=&quot;800&quot; height=&quot;395&quot; src=&quot;https://img.ly/_astro/0-twdBvfCaQIwyYWf4_Z1FivKm.webp&quot; srcset=&quot;/_astro/0-twdBvfCaQIwyYWf4_ZfMBAM.webp 640w, /_astro/0-twdBvfCaQIwyYWf4_Z24qT6L.webp 750w, /_astro/0-twdBvfCaQIwyYWf4_Z1FivKm.webp 800w&quot;&gt;&lt;/p&gt;
&lt;p&gt;With the SDK the developers at Infopark accomplished in days what would’ve otherwise taken weeks. “Our developers loved that the integration was really straightforward and that everything you need like APIs and documentation is easy to find and of very good quality. And when we had questions, the support team responded very quickly and extensively” says Witt. With the PhotoEditor SDK Infopark provides their customers with a native image editing solution that seamlessly integrates into &lt;em&gt;Scrivito’s&lt;/em&gt; content browser and further facilitates the creation of websites and content. “Asset management is critical to our users, and we get a lot of positive feedback for our solution,” says Thomas Witt.&lt;/p&gt;
&lt;p&gt;“&lt;a href=&quot;https://img.ly/products/photo-sdk/&quot;&gt;The PhotoEditor SDK&lt;/a&gt; is a great product, it does exactly what it’s supposed to do, and we never encountered any difficulties. It even provides more functionalities than our customers would ever use. The support was very fast and helpful, and the documentation is extensive and comprehensible” concludes Witt. With the PhotoEditor SDK &lt;em&gt;Scrivito&lt;/em&gt; allows for an intuitive and straightforward approach to both content and website creation. Unlike other CMS &lt;em&gt;Scrivito&lt;/em&gt; now enables its users to creatively experiment with their website assets and make impromptu decisions for their design.&lt;/p&gt;</content:encoded><dc:creator>Felix</dc:creator><media:content url="https://blog.img.ly/2020/04/image-44.png" medium="image"/><category>Web Development</category><category>Case Study</category><category>CMS</category><category>Website</category><category>Cloud</category><category>Case Studies</category></item></channel></rss>